Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Chapter 7: The Wisconsin Breast Cancer Epidemiology Simulation Model

Chapter 7: The Wisconsin Breast Cancer Epidemiology Simulation Model Abstract The Wisconsin Breast Cancer Epidemiology Simulation Model is a discrete-event, stochastic simulation model using a systems-science modeling approach to replicate breast cancer incidence and mortality in the U.S. population from 1975 to 2000. Four interacting processes are modeled over time: (1) natural history of breast cancer, (2) breast cancer detection, (3) breast cancer treatment, and (4) competing cause mortality. These components form a complex interacting system simulating the lives of 2.95 million women (approximately 1/50 the U.S. population) from 1950 to 2000 in 6-month cycles. After a “burn in” of 25 years to stabilize prevalent occult cancers, the model outputs age-specific incidence rates by stage and age-specific mortality rates from 1975 to 2000. The model simulates occult as well as detected disease at the individual level and can be used to address “What if?” questions about effectiveness of screening and treatment protocols, as well as to estimate benefits to women of specific ages and screening histories. As part of the National Cancer Institute's Cancer Intervention and Surveillance Modeling Network (CISNET) consortium we developed and calibrated the Wisconsin Breast Cancer Epidemiology Simulation Model. The model is a discrete-event, stochastic simulation model designed to replicate breast cancer incidence and mortality rates in a population with size and age structure of the Wisconsin female population but generalizing to breast cancer epidemiology in the U.S. population from 1975 to 2000. This paper describes the model development, structure, and calibration. A complete specification of the model and all parameters is available electronically at the CISNET Web site (http://cisnet.cancer.gov/profiles/). MODEL HISTORY AND OBJECTIVES The Wisconsin model evolved from a model constructed by Chang (1). Chang's deterministic model replicated Wisconsin breast cancer incidence and mortality from 1980 to 1992, but only if a substantial fraction of all breast cancers are predestined from their occult biologic onset to have limited malignant potential, i.e., to grow to only a limited size, approximately 1-cm diameter, and not be a lethal threat to the woman. This simulation has two objectives: 1) to develop a simulation based on components similar to those in Chang's model, but to predict age- and stage-specific breast cancer incidence rates and age-specific mortality rates in the U.S. population and 2) to conduct simulations at the individual woman level allowing exploration of relative costs and effectiveness of alternative screening protocols in the U.S. population. MODELING APPROACH AND COMPONENT OVERVIEW We used a systems engineering approach to construct the model. We decomposed the complex, real-world system, which results in observed national breast cancer statistics, into interacting subsystems and modeled these component systems and their interactions (2). Our model is a discrete-event simulation with a fixed cycle time of 6 months beginning in calendar year 1950. The model is populated by 2.95 million women, divided into birth cohorts, making up a female population aged 20–100 years living between 1950 and 2000. The size and cohort structure in this simulated population is identical to that of Wisconsin in those years to allow us to generate a case-based cancer registry matching the number of cases in Wisconsin for comparison to breast cancer case counts in the Wisconsin Cancer Reporting System (WCRS) state cancer registry. However, model parameters were calibrated to breast cancer statistics reported in the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) program (3). Thus, when the simulated registry results are reported as age-adjusted rates, they apply to the U.S. population as represented in SEER generally and are not Wisconsin specific. Women are individually simulated from 1950 (or the year in which they were age 20) until they die a simulated death, achieve age 100, or the simulated year 2000 is reached. The processes simulated are as follows: 1) the natural history of breast cancer from occult onset to breast cancer death, 2) detection of breast cancer by screening mammography or other diagnostic pathways, 3) effectiveness of treatment of breast cancer and diffusion of adjuvant therapies over time, and 4) death from non–breast cancer causes. Each of these processes is stochastic and unfolding over time in the population; jointly they determine observed breast cancer epidemiology. A run of the model is initialized in 1950 assuming all the women at that year are breast cancer free. The breast cancer onset and progression submodels are invoked in 6-month cycles from simulated year 1950 to 1975 as “burn-in” for the simulation. The prevalence of occult breast cancer and detected breast cancer is stabilized after the 25-year burn-in so that the simulated population going forward from 1975 is appropriately represented with prevalent occult and detected breast cancer. For comparison of output to that of other CISNET collaborators when we computed age-adjusted rates, we adjusted results to the U.S. standard population (male and female) aged 30–79 years in 2000. MODEL COMPONENTS AND PARAMETER DESCRIPTION Natural History Submodel An important assumption in the model is that in situ carcinoma is an early stage of invasive cancer. Breast cancers are simulated as idealized spherical tumors having occult onset with diameter 2 mm, a lower bound chosen to be minimally detectable with technologies prevalent in 2000. Simulated tumors grow according a Gompertz-type function with asymptotic diameter of 8 cm (4). The Gompertzian growth rate for an individual tumor is fixed at its onset by a random draw from a gamma distribution of growth rates common to all tumors in the model and women of all ages. The mean and variance of this gamma distribution (Mean gamma and Var gamma in Table 1) were fitted in the model calibration process described below. In each 6-month cycle possible metastatic spread of the tumor is simulated stochastically by a random draw from a Poisson distribution determining number of new positive lymph nodes in that period; the Poisson distribution mean is a function of current tumor size and instantaneous growth rate developed by Shwartz (5–8). Table 1.  Input parameters, their sampled ranges and the best-fitting values Parameter  Use in the model  [Sampled range during phase 1, when LMP fraction = 0] (increment size for discretized sampling)  [Sampled range during phase 2, when LMP fraction allowed >0] (increment size for discretized sampling)  Final value  In situ boundary  The diameter (cm) below which the tumor is classified as in situ stage in the simulation if there are no associated positive lymph nodes  [0.75–1.0] (0.01)  [0.85–0.99] (0.01)  0.95 cm  Onset proportion  Ratio of assumed age-specific biologic onset rate divided by age-specific incidence rate (the latter specified by age–period–cohort model estimated in absence of screening—see text).  [0.85–1.2] (0.01)  [0.8–1.0] (0.01)  0.9  Onset lag  Time interval (years) between year of index onset rate and incidence rate used in Onset Proportion. This is to “fill the pipeline” with biologically onset tumors which will be discovered at a given incidence rate some years later. Because the cycle time of the model is 0.5 years, this was taken to be step size.  [1–8] (0.5)  [1.5–4] (0.5)  3 y  Mean gamma  The Gompertz growth rate is assumed to have a gamma distribution across all onset tumors. This parameter is the mean of this gamma distribution (see text).  [0.01–0.2] (0.01)  [0.08–0.18] (0.01)  0.12  Var gamma  The variance of the gamma distribution of Gompertz growth rates  [0.006–0.1] (0.001)  [0.01–0.05] (0.001)  0.012  Percent 4 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 4 positive lymph nodes at onset. (This places these tumors at the upper limits of simulated regional tumors, which are presumed to have 1-4 positive nodes.)  [0%–5%] (1%)  [0%–1%] (1%)  1%  Percent 5 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 5 positive lymph nodes at onset. (This simulates these tumors in the distant stage from their initiation in the model.)  [0%–5%] (1%)  [2%–4%] (1%)  2%  Parameter  Use in the model  [Sampled range during phase 1, when LMP fraction = 0] (increment size for discretized sampling)  [Sampled range during phase 2, when LMP fraction allowed >0] (increment size for discretized sampling)  Final value  In situ boundary  The diameter (cm) below which the tumor is classified as in situ stage in the simulation if there are no associated positive lymph nodes  [0.75–1.0] (0.01)  [0.85–0.99] (0.01)  0.95 cm  Onset proportion  Ratio of assumed age-specific biologic onset rate divided by age-specific incidence rate (the latter specified by age–period–cohort model estimated in absence of screening—see text).  [0.85–1.2] (0.01)  [0.8–1.0] (0.01)  0.9  Onset lag  Time interval (years) between year of index onset rate and incidence rate used in Onset Proportion. This is to “fill the pipeline” with biologically onset tumors which will be discovered at a given incidence rate some years later. Because the cycle time of the model is 0.5 years, this was taken to be step size.  [1–8] (0.5)  [1.5–4] (0.5)  3 y  Mean gamma  The Gompertz growth rate is assumed to have a gamma distribution across all onset tumors. This parameter is the mean of this gamma distribution (see text).  [0.01–0.2] (0.01)  [0.08–0.18] (0.01)  0.12  Var gamma  The variance of the gamma distribution of Gompertz growth rates  [0.006–0.1] (0.001)  [0.01–0.05] (0.001)  0.012  Percent 4 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 4 positive lymph nodes at onset. (This places these tumors at the upper limits of simulated regional tumors, which are presumed to have 1-4 positive nodes.)  [0%–5%] (1%)  [0%–1%] (1%)  1%  Percent 5 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 5 positive lymph nodes at onset. (This simulates these tumors in the distant stage from their initiation in the model.)  [0%–5%] (1%)  [2%–4%] (1%)  2%  View Large In the model, tumors are mapped to the four SEER historical stages according to their simulated size and number of positive lymph nodes. Tumors without associated positive lymph nodes and below a critical size (In situ boundary in our parameterization; see Table 1) are defined as in situ. Beyond this size, but still without positive nodes, the tumor is defined to be localized invasive cancer. Any tumor with one to four simulated positive nodes is defined to be in the regional stage. Tumors with five or more positive nodes are defined as distant stage. However, simulation of number of positive lymph nodes is meant to be a simulacrum rather than facsimile of the physical disease process of metastatic spread and is used only crudely in our model to provide a continually updated probability of regionally spread and then of distantly spread disease. We maintained the “positive lymph node” metaphor from Shwartz's early model but do not use this other than as a way to indicate three categories of spread (none, regional, distant). More detailed modeling of the metastatic process is beyond scope of this model. This simulation process yields stage-specific incidence statistics to replicate WCRS and SEER data. We stress that calibration of the model, described later, focused on producing stage-specific incidence rates over time to match SEER data for 1975–1999. The underlying simulated growth model was not calibrated directly with, or compared to, data concerning tumor sizes per se. Depending on the stage of cancer, “tumor size” in the model functions in two ways. Once a random growth rate is fixed for a simulated tumor, tumor size and passage of clock time are equivalent. In the in situ stage, where real-world breast carcinomas often are highly irregular in shape and physical extent, “tumor size” is used to mark clock time in the stage; variability in growth rate distribution induces variability of dwell times in this stage. In our model, a tumor transits to the localized invasive stage at 9.5 mm in “tumor size.” At this point “tumor size” is less metaphorical and codes for size of the spherical mass insofar as it influences average radiographic and clinical exam detectability of the tumor. Thus distribution of tumor size among simulated incident invasive cancers might be expected to be similar to actual data for tumors of 1 cm or larger, although such correspondence was not forced by the calibration process, which considered only incidence rates by stage. The model specifies that breast cancer death occurs only as an endpoint of the process of uncontrolled growth and spread of an invasive tumor. Once a simulated tumor enters the distant stage, its natural history is presumed to be lethal at a rate described by the survival of women entered in the SEER registry as diagnosed in the distant stage during the premammography era (1975–1982). We estimated an empirical distribution of survival times from time of diagnosis for such women from SEER data (3). The mean survival was 1.95 years and the median 5.22 years. This finding potentially underestimates true survival time at this point because SEER incident cases will have transited occultly to the distant stage some time before diagnosis. On the other hand, treatment after diagnosis acts to prolong survival somewhat in this stage (even pre-1982), and so this estimate is also biased toward overestimating survival with untreated time course. We do not know the net effect of these offsetting biases for estimated natural history time course after transit to the distant stage. The rate of onset of occult breast tumors in women without breast cancer is a function of expected incidence in the absence of screening. For the CISNET consortium, Holford et al. developed an age–period–cohort (APC) model predicting total breast cancer incidence in the absence of screening discussed elsewhere in this volume (9). A signal characteristic of this CISNET base case input is a predicted secular increase in breast cancer incidence almost coincident in time with diffusion of mammography. The model simulates tumors forward in time from a point of occult onset. Because the APC model predicts total breast cancer incidence rates (i.e., rate of diagnosed breast cancer) in absence of screening and not the occult onset rate for tumors, we cannot use the APC age- and calendar year–specific incidence rates directly as an input. First, not all onset tumors will become incident—a woman may die with an undiagnosed tumor, for example. So the APC incidence rates must underpredict the total onset population of tumors by some unknown amount. We fit a parameter termed “Onset proportion” to be the ratio of onset to incident tumors across women and time. Initially this parameter was bounded below at 1 and expected it to be in the range 1–1.2 (as discussed below, this changed when we found it necessary to introduce into the model tumors that regress), reflecting the possibility that up to 20% of occult-onset tumors would never surface in the woman's lifetime. Second, there must be some interval between the time of occult onset and the average time of diagnosis, so the APC incidence rate in year Y will reflect tumors which were onset in year Y − l where l reflects an average lag between onset and incidence; we term the average of this interval the “Onset lag.” Onset lag (Table 1) is a parameter fitted during model calibration. Thus in our initial model, for women of a given age in a given year, the risk of occult onset of a breast tumor was the APC model incidence rate for the same cohort of women Onset lag years in the future, multiplied by the Onset proportion. As discussed next, the meaning of the onset proportion parameter changed slightly at the next step. Both of these parameters should no doubt be age related to reflect differential all-cause mortality across women of different ages, which can effectively censor incidence of occult tumors. However, we fit them as constant across all women to simplify the modeling somewhat. Future iterations of the model will explore relaxing this assumption. An important finding of the Wisconsin Breast Cancer Epidemiology Simulation Model was that the natural history model as described to this point cannot account for the steep rise of in situ and small localized invasive cancers after the advent of mass screening. Far too many of these early stage tumors were detected to be accounted for by an occult pool of tumors that would have progressed to be late stage tumors if not found (10). Following Chang's (1) ideas we included the possibility that some proportion of tumors at onset were destined to be limited malignant potential (LMP) tumors. LMP tumors 1) start to grow at the same rate as lethal tumors, 2) stop growing at a small size (Max LMP size), and, 3) if undetected after a fixed length of time (LMP dwell time), disappear. The fraction of tumors at onset randomly selected to be LMP is LMP fraction. We believe these three parameters (Table 2) are needed to explain observed patterns of breast cancer incidence and mortality 1975–2000, under the assumption that in situ carcinoma of the breast is an early stage of invasive cancer as discussed below. Table 2.  Parameters governing limited malignant potential tumors Parameter  Use in the model  [Sampled Range] (increment size for discretized sampling)  Final value  LMP fraction  Proportion of all tumors assumed to be limited malignant potential (LzMP)  [0%–10%]; [30%–50%] (1%)  42%  Max LMP size  LMP tumors assumed to grow no larger than this diameter (cm)  [0.8–1.5] (0.1 cm)  1 cm  LMP dwell time  Maximum sojourn time (y) for LMP tumor after reaching max LMP size; after this time without discovery, the LMP tumor disappears next simulation cycle.  [1–3 y] (0.5 y)  2 y  Parameter  Use in the model  [Sampled Range] (increment size for discretized sampling)  Final value  LMP fraction  Proportion of all tumors assumed to be limited malignant potential (LzMP)  [0%–10%]; [30%–50%] (1%)  42%  Max LMP size  LMP tumors assumed to grow no larger than this diameter (cm)  [0.8–1.5] (0.1 cm)  1 cm  LMP dwell time  Maximum sojourn time (y) for LMP tumor after reaching max LMP size; after this time without discovery, the LMP tumor disappears next simulation cycle.  [1–3 y] (0.5 y)  2 y  View Large Introduction of LMP tumors required redefinition of onset proportion as the ratio of onset of non-LMP tumors to APC-predicted incidence. No longer was it constrained to be greater than one and during calibration we searched in a region from 0.85 to 1.2. Total onset rate is equal to non-LMP rate onset plus LMP onset rate: Total Onset = (non-LMP onset) + (LMP onset). We have two parameters, LMP fraction = (LMP onset)/(Total onset), and Onset Proportion = (non-LMP onset)/(APC incidence). Putting these together, we determine the total onset rate in year y for a particular cohort of women to be:  \begin{eqnarray*}&&TotalOnset\\&&{=}APC_{(y{+}OnsetLag)}{\cdot}OnsetProportion{\cdot}\frac{1}{(1{-}LMPfraction)}\end{eqnarray*} APC( y+OnsetLag) is the APC model–predicted incidence rate in absence of screening for women in this cohort in year Y + OnsetLag. To foreshadow results, the fitted value for Onset Proportion was determined to be 0.9 and LMP fraction to be 0.42. In the natural history model, each newly onset tumor is assigned a random Gompertz growth rate and then assigned at random to be LMP or not. Among those tumors that are not LMP, two additional parameters govern a small population of hyperaggressive tumors – a fraction (Percent 4 nodes, Table 1) is assigned 4 associated positive lymph nodes when onset at 2 mm diameter, and another fraction (Percent 5 nodes, Table 1) is assigned five positive nodes. These hyperaggressive tumors were needed to match SEER data and avoid depleting the reservoir of tumors to be discovered as regional and distant tumors under all reasonable screening regimes. Modeling Detection of Breast Cancer In any given 6-month period a simulated woman's occult tumor may be detected by either screening mammography or detected by other means (collectively referred to below as “clinical surfacing”). Any schedule of mammography can be arbitrarily imposed by the simulation user. For present purposes we used U.S. data over the period 1975–2000 and a stochastic model of age-specific screening propensity and rates based on data from the Breast Cancer Surveillance Consortium to assign screening dates to women simulating historical screening patterns in the United States (11,12). This stochastic model is available at the CISNET Web site (http://cisnet.cancer.gov/interfaces/). When a woman with a simulated occult tumor is screened, the probability of detection is specified as a function of the idealized spherical tumor diameter. Functions relating detection probability to tumor diameter are specified as a constant probability over the interval 0.2–0.5 cm and then specified pointwise at 0.75 cm, 1.5 cm, and 2 cm; detection probability at 5 cm is fixed at 0.99 and at 8 cm at 1.0. Separate functions were fitted for screening mammography in the year 1984 and 2000 to reflect technological improvements, and these were further differentiated for women younger and older than 50 years in these years to reflect postmenopausal changes in breast radiolucency (13–15). Detection probabilities for years before 1984 were assumed equal to those in 1984; linear interpolation for other tumor diameters and calendar years was used. Clinical surfacing is modeled similarly; however, instead of point probabilities, we fitted annualized rates of detection as a function of tumor diameter at the beginning of the 6-month interval. Separate functions were fit for the year 1990 and 2000 to reflect increasing awareness of breast cancer in the population in this decade. Clinical surfacing before 1990 was presumed to be at the 1990 rate. Figure 1 (left panel) shows four functions relating mammogram detection probability to tumor diameter. These four curves were fitted during calibration by assuming the detection probability was zero for tumor diameters less than 0.2 cm, a constant in the range 0.2–0.5 cm, a larger constant at 0.75 cm, a larger constant yet at 1.5 cm, larger yet at 2.0 cm, and equal to 0.99 at 5 cm, and 1.0 at 8 cm. Sixteen constants (subject to the ordinal constraints noted, and so that probability given size was at least as large in 2000 as in 1990 and at least as large for women over age 50 as under) were fitted during model calibration to specify the four mammogram sensitivity curves shown. On the right panel in Fig. 1 are two curves showing clinical surfacing rates for 1990 and 2000. Rates at tumor sizes less than 0.3 cm were assumed to be zero; constants specifying rates at 0.3, 1, 1.5, 2, 3, 4, and 5 cm were fitted during calibration subject to obvious ordering constraints, and all tumors 8 cm in diameter were presumed to surface within a year. Fig. 1. View largeDownload slide Mammogram sensitivity and clinical surfacing rates used in the final model. Mammogram sensitivity is shown for women less than 50 years old and for women aged 50 years and older in 1984 (thin dotted and solid lines) and for the same two age groups in the year 2000 (thick dotted and solid lines). Clinical surfacing rate fraction as a function of tumor diameter is shown for 1990 and for 2000. Fig. 1. View largeDownload slide Mammogram sensitivity and clinical surfacing rates used in the final model. Mammogram sensitivity is shown for women less than 50 years old and for women aged 50 years and older in 1984 (thin dotted and solid lines) and for the same two age groups in the year 2000 (thick dotted and solid lines). Clinical surfacing rate fraction as a function of tumor diameter is shown for 1990 and for 2000. Adjuvant Treatment Effectiveness and Adjuvant Treatment Diffusion Submodel For simplicity we model treatment as a cure/no-cure process. When a breast cancer is detected it is assumed to be treated according to prevailing practices for tumors of that stage and women that age in the year of detection. The result of simulated treatment is either “cure,” with total arrest of tumor progression and no possibility of progressing to a breast cancer death, or “no cure,” in which case the tumor continues to progress as if it were undetected and the woman may die of breast cancer, competing causes, or achieve age 100 before year 2000, depending on her individual circumstances. Continued tumor progression in this case is used as a marker for progression to breast cancer death and is not meant to be biologically representative of tumor growth per se since the primary tumor may be gone. The treatment submodel has three logical parts. First, we specified “baseline” treatment effectiveness—cure fractions—in the pretamoxifen, preadjuvant multiagent chemotherapy era for tumors treated at different stages with a standard, baseline therapy. These baseline cure probabilities represent mastectomy with or without radiation as was common in the prescreening era. Second, we calculated implied cure fractions for the various combinations of adjuvant therapies added to the baseline therapy. Third, we specified the diffusion of adjuvant treatments over time as a function of characteristics of the woman and the stage of tumor at diagnosis. The model assumes that in addition to baseline treatment, a woman may receive one of five modes of adjuvant therapy depending on her tumor and the calendar year. The different modes of adjuvant therapy are chemotherapy alone, tamoxifen alone for 2 years, tamoxifen alone for 5 years, chemotherapy and a 2-year course of tamoxifen, or chemotherapy and a 5-year course of tamoxifen. A woman with breast cancer detected in localized or regional stage is assigned a mode of adjuvant treatment based on the calendar year, her current age, tumor size/stage (and revealed estrogen receptor [ER] status in years after which ER status was commonly measured). Tumors diagnosed in the in situ or distant stages are not assigned adjuvant therapy in the model. The likelihood of each mode of treatment is based on treatment data provided by NCI from the Patterns of Care study as well as combined data from numerous cancer registries (11,16). Revealed ER status is modeled as a function of the calendar year and true ER status, which is simulated based on the age of the woman at the time of tumor onset (Table 3) (17). In the simulation, the treatment probabilities are determined in part by whether the ER status is known. We used SEER data from 1990 forward (the first year this was recorded in the SEER data) to estimate the proportion of tumors with ER status determined; probabilities before this time were based on assessment of a local expert oncologist–breast cancer researcher (Table 4). The treatment administered is in part determined by whether the ER status is known and if so whether it is positive or negative. The treatment effectiveness is determined as a function of the true underlying ER status of the tumor and the treatment given. Table 3.  Probability that a tumor is estrogen receptor positive by age Age of woman at detection of tumor, y  Probability tumor is estrogen receptor positive  <45  0.6  45–54  0.65  55–64  0.74  65–74  0.77  >75  0.83  Age of woman at detection of tumor, y  Probability tumor is estrogen receptor positive  <45  0.6  45–54  0.65  55–64  0.74  65–74  0.77  >75  0.83  View Large Table 4.  Probability that the true estrogen receptor status of a tumor will be known Year  Probability estrogen receptor status is known  <1975  0.1  1975–1979  0.2  1980–1984  0.5  1985–1989  0.63  1990  0.68  >1991  0.69  Year  Probability estrogen receptor status is known  <1975  0.1  1975–1979  0.2  1980–1984  0.5  1985–1989  0.63  1990  0.68  >1991  0.69  View Large Meta-analyses of adjuvant therapy trials for early stage breast cancer (18,19) showed a 27%, 14%, and 18% reduction in annualized odds of 10-year all-cause mortality for women under age 50, 50–59, and aged 60 or older, respectively. In women with ER-positive breast cancer, a 2-year course of tamoxifen resulted in an 18% reduction in annualized odds of 10-year all-cause mortality and a 5-year course resulted in a 28% reduction independent of age. The effect sizes for adjuvant chemotherapy and tamoxifen appeared independent. We used these data to instantiate our treatment effectiveness model [see also (11)]. The model's final calibrated baseline cure fractions were 0.99 for in situ, 0.82 for localized, 0.45 for regional, and 0.05 for distant tumors. Baseline cure fractions are assumed representative of tumors treated in 1977–1981 before widespread use of adjuvant therapy. From the meta-analyses' results and these assumed baseline cure fractions, we calculated implied cure fractions for all combinations of adjuvant therapy and baseline treatment for women with localized or regional tumors. We did not adjust the already high cure fraction for in situ stage or the low rate for distant cancers as adjuvant treatments have little effect in the first case and data are sparse concerning their use in the latter. We illustrate our calculations with women aged 50–59 years diagnosed with a regional breast cancer and treated with multiagent adjuvant chemotherapy. Among simulated women this age and without breast cancer, the 10-year survival probability was 0.911 based on the CISNET common input for all-cause mortality with breast cancer removed (20); among corresponding women diagnosed with regional-stage breast cancer and no treatment (i.e., allowing the simulated tumors to progress according to the natural history model), the 10-year survival probability (all-cause mortality) was observed to be 0.198 in model output. As a model input, the baseline cure fraction without adjuvant therapy was 0.45 for women with regional cancer. So the 10-year survival probability for women with this cancer in this age range is modeled as a 45% mixture of the two survival probabilities, one for cured and one for uncured women: 0.45(0.911) + (1 − 0.45)(0.198) = 0.519. The annualized all-cause mortality rate corresponding to a 10-year survival probability of 0.519 is −(1/10)(ln(0.519)) = 0.066, and this corresponds to an annual mortality probability of 0.063, which in turn is an annual mortality odds of 0.067. Adjuvant chemotherapy is presumed to reduce the annual mortality odds by 14% (the reduction reported in the meta-analyses), leaving an annual mortality odds of 0.058. Reversing the calculation steps, this mortality odds corresponds to a 10-year survival fraction of 0.567. We now solve for the cure fraction such that a mixture of the 0.911 survival probability given cure and the 0.198 survival probability without cure gives a survival fraction of 0.567. The answer is a cure fraction of 0.518. This cure fraction was applied in the simulation to all women aged 50–59 years diagnosed with regional-stage breast cancer and treated with multiagent adjuvant chemotherapy. For women with both adjuvant chemotherapy and tamoxifen, the two mortality reductions were applied independently to the annual mortality odds in the middle step. Tamoxifen was assumed effective only in ER-positive tumors. In the final steps of model calibration, the baseline cure fractions for local and regional tumors were adjusted to make simulated breast cancer mortality rates match U.S. data for 1975–2000 as described at the end of the calibration section. The baseline cure fractions given above are the final ones for the calibrated model. Although overall fit was good to U.S. breast cancer mortality data, we observed in the simulation that breast cancer mortality among older women was consistently too low in the penultimate calibration of the model. This finding is consistent with observations that older women are less aggressively treated than younger women (21–26) since neither the baseline treatment effectiveness nor the treatment dissemination submodel depend on the patient's age, so older women were “cured” in the model at the same rates as younger women. Rather than attempting to micromodel treatment of older women, for the final calibration we simply reduced the cure fractions of the two more advanced stages by one-half for women aged 70 years and older at diagnosis to get better accord to age-specific mortality in this age group. Although not affecting overall age-adjusted mortality much, this correction appears to bring the simulation more in line with mortality among older women. Future versions of the simulation may address treatment of older women in more realistic detail. Treatment of LMP tumors is assumed to be 100% curative since these tumors are, by our definition, not lethal. Mortality From Non–Breast Cancer Causes We actuarially adjusted age-specific all-cause life tables of female mortality by birth cohort to develop mortality rates from nonbreast cancer [see Berkeley Mortality Data Base (27)], using age-specific breast cancer mortality is from the Centers for Disease Control and Prevention. Derivation of non–breast cancer mortality tables by one of our modeling team is described in chapter 3 of this volume (20). The resulting non–breast cancer mortality rates are a common input to all the CISNET breast cancer collaboration models. The Modeled Population The model was designed to produce counts of incident tumors and breast cancer deaths over time in a population the size and age structure of the state of Wisconsin. Starting the model in 1950, we simulate a number of women equal to each single-year age cohort of women in the Wisconsin population aged 20–99 years in 1950. In each year 1951–2000, we add to the simulated cohort the number of women aged 20 years in that year. The total number of women simulated is 2.95 million. Each complete simulation of these women is one replication of the simulation and results in data simulating the number of breast cancer cases in Wisconsin from 1950 to 2000. The simulation is programmed in C++ using Microsoft Visual Studio Version 6 and runs either in standalone mode or multiple replications may be run in parallel by using the CONDOR sharing software (28–30). Each stochastic event in the simulation uses its own stream of random numbers, allowing use of the technique of common random numbers when performing experiments comparing, e.g., different screening protocols. A run of the model is initialized in 1950 assuming all women at that year are breast cancer free. The breast cancer natural history submodel is invoked in 6 month cycles from simulated year 1950 to 1974 as “burn-in” for our model. The prevalence of occult breast cancer has stabilized after the 25-year burn-in so that the population going forward from 1975 is appropriately seeded with prevalent occult and detected breast cancer. Our fit to observed prevalent cases in 1975 is good. For standardized comparison of output we computed age-adjusted rates for all results using the U.S. standard population aged 30–79 years in the year 2000. MODEL CALIBRATION Model components are governed by parameters falling into three broad classes: Fixed inputs common to all CISNET collaborators: Cohort specific mortality from non-breast cancer causes (20) APC-predicted total breast cancer incidence rates in the absence of screening (9) Functions governing diffusion of mammography over time for women of different ages (11,12) Functions governing diffusion of adjuvant therapy over time (11,16) Fixed inputs derived from literature for the Wisconsin model: Adjuvant treatment effectiveness for all-cause mortality reduction (see text) Parameters governing proportions of estrogen receptor–positive tumors (Tables 3 and 4) Parameters fitted during model calibration: Natural history parameters (Tables 1 and 2); 10 parameters Breast cancer detection probabilities and rates determining curve segments in Fig. 1; 29 parameters, with many ordinal constraints Baseline cure fractions for in situ, local, regional, and distant stage tumors; four parameters The simulation model is a complicated, nonlinear function of the input parameters. Its output is multivalued and may be summarized by five curves relating the predicted U.S. age-adjusted incidence rates for in situ, localized, regional, and distant breast cancers over time for the years 1975–2000 (see Fig. 2), and the U.S. age-adjusted breast cancer mortality rates over time for the same years. Furthermore, because the simulation is stochastic, separate runs, each with the same set of input parameter values, can produce different outputs if allowed by program control to use different random-number seeds. The input parameters listed above as sets numbered 7 and 8 were manipulated to make the simulation output conform as closely as possible to observed SEER and WCRS incidence rates within the 26-year time frame. Fig. 2. View largeDownload slide Incidence rates are shown for four stages of breast cancer observed in Surveillance, Epidemiology, and End Results (SEER) data and the Wisconsin Cancer Reporting System (WCRS) within the interval from 1975 to 2000, adjusted to the U.S. standard population aged 30–79 in year 2000. The dotted boundaries show the acceptance envelopes used for parameter sampling experiments during model calibration. Note, the vertical scales are different for each stage to allow comparison of the dynamic shapes of all four curves. Fig. 2. View largeDownload slide Incidence rates are shown for four stages of breast cancer observed in Surveillance, Epidemiology, and End Results (SEER) data and the Wisconsin Cancer Reporting System (WCRS) within the interval from 1975 to 2000, adjusted to the U.S. standard population aged 30–79 in year 2000. The dotted boundaries show the acceptance envelopes used for parameter sampling experiments during model calibration. Note, the vertical scales are different for each stage to allow comparison of the dynamic shapes of all four curves. Several technical challenges were encountered in doing calibration. First, traditional measures of fit such as least-squared error were not useful because they emphasize correspondence of the simulation output to the observed data at points with highest incidence rates, thus deemphasizing fitting the dynamic shapes of incidence curves which rise from low rates to high rates, and also de-emphasizing fit of the low-incidence, distant, and in situ stages in favor of high-incidence stages (localized, regional). Second, because of the complex and nonlinear nature of the simulation, traditional sensitivity analyses examining only one or a few parameters at a time yield little insight into the solution. We approached calibration by using acceptance sampling to fit the 29 parameters in sets 7 and 8. Biologically plausible ranges were set for each free parameter. A vector of parameter values was drawn at random by drawing each parameter from an independent uniform distribution over a discrete partition of its range so that all ordinal constraints on detection rates and probabilities were satisfied. The simulation was run using the sampled input values as fixed inputs, generating a cancer registry for the 2.95 million woman population from 1975 to 2000, and the age-adjusted (to year 2000 U.S. standard population aged 30–79 years) stage-specific incidence rates were computed for each year in this range. This output was then assigned a score by using acceptance envelopes around the observed SEER incidence rate curves. Fig. 2 shows acceptance envelopes placed around the SEER data. The envelopes were specified heuristically to encompass variation around SEER that might naturally be expected in a population the size we simulated. For example, the corresponding incidence curves from the Wisconsin Cancer Reporting System for 1978–1999 are plotted in Fig. 2 to show breast cancer incidence in a population the size we simulate and which is independent of the SEER data and not distinguishable from the set of similar curves for individual SEER registries (not shown here). The envelopes were set to also enforce the general shapes of the incidence curves, in particular requiring increasing incidence of in situ and local stage tumors throughout the period and flattening of regional and distant stage curves in the latter part of the covered years. Simulation output was scored by counting the number of time points across the four graphs at which the simulation-based incidence rate fell outside of the envelopes; the best possible score was zero, and the worst possible score was 104 (= 4 stages × 26 time points from 1975 to 2000). Empirically we determined a minimally acceptable simulation to have a score of 10 or less; exceptionally good output scored 5 or less. This scoring system was used to screen sampled parameter vectors. The sampling was carried out in two phases. First we assumed all onset breast cancers progress to be invasive breast cancer, and eventually lethal, if undetected and untreated. Under this assumption we did extensive natural history parameter sampling to find the best solution. The ranges for sampled parameters in this first phase are shown in the third column of Table 1. The constant Onset proportion (row 2, Table 1) is a fitted constant equaling the ratio of the age-specific and period-specific onset rate of preclinical occult tumors (malignant tumors that will progress and eventually threaten the woman's life) to the predicted breast cancer incidence rate in the absence of screening, this latter rate being a primary input to the simulation. Because some women may die of other causes with undetected tumors, this constant should be greater than 1.0. We sampled a range from 0.85 to 1.2 for this parameter allowing us to double check program logic. No adequate solutions were found in the first phase; in the second phase of sampling we introduced three more parameters, shown in Table 2. The first parameter (LMP fraction, row 1, Table 2) allows some fraction of onset breast lesions to be LMP, as was suggested by Chang's (1) analysis. The second parameter specifies a maximum diameter for LMP tumors. The last parameter specifies a dwell time at maximum size after which an LMP tumor regresses and disappears. With these three added parameters we again performed extensive sampling of the parameters in Tables 1 and 2 jointly (plus parameters for mammogram sensitivity and clinical surfacing rates) and evaluated the outputs using the acceptance envelopes as described earlier. Results of Model Calibration Final values for fitted parameters are shown in Tables 1 and 2 and in Figure 1. We have evaluated more than 475 000 randomly sampled parameter combinations. Constraining the LMP fraction to be zero, the best result had a score of 21. Characteristically, when LMP fraction is zero the best solutions have far too high incidence rates for early cancers before 1982 and far too low rates for these after 1990. When the two early-stage incidence curves appear more like the observed data, the regional and distant stages plummet in years after 1990, contrary to observed data. Allowing the LMP fraction to be nonzero, but less than or equal to 10%, improved the best score to 15 (among 289 000 sampled combinations of input parameters), and all solutions with scores near 15 had LMP fractions at 9%–10%. Ad hoc sampling indicated that much better solutions were available with LMP fraction between 30% and 50%, so we focused sampling for LMP fraction in this range (for efficiency, as each sample takes approximately 8 minutes of computer time on a Pentium III 1 GHz computer with 1 GB of RAM), still allowing other parameters to vary throughout their plausible ranges. We sampled 30 188 parameter vectors in this focused sampling study and found 91 with scores of 5 or less and 363 with scores of 10 or less. The final “best” parameter values for all parameters are shown in the right-hand columns of Tables 1 and 2. Fig. 3 illustrates the fit of the final parameter vector solution for the simulation model. The final panel of Fig. 3 shows scores from each of 300 replications using the final parameter vector and illustrates stochastic variation in scores given fixed input parameters; no score was larger than 10. Fig. 3. View largeDownload slide Five panels show simulation results (black line) for age-adjusted incidence and breast cancer mortality rates compared to Wisconsin Cancer Reporting System and Surveillance, Epidemiology, and End Results (SEER) data (light gray lines). These results are based on the final calibrated input parameters. Error bars on simulation results are ±2 standard deviations among 300 replications of the simulation run at each annual time point. All rates are per 100 000, adjusted to U.S. standard year 2000 population for ages 30–79 years. The lower right panel is a histogram of acceptance scores generated by the 300 replications; these scores ranged from 0 to 10. Fig. 3. View largeDownload slide Five panels show simulation results (black line) for age-adjusted incidence and breast cancer mortality rates compared to Wisconsin Cancer Reporting System and Surveillance, Epidemiology, and End Results (SEER) data (light gray lines). These results are based on the final calibrated input parameters. Error bars on simulation results are ±2 standard deviations among 300 replications of the simulation run at each annual time point. All rates are per 100 000, adjusted to U.S. standard year 2000 population for ages 30–79 years. The lower right panel is a histogram of acceptance scores generated by the 300 replications; these scores ranged from 0 to 10. The marginal posterior distributions of six parameters (selected to show a variety of posterior results beyond LMP fraction) are shown in Fig. 4. The distribution for LMP fraction peaks between 42% and 46%; our best model uses 42% LMP tumors. LMP tumors progress to a maximum of approximately 1-cm diameter, dwell at this size for 2 years, and then regress if undetected. Examination of the best scoring model without regression of LMP tumors reveals an apparent depletion of the occult pool of localized invasive cancers with the incidence rate peaking near 1992 and beginning to fall after that. No such depletion has been observed in SEER data to date. Apparently regression of LMP tumors is needed to maintain an occult reservoir of small tumors to be found by screening that is not depleted by screening in the late 1990s. Fig. 4. View largeDownload slide Marginal posterior distributions for six of the input parameters in Tables 1 and 2. As defined in those tables the parameters shown in the upper row, left to right, are LMP Fraction, Onset Proportion, In Situ Boundary, Onset Lag, Mean Gamma, and Var Gamma. These histograms show values of the parameters for 363 sampled vectors with scores of 10 or less. The ranges correspond to the ranges sampled for these parameters in the second phase of sampling (see text). The parameters are not statistically independent within the solution set; for example, the correlation between LMP Fraction and Onset Proportion is –.28 (P<.001) and between Mean Gamma and Var Gamma is 0.88 (P<.001). Fig. 4. View largeDownload slide Marginal posterior distributions for six of the input parameters in Tables 1 and 2. As defined in those tables the parameters shown in the upper row, left to right, are LMP Fraction, Onset Proportion, In Situ Boundary, Onset Lag, Mean Gamma, and Var Gamma. These histograms show values of the parameters for 363 sampled vectors with scores of 10 or less. The ranges correspond to the ranges sampled for these parameters in the second phase of sampling (see text). The parameters are not statistically independent within the solution set; for example, the correlation between LMP Fraction and Onset Proportion is –.28 (P<.001) and between Mean Gamma and Var Gamma is 0.88 (P<.001). Biologic Plausibility of LMP and Hyperaggressive Breast Cancers Our numerical results imply a large reservoir of exceedingly indolent breast cancers, the LMP tumors, at one end of the spectrum and a small fraction of hyperaggressive tumors at the other end of the spectrum. Are these biologically plausible? Welch and Black (31) discuss incidental autopsy evidence that a substantial pool of undetected invasive breast cancers may exist. A recent review of in situ breast cancer strongly suggests that ductal carcinoma in situ is an early stage of invasive cancer but notes a wide range in estimates (14%–60%) of the proportion of ductal carcinoma in situ that will progress to invasive stages in 10 years if left untreated (32). Our calculations are at the high end of this range, with 58% being non-LMP tumors. Love and Niederhuber review evidence suggesting that breast cancer growth may be viewed as “ … a problem of macro- and microenvironmental regulatory imbalance and dynamic chaos” within the host (33). They suggest a specific progesterone trigger affecting tumor cell kinetics and note that the same or other host-based regulatory systems may affect angiogenesis. It is not unreasonable under this model that in situ and small invasive cancers may be exceptionally indolent or even regress. At the other end of the spectrum, is there a biologic model for the small fraction of all tumors (1%–2%) we found necessary to create as hyperaggressive tumors? These are tumors that at a very small size of the focal primary tumors would be classified as regional or distant stage and are unlikely to be discovered by mammography screening. Mustafa et al. (34) review a series of invasive cancers found between 1984 and 1995 at one hospital with diameter of the primary less than 1 cm; nodal involvement was found in 11% of those less than 0.5 cm and 17% of those 0.6–1.0 cm in maximum diameter. Clearly there is a biologic basis for small tumors with regional spread, even if this series of 2153 tumors at one medical center was exceptional. Another possible entity is inflammatory breast cancer (IBC), “ … the most aggressive form of [breast cancer] … estimated to makeup about 1%–6% of breast cancer cases …” (35). IBC is staged as regional or distant when diagnosed and its observed incidence rate should be uncorrelated with rates of mammography—the characteristics we needed to postulate to fit observed data. Although our model deals only with focal masses, the statistical behavior of IBC is similar to the hyperaggressive subset we found necessary. Calibration of Treatment Baseline Cure Fractions With incidence rates calibrated as described above, the mortality rate curve output by the model was slightly lower than the observed U.S. breast cancer mortality rate curve but identical in shape. A small upward adjustment to baseline cure fractions for localized and regional tumors lifted the simulation output mortality rates. The final values for cure fractions are given in the text above and the fit to observed mortality rates shown in Fig. 3. Base Case Results for the Wisconsin model Our CISNET base case estimate is that breast cancer mortality in the year 2000 was reduced overall by 38.3% from what it would have been without introduction of adjuvant therapy (tamoxifen and adjuvant chemotherapy) or of mammography screening over the interval 1975–2000. Under the base case assumptions, mortality would have risen beyond levels observed in the mid-1980s because of background secular increase in breast cancer risk in the 1980s and 1990s, explaining why the 38% is larger than the observed reduction from breast cancer mortality over the 1990s. Were screening alone introduced, with therapy remaining at 1975 levels, the mortality reduction would have been 20.3%. Were only tamoxifen and adjuvant chemotherapy introduced without screening, the reduction would have been 20.8%. Although there is some redundancy between the two innovations in which breast cancer deaths would have been averted, our model implies that their effects in 2000 were largely independent and about equal. We have explored variability in our base case estimate as a function of model parameter uncertainty. The set of 363 sampled vectors of model parameters with scores of 10 or less form a joint posterior distribution of acceptable model parameters. Across these vectors of input parameters (keeping all other base case inputs constant), the mean estimated total reduction in breast cancer mortality is 35.5%, with 95% posterior interval of 32.9% to 38.7%. The mean estimated reduction with screening alone is 17.9% (95% interval 14.7% to 20.3%). The mean estimated reduction with adjuvant therapy alone is 20.3% (95% interval 19.1% to 21.6%). The correlation between reduction due to screening and reduction due to treatment is −0.35. Our posterior interval for reduction due to screening includes the point estimates of two of the other six CISNET models (it excludes the highest estimate and the lowest three estimates among the six). Our posterior interval for reduction due to treatment includes only the highest three of the other six model's point estimates. The uncertainty we represent here is parameter uncertainty and does not include modeling uncertainty (36). Even so, one other model's bivariate estimate of reductions due to screening and adjuvant therapy is within our posterior bivariate distribution of results for these quantities (Fig. 5). Fig. 5. View largeDownload slide Bivariate estimates of fraction reduction in year 2000 breast cancer mortality assuming only adjuvant therapy and assuming only screening. The double circle is the Wisconsin model's base case estimate. Solid points are estimates using each of the 363 acceptable parameter vectors in the Wisconsin simulation to compute base case results; the distribution of solid points represents uncertainty in the Wisconsin model's base case results due to parameter uncertainty. Open squares are base case point-estimate results of six other CISNET modeling groups. Fig. 5. View largeDownload slide Bivariate estimates of fraction reduction in year 2000 breast cancer mortality assuming only adjuvant therapy and assuming only screening. The double circle is the Wisconsin model's base case estimate. Solid points are estimates using each of the 363 acceptable parameter vectors in the Wisconsin simulation to compute base case results; the distribution of solid points represents uncertainty in the Wisconsin model's base case results due to parameter uncertainty. Open squares are base case point-estimate results of six other CISNET modeling groups. Future Work Traditionally, nonidentifiability of parameters in a highly parameterized model is problematic, and mutually compensatory parameters must be reduced for analysis. Using our method of acceptance sampling, non- or poor identifiability is less of a problem as we can identify sets of feasible solutions in the high-dimension parameter space. Our next step will be to characterize clusters of solutions in this high-dimension space indicating compensatory behavior of subsets of the parameters and then to find predictions and data that might distinguish among currently feasible solutions. Our parameter sampling experiments continue at this writing with the goal being to characterize the topology of good parameter solution sets in parameter space. Preliminary analyses indicate that “good” solutions tend to fall in four relatively compact clusters in parameter space, connected by thin “bridges” of good solutions. These clusters seem distinguished by a faster versus slower mean for the tumor growth rate distribution, plus some compensatory changes in growth rate distribution variance and the lag between tumor onset and average tumor incidence. However, LMP fraction is still in the 30%–50% range for all solutions, with variation in this range compensated by overall onset proportion; we can find no solution sets with LMP fraction near zero. Exploring the biological implications of these clusters of solutions will be interesting, but whichever of these we are led to, there will not be drastic change in our model's implications for relative contribution of screening and adjuvant therapy to breast cancer mortality reduction since the solid points shown in Fig. 5 span all four solution clusters. CONCLUSION The Wisconsin Breast Cancer Epidemiology Simulation Model uses a systems analytic approach to modeling breast cancer as part of the NCI CISNET program. Particular attention has been devoted to developing and fitting a model of breast cancer natural history compatible with observed U.S. and Wisconsin statistics. We have found it necessary to postulate a class of limited malignant potential breast tumors to fit observed statistics. CISNET breast cancer base case results were produced using the technique of common random numbers to compare alternative counterfactual histories with and without introduction of mammography screening and with and without introduction of adjuvant therapy modalities after 1975 in the U.S. population. Supported by U01CA88211 from National Cancer Institute to the University of Wisconsin; HS00083, T-32 training grant for population-based health services research at University of Wisconsin, U01CA82004 from National Cancer Institute and National Institute of Environmental Health Sciences to University of Wisconsin, and P30CA14520 supporting the University of Wisconsin Comprehensive Cancer Center. Since acceptance of this article, N. K. Stout received her PhD and is now with the Department of Health Policy and Management, Harvard School of Public Health, Boston, MA. V. Kuruchittam completed his PhD and is with the College of Public Health, Chulalongkorn University, Thailand. The authors thank Drs. Polly Newcomb, Richard Lore, Elizabeth Burnside, and Polun Chang for their expert advice during the development and calibration of the Wisconsin simulation model. References (1) Chang P. A simulation study of breast cancer epidemiology and detection since 1982: the case for limited malignant potential lesions. Ph.D. dissertation, Department of Industrial Engineering. Madison (WI): University of Wisconsin–Madison; 1993. Google Scholar (2) Csete ME, Doyle JC. Reverse engineering of biological complexity. Science  2002; 295: 1664–9. Google Scholar (3) SEER*Stat. Surveillance, Epidemiology, and End Results (SEER) (http://seer.cancer.gov) Program SEER*Stat Database: Mortality-All COD, Public-Use With State, Total U.S. (1969–2000) <18 Age Groups>; National Cancer Institute, DCCPS, Surveillance Research Program, Cancer Statistics Branch, released April 2003. Underlying mortality data provided by National Center for Health Statistics. Google Scholar (4) Spratt JS, Spratt JA. Chapter 21. Growth rates. In: Donegan WL, editor. Cancer of the breast, 5th edition. Philadelphia (PA): Saunders; 2002. pp. 443–76. Google Scholar (5) Shwartz M. Analysis of preventive screening to detect early breast cancer. Ph.D. dissertation, Department of Urban and Regional Planning. Ann Arbor (MI): University of Michigan; 1975. Google Scholar (6) Shwartz M. A mathematical model used to analyze breast cancer screening strategies. Oper Res  1978; 26: 937–55. Google Scholar (7) Shwartz M. Validation and use of a mathematical model to estimate the benefits of screening younger women for breast cancer. Cancer Detect Prev  1981; 4: 595–601. Google Scholar (8) Shwartz M. Validation of a model of breast cancer screening: an outlier observation suggests the value of breast self-examinations. Med Decis Making  1992; 12: 222–8. Google Scholar (9) Holford TR, Cronin KA, Mariotto AB, Feuer EJ. Changing patterns in breast cancer incidence trends. J Natl Cancer Inst Monogr  2006; 36: 19–25. Google Scholar (10) Feuer EJ, Wun LM. How much of the recent rise in breast cancer incidence can be explained by increases in mammography utilization? A dynamic population model approach. Am J Epidemiol  1992; 136: 1423–36. Google Scholar (11) Cronin KA, Moriotto AB, Clarke LD, Feuer EJ. Additional common inputs for analyzing impact of adjuvant therapy and mammography on U.S. mortality. J Natl Cancer Inst Monogr  2006; 36: 26–9. Google Scholar (12) Cronin KA, Yu B, Krapchow M, Miglioretti DL, Fay MP, Izmirlian G, et al. Modeling the dissemination of mammography in the United States. Cancer Causes Control  2005; 16: 701–12. Google Scholar (13) Heine JJ, Malhotra P. Mammographic tissue, breast cancer risk, serial image analysis, and digital mammography. Part 1. Tissue and related risk factors. Acad Radiol  2002; 9: 298–316. Google Scholar (14) Heine JJ, Malhotra P. Mammographic tissue, breast cancer risk, serial image analysis, and digital mammography. Part 2. Serial breast tissue change and related temporal influences. Acad Radiol  2002; 9: 317–35. Google Scholar (15) Boyd N, Martin L, Stone J, Little L, Minkin S, Yaffe M. A longitudinal study of the effects of menopause on mammographic features. Cancer Epidemiol Biomarkers Prev  2002; 11: 1048–53. Google Scholar (16) Mariotto AB, Feuer EJ, Harlan LC, Abrams J. Dissemination of adjuvant multiagent chemotherapy and tamoxifen for breast cancer in the United States using estrogen receptor information: 1975–1999. J Natl Cancer Inst Monogr  2006; 36: 7–15. Google Scholar (17) Ashba J, Traish AM. Estrogen and progesterone receptor concentrations and prevalence of tumor hormonal phenotypes in older breast cancer patients. Cancer Detect Prev  1999; 23: 238–44. Google Scholar (18) Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy. 133 randomised trials involving 31,000 recurrences and 24,000 deaths among 75,000 women. Part II. Early Breast Cancer Trialists' Collaborative Group. Lancet  1992; 339: 71–85. Google Scholar (19) Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy. 133 randomised trials involving 31,000 recurrences and 24,000 deaths among 75,000 women. Part I. Early Breast Cancer Trialists' Collaborative Group. Lancet  1992; 339: 1–15. Google Scholar (20) Rosenberg MA. Competing risks to breast cancer mortality. J Natl Cancer Inst Monogr  2006; 36: 15–9. Google Scholar (21) Hurria A, Leung D, Trainor K, Borgen P, Norton L, Hudis C. Factors influencing treatment patterns of breast cancer patients age 75 and older. Crit Rev Oncol Hematol  2003; 46: 121–6. Google Scholar (22) Gilligan MA, Kneusel RT, Hoffmann RG, Greer AL, Nattinger AB. Persistent differences in sociodemographic determinants of breast conserving treatment despite overall increased adoption. Med Care  2002; 40: 181–9. Google Scholar (23) Goodwin JS, Freeman JL, Mahnken JD, Freeman DH, Nattinger AB. Geographic variations in breast cancer survival among older women: implications for quality of breast cancer care. J Gerontol A Biol Sci Med Sci  2002; 57: M401–6. Google Scholar (24) Newcomb PA, Carbone PP. Cancer treatment and age: patient perspectives. J Natl Cancer Inst  1993; 85: 1580–4. Google Scholar (25) Kemeny MM, Peterson BL, Kornblith AB, Muss HB, Wheeler J, Levine E, et al. Barriers to clinical trial participation by older women with breast cancer. J Clin Oncol  2003; 21: 2268–75. Google Scholar (26) Wyld L, Reed MW. The need for targeted research into breast cancer in the elderly. Br J Surg  2003; 90: 388–99. Google Scholar (27) Human Mortality Database. US Female Death Rates by Year of Birth, University of California, Berkeley (USA), and Max Planck Institute for Demographic Research, 2005. Available at: http://www.demog.berkeley.edu/∼bmd/. [Last accessed: August 18, 2006.] Google Scholar (28) Thain D, Tannenbaum T, Livny M. Distributed computing in practice: the Condor experience. Concurrency: Practice and Experience  2005; 17: 323–56. Google Scholar (29) Basney J, Livny M. Chapter 5: Deploying a high throughput computing cluster. In Buyya R, editor. High performance cluster computing. Vol. 1. Upper Saddle River (NJ): Prentice Hall PTR; 1999. Google Scholar (30) Thain D, Tannenbaum T, Livny M. Condor and the grid. In Berman F, Hey AJG, Fox G, editors. Grid computing: making the global infrastructure a reality. New York (NY): John Wiley; 2003. Google Scholar (31) Welch HG, Black WC. Using autopsy series to estimate the disease “reservoir” for ductal carcinoma in situ of the breast: how much more breast cancer can we find? Ann Intern Med  1997; 127: 1023–8. Google Scholar (32) Burstein HJ, Polyak K, Wong JS, Lester SC, Kaelin CM. Ductal carcinoma in situ of the breast. N Engl J Med  2004; 350: 1430–41. Google Scholar (33) Love RR, Niederhuber JE. Models of breast cancer growth and investigations of adjuvant surgical oophorectomy. Ann Surg Oncol  2004; 11: 818–28. Google Scholar (34) Mustafa IA, Cole B, Wanebo HJ, Bland KI, Chang HR. The impact of histopathology on nodal metastases in minimal breast cancer. Arch Surg  1997; 132: 384–90. Google Scholar (35) Wingo PA, Jamison PM, Young JL, Gargiullo P. Population-based statistics for women diagnosed with inflammatory breast cancer (United States). Cancer Causes Control  2004; 15: 321–8. Google Scholar (36) Manning WG, Fryback DG, Weinstein MC. Reflecting uncertainty in cost-effectiveness analysis. In Gold MR, Siegel JE, Russell LB, Weinstein MC, editors. Cost-effectiveness in health and medicine. New York (NY): Oxford University Press; 1996. Google Scholar © The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JNCI Monographs Oxford University Press

Chapter 7: The Wisconsin Breast Cancer Epidemiology Simulation Model

Loading next page...
 
/lp/oxford-university-press/chapter-7-the-wisconsin-breast-cancer-epidemiology-simulation-model-Vz5p4KRwT0
Publisher
Oxford University Press
Copyright
© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org.
ISSN
1052-6773
eISSN
1745-6614
DOI
10.1093/jncimonographs/lgj007
pmid
17032893
Publisher site
See Article on Publisher Site

Abstract

Abstract The Wisconsin Breast Cancer Epidemiology Simulation Model is a discrete-event, stochastic simulation model using a systems-science modeling approach to replicate breast cancer incidence and mortality in the U.S. population from 1975 to 2000. Four interacting processes are modeled over time: (1) natural history of breast cancer, (2) breast cancer detection, (3) breast cancer treatment, and (4) competing cause mortality. These components form a complex interacting system simulating the lives of 2.95 million women (approximately 1/50 the U.S. population) from 1950 to 2000 in 6-month cycles. After a “burn in” of 25 years to stabilize prevalent occult cancers, the model outputs age-specific incidence rates by stage and age-specific mortality rates from 1975 to 2000. The model simulates occult as well as detected disease at the individual level and can be used to address “What if?” questions about effectiveness of screening and treatment protocols, as well as to estimate benefits to women of specific ages and screening histories. As part of the National Cancer Institute's Cancer Intervention and Surveillance Modeling Network (CISNET) consortium we developed and calibrated the Wisconsin Breast Cancer Epidemiology Simulation Model. The model is a discrete-event, stochastic simulation model designed to replicate breast cancer incidence and mortality rates in a population with size and age structure of the Wisconsin female population but generalizing to breast cancer epidemiology in the U.S. population from 1975 to 2000. This paper describes the model development, structure, and calibration. A complete specification of the model and all parameters is available electronically at the CISNET Web site (http://cisnet.cancer.gov/profiles/). MODEL HISTORY AND OBJECTIVES The Wisconsin model evolved from a model constructed by Chang (1). Chang's deterministic model replicated Wisconsin breast cancer incidence and mortality from 1980 to 1992, but only if a substantial fraction of all breast cancers are predestined from their occult biologic onset to have limited malignant potential, i.e., to grow to only a limited size, approximately 1-cm diameter, and not be a lethal threat to the woman. This simulation has two objectives: 1) to develop a simulation based on components similar to those in Chang's model, but to predict age- and stage-specific breast cancer incidence rates and age-specific mortality rates in the U.S. population and 2) to conduct simulations at the individual woman level allowing exploration of relative costs and effectiveness of alternative screening protocols in the U.S. population. MODELING APPROACH AND COMPONENT OVERVIEW We used a systems engineering approach to construct the model. We decomposed the complex, real-world system, which results in observed national breast cancer statistics, into interacting subsystems and modeled these component systems and their interactions (2). Our model is a discrete-event simulation with a fixed cycle time of 6 months beginning in calendar year 1950. The model is populated by 2.95 million women, divided into birth cohorts, making up a female population aged 20–100 years living between 1950 and 2000. The size and cohort structure in this simulated population is identical to that of Wisconsin in those years to allow us to generate a case-based cancer registry matching the number of cases in Wisconsin for comparison to breast cancer case counts in the Wisconsin Cancer Reporting System (WCRS) state cancer registry. However, model parameters were calibrated to breast cancer statistics reported in the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) program (3). Thus, when the simulated registry results are reported as age-adjusted rates, they apply to the U.S. population as represented in SEER generally and are not Wisconsin specific. Women are individually simulated from 1950 (or the year in which they were age 20) until they die a simulated death, achieve age 100, or the simulated year 2000 is reached. The processes simulated are as follows: 1) the natural history of breast cancer from occult onset to breast cancer death, 2) detection of breast cancer by screening mammography or other diagnostic pathways, 3) effectiveness of treatment of breast cancer and diffusion of adjuvant therapies over time, and 4) death from non–breast cancer causes. Each of these processes is stochastic and unfolding over time in the population; jointly they determine observed breast cancer epidemiology. A run of the model is initialized in 1950 assuming all the women at that year are breast cancer free. The breast cancer onset and progression submodels are invoked in 6-month cycles from simulated year 1950 to 1975 as “burn-in” for the simulation. The prevalence of occult breast cancer and detected breast cancer is stabilized after the 25-year burn-in so that the simulated population going forward from 1975 is appropriately represented with prevalent occult and detected breast cancer. For comparison of output to that of other CISNET collaborators when we computed age-adjusted rates, we adjusted results to the U.S. standard population (male and female) aged 30–79 years in 2000. MODEL COMPONENTS AND PARAMETER DESCRIPTION Natural History Submodel An important assumption in the model is that in situ carcinoma is an early stage of invasive cancer. Breast cancers are simulated as idealized spherical tumors having occult onset with diameter 2 mm, a lower bound chosen to be minimally detectable with technologies prevalent in 2000. Simulated tumors grow according a Gompertz-type function with asymptotic diameter of 8 cm (4). The Gompertzian growth rate for an individual tumor is fixed at its onset by a random draw from a gamma distribution of growth rates common to all tumors in the model and women of all ages. The mean and variance of this gamma distribution (Mean gamma and Var gamma in Table 1) were fitted in the model calibration process described below. In each 6-month cycle possible metastatic spread of the tumor is simulated stochastically by a random draw from a Poisson distribution determining number of new positive lymph nodes in that period; the Poisson distribution mean is a function of current tumor size and instantaneous growth rate developed by Shwartz (5–8). Table 1.  Input parameters, their sampled ranges and the best-fitting values Parameter  Use in the model  [Sampled range during phase 1, when LMP fraction = 0] (increment size for discretized sampling)  [Sampled range during phase 2, when LMP fraction allowed >0] (increment size for discretized sampling)  Final value  In situ boundary  The diameter (cm) below which the tumor is classified as in situ stage in the simulation if there are no associated positive lymph nodes  [0.75–1.0] (0.01)  [0.85–0.99] (0.01)  0.95 cm  Onset proportion  Ratio of assumed age-specific biologic onset rate divided by age-specific incidence rate (the latter specified by age–period–cohort model estimated in absence of screening—see text).  [0.85–1.2] (0.01)  [0.8–1.0] (0.01)  0.9  Onset lag  Time interval (years) between year of index onset rate and incidence rate used in Onset Proportion. This is to “fill the pipeline” with biologically onset tumors which will be discovered at a given incidence rate some years later. Because the cycle time of the model is 0.5 years, this was taken to be step size.  [1–8] (0.5)  [1.5–4] (0.5)  3 y  Mean gamma  The Gompertz growth rate is assumed to have a gamma distribution across all onset tumors. This parameter is the mean of this gamma distribution (see text).  [0.01–0.2] (0.01)  [0.08–0.18] (0.01)  0.12  Var gamma  The variance of the gamma distribution of Gompertz growth rates  [0.006–0.1] (0.001)  [0.01–0.05] (0.001)  0.012  Percent 4 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 4 positive lymph nodes at onset. (This places these tumors at the upper limits of simulated regional tumors, which are presumed to have 1-4 positive nodes.)  [0%–5%] (1%)  [0%–1%] (1%)  1%  Percent 5 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 5 positive lymph nodes at onset. (This simulates these tumors in the distant stage from their initiation in the model.)  [0%–5%] (1%)  [2%–4%] (1%)  2%  Parameter  Use in the model  [Sampled range during phase 1, when LMP fraction = 0] (increment size for discretized sampling)  [Sampled range during phase 2, when LMP fraction allowed >0] (increment size for discretized sampling)  Final value  In situ boundary  The diameter (cm) below which the tumor is classified as in situ stage in the simulation if there are no associated positive lymph nodes  [0.75–1.0] (0.01)  [0.85–0.99] (0.01)  0.95 cm  Onset proportion  Ratio of assumed age-specific biologic onset rate divided by age-specific incidence rate (the latter specified by age–period–cohort model estimated in absence of screening—see text).  [0.85–1.2] (0.01)  [0.8–1.0] (0.01)  0.9  Onset lag  Time interval (years) between year of index onset rate and incidence rate used in Onset Proportion. This is to “fill the pipeline” with biologically onset tumors which will be discovered at a given incidence rate some years later. Because the cycle time of the model is 0.5 years, this was taken to be step size.  [1–8] (0.5)  [1.5–4] (0.5)  3 y  Mean gamma  The Gompertz growth rate is assumed to have a gamma distribution across all onset tumors. This parameter is the mean of this gamma distribution (see text).  [0.01–0.2] (0.01)  [0.08–0.18] (0.01)  0.12  Var gamma  The variance of the gamma distribution of Gompertz growth rates  [0.006–0.1] (0.001)  [0.01–0.05] (0.001)  0.012  Percent 4 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 4 positive lymph nodes at onset. (This places these tumors at the upper limits of simulated regional tumors, which are presumed to have 1-4 positive nodes.)  [0%–5%] (1%)  [0%–1%] (1%)  1%  Percent 5 nodes  Percentage of biologically onset, non-LMP tumors that are assigned 5 positive lymph nodes at onset. (This simulates these tumors in the distant stage from their initiation in the model.)  [0%–5%] (1%)  [2%–4%] (1%)  2%  View Large In the model, tumors are mapped to the four SEER historical stages according to their simulated size and number of positive lymph nodes. Tumors without associated positive lymph nodes and below a critical size (In situ boundary in our parameterization; see Table 1) are defined as in situ. Beyond this size, but still without positive nodes, the tumor is defined to be localized invasive cancer. Any tumor with one to four simulated positive nodes is defined to be in the regional stage. Tumors with five or more positive nodes are defined as distant stage. However, simulation of number of positive lymph nodes is meant to be a simulacrum rather than facsimile of the physical disease process of metastatic spread and is used only crudely in our model to provide a continually updated probability of regionally spread and then of distantly spread disease. We maintained the “positive lymph node” metaphor from Shwartz's early model but do not use this other than as a way to indicate three categories of spread (none, regional, distant). More detailed modeling of the metastatic process is beyond scope of this model. This simulation process yields stage-specific incidence statistics to replicate WCRS and SEER data. We stress that calibration of the model, described later, focused on producing stage-specific incidence rates over time to match SEER data for 1975–1999. The underlying simulated growth model was not calibrated directly with, or compared to, data concerning tumor sizes per se. Depending on the stage of cancer, “tumor size” in the model functions in two ways. Once a random growth rate is fixed for a simulated tumor, tumor size and passage of clock time are equivalent. In the in situ stage, where real-world breast carcinomas often are highly irregular in shape and physical extent, “tumor size” is used to mark clock time in the stage; variability in growth rate distribution induces variability of dwell times in this stage. In our model, a tumor transits to the localized invasive stage at 9.5 mm in “tumor size.” At this point “tumor size” is less metaphorical and codes for size of the spherical mass insofar as it influences average radiographic and clinical exam detectability of the tumor. Thus distribution of tumor size among simulated incident invasive cancers might be expected to be similar to actual data for tumors of 1 cm or larger, although such correspondence was not forced by the calibration process, which considered only incidence rates by stage. The model specifies that breast cancer death occurs only as an endpoint of the process of uncontrolled growth and spread of an invasive tumor. Once a simulated tumor enters the distant stage, its natural history is presumed to be lethal at a rate described by the survival of women entered in the SEER registry as diagnosed in the distant stage during the premammography era (1975–1982). We estimated an empirical distribution of survival times from time of diagnosis for such women from SEER data (3). The mean survival was 1.95 years and the median 5.22 years. This finding potentially underestimates true survival time at this point because SEER incident cases will have transited occultly to the distant stage some time before diagnosis. On the other hand, treatment after diagnosis acts to prolong survival somewhat in this stage (even pre-1982), and so this estimate is also biased toward overestimating survival with untreated time course. We do not know the net effect of these offsetting biases for estimated natural history time course after transit to the distant stage. The rate of onset of occult breast tumors in women without breast cancer is a function of expected incidence in the absence of screening. For the CISNET consortium, Holford et al. developed an age–period–cohort (APC) model predicting total breast cancer incidence in the absence of screening discussed elsewhere in this volume (9). A signal characteristic of this CISNET base case input is a predicted secular increase in breast cancer incidence almost coincident in time with diffusion of mammography. The model simulates tumors forward in time from a point of occult onset. Because the APC model predicts total breast cancer incidence rates (i.e., rate of diagnosed breast cancer) in absence of screening and not the occult onset rate for tumors, we cannot use the APC age- and calendar year–specific incidence rates directly as an input. First, not all onset tumors will become incident—a woman may die with an undiagnosed tumor, for example. So the APC incidence rates must underpredict the total onset population of tumors by some unknown amount. We fit a parameter termed “Onset proportion” to be the ratio of onset to incident tumors across women and time. Initially this parameter was bounded below at 1 and expected it to be in the range 1–1.2 (as discussed below, this changed when we found it necessary to introduce into the model tumors that regress), reflecting the possibility that up to 20% of occult-onset tumors would never surface in the woman's lifetime. Second, there must be some interval between the time of occult onset and the average time of diagnosis, so the APC incidence rate in year Y will reflect tumors which were onset in year Y − l where l reflects an average lag between onset and incidence; we term the average of this interval the “Onset lag.” Onset lag (Table 1) is a parameter fitted during model calibration. Thus in our initial model, for women of a given age in a given year, the risk of occult onset of a breast tumor was the APC model incidence rate for the same cohort of women Onset lag years in the future, multiplied by the Onset proportion. As discussed next, the meaning of the onset proportion parameter changed slightly at the next step. Both of these parameters should no doubt be age related to reflect differential all-cause mortality across women of different ages, which can effectively censor incidence of occult tumors. However, we fit them as constant across all women to simplify the modeling somewhat. Future iterations of the model will explore relaxing this assumption. An important finding of the Wisconsin Breast Cancer Epidemiology Simulation Model was that the natural history model as described to this point cannot account for the steep rise of in situ and small localized invasive cancers after the advent of mass screening. Far too many of these early stage tumors were detected to be accounted for by an occult pool of tumors that would have progressed to be late stage tumors if not found (10). Following Chang's (1) ideas we included the possibility that some proportion of tumors at onset were destined to be limited malignant potential (LMP) tumors. LMP tumors 1) start to grow at the same rate as lethal tumors, 2) stop growing at a small size (Max LMP size), and, 3) if undetected after a fixed length of time (LMP dwell time), disappear. The fraction of tumors at onset randomly selected to be LMP is LMP fraction. We believe these three parameters (Table 2) are needed to explain observed patterns of breast cancer incidence and mortality 1975–2000, under the assumption that in situ carcinoma of the breast is an early stage of invasive cancer as discussed below. Table 2.  Parameters governing limited malignant potential tumors Parameter  Use in the model  [Sampled Range] (increment size for discretized sampling)  Final value  LMP fraction  Proportion of all tumors assumed to be limited malignant potential (LzMP)  [0%–10%]; [30%–50%] (1%)  42%  Max LMP size  LMP tumors assumed to grow no larger than this diameter (cm)  [0.8–1.5] (0.1 cm)  1 cm  LMP dwell time  Maximum sojourn time (y) for LMP tumor after reaching max LMP size; after this time without discovery, the LMP tumor disappears next simulation cycle.  [1–3 y] (0.5 y)  2 y  Parameter  Use in the model  [Sampled Range] (increment size for discretized sampling)  Final value  LMP fraction  Proportion of all tumors assumed to be limited malignant potential (LzMP)  [0%–10%]; [30%–50%] (1%)  42%  Max LMP size  LMP tumors assumed to grow no larger than this diameter (cm)  [0.8–1.5] (0.1 cm)  1 cm  LMP dwell time  Maximum sojourn time (y) for LMP tumor after reaching max LMP size; after this time without discovery, the LMP tumor disappears next simulation cycle.  [1–3 y] (0.5 y)  2 y  View Large Introduction of LMP tumors required redefinition of onset proportion as the ratio of onset of non-LMP tumors to APC-predicted incidence. No longer was it constrained to be greater than one and during calibration we searched in a region from 0.85 to 1.2. Total onset rate is equal to non-LMP rate onset plus LMP onset rate: Total Onset = (non-LMP onset) + (LMP onset). We have two parameters, LMP fraction = (LMP onset)/(Total onset), and Onset Proportion = (non-LMP onset)/(APC incidence). Putting these together, we determine the total onset rate in year y for a particular cohort of women to be:  \begin{eqnarray*}&&TotalOnset\\&&{=}APC_{(y{+}OnsetLag)}{\cdot}OnsetProportion{\cdot}\frac{1}{(1{-}LMPfraction)}\end{eqnarray*} APC( y+OnsetLag) is the APC model–predicted incidence rate in absence of screening for women in this cohort in year Y + OnsetLag. To foreshadow results, the fitted value for Onset Proportion was determined to be 0.9 and LMP fraction to be 0.42. In the natural history model, each newly onset tumor is assigned a random Gompertz growth rate and then assigned at random to be LMP or not. Among those tumors that are not LMP, two additional parameters govern a small population of hyperaggressive tumors – a fraction (Percent 4 nodes, Table 1) is assigned 4 associated positive lymph nodes when onset at 2 mm diameter, and another fraction (Percent 5 nodes, Table 1) is assigned five positive nodes. These hyperaggressive tumors were needed to match SEER data and avoid depleting the reservoir of tumors to be discovered as regional and distant tumors under all reasonable screening regimes. Modeling Detection of Breast Cancer In any given 6-month period a simulated woman's occult tumor may be detected by either screening mammography or detected by other means (collectively referred to below as “clinical surfacing”). Any schedule of mammography can be arbitrarily imposed by the simulation user. For present purposes we used U.S. data over the period 1975–2000 and a stochastic model of age-specific screening propensity and rates based on data from the Breast Cancer Surveillance Consortium to assign screening dates to women simulating historical screening patterns in the United States (11,12). This stochastic model is available at the CISNET Web site (http://cisnet.cancer.gov/interfaces/). When a woman with a simulated occult tumor is screened, the probability of detection is specified as a function of the idealized spherical tumor diameter. Functions relating detection probability to tumor diameter are specified as a constant probability over the interval 0.2–0.5 cm and then specified pointwise at 0.75 cm, 1.5 cm, and 2 cm; detection probability at 5 cm is fixed at 0.99 and at 8 cm at 1.0. Separate functions were fitted for screening mammography in the year 1984 and 2000 to reflect technological improvements, and these were further differentiated for women younger and older than 50 years in these years to reflect postmenopausal changes in breast radiolucency (13–15). Detection probabilities for years before 1984 were assumed equal to those in 1984; linear interpolation for other tumor diameters and calendar years was used. Clinical surfacing is modeled similarly; however, instead of point probabilities, we fitted annualized rates of detection as a function of tumor diameter at the beginning of the 6-month interval. Separate functions were fit for the year 1990 and 2000 to reflect increasing awareness of breast cancer in the population in this decade. Clinical surfacing before 1990 was presumed to be at the 1990 rate. Figure 1 (left panel) shows four functions relating mammogram detection probability to tumor diameter. These four curves were fitted during calibration by assuming the detection probability was zero for tumor diameters less than 0.2 cm, a constant in the range 0.2–0.5 cm, a larger constant at 0.75 cm, a larger constant yet at 1.5 cm, larger yet at 2.0 cm, and equal to 0.99 at 5 cm, and 1.0 at 8 cm. Sixteen constants (subject to the ordinal constraints noted, and so that probability given size was at least as large in 2000 as in 1990 and at least as large for women over age 50 as under) were fitted during model calibration to specify the four mammogram sensitivity curves shown. On the right panel in Fig. 1 are two curves showing clinical surfacing rates for 1990 and 2000. Rates at tumor sizes less than 0.3 cm were assumed to be zero; constants specifying rates at 0.3, 1, 1.5, 2, 3, 4, and 5 cm were fitted during calibration subject to obvious ordering constraints, and all tumors 8 cm in diameter were presumed to surface within a year. Fig. 1. View largeDownload slide Mammogram sensitivity and clinical surfacing rates used in the final model. Mammogram sensitivity is shown for women less than 50 years old and for women aged 50 years and older in 1984 (thin dotted and solid lines) and for the same two age groups in the year 2000 (thick dotted and solid lines). Clinical surfacing rate fraction as a function of tumor diameter is shown for 1990 and for 2000. Fig. 1. View largeDownload slide Mammogram sensitivity and clinical surfacing rates used in the final model. Mammogram sensitivity is shown for women less than 50 years old and for women aged 50 years and older in 1984 (thin dotted and solid lines) and for the same two age groups in the year 2000 (thick dotted and solid lines). Clinical surfacing rate fraction as a function of tumor diameter is shown for 1990 and for 2000. Adjuvant Treatment Effectiveness and Adjuvant Treatment Diffusion Submodel For simplicity we model treatment as a cure/no-cure process. When a breast cancer is detected it is assumed to be treated according to prevailing practices for tumors of that stage and women that age in the year of detection. The result of simulated treatment is either “cure,” with total arrest of tumor progression and no possibility of progressing to a breast cancer death, or “no cure,” in which case the tumor continues to progress as if it were undetected and the woman may die of breast cancer, competing causes, or achieve age 100 before year 2000, depending on her individual circumstances. Continued tumor progression in this case is used as a marker for progression to breast cancer death and is not meant to be biologically representative of tumor growth per se since the primary tumor may be gone. The treatment submodel has three logical parts. First, we specified “baseline” treatment effectiveness—cure fractions—in the pretamoxifen, preadjuvant multiagent chemotherapy era for tumors treated at different stages with a standard, baseline therapy. These baseline cure probabilities represent mastectomy with or without radiation as was common in the prescreening era. Second, we calculated implied cure fractions for the various combinations of adjuvant therapies added to the baseline therapy. Third, we specified the diffusion of adjuvant treatments over time as a function of characteristics of the woman and the stage of tumor at diagnosis. The model assumes that in addition to baseline treatment, a woman may receive one of five modes of adjuvant therapy depending on her tumor and the calendar year. The different modes of adjuvant therapy are chemotherapy alone, tamoxifen alone for 2 years, tamoxifen alone for 5 years, chemotherapy and a 2-year course of tamoxifen, or chemotherapy and a 5-year course of tamoxifen. A woman with breast cancer detected in localized or regional stage is assigned a mode of adjuvant treatment based on the calendar year, her current age, tumor size/stage (and revealed estrogen receptor [ER] status in years after which ER status was commonly measured). Tumors diagnosed in the in situ or distant stages are not assigned adjuvant therapy in the model. The likelihood of each mode of treatment is based on treatment data provided by NCI from the Patterns of Care study as well as combined data from numerous cancer registries (11,16). Revealed ER status is modeled as a function of the calendar year and true ER status, which is simulated based on the age of the woman at the time of tumor onset (Table 3) (17). In the simulation, the treatment probabilities are determined in part by whether the ER status is known. We used SEER data from 1990 forward (the first year this was recorded in the SEER data) to estimate the proportion of tumors with ER status determined; probabilities before this time were based on assessment of a local expert oncologist–breast cancer researcher (Table 4). The treatment administered is in part determined by whether the ER status is known and if so whether it is positive or negative. The treatment effectiveness is determined as a function of the true underlying ER status of the tumor and the treatment given. Table 3.  Probability that a tumor is estrogen receptor positive by age Age of woman at detection of tumor, y  Probability tumor is estrogen receptor positive  <45  0.6  45–54  0.65  55–64  0.74  65–74  0.77  >75  0.83  Age of woman at detection of tumor, y  Probability tumor is estrogen receptor positive  <45  0.6  45–54  0.65  55–64  0.74  65–74  0.77  >75  0.83  View Large Table 4.  Probability that the true estrogen receptor status of a tumor will be known Year  Probability estrogen receptor status is known  <1975  0.1  1975–1979  0.2  1980–1984  0.5  1985–1989  0.63  1990  0.68  >1991  0.69  Year  Probability estrogen receptor status is known  <1975  0.1  1975–1979  0.2  1980–1984  0.5  1985–1989  0.63  1990  0.68  >1991  0.69  View Large Meta-analyses of adjuvant therapy trials for early stage breast cancer (18,19) showed a 27%, 14%, and 18% reduction in annualized odds of 10-year all-cause mortality for women under age 50, 50–59, and aged 60 or older, respectively. In women with ER-positive breast cancer, a 2-year course of tamoxifen resulted in an 18% reduction in annualized odds of 10-year all-cause mortality and a 5-year course resulted in a 28% reduction independent of age. The effect sizes for adjuvant chemotherapy and tamoxifen appeared independent. We used these data to instantiate our treatment effectiveness model [see also (11)]. The model's final calibrated baseline cure fractions were 0.99 for in situ, 0.82 for localized, 0.45 for regional, and 0.05 for distant tumors. Baseline cure fractions are assumed representative of tumors treated in 1977–1981 before widespread use of adjuvant therapy. From the meta-analyses' results and these assumed baseline cure fractions, we calculated implied cure fractions for all combinations of adjuvant therapy and baseline treatment for women with localized or regional tumors. We did not adjust the already high cure fraction for in situ stage or the low rate for distant cancers as adjuvant treatments have little effect in the first case and data are sparse concerning their use in the latter. We illustrate our calculations with women aged 50–59 years diagnosed with a regional breast cancer and treated with multiagent adjuvant chemotherapy. Among simulated women this age and without breast cancer, the 10-year survival probability was 0.911 based on the CISNET common input for all-cause mortality with breast cancer removed (20); among corresponding women diagnosed with regional-stage breast cancer and no treatment (i.e., allowing the simulated tumors to progress according to the natural history model), the 10-year survival probability (all-cause mortality) was observed to be 0.198 in model output. As a model input, the baseline cure fraction without adjuvant therapy was 0.45 for women with regional cancer. So the 10-year survival probability for women with this cancer in this age range is modeled as a 45% mixture of the two survival probabilities, one for cured and one for uncured women: 0.45(0.911) + (1 − 0.45)(0.198) = 0.519. The annualized all-cause mortality rate corresponding to a 10-year survival probability of 0.519 is −(1/10)(ln(0.519)) = 0.066, and this corresponds to an annual mortality probability of 0.063, which in turn is an annual mortality odds of 0.067. Adjuvant chemotherapy is presumed to reduce the annual mortality odds by 14% (the reduction reported in the meta-analyses), leaving an annual mortality odds of 0.058. Reversing the calculation steps, this mortality odds corresponds to a 10-year survival fraction of 0.567. We now solve for the cure fraction such that a mixture of the 0.911 survival probability given cure and the 0.198 survival probability without cure gives a survival fraction of 0.567. The answer is a cure fraction of 0.518. This cure fraction was applied in the simulation to all women aged 50–59 years diagnosed with regional-stage breast cancer and treated with multiagent adjuvant chemotherapy. For women with both adjuvant chemotherapy and tamoxifen, the two mortality reductions were applied independently to the annual mortality odds in the middle step. Tamoxifen was assumed effective only in ER-positive tumors. In the final steps of model calibration, the baseline cure fractions for local and regional tumors were adjusted to make simulated breast cancer mortality rates match U.S. data for 1975–2000 as described at the end of the calibration section. The baseline cure fractions given above are the final ones for the calibrated model. Although overall fit was good to U.S. breast cancer mortality data, we observed in the simulation that breast cancer mortality among older women was consistently too low in the penultimate calibration of the model. This finding is consistent with observations that older women are less aggressively treated than younger women (21–26) since neither the baseline treatment effectiveness nor the treatment dissemination submodel depend on the patient's age, so older women were “cured” in the model at the same rates as younger women. Rather than attempting to micromodel treatment of older women, for the final calibration we simply reduced the cure fractions of the two more advanced stages by one-half for women aged 70 years and older at diagnosis to get better accord to age-specific mortality in this age group. Although not affecting overall age-adjusted mortality much, this correction appears to bring the simulation more in line with mortality among older women. Future versions of the simulation may address treatment of older women in more realistic detail. Treatment of LMP tumors is assumed to be 100% curative since these tumors are, by our definition, not lethal. Mortality From Non–Breast Cancer Causes We actuarially adjusted age-specific all-cause life tables of female mortality by birth cohort to develop mortality rates from nonbreast cancer [see Berkeley Mortality Data Base (27)], using age-specific breast cancer mortality is from the Centers for Disease Control and Prevention. Derivation of non–breast cancer mortality tables by one of our modeling team is described in chapter 3 of this volume (20). The resulting non–breast cancer mortality rates are a common input to all the CISNET breast cancer collaboration models. The Modeled Population The model was designed to produce counts of incident tumors and breast cancer deaths over time in a population the size and age structure of the state of Wisconsin. Starting the model in 1950, we simulate a number of women equal to each single-year age cohort of women in the Wisconsin population aged 20–99 years in 1950. In each year 1951–2000, we add to the simulated cohort the number of women aged 20 years in that year. The total number of women simulated is 2.95 million. Each complete simulation of these women is one replication of the simulation and results in data simulating the number of breast cancer cases in Wisconsin from 1950 to 2000. The simulation is programmed in C++ using Microsoft Visual Studio Version 6 and runs either in standalone mode or multiple replications may be run in parallel by using the CONDOR sharing software (28–30). Each stochastic event in the simulation uses its own stream of random numbers, allowing use of the technique of common random numbers when performing experiments comparing, e.g., different screening protocols. A run of the model is initialized in 1950 assuming all women at that year are breast cancer free. The breast cancer natural history submodel is invoked in 6 month cycles from simulated year 1950 to 1974 as “burn-in” for our model. The prevalence of occult breast cancer has stabilized after the 25-year burn-in so that the population going forward from 1975 is appropriately seeded with prevalent occult and detected breast cancer. Our fit to observed prevalent cases in 1975 is good. For standardized comparison of output we computed age-adjusted rates for all results using the U.S. standard population aged 30–79 years in the year 2000. MODEL CALIBRATION Model components are governed by parameters falling into three broad classes: Fixed inputs common to all CISNET collaborators: Cohort specific mortality from non-breast cancer causes (20) APC-predicted total breast cancer incidence rates in the absence of screening (9) Functions governing diffusion of mammography over time for women of different ages (11,12) Functions governing diffusion of adjuvant therapy over time (11,16) Fixed inputs derived from literature for the Wisconsin model: Adjuvant treatment effectiveness for all-cause mortality reduction (see text) Parameters governing proportions of estrogen receptor–positive tumors (Tables 3 and 4) Parameters fitted during model calibration: Natural history parameters (Tables 1 and 2); 10 parameters Breast cancer detection probabilities and rates determining curve segments in Fig. 1; 29 parameters, with many ordinal constraints Baseline cure fractions for in situ, local, regional, and distant stage tumors; four parameters The simulation model is a complicated, nonlinear function of the input parameters. Its output is multivalued and may be summarized by five curves relating the predicted U.S. age-adjusted incidence rates for in situ, localized, regional, and distant breast cancers over time for the years 1975–2000 (see Fig. 2), and the U.S. age-adjusted breast cancer mortality rates over time for the same years. Furthermore, because the simulation is stochastic, separate runs, each with the same set of input parameter values, can produce different outputs if allowed by program control to use different random-number seeds. The input parameters listed above as sets numbered 7 and 8 were manipulated to make the simulation output conform as closely as possible to observed SEER and WCRS incidence rates within the 26-year time frame. Fig. 2. View largeDownload slide Incidence rates are shown for four stages of breast cancer observed in Surveillance, Epidemiology, and End Results (SEER) data and the Wisconsin Cancer Reporting System (WCRS) within the interval from 1975 to 2000, adjusted to the U.S. standard population aged 30–79 in year 2000. The dotted boundaries show the acceptance envelopes used for parameter sampling experiments during model calibration. Note, the vertical scales are different for each stage to allow comparison of the dynamic shapes of all four curves. Fig. 2. View largeDownload slide Incidence rates are shown for four stages of breast cancer observed in Surveillance, Epidemiology, and End Results (SEER) data and the Wisconsin Cancer Reporting System (WCRS) within the interval from 1975 to 2000, adjusted to the U.S. standard population aged 30–79 in year 2000. The dotted boundaries show the acceptance envelopes used for parameter sampling experiments during model calibration. Note, the vertical scales are different for each stage to allow comparison of the dynamic shapes of all four curves. Several technical challenges were encountered in doing calibration. First, traditional measures of fit such as least-squared error were not useful because they emphasize correspondence of the simulation output to the observed data at points with highest incidence rates, thus deemphasizing fitting the dynamic shapes of incidence curves which rise from low rates to high rates, and also de-emphasizing fit of the low-incidence, distant, and in situ stages in favor of high-incidence stages (localized, regional). Second, because of the complex and nonlinear nature of the simulation, traditional sensitivity analyses examining only one or a few parameters at a time yield little insight into the solution. We approached calibration by using acceptance sampling to fit the 29 parameters in sets 7 and 8. Biologically plausible ranges were set for each free parameter. A vector of parameter values was drawn at random by drawing each parameter from an independent uniform distribution over a discrete partition of its range so that all ordinal constraints on detection rates and probabilities were satisfied. The simulation was run using the sampled input values as fixed inputs, generating a cancer registry for the 2.95 million woman population from 1975 to 2000, and the age-adjusted (to year 2000 U.S. standard population aged 30–79 years) stage-specific incidence rates were computed for each year in this range. This output was then assigned a score by using acceptance envelopes around the observed SEER incidence rate curves. Fig. 2 shows acceptance envelopes placed around the SEER data. The envelopes were specified heuristically to encompass variation around SEER that might naturally be expected in a population the size we simulated. For example, the corresponding incidence curves from the Wisconsin Cancer Reporting System for 1978–1999 are plotted in Fig. 2 to show breast cancer incidence in a population the size we simulate and which is independent of the SEER data and not distinguishable from the set of similar curves for individual SEER registries (not shown here). The envelopes were set to also enforce the general shapes of the incidence curves, in particular requiring increasing incidence of in situ and local stage tumors throughout the period and flattening of regional and distant stage curves in the latter part of the covered years. Simulation output was scored by counting the number of time points across the four graphs at which the simulation-based incidence rate fell outside of the envelopes; the best possible score was zero, and the worst possible score was 104 (= 4 stages × 26 time points from 1975 to 2000). Empirically we determined a minimally acceptable simulation to have a score of 10 or less; exceptionally good output scored 5 or less. This scoring system was used to screen sampled parameter vectors. The sampling was carried out in two phases. First we assumed all onset breast cancers progress to be invasive breast cancer, and eventually lethal, if undetected and untreated. Under this assumption we did extensive natural history parameter sampling to find the best solution. The ranges for sampled parameters in this first phase are shown in the third column of Table 1. The constant Onset proportion (row 2, Table 1) is a fitted constant equaling the ratio of the age-specific and period-specific onset rate of preclinical occult tumors (malignant tumors that will progress and eventually threaten the woman's life) to the predicted breast cancer incidence rate in the absence of screening, this latter rate being a primary input to the simulation. Because some women may die of other causes with undetected tumors, this constant should be greater than 1.0. We sampled a range from 0.85 to 1.2 for this parameter allowing us to double check program logic. No adequate solutions were found in the first phase; in the second phase of sampling we introduced three more parameters, shown in Table 2. The first parameter (LMP fraction, row 1, Table 2) allows some fraction of onset breast lesions to be LMP, as was suggested by Chang's (1) analysis. The second parameter specifies a maximum diameter for LMP tumors. The last parameter specifies a dwell time at maximum size after which an LMP tumor regresses and disappears. With these three added parameters we again performed extensive sampling of the parameters in Tables 1 and 2 jointly (plus parameters for mammogram sensitivity and clinical surfacing rates) and evaluated the outputs using the acceptance envelopes as described earlier. Results of Model Calibration Final values for fitted parameters are shown in Tables 1 and 2 and in Figure 1. We have evaluated more than 475 000 randomly sampled parameter combinations. Constraining the LMP fraction to be zero, the best result had a score of 21. Characteristically, when LMP fraction is zero the best solutions have far too high incidence rates for early cancers before 1982 and far too low rates for these after 1990. When the two early-stage incidence curves appear more like the observed data, the regional and distant stages plummet in years after 1990, contrary to observed data. Allowing the LMP fraction to be nonzero, but less than or equal to 10%, improved the best score to 15 (among 289 000 sampled combinations of input parameters), and all solutions with scores near 15 had LMP fractions at 9%–10%. Ad hoc sampling indicated that much better solutions were available with LMP fraction between 30% and 50%, so we focused sampling for LMP fraction in this range (for efficiency, as each sample takes approximately 8 minutes of computer time on a Pentium III 1 GHz computer with 1 GB of RAM), still allowing other parameters to vary throughout their plausible ranges. We sampled 30 188 parameter vectors in this focused sampling study and found 91 with scores of 5 or less and 363 with scores of 10 or less. The final “best” parameter values for all parameters are shown in the right-hand columns of Tables 1 and 2. Fig. 3 illustrates the fit of the final parameter vector solution for the simulation model. The final panel of Fig. 3 shows scores from each of 300 replications using the final parameter vector and illustrates stochastic variation in scores given fixed input parameters; no score was larger than 10. Fig. 3. View largeDownload slide Five panels show simulation results (black line) for age-adjusted incidence and breast cancer mortality rates compared to Wisconsin Cancer Reporting System and Surveillance, Epidemiology, and End Results (SEER) data (light gray lines). These results are based on the final calibrated input parameters. Error bars on simulation results are ±2 standard deviations among 300 replications of the simulation run at each annual time point. All rates are per 100 000, adjusted to U.S. standard year 2000 population for ages 30–79 years. The lower right panel is a histogram of acceptance scores generated by the 300 replications; these scores ranged from 0 to 10. Fig. 3. View largeDownload slide Five panels show simulation results (black line) for age-adjusted incidence and breast cancer mortality rates compared to Wisconsin Cancer Reporting System and Surveillance, Epidemiology, and End Results (SEER) data (light gray lines). These results are based on the final calibrated input parameters. Error bars on simulation results are ±2 standard deviations among 300 replications of the simulation run at each annual time point. All rates are per 100 000, adjusted to U.S. standard year 2000 population for ages 30–79 years. The lower right panel is a histogram of acceptance scores generated by the 300 replications; these scores ranged from 0 to 10. The marginal posterior distributions of six parameters (selected to show a variety of posterior results beyond LMP fraction) are shown in Fig. 4. The distribution for LMP fraction peaks between 42% and 46%; our best model uses 42% LMP tumors. LMP tumors progress to a maximum of approximately 1-cm diameter, dwell at this size for 2 years, and then regress if undetected. Examination of the best scoring model without regression of LMP tumors reveals an apparent depletion of the occult pool of localized invasive cancers with the incidence rate peaking near 1992 and beginning to fall after that. No such depletion has been observed in SEER data to date. Apparently regression of LMP tumors is needed to maintain an occult reservoir of small tumors to be found by screening that is not depleted by screening in the late 1990s. Fig. 4. View largeDownload slide Marginal posterior distributions for six of the input parameters in Tables 1 and 2. As defined in those tables the parameters shown in the upper row, left to right, are LMP Fraction, Onset Proportion, In Situ Boundary, Onset Lag, Mean Gamma, and Var Gamma. These histograms show values of the parameters for 363 sampled vectors with scores of 10 or less. The ranges correspond to the ranges sampled for these parameters in the second phase of sampling (see text). The parameters are not statistically independent within the solution set; for example, the correlation between LMP Fraction and Onset Proportion is –.28 (P<.001) and between Mean Gamma and Var Gamma is 0.88 (P<.001). Fig. 4. View largeDownload slide Marginal posterior distributions for six of the input parameters in Tables 1 and 2. As defined in those tables the parameters shown in the upper row, left to right, are LMP Fraction, Onset Proportion, In Situ Boundary, Onset Lag, Mean Gamma, and Var Gamma. These histograms show values of the parameters for 363 sampled vectors with scores of 10 or less. The ranges correspond to the ranges sampled for these parameters in the second phase of sampling (see text). The parameters are not statistically independent within the solution set; for example, the correlation between LMP Fraction and Onset Proportion is –.28 (P<.001) and between Mean Gamma and Var Gamma is 0.88 (P<.001). Biologic Plausibility of LMP and Hyperaggressive Breast Cancers Our numerical results imply a large reservoir of exceedingly indolent breast cancers, the LMP tumors, at one end of the spectrum and a small fraction of hyperaggressive tumors at the other end of the spectrum. Are these biologically plausible? Welch and Black (31) discuss incidental autopsy evidence that a substantial pool of undetected invasive breast cancers may exist. A recent review of in situ breast cancer strongly suggests that ductal carcinoma in situ is an early stage of invasive cancer but notes a wide range in estimates (14%–60%) of the proportion of ductal carcinoma in situ that will progress to invasive stages in 10 years if left untreated (32). Our calculations are at the high end of this range, with 58% being non-LMP tumors. Love and Niederhuber review evidence suggesting that breast cancer growth may be viewed as “ … a problem of macro- and microenvironmental regulatory imbalance and dynamic chaos” within the host (33). They suggest a specific progesterone trigger affecting tumor cell kinetics and note that the same or other host-based regulatory systems may affect angiogenesis. It is not unreasonable under this model that in situ and small invasive cancers may be exceptionally indolent or even regress. At the other end of the spectrum, is there a biologic model for the small fraction of all tumors (1%–2%) we found necessary to create as hyperaggressive tumors? These are tumors that at a very small size of the focal primary tumors would be classified as regional or distant stage and are unlikely to be discovered by mammography screening. Mustafa et al. (34) review a series of invasive cancers found between 1984 and 1995 at one hospital with diameter of the primary less than 1 cm; nodal involvement was found in 11% of those less than 0.5 cm and 17% of those 0.6–1.0 cm in maximum diameter. Clearly there is a biologic basis for small tumors with regional spread, even if this series of 2153 tumors at one medical center was exceptional. Another possible entity is inflammatory breast cancer (IBC), “ … the most aggressive form of [breast cancer] … estimated to makeup about 1%–6% of breast cancer cases …” (35). IBC is staged as regional or distant when diagnosed and its observed incidence rate should be uncorrelated with rates of mammography—the characteristics we needed to postulate to fit observed data. Although our model deals only with focal masses, the statistical behavior of IBC is similar to the hyperaggressive subset we found necessary. Calibration of Treatment Baseline Cure Fractions With incidence rates calibrated as described above, the mortality rate curve output by the model was slightly lower than the observed U.S. breast cancer mortality rate curve but identical in shape. A small upward adjustment to baseline cure fractions for localized and regional tumors lifted the simulation output mortality rates. The final values for cure fractions are given in the text above and the fit to observed mortality rates shown in Fig. 3. Base Case Results for the Wisconsin model Our CISNET base case estimate is that breast cancer mortality in the year 2000 was reduced overall by 38.3% from what it would have been without introduction of adjuvant therapy (tamoxifen and adjuvant chemotherapy) or of mammography screening over the interval 1975–2000. Under the base case assumptions, mortality would have risen beyond levels observed in the mid-1980s because of background secular increase in breast cancer risk in the 1980s and 1990s, explaining why the 38% is larger than the observed reduction from breast cancer mortality over the 1990s. Were screening alone introduced, with therapy remaining at 1975 levels, the mortality reduction would have been 20.3%. Were only tamoxifen and adjuvant chemotherapy introduced without screening, the reduction would have been 20.8%. Although there is some redundancy between the two innovations in which breast cancer deaths would have been averted, our model implies that their effects in 2000 were largely independent and about equal. We have explored variability in our base case estimate as a function of model parameter uncertainty. The set of 363 sampled vectors of model parameters with scores of 10 or less form a joint posterior distribution of acceptable model parameters. Across these vectors of input parameters (keeping all other base case inputs constant), the mean estimated total reduction in breast cancer mortality is 35.5%, with 95% posterior interval of 32.9% to 38.7%. The mean estimated reduction with screening alone is 17.9% (95% interval 14.7% to 20.3%). The mean estimated reduction with adjuvant therapy alone is 20.3% (95% interval 19.1% to 21.6%). The correlation between reduction due to screening and reduction due to treatment is −0.35. Our posterior interval for reduction due to screening includes the point estimates of two of the other six CISNET models (it excludes the highest estimate and the lowest three estimates among the six). Our posterior interval for reduction due to treatment includes only the highest three of the other six model's point estimates. The uncertainty we represent here is parameter uncertainty and does not include modeling uncertainty (36). Even so, one other model's bivariate estimate of reductions due to screening and adjuvant therapy is within our posterior bivariate distribution of results for these quantities (Fig. 5). Fig. 5. View largeDownload slide Bivariate estimates of fraction reduction in year 2000 breast cancer mortality assuming only adjuvant therapy and assuming only screening. The double circle is the Wisconsin model's base case estimate. Solid points are estimates using each of the 363 acceptable parameter vectors in the Wisconsin simulation to compute base case results; the distribution of solid points represents uncertainty in the Wisconsin model's base case results due to parameter uncertainty. Open squares are base case point-estimate results of six other CISNET modeling groups. Fig. 5. View largeDownload slide Bivariate estimates of fraction reduction in year 2000 breast cancer mortality assuming only adjuvant therapy and assuming only screening. The double circle is the Wisconsin model's base case estimate. Solid points are estimates using each of the 363 acceptable parameter vectors in the Wisconsin simulation to compute base case results; the distribution of solid points represents uncertainty in the Wisconsin model's base case results due to parameter uncertainty. Open squares are base case point-estimate results of six other CISNET modeling groups. Future Work Traditionally, nonidentifiability of parameters in a highly parameterized model is problematic, and mutually compensatory parameters must be reduced for analysis. Using our method of acceptance sampling, non- or poor identifiability is less of a problem as we can identify sets of feasible solutions in the high-dimension parameter space. Our next step will be to characterize clusters of solutions in this high-dimension space indicating compensatory behavior of subsets of the parameters and then to find predictions and data that might distinguish among currently feasible solutions. Our parameter sampling experiments continue at this writing with the goal being to characterize the topology of good parameter solution sets in parameter space. Preliminary analyses indicate that “good” solutions tend to fall in four relatively compact clusters in parameter space, connected by thin “bridges” of good solutions. These clusters seem distinguished by a faster versus slower mean for the tumor growth rate distribution, plus some compensatory changes in growth rate distribution variance and the lag between tumor onset and average tumor incidence. However, LMP fraction is still in the 30%–50% range for all solutions, with variation in this range compensated by overall onset proportion; we can find no solution sets with LMP fraction near zero. Exploring the biological implications of these clusters of solutions will be interesting, but whichever of these we are led to, there will not be drastic change in our model's implications for relative contribution of screening and adjuvant therapy to breast cancer mortality reduction since the solid points shown in Fig. 5 span all four solution clusters. CONCLUSION The Wisconsin Breast Cancer Epidemiology Simulation Model uses a systems analytic approach to modeling breast cancer as part of the NCI CISNET program. Particular attention has been devoted to developing and fitting a model of breast cancer natural history compatible with observed U.S. and Wisconsin statistics. We have found it necessary to postulate a class of limited malignant potential breast tumors to fit observed statistics. CISNET breast cancer base case results were produced using the technique of common random numbers to compare alternative counterfactual histories with and without introduction of mammography screening and with and without introduction of adjuvant therapy modalities after 1975 in the U.S. population. Supported by U01CA88211 from National Cancer Institute to the University of Wisconsin; HS00083, T-32 training grant for population-based health services research at University of Wisconsin, U01CA82004 from National Cancer Institute and National Institute of Environmental Health Sciences to University of Wisconsin, and P30CA14520 supporting the University of Wisconsin Comprehensive Cancer Center. Since acceptance of this article, N. K. Stout received her PhD and is now with the Department of Health Policy and Management, Harvard School of Public Health, Boston, MA. V. Kuruchittam completed his PhD and is with the College of Public Health, Chulalongkorn University, Thailand. The authors thank Drs. Polly Newcomb, Richard Lore, Elizabeth Burnside, and Polun Chang for their expert advice during the development and calibration of the Wisconsin simulation model. References (1) Chang P. A simulation study of breast cancer epidemiology and detection since 1982: the case for limited malignant potential lesions. Ph.D. dissertation, Department of Industrial Engineering. Madison (WI): University of Wisconsin–Madison; 1993. Google Scholar (2) Csete ME, Doyle JC. Reverse engineering of biological complexity. Science  2002; 295: 1664–9. Google Scholar (3) SEER*Stat. Surveillance, Epidemiology, and End Results (SEER) (http://seer.cancer.gov) Program SEER*Stat Database: Mortality-All COD, Public-Use With State, Total U.S. (1969–2000) <18 Age Groups>; National Cancer Institute, DCCPS, Surveillance Research Program, Cancer Statistics Branch, released April 2003. Underlying mortality data provided by National Center for Health Statistics. Google Scholar (4) Spratt JS, Spratt JA. Chapter 21. Growth rates. In: Donegan WL, editor. Cancer of the breast, 5th edition. Philadelphia (PA): Saunders; 2002. pp. 443–76. Google Scholar (5) Shwartz M. Analysis of preventive screening to detect early breast cancer. Ph.D. dissertation, Department of Urban and Regional Planning. Ann Arbor (MI): University of Michigan; 1975. Google Scholar (6) Shwartz M. A mathematical model used to analyze breast cancer screening strategies. Oper Res  1978; 26: 937–55. Google Scholar (7) Shwartz M. Validation and use of a mathematical model to estimate the benefits of screening younger women for breast cancer. Cancer Detect Prev  1981; 4: 595–601. Google Scholar (8) Shwartz M. Validation of a model of breast cancer screening: an outlier observation suggests the value of breast self-examinations. Med Decis Making  1992; 12: 222–8. Google Scholar (9) Holford TR, Cronin KA, Mariotto AB, Feuer EJ. Changing patterns in breast cancer incidence trends. J Natl Cancer Inst Monogr  2006; 36: 19–25. Google Scholar (10) Feuer EJ, Wun LM. How much of the recent rise in breast cancer incidence can be explained by increases in mammography utilization? A dynamic population model approach. Am J Epidemiol  1992; 136: 1423–36. Google Scholar (11) Cronin KA, Moriotto AB, Clarke LD, Feuer EJ. Additional common inputs for analyzing impact of adjuvant therapy and mammography on U.S. mortality. J Natl Cancer Inst Monogr  2006; 36: 26–9. Google Scholar (12) Cronin KA, Yu B, Krapchow M, Miglioretti DL, Fay MP, Izmirlian G, et al. Modeling the dissemination of mammography in the United States. Cancer Causes Control  2005; 16: 701–12. Google Scholar (13) Heine JJ, Malhotra P. Mammographic tissue, breast cancer risk, serial image analysis, and digital mammography. Part 1. Tissue and related risk factors. Acad Radiol  2002; 9: 298–316. Google Scholar (14) Heine JJ, Malhotra P. Mammographic tissue, breast cancer risk, serial image analysis, and digital mammography. Part 2. Serial breast tissue change and related temporal influences. Acad Radiol  2002; 9: 317–35. Google Scholar (15) Boyd N, Martin L, Stone J, Little L, Minkin S, Yaffe M. A longitudinal study of the effects of menopause on mammographic features. Cancer Epidemiol Biomarkers Prev  2002; 11: 1048–53. Google Scholar (16) Mariotto AB, Feuer EJ, Harlan LC, Abrams J. Dissemination of adjuvant multiagent chemotherapy and tamoxifen for breast cancer in the United States using estrogen receptor information: 1975–1999. J Natl Cancer Inst Monogr  2006; 36: 7–15. Google Scholar (17) Ashba J, Traish AM. Estrogen and progesterone receptor concentrations and prevalence of tumor hormonal phenotypes in older breast cancer patients. Cancer Detect Prev  1999; 23: 238–44. Google Scholar (18) Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy. 133 randomised trials involving 31,000 recurrences and 24,000 deaths among 75,000 women. Part II. Early Breast Cancer Trialists' Collaborative Group. Lancet  1992; 339: 71–85. Google Scholar (19) Systemic treatment of early breast cancer by hormonal, cytotoxic, or immune therapy. 133 randomised trials involving 31,000 recurrences and 24,000 deaths among 75,000 women. Part I. Early Breast Cancer Trialists' Collaborative Group. Lancet  1992; 339: 1–15. Google Scholar (20) Rosenberg MA. Competing risks to breast cancer mortality. J Natl Cancer Inst Monogr  2006; 36: 15–9. Google Scholar (21) Hurria A, Leung D, Trainor K, Borgen P, Norton L, Hudis C. Factors influencing treatment patterns of breast cancer patients age 75 and older. Crit Rev Oncol Hematol  2003; 46: 121–6. Google Scholar (22) Gilligan MA, Kneusel RT, Hoffmann RG, Greer AL, Nattinger AB. Persistent differences in sociodemographic determinants of breast conserving treatment despite overall increased adoption. Med Care  2002; 40: 181–9. Google Scholar (23) Goodwin JS, Freeman JL, Mahnken JD, Freeman DH, Nattinger AB. Geographic variations in breast cancer survival among older women: implications for quality of breast cancer care. J Gerontol A Biol Sci Med Sci  2002; 57: M401–6. Google Scholar (24) Newcomb PA, Carbone PP. Cancer treatment and age: patient perspectives. J Natl Cancer Inst  1993; 85: 1580–4. Google Scholar (25) Kemeny MM, Peterson BL, Kornblith AB, Muss HB, Wheeler J, Levine E, et al. Barriers to clinical trial participation by older women with breast cancer. J Clin Oncol  2003; 21: 2268–75. Google Scholar (26) Wyld L, Reed MW. The need for targeted research into breast cancer in the elderly. Br J Surg  2003; 90: 388–99. Google Scholar (27) Human Mortality Database. US Female Death Rates by Year of Birth, University of California, Berkeley (USA), and Max Planck Institute for Demographic Research, 2005. Available at: http://www.demog.berkeley.edu/∼bmd/. [Last accessed: August 18, 2006.] Google Scholar (28) Thain D, Tannenbaum T, Livny M. Distributed computing in practice: the Condor experience. Concurrency: Practice and Experience  2005; 17: 323–56. Google Scholar (29) Basney J, Livny M. Chapter 5: Deploying a high throughput computing cluster. In Buyya R, editor. High performance cluster computing. Vol. 1. Upper Saddle River (NJ): Prentice Hall PTR; 1999. Google Scholar (30) Thain D, Tannenbaum T, Livny M. Condor and the grid. In Berman F, Hey AJG, Fox G, editors. Grid computing: making the global infrastructure a reality. New York (NY): John Wiley; 2003. Google Scholar (31) Welch HG, Black WC. Using autopsy series to estimate the disease “reservoir” for ductal carcinoma in situ of the breast: how much more breast cancer can we find? Ann Intern Med  1997; 127: 1023–8. Google Scholar (32) Burstein HJ, Polyak K, Wong JS, Lester SC, Kaelin CM. Ductal carcinoma in situ of the breast. N Engl J Med  2004; 350: 1430–41. Google Scholar (33) Love RR, Niederhuber JE. Models of breast cancer growth and investigations of adjuvant surgical oophorectomy. Ann Surg Oncol  2004; 11: 818–28. Google Scholar (34) Mustafa IA, Cole B, Wanebo HJ, Bland KI, Chang HR. The impact of histopathology on nodal metastases in minimal breast cancer. Arch Surg  1997; 132: 384–90. Google Scholar (35) Wingo PA, Jamison PM, Young JL, Gargiullo P. Population-based statistics for women diagnosed with inflammatory breast cancer (United States). Cancer Causes Control  2004; 15: 321–8. Google Scholar (36) Manning WG, Fryback DG, Weinstein MC. Reflecting uncertainty in cost-effectiveness analysis. In Gold MR, Siegel JE, Russell LB, Weinstein MC, editors. Cost-effectiveness in health and medicine. New York (NY): Oxford University Press; 1996. Google Scholar © The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org.

Journal

JNCI MonographsOxford University Press

Published: Oct 1, 2006

There are no references for this article.