# Chapter 15: Impact of Adjuvant Therapy and Mammography on U.S. Mortality From 1975 to 2000: Comparison of Mortality Results From the CISNET Breast Cancer Base Case Analysis

Chapter 15: Impact of Adjuvant Therapy and Mammography on U.S. Mortality From 1975 to 2000:... Abstract The CISNET breast cancer program is a consortium of seven research groups modeling the impact of various cancer interventions on the national trends of breast cancer incidence and mortality. Each of the modeling groups participated in a CISNET breast cancer base case analysis with the objective of assessing the impact of mammography and adjuvant therapy on breast cancer mortality between 1975 and 2000. The comparative modeling approach used to address this question allowed for a unique view into the process of modeling. Results shown here expand on those recently reported in the New England Journal of Medicine (Berry et al., N Engl J Med 2005;353:1784–92) by presenting mortality impact in several different ways to facilitate comparisons between models. Comparisons of each group's results in the context of modeling assumptions made during the process gave insight into how specific model assumptions may have affected the results. The median estimate for the percent decline in breast cancer mortality due to mammography was 15% (range of 8%–23%), and the median estimate for the percent decline in mortality due to adjuvant treatment was 19% (range of 12%–21%). A detailed discussion of the differences in modeling approaches and how those differences may have influenced the mortality results concludes the chapter. INTRODUCTION The objective of the base case analysis is to model the influence of mammography screening and adjuvant treatment on the mortality rates, as well as to estimate whether other factors also influence the decline (1). To do this we partitioned the decline into the portion that was due to screening, due to adjuvant therapy, and due to other causes not explained by those two factors. Each of the seven CISNET breast cancer models undertook the same analysis, giving the opportunity to not only observe a range of results but also compare the results in the context of the modeling assumptions used by each group. A summary of the mortality results from the base case was recently published (2). This report expands on the results published and delves into the reasons why groups reported various results for the benefits of mammography screening and adjuvant therapy. Details of the models from the seven CISNET groups are described in chapters 6–12 of this monograph (3–9), and a comparison of the models' structure is given in chapter 13 (10). The purpose of this chapter is to compare the primary outcome of the impact of screening and adjuvant treatment on breast cancer mortality as estimated by the seven models and provide insight into how modeling assumption influence these results. The controlled nature of the base case analysis makes possible the comparative modeling approach that provides a link between the modeling decisions and ultimate outcomes. Models differed in their general approach to the problem (10). All groups incorporated information obtained from outside data sources, such as clinical trials, to develop the model and fit model parameters. Some groups focused strictly on mammography and adjuvant treatment, allowing for a component of the decline to be attributed to other causes not directly examined in this exercise, such as improved surgical techniques or improvements within adjuvant therapies. Other groups adjusted their models, either formally through Bayesian methods or informally by calibrating model parameters, to fit observed population trends. This alternative approach led to a more complete partitioning of the decline in mortality to either screening or adjuvant treatment. The models differed in structure, for example, if and how the natural history of breast cancer was modeled and how to quantify the benefits of early detection and treatment. Each of the models contained parameters unique to that model, reflecting the underlying structure. Several common, or base case, inputs (11) were used by all the models. The purpose of using the same input values for key variables describing population risk, mammography screening, and adjuvant treatment was to eliminate the uncertainly related to these base case inputs as a potential cause for differences among modeling results. Common population variables allows for a more detailed understanding of how the assumed model structure influences the modeling results. This chapter presents the mortality results from the seven models in several different ways. Using a variety of metrics to measure mortality facilitates comparisons and provides insight into how the results are similar and how they differ. We discuss how differences in modeling approaches, inputs, and assumptions may translate into different mortality results—giving valuable insight into the effect of underlying assumptions on the final results. METHODS The breast cancer CISNET models used several common, or base case, inputs to model the effect of mammography screening and adjuvant treatment on U.S. mortality trends. Each of these base case inputs is described in detail in other chapters of this monograph (11,13,14). The base case inputs include the dissemination and use of screening mammography in the United States (12), the dissemination and use of adjuvant chemotherapy and tamoxifen by stage of diagnosis (13), and the background trend in incidence rates that would have been observed without the introduction of screening based on an age–period–cohort model (14). The seven CISNET breast cancer models were used to estimate breast cancer age-adjusted mortality rates between 1975 and 2000 for women aged 30–79 years with and without mammography screening and adjuvant therapy. All models assume an increasing background trend in breast cancer risk by birth cohort leading to an increase in incidence and a subsequent increase in mortality with the interventions of screening and adjuvant treatment, over the years modeled. The amount of increased incidence in the background trend varied, with five models using the base case input as provided and two models using a smaller increase. The relation between the background trend and the mortality outcome is considered in the discussion. This analysis considers six runs: Run 1. No screening and no adjuvant therapy [Background Run] Run 2. Base case screening dissemination but no adjuvant therapy [Screening-Only Run] Run 3. No screening and no tamoxifen use but base case chemotherapy use [Chemotherapy-Only Run] Run 4. No screening and no chemotherapy but base case tamoxifen use [Tamoxifen-Only Run] Run 5. No screening but base case chemotherapy and tamoxifen use [Adjuvant Treatment–Only Run] Run 6. Base-case screening and chemotherapy and tamoxifen use [Screening and Adjuvant Treatment Run]. Model results are shown for various runs, either as generated by the individual models or adjusted to match the 1975 age-adjusted mortality level. This adjustment was made by shifting the entire curve by the difference between the modeled 1975 level and the observed 1975 level. Shifting the curve to match in 1975 was to focus on the impact of the interventions post-1975 rather than the challenge of recreating pre-1975 experience (which leads to the 1975 mortality level). A variety of measures are used to assess the mortality benefits of screening, chemotherapy, and tamoxifen either individually or in combination. All measures involve comparing Run 1 with Runs 2–6 to estimate benefit. Let M1(year)–M6(year) represent the modeled mortality rates from each of six runs described for a particular year. The first comparison made was to compute the absolute difference between model runs obtained by taking the difference in the estimated mortality for each year modeled. The set of equations in [15.1] show how the absolute benefit is calculated for screening and adjuvant treatment.  \begin{eqnarray*}&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{only}{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{2}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{chemotherapy}{\,}\mathrm{only}{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{3}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{tamoxifen}{\,}\mathrm{only}{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{4}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}{\,}\mathrm{only}\\&&{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{5}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{and}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}\\&&{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{6}(\mathrm{Year}))\end{eqnarray*} [15.1] A second approach to assessing the benefits is to compute the difference between runs with and without screening and/or treatment relative to the estimated mortality rate from the Background Run. The set of equations in [15.2] show how relative benefits for screening and adjuvant treatment are calculated. This approach produces a percent decline from the background mortality level attributable to screening, treatment, or both relative to the background run.  \begin{eqnarray*}&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{2}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{chemotherapy}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{3}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{tamoxifen}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{4}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{5}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{and}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{6}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\end{eqnarray*} [15.2] Relative estimates of benefit are less sensitive to misspecification in the models. For example, a relative benefit is not influenced by the background trend, which is an important source of uncertainty in this modeling effort (15). Similarly, the relative estimates of benefit for screening programs are unaffected by the treatment assumptions. See Feuer et al. (15) for a discussion on the benefits and drawbacks of absolute versus relative presentation of the impact of modeled interventions on mortality trends. To display the combined results, we plot the average of the seven probability density functions representing each model's results for the percent change in mortality for 2000 due to screening and due to treatment as compared to the Background Run from all seven models. Each pair of estimates is viewed as a sample from a hypothetical population of models. The seven pairs of data points were represented by a bivariate normal distribution centered on the point estimates for the percent decline in 2000 for screening and adjuvant treatment with a covariance matrix estimated from the set of seven observations. The bivariate normal distributions were defined by making the implicit assumption that the unobserved within-model variability (i.e., parameter uncertainty and random number variability) was equal to the observed between-model variability (i.e., model structural uncertainty). This assumption was justified by the one model where within-model variability was explicitly observed (3). The seven densities are averaged using equal weights to obtain an estimate of the posterior joint distribution for the population of models using the equation shown in [15.3](16).  $d(x,y){=}\frac{1}{7}{{\sum}_{i{=}1}^{7}}d_{i}(x,y)$ [15.3] x is the percent decline due to screening, y is the percent decline due to adjuvant treatment, di(x,y) is the probability density value at the point (x,y) for model i, and d(x,y) is the distribution for the population. The plot of d(x,y) is meant to give a visual impression of the combined result and a measure of the uncertainty around those results. Modeling results were also used to directly partition the observed trend in breast cancer mortality. Between 1975 and 2000 mortality for women aged between 30 and 79 years decreased 10.3 per 100 000 women (17). Comparing mortality rates in 2000 from various scenarios permitted estimating the effect that screening, adjuvant treatment, and the background trend in incidence had on mortality rates. Whereas screening and adjuvant treatment were moving the mortality rates lower, the estimated background incidence trend was pushing it higher. Comparing the mortality rate in the background run in 1975 with the rate in 2000 estimates how many additional deaths are attributable to increased level of risk for breast cancer in the population (i.e., the background trend). The mortality benefit attributable to screening and adjuvant treatment can also be calculated as shown in the set of equations below [15.4]. An estimate for the amount of the observed mortality trend that was not accounted for by the background trend in incidence, screening, and adjuvant treatment in a model (i.e., the amount of the mortality reduction attributed to “other causes” not included in the models) can be obtained by starting with the observed change in mortality of 10.3 per 100 000 women and subtracting the effects of the background trend, screening, and treatment.  \begin{eqnarray*}&&\mathrm{Background}{=}\mathrm{M}_{1}(2000){-}\mathrm{M}_{1}(1975)\\&&\mathrm{Screening}{=}\mathrm{M}_{2}(2000){-}M_{1}(2000)\\&&\mathrm{Adjuvant}{\,}\mathrm{Treatment}{=}\mathrm{M}_{6}(2000){-}\mathrm{M}_{2}(2000)\\&&\mathrm{Other}{\,}\mathrm{causes}{=}10.3{-}\mathrm{Background}\\&&{-}\mathrm{Screening}{-}\mathrm{Adjuvant}{\,}\mathrm{Therapy}\end{eqnarray*} [15.4] Different modeling approaches led some groups to adjust their model assumptions and parameters until they could better explain the entire observed trend in mortality (calibrating their model to observed mortality), whereas other models left open the possibility that other factors were influencing mortality that were not captured in their models (10). The implications of calibrated to observed mortality trends are discussed in the conclusions. In addition to estimating the benefit associated with screening and adjuvant treatment, this modeling design also allows for the estimate of synergy between the two. It was expected that the sum of the impact of Screening Only and Adjuvant Therapy Only would be greater than the impact of both together. By comparing the benefit of Screening Only and Adjuvant Treatment Only alone with the benefit of both combined, the synergy between the two can be estimated.  $\mathrm{Synergy}{=}\mathrm{M}_{2}(2000){+}\mathrm{M}_{5}(2000){-}\mathrm{M}_{6}(2000)$ [15.5] RESULTS Figure 1, A shows mortality estimates results from each of the seven groups for the Background Run, M1(year), and Fig. 1, C shows mortality estimates from the Screening and Adjuvant Treatment Run, M6(year). The mortality results are also shown adjusted to align with the observed 1975 level (Fig. 1, B and D). Fig. 1, A and C illustrate how mortality level and trend differ among the models, whereas Fig. 1, B and D facilitate the comparison of mortality trends. For example, five of the groups (Dana-Farber, Erasmus, Georgetown, Stanford, Wisconsin) produce similar mortality trends in the Background Run, and two groups (M. D. Anderson and Rochester) model a smaller increase in the absence of screening and adjuvant treatment. This effect is due partly to different assumptions related to the background trend, as will be discussed later. Fig. 1, C and D include the observed U.S. mortality trend for comparison with the Screening and Adjuvant Treatment Run. Fig. 1. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 from the seven CISNET models. A) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment. B) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment adjusted to begin at the observed rate in 1975. C) Mortality estimates from the Screening and Adjuvant Therapy Run and the observed U.S. mortality trends. D) Mortality estimates from the Screening and Adjuvant Therapy Run adjusted to begin at the observed rate in 1975 and the observed U.S. mortality trends. Panel C originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Fig. 1. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 from the seven CISNET models. A) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment. B) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment adjusted to begin at the observed rate in 1975. C) Mortality estimates from the Screening and Adjuvant Therapy Run and the observed U.S. mortality trends. D) Mortality estimates from the Screening and Adjuvant Therapy Run adjusted to begin at the observed rate in 1975 and the observed U.S. mortality trends. Panel C originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Figure 2, A–G shows estimated mortality results for the six model runs, M1(year)–M6(year), for each of the models. The Background Run can be interpreted as an estimate of what mortality would have been had screening and adjuvant therapy never been introduced into the population. Similarly, the Screening-Only Run represents mortality without the introduction of adjuvant therapy and the Adjuvant Treatment–Only Run represents what mortality would have looked like without mammography screening. The Rochester group did not model chemotherapy and tamoxifen separately and report only an estimate for adjuvant treatment together. Fig. 2. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 for each of the six scenarios considered (Background Run, Screening-Only Run, Chemotherapy-Only Run, Tamoxifen-Only Run, Adjuvant Treatment–Only Run, Screening and Adjuvant Treatment Run.) for each of the CISNET models. A) Dana-Farber, B) Erasmus, C) Georgetown, D) M. D. Anderson, E) Stanford. F) Rochester, G) Wisconsin. Panel G originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Fig. 2. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 for each of the six scenarios considered (Background Run, Screening-Only Run, Chemotherapy-Only Run, Tamoxifen-Only Run, Adjuvant Treatment–Only Run, Screening and Adjuvant Treatment Run.) for each of the CISNET models. A) Dana-Farber, B) Erasmus, C) Georgetown, D) M. D. Anderson, E) Stanford. F) Rochester, G) Wisconsin. Panel G originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Figure 3, A–E graphs the estimated absolute mortality benefits for the five scenarios considered compared to the Background Run calculated as shown in the set of equations above [15.1]. The mortality reduction represents the deaths avoided per 100 000 women by the use of mammography and/or adjuvant treatment for 1975–2000. Fig. 3. View largeDownload slide Absolute difference in mortality due to screening and adjuvant treatments compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Fig. 3. View largeDownload slide Absolute difference in mortality due to screening and adjuvant treatments compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Figure 4, A–E gives the estimated percent decline in mortality for the five scenarios considered compared to the Background Run calculated as shown in the set of equations in [15.2]. The percent decline represents a percent reduction from the mortality level without screening or adjuvant treatment. Table 1 gives similar results for 2000. The estimated total mortality decline for both screening and treatment varied from 24.9% to 38.3%, estimates of 7.5%–22.7% due to screening and 12.0%–20.9% due to adjuvant treatment. All models estimated a small negative interaction between screening and treatment as demonstrated in the Synergy column in Table 1. A negative synergy suggests that the benefits of screening are larger without adjuvant treatment than it would be with adjuvant treatment. This is the direction of the synergy that was expected a priori. Taken in the extreme to demonstrate the concept, if treatment were completely curative there would be no additional mortality benefit associated with early detection. Fig. 4. View largeDownload slide Percent decline in mortality due to screening and adjuvant treatment compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Fig. 4. View largeDownload slide Percent decline in mortality due to screening and adjuvant treatment compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Table 1.  Percent decline for the year 2000 compared to background run*   % Decline for:             Model  Tamoxifen  Chemotherapy  Both treatments  Screening  Screening and treatment  Synergy  Dana-Farber  6.1  6.1  12.0  22.7  32.9  −1.8  Erasmus  12.0  9.6  20.9  15.3  30.9  −5.3  Georgetown  7.7  7.0  14.6  12.4  24.9  −2.1  M. D. Anderson  10.7  9.5  19.5  10.6  27.5  −2.6  Rochester  NA  NA  19.0  7.5  25.6  −0.9  Stanford  8.9  6.9  14.9  16.9  29.9  −1.9  Wisconsin  12.5  8.9  20.8  20.3  38.3  −2.8    % Decline for:             Model  Tamoxifen  Chemotherapy  Both treatments  Screening  Screening and treatment  Synergy  Dana-Farber  6.1  6.1  12.0  22.7  32.9  −1.8  Erasmus  12.0  9.6  20.9  15.3  30.9  −5.3  Georgetown  7.7  7.0  14.6  12.4  24.9  −2.1  M. D. Anderson  10.7  9.5  19.5  10.6  27.5  −2.6  Rochester  NA  NA  19.0  7.5  25.6  −0.9  Stanford  8.9  6.9  14.9  16.9  29.9  −1.9  Wisconsin  12.5  8.9  20.8  20.3  38.3  −2.8  * Synergy is defined as Screening and treatment together – (Screening Alone + Treatment Alone). NA = not available. View Large Figure 5, A and B originally appeared in Berry et al. (2). Figure 5, A shows a contour plot of the estimated distribution of a larger population of model results from which our seven models represent a sample. Figure 5, B shows the three-dimensional rendering of the contour plot reflecting the joint results. Lines on the contour plot represent equal values of the average of the individual distributions and the distance between adjacent contour lines represents equal differences in values as calculated using equation [15.3]. Fig. 5. View largeDownload slide A) Point estimates from the individual models and distribution contours for the combined model results derived by kernel density estimation. B) Three-dimensional rendering of the contour plot. This figure originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Fig. 5. View largeDownload slide A) Point estimates from the individual models and distribution contours for the combined model results derived by kernel density estimation. B) Three-dimensional rendering of the contour plot. This figure originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. The final presentation of the mortality results is the partitioning of the observed decline in mortality of 10.3 deaths per 100 000 women into four categories (increase due to background trend, decrease due to screening, decrease due to adjuvant treatment, and other factors) as described in set of equations [15.4]. Fig. 6, A demonstrates the concept of partitioning the observed difference in age-adjusted mortality into the four categories considered. Fig. 6, B and C shows the partitioning for each of the seven groups. Although no group explained the entire decline in mortality with their model, several groups came quite close. Most notably, Wisconsin, Rochester, and M. D. Anderson had a very small portion of the trend that was not explained by screening and adjuvant treatment. These three models incorporated components into their models to better match observed incidence, survival, and mortality data. As is demonstrated in Fig. 1, B, M. D. Anderson and Rochester modeled a smaller increase in mortality due to changes in background risk than the other models since they did not directly use the base case input described in chapter 4 (14). The reduction in mortality due to screening and treatment varied, but on average about half of the total decline was associated with screening and half associated with treatment. Fig. 6. View largeDownload slide Partitioning of the observed difference between the age-adjusted mortality rate in 1975 and the rate in 2000 into the increase in mortality associated with the background trend, the decrease associated with screening, the decrease associated with adjuvant treatment, and the portion of the change that is due to other factors not modeled. A) Schematic demonstrating the concept of partitioning the observed mortality trends. B and C) Partitioning of the mortality trend from each of the CISNET models. Fig. 6. View largeDownload slide Partitioning of the observed difference between the age-adjusted mortality rate in 1975 and the rate in 2000 into the increase in mortality associated with the background trend, the decrease associated with screening, the decrease associated with adjuvant treatment, and the portion of the change that is due to other factors not modeled. A) Schematic demonstrating the concept of partitioning the observed mortality trends. B and C) Partitioning of the mortality trend from each of the CISNET models. DISCUSSION The joint analysis of the CISNET consortium provided a unique opportunity to gain insight into the impact of screening and adjuvant treatment on breast cancer mortality as well as the models themselves. There is uncertainty related to this modeling effort occurring at several levels. The first type of uncertainty is the stochastic variance associated with a single run of a simulation program for an individual model (i.e., the variability associated with a particular choice of random seed in the random number generators). Second, there is uncertainty associated with model inputs. These model parameters can be unique to each model, such as natural history parameters within a particular model or parameters that are shared across models, such as use of mammography and adjuvant treatment over the period of interest. The third type of uncertainty is associated with the model structure, for example, whether a continuous tumor growth model or a discrete series of stages was used to describe progression of disease. Careful consideration of how each type of uncertainty influences modeling results is needed to better understand the differences in the estimated mortality reductions. Stochastic variance associated with a single run of a simulation program was generally minimized by running the models for a large population and/or averaging results over several runs. The uncertainty related to parameters unique to individual models can be addressed by each model individually either through a formal approach such as M. D. Anderson's Bayesian modeling or through other means such as sensitivity analysis of key model parameters. This type of model uncertainty is not addressed in this chapter but considered in chapters 6–12 of this monograph, which describe individual models in detail (3–9). The uncertainty associated with variables used across all the models was addressed by standardizing several of the model inputs. Common inputs, such as the background risk, dissemination of mammography, and the dissemination of treatment, eliminate uncertainty in those parameters as a cause for different mortality results. These inputs were either used directly as model inputs or as values used to calibrate model parameters. However, not all models used the base case inputs in the analyses presented here. For example, the M. D. Anderson group felt that the inclusion of the uncertainty associated with the background trend was crucial to their modeling effort and incorporated this into their model, thereby reintroducing differences in background trend as a possible explanation for different results. The final source of uncertainty, and the focus of this joint analysis, is the consequence of model structure and modeling assumptions. Usually models such as those described here are developed and presented individually with no measure of the effect of modeling choices made along the way. Looking at seven distinct approaches to the same problem gives a unique opportunity to quantify variation related to the modeling approach. The differences in results presented in this chapter are mainly a reflection of the structures assumed in the models and the approach to fitting model parameters. There is no single reason for the differences in the estimated mortality benefits. The differences arise from intricate relationships between many aspects of the individual models. Because of this complexity, we consider how model differences produced different estimates of mortality reduction through a series of relevant questions. Why do the models predict different mortality trends in the absence of screening and adjuvant treatment? For five of the groups, mortality rates from the Background Run were determined by incidence rates produced by the age–period–cohort (APC) model (14) that produced an increase in incidence rates without screening in combination with the respective survival components of the models. As seen in Figure 1, B, M. D. Anderson and Rochester had a noticeably smaller increase in mortality in the background run. Rochester based their background trend on a combination of risk observed in the Canadian National Breast Screening Studies (CNBSS) trial and the results of the APC model. M. D. Anderson did not directly use the background trend in risk but used the APC as a basis of a prior distribution and let their Bayesian modeling approach determine the background trend that would best fit observed mortality trends as a posterior distribution. Both of these approaches led to a smaller background trend than used by the other groups. M. D. Anderson and Rochester also allowed for improvements in breast cancer survival over time in the absence of adjuvant therapy via different mechanisms. The other five groups assumed that in the Background Run survival would remain constant from 1975 to 2000 (after controlling for characteristics at diagnosis such as patient age, stage, and size of tumor). The increased survival trend for M. D. Anderson and Rochester represented other improvements in detection and treatment of breast cancer not specifically captured in the interventions considered (mammography and adjuvant therapy). Because they have background improvements in survival, even with the same incidence of disease, M. D. Anderson and Rochester would have a smaller mortality rise in the background run than the other five groups. Including survival improvements over time produced a smaller increase in mortality from the background run and indirectly assigned some of the decline in mortality to factors not directly modeled (10). How do assumptions of the screening benefit affect the mortality results? Generally screening benefit was modeled by a shift to detection of less advanced disease. A case would be placed on a survival curve of a clinically detected case with equivalent disease at diagnosis. Georgetown models an increase in survival only for cases that were diagnosed in an earlier stage through screening. Dana-Farber made survival a function of both stage and screening history. The Georgetown model assumes a stage progression model to approximate tumor growth, and Dana-Farber assigns stage for a screen-detected case from a different stage distribution than a clinically detected case. Stanford modeled a benefit based on stage and size, allowing for a benefit of being detected at an earlier point within the same stage as well as being shifted to an earlier stage. Rochester modeled survival by age, stage, and tumor size. One would expect that modeling only a between stage benefit (i.e., no benefit from detection of a smaller tumor within the same stage) would underestimate the benefit of screening. No benefit for a within stage shift might suggest why the Georgetown estimates of screening benefit are somewhat smaller than others. The overall higher benefit estimated by the Dana-Farber model is closely related to the stage distribution for screen-detected cases estimated using data from the Breast Cancer Surveillance Consortium (BCSC) (18). The screened stage distribution is influenced by the number of examinations and interval between the exams. Stage distributions for screen-detected cases with a previous mammography within 1 year and for cases with more than 1 year since a previous mammogram were obtained by the BCSC. Since there were more screening patterns (combinations of 1-, 2-, and 5-year intervals) modeled than considered in developing these stage distributions, stage distributions applied to screening patterns other than annual might have led to a larger mortality reduction than would have been expected. Thus larger stage shifts applied to nonannual screening patterns might explain some of the overall higher benefit associated with screening reported by the Dana-Farber group. M. D. Anderson modeled a combination of stage shift for earlier stage disease and survival benefit for all screen-detected cases. In the background run, M. D. Anderson assumed that the increasing trend in background disease applied only to earlier disease (in situ, Stage I and Stage II). The trend in late disease (Stage III and Stage IV) followed the same trend with or without screening; i.e., trends in late-stage disease are unaffected by screening. Therefore screening cannot shift a case from Stage III and Stage IV to an earlier stage of disease. However late-stage screen-detected cases do receive a survival benefit through a beyond–stage shift screening benefit. This benefit is modeled as a proportional hazards survival benefit that increases survival for all screen-detected cases, including Stage III and Stage IV. When comparing trends in incidence from the screening runs to the Background Run without screening, screening results in a large increase in in situ and Stage I disease and a decrease in Stage II disease. The decrease in Stage II is primarily for node-positive cases, suggesting that there is a stage shift in earlier-stage disease from node-positive to node-negative disease within Stage II and from Stage II to either Stage I or in situ. The screening benefit for early-stage disease is a combination of both a possible stage shift and extended survival via the beyond–stage shift benefit. Since the survival benefit for a stage shift is generally greater than a within-stage survival benefit, the assumption that the incidence of late-stage disease is unaffected by screening limits the benefit of screening and may account for the lower benefit of screening estimated by the M. D. Anderson group and possibly explains why they estimate a large beyond–stage shift effect. Erasmus and Wisconsin considered a cure/no-cure approach, modeling whether screening moved the diagnosis time back far enough that a patient would now be cured of the disease. The other groups assumed that early detection and treatment would increase cause-specific survival. Two possibilities exist when extending survival. The first is that survival is improved until a woman dies of other causes; then their mortality benefit was similar to a cure. The second possibility is that extending survival only delays the time of death due to breast cancer and that those cases would not ultimately receive a mortality benefit, although they may affect trends over time by delaying the time of death. Even if one carefully picks a cure fraction to replicate results from a model that extends survival at a particular follow-up time (e.g., 5-year survival), in the long run it would be expected that curing a fraction of the cases would lead to a larger mortality benefit than simply an increase in survival time. Therefore the cure approach used by Wisconsin and Erasmus may lead to a higher mortality benefit than extending survival. How do assumptions of the benefit of adjuvant treatment affect the mortality results? There were three approaches to modeling benefit of treatment. One was to increase survival by using a proportional hazards assumption, the second was to adjust the fraction of the population that is cured, and the third was a combination of both changing the portion of the population that was cured and extending survival for the uncured patients. Similar to screening benefit, the use of a cure approach may lead to a higher mortality benefit. Wisconsin and Erasmus used a cure approach and predicted the two highest benefits from treatment (shown in Table 1). The use of a cure/no-cure approach for screening and adjuvant treatment may partially explain why Erasmus and Wisconsin fall in the upper right corner of the contour plot shown in Fig. 5, A. When looking at the other five groups, there seems to be a clear negative statistical correlation between the amount of mortality decline explained by screening and the amount explained by adjuvant therapy. This negative correlation was predicted a priori since a larger portion explained by screening would leave less to be explained by adjuvant treatment and vice versa. However, the two groups that modeled cure had generally higher benefits for both screening and adjuvant treatment simultaneously. Rochester approached modeling treatment from a different viewpoint. Instead of specifically modeling the dissemination of adjuvant chemotherapy as described in chapter 5 of this monograph (11), they directly modeled survival from Surveillance, Epidemiology, and End Results (SEER) controlling for age, stage, and tumor size by using a survival cure model. The cure model combined the probability of cure and the timing of death in one concept, thus allowing treatment advances to extend survival for some patient while simultaneously changing the probability of being cured from breast cancer. They noticed a change point in the survival trend around the time that adjuvant treatment was being widely disseminated and attributed the increase in the survival trend to those therapies. This approach cannot explicitly distinguish between the specific therapies mentioned and other factors that may have been affecting survival concurrently, such as improvements in surgical methods. Although this approach has the advantage of capturing other improvement in treatment in addition to chemotherapy and tamoxifen, it makes it more difficult to compare with the other model results and may be confounded with screening. If screen-detected cases had a survival benefit above and beyond a clinically detected tumor with equivalent characteristics (e.g., related to either length bias or overdiagnosis), this benefit would be captured as treatment rather than screening. Rochester attributes 72% of the overall mortality benefit modeled (overall benefit defined as the benefit of screening alone plus benefit of treatment alone) to treatment, the highest among the groups. Other groups specifically modeled better prognosis for screen-detected tumors as a screening benefit through mechanisms such as a beyond–stage shift screening benefit or making survival dependent on screening history. How do assumptions related to modeling the natural history of breast cancer influence the mortality results? The benefit of early detection to an individual is determined largely by how much earlier in the natural history of disease a case is screen detected. In groups that specifically modeled the natural history of breast cancer, this period is closely related to the rates of tumor growth or dwell time within a stage and the consequent preclinical detectable period. The latent time between screen detection and clinical detection (lead time) is associated with the severity of disease at the time of screen diagnosis. In several cases, natural history models were fitted by using screening trial data; therefore, the particular trial used determined the parameter values. Rochester used data from the CNBSS trial in fitting the natural history model. The Canadian trials showed little screening benefit and Rochester's model produced a smaller lead time than other models and a lower benefit for screening. Erasmus fit model parameters by using the Swedish Two County trial. The Swedish Two County study showed a larger difference between the screened and unscreened groups. Consequently, this group had a larger lead time and larger benefit of screening. Dana-Farber modeled sojourn time by using data from the CNBSS, Health Insurance Plan of New York trial, Swedish Two County study, Edinburgh, and Malmo trials as estimated by Shen and Zelen (19). A detailed discussion of parameters calibrated and datasets used in the calibration can be found in chapter 13 of this monograph (10). Four of the seven models (M. D. Anderson, Erasmus, Georgetown, and Wisconsin) included in situ disease in their models. Other groups felt that available information on the natural history of situ disease was currently inadequate to include in situ in their models. When in situ is included in the model, screening can detect tumors before they progress to malignant disease. Models that do not include in situ do not allow for that possibility and the earliest a tumor can be diagnosed is at a small localized stage. Observed trends of in situ breast cancer clearly show that many cases are being screen detected in this early stage where cause-specific survival is near 1. Without detection in the in situ stage, models assume that some portion of these tumors would progress to be diagnosed at a later stage. Not modeling this opportunity for early detection would underestimate the benefit of screening. However, there does not seem to be a pattern between the screening benefits produced by the models and whether or not they modeled in situ disease. Perhaps this lack of association is explained by the very good prognosis also seen in early localized disease. For example, in one model, screening may move diagnosis to an in situ stage, and in another model, screening may move diagnosis to an early localized stage; there is not much of a mortality difference between the two since both would have very low mortality. Clearly the modeling of in situ is important in modeling morbidity and possible overdiagnosis but may have less influence on the outcome of mortality. How do different assumptions on length and lead time biases affect the mortality results? Lead-time bias is a reflection of survival being increased by early detection whether or not a patient received a mortality benefit from screening. For example, if a patient's cancer is detected 6 months earlier by screening but that patient still dies of breast cancer at the same time as they would have without screening, her survival is increased by 6 months even though she has received no mortality benefit for early detection. How different models deal with lead time is discussed in chapter 13 of this monograph (10), in individual model description chapters (3–9), and in the chapter on intermediate outcomes (20). Many of the models guaranteed that patients would not die of their disease during their lead time. Length bias is related to the phenomenon that slower-growing tumors (i.e., tumors with longer sojourn times) are more likely to be detected by screening. Length bias has many implications that affect the modeling of mortality, including the implication that if screening is detecting slower-growing, potentially less aggressive tumors then it follows that tumors detected by screening may also have a better prognosis after detection than similar clinically detected tumors. This implication has been demonstrated in observed data that shows screen detected cases have a better prognosis than comparable clinically detected cases (21). When a screening schedule is overlaid on the natural history model in a simulation, an outcome is that screening will more likely detect slower-growing tumors. This outcome will automatically result in screen-detected tumors having longer sojourn times than clinically detected tumors on average. However, this result does not guarantee that screen-detected tumors will also have a better prognosis than similar clinically detected tumor unless tumor growth rates are related to survival times. Stanford, Erasmus, Rochester, and Wisconsin linked survival to tumor characteristics, such as size [more details in chapter 13 of this monograph (10)], to allow for a possible benefit from screen detection earlier within a single stage (a within-stage benefit) in addition to the possibility of shifted detection to an earlier stage (between-stage or stage-shift benefit). Thus screen-detected tumors would tend to have better prognosis than clinically detected tumors within the same stage since, they would tend to be smaller at diagnosis. Wisconsin also introduced a further prognostic benefit in screen-detected cases above that of survival conditioned on tumor characteristic. By the inclusion of tumors with latent malignant potential, many of which are detected only by screening, the survival distributions for screen-detected tumors will consist of a larger percentage of cases that are not at risk of dying from their breast cancer. If length bias results in screen-detected tumors having a better prognosis even after controlling for observable tumor characteristics as is suggested in the literature (21), Stanford, Erasmus, and Rochester models would not fully capture that difference. The Georgetown model captures no survival benefit for screen-detected cases beyond a stage shift. Models that do not involve simulation on an individual level addressed length time bias through different means. M. D. Anderson directly modeled a beyond–stage shift benefit that gave screen-detected tumors an additional survival benefit over similar clinically detected tumors. The magnitude of the additional benefit assigned by M. D. Anderson was estimated using Bayesian methods (3). Dana-Farber addresses this issue by modeling a stage shift that is a function of screening history (8). One would expect that the modeled mortality benefits of screening would be related to how each of the models dealt with length lead time and bias; however, this relationship was not specifically studied in this analysis and no relationship between how lead time and length biases were addressed and estimated mortality reduction is apparent. How do assumptions of screening dissemination affect the mortality results? A screening dissemination simulator was provided to all the groups mimicking the U.S. experience [(12); http://cisnet.cancer.gov]. Six of the groups used this dissemination program directly in their model. Dana-Farber used parameters of this simulator instead of the dissemination program. The Dana-Farber screening patterns were composed of mixtures of 1-, 2-, and 5-year intervals between exams. In the original screening dissemination model, an annual screener would return for a subsequent screening exam between 1 and 2 years, a biennial screener between 2 and 5 years, and the irregular screener over a longer range of time with a mean of 5 years. Comparing the Dana-Farber schedules with the screening patterns generated by the simulation models showed that the Dana-Farber schedules were (on average) slightly shortened with regard to the times between exams. The shorter time between examinations is due primarily to the lack of screening intervals, which are longer than 5 years in the other models. More frequent screening may have contributed to the higher benefit associated with screening reported by the Dana-Farber group. How can some groups explain the entire mortality decline, whereas others left a portion unexplained? Wisconsin, M. D. Anderson, and Rochester focused more on matching observed trends in the Screening and Adjuvant Treatment Run, which they achieved by incorporating more components in their models that they felt best explained what was occurring over the period. Wisconsin included nonprogressing tumors in the natural history and fit their model parameters to better reproduce incidence trends. M. D. Anderson fit their model parameters on the basis of how well they could reproduce mortality trends. Rochester fit SEER incidence and survival, thus indirectly fitting mortality. The other models were more focused on fitting model parameters from outside data sources such as clinical trials and left a portion of the incidence increase and mortality decline unexplained. This unexplained portion of the mortality decline can then be used to generate hypotheses (not yet verified by outside data sources) as to what else was affecting mortality over the period of interest. Hypotheses of what else could be affecting mortality are often similar to the component already added to the models by groups matching observed trends. Whether those factors were included in the model or left as a hypothesis for future research accounts for much of the variation in the modeling results, particularly the portion of the observed decline that was left unexplained in Fig. 6. Are the mortality results different? For the reasons listed above and others not covered in the discussion, the modeling results varied. The results were more homogeneous for treatment (range of 14.6%–20.8% for the percent decline in mortality due to treatment) and more varied for screening (range of 7.5%–22.7% for the percent decline in mortality due to screening). There were several reasons why these differences arose. The first reason was related to the model inputs. Even though the original intention was for all the models to use the same base case input parameters, many groups processed the inputs to better fit their model structure or their modeling philosophy. Therefore, the uncertainty in the common inputs was not completely controlled in this comparison. The second reason related to differences in modeling approach. Rochester, M. D. Anderson, and Wisconsin incorporated more components in their models to better explain the observed trends, whereas others did not. For example, M. D. Anderson changed the background trend, added a beyond–stage shift survival advantage to screen-detected cases, and improved background survival over time even without adjuvant treatment; Wisconsin included tumors with limited malignant potential; and Rochester altered the background trend and closely modeled observed survival. Model calibration to incidence, survival, and/or mortality changed the focus of the modeling exercise to developing models that could best explain what has occurred in population trends rather than taking available information on screening and adjuvant treatment from existing literature to predict the impact on mortality. These groups took the models a step further than the original base case question in trying to explain what was affecting observed trends. The results from these models depended on the mechanisms used to fit observed data and produced results that were more extreme in estimating the benefit of screening. Erasmus, Georgetown, and Stanford more closely followed the base case inputs and took a more limited approach to fitting observed trends. These three groups generally fell toward the middle of the contour plot and had similar results. Because of the similarity in approach among these three models, it is easier to delve into the impact of different modeling assumptions. For example, the higher benefit in treatment for Erasmus could more closely be linked to the cure/no-cure approach rather than to an extended survival approach. Similarly, the lower benefit of screening for Georgetown is probably reflected in the fact that Georgetown did not model a benefit of screening shifting cases to a smaller tumor with a particular stage. A comparison of the mortality benefits of screening and adjuvant treatment that took into consideration various modeling approaches and assumptions provided many insights that would not have been possible by analyzing individual results. The base case not only provided a broader view of the substantive question of explaining the impact of screening and adjuvant treatment but also gave the researchers a rare opportunity to evaluate their modeling results in a larger context. The common analysis allowed for better understanding of the sometimes subtle impact of assumptions made during the modeling process, resulting in a set of models that are better able to investigate other important questions related to current and future breast cancer trends. References (1) Feuer EJ. Modeling the Impact of adjuvant therapy and screening mammograpy on U.S. breast cancer mortality between 1975 and 2000: introduction to the problem. J Natl Cancer Inst Monogr  2006; 36: 2–6. Google Scholar (2) Berry DA, Cronin KA, Plevritis SK, Fryback DG, Clarke L, Zelen M, et al., for the Cancer Intervention and Surveillance Modeling Network (CISNET) Collaborators. Effect of screening and adjuvant therapy on mortality from breast cancer. N Engl J Med  2005; 353: 1784–92. Google Scholar (3) Berry DA, Inoue L, Shen Y, Venier J, Cohen D, Bondy M, et al. Modeling the impact of treatment and screening on U.S. breast cancer mortality: a Bayesian approach. J Natl Cancer Inst Monogr  2006; 36: 30–6. Google Scholar (4) Fryback DG, Stout NK, Rosenberg MA, Trentham-Dietz A, Kuruchittham V, Remington PL. The Wisconsin breast cancer epidemiology simulation model. J Natl Cancer Inst Monogr  2006; 36: 37–47. Google Scholar (5) Mandelblatt J, Schechter CB, Lawrence W, Yi B, Cullen J. The SPECTRUM population model of the impact of screening and treatment on U.S. breast cancer trends from 1975 to 2000: principles and practice of the model methods. J Natl Cancer Inst Monogr  2006; 36: 47–55. Google Scholar (6) Tan SYGL, van Oortmarssen GJ, de Koning HJ, Boer R, Habbema JDF. The MISCAN-Fadia continuous tumor growth model for breast cancer. J Natl Cancer Inst Monogr  2006; 36: 56–65. Google Scholar (7) Hanin LG, Miller A, Zorin AV, Yakovlev AY. The University of Rochester model of breast cancer detection and survival. J Natl Cancer Inst Monogr  2006; 36: 66–78. Google Scholar (8) Lee S, Zelen M. A stochastic model for predicting the mortality of breast cancer. J Natl Cancer Inst Monogr  2006; 36: 79–86. Google Scholar (9) Plevritis SK, Sigal BM, Salzman P, Rosenberg J, Glynn P. A stochastic simulation model of U.S. breast cancer mortality trends from 1975 to 2000. J Natl Cancer Inst Monogr  2006; 36: 86–95. Google Scholar (10) Clarke LD, Plevritis SK, Boer R, Cronin KA, Feuer EJ. A comparative review of CISNET breast models used to analyze U.S. breast cancer incidence and mortality trends. J Natl Cancer Inst Monogr  2006; 36: 96–105. Google Scholar (11) Cronin KA, Mariotto AB, Clarke LD, Feuer EJ. Additional common inputs for analyzing impact of adjuvant therapy and mammography on U.S. mortality. J Natl Cancer Inst Monogr  2006; 36: 26–9. Google Scholar (12) Cronin KA, Yu B, Krapcho M, Miglioretti DL, Fay MP, Izmirlian G, et al. Modeling the dissemination of mammography in the United States. Cancer Causes Control  2005; 16: 701–12. Google Scholar (13) Mariotto AB, Feuer EJ, Harlan LC, Abrams J. Dissemination of adjuvant multiagent chemotherapy and tamoxifen for breast cancer in the United States using estrogen receptor information: 1975–1999. J Natl Cancer Inst Monogr  2006; 36: 7–15. Google Scholar (14) Holford TR, Cronin KA, Mariotto AB, Feuer EJ. Changing patterns in breast cancer incidence trends. J Natl Cancer Inst Monogr  2006; 36: 19–25. Google Scholar (15) Feuer EJ, Etzioni R, Cronin KA, Mariotto A. The use of modeling to understand the impact on U.S. mortality: examples from mammography and PSA screening. Stat Methods Med Res  2004; 13: 421–42. Google Scholar (16) Wand MP, Jones MC. Kernel smoothing. Boca Raton (FL):Chapman and Hall/CRC Press; 1995. Google Scholar (17) Ries LA, Eisner MP, Kosary CL, Hankey BF, Miller BA, Clegg L, et al. (eds). SEER Cancer Statistics Review, 1975–2002, National Cancer Institute. Bethesda, MD. Available at: http://seer.cancer.gov/csr/1975_2002/, based on November 2004 SEER data submission, posted to the SEER Web site 2005. Google Scholar (18) National Cancer Institute. Breast Cancer Surveillance Consortium: Evaluating Screening Performance in Practice. NIH Publication No. 04-5490. Bethesda, MD: National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services, April 2004. Available at: http://breastscreening.cancer.gov/espp.pdf. Google Scholar (19) Shen Y, Zelen M. Screening sensitivity and sojourn time from breast cancer early detection clinical trials: mammograms and physical examinations. J Clin Oncol  2001; 19: 3490–9. Google Scholar (20) Habbema JDF, Tan SYGL, Cronin KA. Impact of mammography on U.S. breast cancer mortality, 1975–2000: are intermediate outcome measures informative? J Natl Cancer Inst Monogr  2006; 36: 105–11. Google Scholar (21) Joensuu H, Lehtimaki T, Holli K, Elomaa L, Turpeenniemi-Huianen T, Kataja V, et al. Risk of distant recurrence of breast cancer detected by mammography screening or other methods. JAMA  2004; 292: 1064–73. Google Scholar © The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JNCI Monographs Oxford University Press

# Chapter 15: Impact of Adjuvant Therapy and Mammography on U.S. Mortality From 1975 to 2000: Comparison of Mortality Results From the CISNET Breast Cancer Base Case Analysis

, Volume 2006 (36) – Oct 1, 2006
10 pages

Publisher
Oxford University Press
ISSN
1052-6773
eISSN
1745-6614
DOI
10.1093/jncimonographs/lgj015
pmid
17032901
Publisher site
See Article on Publisher Site

### Abstract

Abstract The CISNET breast cancer program is a consortium of seven research groups modeling the impact of various cancer interventions on the national trends of breast cancer incidence and mortality. Each of the modeling groups participated in a CISNET breast cancer base case analysis with the objective of assessing the impact of mammography and adjuvant therapy on breast cancer mortality between 1975 and 2000. The comparative modeling approach used to address this question allowed for a unique view into the process of modeling. Results shown here expand on those recently reported in the New England Journal of Medicine (Berry et al., N Engl J Med 2005;353:1784–92) by presenting mortality impact in several different ways to facilitate comparisons between models. Comparisons of each group's results in the context of modeling assumptions made during the process gave insight into how specific model assumptions may have affected the results. The median estimate for the percent decline in breast cancer mortality due to mammography was 15% (range of 8%–23%), and the median estimate for the percent decline in mortality due to adjuvant treatment was 19% (range of 12%–21%). A detailed discussion of the differences in modeling approaches and how those differences may have influenced the mortality results concludes the chapter. INTRODUCTION The objective of the base case analysis is to model the influence of mammography screening and adjuvant treatment on the mortality rates, as well as to estimate whether other factors also influence the decline (1). To do this we partitioned the decline into the portion that was due to screening, due to adjuvant therapy, and due to other causes not explained by those two factors. Each of the seven CISNET breast cancer models undertook the same analysis, giving the opportunity to not only observe a range of results but also compare the results in the context of the modeling assumptions used by each group. A summary of the mortality results from the base case was recently published (2). This report expands on the results published and delves into the reasons why groups reported various results for the benefits of mammography screening and adjuvant therapy. Details of the models from the seven CISNET groups are described in chapters 6–12 of this monograph (3–9), and a comparison of the models' structure is given in chapter 13 (10). The purpose of this chapter is to compare the primary outcome of the impact of screening and adjuvant treatment on breast cancer mortality as estimated by the seven models and provide insight into how modeling assumption influence these results. The controlled nature of the base case analysis makes possible the comparative modeling approach that provides a link between the modeling decisions and ultimate outcomes. Models differed in their general approach to the problem (10). All groups incorporated information obtained from outside data sources, such as clinical trials, to develop the model and fit model parameters. Some groups focused strictly on mammography and adjuvant treatment, allowing for a component of the decline to be attributed to other causes not directly examined in this exercise, such as improved surgical techniques or improvements within adjuvant therapies. Other groups adjusted their models, either formally through Bayesian methods or informally by calibrating model parameters, to fit observed population trends. This alternative approach led to a more complete partitioning of the decline in mortality to either screening or adjuvant treatment. The models differed in structure, for example, if and how the natural history of breast cancer was modeled and how to quantify the benefits of early detection and treatment. Each of the models contained parameters unique to that model, reflecting the underlying structure. Several common, or base case, inputs (11) were used by all the models. The purpose of using the same input values for key variables describing population risk, mammography screening, and adjuvant treatment was to eliminate the uncertainly related to these base case inputs as a potential cause for differences among modeling results. Common population variables allows for a more detailed understanding of how the assumed model structure influences the modeling results. This chapter presents the mortality results from the seven models in several different ways. Using a variety of metrics to measure mortality facilitates comparisons and provides insight into how the results are similar and how they differ. We discuss how differences in modeling approaches, inputs, and assumptions may translate into different mortality results—giving valuable insight into the effect of underlying assumptions on the final results. METHODS The breast cancer CISNET models used several common, or base case, inputs to model the effect of mammography screening and adjuvant treatment on U.S. mortality trends. Each of these base case inputs is described in detail in other chapters of this monograph (11,13,14). The base case inputs include the dissemination and use of screening mammography in the United States (12), the dissemination and use of adjuvant chemotherapy and tamoxifen by stage of diagnosis (13), and the background trend in incidence rates that would have been observed without the introduction of screening based on an age–period–cohort model (14). The seven CISNET breast cancer models were used to estimate breast cancer age-adjusted mortality rates between 1975 and 2000 for women aged 30–79 years with and without mammography screening and adjuvant therapy. All models assume an increasing background trend in breast cancer risk by birth cohort leading to an increase in incidence and a subsequent increase in mortality with the interventions of screening and adjuvant treatment, over the years modeled. The amount of increased incidence in the background trend varied, with five models using the base case input as provided and two models using a smaller increase. The relation between the background trend and the mortality outcome is considered in the discussion. This analysis considers six runs: Run 1. No screening and no adjuvant therapy [Background Run] Run 2. Base case screening dissemination but no adjuvant therapy [Screening-Only Run] Run 3. No screening and no tamoxifen use but base case chemotherapy use [Chemotherapy-Only Run] Run 4. No screening and no chemotherapy but base case tamoxifen use [Tamoxifen-Only Run] Run 5. No screening but base case chemotherapy and tamoxifen use [Adjuvant Treatment–Only Run] Run 6. Base-case screening and chemotherapy and tamoxifen use [Screening and Adjuvant Treatment Run]. Model results are shown for various runs, either as generated by the individual models or adjusted to match the 1975 age-adjusted mortality level. This adjustment was made by shifting the entire curve by the difference between the modeled 1975 level and the observed 1975 level. Shifting the curve to match in 1975 was to focus on the impact of the interventions post-1975 rather than the challenge of recreating pre-1975 experience (which leads to the 1975 mortality level). A variety of measures are used to assess the mortality benefits of screening, chemotherapy, and tamoxifen either individually or in combination. All measures involve comparing Run 1 with Runs 2–6 to estimate benefit. Let M1(year)–M6(year) represent the modeled mortality rates from each of six runs described for a particular year. The first comparison made was to compute the absolute difference between model runs obtained by taking the difference in the estimated mortality for each year modeled. The set of equations in [15.1] show how the absolute benefit is calculated for screening and adjuvant treatment.  \begin{eqnarray*}&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{only}{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{2}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{chemotherapy}{\,}\mathrm{only}{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{3}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{tamoxifen}{\,}\mathrm{only}{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{4}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}{\,}\mathrm{only}\\&&{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{5}(\mathrm{Year}))\\&&\mathrm{Absolute}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{and}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}\\&&{=}(\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{6}(\mathrm{Year}))\end{eqnarray*} [15.1] A second approach to assessing the benefits is to compute the difference between runs with and without screening and/or treatment relative to the estimated mortality rate from the Background Run. The set of equations in [15.2] show how relative benefits for screening and adjuvant treatment are calculated. This approach produces a percent decline from the background mortality level attributable to screening, treatment, or both relative to the background run.  \begin{eqnarray*}&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{2}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{chemotherapy}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{3}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{tamoxifen}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{4}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}{\,}\mathrm{only}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{5}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\\&&\mathrm{Relative}{\,}\mathrm{benefit}{\,}\mathrm{screening}{\,}\mathrm{and}{\,}\mathrm{adjuvant}{\,}\mathrm{treatment}\\&&{=}[\mathrm{M}_{1}(\mathrm{Year}){-}\mathrm{M}_{6}(\mathrm{Year})]/\mathrm{M}_{1}(\mathrm{Year})\end{eqnarray*} [15.2] Relative estimates of benefit are less sensitive to misspecification in the models. For example, a relative benefit is not influenced by the background trend, which is an important source of uncertainty in this modeling effort (15). Similarly, the relative estimates of benefit for screening programs are unaffected by the treatment assumptions. See Feuer et al. (15) for a discussion on the benefits and drawbacks of absolute versus relative presentation of the impact of modeled interventions on mortality trends. To display the combined results, we plot the average of the seven probability density functions representing each model's results for the percent change in mortality for 2000 due to screening and due to treatment as compared to the Background Run from all seven models. Each pair of estimates is viewed as a sample from a hypothetical population of models. The seven pairs of data points were represented by a bivariate normal distribution centered on the point estimates for the percent decline in 2000 for screening and adjuvant treatment with a covariance matrix estimated from the set of seven observations. The bivariate normal distributions were defined by making the implicit assumption that the unobserved within-model variability (i.e., parameter uncertainty and random number variability) was equal to the observed between-model variability (i.e., model structural uncertainty). This assumption was justified by the one model where within-model variability was explicitly observed (3). The seven densities are averaged using equal weights to obtain an estimate of the posterior joint distribution for the population of models using the equation shown in [15.3](16).  $d(x,y){=}\frac{1}{7}{{\sum}_{i{=}1}^{7}}d_{i}(x,y)$ [15.3] x is the percent decline due to screening, y is the percent decline due to adjuvant treatment, di(x,y) is the probability density value at the point (x,y) for model i, and d(x,y) is the distribution for the population. The plot of d(x,y) is meant to give a visual impression of the combined result and a measure of the uncertainty around those results. Modeling results were also used to directly partition the observed trend in breast cancer mortality. Between 1975 and 2000 mortality for women aged between 30 and 79 years decreased 10.3 per 100 000 women (17). Comparing mortality rates in 2000 from various scenarios permitted estimating the effect that screening, adjuvant treatment, and the background trend in incidence had on mortality rates. Whereas screening and adjuvant treatment were moving the mortality rates lower, the estimated background incidence trend was pushing it higher. Comparing the mortality rate in the background run in 1975 with the rate in 2000 estimates how many additional deaths are attributable to increased level of risk for breast cancer in the population (i.e., the background trend). The mortality benefit attributable to screening and adjuvant treatment can also be calculated as shown in the set of equations below [15.4]. An estimate for the amount of the observed mortality trend that was not accounted for by the background trend in incidence, screening, and adjuvant treatment in a model (i.e., the amount of the mortality reduction attributed to “other causes” not included in the models) can be obtained by starting with the observed change in mortality of 10.3 per 100 000 women and subtracting the effects of the background trend, screening, and treatment.  \begin{eqnarray*}&&\mathrm{Background}{=}\mathrm{M}_{1}(2000){-}\mathrm{M}_{1}(1975)\\&&\mathrm{Screening}{=}\mathrm{M}_{2}(2000){-}M_{1}(2000)\\&&\mathrm{Adjuvant}{\,}\mathrm{Treatment}{=}\mathrm{M}_{6}(2000){-}\mathrm{M}_{2}(2000)\\&&\mathrm{Other}{\,}\mathrm{causes}{=}10.3{-}\mathrm{Background}\\&&{-}\mathrm{Screening}{-}\mathrm{Adjuvant}{\,}\mathrm{Therapy}\end{eqnarray*} [15.4] Different modeling approaches led some groups to adjust their model assumptions and parameters until they could better explain the entire observed trend in mortality (calibrating their model to observed mortality), whereas other models left open the possibility that other factors were influencing mortality that were not captured in their models (10). The implications of calibrated to observed mortality trends are discussed in the conclusions. In addition to estimating the benefit associated with screening and adjuvant treatment, this modeling design also allows for the estimate of synergy between the two. It was expected that the sum of the impact of Screening Only and Adjuvant Therapy Only would be greater than the impact of both together. By comparing the benefit of Screening Only and Adjuvant Treatment Only alone with the benefit of both combined, the synergy between the two can be estimated.  $\mathrm{Synergy}{=}\mathrm{M}_{2}(2000){+}\mathrm{M}_{5}(2000){-}\mathrm{M}_{6}(2000)$ [15.5] RESULTS Figure 1, A shows mortality estimates results from each of the seven groups for the Background Run, M1(year), and Fig. 1, C shows mortality estimates from the Screening and Adjuvant Treatment Run, M6(year). The mortality results are also shown adjusted to align with the observed 1975 level (Fig. 1, B and D). Fig. 1, A and C illustrate how mortality level and trend differ among the models, whereas Fig. 1, B and D facilitate the comparison of mortality trends. For example, five of the groups (Dana-Farber, Erasmus, Georgetown, Stanford, Wisconsin) produce similar mortality trends in the Background Run, and two groups (M. D. Anderson and Rochester) model a smaller increase in the absence of screening and adjuvant treatment. This effect is due partly to different assumptions related to the background trend, as will be discussed later. Fig. 1, C and D include the observed U.S. mortality trend for comparison with the Screening and Adjuvant Treatment Run. Fig. 1. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 from the seven CISNET models. A) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment. B) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment adjusted to begin at the observed rate in 1975. C) Mortality estimates from the Screening and Adjuvant Therapy Run and the observed U.S. mortality trends. D) Mortality estimates from the Screening and Adjuvant Therapy Run adjusted to begin at the observed rate in 1975 and the observed U.S. mortality trends. Panel C originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Fig. 1. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 from the seven CISNET models. A) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment. B) Mortality estimates from the Background Run with no mammography screening or adjuvant treatment adjusted to begin at the observed rate in 1975. C) Mortality estimates from the Screening and Adjuvant Therapy Run and the observed U.S. mortality trends. D) Mortality estimates from the Screening and Adjuvant Therapy Run adjusted to begin at the observed rate in 1975 and the observed U.S. mortality trends. Panel C originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Figure 2, A–G shows estimated mortality results for the six model runs, M1(year)–M6(year), for each of the models. The Background Run can be interpreted as an estimate of what mortality would have been had screening and adjuvant therapy never been introduced into the population. Similarly, the Screening-Only Run represents mortality without the introduction of adjuvant therapy and the Adjuvant Treatment–Only Run represents what mortality would have looked like without mammography screening. The Rochester group did not model chemotherapy and tamoxifen separately and report only an estimate for adjuvant treatment together. Fig. 2. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 for each of the six scenarios considered (Background Run, Screening-Only Run, Chemotherapy-Only Run, Tamoxifen-Only Run, Adjuvant Treatment–Only Run, Screening and Adjuvant Treatment Run.) for each of the CISNET models. A) Dana-Farber, B) Erasmus, C) Georgetown, D) M. D. Anderson, E) Stanford. F) Rochester, G) Wisconsin. Panel G originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Fig. 2. View largeDownload slide Modeled age-adjusted breast cancer mortality rates for women aged 30–70 years from 1975 to 2000 for each of the six scenarios considered (Background Run, Screening-Only Run, Chemotherapy-Only Run, Tamoxifen-Only Run, Adjuvant Treatment–Only Run, Screening and Adjuvant Treatment Run.) for each of the CISNET models. A) Dana-Farber, B) Erasmus, C) Georgetown, D) M. D. Anderson, E) Stanford. F) Rochester, G) Wisconsin. Panel G originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Figure 3, A–E graphs the estimated absolute mortality benefits for the five scenarios considered compared to the Background Run calculated as shown in the set of equations above [15.1]. The mortality reduction represents the deaths avoided per 100 000 women by the use of mammography and/or adjuvant treatment for 1975–2000. Fig. 3. View largeDownload slide Absolute difference in mortality due to screening and adjuvant treatments compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Fig. 3. View largeDownload slide Absolute difference in mortality due to screening and adjuvant treatments compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Figure 4, A–E gives the estimated percent decline in mortality for the five scenarios considered compared to the Background Run calculated as shown in the set of equations in [15.2]. The percent decline represents a percent reduction from the mortality level without screening or adjuvant treatment. Table 1 gives similar results for 2000. The estimated total mortality decline for both screening and treatment varied from 24.9% to 38.3%, estimates of 7.5%–22.7% due to screening and 12.0%–20.9% due to adjuvant treatment. All models estimated a small negative interaction between screening and treatment as demonstrated in the Synergy column in Table 1. A negative synergy suggests that the benefits of screening are larger without adjuvant treatment than it would be with adjuvant treatment. This is the direction of the synergy that was expected a priori. Taken in the extreme to demonstrate the concept, if treatment were completely curative there would be no additional mortality benefit associated with early detection. Fig. 4. View largeDownload slide Percent decline in mortality due to screening and adjuvant treatment compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Fig. 4. View largeDownload slide Percent decline in mortality due to screening and adjuvant treatment compared to the background run for women aged 30–70 years from 1975 to 2000 for each of the CISNET models. A) Screening Only, B) Chemotherapy Only, C) Tamoxifen Only, D) Adjuvant Treatment Only, E) Screening and Adjuvant Treatment. Table 1.  Percent decline for the year 2000 compared to background run*   % Decline for:             Model  Tamoxifen  Chemotherapy  Both treatments  Screening  Screening and treatment  Synergy  Dana-Farber  6.1  6.1  12.0  22.7  32.9  −1.8  Erasmus  12.0  9.6  20.9  15.3  30.9  −5.3  Georgetown  7.7  7.0  14.6  12.4  24.9  −2.1  M. D. Anderson  10.7  9.5  19.5  10.6  27.5  −2.6  Rochester  NA  NA  19.0  7.5  25.6  −0.9  Stanford  8.9  6.9  14.9  16.9  29.9  −1.9  Wisconsin  12.5  8.9  20.8  20.3  38.3  −2.8    % Decline for:             Model  Tamoxifen  Chemotherapy  Both treatments  Screening  Screening and treatment  Synergy  Dana-Farber  6.1  6.1  12.0  22.7  32.9  −1.8  Erasmus  12.0  9.6  20.9  15.3  30.9  −5.3  Georgetown  7.7  7.0  14.6  12.4  24.9  −2.1  M. D. Anderson  10.7  9.5  19.5  10.6  27.5  −2.6  Rochester  NA  NA  19.0  7.5  25.6  −0.9  Stanford  8.9  6.9  14.9  16.9  29.9  −1.9  Wisconsin  12.5  8.9  20.8  20.3  38.3  −2.8  * Synergy is defined as Screening and treatment together – (Screening Alone + Treatment Alone). NA = not available. View Large Figure 5, A and B originally appeared in Berry et al. (2). Figure 5, A shows a contour plot of the estimated distribution of a larger population of model results from which our seven models represent a sample. Figure 5, B shows the three-dimensional rendering of the contour plot reflecting the joint results. Lines on the contour plot represent equal values of the average of the individual distributions and the distance between adjacent contour lines represents equal differences in values as calculated using equation [15.3]. Fig. 5. View largeDownload slide A) Point estimates from the individual models and distribution contours for the combined model results derived by kernel density estimation. B) Three-dimensional rendering of the contour plot. This figure originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. Fig. 5. View largeDownload slide A) Point estimates from the individual models and distribution contours for the combined model results derived by kernel density estimation. B) Three-dimensional rendering of the contour plot. This figure originally appeared in Berry et al. (2), copyright 2006 Massachusetts Medical Society. The final presentation of the mortality results is the partitioning of the observed decline in mortality of 10.3 deaths per 100 000 women into four categories (increase due to background trend, decrease due to screening, decrease due to adjuvant treatment, and other factors) as described in set of equations [15.4]. Fig. 6, A demonstrates the concept of partitioning the observed difference in age-adjusted mortality into the four categories considered. Fig. 6, B and C shows the partitioning for each of the seven groups. Although no group explained the entire decline in mortality with their model, several groups came quite close. Most notably, Wisconsin, Rochester, and M. D. Anderson had a very small portion of the trend that was not explained by screening and adjuvant treatment. These three models incorporated components into their models to better match observed incidence, survival, and mortality data. As is demonstrated in Fig. 1, B, M. D. Anderson and Rochester modeled a smaller increase in mortality due to changes in background risk than the other models since they did not directly use the base case input described in chapter 4 (14). The reduction in mortality due to screening and treatment varied, but on average about half of the total decline was associated with screening and half associated with treatment. Fig. 6. View largeDownload slide Partitioning of the observed difference between the age-adjusted mortality rate in 1975 and the rate in 2000 into the increase in mortality associated with the background trend, the decrease associated with screening, the decrease associated with adjuvant treatment, and the portion of the change that is due to other factors not modeled. A) Schematic demonstrating the concept of partitioning the observed mortality trends. B and C) Partitioning of the mortality trend from each of the CISNET models. Fig. 6. View largeDownload slide Partitioning of the observed difference between the age-adjusted mortality rate in 1975 and the rate in 2000 into the increase in mortality associated with the background trend, the decrease associated with screening, the decrease associated with adjuvant treatment, and the portion of the change that is due to other factors not modeled. A) Schematic demonstrating the concept of partitioning the observed mortality trends. B and C) Partitioning of the mortality trend from each of the CISNET models. DISCUSSION The joint analysis of the CISNET consortium provided a unique opportunity to gain insight into the impact of screening and adjuvant treatment on breast cancer mortality as well as the models themselves. There is uncertainty related to this modeling effort occurring at several levels. The first type of uncertainty is the stochastic variance associated with a single run of a simulation program for an individual model (i.e., the variability associated with a particular choice of random seed in the random number generators). Second, there is uncertainty associated with model inputs. These model parameters can be unique to each model, such as natural history parameters within a particular model or parameters that are shared across models, such as use of mammography and adjuvant treatment over the period of interest. The third type of uncertainty is associated with the model structure, for example, whether a continuous tumor growth model or a discrete series of stages was used to describe progression of disease. Careful consideration of how each type of uncertainty influences modeling results is needed to better understand the differences in the estimated mortality reductions. Stochastic variance associated with a single run of a simulation program was generally minimized by running the models for a large population and/or averaging results over several runs. The uncertainty related to parameters unique to individual models can be addressed by each model individually either through a formal approach such as M. D. Anderson's Bayesian modeling or through other means such as sensitivity analysis of key model parameters. This type of model uncertainty is not addressed in this chapter but considered in chapters 6–12 of this monograph, which describe individual models in detail (3–9). The uncertainty associated with variables used across all the models was addressed by standardizing several of the model inputs. Common inputs, such as the background risk, dissemination of mammography, and the dissemination of treatment, eliminate uncertainty in those parameters as a cause for different mortality results. These inputs were either used directly as model inputs or as values used to calibrate model parameters. However, not all models used the base case inputs in the analyses presented here. For example, the M. D. Anderson group felt that the inclusion of the uncertainty associated with the background trend was crucial to their modeling effort and incorporated this into their model, thereby reintroducing differences in background trend as a possible explanation for different results. The final source of uncertainty, and the focus of this joint analysis, is the consequence of model structure and modeling assumptions. Usually models such as those described here are developed and presented individually with no measure of the effect of modeling choices made along the way. Looking at seven distinct approaches to the same problem gives a unique opportunity to quantify variation related to the modeling approach. The differences in results presented in this chapter are mainly a reflection of the structures assumed in the models and the approach to fitting model parameters. There is no single reason for the differences in the estimated mortality benefits. The differences arise from intricate relationships between many aspects of the individual models. Because of this complexity, we consider how model differences produced different estimates of mortality reduction through a series of relevant questions. Why do the models predict different mortality trends in the absence of screening and adjuvant treatment? For five of the groups, mortality rates from the Background Run were determined by incidence rates produced by the age–period–cohort (APC) model (14) that produced an increase in incidence rates without screening in combination with the respective survival components of the models. As seen in Figure 1, B, M. D. Anderson and Rochester had a noticeably smaller increase in mortality in the background run. Rochester based their background trend on a combination of risk observed in the Canadian National Breast Screening Studies (CNBSS) trial and the results of the APC model. M. D. Anderson did not directly use the background trend in risk but used the APC as a basis of a prior distribution and let their Bayesian modeling approach determine the background trend that would best fit observed mortality trends as a posterior distribution. Both of these approaches led to a smaller background trend than used by the other groups. M. D. Anderson and Rochester also allowed for improvements in breast cancer survival over time in the absence of adjuvant therapy via different mechanisms. The other five groups assumed that in the Background Run survival would remain constant from 1975 to 2000 (after controlling for characteristics at diagnosis such as patient age, stage, and size of tumor). The increased survival trend for M. D. Anderson and Rochester represented other improvements in detection and treatment of breast cancer not specifically captured in the interventions considered (mammography and adjuvant therapy). Because they have background improvements in survival, even with the same incidence of disease, M. D. Anderson and Rochester would have a smaller mortality rise in the background run than the other five groups. Including survival improvements over time produced a smaller increase in mortality from the background run and indirectly assigned some of the decline in mortality to factors not directly modeled (10). How do assumptions of the screening benefit affect the mortality results? Generally screening benefit was modeled by a shift to detection of less advanced disease. A case would be placed on a survival curve of a clinically detected case with equivalent disease at diagnosis. Georgetown models an increase in survival only for cases that were diagnosed in an earlier stage through screening. Dana-Farber made survival a function of both stage and screening history. The Georgetown model assumes a stage progression model to approximate tumor growth, and Dana-Farber assigns stage for a screen-detected case from a different stage distribution than a clinically detected case. Stanford modeled a benefit based on stage and size, allowing for a benefit of being detected at an earlier point within the same stage as well as being shifted to an earlier stage. Rochester modeled survival by age, stage, and tumor size. One would expect that modeling only a between stage benefit (i.e., no benefit from detection of a smaller tumor within the same stage) would underestimate the benefit of screening. No benefit for a within stage shift might suggest why the Georgetown estimates of screening benefit are somewhat smaller than others. The overall higher benefit estimated by the Dana-Farber model is closely related to the stage distribution for screen-detected cases estimated using data from the Breast Cancer Surveillance Consortium (BCSC) (18). The screened stage distribution is influenced by the number of examinations and interval between the exams. Stage distributions for screen-detected cases with a previous mammography within 1 year and for cases with more than 1 year since a previous mammogram were obtained by the BCSC. Since there were more screening patterns (combinations of 1-, 2-, and 5-year intervals) modeled than considered in developing these stage distributions, stage distributions applied to screening patterns other than annual might have led to a larger mortality reduction than would have been expected. Thus larger stage shifts applied to nonannual screening patterns might explain some of the overall higher benefit associated with screening reported by the Dana-Farber group. M. D. Anderson modeled a combination of stage shift for earlier stage disease and survival benefit for all screen-detected cases. In the background run, M. D. Anderson assumed that the increasing trend in background disease applied only to earlier disease (in situ, Stage I and Stage II). The trend in late disease (Stage III and Stage IV) followed the same trend with or without screening; i.e., trends in late-stage disease are unaffected by screening. Therefore screening cannot shift a case from Stage III and Stage IV to an earlier stage of disease. However late-stage screen-detected cases do receive a survival benefit through a beyond–stage shift screening benefit. This benefit is modeled as a proportional hazards survival benefit that increases survival for all screen-detected cases, including Stage III and Stage IV. When comparing trends in incidence from the screening runs to the Background Run without screening, screening results in a large increase in in situ and Stage I disease and a decrease in Stage II disease. The decrease in Stage II is primarily for node-positive cases, suggesting that there is a stage shift in earlier-stage disease from node-positive to node-negative disease within Stage II and from Stage II to either Stage I or in situ. The screening benefit for early-stage disease is a combination of both a possible stage shift and extended survival via the beyond–stage shift benefit. Since the survival benefit for a stage shift is generally greater than a within-stage survival benefit, the assumption that the incidence of late-stage disease is unaffected by screening limits the benefit of screening and may account for the lower benefit of screening estimated by the M. D. Anderson group and possibly explains why they estimate a large beyond–stage shift effect. Erasmus and Wisconsin considered a cure/no-cure approach, modeling whether screening moved the diagnosis time back far enough that a patient would now be cured of the disease. The other groups assumed that early detection and treatment would increase cause-specific survival. Two possibilities exist when extending survival. The first is that survival is improved until a woman dies of other causes; then their mortality benefit was similar to a cure. The second possibility is that extending survival only delays the time of death due to breast cancer and that those cases would not ultimately receive a mortality benefit, although they may affect trends over time by delaying the time of death. Even if one carefully picks a cure fraction to replicate results from a model that extends survival at a particular follow-up time (e.g., 5-year survival), in the long run it would be expected that curing a fraction of the cases would lead to a larger mortality benefit than simply an increase in survival time. Therefore the cure approach used by Wisconsin and Erasmus may lead to a higher mortality benefit than extending survival. How do assumptions of the benefit of adjuvant treatment affect the mortality results? There were three approaches to modeling benefit of treatment. One was to increase survival by using a proportional hazards assumption, the second was to adjust the fraction of the population that is cured, and the third was a combination of both changing the portion of the population that was cured and extending survival for the uncured patients. Similar to screening benefit, the use of a cure approach may lead to a higher mortality benefit. Wisconsin and Erasmus used a cure approach and predicted the two highest benefits from treatment (shown in Table 1). The use of a cure/no-cure approach for screening and adjuvant treatment may partially explain why Erasmus and Wisconsin fall in the upper right corner of the contour plot shown in Fig. 5, A. When looking at the other five groups, there seems to be a clear negative statistical correlation between the amount of mortality decline explained by screening and the amount explained by adjuvant therapy. This negative correlation was predicted a priori since a larger portion explained by screening would leave less to be explained by adjuvant treatment and vice versa. However, the two groups that modeled cure had generally higher benefits for both screening and adjuvant treatment simultaneously. Rochester approached modeling treatment from a different viewpoint. Instead of specifically modeling the dissemination of adjuvant chemotherapy as described in chapter 5 of this monograph (11), they directly modeled survival from Surveillance, Epidemiology, and End Results (SEER) controlling for age, stage, and tumor size by using a survival cure model. The cure model combined the probability of cure and the timing of death in one concept, thus allowing treatment advances to extend survival for some patient while simultaneously changing the probability of being cured from breast cancer. They noticed a change point in the survival trend around the time that adjuvant treatment was being widely disseminated and attributed the increase in the survival trend to those therapies. This approach cannot explicitly distinguish between the specific therapies mentioned and other factors that may have been affecting survival concurrently, such as improvements in surgical methods. Although this approach has the advantage of capturing other improvement in treatment in addition to chemotherapy and tamoxifen, it makes it more difficult to compare with the other model results and may be confounded with screening. If screen-detected cases had a survival benefit above and beyond a clinically detected tumor with equivalent characteristics (e.g., related to either length bias or overdiagnosis), this benefit would be captured as treatment rather than screening. Rochester attributes 72% of the overall mortality benefit modeled (overall benefit defined as the benefit of screening alone plus benefit of treatment alone) to treatment, the highest among the groups. Other groups specifically modeled better prognosis for screen-detected tumors as a screening benefit through mechanisms such as a beyond–stage shift screening benefit or making survival dependent on screening history. How do assumptions related to modeling the natural history of breast cancer influence the mortality results? The benefit of early detection to an individual is determined largely by how much earlier in the natural history of disease a case is screen detected. In groups that specifically modeled the natural history of breast cancer, this period is closely related to the rates of tumor growth or dwell time within a stage and the consequent preclinical detectable period. The latent time between screen detection and clinical detection (lead time) is associated with the severity of disease at the time of screen diagnosis. In several cases, natural history models were fitted by using screening trial data; therefore, the particular trial used determined the parameter values. Rochester used data from the CNBSS trial in fitting the natural history model. The Canadian trials showed little screening benefit and Rochester's model produced a smaller lead time than other models and a lower benefit for screening. Erasmus fit model parameters by using the Swedish Two County trial. The Swedish Two County study showed a larger difference between the screened and unscreened groups. Consequently, this group had a larger lead time and larger benefit of screening. Dana-Farber modeled sojourn time by using data from the CNBSS, Health Insurance Plan of New York trial, Swedish Two County study, Edinburgh, and Malmo trials as estimated by Shen and Zelen (19). A detailed discussion of parameters calibrated and datasets used in the calibration can be found in chapter 13 of this monograph (10). Four of the seven models (M. D. Anderson, Erasmus, Georgetown, and Wisconsin) included in situ disease in their models. Other groups felt that available information on the natural history of situ disease was currently inadequate to include in situ in their models. When in situ is included in the model, screening can detect tumors before they progress to malignant disease. Models that do not include in situ do not allow for that possibility and the earliest a tumor can be diagnosed is at a small localized stage. Observed trends of in situ breast cancer clearly show that many cases are being screen detected in this early stage where cause-specific survival is near 1. Without detection in the in situ stage, models assume that some portion of these tumors would progress to be diagnosed at a later stage. Not modeling this opportunity for early detection would underestimate the benefit of screening. However, there does not seem to be a pattern between the screening benefits produced by the models and whether or not they modeled in situ disease. Perhaps this lack of association is explained by the very good prognosis also seen in early localized disease. For example, in one model, screening may move diagnosis to an in situ stage, and in another model, screening may move diagnosis to an early localized stage; there is not much of a mortality difference between the two since both would have very low mortality. Clearly the modeling of in situ is important in modeling morbidity and possible overdiagnosis but may have less influence on the outcome of mortality. How do different assumptions on length and lead time biases affect the mortality results? Lead-time bias is a reflection of survival being increased by early detection whether or not a patient received a mortality benefit from screening. For example, if a patient's cancer is detected 6 months earlier by screening but that patient still dies of breast cancer at the same time as they would have without screening, her survival is increased by 6 months even though she has received no mortality benefit for early detection. How different models deal with lead time is discussed in chapter 13 of this monograph (10), in individual model description chapters (3–9), and in the chapter on intermediate outcomes (20). Many of the models guaranteed that patients would not die of their disease during their lead time. Length bias is related to the phenomenon that slower-growing tumors (i.e., tumors with longer sojourn times) are more likely to be detected by screening. Length bias has many implications that affect the modeling of mortality, including the implication that if screening is detecting slower-growing, potentially less aggressive tumors then it follows that tumors detected by screening may also have a better prognosis after detection than similar clinically detected tumors. This implication has been demonstrated in observed data that shows screen detected cases have a better prognosis than comparable clinically detected cases (21). When a screening schedule is overlaid on the natural history model in a simulation, an outcome is that screening will more likely detect slower-growing tumors. This outcome will automatically result in screen-detected tumors having longer sojourn times than clinically detected tumors on average. However, this result does not guarantee that screen-detected tumors will also have a better prognosis than similar clinically detected tumor unless tumor growth rates are related to survival times. Stanford, Erasmus, Rochester, and Wisconsin linked survival to tumor characteristics, such as size [more details in chapter 13 of this monograph (10)], to allow for a possible benefit from screen detection earlier within a single stage (a within-stage benefit) in addition to the possibility of shifted detection to an earlier stage (between-stage or stage-shift benefit). Thus screen-detected tumors would tend to have better prognosis than clinically detected tumors within the same stage since, they would tend to be smaller at diagnosis. Wisconsin also introduced a further prognostic benefit in screen-detected cases above that of survival conditioned on tumor characteristic. By the inclusion of tumors with latent malignant potential, many of which are detected only by screening, the survival distributions for screen-detected tumors will consist of a larger percentage of cases that are not at risk of dying from their breast cancer. If length bias results in screen-detected tumors having a better prognosis even after controlling for observable tumor characteristics as is suggested in the literature (21), Stanford, Erasmus, and Rochester models would not fully capture that difference. The Georgetown model captures no survival benefit for screen-detected cases beyond a stage shift. Models that do not involve simulation on an individual level addressed length time bias through different means. M. D. Anderson directly modeled a beyond–stage shift benefit that gave screen-detected tumors an additional survival benefit over similar clinically detected tumors. The magnitude of the additional benefit assigned by M. D. Anderson was estimated using Bayesian methods (3). Dana-Farber addresses this issue by modeling a stage shift that is a function of screening history (8). One would expect that the modeled mortality benefits of screening would be related to how each of the models dealt with length lead time and bias; however, this relationship was not specifically studied in this analysis and no relationship between how lead time and length biases were addressed and estimated mortality reduction is apparent. How do assumptions of screening dissemination affect the mortality results? A screening dissemination simulator was provided to all the groups mimicking the U.S. experience [(12); http://cisnet.cancer.gov]. Six of the groups used this dissemination program directly in their model. Dana-Farber used parameters of this simulator instead of the dissemination program. The Dana-Farber screening patterns were composed of mixtures of 1-, 2-, and 5-year intervals between exams. In the original screening dissemination model, an annual screener would return for a subsequent screening exam between 1 and 2 years, a biennial screener between 2 and 5 years, and the irregular screener over a longer range of time with a mean of 5 years. Comparing the Dana-Farber schedules with the screening patterns generated by the simulation models showed that the Dana-Farber schedules were (on average) slightly shortened with regard to the times between exams. The shorter time between examinations is due primarily to the lack of screening intervals, which are longer than 5 years in the other models. More frequent screening may have contributed to the higher benefit associated with screening reported by the Dana-Farber group. How can some groups explain the entire mortality decline, whereas others left a portion unexplained? Wisconsin, M. D. Anderson, and Rochester focused more on matching observed trends in the Screening and Adjuvant Treatment Run, which they achieved by incorporating more components in their models that they felt best explained what was occurring over the period. Wisconsin included nonprogressing tumors in the natural history and fit their model parameters to better reproduce incidence trends. M. D. Anderson fit their model parameters on the basis of how well they could reproduce mortality trends. Rochester fit SEER incidence and survival, thus indirectly fitting mortality. The other models were more focused on fitting model parameters from outside data sources such as clinical trials and left a portion of the incidence increase and mortality decline unexplained. This unexplained portion of the mortality decline can then be used to generate hypotheses (not yet verified by outside data sources) as to what else was affecting mortality over the period of interest. Hypotheses of what else could be affecting mortality are often similar to the component already added to the models by groups matching observed trends. Whether those factors were included in the model or left as a hypothesis for future research accounts for much of the variation in the modeling results, particularly the portion of the observed decline that was left unexplained in Fig. 6. Are the mortality results different? For the reasons listed above and others not covered in the discussion, the modeling results varied. The results were more homogeneous for treatment (range of 14.6%–20.8% for the percent decline in mortality due to treatment) and more varied for screening (range of 7.5%–22.7% for the percent decline in mortality due to screening). There were several reasons why these differences arose. The first reason was related to the model inputs. Even though the original intention was for all the models to use the same base case input parameters, many groups processed the inputs to better fit their model structure or their modeling philosophy. Therefore, the uncertainty in the common inputs was not completely controlled in this comparison. The second reason related to differences in modeling approach. Rochester, M. D. Anderson, and Wisconsin incorporated more components in their models to better explain the observed trends, whereas others did not. For example, M. D. Anderson changed the background trend, added a beyond–stage shift survival advantage to screen-detected cases, and improved background survival over time even without adjuvant treatment; Wisconsin included tumors with limited malignant potential; and Rochester altered the background trend and closely modeled observed survival. Model calibration to incidence, survival, and/or mortality changed the focus of the modeling exercise to developing models that could best explain what has occurred in population trends rather than taking available information on screening and adjuvant treatment from existing literature to predict the impact on mortality. These groups took the models a step further than the original base case question in trying to explain what was affecting observed trends. The results from these models depended on the mechanisms used to fit observed data and produced results that were more extreme in estimating the benefit of screening. Erasmus, Georgetown, and Stanford more closely followed the base case inputs and took a more limited approach to fitting observed trends. These three groups generally fell toward the middle of the contour plot and had similar results. Because of the similarity in approach among these three models, it is easier to delve into the impact of different modeling assumptions. For example, the higher benefit in treatment for Erasmus could more closely be linked to the cure/no-cure approach rather than to an extended survival approach. Similarly, the lower benefit of screening for Georgetown is probably reflected in the fact that Georgetown did not model a benefit of screening shifting cases to a smaller tumor with a particular stage. A comparison of the mortality benefits of screening and adjuvant treatment that took into consideration various modeling approaches and assumptions provided many insights that would not have been possible by analyzing individual results. The base case not only provided a broader view of the substantive question of explaining the impact of screening and adjuvant treatment but also gave the researchers a rare opportunity to evaluate their modeling results in a larger context. The common analysis allowed for better understanding of the sometimes subtle impact of assumptions made during the modeling process, resulting in a set of models that are better able to investigate other important questions related to current and future breast cancer trends. References (1) Feuer EJ. Modeling the Impact of adjuvant therapy and screening mammograpy on U.S. breast cancer mortality between 1975 and 2000: introduction to the problem. J Natl Cancer Inst Monogr  2006; 36: 2–6. Google Scholar (2) Berry DA, Cronin KA, Plevritis SK, Fryback DG, Clarke L, Zelen M, et al., for the Cancer Intervention and Surveillance Modeling Network (CISNET) Collaborators. Effect of screening and adjuvant therapy on mortality from breast cancer. N Engl J Med  2005; 353: 1784–92. Google Scholar (3) Berry DA, Inoue L, Shen Y, Venier J, Cohen D, Bondy M, et al. Modeling the impact of treatment and screening on U.S. breast cancer mortality: a Bayesian approach. J Natl Cancer Inst Monogr  2006; 36: 30–6. Google Scholar (4) Fryback DG, Stout NK, Rosenberg MA, Trentham-Dietz A, Kuruchittham V, Remington PL. The Wisconsin breast cancer epidemiology simulation model. J Natl Cancer Inst Monogr  2006; 36: 37–47. Google Scholar (5) Mandelblatt J, Schechter CB, Lawrence W, Yi B, Cullen J. The SPECTRUM population model of the impact of screening and treatment on U.S. breast cancer trends from 1975 to 2000: principles and practice of the model methods. J Natl Cancer Inst Monogr  2006; 36: 47–55. Google Scholar (6) Tan SYGL, van Oortmarssen GJ, de Koning HJ, Boer R, Habbema JDF. The MISCAN-Fadia continuous tumor growth model for breast cancer. J Natl Cancer Inst Monogr  2006; 36: 56–65. Google Scholar (7) Hanin LG, Miller A, Zorin AV, Yakovlev AY. The University of Rochester model of breast cancer detection and survival. J Natl Cancer Inst Monogr  2006; 36: 66–78. Google Scholar (8) Lee S, Zelen M. A stochastic model for predicting the mortality of breast cancer. J Natl Cancer Inst Monogr  2006; 36: 79–86. Google Scholar (9) Plevritis SK, Sigal BM, Salzman P, Rosenberg J, Glynn P. A stochastic simulation model of U.S. breast cancer mortality trends from 1975 to 2000. J Natl Cancer Inst Monogr  2006; 36: 86–95. Google Scholar (10) Clarke LD, Plevritis SK, Boer R, Cronin KA, Feuer EJ. A comparative review of CISNET breast models used to analyze U.S. breast cancer incidence and mortality trends. J Natl Cancer Inst Monogr  2006; 36: 96–105. Google Scholar (11) Cronin KA, Mariotto AB, Clarke LD, Feuer EJ. Additional common inputs for analyzing impact of adjuvant therapy and mammography on U.S. mortality. J Natl Cancer Inst Monogr  2006; 36: 26–9. Google Scholar (12) Cronin KA, Yu B, Krapcho M, Miglioretti DL, Fay MP, Izmirlian G, et al. Modeling the dissemination of mammography in the United States. Cancer Causes Control  2005; 16: 701–12. Google Scholar (13) Mariotto AB, Feuer EJ, Harlan LC, Abrams J. Dissemination of adjuvant multiagent chemotherapy and tamoxifen for breast cancer in the United States using estrogen receptor information: 1975–1999. J Natl Cancer Inst Monogr  2006; 36: 7–15. Google Scholar (14) Holford TR, Cronin KA, Mariotto AB, Feuer EJ. Changing patterns in breast cancer incidence trends. J Natl Cancer Inst Monogr  2006; 36: 19–25. Google Scholar (15) Feuer EJ, Etzioni R, Cronin KA, Mariotto A. The use of modeling to understand the impact on U.S. mortality: examples from mammography and PSA screening. Stat Methods Med Res  2004; 13: 421–42. Google Scholar (16) Wand MP, Jones MC. Kernel smoothing. Boca Raton (FL):Chapman and Hall/CRC Press; 1995. Google Scholar (17) Ries LA, Eisner MP, Kosary CL, Hankey BF, Miller BA, Clegg L, et al. (eds). SEER Cancer Statistics Review, 1975–2002, National Cancer Institute. Bethesda, MD. Available at: http://seer.cancer.gov/csr/1975_2002/, based on November 2004 SEER data submission, posted to the SEER Web site 2005. Google Scholar (18) National Cancer Institute. Breast Cancer Surveillance Consortium: Evaluating Screening Performance in Practice. NIH Publication No. 04-5490. Bethesda, MD: National Cancer Institute, National Institutes of Health, U.S. Department of Health and Human Services, April 2004. Available at: http://breastscreening.cancer.gov/espp.pdf. Google Scholar (19) Shen Y, Zelen M. Screening sensitivity and sojourn time from breast cancer early detection clinical trials: mammograms and physical examinations. J Clin Oncol  2001; 19: 3490–9. Google Scholar (20) Habbema JDF, Tan SYGL, Cronin KA. Impact of mammography on U.S. breast cancer mortality, 1975–2000: are intermediate outcome measures informative? J Natl Cancer Inst Monogr  2006; 36: 105–11. Google Scholar (21) Joensuu H, Lehtimaki T, Holli K, Elomaa L, Turpeenniemi-Huianen T, Kataja V, et al. Risk of distant recurrence of breast cancer detected by mammography screening or other methods. JAMA  2004; 292: 1064–73. Google Scholar © The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org.

### Journal

JNCI MonographsOxford University Press

Published: Oct 1, 2006