Main

The European Randomized Study of Screening for Prostate Cancer (ERSPC) and the Gothenburg trial (part of the ERSPC) have shown that a reduction in prostate cancer mortality can be achieved using prostate-specific antigen (PSA) screening (Schroder et al, 2009; Hugosson et al, 2010; Schroder et al, 2012). It has been shown that after adjusting for quality of life there is still a substantial benefit (Heijnsdijk et al, 2012).

Men with screen-detected prostate cancer can choose between receiving immediate active treatment or entering into an expectant management programme. The potential benefit of actively treating the cancer immediately after diagnosis is an increase in life expectancy. The potential harm is the risk of living for many years with the side effects of treatment, years which might otherwise have been symptom free (Korfage et al, 2005). Alternatively, the potential benefit can be expressed as the reduction of prostate cancer-specific mortality, and the potential harm expressed as the percentage of overdiagnosis, that is, the proportion of men with screen-detected prostate cancer that, in the absence of screening, would die from other causes before the time of clinical diagnosis.

The aim of the present study was to quantify the potential benefits and harms of immediate versus delayed active treatment for local–regional prostate cancer according to the following factors at time of detection: clinical T-stage, Gleason score, and patient age. The results presented in this study are intended to help clinicians and patients decide whether or not treatment is favourable immediately after early detection.

MATERIALS AND METHODS

ERSPC trial: Rotterdam and Gothenburg sections

The ERSPC trial was initiated in the early 1990s in order to evaluate the effect of PSA screening on prostate cancer mortality. In the Rotterdam and Gothenburg sections, 42 376 men aged 55–74 years and 19 946 men aged 50–64 years were randomised, respectively. The time interval between the screening rounds was 4 years in Rotterdam and 2 years in Gothenburg. An overall reduction in prostate cancer mortality of 29% at a median follow-up of 11 years was observed in the ERSPC (Schroder et al, 2012).

MISCAN model

We used the MIcrosimulation SCreening Analysis (MISCAN) prostate cancer model (Draisma et al, 2003, 2006, 2009; Wever et al, 2010a). MIcrosimulation SCreening ANalysis is a microsimulation programme that simulates progression and screening of prostate cancer within a population. The model was validated using prostate cancer detection data from the Rotterdam (Draisma et al, 2003, 2006; Wever et al, 2010a) and Gothenburg (Hugosson et al, 2004) sections of the ERSPC, as well as the mortality reduction data from the overall ERSPC trial (Schroder et al, 2012). A summary of the assumptions in the model and the data used for calibration are presented in Table 1 and are outlined below. A more detailed description of the model can be found in earlier publications (Draisma et al, 2003, 2006) and in a standardised model profile (Wever et al, 2010b).

Table 1 Modelling assumptions and data used in the present study

MIcrosimulation SCreening ANalysis is a microsimulation programme, which simulates the progression of prostate cancer in individuals as a sequence of preclinical, clinical, and screen-detected tumour states. First, the age at death from other causes is simulated per individual using Dutch life tables (Statistics Netherlands, 2000–2007). Next, the progression of prostate cancer in the absence of screening is simulated. Prostate cancer may develop from no prostate cancer to a clinically diagnosed cancer through one or more screen-detectable preclinical stages. From each preclinical stage, a tumour may grow to the next clinical T-stage (T1, impalpable; T2, palpable, confined to the prostate; T3+, palpable, with extensions beyond the prostatic capsule); it may dedifferentiate to a higher Gleason score (well differentiated, Gleason score 2–6; moderately differentiated, Gleason score 7; poorly differentiated, Gleason score 8–10); or it may be clinically diagnosed. For these transitions, the time spent in the current stage is generated from a Weibull distribution, where the parameters depend on the current stage and the choice of next stage is determined by transition probabilities. In addition, there is a risk that a tumour in the local–regional stage (M0) will develop into distant disease (M1), which is modelled by using a stage and Gleason score-specific hazard function. Depending on the frequency and sensitivity of the screening test, preclinical cancers may be detected by screening. PSA test and subsequent biopsy were modelled as a single test, where the sensitivity parameter was assumed to be clinical T-stage-dependent. In the model, sensitivity is defined as the probability that a preclinical tumour is detected by a screening test at the time the test is taken.

Model parameters, including transition probabilities, mean dwelling times (the time from one preclinical state to another preclinical or clinical state), and stage-specific test sensitivities were estimated by constructing models for the ERSPC-Rotterdam and -Gothenburg, and by calibrating the model to the following data observed at these centres: baseline incidence (national cancer registry data for 1991 (Visser et al, 1994)) and stage distribution in the Netherlands (Rotterdam cancer registry data 1992–1993 (Spapen et al, 2000)); baseline incidence in Sweden (1988–1992 (Parkin et al, 1997)); incidence, Gleason and stage distributions in the control arms of ERSPC-Rotterdam and -Gothenburg; and detection rates, interval cancer rates, Gleason and stage distributions in the screen arms of ERSPC-Rotterdam (Draisma et al, 2003, 2006; Wever et al, 2010a) and Gothenburg (Hugosson et al, 2004). Number of cases diagnosed, and Gleason and stage distributions in the control arms versus those in the screen arms provide insight into disease progression through the various preclinical phases. Parameters were estimated by numerically minimising the deviance between the number of cases observed and the number of cases predicted by the models. Deviances were calculated by assuming Poisson likelihood for incidence data and by assuming multinomial likelihood for stage-distribution data.

Survival after clinical diagnosis of untreated prostate cancer was assumed to be according to the Gleason score-specific survival curves of (Albertsen et al, 2005), but we added clinical T-stage as an explanatory factor. For this, we used the Cox proportional hazard estimates from (Aus et al, 2005; T1 1, T2 1.51, T3 2.77), and changed the Albertsen model such that the weighted sum of hazards by clinical T-stage add up to the same overall level by age and Gleason score.

In the model, we assumed that all men diagnosed with prostate cancer receive radical prostatectomy. According to published results, (Bill-Axelson et al, 2011), we assumed that men receiving radical prostatectomy have a relative risk of 0.62 of dying from prostate cancer compared with men receiving no initial treatment. This analysis did not consider other treatments, such as radiation therapy and active surveillance, because of the limited published results about the effectiveness of these treatments. If the effectiveness of radiation therapy and active surveillance are different than that of radical prostatectomy, the results for these treatments will be different than those presented. For distant prostate cancer, we assumed that treatment has no effect. Some treatments might increase survival of distant disease slightly. However, in our view this is a minor limitation of the model.

The effect of early detection through screening on survival was included by assuming that a fraction of local–regional tumours detected by screening are cured because the tumours are treated earlier. The Gleason score-dependent cure rates were estimated by calibrating the ERSPC-Rotterdam model to the observed 29% prostate cancer mortality reduction in the overall ERSPC at a median follow-up of 11 years (Schroder et al, 2012).

Analysis

For the present study, we constructed a model in which individuals were initially screened between the ages of 50 and 74 years, and then subsequently every 4 years until the age of 75 years. Two situations were simulated: one in which all men with screen-detected prostate cancer received treatment immediately after diagnosis; and another in which they received delayed treatment, that is, they received treatment at the time they would have if clinically diagnosed in the absence of screening. Comparing these two simulated populations, life-years gained, reduced probability of death from prostate cancer, lead time and probability of overdiagnosis were calculated and stratified according to the prognostic factors clinical T-stage, Gleason score, and patient age. Note that lead time is the period by which diagnosis is advanced due to screening; therefore, mean lead time shows the average potential life-years with no side effects of treatment in case treatment was delayed to the time of clinical diagnosis.

Two alternative harm–benefit ratios were subsequently calculated from the results: the ratio between the mean lead time and mean life-years gained (M, which represents the average loss of life-years free from the potential side effects of curative treatment per life year gained), and the ratio between the percentage of overdiagnosis and the percentage of prostate cancer deaths prevented by early treatment (NNT, which represents the additional number of patients who must be treated in order to avoid one prostate cancer death). Considering that the decision to treat immediately or not depends on the variable M, we determined for which combinations of prognostic factors it would be less favourable (M>9), more favourable (3M<9), and most favourable (M3) to treat immediately. The ranges for these three groups were chosen arbitrarily.

Sensitivity analyses

To evaluate the statistical variation and uncertainty of the observed data, we conducted a number of sensitivity analyses. The sensitivity analyses compared models with various lead times, survival curves, and cure rates. Penalised optimisation was used to obtain a range of models with various lead times. Parameters for these models were estimated by minimising the sum of total deviance and lead time penalty (mean lead time × penalty). Different survivals were considered by assuming a relative risk of 0.8, 1, or 1.2 on the hazard of prostate cancer death. We also varied assumed mortality reductions due to screening: 21% (observed for the screen group in ERSPC), 29% (observed for attendees in ERSPC), or 44% (observed for the screen group in ERSPC-Gothenburg).

RESULTS

Values describing the potential benefits and harms of active treatment varied considerably between the different prognostic groups: mean lead time ranged from 2.9 to 12.2 years; mean life-years gained ranged from 0.1 to 4.0 years; the percentage of screen-detected cases that were overdiagnosed ranged from 2.7 to 60.1%; and the percentage of men who avoided death from prostate cancer ranged from 1.5 to 32.1% (Table 2).

Table 2 Predicted percentage of overdiagnosis,a mean lead time,b life expectancy,c and percentage of prostate cancer (PC) deathd according to prognostic factors of clinical T-stage, Gleason score, and patient age

To illustrate Table 2, consider the example of men with the following prognostic factors: diagnosed with T1G7 prostate cancer at 62 years of age. These men each have a 28.4% risk of being an overdiagnosed case. The mean lead time for such overdiagnosed men is 11.3 years, that is, if these men decide to be treated at the time of screen detection they will live an average of 11.3 years with the potential side effects of curative treatment. The non-overdiagnosed men have to live an average of 8.5 years with the potential side effects if they decide to be treated immediately. The life expectancy is 16.5 years and there is an 18.7% risk of dying from prostate cancer if the cancer is treated immediately. If the cancer is treated at the expected time of clinical diagnosis then their life expectancy is 15.3 years (1.2 years less) and the risk of dying from prostate cancer is 32.3% (an increase of 13.6% in absolute terms). The negative effect of treating cancer at screen detection instead of time of clinical diagnosis is that the patient has to live on average 9.3 years (((11.3 × 28.4)+(8.5 × 71.6))/100) longer with the potential side effects of treatment. The positive effect is that the patient’s life expectancy is 1.2 years longer. The ratio between the negative effect and the positive effect is 8.0 (the ratio is not exactly 9.3/1.2 because of rounding of decimals). The negative effect can also be expressed as the percentage of overdiagnosed cases (28.4%) and the positive effect as the percentage of prostate cancer death avoided by treating early (13.6%). The ratio between these is the NNT (2.1 in this example). Lower values for these ratios imply lower expected negative effects in relation to the expected positive effects, and therefore immediate treatment is more favoured.

The two harm–benefit ratios (M and NNT) also showed considerable variation between the different prognostic groups: M ranged from 1.8 to 31.2 and NNT ranged from 0.3 to 11.6 (Table 2). If the same age groups were compared, immediate treatment was increasingly favourable with increasing Gleason score for patients at clinical stage T1 or T2. In contrast, favourability decreased with increasing Gleason score for patients in same age group at clinical stage T3. Favourability of immediate treatment increased with increasing T score in patients of the same age and with a Gleason score <7. In patients of the same age, with a Gleason score of 7, immediate treatment was most favourable in those with clinical stage T2, followed by stage T3, and then stage T1. In patients of the same age with a Gleason score >7, immediate treatment was also most favourable in patients with clinical stage T2 (although ratios were only slightly lower compared with T1) and least favourable in T3.

Considering that the decision to treat immediately depends on the lead time divided by the life-years gained (M), the relationship between these two variables was plotted in order to better illustrate how the various combinations of prognostic factors influence the favourability of immediate treatment after screen detection (Figure 1). For all age groups, the same pattern can be observed, where immediate treatment was most favourable in men diagnosed with T3G6, and least favourable in men diagnosed with T1G6 or T3G8. The figure illustrates clearly that both the benefits and harms decrease with increasing age, and that the benefits decrease relatively more than the harms. Therefore, immediate active treatment after screen detection was less favourable with increasing age. It also shows that with increasing age there is less variation in the harms and benefits between the different prognostic groups.

Figure 1
figure 1

Comparison of mean life-years gained and mean lead time for different combinations of prognostic factors. The dark grey zone represents the area where treatment would be considered less favourable (M>9), the light grey zone represents the area where treatment would be considered more favourable (3M<9), and the white zone represents the area where treatment would be considered most favourable (M3). The variable M is the ratio of the mean lead time to the mean life-years gained, which represents the average loss of life-years free from the potential side effects of curative treatment per life-year gained.

Varying lead time, survival, and mortality reduction caused the minimum of the harm–benefit ratio M to vary from 1.2 to 2.4 and the maximum to vary from 20.3 to 44.0 (Table 3). Considering the different assumptions in the sensitivity analysis, the same pattern was preserved where immediate treatment was most favourable in men diagnosed at age 55–59 with T3G6, and least favourable in men diagnosed at age 70–74 with T1G6 or T3G8.

Table 3 Sensitivity analysis for uncertainty in the model and the data

DISCUSSION

The results from our model demonstrate that, in addition to potential life-years gained and probability of avoiding death from prostate cancer, the decision of whether a patient should receive immediate treatment or not should also depend on the potential lead time and the probability of overdiagnosis. Our model predicts that these factors vary considerably depending on the prognostic factors of clinical T-stage, Gleason score, and patient age.

The two harm–benefit ratios were lowest for men aged 55–59 years, and so immediate treatment would seem most favourable for this group. Both ratios were lowest for men with moderate-risk cancer, specifically those diagnosed with T3G6 (representing 3% of screen-detected cases at 55–59 years of age), T2G7 (7% of screen-detected cases at 55–59 years of age), T1G8 (1% of screen-detected cases at 55–59 years of age), and T2G8 (2% of screen-detected cases at 55–59 years of age). Our model suggests that patients with these prognostic factors would be likely to derive greatest benefit from immediate treatment. The majority of men aged 60–69 years at the time of detection belong to a group for which the harm–benefit ratios could be considered ‘intermediate’. Men aged 70–74 years with low-risk cancer (T1G6) or high-risk cancer (T3G8) had the highest ratios and, therefore, immediate treatment is least favourable for these patients. Indeed, the harm–benefit ratios are relatively high for all combinations of clinical T-stage and Gleason score in patients aged 70–74 years, which is due to the low probability of these patients living long enough to experience the benefit of curative treatment. The high harm–benefit ratios for immediate treatment of men aged 70–74 years also imply that the harm–benefit ratio for the screening of these men will also be high relative to younger men. Therefore, the negative impact of the screening process may be reduced by only screening younger men. Our results suggest that it is crucial to precisely determine which age groups would derive greatest benefit from screening and at what age it would be most favourable to cease.

In our models, the harm–benefit ratio is unfavourable in patients with either favourable prognostic factors or with unfavourable characteristics. In the first group, this is due to the extended lead times, in the second group due to the very modest benefits of immediate treatment. For the first group, patients with T1G6, T2G6, or T1G7 tumours, active surveillance might allow treatment to be delayed without impairing the chances of cure too much. For the second group, patients with T2G8, T3G7, or T3G8 tumours, watchful waiting, that is, waiting for symptoms to appear and starting palliative treatment when necessary, might be the most appropriate treatment.

In our analysis, we assumed that the favourability of immediate treatment for each combination of prognostic factors will depend on the ratio between mean lead time and mean life-years gained (M). It is important to note that the three ranges describing the favourability of the ratio M were chosen arbitrarily and do not imply that all patients in the least favourable group should not receive immediate treatment, and that all patients in the most favourable group should receive immediate treatment. The division between the groups only illustrates for which combination of prognostic factors it is relatively less favourable to treat immediately and for which prognostic factors it is relatively more favourable to treat immediately.

Our modelling study has some limitations. The predicted values are based on data from a specific population and on the assumptions underlying the model. However, using the best available data and most reasonable assumptions, we obtained the best possible estimates for the measures of interest. The results were calculated using data from the ERSPC Rotterdam and Gothenburg. Results may be different for other populations, as different incidences of and mortalities for prostate cancer have been observed in different countries (Ferlay et al, 2010). However, in the appendix of Heijnsdijk et al (2012), it is shown that the incidence of prostate cancer in the screen arm of all centres of the ERSPC is predicted well by the model. Also, in the United States, the estimated lifetime risk of prostate cancer diagnosis is 16.22% and of prostate cancer death is 2.79% (SEER, 2010), which are close to lifetime risks of this model (Wever et al, 2012). Therefore, the results can also be applied in the US population. Another limitation is that it was not possible to consider all available prognostic and predictive factors that would be available in clinical practice. For example, PSA measurements, MRI results, or the number of positive cores were not considered in our analysis. We also made no distinction between Gleason scores (3+4) and (4+3). These could be important prognostic factors. However, this is the first study to quantify the benefits and harms of immediate treatment vs delayed treatment according to the prognostic factors of clinical T-stage, Gleason score, and patient age. Judging from our results, it would seem crucial that we understand the impact of at least these three factors. Another important prognostic factor that was not included in our model is the presence of co-morbidity. We suggest that the results obtained for older men in our study could be considered likely to be similar to individuals with high levels of co-morbidity. A final limitation is that we used survival curves based on non-contemporary data observed in the United States for the ERSPC population that was modelled (Albertsen et al, 2005). These survival data were used because they are one of the few data sets presenting survival of untreated prostate cancer as a function of age at diagnosis and Gleason score progression. The percentage of prostate cancer death that we report may seem higher than previous reports (Albertsen et al, 2005; Aus et al, 2005; Albertsen et al, 2011). However, the percentages of prostate cancer deaths are generally presented after a follow-up time of 10 or 20 years, whereas our results represent lifetime data. For example, Albertsen et al (2011) reported that, after 10 years of follow-up, the percentage of prostate cancer death in men aged 66–75 years and with T1cG8 is 13.7–25.7%, depending on co-morbidity. In the present study, the predicted lifetime percentage of prostate cancer death for men with T1G8 and aged 65–74 is 27.6%; however, the value is 14.0% after 10-year follow-up, which is on the low side compared with the numbers reported by Albertsen et al (2011).

In conclusion, the potential benefits and harms of immediate treatment of local–regional prostate cancer detected by PSA screening depend on the prognostic factors clinical T-stage, Gleason score, and patient age. The range of these values by the different prognostic groups is wide. Therefore, it would seem important to understand and consider these factors when making decisions regarding whether or not to treat a patient. Men aged 55–59 years and with moderate-risk cancer (T3G6, T2G7, T1G8, or T2G8) display the most favourable harm–benefit ratios, and would therefore seem most likely to derive greatest benefit from immediate curative treatment. Immediate curative treatment is least favourable for men aged 70–74 years with low-risk (T1G6) or high-risk cancer (T3G8).