Main

Targeted treatment with antiangiogenic therapy has become standard of care for the treatment of most patients with metastatic clear cell renal cell carcinoma (mRCC) based upon improvements in progression-free survival (PFS) in several randomised phase III clinical trials (Escudier et al, 2007a, 2007b; Motzer et al, 2007; Sternberg et al, 2010). With more than one choice of therapy available for patients with mRCC, clinical decision making is more complex, and data on other outcomes in addition to efficacy, such as quality of life and safety, have become increasingly important.

Most often, treatments are compared by a head-to-head analysis of each trial endpoint separately. For purposes of a primary analysis, this approach is reasonable. However, this approach may not be ideal if there are important tradeoffs between endpoints such as an increased time with side effects of treatment but longer time to progression in one arm compared with the other.

One method that allows integration of both the quality and quantity of survival time is the Time Without Symptoms of disease progression or Toxicity of treatment or TWiST analysis, and its extension the quality-adjusted TWiST or Q-TWiST (Gelber and Goldhirsch, 1986; Goldhirsch et al, 1989). The primary hypothesis of these methods is that patients with no disease symptoms or treatment toxicity have better health-related quality of life than those who have disease symptoms and toxicity. Q-TWiST was first used to evaluate adjuvant therapy for breast cancer (Gelber et al, 1991), and has since been widely applied to other settings and cancers (Gelber et al, 1996; Rosendahl et al, 1999; Sherrill et al, 2008; Marcus et al, 2010; Zbrozek et al, 2010).

In this paper, we report the results of a Q-TWiST analysis from a phase III randomised clinical trial comparing the oral antiangiogenic compound sunitinib (SUTENT; Pfizer, Inc; New York, NY, USA), with interferon-α (IFN-α) as first-line treatment for patients with mRCC (Motzer et al, 2007; Motzer et al, 2009). In this trial, sunitinib showed superior PFS compared with IFN-α (median PFS 11 vs 5 months, P<0.001); in addition, median overall survival with sunitinib was more than 2 years (26.4 months). In general, more adverse events of all grades were reported in the sunitinib arm than in the IFN-α arm, although the proportion of patients experiencing grade 3 or 4 toxicities was relatively low for both treatment groups. The Q-TWiST analysis was used to simultaneously compare the two treatments in terms of PFS, overall survival, and grade 3 or 4 toxicities.

Materials and methods

Patients and study design

The design and main results of the randomised phase III clinical trial have been reported previously (Motzer et al, 2007; Motzer et al, 2009). In this trial, 750 patients with mRCC were randomised in a 1 : 1 ratio to receive either sunitinib or IFN-α. Key patient eligibility criteria included no previous systemic therapy for RCC, measurable disease, an Eastern Cooperative Oncology Group performance status of 0 or 1, as well as adequate hepatic, renal, and cardiac function. All patients provided signed informed consent. The primary endpoint was PFS. Sunitinib was administered orally at an initial dose of 50 mg per day for 4 weeks, followed by 2 weeks off treatment (Schedule 4/2). IFN-α was administered as a subcutaneous injection on three nonconsecutive days per week, starting at 3 million units (MU) for the first week, 6 MU for the second week, and 9 MU thereafter.

Statistical methods and analysis

The Q-TWiST analysis considered three health states, TOX, TWiST, and REL, and the duration of each state was calculated for every patient. The TOX state comprised the total number of days after randomisation and before progression spent with toxicity, regardless of when the toxicity started or whether there were gaps between toxicities. All grade 3 or 4 toxicities attributable to the study drugs were included in the analysis, apart from those starting after progression. The model included only the more severe toxicities because they were the events considered most likely to have more effect on a patient’s quality of life. The type, date of onset, and date of resolution of each toxicity were recorded prospectively as a part of the standard procedure in conducting a randomised phase III trial. Time spent with toxicities unresolved by progression was capped at the date of progression. There were 44 sunitinib and 34 IFN-α patients with unresolved toxicities at the time of progression. It is possible for a patient to have more than one type of toxicity during a period of time. Care was taken so that these overlapping toxicity intervals were not double counted. That is, if a patient had a toxicity that lasted from day 1 to day 5 and another toxicity that lasted from day 1 to day 10, the number of days spent with toxicity was 10 days. The TWiST state was defined as PFS time minus time with toxicities. Progression-free survival time was defined from randomisation to the last date of follow-up or progression. Patients without progression were censored at their last date of follow-up (median follow-up was similar in each treatment arm, being 853 days for sunitinib and 863 days for IFN-α). The duration of the relapse or REL state was defined as overall survival time minus PFS time, or the period of time from progression to death. Patients alive at the end of the study were censored for the overall survival endpoint.

The mean time spent in each of the three health states was calculated for each treatment arm separately, and a 95% confidence interval for the difference by treatment was calculated using the nonparametric bootstrap method. Progression-free and overall survival curves were generated using Kaplan–Meier methods, which account for differential follow-up. These curves, along with a curve for time on toxicity, were overlaid on a single graph generated separately for each treatment.

We used a threshold utility analysis to assess quality-adjusted or Q-TWiST outcomes, in which the TOX and REL health states are each weighted by utility scores or weights. These weights are represented by μTOX and μREL in the equation below:

The utility weights reflect the relative value for the TOX and REL states and range from zero to one, with values closer to one discounting fewer days than those closer to zero for each state. A value of one represents a time period that is denoted by patients as a time of perfect health. Conversely, a value of zero represents a time period that is akin to death. Q-TWiST scores were calculated for a combination of utility weights increasing from zero to one by increments of 0.25.

Results

There were more reported occurrences of most general adverse events of all grades in the sunitinib arm than in the IFN-α arm (Motzer et al, 2007). Figures 1A and B show survival times partitioned into the three health states over the follow-up period separately for sunitinib and IFN-α. In each graph, the overall survival curve (blue) is partitioned by the Kaplan–Meier curves for PFS (red) and time with treatment toxicity (green). The area between the Kaplan–Meier curves gives the average time spent in each of the three health states.

Figure 1
figure 1

(A) Kaplan–Meier curves for overall survival (blue) and PFS (red) for the sunitinib arm, with toxicity (green) for patients who experienced any treatment-related grade 3 or 4 toxicity. (B) Kaplan–Meier curves for overall survival (blue) and PFS (red) for the IFN-α arm, with toxicity (green) for patients who experienced any treatment-related grade 3 or 4 toxicity.

The mean number of days spent with grade 3 or 4 toxicity (i.e., TOX) was 27 days higher among patients in the sunitinib arm than in patients from the IFN-α arm (95% CI: 18, 37; Table 1). However, the mean time spent without symptoms of disease progression or toxicity of treatment (i.e., TWiST) was 151 days higher in the sunitinib than in the IFN-α arm (95% CI: 118, 180; Table 1), and the mean time spent in relapse (i.e., REL) was 96 days lower among patients randomised to sunitinib (95% CI: −126, −56; Table 1).

Table 1 Mean duration of each health state for patients who experienced grade 3 or 4 treatment-related toxicity

Results from a threshold utility analysis where the TOX and REL health states were weighted from 0 to 1 are provided in Table 2. The difference in Q-TWiST ranged from a maximum of 177 ((95% CI: 146, 212; Table 2) weight for TOX=1 and weight for REL=0) to a minimum of 56 ((95% CI: 16, 102; Table 2) weight for TOX=0 and weight for REL=1). The first scenario reflects the case when a patient’s quality of life before progression is unaffected by toxicity, but quality of life after progression is severely affected. The second scenario is a reverse of this, when a patient’s quality of life before progression is severely affected, but quality of life after progression is unaffected.

Table 2 Threshold utility analysis

Discussion and conclusions

Results from this phase III trial of sunitinib vs IFN-α showed that sunitinib was superior to IFN-α, based on a longer duration of median PFS; overall survival was also longer with sunitinib than with IFN-α, although the difference did not reach statistical significance (Motzer et al, 2009). The results of exploratory analyses were consistent with the hypothesis that the overall survival endpoint was confounded by crossover treatment and use of alternate anticancer drugs after discontinuation. A total of 25 patients from the IFN-α group crossed over to receive sunitinib on study and one-third (117/359=33%) of the patients from the IFN-α group received post-study treatment with sunitinib.

The rate of adverse events was low in both treatment groups, but occurred in more patients treated with sunitinib than with IFN-α. This is likely to be attributable to the much longer average duration of therapy with sunitinib than with IFN-α. Quality-of-life scores as measured by the FACT-G and FKSI questionnaires were higher among patients randomised to sunitinib than among those randomised to IFN-α, with minimal regional variation (Cella et al, 2008; Cella et al, 2010).

In this Q-TWiST analysis, we integrated efficacy and safety endpoints, and found that patients on the sunitinib arm spent on average 27 more days with grade 3 or 4 treatment-related toxicity than patients on the IFN-α arm. For both treatment arms, the number of days during which a patient experienced toxicity was low compared with the time during which the average patient remained progression-free. This effect was more pronounced for the sunitinib arm; time spent without progression or toxicity was 151 days greater in the sunitinib than in the IFN-α arm. The Q-TWiST analysis provides a way by which we can compare PFS, overall survival, and time spent without toxicity in the two treatment arms in one metric.

Patients can value time spent in relapse and time spent undergoing active therapy differently. For the Q-TWiST analysis, we assigned a range of utility weights to reflect 25 such scenarios. A utility weight of one discounts zero days from a health state, while a utility weight of zero discounts all days from a health state. In our analysis, all utility combinations resulted in positive Q-TWiST treatment differences (sunitinib Q-TWiST–IFN-α Q-TWiST). The difference in scores ranged from 177 to 56 days. Interpreted another way, sunitinib had higher quality-adjusted survival times than IFN-α across the entire range of utility combinations.

Revicki et al (2006) recommended that differences in Q-TWiST equal to at least 10% of overall survival should be considered clinically meaningful. Based on the median overall survival of 26.4 months or 803 days in the sunitinib arm, a 10% or greater difference corresponds to 80 days or greater. In Table 2, Q-TWiST differences in all scenarios apart from five can be deemed as clinically important. Four of the five smallest differences occurred when time from progression to death or end of study was not discounted (REL weight=1), but time before progression was discounted (TOX weight <1).

In this Q-TWiST analysis, we used overall utility weights to reflect the average value patients place on time spent in relapse and time spent experiencing toxicity. However, it is possible that there is heterogeneity in how a specific patient would value time spent with toxicity and time spent from progression to death or end of study. Another limitation of our analysis is that patients progressing on IFN-α were given the choice to crossover to the sunitinib arm. This resulted in 25 patients switching to sunitinib during the study and could affect estimation of the REL health state for the IFN-α arm.

Q-TWiST methodology can provide useful advice on treatment choice when PFS differences are significant but overall survival differences are not, as is the case with the sunitinib vs IFN-α trial analysed herein. Applying Q-TWiST methodology to examine progression, overall survival, and toxicity as a single metric showed that sunitinib has a greater quality-adjusted survival time than IFN-α. For sunitinib patients, the greater amount of time spent with a toxicity is offset by far longer PFS. These results support the conclusion that sunitinib offers improved clinical and quality-of-life outcomes compared with IFN-α for mRCC patients.