Main

Treatment of metastatic melanoma is advancing with the advent of new targeted immune-modulating therapies (Agarwala, 2009; Julia et al, 2010). This aggressive disease was traditionally treated with dacarbazine (DTIC) chemotherapy, which has never been shown to provide survival benefit and is associated with serious side effects. For decades, it was the only FDA-approved treatment for metastatic stage IV melanoma. In 2011, ipilimumab was approved after significantly improving survival in patients with metastatic melanoma in a phase 3 study comparing ipilimumab/DTIC vs placebo/DTIC (hazard ratio, 0.72; P<0.001) (Robert et al, 2011). Patients who received ipilimumab and DTIC for up to 48 weeks had consistently higher survival rates at each annum, with a 4-year survival rate of 19% vs 10% in the group treated with DTIC only. Disease progression was slowed and durable objective responses noted. These results mark a major advance in the treatment of metastatic melanoma; thus, ipilimumab is seen to fulfill an unmet medical need.

Treatment with ipilimumab is associated with a distinct set of adverse events (AEs) that are consistent with its immune-based mechanism of action. These immune-mediated AEs may affect any organ system; however, the most frequently affected organs are the skin (rash, pruritus) and gastrointestinal tract (diarrhoea, colitis). Most immune-mediated AEs are mild to moderate in grade and resolve when established treatment guidelines are followed (Weber et al, 2012). Occasionally, immune-mediated AEs may be severe in nature and require stopping of treatment and administration of systemic glucocorticoids or other immunosuppressant agents. In some reports, immune-mediated AEs have been reported to correlate with treatment response (Attia et al, 2005; Downey et al, 2007).

To assess the overall effect of ipilimumab on the patient experience, we implemented quality-adjusted time without symptoms or toxicities (Q-TWiST) methodology (Goldhirsch et al, 1989; Glaszhou et al, 1990). Q-TWiST is an analytical approach that compares time with toxicities and clinical outcomes simultaneously to evaluate the trade-off between AEs and benefits of treatment during the entire survival period. The approach has been used to evaluate treatments in many different cancer indications (Jang et al, 2009; Gao et al, 2010; Marcus et al, 2010; Sherrill et al, 2011; Wang et al, 2011; Patil et al, 2012) including interferon for melanoma (Cole et al, 1996; Kilbridge et al, 2002; Sloan et al, 2002).

Materials and Methods

Study design

The data source for this analysis was a phase 3 multicenter, randomised, double-blind, two-arm study in patients with untreated stage III (unresectable) or IV melanoma (ClinicalTrials.gov NCT00324155). The study included patients with previously untreated unresectable stage III melanoma with N3 macroscopic lymph nodes or intransit/satellite metastases or stage IV melanoma, measurable disease per modified World Health Organisation (WHO) criteria. Other key inclusion criteria were age at least 18 years; ECOG performance status of 0–1; and any lactate dehydrogenase level, regardless of b-raf mutation status or HLA type; key exclusions were brain metastases (based on imaging) and symptomatic autoimmune disease.

Study phases are shown in Figure 1. Patients were randomised to receive either of the following treatments:

  • IPI+DTIC: ipilimumab (10 mg kg−1 every 3 weeks × 4 doses, then every 12 weeks starting week 24) +DTIC (850 mg m−2 every 3 weeks × 8 doses)

  • PLA+DTIC: placebo (every 3 weeks × 4 doses, then every 12 weeks starting week 24) +DTIC (850 mg m−2 every 3 weeks × 8 doses).

Figure 1
figure 1

Study Phases.

In the induction phase, patients were dosed with study drug or placebo plus DTIC for up to 12 weeks and then assessed for progression once per month until week 24, during which time DTIC dosing could continue. Patients who progressed before week 12 could continue with treatment and assessments provided that no safety concerns precluded further dosing. At week 24, patients who had not progressed entered a maintenance phase. During the maintenance phase, ipilimumab dosing continued for patients who continued to be eligible for treatment; assessments were performed every 6 weeks until week 48 and every 12 weeks thereafter. At progression or start of alternate cancer therapy or other reason for study termination, patients entered the follow-up phase in which tumour assessments were not performed. Patients were followed for survival and AEs throughout all phases and until 70 days after last dose of study medication for all patients.

Disease progression was based on assessment of a subject’s index lesions, nonindex lesions and new lesions according to modified WHO criteria. Imaging-based evaluations for progression were required at each visit for the chest, abdomen and pelvis. Nonradiographic assessments, such as measurement of visible cutaneous lesions, were also admissible as evidence of progressive disease. Progression was confirmed by an independent review committee. Progression-free survival (PFS) was defined for each patient as the time between randomisation date and the date of progression or death, whichever occurred first. Overall survival (OS) was defined for each patient as the time between randomisation date and death. If a patient had not expired, the patient was censored at the time of last contact (last known date alive).

The analysis presented here uses data from the April 2012 lock date.

Statistical analyses

In this application of the standard Q-TWiST method (Goldhirsch et al, 1989; Cole et al, 1994), survival time was partitioned into three health states:

  • Toxicity (TOX): period with AEs grade ⩾3 after randomisation and before disease progression or censoring for progression

  • Time without symptoms or toxicities (TWiST): period without TOX or symptoms before disease progression

  • Relapse (REL): period following disease progression and ending with death or censoring

For the TOX state, the number of days with grade 3/4 AEs before disease progression or censor date for PFS was totalled for each patient. Nonoverlapping periods with AEs were summed; duration of overlapping periods was counted from the start date of the first period until the stop date of the last period. Patients who experienced no qualifying AEs before disease progression were censored at the day after randomisation and were assigned a duration of zero for the TOX state. Patients with censored PFS also had TOX time censored, because total duration of TOX before progression for these patients is unknown.

Survival curves that correspond to TOX, PFS and OS were estimated by the Kaplan–Meier method and plotted on a single graph. Initially, analyses were run with follow-up time set close to the median OS among all randomised patients as is typical in Q-TWiST analyses. To portray the treatment difference over extended follow-up time and for comparability with other analyses from this trial, the survival analyses were repeated with 2-, 3- and 4-year follow-up times. The areas between the curves represent the restricted mean durations of health states such that:

TWiST duration=Mean PFS−Mean TOX

REL duration=Mean OS−Mean PFS

The mean Q-TWiST for each treatment arm was calculated as follows:

where TOX, TWiST and REL represent the mean health state durations; μTOX, μTWIST and μREL denote the utility scores for each respective health state. Q-TWiST was calculated for each combination of hypothetical utility scores. Note that Q-TWiST equals the mean OS when μTOX=μTWIST=μREL=1 and Q-TWiST equals the mean PFS when μTOX=μTWIST=1 and μREL=0.

A threshold utility analysis was performed to show treatment comparisons of Q-TWiST for a matrix of possible utility weight combinations where μTOX and μREL are varied from 0 to 1 relative to μTWIST=1. Differences in mean Q-TWiST between treatment groups were calculated for each combination of selected utility scores.

Health-related quality of life (HRQoL) scales were collected during the clinical trial, but not utility scores. To look specifically at the utilities and the corresponding Q-TWiST difference that would be expected in a group of melanoma patients, we reran the analyses with utility scores from two additional sources. We used utility scores based on those that Beusterien et al, 2009 assigned to clinically defined health states using a general population valuation of advanced melanoma states and utility scores derived from EORTC-QLQ-C30 scores reported by patients who progressed on or after and/or were intolerant of prior therapy for advanced (unresectable Stage III or IV) melanoma from the MDX010-20 trial (Hodi et al, 2010; Batty et al, 2011).

Results

In study CA184024, 250 patients were randomised to IPI+DTIC and 252 patients to PLA+DTIC. The proportion of patients experiencing a grade 3 or higher AE before progression or censoring for progression was 64% for IPI+DTIC and 40% for PLA+DTIC. Median OS in the trial across all patients was 10 months, with follow-up continuing for up to 5 years.

Partitioned survival plots are shown by treatment group in Figure 2, with vertical lines corresponding to the analysis cutoff points. Given an overall median survival in the trial of 10 months, the follow-up time for initial Q-TWiST analysis was set at 1 year. In order to assess the effects in patients with extended survival, analyses were repeated using data from 2, 3 and 4 years follow-up. These repeated analyses included all patients, but each patient stops contributing data after death or censoring for survival (that is, lost to follow-up). Patients remaining alive and uncensored as of each cutpoint are shown in Table 1 along with the unweighted mean duration of health states for each analysis.

Figure 2
figure 2

Partitioned survival plots with 4-year follow-up (two plots). Ipilimumab+dacarbazine. Placebo+dacarbazine.

Table 1 Mean durations of health states (months, unweighted)

The average duration of time before progression spent with grade 3/4 AEs was 1.34 months on ipilimumab treatment, significantly longer than the 0.61 months of TOX for the placebo treatment group (P=0.0016). During the first 1 year on study, groups were similar in the average time patients experienced TWiST (P=0.858) or REL (P=0.906).

When utility score for TWiST is fixed at 1, the Q-TWiST difference between groups ranged from 0.01 to 0.79 months, favoring ipilimumab therapy for all possible utility levels combinations, but statistically significant only for higher value utilities for the TOX state. This analysis is displayed in Figure 3a, with utility scores for TOX on the x axis and for REL on the y axis. The magnitude of the Q-TWiST difference (in months) is given by the numbered lines within the plot, with positive numbers favoring ipilimumab therapy over placebo. Shaded areas represent different levels of statistical significance, with lighter grey representing smaller P-values.

Figure 3
figure 3

Threshold utility analyses (four plots). Note: Numbered lines within the plots represent Q-TWiST difference (in months); positive numbers favour ipilimumab+dacarbazine over placebo+dacarbazine. Grey shading depicts P-values for testing the difference between treatments under utility score combinations; the lighter the grey, the smaller the P-value.

Analyses with extended follow-up yielded results very similar to those of the initial analysis for duration of time spent with grade 3/4 AEs before progression. However, with extended follow-up time, differences between treatment groups were increased both for TWiST (0.84 months at 2 years; 1.40 months at 3 years; and 1.40 months at 4 years) and survival after disease progression (0.48 months at 2 years; 1.08 months at 3 years; and 2.30 months at 4 years). Differences between treatments were not statistically significant for the unweighted TWiST and REL states (Table 1).

When Q-TWiST analyses were repeated with extended follow-up time, differences between groups consistently favored IPI+DTIC over PLA+DTIC. With 2-year follow-up, the Q-TWiST difference across a matrix of utility values ranged from 0.8 to 2.0 months. The light grey shading on Figure 3b indicates that treatment differences in Q-TWiST were either significant (P⩽0.05) or marginally significant (0.05<P<0.1) for most utility combinations, with the exception of the left-bottom corner with TOX and REL utilities both <0.2. The repeated analysis for 3 years of follow-up yielded Q-TWiST differences from 1.4 to 3.1 months in favour of the ipilimumab treatment (Figure 3c). The differences were statistically significant for most of the utility combinations except when the utilities for the TOX and REL states were extremely low (TOX<0.2; REL<0.1). Finally, the 4-year follow-up shows Q-TWiST differences of 1.4–4.3 months in favour of Ipilimumab. The 4-year Q-TWiST differences are statistically significant at the 0.05 level for all combinations of utility scores.

According to utilities derived from the EORTC-QLQ scores from patients in the MDX-20 trial, melanoma patients had utility of 0.80 preprogression and 0.76 after progression (Batty et al, 2011). As that study did not collect utilities during AEs, we applied utility for TOX at 0.67 based on Beusterien et al, 2009 and reran the Q-TWiST analyses. Based on this combination of trial-based and external utility scores of 0.67, 0.80 and 0.76 for TOX, TWiST and REL, respectively, the Q-TWiST difference in this trial was 0.50 months (P=0.0326) favoring ipilimumab after 1 year (Table 2). The Q-TWiST difference was 1.5 months with 2 years of follow-up (P=0.0091), 2.36 months at 3 years (P=0.005) and 3.28 months at 4 years (P=0.0074).

Table 2 Melanoma-specific utility analyses

Sensitivity analyses (not shown) with different definitions for the TOX state (for example, including AEs of all grades) and on a subgroup of response-evaluable patients support these results.

Discussion

We applied the Q-TWiST analytical method to concurrently evaluate the AEs, progression times, and OS in Study 024. This method provides information about the safety and efficacy impact of treatment from the perspective of utility to the patient. Simply put, treatments are compared with the average time that patients spend in various health states, with devaluation for periods of time that are of less use to the patient (that is, time with AEs, time after disease progression).

Half the patients died during the first year of the study, which included the induction and maintenance dosing periods (median OS for IPI+DTIC=11.2 months vs 9.1 months for PLA+DTIC) (Robert et al, 2011). Q-TWiST results during the same period favored ipilimumab therapy for most hypothetical utility levels, but differences between groups were small. These results are consistent with previously reported HRQoL data collected before progression. During the first 12 weeks of dosing, small to moderate declines in HRQoL scores were observed in both treatment arms, and no significant differences in EORTC-QLQ-C30 functioning scales or symptom scales were detected between treatment groups (Kotapati et al, 2011). Although the threshold utility analyses shown here rely on a hypothetical range of utility values, we also conducted the Q-TWiST analyses using utility values derived from previous studies of patients with advanced melanoma. These results are indicative of the average Q-TWiST difference we would expect among patients with this disease taking these treatments.

Repeated Q-TWiST analyses with extended follow-up time showed that the magnitude of the difference in quality-adjusted survival increased with longer follow-up time (Table 2). These results can be put into context by reviewing the treatment group differences in OS from this trial. Robert et al. (2011) reported that 21% of patients in the IPI+DTIC group vs 12% in the PLA+DTIC group remained alive at the end of 3 years. Here, we show the average quality-adjusted survival advantage for melanoma patients taking ipilumumab with DTIC is 3.3 months over a 4-year period compared with patients taking placebo with dacarabzine.

Drugs such as ipilimumab induce an immunooncological response that may take time to develop (Wolchok et al, 2009). Tumour burden may actually increase before an immune response develops enough to cause regression. Conventional response criteria (mWHO/RECIST) may not capture the delayed response to treatment, and tumour progression endpoints do not adequately reflect this type of nonmonotonic growth pattern. As the Q-TWiST analyses incorporate data from the progression endpoint, the quality-adjusted survival benefit may be understated. In other words, the Q-TWIST method evenly devalues all time after progression as REL, so any tumour responses that may have occurred later that were not reflected in the progression endpoint are also not reflected in these analyses. Thus, quality-adjusted survival for patients with delayed tumour response may be greater than shown here.

Interpretation of these findings is limited by the small number of subjects who remained alive after 1 year, fewer than half the patients in each group. However, the data for very long-term survivors remains favourable for ipilimumab, suggesting that the increased survival accomplished with this drug is not offset by toxicities.

For CA184024, DTIC was used as the reference treatment as it was the only globally approved chemotherapeutic agent for these patients at the time that this study was conducted. The use of DTIC is largely historical and based on tumour response rates, and DTIC has never demonstrated an OS benefit in a randomised trial. Nevertheless, DTIC was in both the reference and experimental arms of this study as it was considered the standard of care therapy for advanced melanoma patients.

During the first year of study, there was little difference between groups in quality-adjusted survival. However, after 2, 3 and 4 years follow-up for patients with extended survival, the benefits of IPI+DTIC vs PLA+DTIC for advanced melanoma continue to accrue.