Main

The role of interferon-alpha in malignant melanoma has long been the debated and researched, with over 6000 patients entered into trials (Molife and Hancock, 2002). One aspect of this has been the effectiveness of low-dose extended duration adjuvant therapy. A recent study in 674 patients with thick primary cutaneous melanoma showed no significant difference in overall survival or recurrence-free survival up to 5 years (Hancock et al, 2004). This, together with other trial evidence (Ascierto et al, 2006), points to there being no routine role for low-dose therapy within this patient group.

A definitive conclusion is not possible, however, until data on health-related quality of life (HRQoL) and costs have been considered alongside the survival data. It is possible that a HRQoL advantage exists, or that the cost differentials are such that the treatment may be considered cost-effective even in the face of nonsignificant clinical findings.

Data from within AIM-High have been reported on toxicity and change in Karnofsky Performance Status (Hancock et al, 2004), but these data offer only a partial view of HRQoL. This paper reports on the HRQoL data from AIM-High plus cost and cost-effectiveness as estimated by an incremental cost per quality-adjusted life year (QALY).

Materials and methods

Patients within the study were randomised to either interferon alpha-2a at 3 megaunits three times per week for 2 years or until recurrence, or placebo. The study protocol was approved by the relevant research ethics committee and all participating patients gave informed written consent.

HRQoL data in the form of the European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 were originally intended to be collected at baseline, 3, 6, 12, 24, 36, 48 and 60 months for a subgroup of patients. However, HRQoL data were actually collected at a variety of time points postrandomisation from 3 days to 77 months. Data for the economic analysis, including cost information and the EQ-5D were collected at 3, 6, 12, 24, 36, 48 and 60 months. These economic data were only collected for a subgroup of patients, selected as every fifth patient to enter the study.

The European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 is a 30-item cancer-specific instrument designed to assess the health-related quality of life (QoL) of cancer patients participating in international clinical trials (Aaronson et al, 1993). The QLQ-C30 version 1.0 used in the AIM-High Trial incorporated five functional scales: physical (PF), role (RF), cognitive (CF), emotional (EF) and social (SF); three symptom scales: fatigue (FA), pain (PA), nausea and vomiting (NV); a global health status/QoL scale (QL) and six single items assessing additional symptoms commonly reported by cancer patients: dyspnoea (DY), loss of appetite (AP), insomnia (SL), constipation (CO), diarrhoea (DI) and a single item on the perceived financial impact of the disease (FI). All of the scales and single-item measures range in score from 0 to 100. A high scale score represents a higher response level. Thus a high score for a functional scale represents a high/healthy level of functioning; a high score for the global health status/QoL represents a high QoL; but a high score for a symptom scale/item represents a high level of symptomatology/problems.

Patient utilities were obtained from the EQ-5D questionnaire. The EQ-5D is a five-dimensional health state classification. The five dimensions are mobility, self-care, usual activities, pain/discomfort and anxiety/depression. These five dimensions are each assessed by a single question on a three point ordinal scale (no problems, some problems, extreme problems). An EQ-5D ‘health state’ is defined by selecting one level from each dimension. A total of 243 health states are thus defined. Values or preference weights for a sample of these health states were obtained from a general community sample using a Time-Trade-Off (TTO) technique. Estimates for all health states were extrapolated from this sample by statistical regression modelling. The EQ-5D preference-based measure can be regarded as a continuous outcome scored on a −0.59 to 1.00 scale, with 1.00 indicating ‘full health’ and 0 representing dead. The negative EQ-5D scores represent certain health states valued as worse than dead.

Data on resource use covered all key areas of care; interferon dose, inpatient and outpatient hospital care, community nurse and general practitioner care. Data on interferon were collected via the study case report form as completed by the study clinician or research nurse. The other economic data, including the EQ-5D, were collected through a patient completed questionnaire.

Analysis

Data for the EORTC QLQ-C30 were scored using the EORTC Scoring Manual (Fayers et al, 1995). EQ-5D data were scored using UK population values (Dolan, 1997), and combined with mortality data to calculated QALYs (Drummond et al, 1997). As baseline EQ-5D values were missing for baseline, these were imputed from a regression of EORTC QLQ-C30 responses on EQ-5D values from other visits.

Differences in patient characteristics between groups were tested for using independent sample t-tests, or χ2 tests, as appropriate. The Kaplan–Meier method was used to calculate the time from randomisation to death, and the log rank test to compare the survival times of both groups (Altman, 1991). HRQoL data were collected at a variety of time points postrandomisation from 3 days to 2136 days, mean 403 days. Patients had between 1 and 13 follow-up QoL assessments, with an average of 3.85 assessments postrandomisation. Given this variation in data collection we decided on a relatively straightforward approach to the analysis of the longitudinal data which involved the use of summary measures (Matthews et al, 1990). We summarised follow-up QoL responses for each individual subject by taking the simple average of their follow-up QoL responses over time as our summary measure (Matthews et al, 1990) as we were concerned with differences in overall levels of QoL rather than more subtle effects.

Differences in mean follow-up HRQoL between the groups were compared using a multiple linear regression model, with mean follow-up HRQoL as the outcome variable and baseline HRQoL, overall survival status (dead or censored) and treatment group as covariates. P-values of less than 0.05 were regarded as being statistically significant.

The economic analysis followed guidelines set down by the National Institute for Clinical Excellence (2004). Costs were calculated by combining resource use data with unit costs representing national estimates (British Medical Association, 2002; Netten and Curtis, 2003). Costs beyond 1 year were discounted at 3.5% per annum. Prices are at 2003/4 levels with prices adjusted using the Hosptial and Community Health Services Pay and Price Index where appropriate (Netten and Curtis, 2003). Cost differences were tested for using independent sample t-tests. An incremental cost-effectiveness ratio was calculated using mean costs and QALYs.

Economic data can be severely limited by missing data as both the costs and the QALYs are cumulative measures (i.e. totals over the entire follow-up period). Consequently, if only one value is missing from the full series of follow-up data, the total cannot be calculated. To avoid this problem, missing data imputation becomes an important part of analysing economic data. Within this study, the last observation carried forward was used to impute missing data in order to calculate total costs and QALYs (Heyting et al, 1992).

Results

Health-related QoL

Figure 1 shows that 444 patients (out of 674) or 66% had a valid baseline QoL assessment; 230/338 (68%) in the IFN group and 214/336 (64%) in the OBS group (P=0.233). Comparison of the n=398 patients with a valid baseline QoL assessment and at least one valid follow-up QoL assessment and n=276 patients with no baseline or follow-up QoL assessments, suggested that the two groups have similar age (P=0.151), gender (P=0.349), histology (P=0.078), and lengths of follow-up (P=0.528) (Table 1).

Figure 1
figure 1

Enrolment, treatment and follow-up of study patients.

Table 1 Comparison of samples analysed

There was no interaction between treatment group and follow-up QoL assessment status with regard to overall survival (P=0.251) and no evidence of a difference in overall survival between the no follow-up QoL data and valid follow-up data groups (log rank P=0.84). Median survival was 4.05 years for patients with no valid follow-up QoL data vs 3.81 years for patients with valid baseline and follow-up QoL data. This implies we can assume that the QoL sample of 388 patients is a randomly selected subsample of the AIM-High trial population.

The IFN and OBS groups in the QoL sample had similar age, gender, stage and overall mortality. The IFN and OBS groups in the QoL sample had similar baseline EORTC QLQ-C30 scores, except for the PAIN dimension, where the IFN had significantly higher levels of pain, +5.8 (95% CI: +1.2 to +10.4, P=0.013), see Table 2. There was no evidence of a difference in overall survival between the IFN and OBS groups (log rank P=0.15) in the QoL sample. Median survival for IFN was 4.29 years vs 3.21 years for the OBS group (see Figure 2).

Table 2 Baseline clinical characteristics and QoL in control and intervention (n=398)
Figure 2
figure 2

Kaplan–Meier plot of overall survival by treatment group HRQoL follow-up sample.

Patients in the observation (OBS) group had significantly better mean follow-up QoL on five dimensions of the EORTC QLQ-C30 V1 functional scales: RF, EF, CF, SF and QL (see Table 3) after adjustment for baseline QoL and overall survival status (dead or censored). Patients in the OBS group had significantly lower (better) mean follow-up QoL symptom scores on seven dimensions of the EORTC QLQ-C30 V1 symptom scales: FA, NV, DY, AP, CO, DI and FI (see Table 4) after adjustment for baseline QoL and overall survival status (dead or censored).

Table 3 Baseline and follow-up EORTC-QLQ-C30 v1 function scores by group (n=398)
Table 4 Baseline and mean follow-up EORTC-QLQ-C30 v1 Symptom scores by group (n=398)

Economic evaluation

In total, 134 patients were entered into the economic study and data were available for 111 of these patients. Costs were higher for the interferon (IFN) group in the first 2 years, then slightly lower, thereafter. Overall, costs were £3066 higher in the IFN group. This is almost entirely due to the cost of therapy (Figure 3), but is not statistically significant (P=0.396). The IFN group generates 0.074 more QALYs (Table 5), which is equivalent to an extra 27 days in full health, although this is not statistically significant (P=0.752).

Figure 3
figure 3

Total costs over 5 years within the two groups (n=111).

Table 5 Profile of quality-adjusted life years for interferon and control patients

The incremental cost per QALY for interferon therapy is £41 432. There is considerable statistical uncertainty around this estimate, and a threshold of £30 000 per QALY, there is only a 45% chance of interferon being cost-effective.

Discussion

These results show that HRQoL is worse in the IFN group in terms of both functioning and symptomatology. As assessed by the EORTC QLQ-C30, statistically significant differences were found in terms of role functioning, emotional functioning, cognitive functioning, social functioning and global health status. Symptom scores in the IFN group were significantly worse for fatigue, nausea/vomiting, dyspnoea, appetite loss, constipation and diarrhoea.

Despite the great interest in interferon therapy for melanoma and its recognised toxicities (Hancock et al, 2000), there are very few large scale studies that have used validated HRQoL instruments. Paterson looked at 21 patients receiving high-dose interferon alpha-2b using the Functional Assessment of Cancer Therapy – Biological Response Modifier (FACT-BRM) scale, showing decreased QoL (Paterson et al, 2005). In an associated study, Trask looked at 16 patients in a longitudinal analysis which showed reductions in QoL (Trask et al, 2004). Bender assessed QoL as part of a trial with 16 patients, and showed a significant reduction in physical well-being associated with high-dose interferon alpha-2b therapy using the Functional Assessment of Cancer Therapy – General (FACT-G) scale (Bender et al, 2000).

The largest available study that used a validated QoL measure is by Rataj et al (2005) that reported a study of 110 melanoma patients receiving interferon alpha-2b patients following radical surgery. Using the EORTC QLQ-C30 (version 2.0) they found treatment had an impact on physical function, social life, emotional functioning and cognitive functioning. Direct comparisons of that study with this, are not possible due to limited reporting in their paper.

Other work has been undertaken looking at QoL in melanoma patients receiving interferon; however, this has been undertaken with a completely different approach. Kilbridge et al (2001), for instance, used the standard gamble technique to value a series of health states describing the QOL associated with interferon toxicity, melanoma recurrence and disease-free health. Their study, based on 107 patient interviews, showed that the side effects from interferon treatment reduced QoL, from 0.96 for the disease-free health state to 0.81 from severe side effects.

The Kilbridge utility estimates have been combined with mortality data from the ECOG 1684 trial (n=280) to produce a quality-adjusted survival analysis (Kilbridge et al, 2002) and a cost-utility analysis (Crott et al, 2004). Other analyses have used other utility estimates to describe treatment and post treatment QoL for interferon patients (Cole et al, 1996; Hillner et al, 1997; Lafuma et al, 2001); however, the utility figures were assigned by the researchers rather than derived from patients.

All of these utility-based studies show that a decrease in QoL during interferon treatment is more than offset by improved QoL owing to reduced recurrence and reduced mortality. Consequently, when these utility estimates are combined with the ECOG 1684 data, results tend to show that treatment with high-dose interferon is cost-effective compared to other technologies (Hillner et al, 1997; Lafuma et al, 2001; Crott et al, 2004). These results are in contrast to this study, which shows that while median survival is around 1-year longer, combining QoL with mortality proves the IFN group to be only marginally better (0.074 QALYs, P=0.752). This produces an incremental cost-effectiveness ratio of £41 432 per QALY. Using a funding threshold of £30 000 per QALY which is at the higher end of a range used within the United Kingdom (National Institute for Clinical Excellence, 2004), these results show that low-dose extended duration interferon therapy is unlikely to be considered cost-effective.

There are several reasons for these differences. Firstly, AIM-High is a study of low dose interferon therapy, whereas ECOG 1684 is a study of high-dose therapy. Consequently, QoL, survival and recurrence might be expected to differ. Secondly, the utility figures are derived in completely different ways. Our study used a generic preference based outcome measure (EQ-5D) to gather data prospectively from within the trial, from which general population utilities values were applied from a standard algorithm. Kilbridge et al (2001) generated utility values from melanoma patients by asking them to value health states describing various treatment scenarios. Thirdly, our study estimates cost-effectiveness at 5 years, while the modelling studies look at longer time scales; 35 years in one case (Crott et al, 2004). This is an important difference as shorter time frames generate higher incremental cost-effectiveness ratios. While extrapolation of our results is possible, the lower final year QALY estimate (Table 5) implies that even worse cost-effectiveness results may be produced, if such an analysis were undertaken.

We should also consider the deficiencies associated with our study. Only 66% of patients in the trial had a baseline EORTC QLQ-C30 assessment. Despite this, there appears to be no systematic difference between patients included in our QoL analysis, and those excluded. Another problem was that the number and timing of QoL assessments completed, varied. This led us to undertake a simple analysis, using a summary measure of QoL based on the average scores. As assessments were more frequent during interferon treatment in order to capture the impact of side effects, the results will be weighted toward the early months of treatment. However, repeating the analysis with average follow-up over the first 2 years as the outcome, rather than the total follow-up gave almost identical results to the longer follow-up (data not shown).

We followed the advice of Cox et al (1992) for QoL studies which recommended simplicity of design, analysis and presentation of QoL assessments. Therefore, we decided to use a simple approach and not the simultaneous assessment of QoL and survival. There are several approaches to the simultaneous assessment of QoL and survival including: QALYs (for which we employed the EQ-5D), Q-TWiST (quality-adjusted time with spent with symptoms of disease and toxicity of treatment) and multistate survival analysis (Billingham et al, 1999). The latter two approaches would require the definition of a finite number of health states in terms of the 15 EORTC QLC-30 dimension scores. We felt it was very difficult to define a set of finite, mutually exclusive and exhaustive health states that are clinically meaningful and fully describe the experiences of patients with malignant melanoma using the 15 dimensions of the EORTC QLC-30.

We assumed that the missing QoL data are missing at random and that dropout was noninformative. We found that the dropout rates and survival experience were similar across the treatment arms and believe that the between-treatment comparisons of QoL remain unbiased. We also included a term for overall survival status in our regression model to adjust for whether the patient was alive or dead during follow-up. This term should take into account that patients who died during follow-up may have a different average QoL at follow-up than patients who were alive or censored.

Further loss of data was present when the economic results are considered, such that data on only 111 patients were available for analysis. Even for these patients, missing data meant that imputation was required to produce a rectangular data set. While differences between this economic subsample and the full sample are not statistically significant, we are limited in our ability to detect differences between the two arms due to the smaller sample size. This problem is perhaps compounded by the possible insensitivity of the EQ-5D seen in several studies (Harper et al, 1997; Nicholl et al, 2001; Patel et al, 2004). Taken together, the lack of a clear pattern in the QALY estimates shown in Table 5 is difficult to interpret.

Looking beyond this study, it is difficult to cast light on other QoL evidence and economic evaluations, as methods are different, as too are the interferon dosing regimens. However, given that such a clear picture of QoL is produced with the EORTC QLQ-C30 we would recommend its use for further studies of interferon treatment. The much cited ECOG 1684 study did not incorporate prospective QoL assessment, and so subsequent evaluations have had to add on supplementary studies. While several improvements to future economic evaluations have been suggested (Crott, 2004), basing future evaluations on trial-based QoL and/or utility estimates would appear to be important given the differences identified here.

Few studies have assessed the impact of interferon therapy on health related QoL using validated instruments. These results show that interferon has significant effects on QoL and symptomatology. Our associated economic analysis also showed that overall, adjuvant low-dose extended duration interferon therapy does not appear cost-effective in this patient group in the UK context.