Introduction

Despite recent advances in cancer therapies and increased availability of treatment options, metastatic breast cancer (MBC) remains incurable. The main treatment goals are to achieve disease control, preferably through prolonging overall survival (OS), and to delay or prevent debilitating disease symptoms [1, 2]. With each additional line of therapy, the response to chemotherapy decreases further and the response rate may be as low as 15 % in patients who have received up to two prior therapies, and is associated with significant toxicity and relatively low OS [3, 4]. In this situation, the measurement of patient-reported symptom experience and health-related quality of life (HRQoL) can provide additional information to evaluate and compare the efficacy and toxicity profiles of the treatments. Further, incorporation of patient-reported outcomes into toxicity reporting in clinical trials has been recommended to overcome the potential underreporting of severity of subjective adverse events by physicians in clinical trials [57].

Treatment side-effects (even those of lower grade) may adversely impact patient well-being [8, 9]. For example, chemotherapy-induced gastrointestinal symptoms (nausea, vomiting, and diarrhea) are not life-threatening, but are associated with worse HRQoL [10]. In contrast, uncomplicated neutropenia is often not associated with significant symptoms.

Eribulin mesylate (eribulin), a novel microtubule dynamics inhibitor, was the first single agent shown to improve survival in patients with heavily pretreated MBC in the phase 3, randomized study 305/EMBRACE trial, where patients receiving eribulin experienced 2.7 months longer median OS than those receiving treatment of physician’s choice (hazard ratio (HR) 0.81; 95 % confidence interval (CI) 0.67–0.96; P = 0.014) [11]. HRQoL was not assessed in the trial due to the variety of treatments and schedules in the control arm (treatment of physician’s choice).

HRQoL was, however, a prespecified secondary endpoint in a second phase 3, open-label, randomized trial (study 301) that evaluated eribulin versus capecitabine as first- to third-line treatment in pretreated patients with MBC [12]. The differences observed in OS between the eribulin arm compared with capecitabine (15.9 vs. 14.5 months, respectively) were not statistically significant (HR 0.88; 95 % CI 0.77–1.00; P = 0.056). Overall, the safety and tolerability profiles of the treatments were comparable: nausea was common with both treatments, in addition, eribulin treatment more commonly led to neutropenia, alopecia, leukopenia, and global peripheral neuropathy; whereas capecitabine was more often associated with hand-foot syndrome, and diarrhea. Similar improvements in patients’ HRQoL over time (a prespecified secondary endpoint), measured by the global health status (GHS)/quality of life (QoL) subscale of the European Organisation for Research and Treatment of Cancer Quality-of-life Questionnaire-Core 30 questions (EORTC QLQ-C30), were observed in both treatment arms [12].

These two trials led to the approval of eribulin as a monotherapy for patients with MBC who have previously received at least one (European Union) or two (United States) chemotherapeutic regimens for advanced/metastatic disease, where prior therapy included an anthracycline and a taxane in the adjuvant or metastatic setting [13, 14].

Here we compare and further evaluate the clinical impact of eribulin and capecitabine on patients’ symptoms/side-effects, functioning, and HRQoL in study 301 [12] to better understand the quality of survival in patients with MBC. The analysis and interpretation of the results are based on a model that posits biological factors associated with a disease or its treatment lead to symptoms that influence functional status, which then impacts on overall HRQoL [15]. The specific objectives of the current post hoc analyses were to:

  1. (a)

    Compare physical symptoms, functional scores, and GHS/QoL in patients treated with eribulin versus capecitabine over time;

  2. (b)

    Estimate the proportion of patients experiencing clinically meaningful changes in HRQoL scales; and

  3. (c)

    Compare the time to meaningful deterioration of HRQoL in both treatment arms.

Subgroup analyses in patients with human epidermal growth factor receptor 2 (HER2)-negative and triple-negative disease status were also performed.

Methods

Patients

The population enrolled (E7389-G000-301; ClinicalTrials.gov identifier: NCT00337103) has been previously described [12]. In brief, women (aged ≥18 years) with histologically or cytologically confirmed breast cancer, who had received ≤3 prior chemotherapy regimens (≤2 for advanced and/or metastatic disease) including prior therapy with an anthracycline and a taxane, were eligible for study inclusion.

Study design

The study was an open-label, 2-arm, parallel, multicenter, phase 3 trial in which patients were stratified at randomization by geographic region (North America, Western Europe, Eastern Europe, Latin America, South Africa, and Asia) and HER2 status (positive, negative, or unknown). Patients were randomized (1:1) to receive 21-day cycles comprising eribulin mesylate 1.4 mg/m2 (equivalent to 1.23 mg/m2 of eribulin expressed as free base) intravenously over 2–5 min on days 1 and 8, or capecitabine 1.25 g/m2 orally twice daily on days 1–14. Patients received study treatment until disease progression, unacceptable toxicity, or patient/investigator request to discontinue. Grade 3 and 4 toxicities, including certain grade 2 toxicities with capecitabine, were managed by dose modification including dose reduction, treatment interruption, and/or symptomatic treatment [12].

HRQoL assessment

HRQoL was a secondary endpoint in this study; the principal prespecified HRQoL outcome was overall GHS/QoL at week 6, and has been reported previously in brief [12]. The results reported here are based on additional post hoc analyses of the study data.

HRQoL was assessed using EORTC QLQ-C30 (version 3.0) [16, 17] and the breast module-23 questions (QLQ-BR23; version 1.0) [18]. The QLQ-C30 consists of 30 questions addressing five functional scales (cognitive, emotional, physical, social, and role), nine symptom scales (appetite loss, constipation, diarrhea, dyspnea, fatigue, financial difficulties, insomnia, nausea and vomiting, and pain), and one GHS/QoL scale. The EORTC QLQ-BR23 focuses on breast-cancer-specific issues and includes 23 questions addressing four functional (body image, future perspective, sexual enjoyment, and sexual functioning) and symptom scales (arm symptoms, breast symptoms, systemic therapy side-effects, and upset by hair loss) [19]. All scores for the EORTC QLQ-C30 and EORTC QLQ-BR23 were transformed to a scale from 0 to 100 [19]. Higher scores in the functional scales and GHS/QoL represent a superior level of functioning and better HRQoL, whereas higher scores in the symptom scales or items represent worse symptoms.

The questionnaires were administered at baseline, week 6, and months 3, 6, 12, 18, and 24, or until disease progression or initiation of other antitumor treatment (including those initiated after study termination). The baseline EORTC questionnaires were completed in clinic before randomization. Subsequent questionnaires were completed in the clinic before any study-related procedures for that visit and before tumor assessment results were communicated to the patient. Patients were asked to complete questionnaires at each clinic visit, even if they had declined previously. Compliance was assessed by counting completed questionnaires.

Statistical analyses

The HRQoL population was defined as patients with QoL assessments at each time point within the intent-to-treat (ITT) population. Data were also analyzed separately for patients with HER2-negative or triple-negative disease. Analysis of patients with HER2-positive disease were not planned due to the anticipated fewer number of patients in this subgroup.

Compliance for completing the EORTC questionnaires was evaluated descriptively for each treatment group. Pattern-mixture models were used to account for data missing-not-at-random [20]. No imputation for missing data was conducted. Mixed models on a set of covariates based on expert opinion (baseline patient demographics such as age, HER2 status, hormone receptor status, Eastern Cooperative Oncology Group status, number of prior chemotherapy regimens for advanced disease, number of organs involved, visceral involvement, and disease-free interval >1 year prior to study) were performed to estimate the effect difference on repeated responses over a selected period of time and between treatment arms. Longitudinal analysis outcomes were expressed as least squares mean and standard error. To test the difference in least squares mean change from baseline between treatment arms, a 2-sided test with P ≤ 0.05 (unadjusted for multiplicity) was considered to be nominally statistically significant.

The minimally important difference (MID) was defined as the smallest difference in scores between groups in the scales of interest, which patients perceived as beneficial. Literature-based threshold values for MID were used for scales in the EORTC QLQ-C30 [21]. Because there are not any published MIDs on the QLQ-BR23, a 10-point change was considered consistent with previous estimates [22]. For functional scales, an increase in change score from baseline of ≥1 MID was defined as “improved,” a decrease of ≥1 MID was defined as “worsened,” and a change in either direction of <1 MID was defined as “stable.” For symptom scales, the same criteria were applied with reverse direction. Proportions of patients classified as “improved,” “stable,” or “worsened” were calculated for each scale and cycle. Tests of proportions were done using Chi squared or Fisher’s exact tests, as appropriate. Cox analysis was used to compare the MID changes for eribulin versus capecitabine (using a reference HR of 1). Adjusted values are stated for the HR.

Time to symptom worsening (TSW) was defined as the time until clinically meaningful deterioration by a specified threshold for each patient-reported endpoint (such as, the MID values) was observed. TSW was calculated for each HRQoL scale using Kaplan–Meier curves. A proportional hazards model (censoring on death, study drop-out, or study discontinuation) was used to estimate adjusted HR values of TSW plus each respective 95 % CI. For patients with >1 TSW event or who deteriorated without improvement, a generalized estimating equation was used to estimate the relative probabilities of observing TSW between treatment arms.

Results

HRQoL population

Of 1102 ITT patients randomized in study 301, 1062 (96.4 %) completed the EORTC questionnaire at baseline and thus formed the HRQoL population. The populations were broadly comparable between the treatment arms (Table 1a). The baseline scores for both questionnaires were similar (Table 1b). Across the symptom scales of the QLQ-C30, patients in both treatment arms had worse scores on fatigue, pain, insomnia, and financial difficulties (means >30). The scores on QLQ-C30 functional scales were generally good (mean values around and above 70) with the exception of GHS/QoL scale where mean scores around 50 suggest significant impact of disease [23]. However, the breast-cancer-specific functional scales of the QLQ-BR23 showed impact on all domains (mean scores 32–65), in particular, on sexual functioning (mean score 14.0; Table 1b).

Table 1 Baseline (a) patient characteristics and demographics, (b) health-related quality-of-life scores

Compliance for completing the EORTC questionnaires during the study was ≥85 % until 12 months, but was lower at 18 and 24 months (73–83 %), and sample sizes decreased due to study attrition (Table 2). Due to smaller sample sizes, analyses after 6 months should be interpreted with caution.

Table 2 Proportion of patients completing questionnaires at scheduled visits

Treatment effects on symptoms

Exposure to both treatments during the study was comparable between the two arms. Patients in the eribulin arm received a median of six treatment cycles, whereas patients in the capecitabine arm received a median of five treatment cycles. Overall, 177 patients (eribulin: 32.5 %, capecitabine: 32.4 %) in either arm underwent dose reduction. The most common reasons for dose reduction were neutropenia in the eribulin arm (22.6 %), and palmar-plantar erythrodysesthesia syndrome (4.9 %) in the capecitabine arm.

During the course of the study, patients receiving capecitabine had comparatively more-severe symptoms (that is, higher symptom scores) for nausea and vomiting (P < 0.001; Fig. 1a and online resource Fig. S1; online resource Table S1) and diarrhea (P < 0.001) compared with those treated with eribulin (Fig. 1a). The differences were clinically significant, as a higher proportion of patients who received capecitabine versus eribulin experienced clinically meaningful worsening of nausea and vomiting (MID 8; HR 1.177 [95 % CI 1.013, 1.367]; P < 0.05) and diarrhea (MID 7; HR 1.189 [95 % CI 1.020, 1.385]; P < 0.05; Fig. 1b). Typically, the differences appeared to be greatest at 6 weeks, and declined thereafter.

Fig. 1
figure 1

Effects of eribulin and capecitabine on physical symptom scales of the EORTC QLQ-C30 and QLQ-BR23 a differences in mean scores; b proportion of patients with worsened symptoms; c differences in median time to symptom worsening

In comparison, patients receiving eribulin had worse mean scores for the systemic therapy side-effects symptom scale (which included dry mouth, different tastes, irritated eyes, feeling ill, hot flushes, headaches, and hair loss; P < 0.001), and upset by hair loss (P < 0.05; Fig. 1a). A higher proportion of patients treated with eribulin experienced clinically meaningful worsening of systemic therapy side-effects than those treated with capecitabine (MID 10; HR 0.821 [95 % CI 0.707, 0.953]; P < 0.01; Fig. 1b).

The analysis of TSW supported the interpretation of the MID thresholds. Patients receiving capecitabine had significantly shorter TSW for nausea and vomiting (MID 8; 7.6 vs. 10.2 months; P < 0.05), and diarrhea (MID 7; 8.4 vs. 11.5 months; P < 0.05) than those treated with eribulin. Similarly, patients treated with eribulin had significantly shorter TSW for systemic therapy side-effects (MID 10; 7.6 vs. 9.7 months; P < 0.05; Fig. 1c) compared with those treated with capecitabine.

Treatment effects on patient functioning

In the longitudinal analyses, baseline HRQoL scores were significantly associated with the change in HRQoL across all EORTC scales (P < 0.001); that is, worse baseline scores were predictive of worse scores while on treatment. There were no differences between the two treatment arms in terms of impact on patients’ functioning over time, as measured by changes in EORTC QLQ-C30 scores for functional scales (Fig. 2a). However, patients receiving eribulin had comparatively worse scores on the body image (P < 0.001) and sexual functioning scales (P < 0.05), measured by QLQ-BR23, than those receiving capecitabine (Fig. 2a).

Fig. 2
figure 2

Effects of eribulin and capecitabine on function scales of the EORTC QLQ-C30 and QLQ-BR23. a differences in mean scores; b proportion of patients with worsened symptoms; c differences in median time to symptom worsening

As indicated by the MID analysis, 10–35 % of patients in both treatment arms experienced a clinically significant worsening of their functioning, suggesting that the majority of patients experienced stable or improved functioning. No statistically significant differences over the course of the study were observed between the treatment groups, except that a higher proportion of patients receiving capecitabine reported a meaningful worsening on the future perspective scale than those receiving eribulin (MID 10; HR 1.173 [95 % CI 1.015, 1.356]; P < 0.05; Fig. 2b).

In the ITT population, median TSW was similar for the majority of the EORTC functional scales and the GHS/QoL scale, with only 1–2 months’ difference between the treatment arms. Patients receiving eribulin had significantly longer TSW for body image (MID 10; 8.9 vs. 6.0 months; P < 0.05) and future perspective (MID 10; 6.1 vs. 4.7 months; P < 0.05; Fig. 2c) than those treated with capecitabine.

Treatment effects in patient subgroups by receptor status

Overall, the results in the HER2-negative and triple-negative subgroups were similar to those in the overall population in all analyses (data not shown). However, in patients with triple-negative disease, significant differences were observed in the TSW analyses. Importantly, TSW in overall GHS/QoL was significantly longer in patients treated with eribulin than those treated with capecitabine (median time 6.2 vs. 6.0 months; P < 0.01; Fig. 3). This difference in median TSW may not appear clinically meaningful, however, a separation of the Kaplan–Meier survival curves is observed beyond 6 months, which is likely to explain the statistically significant difference between the two treatments. The median TSWs were also longer in the eribulin arm compared with capecitabine arm for fatigue (8.9 vs. 6.1 months; P < 0.01), nausea and vomiting (9.9 vs. 6.5 months; P < 0.05), pain (8.1 vs. 5.4 months; P < 0.05), and diarrhea (11.6 vs. 6.6 months; P < 0.01); as well as for the functional scales of body image (6.7 vs. 6.0 months; P < 0.05) and future perspective (6.0 vs. 4.8 months; P = 0.01). The TSW for systemic therapy side-effects appeared shorter for patients treated with eribulin than those receiving capecitabine, but this difference was not statistically significant (4.9 vs. 7.2 months; P > 0.05).

Fig. 3
figure 3

Effects of eribulin and capecitabine, in terms of time to symptom worsening, on overall global health status/quality-of-life scale of the EORTC QLQ-C30 in patients with triple-negative disease

Discussion

In this phase 3 trial comparing eribulin with capecitabine in patients with locally advanced or MBC previously treated with an anthracycline and a taxane, significant differences in physical symptoms/side-effects were observed, reflecting the different toxicity profiles of the drugs. Patients treated with capecitabine had worse scores, and more rapid TSW for gastrointestinal symptoms (nausea and vomiting, diarrhea), whereas patients treated with eribulin had worse scores for systemic therapy side-effects (dry mouth, food and drink taste, eyes painful, hair loss, feeling ill/unwell, hot flushes, headaches). These results were not only statistically significant in the longitudinal models but were also clinically meaningful, as measured by the MID analyses and TSW. Typically, the differences appeared to be greatest at 6 weeks, and declined thereafter. This is in alignment with the literature which suggests that during the course of their disease, over 50 % of patients will experience nausea and vomiting, with a large proportion experiencing these within the first week of treatment [10, 24].

Despite the above side-effects, the majority of patients (65–90 %) in both treatment groups maintained or improved their functioning relative to baseline. Based on the group-level data over time using the pattern-mixture model, patients treated with eribulin had worse body image scores than those receiving capecitabine. While this finding may seem contradictory to the TSW results, which show that patients treated with eribulin compared with capecitabine have longer TSW for body image, this can be explained by the nature of the two different approaches. TSW is an analytic approach that censors data on the time to meaningful decline by a predefined threshold, whereas, the longitudinal evaluation of the raw score change is not censored. Therefore, while patients may have worse scores for these domains in the eribulin arm, we observe that there scores do not decline more rapidly compared to capecitabine treatment to a point of meaningful worsening.

Notably, in patients with triple-negative disease, eribulin also demonstrated a significant delay in time to symptom worsening of overall GHS/QoL, as well as fatigue, nausea and vomiting, diarrhea, and pain when compared with capecitabine. Although the 0.2-month improvement in median TSW of overall GHS/QoL with eribulin compared with capecitabine may not translate to an early clinical benefit, a larger separation of the curves is observed beyond 6 months, potentially due to a subset of patients responding well to treatment.

Capecitabine is widely considered to be a chemotherapy drug with manageable toxicity and a favorable risk:benefit profile in patients with MBC [25, 26]. It is often used as a first-line treatment in older and frail patients to achieve disease control without significant side-effects [25, 27]. The capecitabine dose used in this study (1.25 g/m2 orally twice-daily on days 1–14 per 21-day cycle) is approved by the United States Food and Drug Administration, and has been used in other clinical trials in patients with MBC [2830]. A lower dose of capecitabine, typically 1000 mg/m2 twice-daily for 14 days per 21-day cycle, has also been investigated with similar efficacy but reportedly better tolerability [28]. In our study, treatment exposure to capecitabine was comparable to eribulin, with similar proportions of patients in both arms experiencing dose reduction. The observation that the impact of eribulin on patient functioning and HRQoL, despite worse systemic side-effect scores, is similar to that of capecitabine, is therefore noteworthy, especially from a clinical perspective.

HRQoL is increasingly being recognized as a valuable endpoint of cancer care by payers, regulators, prescribers, and patients. Patient-reported symptom endpoints have prognostic value in clinical trials and support the importance of assessing the patient’s views in the development of new therapies [31]. Therefore, while the development of new cancer therapies should clearly focus on improving efficacy, ideally in terms of OS, maximizing HRQoL is also an important goal, particularly in the setting of advanced disease. Where treatments have comparable tumor-related outcomes, but competing toxicity profiles or differing logistics of administration, HRQoL measurement may help patients and clinicians to guide treatment decisions [32]. The MID assessment described here was defined as the smallest difference in scores between groups in the scales of interest, which patients perceived as beneficial. Therefore, by definition, MID-based improvements are indicative of treatment benefit, and could guide physicians in making treatment decisions.

The measurement of HRQoL and interpretation of HRQoL outcomes are, however, challenging. A review of clinical trials in patients with MBC that included measurement of HRQoL concluded that this assessment does not always provide additional information that is not already clear from other clinical outcomes, such as toxicity [32]. Indeed, in the current study, HRQoL changes in individual symptom scales of the EORTC QLQ-C30 and QLQ-BR23 instruments detected differences on patient reporting corresponding to the known adverse event profile for each agent. Nevertheless, no differences between the two treatments were observed for functional and overall HRQoL outcomes. HRQoL instruments may, however, lack sufficient precision to detect differences between treatments among a relatively homogeneous clinical trial population [33]. Patients may also find it easier to rate individual symptom scales accurately than a global HRQoL scale [33].

Chemotherapy-induced gastrointestinal symptoms (nausea, vomiting, and diarrhea) are common side-effects that can be debilitating for patients [10, 11]. Importantly, diarrhea can interfere with cancer treatment by forcing dose delays or reductions [34]. Diarrhea is one of the principal dose-limiting toxicities of capecitabine and this is reflected in the study findings, with eribulin being associated with comparatively lower symptom scores for diarrhea as well as a lower proportion of patients reporting worsening diarrhea, and a longer TSW for this symptom. Patients on capecitabine reported worse nausea/vomiting than those on eribulin, whereas the scale scores for other systemic therapy side-effects were significantly worse with eribulin. Together, this information will contribute to discussions with patients during the treatment decision-making process. Review of the observed toxicity experience of patients in this study (any grade adverse event) were consistent with the patient-reported symptom experience reported in this study: patients on capecitabine experienced more adverse events of diarrhea (29 vs. 14 % for eribulin), vomiting (17 vs. 12 % for eribulin), anorexia (15 vs. 13 % for eribulin) and nausea (24 vs. 22 % for eribulin) whereas patients on eribulin experienced more adverse events of alopecia (35 vs. 18 % for capecitabine) and fatigue (17 vs. 15 % for capecitabine) [35].

This HRQoL analysis has several strengths. It is based on data from a large-scale, randomized, clinical trial, using 2 well-validated instruments for data collection—the EORTC QLQ-C30 instrument and its breast-cancer-specific supplementary module, QLQ-BR23. Compliance was good throughout most of the study (≥80 % to the 12-month time point). Content validity for the symptom scales used has been validated in patients with advanced breast cancer or MBC in a phase 2 study of eribulin (E7389-G000-211; ClinicalTrials.gov identifier: NCT00246090) [36]. However, the analyses may be limited by the sample size for QoL analysis that decreased sharply from the 12-month time point due to death or discontinuation from the parent study, which is frequently observed in the metastatic setting due to the natural history of the disease. It should also be noted that the assessment instruments used for the study do not specifically capture the symptoms particularly related to capecitabine (namely, “hand-foot syndrome”) and eribulin (e.g., peripheral neuropathy).

In conclusion, the majority of patients with pretreated locally advanced or MBC who were treated with either eribulin or capecitabine did not experience an overall deterioration in functioning or GHS/QoL. Eribulin and capecitabine had similar effects on patients’ HRQoL scores reflecting the known side-effect profiles of these two agents. Along with the efficacy and toxicity findings of study 301 [12] and study 305/EMBRACE [11], the HRQoL results from this study 301 analyses may help patients with MBC and their oncologists to make more informed treatment decisions.