Main

Hepatocellular carcinoma (HCC) is the fifth most common malignancy and the third leading cause of cancer deaths worldwide (El-Serag and Rudolph, 2007). However, its geographic distribution is varied and reflects the aetiology of the disease. Worldwide 54.4% of HCC are attributable to chronic hepatitis B and 31.1% to chronic hepatitis C (Parkin et al, 2005). The disease is endemic to the Asia-Pacific region, where 80% of HCC are found.

Surgical resection and (in carefully selected cases) liver transplantation remain the therapeutic modalities that most consistently prolong survival, but only 20% of patients are amenable (Hung, 2005) at diagnosis. Another 30% may benefit from loco-regional therapies (Llovet et al, 2003a). Before the initiation of this trial (AHCC02) (ClinicalTrials.gov number NCT00041275), phase III placebo-controlled randomised trials for inoperable HCC had failed to demonstrate survival benefit (Llovet and Bruix, 2003b; Nowak et al, 2004).

Specifically, oestrogen is known to influence the growth of HCC, randomised controlled trials have not demonstrated positive benefits with the anti-oestrogen tamoxifen (Chow et al, 2002; Nowak et al, 2005), and this is attributed to the high incidence of oestrogen receptor mutations (Villa et al, 1996, 2000; Liu et al, 2000; Chow et al, 2001). Megestrol acetate (MA), a synthetic progestin with multiple drug actions and potent anti-oestrogen activity at the post-receptor level, is independent of oestrogen receptors. Consequently, MA has been widely used in the management of advanced breast carcinoma that is resistant to tamoxifen (Sedlacek, 1988).

Preclinical data have shown that human HCC xenografts respond to MA in vivo (Zhang and Chow, 2004). Initial clinical studies using MA at 160 mg day−1 in 46 and 11 patients, respectively, did not demonstrate objective tumour response (Colleoni et al, 1995; Chao et al, 1997). However, a phase III trial using MA at 320 mg day−1 in 102 patients with advanced HCC showed a significant doubling of median survival compared with placebo (Phornphutkul et al, 1996), and a study of 45 patients with inoperable HCC demonstrated a doubling of median survival using MA at 160 mg day−1 (Villa et al, 2001). The latter suggested more pronounced results in patients with mutated oestrogen receptors. In the Asia-Pacific region, the majority of HCC patients have such mutations (Chow et al, 2001). Further, MA is frequently used in patients with advanced malignancies to improve QOL (Bruera et al, 1998) especially with respect to appetite (De Conno et al, 1998; Westman et al, 1999; Lesniak et al, 2008).

This paper reports on a multinational randomised double-blind placebo-controlled trial of the Asia-Pacific Hepatocellular Carcinoma (AHCC) Trials Group to assess the efficacy of MA in patients with advanced HCC in terms of overall survival (OS) and QOL. Given the positive results of the previous phase III trial using MA at 320 mg and the negative results with the two smaller phase II trials using MA at 160 mg, this trial compares MA at 320 mg vs placebo.

During the course of this trial, preliminary results were released from another placebo-controlled randomised trial (Sorafenib Hepatocellular Carcinoma Assessment Randomised Protocol (SHARP); Llovet et al, 2008) (ClinicalTrials.gov number NCT00105443) in advanced HCC patients who had good functional reserves. The results of that trial, and a similar trial of sorafenib vs placebo in the Asia-Pacific (SAP) (Cheng et al, 2009, ClinicalTrials.gov number NCT00492752) were subsequently published. The impact of this preliminary information on our AHCC02 trial is reported.

Patients and methods

Patients

AHCC02 enrolled treatment-naive patients with advanced HCC from clinicians in seven specialist clinical centres in six Asia-Pacific nations: Myanmar, New Zealand, the Philippines, Singapore, South Korea and Vietnam. Eligible patients were representative of advanced HCC patients in the region for whom there was no proven standard of care at that time. Diagnosis of HCC was defined by positive histology, demonstration of a space-occupying lesion of the liver by non-dynamic imaging (ultrasound, CT scan or MRI) in patients with either a serum α-feto protein level 400 μg l−1 or dense homogenous lipiodol retention, or radiological evidence of HCC by dynamic contrast-enhanced CT Scan MRI in patients with α-feto protein above the normal range and serology positive for viral Hepatitis B or C. Patients were excluded if they had clinical encephalopathy, had received prior treatment for HCC (surgery including liver transplantation, chemo-embolisation, percutaneous ethanol injection and systemic chemotherapy) or were amenable to surgery. Other exclusion criteria were pregnancy, another malignancy within the last 5 years, and serum bilirubin >100 μmol l−1. The protocol conformed to the ethical guidelines of the 1975 Declaration of Helsinki and was approved by the Institutional Review Boards of the participating institutions. Written informed consent was obtained before enrolment.

Trial design

In addition to receiving best supportive care, patients were randomised to either placebo or MA 320 mg day−1 for 1 year. Both placebo and MA were obtained from Bristol-Myers-Squibb. The trial pharmacist at the Department of Pharmacy, Singapore General Hospital was responsible for supplying placebo and MA, packaged in a double-blind format, which were dispensed in bottles identified by patient trial number by a named clinician at each participating centre according to a randomisation code. At monthly reviews, the responsible blinded clinician checked for compliance in terms of dosage and frequency of drugs taken. After treatment was completed, patients were followed up every 3 months.

Randomisation, conducted through the Singapore Clinical Research Institute using a web-based system, was stratified by recruiting centre using blocks of nine in order to retain the study design's allocation ratio of 2 : 1 in favour of MA.

End points

The primary end point was OS, defined as the time from randomisation to the time of death or, if appropriate, when the patient was last known to be alive. The secondary end point was QOL, assessed using the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire (Aaronson et al, 1993), which was completed at screening and at each follow-up visit by patients named assisting personnel. The questionnaire is composed of five functional and nine symptom scales/single items, and a Global Health Status (GHS) scale. Scores (0–100) were calculated as recommended (Fayers et al, 2001). A better QOL is indicated by high scores on the GHS and functional scales, and low scores on the symptom scales. Any serious adverse events (SAEs) experienced by patients were elicited at each clinic visit.

Statistical methods

Trial size

Previous experience in patients with advanced HCC and Eastern Cooperative Oncology Group (ECOG) performance rating of 0-2 in the Asia-Pacific suggested that OS at 6 months would be 25% (Chow et al, 2002). We anticipated that this might be improved to 45% with MA. On this basis, using an allocation ratio of 2 : 1 in favour of MA, a two-sided test of 5% and 90% power suggested that 220 patients would be required (Machin et al, 1997). However, the potential inclusion of patients with ECOG 3, comprising 12% with the poorest prognosis and thought less likely to benefit as much from MA (Chow et al, 2002), reduced the anticipated 6-month rate to 42.6%. The corresponding hazard ratio (HR)=0.62 revised the target to 280, which was increased to 300 patients to compensate for potential losses.

Overall survival

Overall survival was summarised using the Kaplan–Meier technique and comparisons were made using Cox regression to estimate the HR, 95% CI and P-value. Although subgroup analyses were not part of the original design, we conducted analyses comparing the specific patients from AHCC02 who had the same characteristics (in terms of ECOG and Child-Pugh) as patients in the SHARP and SAP trials. This led to a more detailed examination of the AHCC02 data with respect to Child-Pugh class and ECOG status.

Quality of life

Graphical plots were used to explore the pattern of QOL changes over time. However, since QOL data take relatively few values, such graphs have common plotting points for many patients and thereby obscure individual patient contributions at these points. To compensate, we introduced a small amount of jittering of the plotting positions (surrounding each true position) to reveal any multiplicity of observations. As QOL was assessed in each patient over several time points, the corresponding comparison between treatment groups was made using a linear model of the form:

where Treatment=0 if placebo is given and 1 if MA is given, and t represents the time post-randomisation the assessments were made on the particular patient concerned. The regression coefficients βTreat and βTime are obtained using the data from all patients and represent the difference in QOL between treatments and the linear change in QOL over time, respectively. Thus, the null hypothesis of no difference between treatments is expressed by βTreat=0 and that of no change in QOL over time by βTime=0. Inclusion of the interaction term, hypothesis βInteraction=0, enables a test of whether differences between treatments remain constant over time. The coefficient β0 represents the average value of all QOL assessments made.

The procedure xtmixed in Stata (StataCorp., 2007) was used to fit this statistical model to each of the 15 EORTC end points. This model takes into account the knowledge of which item of data is from which patient and the variable number of observations per patient. The correlation structure of observations from the same patient has to be taken into account as does the possibility that each patient will have their own trajectory of change with time. Since only 20% of patients survived beyond 6 months from randomisation, the model fit is confined to this 6-month period. Once fitted, the model was used to estimate the differences between treatment groups at 3 and 6 months post-randomisation. Although the same statistical model is not the most appropriate for each QOL end point, the above reflects the major features of the data.

Results

Recruitment and trial monitoring

From March 2002 through June 2007, AHCC02 recruited 204 patients with advanced HCC (68% of target) before the preliminary results of the SHARP trial (Cheng et al, 2009) became available (4 June 2007). SHARP had enrolled 602 HCC patients but with good liver function of Child-Pugh A and ECOG 0-2 to compare oral sorafenib against placebo in terms of OS and time to symptomatic progression. They found a median OS almost 3 months longer with sorafenib than with placebo (HR=0.69, 95% CI=0.55–0.87, P<0.001). Consequently, the ongoing AHCC02 was reviewed by the Trial Steering Committee, which concluded that the use of a placebo was no longer ethical. At the advice of the Steering Committee, the AHCC02 was prematurely closed on 14 June 2007, a decision taken without a formal review of the still blinded AHCC02 trial data. One year after the SHARP results, similar benefits for sorafenib were announced from a parallel trial (SAP) (Cheng et al, 2009) in an all-Asian population of 226 patients (HR=0.68, 95% CI=0.50–0.93, P<0.014).

Patient characteristics

Figure 1 summarises the flow of the 204 randomised AHCC02 patients through the trial (135 randomised to MA and 69 randomised to placebo). The majority of patients were male (85.9%) and most were recruited from centres in Singapore (29.2%), Myanmar (27.0%) and Vietnam (20.0%) (Table 1). Chronic hepatitis B infection was identified in 58.4%, C in 8.6% and co-infection in 1.6%. The majority (102 out of 185, 55.1%) presented with ECOG 1 status and 14 out of 185 (7.6%) were ECOG 3. There were 37.8% of Child-Pugh A, 37.8% B and 13.0% C. Their median EORTC GHS was 50 (range 0–100). No data were received from one centre (and the centre was subsequently withdrawn) from the trial. Final analysis was thus based on 185 patients.

Figure 1
figure 1

Consort diagram.

Table 1 Demographic and baseline characteristics of 185 patients included in the intention-to-treat analysis

Overall survival

A total of 180 out of 185 (97.3%) patients had died by the time of final data analysis, including 59 out of 62 (95.2%) who received placebo and 121 out of 123 (98.4%) who received MA. The Kaplan–Meier OS estimates (Figure 2A) at 6 months were 21.76% for placebo and 15.01% for MA. Corresponding median values were 2.14 and 1.88 months, respectively, with an adverse HR=1.25 (95% CI=0.92–1.71, P=0.16) for MA. After adjusting individually for Child-Pugh class and ECOG status, the corresponding HRs for placebo and MA were 1.37 (CI=0.98–1.91, P=0.06) and 1.01 (CI=0.73–1.40, P=0.95), respectively. The former suggests a possible adverse effect of MA on OS, while the latter suggests no benefit of MA over placebo. The median period of administration of MA was 1.57 (range 0.03–13.67) months and that of placebo was 1.35 (range 0.03–15.16) months.

Figure 2
figure 2

AHCC02 Kaplan–Meier estimates of OS. (A) OS by treatment. (B) OS in good and poor risk groups as defined by Child-Pugh class and ECOG performance status.

Adverse events and quality of life

The SAEs were fairly evenly distributed between the two groups. Fifty SAEs (15 in placebo (P) and 35 in MA) were reported in 38 (13 in P and 25 in M) patients. The SAEs are shown in Table 2. Considerable variation in GHS was seen between patients and over time, with a suggestion of increasing values with placebo but decreasing values with MA (Figure 3). An appropriate model, fitted using the assumption of an unstructured (or no pattern) correlation between successive within-patient measurements and a random effect for the assumed linear change over time for each patient, was:

Table 2 Serious adverse events
Figure 3
figure 3

Changes in EORTC GHS over time by treatment group with the fitted model for the first 12 months post-randomisation.

For this model, the P-values for change in GHS over time, Treatment and Interaction (t × Treatment) were 0.877, 0.360 and 0.253, respectively, none of which was statistically significant. This confirms a general but small increase in GHS over the 6-month post-randomisation, with those receiving MA taking lower values. This adverse difference for MA was 8.97 at 3 months and 15.12 at 6 months post-randomisation (Table 3). The difference of 2.82 reported between treatments at baseline represents a random difference.

Table 3 Model estimates of EORTC QOL mean symptom and scale values at baseline, 6 and 12 months post-randomisation and the corresponding estimated differences between treatmentsa

Patients who received placebo generally had a more favourable QOL profile, though for most scales the estimated differences were negligible. Emotional functioning marginally favoured MA but the advantage at 6 months was smaller in magnitude than the random difference at baseline. Megestrol acetate also had favourable reductions in levels of appetite loss and nausea/vomiting compared with placebo.

Discussion

When this multicentre multinational double-blind trial of patients with advanced treatment-naive HCC commenced in 2002, it was designed to reflect the actual patient population with inoperable HCC in the Asia-Pacific rather than the better subset of patients with Child-Pugh A liver function. At the time the trial started, there were no phase III data to suggest that any treatment was superior to placebo or best supportive care in patients with inoperable HCC.

This prematurely terminated trial failed to show the anticipated increase in OS with MA over placebo, and patients on placebo generally had better outcomes in terms of QOL. Interestingly, in spite of its premature termination and due to the greater-than-expected death rate with MA, the analysis presented here includes information on the exact number of deaths (n=180) that had been expected with full recruitment, hence preserving the robustness of the planned data analysis.

Patients recruited to the SHARP and SAP trials were all ECOG 0-2 and the vast majority were Child-Pugh A (581 out of 602 (96.5%) and 220 out of 226 (97.3%), respectively). In contrast in the AHCC02, 82 out of 185 (44.3%) patients were Child-Pugh A and ECOG 0, 1 or 2 (Table 1). These patients were classified as having good functional status and the remainder, including five cases in which category was unknown, were defined as having poor functional status. The corresponding Kaplan–Meier estimates of OS (Figure 2B) indicate an adverse effect of MA in those of poor functional status (HR=1.79, CI=1.15–2.80, P=0.01), but suggest a possible benefit in those of good functional status (HR=0.82, CI=0.51–1.33, P=0.42). Interestingly, 6-month OS differed markedly between AHCC02 patients of good status who were on placebo (22%) and similar SAP trial patients who were on placebo (37%).

The estimated HRs for comparing treatments of patients with Child-Pugh score A vs B–C (Figure 4) suggest a possible advantage with MA in those of Child-Pugh A. In contrast, comparing ECOG groups 0-1 and 2-3 suggests near equivalence of the two regimens.

Figure 4
figure 4

Forest plot for the treatment comparisons within each Child-Pugh and ECOG groups.

In our subgroup analyses, we found little difference in OS between treatments among those who had good functional status, but evidence of an adverse outcome with MA in those who did not. Although we acknowledge the limitations and non-confirmatory nature of findings from such a subset analysis, our results are suggestive of potential differentials among the prognostic groups. Our results also suggest that outcomes of systemic therapy for HCC patients with good functional status may not necessarily apply to patients with poor functional status. Hepatocellular carcinoma is a complex cancer where OS is significantly influenced by underlying liver disease that impacts drug metabolism and toxicities. Thus, while it is tempting to extrapolate the results of positive trials in patients with good functional status to patients with poor functional status, such extrapolation must be treated with caution in the absence of suitably powered double-blind trials (Kelley and Venook, 2008).

The markedly lower OS for AHCC02 patients of good status who were on placebo compared with SAP patients on placebo is remarkable given that patients in both trials were largely similar in ethnicity and aetiology for HCC. A possible reason for this discrepancy lies in differences in eligibility criteria. AHCC02 enrolled only patients who were treatment naive and thus had advanced disease at the time of diagnosis, while SAP also enrolled patients who had previously received other loco-regional therapies such as surgery, radiotherapy, hepatic artery embolisation, chemo-embolisation, radiofrequency ablation, percutaneous therapy or cryoablation. The natural history of such patients differs from those who have advanced disease at diagnosis. For example, Asian patients with recurrent HCC following the initial surgery have 1-, 3- and 5-year survival rates of 77%, 49% and 26%, respectively, after tumour recurrence (Poon et al, 2002). Similarly, the largest reported case series in Asia showed that selected Child-Pugh A patients with advanced HCC who received hepatic artery embolisation had 1-, 3- and 5-year survival rates of 29.5%, 6.0% and 4.4%, respectively (Wang et al, 2008). The natural history of these patients is significantly better than that of the treatment-naive patients with advanced disease on the placebo arm of our trial, which showed a 1-year survival rate of 9.1% for those with Child-Pugh's A and ECOG 0-1. Similarly, we found a 1-year survival rate for such patients of 11.3% in our previous AHCC01 trial, which recruited a sample consistent with the present trial regarding ethnic composition, treatment-naivete and proportion of patients of poor functional status (Child-Pugh B, C or ECOG 3). The SAP trial did not report the proportion that had received loco-regional therapies, but the inclusion of such patients in randomised trials of systemic therapy would be expected to result in a better OS when compared with trials that only recruit patients with advance disease at presentation.

Regarding QOL, patients on placebo experienced little change overall in their EORTC GHS from baseline to 6 months, while those on MA exhibited a decline. Of the six functional scales, all except emotional functioning (which demonstrated little difference between treatments) suggested less favourable values with MA compared with placebo. Among the nine symptom scales, appetite loss and nausea/vomiting improved with MA compared with placebo, while all other scales marginally declined with MA.

In conclusion, the AHCC02 trial suggests that those receiving treatment for HCC who had poor functional status showed a poorer outcome with MA than with P, while those of good functional status showed similar outcome with both treatments. Overall, though we found some benefit with MA with respect to improving appetite and reducing nausea/vomiting, the GHS and other aspects of QOL showed no improvement with MA. Consequently, we recommend that MA not be used for the treatment of HCC.