Introduction

Morbid obesity is associated with a range of serious health complications and reduced health-related quality of life (HRQL) [1], and patients suffering from morbid obesity also have an increased risk for symptoms of anxiety and depression [2]. The only treatment documented to provide sustained weight loss in patients with morbid obesity is bariatric surgery [37].

Improvement in HRQL is the main objective of surgical treatment in morbid obesity [8]. Several studies have shown that morbidly obese patients experience a dramatic improvement in HRQL after surgery [915]. Although surgery may lead to a significant relief from anxiety and depression [16], the improvements may wane with time [9, 17].

There are several surgical procedures for weight loss, such as gastric bypass, gastric banding, sleeve gastrectomy, and biliopancreatic diversion without (BPD) or with the duodenal switch (BPDDS) [18]. BPDDS seems to be the most effective surgical procedure for weight loss and weight loss maintenance [3, 19, 20], but there are little long-term data (≥5 years) on HRQL after BPDDS. Only one cross-sectional study has shown that severely obese patients who underwent BPDDS had better HRQL than a group of patients prior to receiving BPDDS surgery [21].

The main aim of this study was to evaluate changes in HRQL from baseline to 5 years after BPDDS. The primary outcomes were the two summary scores of the Short Form 36: the physical component score (PCS) and the mental component score (MCS). We hypothesized that significant improvements would occur in both these measurements of HRQL.

Methods

Study Design and Patients

Our bariatric surgery program was initiated in 2001, and the first 51 patients who were accepted for BPDDS were invited to participate in the present study. The inclusion criteria were body mass index (BMI) ≥ 40.0 or 35.0–39.9 with obesity-related comorbidities, age 18–60 years, no alcohol or drug problems, no active psychosis, and failure to lose weight through other methods. Patients were assessed before surgery and 1, 2, and 5 years after surgery by self-reported questionnaires.

Power calculations were performed using a two-sided paired test (predicted effect size (ES) = 0.6, providing 90 % power, P < 0.05), indicating that at least 32 paired observations would be required to detect changes in the health-related quality of life scores. To ensure that the study was robust concerning missing data, 51 patients were recruited.

The Treatment: Biliopancreatic Diversion with Duodenal Switch

BPDDS was performed by resecting the greater curvature of the stomach, leaving a narrow gastric tube of 100 to 200 ml along the lesser curvature. The pylorus was left intact, and the duodenum was divided 3 to 4 cm distal to the pylorus. The small bowel was divided 200–300 cm above the cecum, and the proximal end of the distal small bowel was anastomosed to the proximal end of the duodenum (alimentary limb). The distal end of the proximal small bowel was anastomosed to the alimentary limb 50 to 100 cm above the cecum (common limb). Patients were encouraged to have protein-rich foods after the procedure and to take prescribed daily doses of vitamins and minerals [22].

Demographic Characteristics and Clinical Data

Data were obtained using a standardized form. The patients’ age, gender, educational level, employment status, marital status, and BMI were noted. Body weight was measured in light clothing without shoes to the nearest 0.1 kg. Height was measured in standing position without shoes to the nearest 0.01 m. BMI was calculated as weight divided by height squared (kg/m2). Percent excess body mass index loss (%EBMIL) from baseline to the 5-year follow-up was calculated using the following formula: 100 − (Follow-up BMI − 25 / Beginning BMI − 25) × 100) [23].

Outcome Variables

Medical Outcomes Study Short Form 36

The Short Form 36 (SF-36; Norwegian version 1.2) is a well-established generic measure of the health burden of chronic diseases [24]. The questionnaire has demonstrated good validity and reliability [25]. SF-36 assesses eight dimensions of physical and mental health, each ranging from 100 (optimal) to 0 (poorest). The subscales of physical functioning, physical role function, and bodily pain reflect physical health, and emotional role function and mental health reflect mental health status. The subscales of general health, vitality, and social functioning reflect both physical and mental health. The eight SF-36 subscales can also be factor-analyzed and reduced to two summary scores, PCS and MCS [24]. To calculate the PCS and MCS, we used the oblique method which allows for the correlation of physical and mental health [26]. A higher score on either scale represents better physical or mental health. Respectively, PCS and MCS are standardized so that a difference in 2–4.9 points can be interpreted as a small effect size; 5–7.9 points, a medium effect size; and 8+ points, a large effect size [26, 27]. Normative data on the SF-36 was obtained from the Norwegian survey in 2002 (N = 5,396) [28].

Hospital Anxiety and Depression Scale

To evaluate symptoms of anxiety and depression, patients were administrated the Hospital Anxiety and Depression Scale (HADS), a self-reported questionnaire. HADS is composed of 14 items, with 7 items assessing anxiety and 7 items assessing depression. The item scores for anxiety and depression were added separately and given subscale scores, each ranging from 0 to 21. A score >10 on either subscale indicates a probable case of mood disorder, a score of 8–10 indicates a possible case, and a score <8 is within normal range [29]. The HADS has shown good case-finding properties in primary care and hospital settings for anxiety and depression according to the Diagnostic and Statistical Manual of Mental Disorders and International Classification of Diseases [30]. The HADS is well suited for detecting mood disorders among obese people and has shown good responsiveness to change in patients operated for morbid obesity [9]. Effect sized for the HADS scores can be interpreted according to standard deviation units from the norm population. Thus, a difference in HADS–anxiety (HADSA) of 0.7 to 1.6 points can be interpreted as a small effect size; 1.7 to 2.6 points, a medium effect size; and 2.7+ points, a large effect size [27]. Similarly, a difference in HADS–depression (HADSD) of 0.6 to 1.6 points can be interpreted as a small effect size; 1.7 to 2.5 points, a medium effect size; and 2.5+ points, a large effect size. Normative data on the HADS were obtained from the Nord-Trøndelag Health Study in Norway (1994–1996) [31, 32].

Statistical Analysis

To estimate changes over time, we used a linear mixed model. Time was included in the model as a categorical variable, whereas the intraindividual correlation was modeled using an unstructured variance–covariance matrix. Changes from the pre-surgery state to the 5-year follow-up are the main analysis of this study. We also report P values for changes from the highest score by the 1 or 2-year follow-up to the 5-year follow-up. Pearson’s r was used to study the correlation between change in BMI units and change in HRQL.

Normative scores for the SF-36 and HADS for the 5-year follow-up were adjusted for (1) age and gender and then for (2) age, gender, and BMI to reflect the same distributions as that of our sample. Analysis of covariance (ANCOVA) was used to calculate normative scores for the SF-36. The calculation of normative HADS scores was based on published results [31, 32], and the method for this is described elsewhere [33]. ANCOVA (SF-36) and the one-sample test (HADS) were used to compare the patients’ scores after 5 years with the normative scores (both unadjusted and adjusted for BMI). Statistical analysis was performed with SPSS for Windows version 20.0. P values < 0.05 were considered statistically significant.

Ethics

The investigation conforms to the principles outlined in the Declaration of Helsinki. The study was approved by The Norwegian Social Science Data Services and by the Regional Committee of Ethics in Medicine, West Norway.

Results

All 51 patients (28 women and 23 men) who were invited gave their written informed consent. At the time of surgery, the mean age was 37.8 ± 8.1 years and mean BMI = 51.7 units (SD 7.5). One patient died after 2 years and was excluded from the study. Forty-six patients (92 %) completed the SF-36 questionnaire and 44 patients (88 %) completed the HADS questionnaire 5 years after surgery.

We had complete data on all patients for change in BMI, and the mean BMI at 5 years was 32.9 ± 6.6 units (P < 0.001 from baseline) and was stable over time (Fig. 1). The average %EBMIL at 5 years was 70.1 ± 24.0.

Fig. 1
figure 1

Body mass index before and after surgery. The body mass index before 1, 2, and 5 years after DS. The bold lines represent mean values. The dotted line represents the upper limit for normal BMI

PCS improved from baseline to the five5-year follow-up (P < 0.001, ES = 1.58; Table 1, Fig. 2), but a significant decline was observed between 2 and 5 years (P = 0.001). The effect size for PCS compared to the norm population adjusted for age, gender, and BMI was small after 5 years (P = 0.135).

Table 1 SF-36 and HADS data in the patient group before and after BPDDS vs. general Norwegian population
Fig. 2
figure 2

SF-36 summary scores before and after duodenal switch. PCS and MCS are presented as mean and 95 % CI. The dotted line represents the mean score for the normal population, adjusted for age, gender and BMI

MCS also improved significantly from baseline to the 5-year follow-up (P = 0.003, ES = 0.77; Table 1, Fig. 2), but a decline was observed between 1 and 5 years (P < 0.001). The effect size for MCS compared to the norm population adjusted for age, gender, and BMI was moderate at 5 years (P < 0.001).

The HADSA did not significantly improve from baseline to the 5-year follow-up (P = 0.139, ES = 0.29). This was due to a significant decline between the 2- and 5-year follow-up (P = 0.003; Table 1, Fig. 3). The effect size for anxiety compared to the normative population adjusted for age, gender, and BMI was moderate at 5 years (P < 0.001).

Fig. 3
figure 3

HADS scores before and after duodenal switch. Anxiety and depression scores are presented as means and 95 %CIs. The dotted line represents the mean score for the normal population, adjusted for age, gender and BMI

The HADSD score improved significantly from baseline to the 5-year follow-up (P = 0.001, ES = 0.8), but it declined significantly between 2 and 5 years (P = 0.004; Table 1, Fig. 3). The effect sizes for HADSD compared to the normative population adjusted for age, gender, and BMI indicated no important difference at 5 years (P = 1).

Of all the HRQL measures, only changes in physical functioning at the 5-year follow-up were significantly correlated with weight loss (r = 0.45, P = 0.005).

Discussion

This is, to our knowledge, the first study that prospectively evaluated long-term changes in HRQL after BPDDS. Although Marceau et al. [21] also showed good long-term results for HRQL after BPDDS, their results were cross-sectional. Despite small early declines in HRQL scores (from 1 and 2 years), they remained significantly higher both statistically and clinically as to baseline at the 5-year follow-up. The HADSD score, not the HADSA score, showed large and sustained improvements after surgery.

Our findings are in accordance with the studies by Karlsson et al. [9] and Kolotkin et al. [15] who also found very large long-term improvements in HRQL after bariatric surgery (banding, vertical-banded gastroplasty, and gastric bypass), despite the small early declines in HRQL. Other studies with a range of surgical procedure have found similar findings [1214, 20, 34].

The Swedish Obese Subjects (SOS) intervention study [9] demonstrated that long-lasting weight reduction in the severely obese has a general long-standing positive outcome on HRQL. The SOS study found that the pattern of change in HRQL scores corresponded with weight loss, weight regain, and weight stability. Kolotkin et al. [15] found that PCS and the Impact of Weight on Quality of Life-Lite questionnaire (IWQOL-Lite) [35] correlated strongly with weight loss, while the mental aspects of HRQL, as assessed by MCS, did not. In our study, there were no significant correlations between weight loss and change in HRQL, except for the SF-36 subscore physical functioning. Thus, this score is probably especially sensitive for weight changes. It might be that in our study, many patients had a weight loss that exceeded a threshold above which little further improvement in nonphysical HRQL could be observed. Although HRQL has been measured differently in several studies, the consistency of findings across studies suggests that there is a positive relationship between weight loss and improvement of HRQL [9, 15, 36].

Our comparisons of HRQL by the 5-year follow-up with the normative Norwegian population, both unadjusted and adjusted for BMI, revealed interesting results. The PCS score was significantly different from the normative population adjusted for age and gender, but the difference disappeared when we adjusted for BMI. Thus, the patients in this study had the comparable PCS as persons with similar BMI in the general population. The MCS score tells a different story as it was different from the norm population adjusted for age and gender and also after adjustment for BMI. Thus, the lower MCS was unrelated to the BMI. We can only speculate on this, but the HADS scores may shed some light on this issue. The HADSD score, not the HADSA score, showed large and sustained improvements after surgery, which is in agreement with findings in other studies [3739]. The patients may face challenges related to self-concept, social relations, and skill acquisition [40], and some of them may regain some of their weight [38]. How patients cope with these matters might have an influence on anxiety in particular.

Two of the strengths of the present study are the long-term follow-up for 5 years and a very high response rate (92 %). Another strength of this study is our use of well-validated HRQL instruments that allowed us to compare the results with population norms. However, there are also several limitations to our study, and one is the lack of a control group. It could be held that some of the change in HRQL could partly be due to other factors than surgery. For ethical reason, one cannot randomize patients with morbid obesity in clinical studies. With this background, it is held that prospective cohort studies with clear treatment goals, careful monitoring of the individuals, and long follow-up (at least 5 years) is a design well suited for evaluating treatment of morbid obesity [41]. We are well aware that an obesity-specific HRQL questionnaire would have been very useful in this study as well, such as the IWQOL-Lite, because the items deal with concerns that are specific to obese persons. Finally, questionnaires cannot describe the patient’s life situation in deeper and broader sense. That requires other methods as, for example, qualitative interviews.

Conclusions

Overall, this study shows that BPDDS was associated with great improvements in HRQL 5 years after BPDDS. The PCS was comparable to the population norm, while MCS was lower. This may be due the sustained reduction in HADSD but not in HADSA. Further long-term studies should be performed to confirm these findings and other outcomes after BPDDS.