Original ArticleThe SF-36 summary scores and their relation to mental disorders: Physical functioning may affect performance of the summary scores
Introduction
The Medical Outcomes Study 36-item Short-Form Health Survey (SF-36) [1] is the most widely used generic instrument for measuring quality of life (QOL). It is recommended for use in health policy evaluations, general population surveys, clinical research, and clinical practice [2], [3]. Moreover, it has proven to be useful for comparing the relative burden of different diseases on QOL [4], [5]. The SF-36 has been found to correlate substantially with frequency and severity of many specific symptoms and problems. For example, Leidy et al. [6] found that several of the scales from the SF-36 were related to measures of depression and were sensitive to changes in depression over time. Russo et al. [7] reported that the SF-36 scales were related to psychiatric symptoms as measured by the brief psychiatric rating scale. The instrument has demonstrated sound psychometric properties across diverse clinical populations.
Summary scores were developed to aggregate the most highly correlated (redundant) subscales and to simplify analyses without substantial loss of information. There are two broad classes of summary scores: Ware et al. [8] (medical outcome study [MOS] approach) developed two component summary scores for the SF-36 using principal components analysis: the mental component summary (MCS) and physical component summary (PCS). Orthogonal factor rotation was used in the construction of these summary scales. All eight subscales were used to perform both the physical and mental summary scores, leading to two measures that are statistically independent. As a result of this scoring approach, three of the four “mental” scales have negative standardized scoring coefficients in the PCS and the four “physical” scales have negative standardized scoring coefficients in the MCS. These two factors were found to account for more than 80% of the reliable variance of the standard eight subscales and were easily interpreted in a general population [8].
Hays et al. [9] presented an alternative instrument to the MOS SF-36, namely the RAND-36, which slightly differs with regard to the item summation, but not in wording of the items or the structure of the instrument. The scoring approach for the summary scores is based on the assumption that the physical and mental health factors are correlated. Therefore, only four scales contribute to the PCS and four to the MCS. The scoring coefficients of the scales were obtained from oblique factor analysis (with Promax rotation).
Summary scores have some methodological features that make them more advantageous for clinical research. These features include smaller confidence intervals (CIs), the elimination of floor and ceiling effects, simpler analysis by reducing the number of statistical tests required, and avoiding the problem of multiple testing.
The MOS summary scores have been used as primary or secondary outcome measures in clinical trials. Although a number of studies have confirmed the validity of these measures [10], there is an ongoing discussion about the scaling of the summary scores [10], [11], [12], [13], [14], [15]. Several studies showed that the discrepancies between subscale profile and component scores of the SF-36 are attributable to the way in which these summary scores are calculated. The authors of these studies argued that the main problem in the scoring algorithm derives from the use of negatively weighted subscale factor score coefficients, leading sometimes to clinically counterintuitive study results. As pointed out by Simon et al. [11], the physical functioning subscale makes a significant negative contribution to the computed MCS. Therefore, a condition with associated severe physical limitations and modest psychological distress appear to have no impact on overall mental health.
The purpose of this study was to compare and evaluate the two different scoring algorithms for the summary scores in a representative community sample. We compared the performance of the RAND and the MOS scoring for subjects suffering from psychological distress and physical limitations. In addition, we assessed the validity of the MOS and the RAND MCS with respect to mental disorders.
Section snippets
Data
Data for this report were taken from the German National Health Interview and Examination Survey (GHS). Methods of the trial are described in detail elsewhere and will be summarized here [16], [17], [18].
The GHS is based on a stratified, multistage, cross-sectional, national representative sample of 7,124 individuals aged 18–79 years from the noninstitutionalized population of Germany. The main survey consisted of a comprehensive health status examination by a medical doctor; respondents also
Results
Of 4,181 participants examined in the mental health supplement, 129 (3.1%) subjects were excluded due to missing values for the SF-36. The remaining 4,052 subjects were included in the present analyses.
In Table 1 the patterns of scale weights are presented for each summary score [8], [9]. As can be seen from the table, the physical functioning scale makes a significant negative contribution to the MOS MCS. On the other hand, the role emotional functioning scale makes a significant positive
Discussion and conclusion
The SF-36 is a widely used generic health status measure. Nevertheless, there is an ongoing discussion about the scaling of the general scores. The main problem in the original scoring algorithm derives from the use of negatively weighted subscale factor score coefficients. The aim of this study was to evaluate the effects of negative weighting for the assessment of mental health. We compared the validity of two mental scoring algorithms as screening measures for mental disorders in a
Acknowledgment
We thank Hans-Ulrich Wittchen, PhD, Heribert Stolzenberg, PhD, and Bärbel-Maria Kurth, PhD, for their assistance with the GHS public use databases.
References (36)
- et al.
Which chronic conditions are associated with better or poorer quality of life?
J Clin Epidemiol
(2000) German translation and psychometric testing of the SF-36 health survey: preliminary results from the IQOLA project
Soc Sci Med
(1995)- et al.
Prevalence of anxiety in adults with diabetes—a systematic review
J Psychosom Res
(2002) - et al.
Statistical characteristics of area under the receiver operating characteristic curve for a simple prognostic model using traditional and bootstrapped approaches
J Clin Epidemiol
(2002) - et al.
Evaluation of the MOS SF-36 physical functioning scale (PF-10). 2. Comparison of relative precision using Likert and Rasch scoring methods
J Clin Epidemiol
(1997) - et al.
SF-36 Health Survey manual and interpretation guide
(1993) - et al.
The MOS 36-item short-form health survey (SF-36), I: conceptual framework and item selection
Med Care
(1992) - et al.
The MOS short-form general health survey: reliability and validity in a patient population
Med Care
(1988) - et al.
Health-related quality of life in chronic disorders: a comparison across studies using the MOS SF-36
Qual Life Res
(1998) - et al.
Health-related quality of life assessment in euthymic and depressed patients with bipolar disorder—psychometric performance of four self-report measures
J Affect Disord
(1998)
The MOS 36-item short form health survey—reliability, validity, and preliminary findings in schizophrenic outpatients
Med Care
SF-36 physical and mental summary scales: a user's manual
RAND-36 Health Status Inventory
Interpreting SF-36 summary health measures: a response
Qual Life Res
SF-36 summary scores: are physical and mental health truly distinct?
Med Care
The SF-36 summary scales: problems and solutions
Soz Praventivmed
Performance of the SF-36, SF-12, and RAND-36 summary scales in a multiple sclerosis population
Med Care
Do SF-36 summary component scores accurately summarize subscale scores?
Qual Life Res
Cited by (16)
Outcome measurement in functional somatic syndromes: SF-36 summary scores and some scales were not valid
2012, Journal of Clinical EpidemiologyCitation Excerpt :However, when we excluded extremely low PCS scores from the analysis, the negative correlation of PCS and MCS remained substantial (r = −0.40, 95% [CI −0.55 to −0.23]). Third, some researchers are convinced that the construction of summaries based on orthogonal factor solution may distort results especially in situations where patients experience complex health problems that result in both physical disability and emotional distress [47–49,65,66], as is the case in severe FSS [1]. In a large population-based study, Hann and Reeves [48] compared summary measures based on orthogonal and oblique factor rotation.
Perception of health conditions and test availability as predictors of adults’ mental health during the covid-19 pandemic: A survey study of adults in Malaysia
2020, International Journal of Environmental Research and Public HealthManual care of residents with spinal pain within a therapeutic community
2016, Therapeutic CommunitiesChild Self-report to Identify Internalizing and Externalizing Problems and the Influence of Maternal Mental Health
2015, Journal of Child and Family StudiesAge, job groups, and psychological well-being
2013, Bundesgesundheitsblatt - Gesundheitsforschung - Gesundheitsschutz