Assessment of health-related quality of life (HRQOL)—that is, functioning and well-being in physical, mental, and social domains of life–has been shown to be useful in screening for disability and in improving communication between patients and clinicians [1
]. Generic HRQOL profile measures use multiple items to assess each of multiple domains of health. To reduce response burden, short-form HRQOL measures such as the SF-36 health survey are widely used [3
]. Although their brevity makes short-form measures practical for widespread use, even the SF-36 requires 7–10 min to complete.
The Dartmouth COOP Charts were designed to provide the briefest possible measure of HRQOL [4
]. This instrument consists of global items (“chart”) to represent each domain of health. These items are administered using five response choices [4
]. For example, one of the charts assesses overall health using the single item, “How would you rate your health in general? (Excellent
, Very good
)” The Charts have the advantage of ease of administration and scoring but tend to be less precise and specific than multi-item scales. The Charts are one of the original examples of the use of global health items to assess multiple HRQOL domains.
Global health items are evaluations of health in general rather than specific elements of health. Global items allows respondents to weigh together different aspects of health to arrive at a ‘bottom-line” indicator of their health status. They allow an efficient assessment of self-reported health. Global health items are predictive of important future events such as health care utilization and mortality [5
The aim of this study was to evaluate global items representing physical health, pain, fatigue, mental health, social health, and overall health. These domains reflect the health framework used by the Patient-Reported Outcomes Measurement Information System (PROMIS; see www.nihpromis.org
]. We examine the individual items and assess possible aggregation of them into underlying dimensions of health as measured in PROMIS. We first evaluate whether scoring the items together as a single summary scale is supported empirically. Then we examine alternatives that better reflect the data.
Item-scale correlations for the 10 global health items ranged from 0.53 (global7: rating of pain) to 0.80 (global09: satisfaction with social roles) and internal consistency reliability was 0.92. However, the single-factor confirmatory categorical factor analysis model for all 10 items was statistically rejectable (χ
2 = 19,619.82, df = 15, P ≤ 0.001) and did not fit the data very well (CFI = 0.927; TLI = 0.961; RMSEA = 0.249).
The eigenvalues from a principal components analysis of the 10 global items were 6.25, 1.20, 0.75, 0.44, 0.39, 0.30, 0.22, 0.20, 0.18, and 0.05. The scree plot and parallel analysis number of factor criteria suggested two underlying dimensions for the 10 items. We performed an exploratory factor analysis and found support for a physical health and mental health factor (see Table 2
). Satisfaction with discretionary social activities (global05
) loaded on mental health whereas satisfaction with social roles (global09
) loaded on both physical and mental health (as did global02
: quality of life; and global08
: fatigue). The estimated correlation between the physical and mental health factors was 0.63. These results were also supported by our confirmatory categorical factor analysis, but three residual correlations were added to obtain acceptable model fit; see Table 2
(global01 with global03 r
= 0.14, global04 with global10 r
= 0.14, and global08 with global10 r
= 0.15; χ
= 5,295.66, df
= 17, P
< 0.0001; CFI = 0.98; TLI = 0.99, RMSEA = 0.12). The estimated correlation between the physical and mental health factors was 0.69.
Two factor pattern for global health items (standardized regression coefficients)
Quality of life
Based on the exploratory factor analysis, we evaluated a physical health scale with the 5 items loading highest on the physical health factor. Global09 (satisfaction with social roles) was excluded because it correlated about equally with physical and mental health. Item-scale correlations for the five physical health items ranged from 0.57 (global07: rating of pain) to 0.79 (global01: rating of general health; and global03: rating of physical health). All 5 items correlated higher with the physical health scale than with the mental health scale. We fit a single-factor categorical confirmatory factor analytic model for the five physical health items and found that it was statistically rejectable (χ
2 = 3,060.81, P < 0.001) and showed less than adequate practical fit according to the RMSEA index (CFI = 0.991; RMSEA = 0.220). By adding a residual correlation (r = 0.29) between global01 (rating of general health) and global03 (rating of physical health) to the initial model, we found that the fit of the model improved significantly (χ
2 = 2,248.57, df = 1, P < 0.001) and the practical fit indices also improved (χ
2 = 419.56, P < 0.001; CFI = 0.999; TLI = 0.998; RMSEA = 0.081).
We also evaluated a mental health scale with 4 items. Three of these items correlated most highly with the mental health scale. The fourth item, global02 (quality of life), correlated about equally with physical and mental health, but was also included because of prior evidence that it is primarily an indicator of mental health. Item-scale correlations for the 4 hypothesized mental health items ranged from 0.64 (global10: emotional problems) to 0.78 (global04: rating of mental health). One item (global09, satisfaction with social roles) had higher correlation with the global physical health scale than with the mental health scale; the 4 mental health items correlated strongest with the mental health scale. The single-factor categorical confirmatory factor analytic model we fit for these 4 mental health items was statistically rejectable (χ
2 = 1,616.80, df = 2, P ≤ 0.001), and had mixed results in terms of practical fit (CFI = 0.983; TLI = 0.975; RMSEA = 0.196). When we added a residual correlation (r = 0.16) between global04 (rating of mental health) and global10 (bothered by emotional problems) to the initial model, the fit improved significantly (χ
2 = 1,114.27, df = 1, P < 0.001) and the practical fit of the model improved (χ
2 = 151.222, P ≤ 0.001; CFI = 0.998; TLI = 0.995; RMSEA = 0.084).
Based on these results, we formed two-four-item scales by averaging together the items scored on a 1–5 possible range. Our physical health items included global03 (physical health), global06 (physical function), global07 (pain) and global08 (fatigue). Our mental health items included global02 (quality of life), global04 (mental health), global05 (satisfaction with discretionary social activities), and global10 (emotional problems). The global physical health (GPH) scale excluded global01 (general health) because of its substantial residual correlation with global03 (physical health). We retained global03 in the scale rather than global01 to emphasize the physical nature of the construct. The GPH had an internal consistency reliability of 0.81 (mean = 3.79, SD = 0.76). We excluded global09 (satisfaction with social roles) from the global mental health (GMH) scale because of its higher correlation with the GPH scale. The GMH had an internal consistency reliability of 0.86 (mean = 3.60, SD = 0.89). The two scales were substantially inter-correlated (r = 0.63). In addition, we found that GPH correlated more strongly with the EQ-5D than did the GMH (r = 0.76 vs. 0.59). The R-square in a regression of the EQ-5D on the GPH and GMH was 0.60, indicating that the PROMIS global health composites share 60% of variance in common with the EQ-5D.
Correlations of the global health items and GPH and GMH with the nine PROMIS domain scores and the EQ-5D are given in Table 3
. The largest correlations for global01
(rating of general health), global02
(quality of life), global03
(rating of physical health), global08
(rating of fatigue), and global09
(satisfaction with social roles) were with the fatigue domain. Global04
(rating of mental health), global05
(satisfaction with discretionary social activities) and global10
(emotional problems) correlated most strongly with the depressive symptoms domain. Global06
(carry out everyday physical activities) correlated most strongly with physical functioning whereas global07
(rating of pain) correlated highest with pain impact. The GPH correlated most strongly with pain impact (r
= −0.75), fatigue (r
= −0.73), and physical functioning (r
= 0.71). GMH correlated most strongly with depressive symptoms (r
= −0.71), fatigue (r
= −0.68), and anxiety (r
Correlations of global items with PROMIS domains and EQ-5D
Correlations of the global items with the EQ-5D ranged from 0.51 to 0.77. The largest correlations with the EQ-5D were for the global ratings of pain, physical functioning, and satisfaction with social roles. Our regression of the EQ-5D on the global items revealed that all items except two (global03: rating of physical health; global05: satisfaction with discretionary social activities) had significantly unique associations (R-square = 0.64).
We estimated item parameters from the graded response model for the 4 global physical health items (Table 4
) and 4 global mental health items (Table 5
). The range of item threshold values indicates satisfactory coverage of the underlying latent trait from ~−4.0 to 2.0 for Physical Health and between −3.0 and 1.5 for Mental Health. Global06
(carry out everyday physical activities) had the highest slope (a
parameter in Table 4
) and the largest information for the physical health items whereas global04
(rating of mental health) had the largest information for the mental health items. We found the lowest item information for items phrased to elicit ratings of undesirable domains of health (pain, fatigue, emotional problems).
Global physical health scale item parameters (graded response model) and item information
Global mental health scale item parameters (graded response model) and item information
The results of our study provide some support for the construct validity of the global health items based on their correlations with comparable multi-item scales from PROMIS. For example, the global rating of mental health (global04) correlated most strongly with the PROMIS depressive symptoms scale; the global rating of fatigue (global08) correlated strongest with the PROMIS fatigue scale.
In addition, our exploratory factor analyses suggested two underlying dimensions for the global health items. One dimension is defined by indicators of primarily physical health and the other by indicators of mental health. Similar underlying factors have been found in previous research [14
]. Moreover, the correlation we estimated between the GPH and GMH (r
= 0.63) in this study was very similar to correlations between physical and mental health factors derived from the SF-36 (e.g., r
= 0.62 in Farivar et al. [17
]) and other measures of HRQOL [18
] using oblique rotation. We recommend scoring the scales using 8 items, but also scoring the remaining 2 items as single items separately: Global01
(General health) and Global09
(satisfaction with social roles).
A major advantage of the global health scales developed here is the brevity of the resulting measure for gathering summary information about health. For the two scales, each of which had 4 items, we obtained reliabilities of 0.81 and 0.86; together they require about 2 min to complete. In contrast, the SF-36 takes about 7–10 min to administer and the estimated reliabilities are about 0.88–0.93 for the SF-36 physical and mental health composites [19
]. The SF-12™ [20
] and SF-8™ [21
] Health Surveys have completion times and reliabilities that are comparable to the current survey. Future head-to-head comparisons of the present instruments and these instruments would be beneficial.
Although the physical and mental health scales are valuable for summarizing health, if a study shows improvement in one of the summary measures and decrement in the other, drawing an overall conclusion can be difficult. Moreover, attrition of study participants over time because they have died presents challenges for longitudinal comparisons based on these global scores because of the bias of dropping those who die from the analysis. Preference-based measures are designed to derive a single summary score that links morbidity and mortality by anchoring the metric so that 0 is “as bad as being dead” and 1 represents “perfect health.” This study showed noteworthy associations of the global health scores with the EQ-5D preference-based score; 60% of the variance was shared in common. A separate paper derives equations estimating EQ-5D index scores from these composite scores [22
Investigators can use the 10 global health items in future studies to assess global physical and mental health. The items are available as part of the PROMIS item banks at: http://www.nih.promis.org
. In addition, the items can be examined separately to provide specific information about perceptions of physical function, pain, fatigue, emotional distress, social health and general perceptions of health. Future studies are needed to evaluate the relative validity of the global scales compared with physical and mental health composites derived from other measures such as the SF-12 and SF-36.
The Patient-Reported Outcomes Measurement Information System (PROMIS) is a US. National Institutes of Health (NIH) Roadmap initiative to develop a computerized system measuring patient-reported outcomes in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS was funded by cooperative agreements to a Statistical Coordinating Center (Evanston Northwestern Healthcare, PI: David Cella, PhD, U01AR52177) and six Primary Research Sites (Duke University, PI: Kevin Weinfurt, PhD, U01AR52186; University of North Carolina, PI: Darren DeWalt, MD, MPH, U01AR52181; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR52155; Stanford University, PI: James Fries, MD, U01AR52158; Stony Brook University, PI: Arthur Stone, PhD, U01AR52170; and University of Washington, PI: Dagmar Amtmann, PhD, U01AR52171). NIH Science Officers on this project are Deborah Ader, Ph.D., Susan Czajkowski, PhD, Lawrence Fine, MD, DrPH, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, and Susana Serrate-Sztein, PhD. Ron D. Hays was also supported by the UCLA Resource Center for Minority Aging Research/Center for Health Improvement in Minority Elderly (P30AG021684), and the UCLA/DREW Project EXPORT, National Institutes of Health, National Center on Minority Health & Health Disparities (P20MD000148 and P20MD000182). This manuscript was reviewed by the PROMIS Publications Subcommittee prior to external peer review. See the web site at www.nihpromis.org
for additional information on the PROMIS cooperative group.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.