INTRODUCTION

Delivery systems are increasingly focused on two complementary strategies for addressing the goals of lower cost and improved quality: 1) evidence-based interventions to reduce avoidable emergency department (ED) visits and hospitalizations, especially among high-risk patients, and 2) adoption of quality measures that provide meaningful data on improvements in patient functioning and health status. The systematic collection of self-reported patient health, functional status and symptoms, also referred to as patient-reported outcome measures (PROMs),1 , 2 has the potential to contribute to both of these delivery system strategies.

To identify high-risk patients, health care systems have relied primarily on administrative and clinical data in addition to the input of physicians.3 6 However, information derived from demographic, administrative, and clinical data may fail to capture important changes in health status that can alter an individual’s future patterns of care-seeking behavior.7

The potential for PROMs to contribute to better patient identification has been inferred from prior studies showing that PROMs predict outcomes such as mortality, hospitalization, hospital readmission, and total health care costs.8 16 Despite research showing that PROMs predict important outcomes, there is a lack of evidence that routinely collecting this information in non-research settings can also improve the ability of systems and providers to identify patients with higher health care needs.17 19

In this study, we investigated the association between self-reported health status collected as part of routine outpatient care and subsequent health care utilization among patients in the Partners HealthCare (Partners) Accountable Care Organization (ACO). We hypothesized that a self-reported health status measure would be an independent predictor of health care utilization after controlling for demographic and clinical factors.

METHODS

Setting and Study Population

We conducted a retrospective cohort study of all individuals in the Partners ACO (including patients in our Medicare ACO and the commercial and Medicaid risk contract populations) who completed at least one PROMIS Global Health (PGH) questionnaire between March 1, 2014, and September 30, 2015, during a primary care visit at any of 12 participating primary care practices across the system. Partners had approximately 658,314 individuals in its ACO and risk contract populations in 2015. The PGH is collected as part of a larger PROMs program at Partners that began in 2012.

PROMs are collected on iPad tablets that are routinely handed out to the patient at the time of visit check-in at participating practices or, in some cases, using the electronic patient portal in advance of the visit. Patients complete the PROMs, including the PGH, on their own (or potentially with the assistance of a family member) prior to the visit, with a minimum of 90 days between repeated completion. The PGH is offered in English and Spanish.

Individuals were included in this analysis if they had 12 months of paid claims and/or ACO membership prior to the month of survey completion and at least one month of claims or membership after completion of the survey. We excluded patients (n = 86) who were eligible for the Partners Integrated Care Management Program (iCMP). The iCMP assigns care managers to high-risk (high utilizer) patients within the Partners ACO/risk-management groups, with the goal of improving health and preventing unnecessary utilization.

Patient-Reported Outcome Measures

The PGH consists of 10 questions that assess multiple domains including physical health, pain, fatigue, mental health, social connections, and overall health and quality of life. There are four questions that generate a Global Physical Health (GPH) score and four questions that generate a Global Mental Health (GMH) score. Both the GPH and GMH scores are standardized such that a score of 50 represents the mean for the US general population, with a standard deviation of 10.17 Higher scores represent higher functioning.

We calculated standardized scores for the GPH and GMH based on the individual raw scores and according to the PROMIS Global Health scoring guide.20 Only individuals who completed all 10 items of the instrument were included in the cohort for this analysis. There were only 19 individuals eligible for the study who did not complete all 10 questions. Additionally, for those individuals with more than one PGH completed, we used their first completed PGH during the study period.

Administrative and Clinical Variables

Patient-level variables included standard demographic (age, sex, etc.) and clinical (comorbidity) covariates available through encounter and billing data collected by the health system. We extracted clinical diagnoses and calculated an overall outpatient Charlson Comorbidity Index based on encounter codes from medical claims data.21 , 22 Comorbidities were determined based on all paid claims going back 12 months prior to the month of survey completion. In addition, geocoding of patient addresses was performed to assign census block group and determine area-level measures of socioeconomic status (SES), specifically percentage in poverty, based on 2008–2012 American Community Survey census data.

Utilization Variables

We extracted claims for all ED visits and acute hospitalizations at any site for a period of 12 months prior to and up to 12 months after completion of the PGH. We did not count the ED visits that occurred on the same day as a medical or surgical admission date. Prior utilization was defined as the number of ED visits or hospitalizations in the prior 12 months.

Data Analysis

Continuous variables were summarized using means with standard deviations, while categorical variables were summarized using frequency and percentage. Rates of ED visits and hospitalizations were calculated as the total number of events from all individuals within each quartile divided by total follow-up time accumulated from all individuals within the quartile and expressed as rates per 100 person-years. We examined the relationship between PGH scores and subsequent utilization by comparing the utilization rates among PGH quartiles. Due to the much higher utilization rates from the group in the lowest GPH and GMH quartiles and the small number of events in the group with the highest quartile, we later combined the highest three quartiles of GPH and GMH scores.

We conducted multivariable regression to evaluate the independent predictive ability of PGH scores. We used zero-inflated Poisson (ZIP) models to compare utilization rates by PGH score groups to account for the fact that the majority of individuals in the cohort would have zero ED visits and zero hospitalizations. The variables included in the models were chosen a priori, as they are routinely used in past and current utilization prediction models, and included age, gender, area-level poverty, insurance type, Charlson score, and prior utilization.15 , 23 , 24 Insurance type was added to the models for ED visits, but not for inpatient admission, due to model convergence issues. We summarized the overall adjusted effects of PGH scores using the estimates obtained from the ZIP models and used the bootstrap method to estimate the 95% confidence interval of the adjusted rate ratios.25

We also wanted to determine whether the accuracy of the prediction model was improved by the addition of either GPH or GMH score to a more traditional risk prediction model relying on variables available in demographic and administrative data only (age, gender, area-level poverty, insurance type, outpatient Charlson score, and prior utilization). We assessed whether the addition of the GPH or GMH would more accurately identify individuals in the top 5% of highest utilizers. This cutoff was chosen based on prior knowledge that the top 5% of utilizers account for at least 50% of annual expenditures.26 , 27 We ranked all of the individuals in the cohort by actual ED and hospital utilization standardized over 12 months. We then calculated the sensitivity and specificity of the different models to identify the high utilizers in order to classify the performance of the various components of the PGH when added to administrative and clinical data.

Subgroup Analyses

We examined results for individuals who had at least 6 months and those who had at least 12 months of claims data and/or ACO membership following the month of survey completion. We also ran the analyses including the individuals eligible for the iCMP program.

RESULTS

There were 2639 eligible individuals (0.4% of overall Partners ACO/risk contract population) who completed the PGH prior to April 30, 2015, and who had 12 months of paid claims and/or ACO membership prior to the month of survey completion and at least one month of claims or membership after completion of the survey.

Compared to the total population of patients in the ACO/risk contracts, the patients in this study were more likely to be female (63.8% in our cohort vs. 55.1% in the total risk population) and were older on average (mean age 48.0 vs. 42.8 in the total risk population).

The mean (SD) GPH and GMH scores for those completing the PGH at a primary care visit were 51.7 (8.5) and 53.9 (9.2), respectively. The cohort had an average age of 48.0 (SD 14.6) years, was predominantly female, and was predominantly commercially insured (Table 1).

Table 1 Baseline Characteristics of Total Cohort and Groups with Low GPH and GMH Scores Compared to Higher Scores

Those in the lowest quartile (Q1) of GPH were significantly older, more likely to be female, and more likely to live in an area with a higher level of poverty. Patients in Q1 of the GPH were also less likely to be commercially insured and more likely to have a lower Charlson score (Table 1). The trends were similar for those in the lowest quartile of the GMH score, with the exception of mean age—those in the lowest quartile of the GMH were younger than those in the other three quartiles.

In terms of the variation in follow-up times post-PGH, 57.8% of the cohort had ≤3 months, 21.3% had >3 and ≤6 months, 10.5% had >6 and ≤9 months, and 10.4% had >9 months of claims and/or membership post-PGH completion. There was no significant difference in follow-up times between individuals in Q1 and those in higher quartiles of the GPH (p = 0.08). There were significant differences in the follow-up time between individuals in Q1 of GMH and those in higher quartiles (p = 0.04), in that there were slightly more individuals with ≤3 months in Q1 than in Q2–4 (59.4% in Q1 vs. 57.3% in Q2–4), though there were also more in Q1 with >6 months of follow-up time (22.4% in Q1 vs. 20.4% in Q2–4).

Table 2 presents the rates of ED visits and hospitalizations by quartile. Patients in the lowest quartile for both GPH and GMH had the highest ED and hospitalization utilization rates among all quartiles. When comparing patients in the lowest quartile with the three higher quartiles combined (Table 3), we found that the lowest quartile of the GPH and GMH scores had significantly higher rates of hospitalization. In contrast, the lowest quartiles of GPH and GMH were not associated with higher ED visit rates.

Table 2 Rates of ED Visits and Hospitalizations by Quartile of GPH and GMH
Table 3 Rates and Unadjusted Rate Ratios of ED Visits and Hospitalizations for Lowest Quartile of GPH and GMH Scores Compared to All Others

In multivariate analyses adjusting for age, gender, area-level poverty, insurance type (included in ED visit models only), Charlson score, and prior utilization, the lowest quartile of GPH score was associated with increased risk of hospitalization (RR 3.15, 95% CI 1.30, 7.90) but not increased risk of ED visits (RR 1.2, 0 95% CI 0.67, 1.77; Table 4).

Table 4 Rates and Adjusted Rate Ratios of ED Visits and Hospitalizations for Lowest Quartile of GPH and GMH Scores Compared to All Others

The sensitivity of the utilization predictive model with and without the GPH score was 44.0% and 36.0%, respectively, in identifying the top 5% of utilizers of hospital services (Table 5). However, predictive models using either GPH or GMH alone performed worse than either the administrative models or models combining administrative and GPH variables.

Table 5 Sensitivity and Specificity of Models for Predicting Patients in the Top 5% of ED Visits or Hospital Admissions

In subgroup analyses, we found similar results among the subgroup with greater than 6 months of follow-up post-survey month (n = 552), in terms of the unadjusted and adjusted RR of ED visits and hospitalizations for the groups with the lowest quartiles of GPH and GMH compared to those with higher. There were only 174 individuals with at least 12 months of paid claims and/or membership post-survey completion, and this sample size was too small for any meaningful subgroup analysis.

We also performed all of the analyses including those eligible for the iCMP program, and found similar results. In addition, we compared the group of individuals with both GPH and GMH scores in the lowest quartile and those with either score in the lowest quartile to those with higher scores (Q2–4). We found that patients who fell into the lowest quartile of either GPH or GMH, or both, had increased rates of ED visits relative to individuals who did not fall in the lowest quartile of either, and that only individuals who fell into the lowest quartile of GPH had increased rates of hospitalizations.

DISCUSSION

We investigated the association between patient-reported general health scores collected during routine primary care visits and subsequent health care utilization in an ACO. Our findings confirmed that self-reported physical and mental health collected in routine practice predicted future hospitalizations. We also found that the addition of physical health to administrative data increased the ability to identify the patients in the top 5% of hospital admissions. Our finding that patients with the lowest quartile of self-reported physical health scores had significantly higher rates of subsequent hospitalizations makes sense clinically, and is consistent with prior studies showing that functional status is an important indicator of outcomes. However, self-reported physical and mental health were not associated with the risk of ED visits, and the overall sensitivity of the models in predicting patients in the top 5% of ED utilization was fairly low.

Consistent with our finding that self-reported low physical health is associated with an increased rate of hospitalizations, we also found that adding physical health to administrative data resulted in a modest increase in the ability to identify high utilizers of hospital services. On the other hand, adding self-reported mental health to models with clinical and administrative data did not improve predictive model performance.

Although the addition of the PGH scores to traditional administrative data did not always improve the ability to correctly identify the highest utilizers, the PGH has advantages over administrative data in that it can be collected in real time and used at the time of collection to assess risk with a high degree of specificity. It also has advantages over more widely used instruments such as SF-36 or SF-12 in that it does not require a license to use, is publically available through PROMIS (Patient-Reported Outcome Measurement Information System), and is funded by the National Institutes of Health, with the aim of providing researchers and clinicians access to precise and valid outcome data.

Overall, sensitivity for identifying the highest utilizers of hospital and ED care was low for all models. A few reasons might explain this finding. First, in general, prospective models do not perform as well as concurrent models of risk prediction.14 , 28 , 29 Second, one of the benefits of PROMs such as the PGH is that they are dynamic measures and are thus subject to greater change than more static measures such as comorbidities, insurance type, or SES; however, this dynamic quality may make them less helpful in predicting health events far into the future. Third, it may be that global measures of self-reported health predict risk better for certain diseases or subgroups, as found in prior research.11

Our findings fit into a larger body of literature on risk prediction, and both agree with and add to this literature. Previous studies have also found that lower physical health and functional status is associated with increased rates of hospitalization,9 , 30 and that mental health did not perform as well predicting hospitalizations as physical health or measures of comorbidity.8 Additionally, we did find that the addition of self-reported physical health to administrative data increased sensitivity by 8% above administrative data alone. This is important because relatively few studies have looked at the additive benefit of self-reported health over administrative risk prediction models in identifying the highest utilizers.30 , 31 This finding is consistent, however, with prior work suggesting that adding functional status to administrative data can increase the ability to identify high-risk or high-cost patients.12 , 24 , 32 For example, a systematic review by Kansagara et al. found that the addition of functional status to administrative data improved the ability to predict readmissions.24

This study has several limitations. First, it uses data collected at a single health system and includes only a small sample of those patients in our ACO (0.4% of the ACO/risk contract population) who completed the PGH at a primary care visit. This does limit both the internal and external generalizability. Part of the reason that we captured only a small sample of the ACO/risk contract population is because at the time of the collection, only 12 of the 231 (5%) primary care practices at Partners were collecting PROMs. These practices were not selected to be representative of the larger ACO/risk contract population. Second, this study relies on data collected in routine clinical care. It is up to the participating clinics to decide which patients should fill out PROMs, and there is likely a significant amount of variability in how frequently clinics collected the PGH and which patients were asked to collect them, possibly introducing selection bias. Third, we included individuals with variable follow-up time post-PGH completion. The majority of individuals included in the study had 3 months or less of claims and/or ACO/risk contract membership post-PGH completion. We attempted to account for the variable follow-up time by calculating standardized utilization rates per quartile, but we recognize that this is a limitation of the study. We also recognize that there are better methods for developing a new risk prediction index that might be helpful in future analyses, but our goal was simply to assess the impact on model accuracy with and without the PGH component, so we did not pursue additional statistical approaches.

Future analyses should continue to explore specific groups in which the PGH performs best as a risk predictor. Future studies should also look at the ability of the PGH to predict preventable hospitalizations, as acute hospitalizations in healthy individuals are much more difficult to predict. Future work may also examine changes over time in self-reported health scores as a potentially more accurate measure of health trajectory and future health needs.

In summary, we demonstrated that physical health is significantly associated with hospital utilization and that it has the potential to add to a risk prediction model within our ACO and risk contracts population. Ultimately, collecting PROMs—specifically, global PROMs such as the PGH—during routine clinical care appears promising as a means for better understanding patients’ needs and improving physician–patient communication, while also serving as a risk assessment tool, but gaps exist in understanding their validity and performance in different patient populations, how to make the most of them in routine care, and how best to use the results for population health management.