Top

Quality of Life Research

Gepubliceerd in:

01-12-2008

Patient-reported outcomes and the mandate of measurement

Auteur: Gary Donaldson

Gepubliceerd in: Quality of Life Research | Uitgave 10/2008

Abstract

Purpose

Coherent clinical care depends on answering a basic question: is a patient getting worse, getting better, or staying about the same? This can prove surprisingly difficult to answer confidently. Patient-reported outcomes (PROs) could potentially help by providing quantifiable evidence. But quantifiable evidence is not necessarily good evidence, as this article details.

Method

The fundamental mandate of measurement requires that errors in making an assessment be smaller than the distinctions to be measured. This mandate implies that numerical observations of patients may be poor measurements.

Results

Individual assessments require high measurement precision and reliability. Group-averaged comparisons cancel out measurement error, but individual PROs do not. Individual PROs generate numbers, to be sure, but the numbers may fall short of what we should demand of measurements. When typical errors of measurement are large, it is not possible to answer confidently even the modest question of whether a patient is getting worse or getting better.

Conclusion

This article explains some theory behind the mandate of measurement, provides several examples based on clinical research, and suggests strategies to measure and monitor individual patient outcomes more precisely. These include more frequent low-burden assessments, more realistic confidence levels, and strengthened measurement that integrates population data.

vorige artikel Prospects and challenges in using patient-reported outcomes in clinical practice

volgende artikel Evaluating the effectiveness of using PROs in clinical practice: a role for cluster-randomised trials

Under the “usual” assumptions of constant variance and conditional independence.

The order-of-magnitude difference in variability between individual and averaged data captured by Figs. 1 and 2 is completely representative. In 30 years’ experience with patient-reported subjective ratings and standardized questionnaires collected longitudinally, I have never failed to observe it. That the discrepancy still surprises owes to the fact that journals seldom publish individual trajectories, leaving readers with the impression that mean trend lines are representative of individuals.

The confidence intervals for rating scales such as these become smaller near the limits of the scales. This issue is related to floor and ceiling effects that pose additional measurement problems beyond the scope of this paper. To illustrate the ideas, I ignore restriction-of-range issues, and assume confidence intervals located in the middle ranges of the scales, where they are largely constant. Similarly, I do not address the issue of discrete (numerical ratings) versus continuous (visual analog) formats.

Some scales now available are capable of very precise measurement if length and patient burden are not concerns. Dynamic adaptive testing methods work well to generate efficient measurement while minimizing burden, but would still require several questions to achieve very high levels of precision. Methods based on item response theory are in general more sophisticated and efficient than classical psychometric approaches, but for the purposes of this paper the differences are minor ones and not central to the main points.

The linear trend always represents the average rate-of-change, even when the data suggest nonlinearity. Subtle modeling issues notwithstanding, the linear trend is an excellent summary measure when the clinical question concerns whether patients are “getting better” or “getting worse.”

In general, standard errors for any weighted combination of single assessments is given by the matrix formula \( (c'\Uptheta c)^{1/2} \), where c is a weighted contrast or difference, and Θ is the sampling error covariance matrix over the repeated assessments of an individual. In the typical case, the diagonal elements of Θ are squared SEMs, and the off-diagonal elements are zero, but more general scenarios are possible (e.g., autocorrelation or heterogeneity in the SEM over time).

In fact, it is the maximum likelihood estimate. But in what follows I try to rely on ordinary language meaning and to minimize technical statistical vocabulary. In the same vein, I use “likely” as an intuitive term meaning “a good guess” without intending either Bayesian or frequentist subtleties, and use the noun “estimate” to mean an informed guess of a person’s true but unknown value.

This particular representation is more natural in a Bayesian than a frequentist interpretation, but the same points can be made equivalently in either framework. The curve simply shows that good guesses for the unknown true value are closer to the sample measurement, while poorer guesses are farther away, on either interpretation.

For example, when exceeding a clinical threshold would invoke aggressive and risky therapy, it may be important to be nearly certain that the true value exceeds the threshold.

Hays, R. D., Brodsky, M., Johnston, M. F., Spritzer, K. L., & Hui, K. K. (2005). Evaluating the statistical significance of health-related quality-of-life change in individual patients. Evaluation & the Health Professions, 28(2), 160–171. doi:10.1177/0163278705275339.CrossRef

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw-Hill.

Schubert, M. M., Williams, B. E., Lloid, M. E., Donaldson, G., & Chapko, M. K. (1992). Clinical assessment scale for the rating of oral mucosal changes associated with bone marrow transplantation. Development of an oral mucositis index. Cancer, 69(10), 2469–2477. doi:10.1002/1097-0142(19920515)69:10<2469::AID-CNCR2820691015>3.0.CO;2-W.PubMedCrossRef

Syrjala, K. L., Donaldson, G. W., Davis, M. W., Kippes, M. E., & Carr, J. E. (1995). Relaxation and imagery and cognitive-behavioral training reduce pain during cancer treatment: A controlled clinical trial. Pain, 63(2), 189–198. doi:10.1016/0304-3959(95)00039-U.PubMedCrossRef

Joint Commission on Accreditation of Healthcare Organizations. Pain standards for 2001, (2001).

Chapman, C. R., Nakamura, Y., Donaldson, G. W., Jacobson, R. C., Bradshaw, D. H., Flores, L., et al. (2001). Sensory and affective dimensions of phasic pain are indistinguishable in the self-report and psychophysiology of normal laboratory subjects. The Journal of Pain, 2(5), 279–294. doi:10.1054/jpai.2001.25529.PubMedCrossRef

Coda, B. A., O’Sullivan, B., Donaldson, G., Bohl, S., Chapman, C. R., & Shen, D. D. (1997). Comparative efficacy of patient-controlled administration of morphine, hydromorphone, or sufentanil for the treatment of oral mucositis pain following bone marrow transplantation. Pain, 72(3), 333–346. doi:10.1016/S0304-3959(97)00059-6.PubMedCrossRef

Donaldson, G. W., Chapman, C. R., Nakamura, Y., Bradshaw, D. H., Jacobson, R. C., & Chapman, C. N. (2003). Pain and the defense response: Structural equation modeling reveals a coordinated psychophysiological response to increasing painful stimulation. Pain, 102(1–2), 97–108. doi:10.1016/s0304-3959(02)00351-2.PubMedCrossRef

Fosnocht, D. E., Chapman, C. R., Swanson, E. R., & Donaldson, G. W. (2005). Correlation of change in visual analog scale with pain relief in the ed. The American Journal of Emergency Medicine, 23, 55–59. doi:10.1016/j.ajem.2004.09.024.PubMedCrossRef

10.

Fosnocht, D. E., Swanson, E. R., Donaldson, G. W., Blackburn, C. C., & Chapman, C. R. (2003). Pain medication use before ed arrival. The American Journal of Emergency Medicine, 21, 435–437. doi:10.1016/S0735-6757(03)00092-5.PubMedCrossRef

11.

Rowley, S. D., Donaldson, G., Lilleby, K., Bensinger, W. I., & Appelbaum, F. R. (2001). Experiences of donors enrolled in a randomized study of allogeneic bone marrow or peripheral blood stem cell transplantation. Blood, 97(9), 2541–2548. doi:10.1182/blood.V97.9.2541.PubMedCrossRef

12.

Laird, N. M., Donnelly, C., & Ware, J. H. (1992). Longitudinal studies with continuous responses. Statistical Methods in Medical Research, 1(3), 225–247. doi:10.1177/096228029200100302.PubMedCrossRef

13.

Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38(4), 963–974. doi:10.2307/2529876.PubMedCrossRef

14.

Littell, R. C., Milliken, G. A., Stroup, W. W., & Wolfinger, R. D. (1996). Sas system for mixed models. Cary, NC: SAS Institute inc.

15.

Cleveland, W. S. (1985). The elements of graphing data. Monterey, CA: Wadsworth.

16.

Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.

17.

Louis, T.A., & Zeger, S.L. (2007).Effective communication of standard errors and confidence intervals, Johns Hopkins University Department of Biostatistics Working Papers.

18.

Donaldson, G. W., & Moinpour, C. M. (2002). Individual differences in quality-of-life treatment response. Medical Care, 40(6 Suppl), III39–III53. doi:10.1097/00005650-200206001-00007.PubMed

19.

McIntosh, M. W., & Urban, N. (2003). A parametric empirical bayes method for cancer screening using longitudinal observations of a biomarker. Biostatistics (Oxford, England), 17, 27–40. doi:10.1093/biostatistics/4.1.27.

20.

McIntosh, M. W., Urban, N., & Karlan, B. (2002). Generating longitudinal screening alorithms using novel biomarkers for disease. Cancer Epidemiology, Biomarkers & Prevention, 11, 159–166.

Titel: Patient-reported outcomes and the mandate of measurement
Auteur: Gary Donaldson
Publicatiedatum: 01-12-2008
Uitgeverij: Springer Netherlands
Gepubliceerd in: Quality of Life Research / Uitgave 10/2008
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI: https://doi.org/10.1007/s11136-008-9408-4

Bohn Stafleu van Loghum

Deel dit onderdeel of sectie (kopieer de link)

Abstract

Purpose

Method

Results

Conclusion

Log in om toegang te krijgen

Andere artikelen Uitgave 10/2008

Health-related quality of life in unselected outpatients with heart failure across Spain in two different health care levels. Magnitude and determinants of impairment: The INCA study

Comparative responsiveness and minimal change for the Oxford Elbow Score following surgery

Using patient-reported outcomes in clinical practice: proceedings of an International Society of Quality of Life Research conference

Health-related quality of life and utilities in primary-care patients with generalized anxiety disorder

Psychometric properties of the Personal and Social Performance scale (PSP) among individuals with schizophrenia living in the community

Changes in health-related quality of life (HRQoL) in a population-based sample of children and adolescents after 3 years of follow-up