Recall bias did not affect perceived magnitude of change in health-related functional status

doi:10.1016/j.jclinepi.2005.08.018

Journal of Clinical Epidemiology

Volume 59, Issue 5, May 2006, Pages 503-511

https://doi.org/10.1016/j.jclinepi.2005.08.018 Get rights and content

Abstract

Background and Objective

It was hypothesized that within an invasively treated group and within a group that improved in angina pectoris no difference in effect size would occur between prospective and retrospective measures. Furthermore, it was hypothesized that assessment of perceived change at post-test may be invalid because of recall bias and present-state bias.

Study Design and Setting

Effect sizes (as standardized response means) were used as indicators of magnitude of change. Linear structural equation analysis (with LISREL) was used to investigate the relationship between the estimates of recall accuracy and retrospectively assessed change.

Results

No significant differences were found between prospective and retrospective measures of change over time in health-related functional status. Recall bias was not associated with retrospective measurement of change within a 12-week interval. An expected present-state effect was found in a structural equation model.

Conclusion

Prospective and retrospective indices of magnitude of change were similar between groups receiving treatment of known efficacy. Recall bias seems to be an acceptable risk in short-term follow-up studies.

Introduction

The measurement of treatment-related change over a period of time in patients is central to both clinical research and practice. In evaluation research, investigators commonly define change as the difference between baseline and post-treatment scores, obtained from serial measurements (i.e., serial change). In clinical practice, however, change after treatment is generally assessed retrospectively by asking the patient to give an appraisal of the magnitude and direction of the change in health status or functioning as stable, improved, or deteriorated. In the interaction between clinician and patient, such a retrospective appraisal by the patient and physician concerning several domains of the health status has clinical relevance, in that it determines the decisions made in the management of the disease. This is common practice, and therefore consideration must be given to the possibility of measuring the retrospective change directly in evaluation studies of treatment efficacy with, for example, health-related functional status (HRFS) as the outcome.

In evaluation research, retrospective measurement is obviously easier and more economical than serial measurement. Despite the apparent advantages of retrospective measurement of change in HRFS, there is a suspicion that global or transition questions are biased due to recall problems or present-state effects at follow-up. It is assumed that prospective or serial change assessed by repeated measurement is superior and that the use of retrospective assessment of change in HFRS with global or transition questions is definitely not advisable [1].

There is, however, an ongoing debate about the methods for estimating clinically relevant change [2], [3], [4], [5]. One of the assumptions in this debate is that changes inferred from repeated measurement approximate the change captured by the patient's retrospective perceptions of change over a period of time [6], [7], [8]. Other researchers, however, have found that the retrospective recall of a change in health status or symptoms is not as accurate as the change found in pre–post designs because of the complexity of the question. For example, when an interviewer asks patients who have undergone a coronary artery bypass grafting (CABG) operation whether they have felt better or worse since the bypass operation, the patients have first to make a judgment of the present state of health, then make a reconstruction of the situation before the CABG, and finally do mental subtraction to estimate the perceived direction and amount of change over time.

This method has some weaknesses. First, there is often a correlation between the ‘present state’ score, the post-treatment score, and the ‘retrospectively perceived change’ score in that health status domain [9], because this post-treatment present state is the frame of reference for the comparison with the health status before the treatment—this is present-state bias. A second weakness is that when the time span is too long, people have great difficulty remembering how they were before treatment—this is recall bias. A third problem is that retrospective assessment of treatment-related change may be invalid if patients feel that they are being prevented from living as they would like to by problems not related to the disease for which they are being treated.

The fourth weakness is that patients who remained stable after an invasive operation (e.g., CABG), according to the outcome of repeated measurement, were obviously in some respects limited in functional status before this treatment, and consequently when posed a retrospective global question were likely to report improvement. Some transition items are too general (e.g., “Have you felt better or worse since your bypass-operation?”). The patient may then refer only to a few symptoms manifesting themselves at that particular point in time, such as shortness of breath, pain in the chest, or fatigue [10], [11], [12], [13]. Additionally, the single item is a relatively coarse method in comparison with the multi-item scale and is not as suitable for detecting the minor differences in health perception that may still be clinically relevant.

In the present study, multiple-item transition scales enable patients to rate the extent to which they have changed regarding a number of disease-specific variables, thereby allowing for the possibility that not all aspects of functioning, health status and symptoms will be given the same response. A scale constructed from the summed composite of transition items (transition scales) that belong to a HRFS domain—for example, physical, emotional, or social functioning—yields more information reflecting meaningful change in that dimension than single items would. Furthermore, the comparison of these retrospective change scales with multiple serial change scales, comprising identical items in terms of responsiveness, may contribute to the analysis of the convergent validity of prospective and retrospective measures of change in HRFS. In the present study, the importance or value that patients assign to their perceived change after treatment was used to weight the change in HRFS-scores-weighted items.

In an earlier publication based on the data from the present study [8], we showed that the serial change scores of items, and likewise the identical transition items of the physical functioning scale, yielded similar factor loadings and estimated internal consistency coefficients (Cronbach's α), despite the weaknesses of retrospective global questions. The results published by Aseltine et al. [14] correlate with our findings that no significant differences in responsiveness (standardized response mean: SRM) were observed between serial change scales and transition scales [8], [15]. We therefore hypothesize that retrospective assessment of treatment-related change over time may not be affected by recall bias in short-term evaluation of medical treatment or interventions of approximately 6 weeks before and 12 weeks after a significant event or intervention such as percutaneous transluminal coronary angioplasty (PTCA) or CABG, and may be a valid and reliable proxy for serial change assessment with an HRFS measure.

The present study, however, explores the relationship between serial measurement and weighted or unweighted retrospective measurement, with identical items and scales belonging to the physical and emotional domains of health-related functional status. The patients in the present study were undergoing treatment that is known to have an impact on health status domains, and therefore the indices used should reflect meaningful change between baseline and follow-up.

The following questions were addressed in the present study:

1.
Are responsiveness indices derived from serial change scores and from weighted and unweighted retrospective scores similar when patients are broken down into groups with known treatment efficacy?
2.
To what extent is retrospective measurement using global questions influenced by recall bias?

Section snippets

Patient selection

To ensure that a change in health status manifested itself, we selected a group of patients undergoing treatment with a known efficacy and selected a disease-specific instrument with a known sensitivity to detect change over time [16]. The instrument had proved to be sensitive to change in a similar sample of Dutch patients with ischemic heart disease [17].

Patients participating in the present study were recruited consecutively from three hospitals in the north of the Netherlands. All the

Sample

A total of 398 candidates were screened for inclusion in the present study; 139 (35%) did not return the first questionnaire, so the final sample consisted of 259 patients. The probability of systematic differences between nonresponders and the study sample could not be tested, because no information was available without the written informed consent of the patients who did not return the first questionnaire.

Forty-two patients (16%) dropped out of the study before the follow-up assessment

Discussion

There are many published studies that make use of items or global questions measuring the perceived magnitude of change in HRFS as measurements of treatment outcome [7], [14], [20], [24]. In most of these studies, however, these questions are used as a single item to assess perceived change. One of the problems with globally assessed change is that the reliability cannot be established since Cronbach's α cannot be computed for a single item. The serial change scale (SCS), and the unweighted

References (37)

D.A. Mahler et al.
The measurement of dyspnea: contents, interobserver agreement, and physiologic correlates of two new clinical indexes
Chest
(1984)
G.I.J.M. Kempen et al.
Relationship of domain-specific measures of health to perceived overall health among older subjects
J Clin Epidemiol
(1998)
R. Leavey et al.
A comparison of two health survey measures of health status
Soc Sci Med
(1988)
T.S. Rector et al.
Assessment of patient outcome with the Minnesota Living with Heart Failure questionnaire: reliability and validity during a randomized, double-blind, placebo-controlled trial of pimobendan
Am Heart J
(1992)
J.G. Wright et al.
A comparison of different indices of responsiveness
J Clin Epidemiol
(1997)
G.R. Norman et al.
Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach
J Clin Epidemiol
(1997)
C.E. Schwartz et al.
Methodological approaches for assessing response shift in longitudinal health-related quality-of-life research
Soc Sci Med
(1999)
S.S. Coughlin
Recall bias in epidemiologic studies
J Clin Epidemiol
(1990)
D.A. Redelmeier et al.
Assessing the minimal important difference in symptoms: a comparison of two techniques
J Clin Epidemiol
(1996)
D. Fischer et al.
Capturing the patient's view of change as a clinical outcome measure
JAMA
(1999)

S. Ziebland

Measuring changes in health status

D. Osoba

Interpreting the meaningfulness of change in health-related quality of life scores: lessons from studies in adults

Int J Cancer Suppl

(1999)

B. Middel et al.

How to validate clinically important change in health-related functional status: Is the magnitude of the effect size consistently related to magnitude of change as indicated by a global question rating?

J Eval Clin Pract

(2001)

S. Ziebland et al.

Comparison of two approaches to measuring change in health status in rheumatoid arthritis: the Health Assessment Questionnaire (HAQ) and modified HAQ

Ann Rheum Dis

(1992)

C.F. Emery et al.

Perceived change among participants in an exercise program for older adults

Gerontologist

(1990)

B. Middel et al.

Why don't we ask patients with coronary heart disease directly how much they have changed after treatment?

J Cardiopulm Rehabil

(2002)

D.L. Streiner et al.

Health measurement scales: a practical guide to their development and use

(1995)

J.L. Read et al.

Measuring overall health: an evaluation of three important approaches

J Chron Dis

(1987)

Cited by (0)

View full text

Original ArticleRecall bias did not affect perceived magnitude of change in health-related functional status

Abstract

Background and Objective

Study Design and Setting

Results

Conclusion

Introduction

Section snippets

Patient selection

Sample

Discussion

Chest

J Clin Epidemiol

Soc Sci Med

Am Heart J

J Clin Epidemiol

J Clin Epidemiol

Soc Sci Med

J Clin Epidemiol

J Clin Epidemiol

Capturing the patient's view of change as a clinical outcome measure

JAMA

Measuring changes in health status

Interpreting the meaningfulness of change in health-related quality of life scores: lessons from studies in adults

Int J Cancer Suppl

How to validate clinically important change in health-related functional status: Is the magnitude of the effect size consistently related to magnitude of change as indicated by a global question rating?

J Eval Clin Pract

Comparison of two approaches to measuring change in health status in rheumatoid arthritis: the Health Assessment Questionnaire (HAQ) and modified HAQ

Ann Rheum Dis

Perceived change among participants in an exercise program for older adults

Gerontologist

Why don't we ask patients with coronary heart disease directly how much they have changed after treatment?

J Cardiopulm Rehabil

Health measurement scales: a practical guide to their development and use

Measuring overall health: an evaluation of three important approaches

J Chron Dis

Original Article
Recall bias did not affect perceived magnitude of change in health-related functional status