Listening to respondents:: a qualitative assessment of the Short-Form 36 Health Status Questionnaire

https://doi.org/10.1016/S0277-9536(01)00003-XGet rights and content

Abstract

Standardised health status questionnaires are widely used to obtain subjective assessments of health. However, little research has investigated the meaning of the data they produce. Statistical tests will highlight some problems with the structure and wording of a questionnaire but they cannot shed any light on the way in which respondents interpret questions or their intended meaning when they select a response. Various qualitative techniques are being used within disciplines such as sociology and psychology to test both the language of survey instruments and the cognitive bases of surveys. This paper outlines some of these methods and reports findings from a qualitative research study in the UK with a widely used questionnaire- the Short-Form 36 Health Status Questionnaire. The value of including in-depth, qualitative validation techniques in the development and testing of surveys used to collect subjective assessments of health is clearly demonstrated by the findings of the study.

Introduction

During the last decade there has been considerable interest in capturing subjective views of health status and health related quality of life, particularly within the field of health services research. Standardised health status questionnaires are the method of choice for much of this research and considerable energy has been directed towards evaluating and refining them in various settings. After initial piloting ‘post-mortems’ (when interviewers report on the problems they encounter using a questionnaire) the testing process is almost exclusively a quantitative activity. While psychometric assessments have an important role in developing health questionnaires they shed little light on the meaning of the questions and response options to respondents and, therefore, the meaning of respondents’ answers. The validity of survey data depends upon shared understandings of questions and response options and yet research evidence has shown that people interpret survey questions in unexpected ways (Tanur, 1992). In the field of health status assessment there have been some papers which raise questions about the validity of widely used health surveys. For example, Donovan, Frankel, and Eyels (1993) reported that people answering the Nottingham Health Profile (NHP) often misunderstood the NHP's questions and had problems with the simple yes/no response options and this undermined the validity of the data. Mallinson (1998) found that people self-completing the Short-Form 36 Health Status Questionnaire (SF-36) wrote comments on their questionnaires which suggested their interpretation of some items differed from the surveyor's intended meanings. They also had difficulty understanding the wording of some items and found some of the response options inadequate to describe their views. The effect of unexpected variations or of flawed design is to create uncertain data whose validity is questionable. Given that health services researchers are keen to encourage the widespread use of subjective health measurement in health care the problem of meaning has to be addressed. This paper discusses some of the methods which could be drawn into the testing process and illustrates the importance of undertaking further research with qualitative findings from an in-depth assessment of the SF-36.

An active industry within health services research in the UK and internationally is devoted to the development and application of subjective health measures. The technical skill and intellectual input which informs these researches is considerable and yet it seems that the issue of meaning is largely ignored in most of the papers and text books produced (Hunt, 1997). Indeed, the complex testing typologies which are being developed to evaluate health questionnaires fail to consider subjective interpretation at all (Donovan et al., 1993). Interestingly, this apparent avoidance of issues around meaning has occurred despite a wealth of sociological debate about the challenges of conducting survey research (Cicourel, 1964, Pawson, 1989) and largely without reference to work by psychologists and survey methodologists working in other fields of research.

Evidence from psychology and sociology has shown that the processes involved in interpreting a question and formulating an answer are complex. For instance, if you reword or reorder questions in a survey peoples’ responses change. If you provide even slightly amended response options then people will give different answers (Clarke & Schober, 1992). This is largely due to the nature of all interviews (structured or unstructured) as interactions and the reliance of questionnaires on natural language. Although standardised questionnaires have been developed to avoid potential sources of bias that arise when questions are reworded, the standardisation of the survey text does not automatically lead to standardisation of meaning. In natural language the meanings of words does not inhere in the words themselves but is a product of the situation and the relationship between those interacting and can be affected by a range of social and cultural factors. In addition to the problem of natural language there are also other factors related to the cognitive bases of surveys which also have to be taken into account.

Survey methodologists have developed a number of techniques to explore the way people interpret survey questions and to check the cognitive processes which come into play. For example ‘think-aloud’ protocols are often used to explore individual interpretations of questions in detail. People are asked to describe what they are thinking of when listening to each question in a survey and how they interpret the questioners intentions (Hak, 1999; Suchman & Jordan, 1992). The meaning the respondent intends to convey with their reply is also explored through in-depth probing. In some ways this process mimics, albeit in a rather formalised and extended way, the process of natural conversation where people automatically probe to establish intended meaning of speakers or to check that a listener has heard their speech as intended. These interactional techniques to clarify meanings and avoid misunderstanding come naturally to most of us and we are often not even conscious that we use them. Natural conversational flow allows us to repair misapprehensions and establish common ground so that meaningful communication can proceed.

In the interactional strangeness of the survey interview most of the mechanisms used to check meanings are suppressed. The role of questioner and respondent are clearly defined because the format of the interview is predetermined. The direction of the questioning is one-way and (in theory) wordings should not be altered. Indeed, the well trained interviewer should not provide any elaborations to help respondents establish the intended meanings of the surveyor.1 Of course, the extent to which survey interviewers reach text-book standards in practice is questionable. Nevertheless, where in an ordinary conversation a person might ask questions in response to a question they are unsure of, respondents in the structured interview cannot use the same techniques to assist their interpretation.

The problems with language interpretation and the suppression of conversational repair may present challenges to the respondent who has to make judgements and suppositions in order to comply with the survey. The problem for the survey researcher is that routine data processing methods may not find errors of comprehension or variations in the interpretation of questions which arise. Only if people decide that they are unable to provide an answer will there be missing data and the evidence from cognitive research suggests that many people will respond even if they do not understand a question (Clarke & Schober, 1992). Testing the internal consistency of questions aiming to tap a single construct should highlight inconsistencies in responses and therefore alert the surveyor to comprehension problems (Bowling, 1991). However, there is also evidence that people strive to be consistent when they answer a survey and will therefore choose logically consistent responses even if this does not reflect their views (Clarke & Schober, 1992). Although one could argue that some of these problems are ironed out in a large sample, at some level, especially if the measure is being used to monitor individual health, the validity of the data rests upon shared meanings or at least stable meanings within the individual across time and place. How can one know if people hear the questions in the manner they were intended without doing some kind of data checks between respondents, or in different situations, or within individuals over time? These are questions which prompted the development of a study which used qualitative data to explore the various interpretations which arose during the administration of the SF-36 Health Status Questionnaire.

Section snippets

Participants and data collection

This research was conducted in the community with older people who were being treated by community health services in two health authorities in the North West of England. People aged 65 yr or more who were newly referred to two teams of community physiotherapists and one team of rehabilitation occupational therapists during a six-month period were invited to join the study. The therapists provided patients with information on the study and obtained verbal consent for an interview before

The physical functioning dimension

The questions on physical activity included in the SF-36 (Table 1) aim to represent distinct aspects of physical activity and the different severities with which limitations might be experienced. The authors define physical functioning as the ‘performance or capacity to perform a variety of physical activities normal for people in good health’. (Stewart & Kamberg, 1992, p. 86). They have attempted to separate simple physical activity from role activities so that the assessment shows how limited

Conclusions

The practice of health measurement has a long history and considerable energy has been devoted to establishing the conceptual basis for many health questionnaires. However, critics have noted that with the recent upsurge of interest in health measurement some of the more detailed, in-depth research seems to have been abandoned (Hunt (1997) calls it the ‘rush to measurement’). It is easy to fall into the trap of using questionnaires like a form of laboratory equipment (a kind of calibrated

Acknowledgements

I would like to thank NWRHA for providing the Health Services Research Training Fellowship which funded this research and Jean Siddall for her assistance. I am grateful to Jennie Popay, Rob Flynn and Sharon Bennett for their comments on drafts of this paper and to Gareth Williams for his supervision. Particular thanks go to all the staff and patients who gave up time and energy to participate in the study.

References (32)

  • P.J Allison et al.

    Quality of life as a dynamic construct

    Social Science and Medicine

    (1997)
  • F.X Gibbons

    Social comparison as a mediator of response shift

    Social Science and Medicine

    (1999)
  • A Bowling

    Measuring health

    (1991)
  • M Blaxter

    Health and lifestyles

    (1990)
  • C Bryant

    Practical sociology

    (1995)
  • J Brazier et al.

    Validating the SF-36 health survey questionnaireNew outcome measure for primary care

    British Medical Journal

    (1992)
  • Clarke, H. H., & Schober, M. F. (1992). Asking questions and influencing answers. In J. M. Tanur (Ed.), Questions about...
  • A.V Cicourel

    Method and measurement

    (1964)
  • J Cornwell

    Hard earned lives

    (1984)
  • J.L Donovan et al.

    Assessing the need for health status measures

    Journal of Epidemiology Community Health

    (1993)
  • E.R Gerber et al.

    Perspectives on pretesting‘cognition’ in the cognitive interview

    Bulletin de Methodologie Sociologique

    (1997)
  • A Gouldner

    Enter PlatoClassical Greece and the origins of social theory

    (1967)
  • Hak, T. (1999). Quality of life in cancer patients. Seminar Paper, Manchester Medical Sociology Group, Spring...
  • C Herzlich

    Health and IllnessA social psychological analysis

    (1973)
  • S.M Hunt

    The problem of quality of life

    Quality of Life Research

    (1997)
  • S Hill et al.

    Is the SF-36 suitable for routine health outcomes assessment in health care for older people?

    Evidence from preliminary work in community based health services. Epidemiology and Community Health

    (1996)
  • Cited by (161)

    • An Investigation of Age-Related Differential Item Functioning in the EQ-5D-5L Using Item Response Theory and Logistic Regression

      2022, Value in Health
      Citation Excerpt :

      Using these response mechanisms, they lowered their benchmark of what they considered good health and therefore responded more positively than a younger adult would likely respond in the same state. Response shift has also been observed in the wider PROM literature.43-46 The fact that the statistical DIF findings align with broader literature provides further support to the findings.

    • Qualitative Review on Domains of Quality of Life Important for Patients, Social Care Users, and Informal Carers to Inform the Development of the EQ-HWB

      2022, Value in Health
      Citation Excerpt :

      A framework based on the conceptual model (Fig. 1) was developed in Microsoft Excel (Microsoft Corporation, Redmond, Washington) and quotes or summaries were extracted into it after it was independently piloted by 3 reviewers and refined on 1 condition. Study details (see Table 212-53 and 32,54-75) were also extracted. Emerging themes and subthemes were added to the existing framework.

    • Devilish details: The importance of marginalia in personality research

      2022, Journal of Research in Personality
      Citation Excerpt :

      When asked about their health, for example, participants might feel the “need to share more than predetermined responses and medical histories” (Clayton et al., 1999, p. 515), in order to provide a fuller picture of their symptoms or overall health. Analyses of marginalia have proven useful to researchers in determining whether participants have interpreted the question in the same or different ways, though this is not always welcome information, since it can point to limitations in the data collected (Mallinson, 2002). What do researchers do when participants indicate, for example, that a question does not provide an option that works for them?

    • The Qualitative Assessment of Two Translated Dutch Spirituality Scales for Children

      2021, Journal of Pediatric Nursing
      Citation Excerpt :

      Therefore, it is imperative to select and use instruments that are mostly non-religious, and use appropriate non-religious vocabulary. Usually, a quantitative analysis is used to define validity and reliability of measurement instruments, but to guarantee a proper adaption to the Dutch context the interpretation bias and face validity should first be established of any translation (Mallinson, 2002). A literature search in EBSCO provided all available measurement instruments for children under the age of 18.

    View all citing articles on Scopus
    View full text