Introduction

Clinical measures focusing on airways (such as wheeze or dyspnoea or degree of airflow limitation on spirometry) are frequently used to assess asthma severity and control in primary care practice. It is sometimes assumed that these also reflect patients’ overall well-being. However, an individual patient’s perception of airway narrowing is highly variable. A major goal of asthma management is improvement of health-related quality of life (HRQoL).1 It is important to assess HRQoL using standardised questionnaires to gather information complementary to other conventional clinical surrogates of airway inflammation.

As with other chronic disorders, HRQoL in patients with asthma has been assessed using either generic or disease-specific questionnaires in primary care settings. Practically all disease-specific HRQoL questionnaires have been developed and validated in the West. Considering the vast linguistic, ethnic and sociocultural differences in India, the suitability of these questionnaires in patients in this country is uncertain. We have previously evaluated a Hindi adaptation of the Mini Asthma Quality of Life Questionnaire in our patients and found it to be only a moderately discriminative and relatively poor evaluative instrument to assess HRQoL.2 We therefore feel that an ethnically and linguistically appropriate HRQoL measure is necessary for proper HRQoL assessment.

To the best of our knowledge, a disease-specific HRQoL measure has not been developed for Indian patients with asthma. However, a generic Hindi HRQoL questionnaire has been developed in India as part of a multi-country initiative by the World Health Organization.3 The Hindi versions of the 100-item World Health Organization Quality of Life scale (WHOQOL-100) and its 26-item abbreviated version (WHOQOL-Bref) were derived from studies conducted on 304 adult subjects in New Delhi using a bank of 236 items derived from a global question pool as well as those developed locally.3 Validation studies have suggested equivalence between the Hindi and the standard English questionnaires.4 The questionnaire thus appears well suited to HRQoL assessments in the north Indian population, even though its generic nature might result in some loss of sensitivity in detecting HRQoL across disease-specific domains. We therefore applied Rasch analysis in this exploratory study to assess the suitability of using the Hindi version of WHOQOL-Bref to assess HRQoL in asthma patients in north India, and attempted suitable modifications to this instrument to improve performance. The Rasch measurement theory, a modern psychometric approach to instrument validation, not only allows a detailed examination of scaling properties of the instrument but also provides potential solutions to underperforming measures.

Materials and methods

Participants

The participants were enrolled from patients visiting our Chest Clinic. As our institute is a tertiary referral centre, we frequently see patients with poorly controlled asthma and/or asthma-related complications and patients with milder or well-controlled disease are a small proportion of our outpatient attendance. All enrolled patients underwent detailed symptom enquiry, physical examination and spirometry at the initial evaluation. Only those persons with a good knowledge of both spoken and written Hindi were evaluated further. After initial screening, patients with any associated clinical co-morbidity that could independently alter HRQoL (e.g., cardiovascular, neuromuscular or arthritic disorders) were excluded. Patients with recent disease exacerbation requiring change in medication or use of systemic corticosteroids within the last 4 weeks, those with ongoing or recent upper or lower respiratory tract infection, current or past tobacco smokers and pregnant women were also excluded. Asthma control was categorised according to the Global Initiative for Asthma guidelines.1 Informed consent was obtained from all participants and the study protocol was previously approved by our Institutional Ethics Committee.

Design of study

All patients completed the Hindi version of the 26-item WHOQOL-Bref (Table 1).3 WHOQOL-Bref performance was assessed by Rasch analysis. Rasch models use a logit equation (Box 1) to attempt to estimate latent traits (HRQoL in this instance) on an interval scale from questionnaire item scores on an ordinal scale.5 Data were examined using Rasch Unidimensional Measurement Model (RUMM2020) software (Rumm Laboratory, Perth, Australia) that estimated item parameters using the pairwise conditional maximum likelihood procedure and the partial credit approach.6 The partial credit model is useful when response categories in each questionnaire item are ordered but not necessarily equidistant from each other in terms of the latent trait being described. We performed a separate analysis for each domain of the WHOQOL-Bref (physical, psychological, social relationships and environment). Since the first two items of the WHOQOL-Bref are not part of any domain, these were not further examined.

Table 1 Individual items in the WHOQOL-Bref questionnaire

Adequacy of fit of observed data to the Rasch model was assessed through item fit residuals and item–trait interaction chi-square (Box 1). Values of standardised residuals outside the normal range of ±2.5 were considered abnormal, with more negative values signalling local dependency (i.e., various test items being significantly related to each other) and large positive values indicating violation of unidimensionality. A statistically non-significant probability value (after Bonferroni correction) of item–trait interaction chi-square fit statistic was considered indicative of construct validity. Person-item maps were constructed to assess whether questionnaire items targeted the entire range of patients. The person-separation index (Box 1) was calculated for each item, with values of 0.7, 0.8 and 0.9 representing the capacity to distinguish two, three and four distinct statistically discernible person ability strata, respectively.7 Class interval responses and model expectations were also studied graphically through item characteristic curves.

Item scoring structure was assessed by threshold analysis (Box 1). Five response categories in WHOQOL-Bref yield four threshold parameters for each item. A distance of 1.4–5.0 logits is desirable between adjacent thresholds to express adequate separation between categories without leaving undue gaps in the measured variable.8 Differential item functioning (DIF) was assessed by noting if patient subgroups (stratified by gender, age, and level of asthma control) at comparable levels on the measured construct respond systematically differently to items.9 DIF was analysed both graphically and by two-way analysis of variance of the residuals (Box 1).10

Based on the nature and quantum of misfit to expected models, item deletion and rescoring were applied. Questions with significantly high or low item-fit residuals were removed. Adjacent categories for items showing disordered thresholds or having zero response to a particular category were combined. This process is often referred to as collapsing adjacent categories. The modified scale was again analysed (as above) to confirm improvement in performance.

Results

We studied 67 consecutive patients with asthma aged 15–66 years (30 men and 37 women, median age 36 years). Asthma control was good in 16 patients (23.9%), partial in 35 (52.2%) and poor in 16 (23.9%).

The overall fit of the WHOQOL-Bref data was adequate. At the item level, item 3 (pain prevents doing work) displayed a large and significant positive fit residual value (indicating some violation of unidimensionality) and a somewhat larger chi-square value relative to other items (see Supplementary Table S1). No other items showed signs of misfit. Item–trait interactions were non-significant (except for physical domain), confirming invariance of items (Table 2). Overall mean item- and person-standardised fit residuals were satisfactory (Table 2). The person-separation index was satisfactory for all domains (Table 2), indicating the general ability of items in these domains to discriminate approximately three groups. The first response category of items 1, 6, 8, 15, 16 and 21 and the last response category of items 3, 4, 6 and 26 were not chosen by any respondent. Taking into account that responses of items 3, 4 and 26 were reversed prior to analysis, the response indicating best health status was not chosen for these nine items.

Table 2 Performance of WHOQOL-Bref before and after modification

Mean±s.d. person location estimates were satisfactory for all domains as they were not significantly different from the corresponding centralised item location means of zero logit (Table 2). This indicated that the study sample as a whole was neither located at a better level nor located at a lower level of HRQoL than the average of the scale. Therefore, overall the scale appeared well targeted for this patient group. Graphical analysis (Figure 1), as well as formal statistical testing (Supplementary Table S2 ), did not suggest significant DIF for any item. Threshold analysis showed that threshold distances between various responses to an item varied across items. Significant anomalies for threshold patterns were observed for the WHOQOL-Bref as 10 of the 24 items had disordered thresholds (see Supplementary Table S3 and Figure 2). These results suggest that the response scales of several items were inadequate in ordering patients with distinct levels of ability. Moreover, distances between adjacent thresholds were >5 logits in 8 instances and <1.4 logits in 28 instances (see Supplementary Table S3).

Figure 1
figure 1

Graphical exploration of differential item functioning for item 18 (capacity for work) of the abbreviated World Health Organization Quality of Life. The dashed line corresponds to the item characteristic curve representing the expected probability of item endorsement as a function of person ability. Superimposed plots represent the observed responses by patients of either gender (left panel), different age groups (middle panel) and different levels of asthma control (right panel). For each analysis, patients were divided into three approximately equal groups according to their health-related quality of life. Individual plots for each analysis lie close to each other, with no obvious dissimilarities. Group differences were also statistically non-significant on formal analysis of variance testing, suggesting that item response functions were largely invariant across categories.

Figure 2
figure 2

Example of category probability curves. The top panel for item 12 (money to meet needs) of the abbreviated World Health Organization Quality of Life reveals disordered and reversed thresholds. There is no point on the continuum where response categories 2 or 4 are the most likely responses. Threshold locations (corresponding to points of intersection between probability curves of two adjacent response categories) between response 1 or 2 and 2 or 3, and between 3 or 4 and 4 or 5, are reversed. The bottom panel shows the curves redrawn after rescoring category structure (collapsing categories 1 and 2, and 4 and 5). After this merger, the three response categories for this item are well ordered and distributed, with persons with higher ability (or better quality of life) having a progressively greater probability of endorsing a higher response category.

In view of the rather suboptimal performance of the WHOQOL-Bref (significant item–trait interaction for the physical domain, anomalous threshold patterns for several items and lack of selection of few response options by any respondent), the scale was modified by dropping item 3 and collapsing two or more response categories of 16 items (see Supplementary Table S3). The rescored instrument had better construct validity as the previously significant item–trait interaction for the physical domain became insignificant (Table 2). However, the person-separation index did not substantially improve for any domain, indicating that the ability of the revised questionnaire to separate patients with different HRQoL remained largely similar. Overall fit parameters remained satisfactory (Table 2) and threshold analysis revealed ordered thresholds for all items (see Supplementary Table S3 and Figure 2). Person location estimates remained acceptable (Table 2). Person-item maps confirmed that the modified scale was well targeted to the patient population as almost all person ability estimates were well matched by one or more response thresholds (Figure 3). Formal statistical analysis did not demonstrate any DIF (Supplementary Table S2).

Figure 3
figure 3

Person-item distribution maps for various domains of the abbreviated World Health Organization Quality of Life (WHOQOL-Bref) after rescoring categories. The vertical line represents the measure of the variable in linear logit units. In each panel, the right-hand column locates questionnaire item threshold difficulty measures along the variable. Each entry is indicated by its number in the original questionnaire (see Table 1), followed by the threshold after the decimal point. For instance, the location 04.3 refers to the difficulty calibration estimate of the third threshold (i.e., threshold between the third and fourth response category) of the fourth questionnaire item. The left-hand column locates the patients’ ability measure along the variable, with each plus sign representing 10 patients and each circle representing one patient. From bottom to top, measures indicate better health-related quality of life (for patients) and greater difficulty (for items).

Discussion

Main findings

Our study showed that even though the overall performance of the WHOQOL-Bref was adequate in describing HRQoL in asthmatics, the instrument performed poorly on Rasch analysis (in terms of abnormal fit values and disordered thresholds of individual items). After removing one misfitting item and rescoring the category structure of 16 items, the modified instrument had good construct validity for all domains and ordered thresholds for all items.

Interpretation of findings in relation to previously published work

Both generic and disease-specific questionnaires have been used to quantify HRQoL in patients with asthma. Disease-specific questionnaires appear more relevant and more responsive to changes in disease status and are certainly more popular in describing HRQoL in patients with asthma. However, they tend to perform relatively poorly in Indian patients.2 Of the several generic questionnaires available, Short Form-36 and Sickness Impact Profile have been most widely assessed among asthma patients. Comparative studies have shown that several generic questionnaires have good performance characteristics, often comparable to the asthma-specific questionnaires.1113 To the best of our knowledge, WHOQOL-Bref has not previously been used to evaluate HRQoL in patients with asthma, although it has been used in patients with chronic obstructive pulmonary disease.14,15 In India, WHOQOL-Bref has already been used for patients with other pulmonary disorders such as lung cancer and tuberculosis.16,17

Even though the overall fit of the WHOQOL-Bref data was adequate, its performance in describing HRQoL in our patients was suboptimal. Item 3 from the physical domain (pain prevents doing work) had a large positive fit residual value. Since pain is not a feature of asthma, it is understandable that this item violated unidimensionality, resulting in poor construct validity of the physical domain. For nine items the response category corresponding to best health status was not marked by any patient. It is possible that no patient had very mild disease, a phenomenon not infrequent at our institute. The most disturbing trends were noticed on threshold analysis. Disordered thresholds in 10 items indicated that the logic of using successive integer scores as a basis for measurement was not satisfied, since a patient with better HRQoL could respond in the same category as another with lower HRQoL. This can occur either because of too many response choices or because of confusing labelling of options. Additionally, eight threshold distances exceeded 5 logits, implying a less informative zone between these categories. Another 28 threshold distances were <1.4 logits, suggesting that these adjacent response categories were not clearly distinct.

The second part of the study involved modification of the WHOQOL-Bref to improve model performance. A similar approach has previously been undertaken to try and adapt generic scales to assess general or disease-specific HRQoL.1820 Attempts have also been made to shorten the Asthma Quality of Life Questionnaire using such an approach.21,22 Item 3 was removed to improve the construct validity of the physical domain and adjacent categories were combined for 16 items. Proper collapsing improves instrument performance, eliminating redundancy of underused response options and ensuring that each rating category represents a distinct ability level. Indeed, the adapted scale had better performance characteristics. Reducing the number of items/responses in the WHOQQOL-Bref while improving the psychometric properties is advantageous in terms of efficiency.

Strengths and limitations of the study

Our study is not without limitations. One obvious factor is the small sample size, which may have resulted in type I errors related to fit statistics.23 However, Rasch analysis can be effectively conducted with small samples and, according to some experts, even a sample size of 50 might prove sufficient for most exploratory work as it gives 99% confidence that no item calibration is >1 logit away from its stable value.24 Based on the results of this preliminary work, we also cannot comment if the final scoring matrix of the revised questionnaire is identical to the original scale. The focus of our study was assessment of measurement properties of WHOQOL-Bref with reference to the structure and fit of the questionnaire and not the derivation of various domain scores or their clinical relevance. The analysis therefore does not provide any information regarding the clinical validity of this instrument. As our institute is a tertiary referral centre for pulmonary disorders, it is likely that study participants were not representative of the general asthma population of the region. Our findings may therefore not be generalisable to other patient populations such as patients with milder asthma being managed by primary care physicians.

Implications for future research, policy and practice

Application of Rasch modelling is only an initial step. Further analysis of the modified questionnaire structure is needed (in terms of its acceptability, validity, reliability and responsiveness), since collapsing categories or removing items statistically is not the same as presenting patients with a set of reduced response options in a primary care setting.

Conclusions

We have demonstrated that the WHOQOL-Bref is rather inadequate at describing HRQoL in patients with asthma in north India. Rasch analysis enabled us to improve the instrument to achieve better performance. This modified generic scale meets the expectations of Rasch modelling and may therefore be more suited to describe HRQoL.