Introduction
Many questionnaires have been developed to assess the different facets of cancer patients’ experiences before, during, and after their treatment. For example, there are questionnaires to assess their satisfaction with care, their health-related quality-of-life (HRQoL), or their preferred communication style with their oncologists. When developing scales for a general cancer population or testing differences between groups using well-established questionnaires, an important question to keep in mind is whether members of different groups assign the same meaning to questionnaire items. In other words, if there are two patients with the same level of overall satisfaction, will they respond to an observed item in the same way, or will specific characteristics, like gender or treatment regime, influence their response to the item. If it can be shown that these characteristics do not affect responses to observed items, then the assumption of measurement invariance has been met.
The assumption of measurement invariance requires that the relationships between the observed items and the latent construct remain constant regardless of respondents’ group membership, for example, age, race, or disease characteristics or the measurement occasion [
1,
2]. If this assumption is violated, then the results from cross-group comparisons of the construct may be incorrect. This is because mean differences should represent true differences in the construct of interest and not reflect anything else. For example, it may be that a male patient and female patient share the same underlying level of Physical HRQoL. However, when asked a question about carrying groceries, the male who does not do the shopping may respond that he has no difficulty with this activity, whereas the female may indicate that she has great difficulty. The responses given to this grocery item are related not only to Physical HRQoL but also to gender roles. In this example, it is clear to see how gender roles and Physical HRQoL can become entangled. However, it may not always be obvious how patient characteristics might affect certain items. In a study by Reker and Fry [
3], bias with respect to age was found in personal meaning measures. The authors concluded that bias in the Self-Transcendence Scales stemmed from older adults using events from the past as their frame of reference, whereas younger adults used present and future events as their frame of reference. When developing items for a scale, this type of bias will be difficult to anticipate and success can only be evaluated after scale development and piloting. If invariance testing yields positive results, in that the measurement is invariant, we can be confident that our results are not distorted because of different functioning of the measurement as a result of group membership. Unfortunately, measurement invariance of self-report questionnaires is often not investigated.
Establishing that a scale has good reliability and validity does not ensure that the scale will not violate the assumption of invariance. The European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30, a measure developed to assess HRQoL, is considered to have excellent reliability and validity [
4]. Specifically, it is a generic HRQoL measure for use with all cancer patients, with additional modules for specific cancer diagnosis (e.g., breast [
5] and lung cancer [
6]). Despite having excellent reliability and validity, the factor structure of this scale has received little attention [
7,
8], and most measurement invariance testing has been conducted using item response theory (IRT) [
9], with the primary focus on language translation [
10,
11]. While designed for a general and therefore heterogeneous cancer population, it is possible that this heterogeneity will lead to a violation of measurement invariance. If we are interested in, for example, differences in HRQoL based on different treatment stages or information preferences, then before differences can be investigated, we must check whether measurement bias with respect to these variables is present. For example, patients who have already received treatment for their diagnosis may have experienced an unmeasured response shift [
12]. This in turn could result in a shift in internal standards when responding to HRQoL items, whereas yet to be treated patients will not have experienced this phenomenon. In regard to patients’ information preferences, it is conceivable that patients with high compared to low levels of information preferences may respond to HRQoL items using a different frame of reference. This might be because patients who want more information may want this information to inform their family and friends of their treatment and prognosis [
13], therefore, they might have a different frame of reference toward social functioning. Thus, before we investigate the relationships between these variables and HRQoL, we need to be sure that differences in HRQoL mean the same thing for patients in different treatment stages or with different information preferences.
Testing the assumption of measurement invariance in different situations and with different groups of people has been greatly facilitated by the development of several analytic techniques including IRT and structural equation modeling (SEM)/confirmatory factor analysis (CFA) [
14]. Within the framework of SEM, there are three approaches available to assess whether the assumption of invariance holds. In cross-sectional research, multi-group CFA comparisons are the most frequently used method. However, to conduct such an analysis, a large sample is required as the sample must be split by group membership. Also, if the potential violator of invariance is continuous, the variable must be transformed to a discrete variable to create multiple groups, in doing this, there is a loss of information. Restricted factor analysis (RFA) is one alternative. The RFA specification allows for multiple groups to be tested simultaneously (i.e., sex
and race) and continuous variables can be included as originally measured (i.e., age). These additional variables are modeled as single indicator exogenous variables in the RFA model and tested as possible violators of invariance [
15,
16]. The RFA model is equivalent in overall fit and yields the same results as the third alternative, the multiple indicator, multiple cause (MIMIC) model. The difference between these two models is in how the relationships between the exogenous variables and the common factor(s) are modeled. In the MIMIC model, these relationships are causal and in RFA the relationships are not necessarily causal [
17]. As we do not necessarily expect causal relationships between the exogenous variables and HRQoL and RFA has been shown in simulation studies to be a robust method [
15,
16], we will use RFA.
By using the RFA approach, we can obtain further insight into the psychometric properties of the EORTC QLQ-C30 in a heterogeneous cancer sample. To achieve this we include and study simultaneously multiple variables that have the potential to violate the assumption of measurement invariance. Therefore, the aim of this paper is to investigate whether HRQoL scales are invariant with respect to age, sex, previous treatment for cancer, and patients’ information preferences. If any of the observed scales are biased with respect to the exogenous variables, group comparisons in relation to the variables investigated will be less meaningful. So, in doing this, we aim to better understand the construct of HRQoL as measured in a heterogeneous cancer population.
Discussion
In this study, we investigated the assumption of measurement invariance for the EORTC QLQ-C30 in a heterogeneous population of cancer patients who were about to begin their first session of radiotherapy. Applying RFA, we investigated age, sex, previous treatment for cancer, and information preferences regarding treatment as potential violators of invariance. Two violations were identified in Physical Functioning with regard to age and in Emotional Functioning with regard to previous treatment.
In the first step, we were able to fit a measurement model to the EORTC QLQ-C30 that had satisfactory fit. Our final measurement model did not include any of the symptoms scales of the EORTC QLQ-C30. However, Fayers and Hand have argued that these symptom scales should not be used as manifestations of underlying HRQoL but rather as manifestations of treatment [
28]. This is because one would expect a different factor structure for symptoms dependent on the type of treatment the patient was undergoing. While this substantive debate is beyond the scope of this paper, in the current sample, the patients are in different stages of treatment, and this may explain why the inclusion of symptom scales did not lead to a satisfactory model. Once we identified a satisfactory measurement model we were able to investigate invariance in the EORTC QLQ-C30, therefore, it was in Step 2 that we identified the two violations of invariance.
The direct effect between age and Physical Functioning suggested that if younger and older patients had the same underlying Functioning HRQoL, older patients reported their Physical Functioning to be worse than younger patients. This result was found in another study where measurement invariance was investigated in regard to the SF-36 [
29] (HRQoL measure) in a sample of cancer patients [
30]. The authors suggested that because Physical Functioning is the most objective HRQoL scale, leaving little room for individualized interpretation, it is conceivable that it is the other scales that are biased because there is more room for subjective interpretation. An alternative model could be fitted that allowed direct effects between age and the other scales, excluding Physical Functioning; however, this model would include many additional parameters and as a result be less parsimonious. Therefore, we opt for parsimony and the model with least instances of measurement bias. As a result of our finding, care should be taken when making age comparisons with any of the EORTC QLQ-C30 scales and age. Recently, the EORTC QLQ-ELD15 [
31] was developed specifically for older adults, though it is ideal to have observed variables that are invariant to the effects of age.
The direct effect between previous treatment and Emotional Functioning was also negative. In other words, radiotherapy patients who had undergone a previous treatment (chemotherapy/surgery) evaluated their Emotional Functioning worse than those who did not undergo treatment before starting radiotherapy, even when their underlying Functioning HRQoL was similar. The different interpretation of Emotional Functioning might be due to resource depletion [
32]. According to resource models, self-regulatory resources can be depleted or fatigued by self-regulatory demands. Muraven et al. [
33] found that one route to self-regulatory failure is prior self-regulatory activities. In their laboratory studies, participants who were asked to employ a form of self-regulation (e.g., mental control or regulation of emotional expression) were less able to self-regulate after that (see also [
34]). Previous treatment can be regarded as a prior self-regulatory effort, where emotions needed to be regulated. To undergo more treatment might decrease Emotional Functioning, because regulatory resources are depleted. This depletion may result in a different frame of reference in regard to Emotional Functioning for patients who have already undergone treatment. Interestingly, the latent construct of Functioning HRQoL was not reliant on self-regulatory efforts as evidenced by the very small relationship between previous treatment and Functioning HRQoL. To better understand this relationship, more research with longitudinal data is needed.
Previous research has shown that the EORTC QLQ-C30 has excellent psychometric properties [
35] and is used extensively to assess HRQoL [
36‐
38]. For the aim of our study to investigate invariance, we believe the model we used was a good representation of the functioning scales of the EORTC QLQ-C30. The two instances of non-invariance detected do not suggest that Physical Functioning and Emotional Functioning are not valid indicators of Functioning HRQoL, but rather that care should be taken when using the functioning scales to compare younger and older adults and patients at different stages of treatment. While our sample size was small, and therefore limits generalization, the direct effect between age and Physical Functioning has been identified in previous research, indicating that it is certainly worth further investigation in a longitudinal study. In addition to this, it would be worthwhile to also consider invariance of the EORTC QLQ-C30 with respect to different cancer diagnoses, different treatment regimes, and different stages of cancer. Focusing on these specific variables would lead to greater confidence when comparing differences in HRQoL in relation to these variables.
Accounting for violations to the assumption of measurement invariance in our study lead to a significant improvement in overall model fit. The inclusion of patient characteristic variables to our model initially resulted in a model where the estimates could not be confidently interpreted. However, after the inclusion of direct effects accounting for bias, our model fit was satisfactory and conclusions regarding the model could be drawn. It is important to note that a single violation of invariance may not be enough to cause unsatisfactory model fit, but could have a substantial impact on the conclusions drawn. In other words, if the assumption of measurement invariance is ignored the researcher cannot be sure whether differences observed are related to true differences in HRQoL, or whether these differences are related to how patients interpret the HRQoL items.