The effectiveness of a health care intervention or strategy can be measured in a variety of ways. A commonly used method is measuring and comparing the Health-related Quality of Life (HrQoL) between groups. HrQoL is a measure of the impact of disease and treatment on an individuals’ disability and daily functioning [1
]. It includes factors that are part of an individual’s health, without non-health aspects such as economic circumstances, and is often used in cost-effectiveness studies [2
]. HrQoL outcomes are gathered using questionnaires and respondents’ answers can be converted into a single utility score, usually between 0 and 1, that reflects the personal desirability of an individual’s health state at a particular point in time [2
]. The EQ-5D-5L is often recommended as the instrument to obtain utility scores [3
]. To enable the conversion for EQ-5D-5L outcomes, pre-defined country-specific value sets have been developed to this aim [4
In cost-effectiveness studies, utility scores are used to calculate quality adjusted life years (QALY’s) for all relevant health states. If utility scores are not available for these health states, assumptions about such utilities have to be made. However, assumptions are sub-optimal compared to objectively measured utilities as this influences cost-effectiveness ratios and ultimately decision-making [5
]. Besides utilities for disease specific health states, also utilities for the general population are considered to be relevant. These so-called ‘normative utility scores’ can be used as a comparator for health profiles of patients based on subgroups with similar age and gender. Additionally, they can be used to compensate for a loss in HrQoL due to factors that are not caused by the disease or intervention of interest [7
]. Currently, many cost-effectiveness studies made the assumption of a utility of 1 (reflecting perfect health) for the general population. However, Versteegh et al. obtained utilities in a general Dutch population and the results suggested that utilities of the general population tend to be below one [8
]. This means that cost-effectiveness studies may overestimate the health of the general population, and thereby overestimate the loss in utility score caused by a disease or intervention. Therefore, up to date normative utility scores are needed to be used in cost-effectiveness studies.
Other countries have calculated normative utility scores using the EQ-5D and showed differences between genders [9
]. In studies on women’s health, using gender-specific normative EQ-5D utility scores of females only may be more accurate than population norms. Janssen et al. published EQ-5D index value population norms for 20 countries in Europe including the Netherlands [12
]. Data of 2367 people, identified between 2001 and 2003, were used to calculate age stratified normative utility scores [14
]. However, these results were based on the EQ-5D-3L, and the Dutch normative data for the EQ-5D-5L that was published thereafter, were not classified by gender [8
]. This is a drawback for cost-effectiveness studies among only male or female populations.
Therefore, the aim of this study was to obtain EQ-5D-5L normative utility scores in a female Dutch cohort, stratified by age. In addition, these normative utility scores were compared to normative utility scores of female cohorts of other countries. Furthermore, three different country-specific value sets were applied to the answers of the EQ-5D-5L of the Dutch cohort. This analysis was conducted to illustrate the impact of using different value sets on age-specific mean normative utility scores, and to enable the use in cost-effectiveness studies in populations for which country-specific normative utility scores for women are not available.
We obtained normative utility scores using the EQ-5D-5L in a sample of 9037 Dutch females and found relatively high utility values for Dutch females aged 18 to > 75 years old. In general, the mean normative utilities were lower in the older age groups although absolute differences were small. Applying the country-specific value sets of Germany, UK and US to the EQ-5D-5L answers of our Dutch sample resulted in consistently higher mean utility scores in all age groups as compared to the mean utility scores calculated with the Dutch value set.
Our mean normative utility scores in the younger age groups were slightly lower than previously found in female populations of other countries [9
]. This difference may be caused by the sampling method. Young people that are less healthy may spend more time on their computer, mobile phones or social media than healthy adolescents who are possibly able to do more activities. Therefore, they might have been more likely to encounter the study invitation and more inclined to complete a questionnaire on their health. The normative utility data of female populations of other countries was collected between 2013 and 2017 [9
]. The lower Dutch utilities in the younger age groups compared to those of previous studies might be explained by an increase in mental health problems in adolescents over the last years as observed in the Netherlands [19
]. The data of this study were collected during the start of the COVID-19 pandemic, which also led to more anxiety and mental health issues particularly in females and adolescents, and may have contributed to lower utility scores [20
]. Besides, it appears as if the use of the Dutch value set is partially responsible for the differences in utility scores in younger age groups (up to 35 years), because the differences in utility becomes smaller when the German, UK, and US value sets were used. In contrast, our mean normative utility scores in the older age groups were higher than those in female populations of other countries. Particular in these age groups, the differences were enlarged by the use of the German, UK and US value sets. That is, these differences cannot be explained by the value sets themselves.
The oldest age group (> 75 years) showed a relatively high mean normative utility, as none of the participants scored level four and five across all dimensions. This might indicate that older Dutch women have a relatively good quality of life, and possibly better than older women elsewhere. In contrast to a recently published Russian article reporting normative utility scores, Dutch women did not show many problems in the self-care dimension for all age groups [21
]. In the current study, the frequency of having any problems in the anxiety/depression dimension decreased with increasing age, but was consistent across all age groups in the Russian population. Although the pattern of having any problems in the mobility dimension was similar in both studies, the frequency in the older age group was considerably higher in the Russian population [21
]. However, the high mean normative utilities may also be related to most participants being between 75 and 80 years of age, and no one being older than 90 years. Because more health issues appear with increasing age, this may explain the differences with other studies if they included older participants [21
]. In addition, the sample of older participants (n =
34) was relatively small, which reduces the generalizability. Another explanation is the use of social media as a recruitment method, which may have caused some selection bias. Older females that are able and willing to complete a questionnaire through an online survey are potentially in better health [24
]. On the other hand, internet is easily accessible in the Netherlands and internet use is higher than in most other western countries, also in older people [25
]. Interestingly, Jiang et al. has shown differences in outcome between face-to-face and online sampling, with higher EQ-5D-5L index scores in the face-to-face population for most age groups [9
]. However, the index scores of the older participants (i.e. above the age of 65) were slightly higher in the online population [9
We found statistically significant differences in mean normative utility scores between the age groups. However, we expected larger age-specific absolute differences beforehand based on results of previous normative studies (both males and females) in the Netherlands [26
]. Nevertheless, we recommend to use age and gender-specific reference values, as they are important for cost-effectiveness studies and can have a substantial effect on outcomes [5
]. It would be interesting to investigate to what extent our age-specific values alter the outcomes of cost-effectiveness analyses. To note, our normative utility scores are mainly intended to answer women-specific research questions, and they might not be directly comparable to future normative utility scores of Dutch males as they are not generated from the same sample.
The key strengths of our study are the use of the EQ-5D-5L to obtain normative utility scores and the large sample size. The EQ-5D-5L is more sensitive than the EQ-5D-3L version which has several limitations (e.g. ceiling effects in patient populations, non-detection of small differences or changes in patients with mild conditions) [27
]. Furthermore, the sample size of our cohort was substantially larger (at least three times) than the samples in previous studies, and in combination with the more sensitive 5-level version of the EQ-5D, our study may have resulted in more reliable outcomes [9
]. Another strength is that we provide age-specific mean utility scores specifically for women. These could be used an up-to-date reference point in research and Dutch health policy evaluations, such as breast and cervical cancer screening strategies, and health policies for pregnancy and childbirth. Importantly, our study did not gather demographic data which makes it difficult to state anything about the representativeness of the population. We used a web-based survey that was disseminated through the institutes’ social media platforms, which are all accessible for the general population. To be able to complete the survey, access to internet was required. Especially in the Netherlands, internet use has increased over the last decade and is nowadays extremely high as 95% of total population has access to internet [30
]. This makes the internet-user population very similar to the general population. Even back in 2013, internet was the main source to search for health information (83%) in the Netherlands, and social media is frequently used for this purpose [31
]. The percentage of social media use is more than 90% for the age group of 18–54 years, and between 76 and 89% in the age group of 55–64 years of the Dutch population [32
]. Although we cannot assume that all female internet-users have seen our survey, we believe that the survey reached a large and representative part of the Dutch female population. Despite our large sample size the group of elderly females was relatively small. In other countries where internet availability is less developed, using this sampling method might be more of an issue because certain populations are possibly left out.
To date, it is unclear if and to which extent utility measurements on a national level can be generalized to other countries. However, there are differences between the country-specific value sets even between countries that were expected to have quite similar populations, socioeconomic status, health systems, or attitudes to health [13
]. Therefore, using a country-specific value set is encouraged [33
]. In this study, a subset of value sets of three other countries was used to calculate utility scores based on the answers to the EQ-5D-5L of our Dutch female cohort. This was done to illustrate the impact of using different value sets on age-specific mean normative utility scores, and also to provide age-specific mean normative utility scores to be used in cost-effectiveness studies in countries of which country-specific normative utility scores for women are lacking. For example, if a breast cancer study would be conducted in the UK, researchers probably prefer to use the UK value set to determine the utilities in patients. In order to allow for proper comparisons with the general population, they can also best use normative utilities calculated with the UK value set. If age-specific mean normative utility scores for women in the UK are not available, the normative utility scores calculated with the UK value set in this study may be a good alternative. Reporting the normative utility scores for different value sets enlarges the applicability in multiple international studies.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.