FormalPara Key Points for Decision Makers

This study developed a national value set of the EQ-5D-Y-3L for Hungary, derived from the Hungarian adult general population.

The value set allows to estimate utilities from EQ-5D-Y-3L responses in children and adolescents and can be used for cost-utility analyses of health technologies.

Hungary is the first country in Central and Eastern Europe with value sets for the EQ-5D-3L, EQ-5D-5L and EQ-5D-Y-3L.

1 Introduction

Health technology assessment (HTA) plays an important role in enhancing health policy decision making about the coverage and use of health interventions [1]. HTA provides a systematic evaluation of new health technologies (e.g. treatment, diagnostic procedure) based on both clinical and economic evidence. In most countries, cost-utility analysis, resulting in cost/quality-adjusted life-year (QALY) estimates is a preferred form of economic evaluations in national HTA guidelines [2]. QALY is a summary measure of health gain that takes into account health-related quality of life and the length of survival. To compute QALYs, health-related quality of life is mostly measured using generic preference-accompanied measures [3]. These instruments comprise domains that describe health states (i.e. descriptive system) and a set of preference weights (i.e. value set) that allows to assign utilities to health states. In Hungary, HTA guidelines developed by the Ministry of Human Capacities recommend the use of the EuroQol instrument family to obtain utilities [4,5,6]. Hungarian value sets are available for the two adult versions of the EQ-5D, the EQ-5D-3L and EQ-5D-5L, both having been developed according to the latest EuroQol Valuation Technology (EQ-VT) protocol [7].

The EQ-5D-3L and EQ-5D-5L, which have been developed for use in the adult population, are typically not suitable to measure health outcomes in young populations, such as adolescents and children. Adult and youth instruments may differ in a number of ways, including health domains and item content, age-appropriate language and the value sets used to assign utilities to health states [8, 9]. A growing number of generic preference-accompanied measures have been developed for young populations. Examples include the EQ-5D-Y-3L, EQ-5D-Y-5L (experimental version), Child Health Utility 9D (CHU9D), Assessment of Quality of Life-6D (AQoL-6D) Adolescent, Health Utilities Index (HUI) 2 and 3, 16D and 17D [10, 11]. Among these, the largest amount of positive psychometric evidence has been reported about the EQ-5D-Y-3L [12]. The EQ-5D-Y-3L is a three-level youth measure targeting populations aged 8–15 years [13, 14]. However, it requires a separate value set due to the differences in the wording of descriptive system and utilities elicited for children/adolescents in comparison with adults [9, 15,16,17,18,19,20,21,22].

Recently, an international protocol has been established for the valuation of EQ-5D-Y-3L, which advocates to rely upon adult general population preferences for hypothetical health states [23]. Arguments supporting this approach include coherence with the population used to generate value sets for adult measures; adults bear the costs of healthcare through health insurance; adults’ capacity of decision making (both in general and about health) and legal rights to participate in certain activities (e.g. voting, signing a contract, getting married); ethical difficulties around the acceptability of valuation exercises involving the concept of death for young populations; and matters of understanding and cognitive burden associated with the task in adolescents or children [24,25,26,27]. In Hungary, no value set exists for the EQ-5D-Y-3L or any other youth-specific health status measure. This study therefore aimed to develop a national value set for the EQ-5D-Y-3L in Hungary based on preferences of the general adult population.

2 Methods

This study followed the international valuation protocol for the EQ-5D-Y-3L [23] and the Checklist for Reporting Valuation Studies of the EQ-5D (CREATE) [28]. Ethical approval was obtained from the research ethics committee of the local research team’s institution (no. KRH/31/2021).

2.1 EQ-5D-Y-3L

The official Hungarian version of the EQ-5D-Y-3L was used in this study. The EQ-5D-Y-3L consists of a descriptive system that measures health across five domains and a visual analogue scale (EQ VAS) with endpoints of 0 (the worst health you can imagine) and 100 (the best health you can imagine) [13]. The five domains include mobility (walking about), looking after myself (washing or dressing), doing usual activities (going to school, hobbies, sports, playing, doing things with family or friends), having pain or discomfort, and feeling worried, sad or unhappy. In the EQ-5D-Y-3L, each domain has three response levels (level 1 = no problems/no pain or discomfort/not worried, sad or unhappy; level 2 = some problems/some pain or discomfort/a bit worried, sad or unhappy; and level 3 = a lot of problems/a lot of pain or discomfort/very worried, sad or unhappy) defining a total of 243 different health state profiles. The numbers assigned to the level of each domain can be summarised as a five-digit string, whereby 11111 (full health) indicates no problems in any domains and 33333 (worst health state) refers to the worst level of problems in all domains.

2.2 Study Design

The study design used a two-step valuation approach involving data collections through two independent surveys with different modes of administration, samples, and preference elicitation methods (online survey with discrete choice experiment [DCE] tasks and face-to-face computer-assisted personal interviews with composite time trade-off [cTTO] tasks). Both surveys comprised of the following elements: introduction page, informed consent page, introductory questions, self-reported health on the EQ-5D-Y-3L to familiarise respondents with the measure, preference elicitation tasks, three debriefing questions and background questions (e.g. employment, marital and parental status, and the presence of chronic health conditions). The face-to-face survey included three introductory questions about the respondents’ age, sex and experience with severe illness, while questions about education, place of living and geographical region appeared among the background questions. By contrast, the online survey contained five introductory questions to allocate respondents to a quota group based on age, sex, education, place of living and geographical region, and an extra question about experience with severe illness.

2.3 Preference Elicitation Methods

The valuation tasks were conducted using the EQ-VT software (v2.1) [29]. Adults completed preference elicitation tasks considering the health of a 10-year-old child (exact phrasing: ‘Considering your views for a 10-year-old child’) [23]. The DCE tasks asked the participants to express their preference between two different EQ-5D-Y-3L health states (labelled as option ‘A’ and option ‘B’). The cTTO task was composed of a conventional 10-year TTO for better-than-dead states and a lead-time variant (i.e. 10 years in full heath followed by 10 years in an EQ-5D-Y-3L state) for worse-than-dead states [30]. The smallest tradable unit was 6 months. Examples of the DCE and cTTO tasks are presented in Online Resource 1. At the respondent’s point of indifference (t), the utilities for the EQ-5D-Y-3L health states were calculated as follows:

  • U = t/10 for better-than-dead states

  • U = (t − 10)/10 for worse-than-dead states

The possible range of observed utilities was − 1 to 1.

2.4 Health State Selection

The health states valued in this study were those described in the international valuation protocol of the EQ-5D-Y-3L [23]. For the DCE design, 150 health state pairs were selected using a Bayesian efficient design allowing for the estimation of main effects and all two-way interactions. The design attempts to minimise the number of implausible health states, provides an appropriate balance across both severity levels and utilities, and uses an overlap in two domains for all pairs [18]. The 150 pairs were distributed over 10 blocks of 15 pairs. Respondents were randomly allocated to one of the 10 blocks and the order of the pairs within each block was also randomised. In addition, each respondent completed three fixed dominant pairs, whereby one health state was always unanimously better than the other (e.g. 21123 vs. 22233). The first and last pair were set at dominant pairs for all respondents, while the third dominant pair was presented at a random point.

For the cTTO design, one block of 10 health states was used, containing three mild states (11112, 11121 and 21111), two moderate states (22223 and 22232), four severe states (31133, 32223, 33233 and 33323) and the worst health state (33333). The order of the health states was randomised across respondents. According to the EQ-VT protocol, each respondent valued two wheelchair examples, three practice health states (21112, 32323, 13311) and 10 ‘real’ EQ-5D-Y-3L states [29, 31]. After the valuation, a ranking of the 10 health states (‘feedback module’) based on their responses was displayed to the respondents who had the opportunity to remove one or more cTTO valuations (regardless of being inconsistent or not) [32].

2.5 Sampling, Recruitment, and Data Collection

The target sample size was approximately 1000 respondents for the DCE and 200 for the cTTO survey. Members of the general population aged 18 years or above who provided informed consent were included in the study. The language of both surveys was Hungarian. For the DCE survey, participants were recruited from members of a large Hungarian online panel. Respondents for the cTTO interviews were recruited by the interviewers using their existing contacts and volunteers. For both surveys, a quota sample was drawn using interlocking quotas for age and sex to achieve population representativeness according to census data reported by the Hungarian Central Statistical Office [33]. Furthermore, we used separate ‘soft’ quotas for education, place of living and geographical region. We attempted to fill in these quota brackets approaching the census data as much as possible, but we did not close the quotas when these were completed. Online panel members received survey points redeemable for gift vouchers or prize-drawing tickets as compensation, while no remuneration was provided for participation in the cTTO interviews. The online DCE survey took place in April and May, and the cTTO interviews were performed between March and September 2021. A team comprising a PhD student (A.B.) and three MSc students was used, with each person conducting 50 interviews. All but one interviewer had extensive experience with valuation studies as they had previously worked as interviewers in the adult EQ-5D-3L and EQ-5D-5L parallel valuation study in Hungary [7]. Interviewers received a standardised training on valuation methods, the EQ-VT protocol and the quality control (QC) procedure. Each interviewer conducted 10 pilot interviews before the formal data collection commenced.

2.6 Quality Control

Both the DCE and cTTO surveys included a QC process developed for valuation studies of EuroQol instruments. In the DCE survey, the following QC rules were established [34, 35]:

  1. (1)

    Speeding: respondents were required to spend at least 150 s on completing the 18 (15 ‘real’ and 3 ‘dominant’) DCE pairs.

  2. (2)

    Dominance test: respondents were expected to answer correctly to all three dominant pairs.

Respondents failing any of the above tests were blocked from proceeding to the background questions and were excluded from the data analysis based on quality grounds. In the cTTO interviews, the standard QC procedure was followed to assess interviewers’ performance in terms of protocol compliance and interviewer effects [32]. Interviews were assessed against the following four QC criteria: (1) no explanation of the worse-than-dead format in the wheelchair task; (2) the interviewer spent less than 3 min on explaining the wheelchair tasks; (3) the 10 ‘real’ health states were completed within less than 5 min; and (4) the worst health state was valued at least 5 years better than the health state valued as being the worst (‘33333 inconsistency’). Data quality was primarily monitored by one of the authors (F.R.) and discussed with the EQ-VT support team at 25, 50 and 75% of the target sample size. The QC criteria for the cTTO tasks did not serve as exclusion criteria.

2.7 Data Analysis and Modelling

Respondents’ background characteristics were summarised using descriptive statistics. The value set was developed using both DCE and cTTO data. Following the international valuation protocol [23], statistical modelling was conducted in two steps. First, the DCE valuation data were used to estimate the relative importance of domains and levels on a latent scale. The dependent variable was coded 1 for the chosen option (e.g. ‘A’) in each choice task and 0 for the other alternative (e.g. ‘B’). The models included main effect dummy-coded variables for level 2 and level 3 problems in each domain, resulting in a total of 10 parameters. DCE responses were analysed in a probabilistic choice/random utility model framework (‘mlogit’ package in R), using both ‘conditional logit’ (fixed coefficients) and ‘mixed logit’ (correlated random coefficients) variants [36]. Mixed logit is preferred over the conditional logit model as it allows for unobserved heterogeneity in individuals’ preferences over choice option characteristics [37]. Maximum likelihood estimation was implemented using the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. Concerning the mixed logit model, unconditional likelihood values were approximated via 5000 pseudo-random draws from the (supposedly normal) distribution of correlated random coefficients. Multiple criteria were used in selecting between DCE models. These included theoretical considerations (e.g. accounting for preference heterogeneity), goodness of fit as measured by the log-likelihood value and the Bayesian information criterion (BIC), and prediction accuracy in terms of the mean absolute error (MAE) and mean square error (MSE) of the observed versus predicted choice probabilities for the 150 pairs. For the latter two, in both the conditional and mixed logit models, we used the following formulas: MAE = Σs|m(ps(A)) − Prs(A)|/S and MSE = Σs(m(ps(A)) − Prs(A))2/S, whereby S = 150 is the number of health state pairs involved in the DCE task, s is the index for identifying health state pairs, Prs(A) is the proportion of individuals who were presented health state pair s and chose option A, and m(ps(A)) is the mean estimated probability of choosing option A across individuals who were presented health state pair s, calculated according to each specific model version.

In the second step of modelling, the cTTO data were used for anchoring the best-performing DCE model onto the 0 (= dead) to 1 (= full health) scale. No respondents were excluded from the analysis of cTTO data. Respondents who gave the same ‘1’ value to all health states (i.e. non-traders) were not excluded as we considered these valuations as the reflections of respondents’ true preferences. However, following previous valuation studies, responses flagged by participants on the feedback module were excluded from the data analysis [38]. Two different anchoring methods were tested to link DCE and cTTO data: worst health state anchoring and mapping [39]. For worst health state anchoring, we rescaled the regression coefficients for each level of each dimension in the DCE model by the ratio of the mean observed cTTO disutility of the worst health state (33333) to the latent scale DCE value thereof. For the mapping, the mean observed cTTO disutilities of the 10 health states were regressed on the latent scale DCE values thereof using an ordinal least squares regression. We tested the model with and without constant and selected the preferred model based on parameter significance and model fit (R-squared). The estimated regression coefficient (β) was used to rescale each regression coefficient of the best-performing DCE model.

The two anchoring approaches were compared by computing MAE between predicted and mean observed cTTO utilities for the 10 health states and also separately for the three mild states. For all analyses, a p value < 0.05 was considered statistically significant. Statistical analyses were conducted in Stata 14 (StataCorp LLC, College Station, TX, USA) and R version 4.0.5 (R Core Team, Vienna, Austria).

3 Results

3.1 Respondent Characteristics

In total, 3273 individuals responded to the invitation to participate in the online DCE survey, 1251 (38.2%) of whom completed the survey. Of non-completers, 159 (7.9%) declined consent, 1518 (75.1%) belonged to a quota that was full, and 345 (17.1%) dropped out or lost the internet connection. Of the 1251 completers, 255 (20.4%) did not pass the QC criteria and thus were excluded. The majority of these respondents failed the dominance test(s) (n = 219, 85.9%) [Online Resource 2]. Overall, 221 people were invited for the cTTO survey, 16 (7.2%) of whom were unwilling to participate, 3 (1.4%) were not interviewed due to conflicting schedules, and 2 (0.9%) refused to engage in face-to-face interactions due to the coronavirus disease 2019 (COVID-19) pandemic. Thus, 996 and 200 respondents from the DCE and cTTO surveys, respectively, formed the final analytical samples for this study.

The sociodemographic and health-related characteristics of the samples are reported in Table 1. The DCE and cTTO samples were similar in most characteristics. The overall study population was closely representative of the general population based on age, sex, geographical region, place of living, employment, and marital status; however, we had an overrepresentation of college/university-educated respondents.

Table 1 Characteristics of the sample

3.2 Descriptive Statistics of the Composite Time Trade-Off (cTTO) Utilities

Two (1.0%) of the 200 cTTO interviews were flagged on the QC criteria (one on wheelchair time and another on 33333 inconsistency). There were two (1.0%) non-traders. A total of 40 (20.0%) individuals had at least one inconsistent cTTO response (18.0% related to moderate or severe health states, 1.5% involving 33333 and 0.5% related to the mild states). Fifty-eight respondents (29.0%) flagged overall 68 (3.4%) cTTO responses on the feedback module that reduced the rate of inconsistency to 8.0% (all related to the moderate or severe states). The distribution of cTTO responses (after the feedback module) is plotted in Fig. 1. Overall, 13.0% of the 1932 unflagged cTTO responses were at 1, 28.0% were worse than dead, and 2.7% were at − 1. The mean observed cTTO utilities for the 10 health states ranged between 0.948 (11112) and −0.517 (33333) [Table 2].

Fig. 1
figure 1

Distribution of observed cTTO utilities for the 10 EQ-5D-Y-3L health states. Responses flagged by respondents on the feedback module are not included. cTTO composite time trade-off

Table 2 Observed summary statistics for the 10 EQ-5D-Y-3L health states (cTTO)

3.3 Modelling Discrete Choice Experiment (DCE) Data

Both the conditional and mixed logit models resulted in larger utility decrements with more severe response levels in each domain (Table 3). In both models, the relative importance of domains by level 3 coefficients was having pain or discomfort > feeling worried, sad or unhappy > mobility > doing usual activities > looking after myself. The mixed logit model demonstrated considerably better goodness-of-fit statistics and prediction accuracy, including higher (i.e. less negative) log-likelihood (− 6480 vs. − 5602) and lower BIC (13056 vs. 11830), MAE (0.0474 vs. 0.0348) and MSE (0.0040 vs. 0.0022). The mixed logit was selected as the final model based on its better fit indices, prediction accuracy and ability to account for preference heterogeneity.

Table 3 Results of DCE modelling

3.4 Anchoring Onto cTTO Utilities

The ordinary least squares regression without constant was selected as the preferred mapping model to rescale the DCE model with model parameters and fit statistics as follows: β = − 0.09509 (SE 0.00201, p < 0.001), R-squared 0.996, model F test (1,9) 2230.36 (p < 0.001). The two anchoring approaches (worst health state anchoring and mapping) produced very similar models in terms of parameter estimates and prediction accuracy (Table 4). The predicted utility for the pits state was slightly lower (− 0.517 vs. − 0.485) and the proportion of health states with negative utilities was somewhat higher with worst health state anchoring than with mapping (13.2 vs. 12.3%). The agreement between the predicted and observed utilities is displayed in Fig. 2. The mapping approach, resulting in marginally better agreement with mean observed cTTO utilities as indicated by the lower MAE values both for all 10 health states (0.0486 vs. 0.0485) and the three mild states separately (0.0311 vs. 0.0298) was selected as the final value set. With the selected value set, utilities for the 243 EQ-5D-Y-3L health states ranged between − 0.485 (33333) and 1 (11111), with the utility of 0.962 for the mildest impaired state (12111). The full set of the 243 predicted EQ-5D-Y-3L utilities is provided in Online Resource 3. For example, using the Hungarian value set, the predicted utility for the EQ-5D-Y-3L health state 12321 may be calculated as follows:

Table 4 Anchoring DCE onto cTTO utilities
Fig. 2
figure 2

Agreement between predicted and observed utilities. cTTO composite time trade-off, DCE discrete choice experiment

$$U\left(12321\right)=1+\mathrm{MO}1+\mathrm{LAM}2+\mathrm{UA}3+\mathrm{PD}2+\mathrm{WSU}1=1+0-0.038-0.252-0.133+0=0.577$$

4 Discussion

In this study, we developed a national value set of the EQ-5D-Y-3L for Hungary, derived from the Hungarian adult general population. This is the first value set for any generic preference-accompanied measure targeting younger populations in Hungary.

The Hungarian EQ-5D-Y-3L value set was estimated using a mixed logit model with random correlated coefficients, and mapping the resultant values onto the QALY scale. The domains with the largest utility decrements were ‘having pain or discomfort’ and ‘feeling worried, sad or unhappy’. The domain of ‘looking after myself’ was associated with the smallest utility decrements, very likely because self-care may be interpreted differently in the case of a child due to the existing adult support network even for healthy children [21]. Although a direct comparison with the Hungarian adult EQ-5D-3L value set is limited due to the differences in wording of descriptive systems, preference elicitation methods (adult: cTTO vs. youth: DCE and cTTO), valuation perspective (adult for self vs. adult considering the health of a 10-year-old child) and data collection period (adult: pre-COVID vs. youth: during the pandemic), it is notable that in the Hungarian adult value set, the domains of mobility and self-care have the greatest utility decrements. Our results are similar in range and domain importance order to those reported from other European countries with EQ-5D-Y-3L value sets. In Hungary, the utility of the worst health state was − 0.485, compared with utilities of − 0.283 in Germany [40], − 0.539 in Spain [34] and − 0.691 in Slovenia [41].

Following the EQ-5D-Y-3L international valuation protocol [23], we asked adults to complete the valuation tasks considering the health of a 10-year-old child. However, there may be systematic differences in preferences and value sets depending on who (e.g. adult vs. adolescent) and for whom (self vs. other, adult vs. child) values health states [42]. There is an increasing body of literature that suggests the feasibility of including adolescents in valuation of the EQ-5D-Y-3L [43,44,45]. This is also supported by evidence from valuation studies of other instruments, such as the CHU9D and vignette-based studies [46, 47]. In DCE studies with the EQ-5D-Y-3L in the UK, Spain, Germany and Slovenia, adolescents gave less importance to ‘doing usual activities’, ‘having pain or discomfort’ and ‘feeling worried, sad or unhappy’ than adults valuing health states for a child [18, 19]. Furthermore, the perspective is also challenging as there may be differences in the person respondents imagine (e.g. themselves as a child, their own child, another child they know, or no particular child) [42]. A recent study has shown, for example, that respondents are less willing to trade life-years on behalf of others [22]. A promising new approach is to elicit preferences from a mixed sample of adults and adolescents [24].

There are numerous strengths of this study. The data collection was among the first wave of youth EQ-5D valuation studies conducted in over 10 countries, including Belgium, China, Germany, Indonesia, Japan, The Netherlands, Slovenia and Spain [34, 40, 41, 48, 49]. Furthermore, a similar EQ-VT protocol and the same software and interviewers were used in this study as for the valuation of the adult EQ-5D-3L and EQ-5D-5L in Hungary in 2018–2019, maintaining methodological consistency [7]. However, this study also has some limitations. First, preference data were collected during the COVID-19 pandemic with varying levels of stringency and vaccine availability during the study period. It is unknown if the pandemic had an impact on how the general population valued health states, in particular whether these utilities are different from pre-COVID utilities and whether they will remain stable for the post-pandemic era. Findings from a recent study comparing EQ-5D-5L health state valuations (by using EQ VAS) before and during the pandemic in the UK suggest that COVID-19 may have affected health preferences among members of the general public [16]. Nevertheless, findings from EQVAS values may not be generalisable to cTTO utilities. Second, the online DCE survey may be subject to selection bias because digital literacy and internet access were prerequisites to participate in the study. Furthermore, 255 respondents were excluded from the DCE survey due to speeding or providing inconsistent responses which decision may be considered overly strict with fast decision makers and those susceptible for random mistakes. Third, despite the overall good representativeness and similarities in terms of sociodemographic characteristics, the DCE and cTTO samples differed in their recruitment and incentivisation strategies and this may represent a further limitation. Fourth, the cTTO task involved one block of 10 health states that somewhat limited our options for modelling. Finally, in both the DCE and cTTO tasks, health states were valued considering the health of a 10-year-old child. Further research is warranted if such preferences are representative for the entire age range of the instrument (8–15 years) [50, 51].

5 Conclusions

This study provided a national value set of the EQ-5D-Y-3L for Hungary based on general adult population preferences. The value set allows to estimate utilities from EQ-5D-Y-3L responses and can be used for cost-utility analyses of health technologies for young populations. Given the widespread use and acceptance of the adult EQ-5D questionnaires by researchers, physicians, analysts and policymakers in healthcare in Hungary, it is hoped that the EQ-5D-Y-3L value set will soon become adopted by users.