The pressure on both tax and insurance-based health care budgets has been high for many years. On the demand side, pressure is increased by an aging population and higher living standards which raise expectations regarding medical services. On the supply side, expensive new health technologies offer new treatment opportunities for patients but often at a (very) high price. Since these high prices are not necessarily a reflection of large health gains, politicians, academics and society need to answer the question when such technologies are ‘too expensive’. When is it justified to not reimburse or use such technologies, effectively limiting supply in collectively financed health care systems? Surely, such a question is not easily answered and draws in subsets of questions from different disciplines, including clinical and ethical angles.

Politically, it may be very difficult to deny an (to some degree) effective treatment option to patients, especially when these patients are in poor health and effective alternative options do not exist. This is strengthened by the fact that in many societies a considerable proportion of citizens display a clear reluctance to accept negative reimbursement decisions, as for many citizens access to health care is seen as a right [18, 30], which should not be denied for financial reasons.Footnote 1

Whilst being just one of the relevant academic disciplines to inform this debate, health economics has much to contribute to this important topic, for instance by highlighting the importance of displacement and opportunity costs, by demonstrating the full costs and benefits of new technologies, and by showing how decisions based on such information could be made and argued to be ‘just’ to some extent.

In this editorial, we focus on the latter issue and especially on the question of how a line should be drawn beyond which a technology is considered to be too expensive, and therefore, should not be reimbursed. We will argue that in terms of both theoretical and empirical research, much work remains needed. Moreover, if health economic evaluations are to have more impact on decision-making, interactions with decision makers and the public are required to bridge the gap between academic endeavours and societal and political realities.

When it is too expensive?

From an economic viewpoint, the point at which something becomes too expensive is related to the objective function of a decision maker and the constraints he or she faces. Interestingly, these decision rules differ across jurisdictions, including those jurisdiction that use economic evaluations to inform their decisions. A first important distinction in this context is the objective and perspective of the relevant decision maker. Some jurisdictions, such as Wales and England, take a health care perspective in which the goal is to maximize health from a fixed budget. The decision maker thus aims to produce as much health as possible for the population (often measured as Quality Adjusted Life Years or QALYs) from the fixed budget allocated to health care. Other jurisdictions, like The Netherlands, take a broader societal perspective, in which the goal is to maximize social welfare from a more flexible budget. In the latter context, broader impacts on society, that fall outside the health care sector, are also included in the decision-making process. Below we describe both the societal and the health care perspective on thresholds beyond which technologies are deemed too expensive, in that order.

In the context of the broader societal perspective, the decision rule, firmly rooted in welfare economics (e.g. [19]), can be written as [16]:Footnote 2

$${v_Q}\Delta Q- \Delta {c_{\text{t}}}>0.$$
(1)

In which vQ denotes the consumption value of health, ΔQ the incremental gain in health (measured in QALYs) and Δct the incremental total costs. Note that Δct denotes the total of both health care costs (Δch) and broader consumption costs (Δcc), so that Δct = Δch + Δcc. Equation (1) can be rewritten as:

$$\Delta {c_{\text{t}}}/\Delta Q<{v_Q}.$$
(1')

This equation shows the incremental cost-effectiveness ratio (ICER) on the left-hand side and simply states that the incremental costs per gained unit of health gain (QALY) should not exceed the consumption value of this unit of health. In practice, this is translated into the question whether the ICER exceeds the monetary value of the QALY. From equation (1') it is also clear when something is ‘too expensive’. This is the case when the costs per QALY are higher than the value per QALY. It is intuitive that ‘paying’ more for something than its worth will result in a welfare loss. Hence, in this decision context vQ is the relevant threshold determining when something becomes too expensive.

As an aside it needs noting that this value is compared to costs per QALY. It deserves emphasis that prices are not equal to costs but may include (substantial) profits. In other words, paying up to the value (vQ) per QALY means that the full welfare surplus created by a technology is transferred from consumers to the producer of a technology. It is questionable whether such a division of surplus is optimal or fair, in particular in the health care context. Value-based pricing should, therefore, not be interpreted as implying that transferring all surplus to producers is necessary, normal, optimal or fair [15].

Using the same notation, the decision rule related to the narrower health care perspective, assuming a fixed health care budget, can be written as:

$$k\Delta Q - \Delta {c_{\text{h}}}>0$$
(2)

in which k is the marginal cost-effectiveness of current spending in the health care system [11], and only health care costs (Δch) are considered. Ideally, k represents the cost-effectiveness ratio of the interventions that get displaced (given the fixed budget) because of funding the new intervention (k = ΔchdQd). Equation (2) can be rewritten to

$$\Delta {c_{\text{h}}}/\Delta Q<k\;({\text{or}}\;\Delta {c_{\text{h}}}/\Delta Q<\Delta {c_{{\text{hd}}}}/\Delta {Q_{\text{d}}}).$$
(2')

This simply means that the cost-effectiveness of the new intervention should be better than the cost-effectiveness of the displaced care. This is equivalent to stating that the new technology should produce more health per invested euro than it displaces. Hence, under this decision rule k is the relevant threshold value determining when a technology becomes too expensive.Footnote 3

In principle, one would expect k to be equal to vQ, as this theoretically would yield an optimal budget for health care. Whenever the health care budget is fixed and non-optimal (so that k ≠ vQ), k will be relevant next to vQ, also in the context of a broader societal perspective and maximizing welfare [10]. The relevant equation for the broader societal perspective, which can still be extra-welfarist [8], then becomes:

$${v_Q}\left[ {\Delta Q- \Delta {c_{\text{h}}}/k} \right] - \Delta {c_{\text{c}}}>0.$$
(3)

It is easy to see that Eq. (3) turns into Eq. (1) again when vQ = k and that it contains Eq. (2) between the brackets. Using this broader framework also allows the operationalisation of a two-perspective approach, as has been advocated before [7, 26]. This would increase comparability between jurisdictions and highlight situations in which both perspectives lead to different conclusions.

Equation (3) simply states that the value of the net gain in health, that is the gains of a new technology minus the lost health due to displaced care, should outweigh the consumption costs incurred. If the displaced technology is also assumed to be associated with broader societal costs or gains (e.g. productivity costs or informal care), Δcc should represent the net change of the new activity compared to the displaced activity (see e.g. [1]). Given that the optimality of health care budgets has not been established, information on k and vQ is required to determine whether or not something is too expensive. So what do we know about these quantities?

Knowledge

First of all, it is striking to see how much research attention has been devoted to the development of methodologies for estimating the left-hand side of Eqs. (1’) and (2’), and how little attention the right hand side has received. One could say, we have become better and better in producing estimates of incremental cost-effectiveness of new technologies, but still have fairly little idea about what to compare these figures to. Of course, a comparison of ICERs between different interventions also provides information, but the final judgement of whether something is worthwhile requires knowledge of k and/or v, depending on the applied decision rule. Hence, the imbalance between academic resources allocated to calculating ICERs as compared to determining the threshold to which ICERs ought to be compared to is a worrying issue. In other words, as an academic field, we should be as concerned with estimating the monetary value of the QALY and the cost-effectiveness of displaced health care, as we are with the estimation of ICERs. Fortunately, in recent years, the attention for the right-hand side of the equation appears to be increasing.

In terms of v, the monetary value of the QALY, several studies have produced estimates of v. Commonly, this is done through stated preference techniques, like willingness to pay studies. It needs emphasis that the methodological problems related to such studies, especially for difficult to value goods like health, are well known (e.g. [6]). In that sense, measurement of v remains highly challenging.

Ryen and Svensson [25] provide a review of the literature, which highlights substantial variation in produced estimates, with a (trimmed) mean of around 75,000 euros. Interestingly, the amount was related to the size of the gain and higher for life extensions than for quality of life improvements. The latter issues do not only raise questions about the relationship between WTP for health gains and the QALY model, but also to the appropriate perspective to take in order to find v. For example, quite some attention has been paid in the literature to ‘equity weights’ for QALY gains (e.g. [21, 22]). People have for instance been shown to prefer QALY gains in younger and more severely ill patients, although the evidence is mixed [24, 32]. Such equity weights may be seen as different social values attached to different QALY gains [5, 33].

In the Netherlands, a maximum ‘v-threshold’ of €80,000 is used in decision-making. This threshold is only used for treatments targeted at diseases that cause a very high proportional loss of remaining health [24]. For less severe diseases, the v-threshold is lower, going down to €20,000 in case of mild diseases (and even zero for very mild diseases—implying that treatments for very mild diseases should not be publicly funded). Such thresholds intend to reflect a societal willingness to pay (WTP) rather than an individual one and may, therefore, express equity concerns [4]. Most empirical studies estimating v, however, estimate individual WTP estimates for own health gains. It remains unclear how these relate to societal decision-making. The use of a threshold range in relation to disease, treatment or beneficiaries’ characteristics also shows that one unique value of a QALY may not exist [9].

If so, it would be better to write Eq. (1’) as ΔctQi < \({v_{{Q_i}}}\) so that a specific social value \({v_{{Q_i}}}\) is attached to specific (classes of) QALY gains; Qi. Finding such values, in relation to equity weights, is an important and difficult challenge. Note that end-of-life considerations in the UK or absolute shortfall considerations in Norway, indicate the international relevance of this issue [20, 23], as well as a lack of international consensus on how to operationalize equity considerations. Arguably, equity weights should be a reflection of country specific preferences.

An increasing number of studies aims to produce estimates of k, the opportunity costs of health care spending. This is done by considering the (average) marginal gains of increased health care spending. The estimates produced in the UK undoubtedly attracted most attention. The results there showed an average ‘central’ estimate of k of almost £13,000 or ~ €17,700 [12]. More recently, estimates for Spain and Australia were presented which ranged between €22,000–€25,000 and AUD20,758 (~ €13,250)–AUD37,667 (~ €24,100) respectively [14, 31]. In the Netherlands, investigating the marginal returns of spending on cardiovascular treatments in hospitals yielded an estimate of k of about €41,000 [29]. These findings, acknowledging the uncertainty and measurement issues around estimates of v and k, suggest a discrepancy between these ‘reference ICERs’ and commonly used thresholds (with the latter being higher) as well as between v and k. Given the relevance of k in decision-making, potentially also when using a broad societal perspective, this highlights the importance of more research in this area.

In terms of the discrepancy between k and v, a number of observations must be made. First of all, if v is a range related to equity considerations rather than a single estimate, the comparison of v to k also requires knowledge on the relative distribution of different ‘equity types’ of QALY gains. If relatively many interventions are targeted at areas with an average or low equity weight (e.g. based on disease severity), the observed k may be more in line with v than otherwise implied. In the Dutch case, where a range is proposed for decision-making of zero to €80,000 (e.g. [24]), the recent Dutch estimate of k of €41,000 [29] falls within that range and may relate to the value of health gains with an ‘average’ equity weight or value. Second, the estimates for k are based on public expenditures and hence present implicit societal valuations of health gains rather individual valuations. This hampers a direct comparison with common estimates of v which typically take an individual perspective [5]. On the other hand, Bobinac et al. [4] did investigate societal valuations of health gains, and found that people were willing to pay up to €52,000 per QALY for gains in others (i.e. gains that would not accrue to themselves) and €83,000 per QALY when the gains were in either others or themselves. Third, k is an estimate of a mean, with health care interventions potentially being both much more and much less cost-effective than represented in the average, making k mostly relevant when it is unknown which health care is displaced. Estimates of k may differ across disease areas and patient subgroups, which is important to consider, also in the context of social decision-making and equity weights. In that sense, it may be better to investigate kj with subscript j indicating relevant health care sectors or disease characteristics. Estimates in the UK show clear differences in productivity of current spending across diseases [12].

The apparent discrepancy between v and k does raise interesting questions, which deserve more theoretical and empirical attention. In theory, and with perfect estimates, it would signal that health budgets should be increased (so that k would move up to v). Before recommending this, more evidence on and understanding of both quantities and their relation is needed. For example, one might argue that the size of health care budgets is the consequence of deliberate allocation decisions by a democratically elected body and hence reflects a ‘revealed societal preference’ for resource allocation, also across competing budgets for public spending. Better understanding of why society accepts different levels of efficiency in different health care sectors or disease areas is also important. We note that good comparisons may require broader outcome measures than QALYs, in that context, as well as knowledge about their value. Estimates of willingness to pay for QALY or wellbeing gains may or may not be considered superior as a source to inform about the optimality of budgets across sectors of public spending.

Two right hands!

Given the increased use of economic evaluation and the need to compare ICERs to a relevant threshold value to judge whether or not the technology helps to optimize health or welfare, better understanding and more precise estimates of both v and k, also in relation to each other and to equity consideration, are required. Although the attention is increasing, too much remains unclear about both quantities.

For k, a better understanding involves more and better estimates of opportunity costs in the health care sector. Ideally, these would be accompanied by more information about the process and the extent of disinvestment, opportunity costs in different sectors (hospital versus primary care) and scalability of current programs. Differences in marginal cost-effectiveness between health care sectors may suggest inefficiencies within the health care sector, or additional constraints [28]. Moreover, it would be interesting to see whether differences between disease areas and sectors in terms of marginal cost-effectiveness reflect societal preferences, including equity considerations. Comparing such k values to sector or disease-specific v values would also be an interesting avenue for future research.

For v, the need for better individual and societal estimates of the monetary value of health gains remains important. Current estimates vary, also due to differences in methodology. New ways of deriving monetary valuations for health gains could be explored (e.g. [2,27]), if only for validation and better understanding of findings from more commonly applied methods. Further exploration of societal valuations would involve the inclusion of information relevant for equity considerations [33] and require the selection of appropriate equity concepts and appropriate inclusion in economic evaluation (e.g. [13]). This can also be more directly relevant for policy making and current decision-making frameworks, like the one currently used in The Netherlands.

Comparing k and v and better understanding the discrepancies between them remains important as well. Here, it is important to consider that if equity considerations play a role, the distribution of types of care and QALY gains may matter in the comparison. To illustrate this point, consider the Dutch decision framework with values ranging between 20K€ for interventions in the context of ‘low disease burden’ and 80K€ for ‘high disease burden’, with disease burden measured in terms of proportional shortfall [24].Footnote 4 If most QALYs are gained through interventions targeting diseases with a low burden, the relevant v to compare an observed k to would be closer to 20K€ than to 80K€. Moreover, low observed ICERs in the context of primary care and prevention compared to hospital care may, to some extent, reflect similar distributional concerns.

Link to policy and society

A final noteworthy issue is the acceptance of any threshold, be it k or v, in society. It is clear that negative reimbursement or funding decisions typically are not well received in society, and consequently, in the political arena. This is especially the case when the intervention is to some extent effective but not cost-effective. Highlighting the health opportunity costs of reimbursement of a cost-ineffective intervention may provide somewhat more support, as a health versus health trade-off may be more acceptable to some than a health versus money trade-off. Even then, it is important to recognise that the viewpoint of the general public towards rationing may be distinct from health economic reasoning or even an adopted and politically endorsed decision-making framework. A recent European study highlighted that a majority of Europeans consider health care to be ‘a right’ [18, 30]. Notions of opportunity costs and limited budgets may not be part of the discourse of many citizens when thinking about choices in health care. In that sense, improving the connection between decision-making and public opinion/support for negative decisions remains highly important.

The decision-making process and policy instruments used in that context may also matter. Depoliticising the decisions on reimbursement of individual treatments, by giving the authority for these decisions to an independent organisation rather than for instance a ministry of health, may help to reduce the political pressure on every single decision, may contribute to more consistent decisions making, but may also be perceived as diminishing political accountability for societal decision-making. Policy instruments like price negotiations (compared to immediate decisions based on the proposed price) may help as well. These can also alter the public perception of who is to ‘blame’ for a negative decision. An accountable process of decision-making, in which broader considerations (e.g. ethical) are weighted by appropriate persons, potentially including patients and citizens, may also improve societal acceptance of decisions. These processes may require involvement of stakeholders during the process prior to evidence generation to make sure all data relevant to the decision-making process is collected.

Policy makers need to be aware of the fact that having explicit thresholds may invoke strategic pricing behaviour by firms, resulting in prices that yield ICERs close to the threshold [17]. Price negotiations need to focus on a fair distribution of surplus. When a discrepancy between v and k exists, the lower value of the two would be the logical maximum threshold to consider as starting point in price negotiations.

If ICER calculations, thresholds and decisions based on these are to have impact on actual funding and reimbursement, stronger societal support or acceptance seems required. Otherwise, assessing that something is too expensive may not be translated into actual policies.

Concluding

For deciding whether something is too expensive, thresholds are crucial. Depending on which perspective is taken, the word ‘threshold’ may either refer to the consumption value of health or the marginal cost-effectiveness of current spending. The fact that currently the same word is used for two distinct concepts is not particularly helpful in terms of clarity of discussions. It would be preferable to be precise in what we mean by using distinct terms like k-threshold and v-threshold or supply-side threshold (k) and demand-side threshold (v). It also needs recognition that in a broader societal decision-making framework, allowing non-optimal budgets, both v and k are relevant. Using a two-perspective approach has many advantages, among which the clear necessity to gather more information on v and k. Better understanding of and evidence on both quantities would also involve their context specificity, especially in relation to equity. Given current evidence, a discrepancy between k and v may well exist, which would lead to new scientific and societal questions.

For now, it seems clear that the question on when something is too expensive remains difficult to answer unambiguously. We particularly advocate more research to explore the right hand side of the equations, that is, v and k. At the same time, more effort to include societal considerations in appraisals, as well as improving the societal understanding and acceptance of the need for and nature of economic evaluations in health care remains highly important. Not knowing when interventions are too expensive, or not accepting this as a reasonable argument for decision-making, is a situation that, for both our future health and wealth, will turn out to be too expensive.