Skip to main content
Log in

Handling missing data in patient-level cost-effectiveness analysis alongside randomised clinical trials

  • Original Research Article
  • Published:
Applied Health Economics and Health Policy Aims and scope Submit manuscript

Abstract

Background

Missing data are potentially an extensive problem in cost-effectiveness analyses conducted alongside randomised clinical trials, where prospective collection of both resource use and health outcome information is required. There are several possible reasons for the presence of incomplete records, and the validity of the analysis in the presence of data with missing values is dependent upon the mechanism generating the missing data phenomenon. In the past, the most commonly used methods for analysing datasets with incomplete observations were relatively ad hoc (e.g. case deletion, mean imputation) and suffered from potential limitations. Recently, several alternative and more sophisticated approaches (e.g. multiple imputation) have been proposed that attempt to correct the flaws of the simple imputation methods.

Objectives

The objectives are to provide a concise and accessible description of the quantitative methods most commonly used in trial-based cost-effectiveness analysis for handling missing data, and also to demonstrate the potential impact of these alternative approaches on the cost-effectiveness results reported in two case studies.

Methods

Data from two recently conducted, trial-based economic evaluations are used to explore the sensitivity of the study results to the technique used to deal with incomplete observations. A statistical framework for representing the uncertainty in the alternative methods is outlined using an approach based on net benefits and cost-effectiveness acceptability curves.

Results

The case studies demonstrate the potential importance of the approach used to handle missing data. Although the analytical strategy did not appear to alter the results of one of the studies, the other case study showed that that the results of the cost-effectiveness analysis were sensitive to both the decision to impute and also the imputation strategy adopted.

Conclusions

Analysts should be more explicit in reporting the analytical strategies applied in the presence of missing data. The use of a multiple imputation approach is recommended in the majority of cases, so as to adequately reflect the uncertainty in the study results due to the presence of missing data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Table I
Fig. 1
Table II
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. 1One scenario in which the CCA estimator is clearly biased is when costs and effects are censored. Lin and colleagues[21] discussed the issue of cost estimation in the presence of censored survival times for some patients in the study. They suggested that the uncensored-cases’ estimator (i.e. CCA) is biased towards the costs of the patients with shorter survival times, because patients with longer survival times are more likely to be censored.

  2. 2For an extensive list of papers and reports on both theoretical developments and applications of MI, and for a list of available software to generate MIs, see the website http://www.multiple-imputation.com.

References

  1. Thompson SG, Barber JA. How should cost data in pragmatic randomised trials be analysed? BMJ 2000; 320(7243): 1197–200

    Article  PubMed  CAS  Google Scholar 

  2. Briggs AH, Clark T, Wolstenholme J, et al. Missing… presumed at random: cost-analysis of incomplete data. Health Econ 2003; 12: 377–92

    Article  PubMed  Google Scholar 

  3. Oostenbrink JB, Al MJ, Rutten-van Molken MPMH. Methods to analyse cost data of patients who withdraw in a clinical trial setting. Pharmacoeconomics 2003; 21(15): 1103–12

    Article  PubMed  Google Scholar 

  4. Crawford SL, Tennstedt SL, McKinlay JB. A comparison of analytic methods for non-random missingness of outcome data. J Clin Epidemiol 1995; 48: 209–19

    Article  PubMed  CAS  Google Scholar 

  5. Engels JM, Diehr P. Imputation of missing longitudinal data: a comparison of methods. J Clin Epidemiol 2003; 56: 968–76

    Article  PubMed  Google Scholar 

  6. Liu G, Gould AL. Comparison of alternative strategies for analysis of longitudinal trials with dropouts. J Biopharm Stat 2002; 12(2): 207–26

    Article  PubMed  Google Scholar 

  7. Musil CM, Warner CB, Yobas PK, et al. A comparison of imputation techniques for handling missing data. West J Nurs Res 2002; 24(7): 815–29

    Article  PubMed  Google Scholar 

  8. Myers WR. Handling missing data in clinical trials: an overview. Drug Inf J 2000; 34: 525–33

    Article  Google Scholar 

  9. Stinnett A, Mullahy J. Net health benefits: a new framework for the analysis of uncertainty in cost-effectiveness analysis. Med Decis Making 1998; 18: S68–80

    Article  PubMed  CAS  Google Scholar 

  10. Barnard J, Meng XL. Applications of multiple imputation in medical studies: from AIDS to NHANES. Stat Methods Med Res 1999; 8(1): 17–36

    Article  PubMed  CAS  Google Scholar 

  11. Bernhard J, Cella DF, Coates AS, et al. Missing quality of life data in clinical trials: serious problems and challenges. Stat Med 1998; 17: 517–32

    Article  PubMed  CAS  Google Scholar 

  12. Little RJA, Rubin DB. Statistical analysis with missing data. 1st ed. New York: John Wiley and Sons, 1987

    Google Scholar 

  13. Rubin DB. Multiple imputation for nonresponse in surveys. New York: John Wiley and Sons, 1987

    Book  Google Scholar 

  14. Little RJA. Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc 1993; 88: 125–34

    Google Scholar 

  15. Miechiels B, Molenberghs G, Lipsitz SR. Selection models and pattern-mixture models for incomplete data with covariates. Biometrics 1999; 55: 978–83

    Article  Google Scholar 

  16. Curran D, Molenberghs G, Aaronson NK, et al. Analysing longitudinal continuous quality of life data with dropout. Stat Methods Med Res 2002; 11(1): 5–23

    Article  PubMed  CAS  Google Scholar 

  17. Schafer JL, Rubin DB. Multiple imputation for missing-data problems. Short course presented at the Joint Statistical Meeting. Co-sponsored by the Survery Reseach Methods Section and the Biometrics Section, American Statistical Association; 1998 Aug 12; Dallas (TX)

    Google Scholar 

  18. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 1977; 39: 1–38

    Google Scholar 

  19. Schafer JL. Analysis of incomplete multivariate data. London: Chapman and Hall, 1997

    Book  Google Scholar 

  20. Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. London: Chapman and Hall, 1996

    Google Scholar 

  21. Lin DY, Feuer EJ, Etzioni R, et al. Estimating medical costs from incomplete follow-up data. Biometrics 1997; 53: 419–34

    Article  PubMed  CAS  Google Scholar 

  22. Little RJA, Rubin DB. The analysis of social science data with missing values. In: Fox J, editor. Modern methods of data analysis. Newbury Park (CA): Sage Publications Inc., 1990

    Google Scholar 

  23. Schafer JL. Multiple imputation: a primer. Stat Methods Med Res 1999; 8: 3–15

    Article  PubMed  CAS  Google Scholar 

  24. Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc 1996; 91: 473–89

    Article  Google Scholar 

  25. Rubin DB, Schenker N. Multiple imputation in health care databases: an overview and some applications. Stat Med 1991; 10: 585–98

    Article  PubMed  CAS  Google Scholar 

  26. Statistical Solutions. SOLAS™ for missing data analysis 2.1 [computer program]. Cork, Ireland: Statistical Solutions, 1999

    Google Scholar 

  27. Tanner MA, Wong WH. The calculation of posterior distributions by data augmentation (with discussion). J Am Stat Assoc 1987; 82: 528–50

    Article  Google Scholar 

  28. van Hout BA, Al MJ, Gordon GS, et al. Costs, effects and c/e-ratios alongside a clinical trial. Health Econ 1994; 3: 309–19

    Article  PubMed  Google Scholar 

  29. Fenwick E, Claxton K, Sculpher MJ. Representing uncertainty: the role of cost-effectiveness acceptability curves. Health Econ 2001; 10: 779–89

    Article  PubMed  CAS  Google Scholar 

  30. Scott J, Palmer S, Paykel ES, et al. Use of cognitive therapy for relapse prevention in chronic depression: cost-effectiveness study. Br J Psychiatry 2003; 182: 221–7

    Article  PubMed  Google Scholar 

  31. Paykel ES, Scott J, Teasdale J, et al. Prevention of relapse in residual depression by cognitive therapy: a controlled trial. Arch Gen Psychiatry 1999; 56: 829–35

    Article  PubMed  CAS  Google Scholar 

  32. Efron B, Tibshirani R. An introduction to the bootstrap. New York: Chapman and Hall, 1993

    Google Scholar 

  33. Manca A, Sculpher MJ, Ward K, et al. A cost-utility analysis of tension-free vaginal tape versus colposuspension for primary urodynamic stress incontinence. BJOG 2003; 110(3): 255–62

    PubMed  Google Scholar 

  34. Lambert PC, Billingham LJ, Cooper NJ, et al. Estimating the cost-effectiveness of an intervention in a clinical trial when partial cost information is available: a Bayesian approach. Leicester: Department of Health Sciences, University of Leicester, 2003. Technical Report no.: 03/03

    Google Scholar 

  35. Rubin HR, Stern HS, Vehovar V. Handling “don’t know” survey responses: the case of the Slovenian plebiscite. J Am Stat Assoc 1995; 90: 822–8

    Google Scholar 

  36. Little RJA. A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc 1988; 83(404): 1198–202

    Article  Google Scholar 

  37. Curran D, Bacchi M, Hsu Schmitz SF, et al. Identifying the types of missingness in quality of life data from clinical trials. Stat Med 1998; 17: 547–55

    Article  Google Scholar 

  38. Willan AR, Briggs AH, Hoch JS. Regression methods for covariate adjustment and subgroup analysis for non-censored cost-effectiveness data. Health Econ 2004; 13(5): 461–75

    Article  PubMed  Google Scholar 

  39. Landrum MB, Becker MP. A multiple imputation strategy for incomplete longitudinal data. Stat Med 2001; 20(17-18): 2741–60

    Article  PubMed  CAS  Google Scholar 

  40. Little RJA, Yao L. Intent-to-treat analysis for longitudinal studies with drop-outs. Biometrics 1996; 52: 1324–33

    Article  PubMed  CAS  Google Scholar 

  41. Best NG, Spiegelhalter DJ, Thomas A, et al. Bayesian analysis of realistically complex models. J R Stat Soc [Ser A] 1996; 159: 323–42

    Article  Google Scholar 

Download references

Acknowledgements

The present research was funded by a grant awarded to Dr Manca by the University of York. An earlier version of this manuscript was presented at the UK Health Economists’ Study Group Conference, September 12–14 2001, City University, London, UK. ## We are grateful to Mark Sculpher and Susan Griffin for their comments during the preparation of this manuscript. Thanks to the anonymous referee for useful comments and suggestions for clarification. Any mistakes and omissions remain the authors’. ## The authors have no conflicts of interest that are directly relevant to the content of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Manca.

Appendix

Appendix

Using the notation provided in Schafer,[23] suppose the analyst is interested in analysing the variable Y (e.g. patient-level net benefit), part of which is observed (Yobs) and part is missing (Ymis).

Let the parameter ^Q be the statistic of interest (i.e. mean costs, mean effects, mean net benefits) and Û its estimated variance, and suppose the researcher has created m completed datasets (equation 1):

$$Y^{(1)}=(Y_{obs},Y^{(1)}_{mis}),Y^{(2)}=(Y_{obs},Y^{(2)}_{mis}),....,Y^{(m)}=(Y_{obs},Y^{(m)}_{mis})$$
(1)

S/he can now calculate m plausible estimates of the statistic of interest ^Q(1),…, ^Q(m) together with their estimated variances, Û(1)(2), …,Û(m).

In the univariate case, it is possible to combine these quantities to obtain an MI estimate of the statistic of interest, together with its variance, by first calculating the average value of -Q across the m datasets as (equation 2):

$${\overline Q}=m^{-1}\sum^{m}_{k=1}{\hat Q^{(k)}}$$
(2)

The total variance of this estimate can be obtained as (equation 3):

$$T=\overline U+(1+{1\over m})B$$
(3)

Which combines the ‘within-imputation variance’ (U) with the ‘between-imputation variance’ (B), where \(\overline U={1\over m}\sum^{m}_{k=1}U^{(l)}\) is the average of the variances across the m dataset, and \(B={1\over m-1}\sum^{m}_{k=1}(\hat Q^{(l)}-\overline Q)^2\) is the sample variance among the m estimates.

The analyst can then construct confidence intervals around the Q estimate using the Student’s t-test, with a number of degrees of freedom that are a function of m, B and U.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Manca, A., Palmer, S. Handling missing data in patient-level cost-effectiveness analysis alongside randomised clinical trials. Appl Health Econ Health Policy 4, 65–75 (2005). https://doi.org/10.2165/00148365-200504020-00001

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2165/00148365-200504020-00001

Keywords

Navigation