Abstract
Introduction: This study investigated the effects of imputing missing data in the WHO Quality of Life Abbreviated Questionnaire (WHOQOL-BREF). The imputation results from both the item and domain levels were compared and the impact of the missing data rate and the number of items included for imputation were examined.
Methods: An empirical analysis and a simulation study were used to examine the effects of missing data rates and the number of items used for imputation on the accuracy for imputation. In the empirical analysis, both item-level and domain-level imputations were performed, and the missing values were imputed using different amounts of data. In the simulation study, sets of 2%, 5% and 10% of the data were drawn randomly and replaced with missing values. Twenty datasets were generated for each situation. The data were imputed and the accuracy of the imputation was reported.
Results: In the empirical study, the number of items used for imputation had only a small impact on the accuracy of imputation. Furthermore, in the simulation study, the accuracy rates of imputation did not significantly change as the proportions of missing data increased. However, the number of items used in the computation did contribute to some extent to the missing values imputed. Extreme responses had the worst computations and the lowest accuracy rates.
Conclusion: It is recommended that as many items as possible be included for imputation within the same domain. However, it is not particularly helpful to use items from different domains for imputation. Researchers should exercise extra caution in interpreting the imputed values of extreme responses.
Similar content being viewed by others
References
Olschewski M, Schilgen G, Schumacher M, et al. Quality of life assessment in clinical cancer research. Br J Cancer 1994; 70: 1–5
Curran D, Fayers P, Molenberghs G, et al. Analysis of incomplete quality-of-life data in clinical trials. In: Staquet MJ, Hays RD, Fayers PM, editors. Quality of life assessment in clinical trials: methods and practice. New York: Oxford University Press, 1998
Fayers PM, Machin D. Quality of life: assessment, analysis, and interpretation. New York: Wiley, 2000
Fayers PM, Curran D, Machin D. Aspects of incomplete quality of life data in randomized trials: I. Missing items. Stat Med 1998; 17: 679–96
Curran D, Molenberghs D, Fayers P, et al. Aspects of incomplete quality of life data in randomized trials: II. Missing forms. Stat Med 1998; 17: 697–709
Little R, Rubin D. Statistical analysis with missing data. New York: Wiley, 1987
Muthén B, Kaplan D, Hollis M. On structural equation modeling with data that are not missing completely at random. Psychometrica 1987; 52: 431–462
Myrtveit I, Stensrud E, Olsson U. Analyzing data sets with missing data: an empirical evaluation of imputation methods and likelihood based methods. IEEE Transactions on Software Engineering 2001; 27: 999–1013
Arbuckle JL. Full information estimation in the presence of incomplete data. In: Marcoulides GA, Schumacker RE, editors. Advanced structural equation modeling. Mahwah (NJ): Lawrence Erlbaum Associates, 1996
Brown RL. Efficacy of the indirect approach for estimating structural equation models with missing data: a comparison of five methods. Structural Equation Modeling 1994; 1: 287–316
Wothke W. Longitudinal and multigroup modeling with missing data. In: Little TD, Schnabel KU, Baumert J, editors. Modeling longitudinal and multiple group data: practical issues, applied approaches and specific examples. Mahwah (NJ): Lawrence Erlbaum Associates, 2000
Glasser M. Linear regression analysis with missing observations among the independent variables. J Am Stat Assoc 1964; 59: 834–844
Haitovsky Y. Missing data in regression analysis. J R Stat Soc Ser B 1968; 30: 67–82
Kim JO, Curry J. The treatment of missing data in multivariate analysis. Sociol Methods Res 1997; 6: 215–240
Fayers PM, Aaronson NK, Bjordal K, et al. EORTC QLQ-C30 scoring manual. Brussels: European Organisation for the Research and Treatment of Cancer (EORTC), 1995
Anderson TW. Maximum likelihood estimates for multivariate normal distribution when some observations are missing. J Am Stat Assoc 1956; 52: 200–203
Browne CH. Asymptotic comparison of missing data procedure for estimating factor loadings. Psychometrika 1983; 48: 269–291
Little JR, Rubin D. The analysis of social science data with missing values. Sociol Methods Res 1989; 18: 292–326
Neal MC. Mx: statistical modeling. 3rd ed. Richmond (VA): Department of Psychiatry, Medical College of Virginia, Virginia Commonwealth University, 1995
Graham JW, Hofer SM, MacKinnon DP. Maximizing the usefulness of data obtained with planned missing value patterns: an application of maximum likelihood procedures. Multivariate Behav Res 1996; 31: 197–218
Graham JW, Hofer SM. Multiple imputation in multivariate research. In: Little TD, Schnabel KU, Baumert J, editors. Modeling longitudinal and multilevel data: practical issues, applied approaches, and specific examples. Mahwah (NJ): Lawrence Erlbaum Associates, 2000
Rovine MJ. Latent variable models and missing data analysis. In: von Eye A, Clogg CC, editors. Latent variable analysis: applications for developmental research. Thousand Oaks (CA): Sage Publications, 1994
Verleye G. Missing at random data problems and maximum likelihood structural equation modelling [reprint]. Interuniversity paper in demography. Gent: Universiteit Gent, 1997. Working paper no.: 1997-3
Jöreskog KG, Sörbom D. LISREL 8 user’s reference guide. Chicago (IL): Scientific Software International, 1996
Jöreskog KG, Sörbom D. PRELIS 2 user’s reference guide computer software. Chicago (IL): Scientific Software International, 1993
World Health Organization. International classification of impairments, disabilities and handicaps. Geneva: World Health Organization, 1980
Lin TH, Chang HY, Weng WS, et al. The National Health Interview Survey Information System: an overview. J Taiwan Public Health 2003; 22: 431–440
Enders CK, Bandalos DL. The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling 2001; 8: 430–457
van Buuren S, van Rijckevorsel JLA. Imputation of missing categorical data by maximizing internal consistency. Psychometrika 1992; 57: 567–580
Acknowledgements
No sources of funding were used to assist in the preparation of this article. The author has no conflicts of interest that are directly relevant to the content of this article. The author thanks the Bureau of Health Promotion, Department of Health and National Health Research Institute in Taiwan for providing the data.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, T.H. Missing Data Imputation in Quality-of-Life Assessment. Pharmacoeconomics 24, 917–925 (2006). https://doi.org/10.2165/00019053-200624090-00008
Published:
Issue Date:
DOI: https://doi.org/10.2165/00019053-200624090-00008