Skip to main content
Log in

A comparison of multiple imputation with EM algorithm and MCMC method for quality of life missing data

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

This study investigated the performance of multiple imputations with Expectation-Maximization (EM) algorithm and Monte Carlo Markov chain (MCMC) method in missing data imputation. We compared the accuracy of imputation based on some real data and set up two extreme scenarios and conducted both empirical and simulation studies to examine the effects of missing data rates and number of items used for imputation. In the empirical study, the scenario represented item of highest missing rate from a domain with fewest items. In the simulation study, we selected a domain with most items and the item imputed has lowest missing rate. In the empirical study, the results showed there was no significant difference between EM algorithm and MCMC method for item imputation, and number of items used for imputation has little impact, either. Compared with the actual observed values, the middle responses of 3 and 4 were over-imputed, and the extreme responses of 1, 2 and 5 were under-represented. The similar patterns occurred for domain imputation, and no significant difference between EM algorithm and MCMC method and number of items used for imputation has little impact. In the simulation study, we chose environmental domain to examine the effect of the following variables: EM algorithm and MCMC method, missing data rates, and number of items used for imputation. Again, there was no significant difference between EM algorithm and MCMC method. The accuracy rates did not significantly reduce with increase in the proportions of missing data. Number of items used for imputation has some contribution to accuracy of imputation, but not as much as expected.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arbucle J.L.: Amos User’s Guide. Smallwaters, Chicago (1995)

    Google Scholar 

  • Arbuckle J.L.: Full information estimation in the presence of incomplete data. In: Marcoulides, G.A., Schumacker, R.E. (eds) Advanced Structural Equation Modeling, Lawrence Erlbaum Publishers, Mahwah (1996)

    Google Scholar 

  • Boomsma A.: On the Robustness of LISREL (Maximum Likelihood Estimation) Against Small Sample Size and Non-normality. Sociometric Research Foundation, Amsterdam (1983)

    Google Scholar 

  • Brown C.H.: Asymptotic comparison of missing data procedures for estimating factor loadings. Psychometrika 48, 269–291 (1983)

    Article  Google Scholar 

  • Brown R.L.: Efficacy of the indirect approach for estimating structural equation models with missing data: A comparison of five methods. Struct. Eq. Model. 1, 287–316 (1994)

    Article  Google Scholar 

  • Curran D., Fayers P.M., Molenberghs G., Machin D.: Analysis of incomplete quality-of-life data in clinical trials. In: Staquet, M. (eds) Quality of Life Assessment in Clinical Trials: Methods and Practice, Oxford University Press, Oxford (1998a)

    Google Scholar 

  • Curran D., Molenberghs G., Fayers P.M., Machin D.: Incomplete quality of life data in randomized trials: missing forms. Stat. Med. 17, 697–709 (1998b)

    Article  Google Scholar 

  • Dempster A., Laird N., Rubin D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B Methodol. 39, 1–38 (1997)

    Google Scholar 

  • Enders C.K., Bandalos D.L.: The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Struct. Eq. Model. 8(3), 430–457 (2001)

    Article  Google Scholar 

  • Fayers P., Machin D.: Quality of Life. Assessment, Analysis and Interpretation. Wiley, Chichester (2000)

    Google Scholar 

  • Fayers P.M., Curran D., Machin D.: Incomplete quality of life data in randomized trials: Missing items. Stat. Med. 17, 679–696 (1998)

    Article  Google Scholar 

  • Gilks, W., Richardson, S., Spiegelhalter, D.: Markov Chain Monte Carlo in Practice. Chapman and Hall (1995)

  • Glasser M.: Linear regression analysis with missing observations among the independent variables. J. Am. Stat. Assoc. 59, 834–844 (1964)

    Article  Google Scholar 

  • Graham, J.W., Hofer, S.M.: Multiple imputation in multivariate research In: Little, T.D. Schnabel, K.U., Baumert J. (eds.) Modeling Longitudinal and Multilevel Data: Practical Issues Applied Approaches and Specific Examples. Lawrence Erlbaum Associates, Mahwah (2000)

  • Graham J.W., Hofer S.M., MacKinnon D.P.: Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedures. Multivar. Behav. Res. 31, 197–218 (1996)

    Article  Google Scholar 

  • Haitovsky Y.: Missing data in regression analysis. J. R. Stat. Soc. B 30, 67–81 (1968)

    Google Scholar 

  • Jöreskog K.G., Sörbom D.: PRELIS 2: User’s Reference Guide. Scientific Software International, Chicago (1996)

    Google Scholar 

  • Jöreskog, K., Sörbom, D. LISREL 8.7 for Windows. Scientific Software International, Inc., Lincolnwood (2004)

  • Kim J.O., Curry J.: The treatment of missing data in multivariate analysis. Sociol. Methods Anal. 6, 215–240 (1977)

    Article  Google Scholar 

  • Lin T.H., Chang H.Y., Weng W.S., Chen Y.J., Cho E.Y., Hsiung C.A., Liu J.P.: The National Health Interview Survey Information System: an overview. J. Taiwan Pub. Health 22(6), 431–440 (2003)

    Google Scholar 

  • Little R., Rubin D.: Statistical Analysis with Missing Data. Wiley, New York (1987)

    Google Scholar 

  • Little R., Rubin D.: The analysis of social science data with missing values. Sociol. Methods Res. 18, 292–326 (1989)

    Article  Google Scholar 

  • McLachlan G.J., Krishnan T.: The EM Algorithm and Extensions. Wiley, New York (1997)

    Google Scholar 

  • Muthén B., Kaplan D., Hollis M.: On structural equation modeling with data that are not missing completely at random. Psychometrica 52, 431–462 (1987)

    Article  Google Scholar 

  • Neale M.C., Boker S.M., Xie G., Maes H.H.: Mx: Statistical Modeling (5th ed.). Department of Psychiatry, Richmond (1999)

    Google Scholar 

  • Olschewski M., Schulgen G., Schumacher M., Altman D.G.: Quality of life assessment in clinical cancer research. Br. J. Cancer 70, 1–5 (1994)

    Google Scholar 

  • Rovine, M.J.: Latent variable models and missing data analysis. In: von Eye, A., Clogg, C.C. (eds.) Latent Variable Analysis: Applications for Developmental Research. Sage Publications Thousand Oaks (1994)

  • Schafer J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall, New York (1997)

    Google Scholar 

  • Verleye, G. (1996). Missing at random data problems in attitude measurement using maximum likelihood structural equation modeling. Unpublished dissertation. Frije Universiteit Brussels, Department of Psychology

  • World Health Organization. International Classification of Impairments, Disabilities and Handicaps. WHO, Geneva (1980)

  • Wothke W.: Longitudinal and multi-group modeling with missing data. In: Little, T.D., Schnabel, K.U., Baumert, J. (eds) Modeling Longitudinal and Multiple Group Data: Practical Issues, Applied Approaches and Specific Examples, Lawrence Erlbaum Associates, Inc., Mahwah (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ting Hsiang Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, T.H. A comparison of multiple imputation with EM algorithm and MCMC method for quality of life missing data. Qual Quant 44, 277–287 (2010). https://doi.org/10.1007/s11135-008-9196-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-008-9196-5

Keywords

Navigation