Abstract
Existing test statistics for assessing whether incomplete data represent a missing completely at random sample from a single population are based on a normal likelihood rationale and effectively test for homogeneity of means and covariances across missing data patterns. The likelihood approach cannot be implemented adequately if a pattern of missing data contains very few subjects. A generalized least squares rationale is used to develop parallel tests that are expected to be more stable in small samples. Three factors were varied for a simulation: number of variables, percent missing completely at random, and sample size. One thousand data sets were simulated for each condition. The generalized least squares test of homogeneity of means performed close to an ideal Type I error rate for most of the conditions. The generalized least squares test of homogeneity of covariance matrices and a combined test performed quite well also.
Similar content being viewed by others
References
Allison, P.D. (1987). Estimation of linear models with incomplete data. In C. Clogg (Ed.),Sociological methodology 1987 (pp. 71–103). San Francisco, CA: Jossey Bass.
Arbuckle, J.L. (1996). Full information estimation in the presence of incomplete data. In G.A. Marcoulides & R.E. Schumacker (Eds.)Advanced structural equation modeling: Issues and techniques (pp. 243–277). Mahwah, NJ: Lawrence Erlbaum Associates.
Bentler, P.M. (1989).EQS structural equations program manual. Los Angeles, CA: BMDP Statistical Software.
Bentler, P.M. (2002).EQS 6 structural equations program manual. Encino, CA: Multivariate Software. Additional information about EQS 6 is available at http://www.mvsoft.com.
Bentler, P.M., Lee, S.-Y., & Weng, J. (1987). Multiple population covariance structure analysis under arbitrary distribution theory.Communications in Statistics—Theory, 16, 1951–1964.
Bernaards, C.A., & Sijtsma, K. (2000). Influence of imputation and EM methods on factor analysis when item nonresponse in questionnaire data is nonignorable.Multivariate Behavioral Research, 35, 321–364.
Browne, M.W. (1974). Generalized least squares estimators in the analysis of covariance structures.South African Statistical Journal, 8, 1–24.
Chen, H.Y., & Little, R. (1999). A test of missing completely at random for generalised estimating equation with missing data.Biometrika, 86, 1–13.
Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, Series B, 39, 1–38.
Dixon, W.J. (Ed.). (1988).BMDP statistical software. Los Angeles, CA: University of California Press.
Enders, C.K., & Bandalos, D.L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models.Structural Equation Modeling, 8, 430–457.
Fuchs, C. (1982). Maximum likelihood estimation and model selection in contingency tables with missing data.Journal of the American Statistical Association, 77, 270–278.
Gold, M.S., & Bentler, P.M. (2000). Treatments of missing data: a Monte Carlo comparison of RBHDI, iterative stochastic regression imputation, and expectation-maximization.Structural Equation Modeling, 7, 319–355.
Jamshidian, M., & Bentler, P.M. (1999). ML estimation of mean and covariance structures with missing data using complete data routines.Journal of Educational and Behavioral Statistics, 24, 21–41.
Jennrich, R.I. (1970). An asymptotic x2 test for equality of two correlation matrices.Journal of the American Statistical Association, 65, 904–912.
Lee, S.-Y., & Tsui, K.-L. (1982). Covariance structure analysis in several populations.Psychometrika, 47, 297–308.
Little, R.J.A. (1988). A test of missing completely at random for multivariate data with missing values.Journal of the American Statistical Association, 83, 1198–1202.
Little, R.J.A., & Rubin, D.B. (1987).Statistical analysis with missing data. New York, NY: Wiley.
Little, R.J.A., & Schenker, N. (1995). Missing Data. In G. Arminger, C.C. Clogg, & M.E. Sobel (Eds.),Handbook of statistical modeling for the social and behavioral sciences (pp. 39–75). New York, NY: Plenum Press.
Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that are not missing completely at random.Psychometrika, 52, 431–462.
Nagao, H. (1973). On some test criteria for covariance matrix.Annals of Statistics, 4, 700–709.
Odeh, R.E., & Evans, J.O. (1974). Algorithm AS 70. The percentage points of the normal distribution.Applied Statistics, 23, 96–97.
Orchard, T., & Woodbury, M.A. (1972). Missing information principle: theory and application.Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, 1, 697–715.
Rovine, M.J. (1994). Latent variables models and missing data analysis. In A. von Eye and C.C. Clogg (Eds.)Latent variables analysis: Applications for developmental research (pp. 181–225). Thousand Oaks, CA: Sage.
Tabachnick, B.G., & Fidell, L.S. (1996).Using multivariate statistics (3rd ed.). New York, NY: Harper Collins.
Tang, M., & Bentler, P.M. (1998). Theory and method for constrained estimation in structural equation models with incomplete data.Computational Statistics & Data Analysis, 27, 257–270.
Yuan, K.-H., & Bentler, P.M. (2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data.Sociological Methodology 2000 (pp. 165–200). Washington, DC: American Sociological Association.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kim, K.H., Bentler, P.M. Tests of homogeneity of means and covariance matrices for multivariate incomplete data. Psychometrika 67, 609–623 (2002). https://doi.org/10.1007/BF02295134
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02295134