Consequences of Misspecifying the Error Covariance Structure in Linear Mixed Models for Longitudinal Data
Abstract
Abstract. Repeated measures and longitudinal data are frequently analyzed using a linear mixed model. According to this approach, rather than presuming a certain type of covariance structure analysts choose the model that best describes their data prior to carrying out inferences of interest. Because it is not possible to know the underlying covariance structure in advance, researchers often use fit criteria to select from possible covariance structures. SAS Institute's (2004) Proc Mixed program, allows users to model the correct covariance structure by comparing Akaike's Information Criterion (AIC), Hurvich and Tsai's Criterion (AICC), Schwarz's Bayesian Criterion (BIC), Bozdogan's Criterion (CAIC), and Hannan and Quinn's Criterion (HQIC). Monte Carlo methods are used to examine performance of these criteria. The program also investigated the effects of misspecification on properties of the inferences. The results of the simulation show that neither criterion always lead to correct selection of model and that misspecification has negative consequences on estimates of standard errors of linear combinations and tests. When data were generated from symmetric distributions, the best AIC model generally provided robust Type I error control for tests of fixed effects. However, when data were generated from distributions with moderate or severe skewness, neither criterion provided valid tests.
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transaction on Automatic Control, AC-19, 716– 723Bozdogan, H. (1987). Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52, 345– 370Bradley, J. (1978). Robustness?. British Journal of Mathematical and Statistical Psychology, 31, 144– 152Burnham, K.P. , Anderson, D.R. (2002). Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.). New York: Springer-VerlagCressie, N.A.C. (1991). Statistics for spatial data . New York: WileyDawson, K.S. , Gennings, C. , Carter, W.H. (1997). Two graphical techniques useful in detecting correlation structure in repeated measures data. American Statistician, 45, 275– 283Ferron, J. , Dailey, R. , Yi, Q. (2002). Effects of misspecifying the first-level error structure in two-level models of change. Multivariate Behavioral Research, 37, 379– 403Fitzmaurice, G.M. , Laird, N.M. , Ware, J.H. (2004). Applied longitudinal analysis . Hoboken, NJ: WileyGómez, V.E. , Schaalje, G.B. , Fellingham, G.W. (2005). Performance of the Kenward-Roger method when the covariance structure is selected using AIC and BIC. Communications in Statistics: Simulation and Computation, 34, 377– 392Hannan, E.J. , Quinn, B.G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society, Series B, 41, 190– 195Henderson, C.R. (1975). The best linear unbiased estimation and prediction under a selection model. Biometrics, 31, 423– 447Hurvich, C.M. , Tsai, C.L. (1989). Regression and time series model selection in small samples. Biometrika, 76, 297– 307Jennrich, R.I. , Schluchter, M.D. (1986). Unbalanced repeated-measures models with structured covariance matrices. Biometrics, 42, 805– 820Kenward, M.G. , Roger, J.H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 53, 983– 997Keselman, H.J. , Algina, J. , Kowalchuk, R.K. , Wolfinger, R.D. (1998). A comparison of two approaches for selecting covariance structures in the analysis of repeated measurements. Communications in Statistics-Simulation and Computation, 27, 591– 604Keselman, H.J. , Algina, J. , Kowalchuk, R.K. , Wolfinger, R.D. (1999). The analysis of repeated measurements: A comparison of mixed-model Satterthwaite F tests and a nonpooled adjusted degrees of freedom multivariate test. Communications in Statistics-Theory and Methods, 28, 2967– 2999Khuri, A.I. , Mathew, T. , Sinha, B.F. (1998). Statistical tests for mixed linear models . New York: WileyKowalchuk, R.K. , Keselman, H.J. , Algina, J. , Wolfinger, R.D. (2004). The analysis of repeated measurements with mixed-model adjusted F tests. Educational and Psychological Measurement, 64, 224– 242Laird, N.M. , Ware, J.H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963– 974Littell, R.C. , Pendergast, J. , Natarajan, R. (2000). Modelling covariance structure in the analysis of repeated measures data. Statistics in Medicine, 19, 1793– 1819Littell, R.C. , Milliken, G.A. , Stroup, W.W. , Wolfinger, R.D. , Schabenberger, O. (2006). SAS system for mixed models (2nd ed.). Cary, NC: SAS Institute IncMicceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 92, 778– 785Núñez-Antón, V. , Zimmerman, D.L. (2001). Modelización de datos longitudinales con estructuras de covarianza no estacionarias: Modelos de coeficientes aleatorios frente a modelos alternativos [Modeling longitudinal data with nonstationary covariance structures: Random coefficients models versus alternative models]. Qüestiió, 25, 225– 262Raudenbush, S.W. , Bryk, A.S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA:Sage2004). SAS/STAT software: Version 9.1.2 . Cary, NC: Author
(Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461– 464Singer, D.J. (2002). Fitting individual growth models using SAS PROC MIXED. In D.S. Moskowitz & S.L. Hershberger (Eds.), Modeling intraindividual variability with repeated measures data: Methods and applications (pp. 135-170). Mahwah, NJ: ErlbaumVallejo, G. , Ato, M. (2006). Modified Brown-Forsythe procedure for testing interaction effects in split-plot designs. Multivariate Behavioral Research, 41, 549– 578Vallejo, G. , Livacic-Rojas, P. (2005). A comparison of two procedures for analyzing small sets of repeated measures data. Multivariate Behavioral Research, 40, 179– 205West, B.T. , Welch, K.B. , Galecki, A.T. (2007). Linear mixed models. A practical guide using statistical software . Boca Raton, FL: Chapman & HallWolfinger, R.D. (1996). Heterogeneous variance-covariance structures for repeated measures. Journal of Agricultural, Biological, and Environmental Statistics, 1, 205– 230Zimmerman, D.L. , Núñez-Antón, V. (2001). Parametric modeling of growth curve data: An overview (with comments). Test, 10, 1– 73