The likelihood ratio test of nested models for family data plays an important role in the assessment of genetic and environmental influences on the variation in traits. The test is routinely based on the assumption that the test statistic follows a chi-square distribution under the null, with the number of restricted parameters as degrees of freedom. However, tests of variance components constrained to be non-negative correspond to tests of parameters on the boundary of the parameter space. In this situation the standard test procedure provides too large p-values and the use of the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC) for model selection is problematic. Focusing on the classical ACE twin model for univariate traits, we adapt existing theory to show that the asymptotic distribution for the likelihood ratio statistic is a mixture of chi-square distributions, and we derive the mixing probabilities. We conclude that when testing the AE or the CE model against the ACE model, the p-values obtained from using the χ2(1 df) as the reference distribution should be halved. When the E model is tested against the ACE model, a mixture of χ2(0 df), χ2(1 df) and χ2(2 df) should be used as the reference distribution, and we provide a simple formula to compute the mixing probabilities. Similar results for tests of the AE, DE and E models against the ADE model are also derived. Failing to use the appropriate reference distribution can lead to invalid conclusions.
Similar content being viewed by others
REFERENCES
Akaike H. (1987). Factor analysis and AIC. Psychometrika 52:317–332
Amos C. I., de Andrade M., Zhu D. K. (2001). Comparison of multivariate tests for genetic linkage. Hum. Hered. 51:133–144
Crainiceanu C. M., Ruppert D. (2004). Likelihood ratio tests in linear mixed models with one variance component. J. Roy. Stat. Soc. B 66:165–185
Efron B., Tibshirani R. J. (1993). An Introduction to the Bootstrap. New York, Chapman & Hall
Finkel D., Pedersen N. L. (2004). Processing speed and longitudinal trajectories of change for cognitive abilities: the Swedish Adoption/Twin Study of Aging. Aging Neuropsychol. C. 11:325–345
Lehmann E. L. (1986). Testing Statistical Hypotheses. 2nd edn., New York, Wiley
Neale, M. C., Boker, S. M., Xie, G., and Maes, H. H. (2003). Mx: Statistical Modeling. Department of Psychiatry, VCU Box 900126, Richmond, VA 23298, 6th edn., URL: http://www.vcu.edu/mx/.
Neale M. C., Cardon L. R. (1992). Methodology for Genetic Studies of Twins and Families. Dordrecht, Kluwer Academic Publishers
Pedersen N. L., Plomin R., Nessleroade J. R., McClearn G. E. (1992). Quantitative genetic analysis of cognitive abilities during the second half of the lifespan. Psychol. Sci. 3:346–353
R Development Core Team (2003). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL: http://www.R-project.org.
Reynolds C. A., Finkel D., McArdle J. J., Gatz M., Berg S., Pedersen N. L. (2005). Quantitative genetic analysis of latent growth curve models of cognitive abilities in adulthood. Dev. Psychol. 41:3–16
Rijsdijk F. V., Sham P. C. (2002). Analytic approaches to twin data using structural equation models. Briefings in Bioinformatics 3:119–133
Schwarz G. (1978). Estimating the dimension of a model. Ann. Stat. 6:461–464
Searle S. R., Casella G., McCulloch C. E. (1992). Variance Components. New York, Wiley
Self S. G., Liang K.-L. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Stat. Assoc. 82:605–610
Sham P.C. (1998). Statistics in Human Genetics. London, Arnold
Skrondal A., Rabe-Hesketh S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Boca Raton, Chapman & Hall/CRC
Stram D. O., Lee J. W. (1994). Variance components testing in the longitudinal mixed effects model. Biometrics 50:1171–1177
Stram, D. O., and Lee, J. W. (1995). Correction to “Variance components testing in the longitudinal mixed effects model” by D. O. Stram and J. W. Lee; 50, 1171–1177, 1994. Biometrics 51:1196–1196.
van den Oord E. J. (2001). Estimating effects of latent and measured genotypes in multilevel models. Stat. Methods Med. Res. 10:393–407
ACKNOWLEDGMENTS
The authors wish to thank Ola Hössjer for valuable discussions. This work was supported by a grant from the Swedish Foundation of Strategic Research and grants from the Department of Higher Education and the National Institutes of Health (AG 04563, AG 10175).
Author information
Authors and Affiliations
Corresponding author
APPENDIX A
APPENDIX A
The test of the E model against the ACE model corresponds to the test of two parameters on the boundary of the parameter space. The asymptotic LRT distribution for such a test has been shown to be a mixture of χ2(0 df), χ2(1 df) and χ2(2 df), with mixing probabilities \(({1/2}-p)\), \({1/2}\), and p (Self and Liang, 1987). The mixing probability p for testing H E against H ACE is obtained from
where I AA, I CC and I AC are the components of the inverse of the asymptotic covariance matrix for the maximum likelihood estimates of λ 2A and λ 2C , evaluated under the null model H E. The asymptotic covariance matrix for a parameter vector \(\varvec{\theta}\) is obtained as the inverse of the expected Fisher information matrix, equal to minus one times the expectation of the Hessian (matrix of second derivatives) of the log-likelihood function. Assuming that the response vector y i =(y i1,y i2) for each twin pair i follow a multivariate normal distribution, the log-likelihood function based on n MZ MZ twin pairs and n DZ DZ twin pairs is
where \(\varvec{\mu}\) is the mean vector and \(\varvec{\Sigma}_{\rm MZ}\) and \(\varvec{\Sigma}_{\rm DZ}\) are the covariance matrices for MZ and DZ twins given by
The components of the expected Fisher information matrix for the variance components \({\varvec{\theta}}=(\lambda_{\rm A}^{2},\lambda_{\rm C}^{2},\lambda_{\rm E}^{2})\) are given by
where the second equality is obtained from general results for differentiation of matrices (Searle et al., 1992). Using ΣMZ and ΣDZ given above, the expected Fisher information matrix obtained from expression (A.2) is evaluated under the E model, which corresponds to the parameter vector \({\varvec{\theta}_{0}}=(0,0,\lambda_{\rm E}^{2})\). The inverse of this matrix, expressed in terms of the ratio \(r={n_{\rm MZ}\over n_{\rm DZ}}\), is
The components I AA, I CC and I AC are obtained as the elements of the inverse of the 2×2 submatrix of \(I^{-1}(\varvec{\theta}_{0})\) corresponding to the asymptotic covariance matrix of the maximum likelihood estimates of λ 2A and λ 2C ,
Inserting these components into equation (A.1) finally gives the mixing probability
The mixture probabilities for the likelihood ratio test of the E model against the ADE model can be derived similarly. The only difference is that the covariance matrices for MZ and DZ twins are
The components of the expected Fisher information matrix for the variance components \({\varvec{\theta}}=(\lambda_{\rm A}^{2},\lambda_{\rm D}^{2},\lambda_{\rm E}^{2})\) are given by formula (A.2), and the inverse of this matrix evaluated at \({\varvec{\theta}}_{0}=(0,0,\lambda_{\rm E}^{2})\) is
This gives
and the mixing probability for the test of the E model against the ADE model becomes
Rights and permissions
About this article
Cite this article
Dominicus, A., Skrondal, A., Gjessing, H.K. et al. Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions. Behav Genet 36, 331–340 (2006). https://doi.org/10.1007/s10519-005-9034-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10519-005-9034-7