Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions

Dominicus, Annica; Skrondal, Anders; Gjessing, Håkon K.; Pedersen, Nancy L.; Palmgren, Juni

doi:10.1007/s10519-005-9034-7

Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions

Published: 11 February 2006

Volume 36, pages 331–340, (2006)
Cite this article

Behavior Genetics Aims and scope Submit manuscript

Annica Dominicus^1,2,6,
Anders Skrondal^3,4,
Håkon K. Gjessing⁴,
Nancy L. Pedersen^2,5 &
…
Juni Palmgren^1,2

1219 Accesses
105 Citations
Explore all metrics

The likelihood ratio test of nested models for family data plays an important role in the assessment of genetic and environmental influences on the variation in traits. The test is routinely based on the assumption that the test statistic follows a chi-square distribution under the null, with the number of restricted parameters as degrees of freedom. However, tests of variance components constrained to be non-negative correspond to tests of parameters on the boundary of the parameter space. In this situation the standard test procedure provides too large p-values and the use of the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC) for model selection is problematic. Focusing on the classical ACE twin model for univariate traits, we adapt existing theory to show that the asymptotic distribution for the likelihood ratio statistic is a mixture of chi-square distributions, and we derive the mixing probabilities. We conclude that when testing the AE or the CE model against the ACE model, the p-values obtained from using the χ²(1 df) as the reference distribution should be halved. When the E model is tested against the ACE model, a mixture of χ²(0 df), χ²(1 df) and χ²(2 df) should be used as the reference distribution, and we provide a simple formula to compute the mixing probabilities. Similar results for tests of the AE, DE and E models against the ADE model are also derived. Failing to use the appropriate reference distribution can lead to invalid conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Type I Error Rates and Parameter Bias in Multivariate Behavioral Genetic Models

Article 20 December 2018

Multilevel Modeling in Classical Twin and Modern Molecular Behavior Genetics

Article 20 February 2021

Fitting Procedures for Novel Gene-by-Measured Environment Interaction Models in Behavior Genetic Designs

Article 04 March 2015

REFERENCES

Akaike H. (1987). Factor analysis and AIC. Psychometrika 52:317–332
Article Google Scholar
Amos C. I., de Andrade M., Zhu D. K. (2001). Comparison of multivariate tests for genetic linkage. Hum. Hered. 51:133–144
Article PubMed CAS Google Scholar
Crainiceanu C. M., Ruppert D. (2004). Likelihood ratio tests in linear mixed models with one variance component. J. Roy. Stat. Soc. B 66:165–185
Article Google Scholar
Efron B., Tibshirani R. J. (1993). An Introduction to the Bootstrap. New York, Chapman & Hall
Google Scholar
Finkel D., Pedersen N. L. (2004). Processing speed and longitudinal trajectories of change for cognitive abilities: the Swedish Adoption/Twin Study of Aging. Aging Neuropsychol. C. 11:325–345
Article Google Scholar
Lehmann E. L. (1986). Testing Statistical Hypotheses. 2nd edn., New York, Wiley
Google Scholar
Neale, M. C., Boker, S. M., Xie, G., and Maes, H. H. (2003). Mx: Statistical Modeling. Department of Psychiatry, VCU Box 900126, Richmond, VA 23298, 6th edn., URL: http://www.vcu.edu/mx/.
Neale M. C., Cardon L. R. (1992). Methodology for Genetic Studies of Twins and Families. Dordrecht, Kluwer Academic Publishers
Google Scholar
Pedersen N. L., Plomin R., Nessleroade J. R., McClearn G. E. (1992). Quantitative genetic analysis of cognitive abilities during the second half of the lifespan. Psychol. Sci. 3:346–353
Article Google Scholar
R Development Core Team (2003). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL: http://www.R-project.org.
Reynolds C. A., Finkel D., McArdle J. J., Gatz M., Berg S., Pedersen N. L. (2005). Quantitative genetic analysis of latent growth curve models of cognitive abilities in adulthood. Dev. Psychol. 41:3–16
Article PubMed Google Scholar
Rijsdijk F. V., Sham P. C. (2002). Analytic approaches to twin data using structural equation models. Briefings in Bioinformatics 3:119–133
Article CAS Google Scholar
Schwarz G. (1978). Estimating the dimension of a model. Ann. Stat. 6:461–464
Article Google Scholar
Searle S. R., Casella G., McCulloch C. E. (1992). Variance Components. New York, Wiley
Google Scholar
Self S. G., Liang K.-L. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J. Am. Stat. Assoc. 82:605–610
Article Google Scholar
Sham P.C. (1998). Statistics in Human Genetics. London, Arnold
Google Scholar
Skrondal A., Rabe-Hesketh S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Boca Raton, Chapman & Hall/CRC
Google Scholar
Stram D. O., Lee J. W. (1994). Variance components testing in the longitudinal mixed effects model. Biometrics 50:1171–1177
Article PubMed CAS Google Scholar
Stram, D. O., and Lee, J. W. (1995). Correction to “Variance components testing in the longitudinal mixed effects model” by D. O. Stram and J. W. Lee; 50, 1171–1177, 1994. Biometrics 51:1196–1196.
van den Oord E. J. (2001). Estimating effects of latent and measured genotypes in multilevel models. Stat. Methods Med. Res. 10:393–407
Article PubMed Google Scholar

Download references

ACKNOWLEDGMENTS

The authors wish to thank Ola Hössjer for valuable discussions. This work was supported by a grant from the Swedish Foundation of Strategic Research and grants from the Department of Higher Education and the National Institutes of Health (AG 04563, AG 10175).

Author information

Authors and Affiliations

Department of Mathematics, Stockholm University, Stockholm, Sweden
Annica Dominicus & Juni Palmgren
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
Annica Dominicus, Nancy L. Pedersen & Juni Palmgren
Department of Statistics, London School of Economics, London, UK
Anders Skrondal
Division of Epidemiology, Norwegian Institute of Public Health, Oslo, Norway
Anders Skrondal & Håkon K. Gjessing
Department of Psychology, University of Southern California, Los Angeles, CA, USA
Nancy L. Pedersen
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 281, Stockholm, SE-171 77, Sweden
Annica Dominicus

Authors

Annica Dominicus
View author publications
You can also search for this author in PubMed Google Scholar
Anders Skrondal
View author publications
You can also search for this author in PubMed Google Scholar
Håkon K. Gjessing
View author publications
You can also search for this author in PubMed Google Scholar
Nancy L. Pedersen
View author publications
You can also search for this author in PubMed Google Scholar
Juni Palmgren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Annica Dominicus.

APPENDIX A

The test of the E model against the ACE model corresponds to the test of two parameters on the boundary of the parameter space. The asymptotic LRT distribution for such a test has been shown to be a mixture of χ²(0 df), χ²(1 df) and χ²(2 df), with mixing probabilities $({1/2}-p)$, ${1/2}$, and p (Self and Liang, 1987). The mixing probability p for testing H _E against H _ACE is obtained from

$$ p = {1\over 2\pi}\hbox{arccos}\left({{I^{\rm AC}}\over{\sqrt{I^{\rm AA}I^{\rm CC}}}}\right), $$

(A.1)

where I ^AA, I ^CC and I ^AC are the components of the inverse of the asymptotic covariance matrix for the maximum likelihood estimates of λ ²_A and λ ²_C , evaluated under the null model H _E. The asymptotic covariance matrix for a parameter vector $\varvec{\theta}$ is obtained as the inverse of the expected Fisher information matrix, equal to minus one times the expectation of the Hessian (matrix of second derivatives) of the log-likelihood function. Assuming that the response vector y _i=(y _i1,y _i2) for each twin pair i follow a multivariate normal distribution, the log-likelihood function based on n _MZ MZ twin pairs and n _DZ DZ twin pairs is

$$\eqalign{ \ell({\varvec{\theta}}) &= -(n_{\rm MZ} + n_{\rm DZ}) \log(2\pi) -{{n_{\rm MZ}}\over{2}}\log \vert {\varvec{\Sigma}_{\rm MZ}} \vert - \sum_{\hbox{MZ pairs}}\left({1\over2}({\varvec{y}}_{i} - \varvec{\mu})^{\prime}\varvec{\Sigma}_{\rm MZ}^{-1}({\varvec{y}}_{i} - \varvec{\mu})\right) \cr & -{n_{DZ}\over2}\log \vert \varvec{\Sigma}_{\rm DZ} \vert - \sum_{\hbox{DZ pairs}}\left({1\over2}({\varvec{y}}_{i} - \varvec{\mu})^{\prime}\varvec{\Sigma}_{\rm DZ}^{-1}({\varvec{y}}_{i} - \varvec{\mu})\right),} $$

where $\varvec{\mu}$ is the mean vector and $\varvec{\Sigma}_{\rm MZ}$ and $\varvec{\Sigma}_{\rm DZ}$ are the covariance matrices for MZ and DZ twins given by

$$ {\varvec{\Sigma}_{\rm MZ}} = \left( \begin{array}{ll} \lambda_{\rm A}^{2} +\lambda_{\rm C}^{2} + \lambda_{\rm E}^{2} & \lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} \\ \lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} &\lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} + \lambda_{\rm E}^{2} \end{array} \right) \hbox{ and } \varvec{\Sigma}_{\rm DZ} = \left( \begin{array}{ll} \lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} + \lambda_{\rm E}^{2} & {1\over2}\lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} \\ {1\over2}\lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} &\lambda_{\rm A}^{2} + \lambda_{\rm C}^{2} + \lambda_{\rm E}^{2} \end{array} \right). $$

The components of the expected Fisher information matrix for the variance components ${\varvec{\theta}}=(\lambda_{\rm A}^{2},\lambda_{\rm C}^{2},\lambda_{\rm E}^{2})$ are given by

$$ \eqalign{ I_{hk}(\varvec{\theta}) &= - \hbox{E}\left({{\partial^2 \ell(\varvec{\theta})}\over{\partial \theta_{h} \partial \theta_{k}}}\right) \cr &= {n_{\rm MZ}\over2}tr\left(\varvec{\Sigma}_{\rm MZ}^{-1}{{\partial \varvec{\Sigma}_{\rm MZ}}\over{\partial \theta_{h}}}\varvec{\Sigma}_{\rm MZ}^{-1}{{\partial \varvec{\Sigma}_{\rm MZ}}\over{\partial \theta_{k}}}\right)+ {n_{\rm DZ}\over2}tr\left(\varvec{\Sigma}_{\rm DZ}^{-1}{{\partial \varvec{\Sigma}_{\rm DZ}}\over{\partial \theta_{h}}}\varvec{\Sigma}_{\rm DZ}^{-1} {{\partial \varvec{\Sigma}_{\rm DZ}}\over{\partial \theta_{k}}}\right),} $$

(A.2)

where the second equality is obtained from general results for differentiation of matrices (Searle et al., 1992). Using Σ_MZ and Σ_DZ given above, the expected Fisher information matrix obtained from expression (A.2) is evaluated under the E model, which corresponds to the parameter vector ${\varvec{\theta}_{0}}=(0,0,\lambda_{\rm E}^{2})$. The inverse of this matrix, expressed in terms of the ratio $r={n_{\rm MZ}\over n_{\rm DZ}}$, is

$$ I^{-1}({\varvec{\theta}_{0}}) = {4\lambda_{\rm E}^{4} \over n_{\rm MZ}} \left( \begin{array}{lll} r+1 & -r - 1/2 & -1/2 \\ -r - 1/2 & r + 1/4 & 1/4\\ -1/2 & 1/4 & {2r + 1 \over 4r + 4} \end{array} \right). $$

The components I ^AA, I ^CC and I ^AC are obtained as the elements of the inverse of the 2×2 submatrix of $I^{-1}(\varvec{\theta}_{0})$ corresponding to the asymptotic covariance matrix of the maximum likelihood estimates of λ ²_A and λ ²_C ,

$$ \eqalign{ I^{\rm AA} &= {{n_{\rm DZ}} \over {\lambda_{\rm E}^{4}}}\left(r + {1\over 4}\right) \cr I^{\rm CC} &= {{n_{\rm DZ}} \over{\lambda_{\rm E}^{4}}}(r + 1) \cr I^{\rm AC} &= {{n_{\rm DZ}} \over{\lambda_{\rm E}^{4}}}\left(r + {1 \over 2}\right).} $$

Inserting these components into equation (A.1) finally gives the mixing probability

$$ p = {1 \over 2\pi}\hbox{arccos}\left({{r + {1\over2}}\over{\sqrt{(r + {1 \over 4})(r + 1)}}}\right). $$

The mixture probabilities for the likelihood ratio test of the E model against the ADE model can be derived similarly. The only difference is that the covariance matrices for MZ and DZ twins are

$${\varvec{\Sigma}_{\rm MZ}} = \left( \begin{array}{ll} \lambda_{\rm A}^{2} + \lambda_{\rm D}^{2} + \lambda_{\rm E}^{2} & \lambda_{\rm A}^{2} + \lambda_{\rm D}^{2} \\ \lambda_{\rm A}^{2} + \lambda_{\rm D}^{2} &\lambda_{\rm A}^{2} + \lambda_{\rm D}^{2} + \lambda_{\rm E}^{2} \end{array} \right) \hbox{ and } {\varvec{\Sigma}_{\rm DZ}} = \left( \begin{array}{ll} \lambda_{\rm A}^{2} + \lambda_{\rm D}^{2} + \lambda_{\rm E}^{2} & {1\over2}\lambda_{\rm A}^{2} + {1\over4}\lambda_{\rm D}^{2} \\ {1\over2}\lambda_{\rm A}^{2} +{1\over4}\lambda_{\rm D}^{2} & \lambda_{\rm A}^{2} + \lambda_{\rm D}^{2} + \lambda_{\rm E}^{2} \end{array} \right). $$

The components of the expected Fisher information matrix for the variance components ${\varvec{\theta}}=(\lambda_{\rm A}^{2},\lambda_{\rm D}^{2},\lambda_{\rm E}^{2})$ are given by formula (A.2), and the inverse of this matrix evaluated at ${\varvec{\theta}}_{0}=(0,0,\lambda_{\rm E}^{2})$ is

$$ I^{-1}({\varvec{\theta}_{0})} = {{16\lambda_{\rm E}^{4}} \over{n_{MZ}}} \left( \begin{array}{lll} r+1/16 & -r - 1/8 & 1/16 \\ -r - 1/8 & r + 1/4 &-1/8 \\ 1/16 & -1/8 & {2r +1 \over 16r + 16} \end{array} \right). $$

This gives

$$\eqalign{ I^{\rm AA} &= {{n_{\rm DZ}}\over{\lambda_{\rm E}^{4}}}(r + 1/4) \cr I^{\rm DD} &= {{n_{\rm DZ}}\over{\lambda_{\rm E}^{4}}}(r + 1/16) \cr I^{\rm AD} &= {{n_{\rm DZ}}\over{\lambda_{\rm E}^{4}}}(r + 1/8),} $$

and the mixing probability for the test of the E model against the ADE model becomes

$$ p^{*} = {1\over 2\pi}\hbox{arccos}\left({{I^{\rm AD}}\over{\sqrt{I^{\rm AA}I^{\rm DD}}}}\right) = {1\over 2\pi}\hbox{arccos}\left({{r + {1\over8}}\over{\sqrt{(r + {1\over4})(r + {1\over16})}}}\right). $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dominicus, A., Skrondal, A., Gjessing, H.K. et al. Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions. Behav Genet 36, 331–340 (2006). https://doi.org/10.1007/s10519-005-9034-7

Download citation

Received: 25 January 2005
Accepted: 14 June 2005
Published: 11 February 2006
Issue Date: March 2006
DOI: https://doi.org/10.1007/s10519-005-9034-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions

Access this article

Similar content being viewed by others

Type I Error Rates and Parameter Bias in Multivariate Behavioral Genetic Models

Multilevel Modeling in Classical Twin and Modern Molecular Behavior Genetics

Fitting Procedures for Novel Gene-by-Measured Environment Interaction Models in Behavior Genetic Designs

REFERENCES

ACKNOWLEDGMENTS

Author information

Authors and Affiliations

Corresponding author

APPENDIX A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Likelihood Ratio Tests in Behavioral Genetics: Problems and Solutions

Access this article

Similar content being viewed by others

Type I Error Rates and Parameter Bias in Multivariate Behavioral Genetic Models

Multilevel Modeling in Classical Twin and Modern Molecular Behavior Genetics

Fitting Procedures for Novel Gene-by-Measured Environment Interaction Models in Behavior Genetic Designs

REFERENCES

ACKNOWLEDGMENTS

Author information

Authors and Affiliations

Corresponding author

APPENDIX A

APPENDIX A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation