Skip to main content
Log in

How meaningful are heritability estimates of liability?

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

It is commonly acknowledged that estimates of heritability from classical twin studies have many potential shortcomings. Despite this, in the post-GWAS era, these heritability estimates have come to be a continual source of interest and controversy. While the heritability estimates of a quantitative trait are subject to a number of biases, in this article we will argue that the standard statistical approach to estimating the heritability of a binary trait relies on some additional untestable assumptions which, if violated, can lead to badly biased estimates. The ACE liability threshold model assumes at its heart that each individual has an underlying liability or propensity to acquire the binary trait (e.g., disease), and that this unobservable liability is multivariate normally distributed. We investigated a number of different scenarios violating this assumption such as the existence of a single causal diallelic gene and the existence of a dichotomous exposure. For each scenario, we found that substantial asymptotic biases can occur, which no increase in sample size can remove. Asymptotic biases as much as four times larger than the true value were observed, and numerous cases also showed large negative biases. Additionally, regions of low bias occurred for specific parameter combinations. Using simulations, we also investigated the situation where all of the assumptions of the ACE liability model are met. We found that commonly used sample sizes can lead to biased heritability estimates. Thus, even if we are willing to accept the meaningfulness of the liability construct, heritability estimates under the ACE liability threshold model may not accurately reflect the heritability of this construct. The points made in this paper should be kept in mind when considering the meaningfulness of a reported heritability estimate for any specific disease.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Baranzini SE, Mudge J, Van Velkinburgh JC, Khankhanian P, Khrebtukova I, Miller NA, Zhang L, Farmer AD, Bell CJ, Kim RW (2010) Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature 464:1351–1356

    Article  PubMed  CAS  Google Scholar 

  • Benaglia T, Chauveau D, Hunter DR, Young DS (2009) mixtools: An R package for analyzing mixture models. J Stat Softw 32(6):1–29

    Google Scholar 

  • Boardman JD, Blalock CL, Pampel FC (2010) Trends in the genetic influences on smoking. J Health Soc Behav 51:108–123

    Article  PubMed  Google Scholar 

  • Boomsma D, Busjahn A, Peltonen L (2002) Classical twin studies and beyond. Nat Rev Genet 3:872–882

    Article  PubMed  CAS  Google Scholar 

  • Bruder CEG, Piotrowski A, Gijsbers AACJ, Andersson R, Erickson S, Diaz de Ståhl T, Menzel U, Sandgren J, von Tell D, Poplawski A (2008) Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles. Am J Human Genet 82:763–771

    Article  CAS  Google Scholar 

  • Centers for Disease Control and Prevention (CDC) (2009–2010) National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: U.S. Department of Health and Human Services

  • Cowley AW Jr, Nadeau JH, Baccarelli A, Berecek K, Fornage M, Gibbons GH, Harrison DG, Liang M, Nathanielsz PW, O’Connor DT (2012) Report of the National Heart, Lung, and Blood Institute Working Group on epigenetics and hypertension. Hypertension 59:899–905

    Article  PubMed  CAS  Google Scholar 

  • Cui JS, Hopper JL, Harrap SB (2003) Antihypertensive treatments obscure familial contributions to blood pressure variation. Hypertension 41:207–210

    Article  PubMed  CAS  Google Scholar 

  • Curnow R (1972) The multifactorial model for the inheritance of liability to disease and its implications for relatives at risk. Biometrics 28:931–946

    Google Scholar 

  • Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11:446–450

    Article  PubMed  CAS  Google Scholar 

  • Elston RC (1977) Query—estimating heritability of a dichotomous trait. Biometrics 33:231–233

    Article  Google Scholar 

  • Falconer DS (1965) Inheritance of liability to certain diseases estimated from incidence among relatives. Ann Hum Genet 29:51–76

    Article  Google Scholar 

  • Ferguson TS (1996) A course in large sample theory. Chapman & Hall/CRC, London

  • Galton F (1869) Hereditary genius: an inquiry into its laws and consequences. Macmillan and co., London

    Book  Google Scholar 

  • Genz A, Bretz F (2002) Comparison of methods for the computation of multivariate t probabilities. J Comput Graph Stat 11:950–971

    Article  Google Scholar 

  • Goldstein DB (2009) Common genetic variation and human traits. N Engl J Med 360:1696–1698

    Article  PubMed  CAS  Google Scholar 

  • Huang GH (2005) Model identifiability. Encycl Stat Behav Sci 3:1249–1251

    Google Scholar 

  • Joseph J (2000) Not in their genes: a critical view of the genetics of attention-deficit hyperactivity disorder. Dev Rev 20:539–567

    Article  Google Scholar 

  • Kendler KS, Neale MC, Kessler RC, Heath AC, Eaves LJ (1993) A test of the equal-environment assumption in twin studies of psychiatric illness. Behav Genet 23:21–27

    Article  PubMed  CAS  Google Scholar 

  • Kidd K, Cavalli-Sforza L (1973) An analysis of the genetics of schizophrenia. Biodemography Soc Biol 20:254–264

    Article  CAS  Google Scholar 

  • Kurtz TW, Griffin KA, Bidani AK, Davisson RL, Hall JE (2005) Recommendations for blood pressure measurement in humans and experimental animals Part 2: blood Pressure measurement in experimental animals: a statement for professionals from the Subcommittee of Professional and Public Education of the American Heart Association Council on High Blood Pressure Research. Hypertension 45:299–310

    Article  PubMed  CAS  Google Scholar 

  • Lee SH, Wray NR, Goddard ME, Visscher PM (2011) Estimating missing heritability for disease from genome-wide association studies. Am J Human Genet 88:294–305

    Article  Google Scholar 

  • Lewontin RC (1974) Annotation: the analysis of variance and the analysis of causes. Am J Hum Genet 26:400

    PubMed  CAS  Google Scholar 

  • Maher B (2008) Personal genomes: the case of the missing heritability. Nature 456:18–21

    Article  PubMed  CAS  Google Scholar 

  • Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753

    Article  PubMed  CAS  Google Scholar 

  • Neale MC, Cardon LR (1992) Methodology for genetic studies of twins and families. Kluwer, Dordrecht

    Book  Google Scholar 

  • Pam A, Kemker SS, Ross CA, Golden R (1996) The “equal environments assumption” in MZ-DZ twin comparisons: an untenable premise of psychiatric genetics? Acta geneticae medicae et gemellologiae 45:349–360

    PubMed  CAS  Google Scholar 

  • Piessens R, Doncker-Kapenga D, Überhuber C, Kahaner D (1983) Quadpack: a subroutine package for automatic integration. Springer series in computational mathematics, vol 1. Springer-Verlag, Berlin, New York

  • Plomin R, Spinath FM (2004) Intelligence: genetics, genes, and genomics. J Pers Soc Psychol 86:112–129

    Article  PubMed  Google Scholar 

  • Rijsdijk FV, Sham PC (2002) Analytic approaches to twin data using structural equation models. Brief Bioinforma 3:119–133

    Article  CAS  Google Scholar 

  • Team R (2010) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria

  • Visscher PM, Hill WG, Wray NR (2008) Heritability in the genomics era—concepts and misconceptions. Nat Rev Genet 9:255–266

    Article  PubMed  CAS  Google Scholar 

  • White H (1982) Maximum-likelihood estimation of mis-specified models. Econometrica 50:1–25

    Article  Google Scholar 

  • Zuk O, Hechter E, Sunyaev SR, Lander ES (2012) The mystery of missing heritability: genetic interactions create phantom heritability. P Natl Acad Sci USA 109:1193–1198

    Article  CAS  Google Scholar 

Download references

Acknowledgments

We thank Dr. Robert Elston for his helpful comments and suggestions. This work was supported by National Cancer Institute (NCI) Grant R25 CA094186 and by National Heart Lung and Blood Institute (NHLBI) Grant T32 HL007567.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Penny H. Benchek.

Appendix

Appendix

Consider the function \( f({\hat{\mathbf{p}}}) = \arg \max_{\varvec{\uptheta} } \left\{ {\log L(\varvec{\uptheta} |{\hat{\mathbf{p}}},{\mathbf{n}})} \right\}, \) which relates the estimated proportions of twin pairs, with (0,1, or 2) affected in each pair, to the maximum likelihood parameter estimates (i.e., \( \hat {\varvec{\uptheta }}_{\text{ML}} = f({\hat{\mathbf{p}}}) \)). As indicated by the notation, \( f( \cdot ) \) is not a function of \( {\mathbf{n}} \) because the model is saturated. By the law of large numbers, the estimated proportions converge in probability to the true proportions: \( {\hat{\mathbf{p}}} \xrightarrow{P} {\mathbf{p}}^{\text{true}} \). Then, assuming f is continuous at \( {\mathbf{p}}_{\text{o}} \), by Slutsky’s Continuity Theorem (Ferguson 1996) we have, \( \hat {\varvec{\uptheta }}_{\text{ML}} \xrightarrow{P} f({\mathbf{p}}^{\text{true}} ) \). Therefore, even in the case where the model is falsely specified, the maximum likelihood estimate will become very close to a function of the true proportions: \( f({\mathbf{p}}^{\text{true}} ) \). If the model is correctly specified, then the function of the true proportions will equal the true parameter values: \( f({\mathbf{p}}^{\text{true}} ) = \varvec{\uptheta}_{\text{o}} \), otherwise it may be a biased estimate of the parameter values: \( f({\mathbf{p}}^{\text{true}} ) \ne \varvec{\uptheta}_{\text{o}} \). In all of the scenarios discussed, it is easy to calculate the “true” proportions (\( {\mathbf{p}}^{\text{true}} \)) using numerical integration. It is also possible to calculate the function \( f({\mathbf{p}}^{\text{true}} ) \) using numerical optimization techniques combined with numerical integration. Thus, the large sample properties of the estimator under the ACE liability model may be investigated by statistical theory without the use of simulations. We calculated the bivariate normal integral using the method of Genz and Bretz (2002) as implemented in the R package “mvtnorm,” and we performed numerical integration over the t distribution using the adaptive quadrature (Piessens et al. 1983) “integrate” function in R (R Team 2010). We performed numerical optimization using the “optim” function in R (R Team 2010). Also, we calculated the thresholds using the R function “uniroot” (R Team 2010).

As an example of calculating \( {\mathbf{p}}^{\text{true}} \), consider the second scenario with a dichotomous common exposure where the exposure has frequency \( \gamma \). Let \( E \in \{ 0,1\} \) represent the exposure status of a pair. Consider the probability that neither of the twins is affected:

$$\begin{aligned} {\mathbf{p}}_{0j}^{\text{true}} &= \gamma P\left( {Z_{1} < \tau_{j} ,Z_{2} < \tau_{j} |E = 1} \right) \\&\quad + (1 - \gamma )P\left( {Z_{1} < \tau_{j} ,Z_{2} < \tau_{j} |E = 0} \right).\end{aligned} $$
(3)

Similar formulas to (3) can be written for \( {\mathbf{p}}_{1j}^{\text{true}} \) and \( {\mathbf{p}}_{2j}^{\text{true}} \), so the probability of 0, 1 and 2 affected in a twin pair can be calculated. If we allow \( \beta = \sqrt {\sigma_{\text{C}}^{2} /\gamma \left( {1 - \gamma } \right)} \) to represent the effect size for the exposure, then variance of the common environment component is \( {\text{Var(}}\beta E )= \sigma_{C}^{2} \). Furthermore, conditional on E we have

$$ \left[ \begin{gathered} Z_{1} \hfill \\ Z_{2} \hfill \\ \end{gathered} \right] = {\text{MVN}}\left( {\left[ {\begin{array}{*{20}c} {\beta E} \\ {\beta E} \\ \end{array} } \right],\left[ {\begin{array}{*{20}c} 1 & {\phi \sigma_{\text{A}}^{2} } \\ {\phi \sigma_{\text{A}}^{2} } & 1 \\ \end{array} } \right]} \right). $$
(4)

Thus, Eq. (4) may be used in conjunction with Eq. (3) to calculate \( {\mathbf{p}}^{\text{true}} \). Very similar equations may be created for the other scenarios discussed in the paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Benchek, P.H., Morris, N.J. How meaningful are heritability estimates of liability?. Hum Genet 132, 1351–1360 (2013). https://doi.org/10.1007/s00439-013-1334-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-013-1334-z

Keywords

Navigation