Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Genome-wide association studies in diverse populations

Key Points

  • Genome-wide association (GWA) studies have identified large numbers of genetic variants that contribute to disease risk.

  • Most GWA studies have been performed primarily in populations of European descent.

  • Phenotypes differ in prevalence across populations, and risk variants differ in frequency, linkage-disequilibrium patterns and effect-size across populations. Diverse populations are therefore required for fully characterizing risk variants.

  • For a given population, both intrinsic population-genetic properties and the properties of genomic resources affect the utility of tag SNPs and the performance of genotype-imputation methods.

  • Population-genetic modelling provides a basis for examining GWA phenomena in diverse populations and for testing the potential of new statistical methods for improving GWA in diverse populations.

  • A combination of population-genetic modelling, statistical methods targeted to diverse populations and new genomic resources will help to address challenges involved in extending GWA to diverse populations.

Abstract

Genome-wide association (GWA) studies have identified a large number of SNPs associated with disease phenotypes. As most GWA studies have been performed in populations of European descent, this Review examines the issues involved in extending the consideration of GWA studies to diverse worldwide populations. Although challenges exist with issues such as imputation, admixture and replication, investigation of a greater diversity of populations could make substantial contributions to the goal of mapping the genetic determinants of complex diseases for the human population as a whole.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Differences in 'mappability' of a risk variant between two populations with different linkage disequilibrium patterns.
Figure 2: Effect of frequency in Europe on the occurrence of an allele in other regions.
Figure 3: Excess SNP variability in Europeans resulting from ascertainment bias.
Figure 4: Genotype imputation accuracy in 29 populations, with and without external reference panels.
Figure 5: Imputation in admixed populations.

Similar content being viewed by others

References

  1. McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008). An informative overview of key issues in the field of GWA studies.

    Article  CAS  PubMed  Google Scholar 

  2. Frazer, K. A. et al. Human genetic variation and its contribution to complex traits. Nature Rev. Genet. 10, 241–251 (2009).

    Article  CAS  PubMed  Google Scholar 

  3. Altshuler, D. et al. Genetic mapping in human disease. Science 322, 881–888 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Hardy, J. & Singleton, A. Genomewide association studies and human disease. N. Engl. J. Med. 360, 1759–1768 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Manolio, T. A. et al. A HapMap harvest of insights into the genetics of common disease. J. Clin. Invest. 118, 1590–1605 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009). An investigation of the properties of GWA findings in the National Human Genome Research Institute (NHGRI) catalogue of published genome-wide association studies.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Halperin, E. & Stephan, D. A. SNP imputation in association studies. Nature Biotech. 4, 349–351 (2009).

    Article  CAS  Google Scholar 

  8. Li, Y. et al. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. de Bakker, P. I. W. et al. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet. 17, R122–R128 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Zeggini, E. & Ioannidis, J. P. A. Meta-analysis in genome-wide association studies. Pharmacogenomics 10, 191–201 (2009).

    Article  PubMed  Google Scholar 

  11. Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility locus for Hirschsprung's disease. Proc. Natl Acad. Sci. USA 106, 2694–2699 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Zhang, X.-J. et al. Psoriasis genome-wide association study identifies susceptibility variants within LCE gene cluster at 1q21. Nature Genet. 41, 205–210 (2009).

    Article  CAS  PubMed  Google Scholar 

  13. Unoki, H. et al. SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations. Nature Genet. 40, 1098–1102 (2008).

    Article  CAS  PubMed  Google Scholar 

  14. Yasuda, K. et al. Variants in KCNQ1 are associated with susceptibility to type 2 diabetes mellitus. Nature Genet. 40, 1092–1097 (2008).

    Article  CAS  PubMed  Google Scholar 

  15. Cho, Y. S. et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nature Genet. 41, 527–534 (2009).

    Article  CAS  PubMed  Google Scholar 

  16. Kim, S.-H. et al. Alpha-T-catenin (CTNNA3) gene was identified as a risk variant for toluene diisocyanate-induced asthma by genome-wide association analysis. Clin. Exp. Allergy 39, 203–212 (2009).

    Article  CAS  PubMed  Google Scholar 

  17. Lowe, J. K. et al. Genome-wide association studies in an isolated founder population from the Pacific island of Kosrae. PLoS Genet. 5, e1000365 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Smith, J. G. et al. Genome-wide association study of electrocardiographic conduction measures in an isolated founder population: Kosrae. Heart Rhythm 6, 634–641 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Cardon, L. R. & Palmer, L. J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003).

    Article  PubMed  Google Scholar 

  20. Ziv, E. & Burchard, E. G. Human population structure and genetic association studies. Pharmacogenomics 4, 431–441 (2003).

    Article  PubMed  Google Scholar 

  21. Tiwari, H. K. et al. Review and evaluation of methods for correcting for population stratification with a focus on underlying statistical principles. Hum. Hered. 66, 67–86 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).

    Article  CAS  PubMed  Google Scholar 

  23. Wang, S. et al. Genetic variation and population structure in Native Americans. PLoS Genet. 3, 2049–2067 (2007).

    CAS  Google Scholar 

  24. Friedlaender, J. S. et al. The genetic structure of Pacific Islanders. PLoS Genet. 4, e19 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. The HUGO Pan-Asian SNP Consortium. Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009).

  26. Reich, D. et al. Reconstructing Indian population history. Nature 461, 489–494 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Tishkoff, S. A. et al. The genetic structure and history of Africans and African Americans. Science 324, 1035–1044 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Heath, S. C. et al. Investigation of the fine structure of European populations with applications to disease association studies. Eur. J. Hum. Genet. 16, 1413–1429 (2008).

    Article  CAS  PubMed  Google Scholar 

  29. Lao, O. et al. Correlation between genetic and geographic structure in Europe. Curr. Biol. 18, 1241–1248 (2008).

    Article  CAS  PubMed  Google Scholar 

  30. Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008). References 23–30 provide extensive genome-wide analyses of population structure in individual geographic regions.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Jakkula, E. et al. The genome-wide patterns of variation expose significant substructure in a founder population. Am. J. Hum. Genet. 83, 787–794 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Price, A. L. et al. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 5, e1000505 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  34. The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).

  35. Carlson, C. S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

    Article  CAS  PubMed  Google Scholar 

  36. Gu, C. C. et al. On transferability of genome-wide tagSNPs. Genet. Epidemiol. 32, 89–97 (2008).

    Article  PubMed  Google Scholar 

  37. Nordborg, M. & Tavaré, S. Linkage disequilibrium: what history has to tell us. Trends Genet. 18, 83–90 (2002).

    Article  CAS  PubMed  Google Scholar 

  38. Slatkin, M. Linkage disequilibrium — understanding the evolutionary past and mapping the medical future. Nature Rev. Genet. 9, 477–485 (2008).

    CAS  PubMed  Google Scholar 

  39. Weir, B. S. Linkage disequilibrium and association mapping. Annu. Rev. Genomics Hum. Genet. 9, 129–142 (2008).

    Article  CAS  PubMed  Google Scholar 

  40. Xing, J. et al. HapMap tagSNP transferability in multiple populations: general guidelines. Genomics 92, 41–51 (2008).

    Article  CAS  PubMed  Google Scholar 

  41. Conrad, D. F. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nature Genet. 38, 1251–1260 (2006).

    Article  CAS  PubMed  Google Scholar 

  42. Tishkoff, S. A. & Kidd, K. K. Implications of biogeography of human populations for 'race' and medicine. Nature Genet. 36, S21–S27 (2004).

    Article  CAS  PubMed  Google Scholar 

  43. Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 998–1003 (2008).

    Article  CAS  PubMed  Google Scholar 

  44. Dhandapany, P. S. et al. A common MYBPC3 (cardiac myosin binding protein C) variant associated with cardiomyopathies in South Asia. Nature Genet. 41, 187–191 (2009). An example of a high-risk complex disease variant that is absent in Europe but occurs with non-trivial frequency in a non-European population.

    Article  CAS  PubMed  Google Scholar 

  45. Myles, S. et al. Worldwide population differentiation at disease-associated SNPs. BMC Med. Genomics 1, 22 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Adeyemo, A. & Rotimi, C. Genetic variants associated with complex human diseases show wide variation across multiple populations. Public Health Genomics 13, 72–79 (2010).

    Article  CAS  PubMed  Google Scholar 

  47. McCarthy, M. I. & Hirschhorn, J. N. Genome-wide association studies: potential next steps on a genetic journey. Hum. Mol. Genet. 17, R156–R165 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Teo, Y. Y. et al. Power consequences of linkage disequilibrium variation between populations. Genet. Epidemiol. 33, 128–135 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Teo, Y.-Y. et al. Methodological challenges of genome-wide association analysis in Africa. Nature Rev. Genet. 11, 149–160 (2010). A Review that focuses on particular challenges for GWA studies in Africa.

    Article  CAS  PubMed  Google Scholar 

  50. Zaitlen, N. et al. Leveraging genetic variability across populations for the identification of causal variants. Am. J. Hum. Genet. 86, 23–33 (2010). This simulation study argues that fine-mapping of causal variants is improved by the joint analysis of multiple populations. The study provides an approach for selecting multiple-population samples for following up on GWA discoveries.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Tang, H. Confronting ethnicity-specific disease risk. Nature Genet. 38, 13–15 (2006).

    Article  CAS  PubMed  Google Scholar 

  52. Tang, M. X. et al. The APOE-ε4 allele and the risk of Alzheimer disease among African Americans, whites, and Hispanics. JAMA 279, 751–755 (1998).

    Article  CAS  PubMed  Google Scholar 

  53. Maher, B. The case of the missing heritability. Nature 456, 18–21 (2008).

    Article  CAS  PubMed  Google Scholar 

  54. Bodmer, W. & Bonilla, C. Common and rare variants in multifactorial susceptibility to common diseases. Nature Genet. 40, 695–701 (2008).

    Article  CAS  PubMed  Google Scholar 

  55. Iles, M. M. What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet. 4, e33 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Schork, N. J. et al. Common vs. rare allele hypotheses for complex diseases. Curr. Op. Genet. Dev. 19, 212–219 (2009).

    Article  CAS  PubMed  Google Scholar 

  57. Dickson, S. P. et al. Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Nielsen, R. Population genetic analysis of ascertained SNP data. Hum. Genomics 1, 218–224 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Clark, A. G. et al. Ascertainment bias in studies of human genomewide polymorphism. Genome Res. 15, 1496–1502 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Barrett, J. C. & Cardon, L. R. Evaluating coverage of genomewide association studies. Nature Genet. 38, 659–662 (2006).

    Article  CAS  PubMed  Google Scholar 

  61. Wray, N. R. Allele frequencies and the r2 measure of linkage disequilibrium: impact on design and interpretation of association studies. Twin Res. Hum. Genet. 8, 87–94 (2005).

    Article  PubMed  Google Scholar 

  62. Eberle, M. A. et al. Frequency-matching SNPs reveals extended linkage disequilibrium in genic regions. PLoS Genet. 2, 1319–1327 (2006).

    Article  CAS  Google Scholar 

  63. VanLiere, J. M. & Rosenberg, N. A. Mathematical properties of the r2 measure of linkage disequilibrium. Theor. Popul. Biol. 74, 130–137 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Pemberton, T. J. et al. Using population mixtures to optimize the utility of genomic databases: linkage disequilibrium and association study design in India. Ann. Hum. Genet. 72, 535–546 (2008).

    Article  CAS  PubMed  Google Scholar 

  65. Egyud, M. R. L. et al. Use of weighted reference panels based on empirical estimates of ancestry for capturing untyped variation. Hum. Genet. 125, 295–303 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Campbell, M. C. & Tishkoff, S. A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genomics Hum. Genet. 9, 403–433 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Browning, S. R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Huang, L. et al. Genotype-imputation accuracy across worldwide human populations. Am. J. Hum. Genet. 84, 235–250 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Huang, L. et al. The relationship between imputation error and statistical power in genetic association studies in diverse populations. Am. J. Hum. Genet. 85, 692–698 (2009). References 68 and 69 provide detailed analyses of genotype imputation in diverse populations.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Wang, S. et al. Geographic patterns of genome admixture in Latin American mestizos. PLoS Genet. 4, e1000037 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Silva-Zolezzi, I. et al. Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. Proc. Natl Acad. Sci. USA 106, 8611–8616 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Bryc, K. et al. Genome-wide patterns of population structure and admixture in West Africans and African Americans. Proc. Natl Acad. Sci. USA 107, 786–791 (2010).

    Article  CAS  PubMed  Google Scholar 

  73. Rosenberg, N. A. & Nordborg, M. A general population-genetic model for the production by population structure of spurious genotype–phenotype associations in discrete, admixed, or spatially distributed populations. Genetics 173, 1665–1678 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. McKeigue, P. M. Prospects for admixture mapping of complex traits. Am. J. Hum. Genet. 76, 1–7 (2005).

    Article  CAS  PubMed  Google Scholar 

  75. Reich, D. & Patterson, N. Will admixture mapping work to find disease genes? Phil. Trans. R. Soc. Lond. B 360, 1605–1607 (2005).

    Article  CAS  Google Scholar 

  76. Smith, M. W. & O'Brien, S. J. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nature Rev. Genet. 6, 623–632 (2005).

    Article  CAS  PubMed  Google Scholar 

  77. Seldin, M. F. Admixture mapping as a tool in gene discovery. Curr. Op. Genet. Dev. 17, 177–181 (2007).

    Article  CAS  PubMed  Google Scholar 

  78. Zhu, X. et al. Admixture mapping for hypertension loci with genome-scan markers. Nature Genet. 37, 177–181 (2005).

    Article  CAS  PubMed  Google Scholar 

  79. Freedman, M. L. et al. Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men. Proc. Natl Acad. Sci. USA 103, 14068–14073 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Reich, D. et al. Admixture mapping of an allele affecting interleukin 6 soluble receptor and interleukin 6 levels. Am. J. Hum. Genet. 80, 716–726 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Nalls, M. A. et al. Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. Am. J. Hum. Genet. 82, 81–87 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Smith, M. W. et al. A high-density admixture map for disease gene discovery in African Americans. Am. J. Hum. Genet. 74, 1001–1013 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Tian, C. et al. A genomewide single-nucleotide-polymorphism panel with high ancestry information for African American admixture mapping. Am. J. Hum. Genet. 79, 640–649 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Price, A. L. et al. A genomewide admixture map for Latino populations. Am. J. Hum. Genet. 80, 1024–1036 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Tian, C. et al. A genomewide single-nucleotide-polymorphism panel for Mexican American admixture mapping. Am. J. Hum. Genet. 80, 1014–1023 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Risch, N. & Tang, H. Whole genome association studies in admixed populations. Am. J. Hum. Genet. 79, S254 (2006).

    Google Scholar 

  87. Falush, D. et al. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Hoggart, C. J. et al. Design and analysis of admixture mapping studies. Am. J. Hum. Genet. 74, 965–978 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Tang, H. et al. Reconstructing genetic ancestry blocks in admixed individuals. Am. J. Hum. Genet. 79, 1–12 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Sankararaman, S. et al. Estimating local ancestry in admixed populations. Am. J. Hum. Genet. 82, 290–303 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Price, A. L. et al. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5, e1000519 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Pas¸aniuc, B., Kennedy, J. & Ma˘ndoiu, I. Imputation-based local ancestry inference in admixed populations. Lect. Notes Comput. Sci. 5542, 221–233 (2009).

    Article  Google Scholar 

  93. Pas¸aniuc, B. et al. Inference of locus-specific ancestry in closely related populations. Bioinformatics 25, i213–i221 (2009).

    Article  CAS  Google Scholar 

  94. Shriner, D. et al. Practical considerations for imputation of untyped markers in admixed populations. Genet. Epidemiol. 34, 258–265 (2010).

    PubMed  PubMed Central  Google Scholar 

  95. Kruglyak, L. The road to genome-wide association studies. Nature Rev. Genet. 9, 314–318 (2008).

    Article  CAS  PubMed  Google Scholar 

  96. Hein, J. et al. Gene Genealogies, Variation and Evolution (Oxford Univ. Press, 2005).

    Google Scholar 

  97. Wakeley, J. Coalescent Theory (Roberts & Company, 2008).

    Google Scholar 

  98. Peng, B. et al. Forward-time simulations of human populations with complex diseases. PLoS Genet. 3, 407–420 (2007).

    Article  CAS  Google Scholar 

  99. Chadeau-Hyam, M. et al. Fregene: simulation of realistic sequence-level data in populations and ascertained samples. BMC Bioinformatics 9, 364 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Hernandez, R. D. A flexible forward simulator for populations subject to selection and demography. Bioinformatics 24, 2786–2787 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Padhukasahasram, B. et al. Exploring population genetic models with recombination using efficient forward-time simulations. Genetics 178, 2417–2427 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  102. Hellenthal, G. & Stephens, M. msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics 23, 520–521 (2007).

    Article  CAS  PubMed  Google Scholar 

  103. McVean, G. A. T. & Cardin, N. J. Approximating the coalescent with recombination. Phil. Trans. R. Soc. Lond. B 360, 1387–1393 (2005).

    Article  CAS  Google Scholar 

  104. Marjoram, P. & Wall, J. D. Fast 'coalescent' simulation. BMC Genet. 7, 16 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Liang, L. et al. GENOME: a rapid coalescent-based whole genome simulator. Bioinformatics 23, 1565–1567 (2007).

    Article  CAS  PubMed  Google Scholar 

  106. Chen, G. K. et al. Fast and flexible simulation of DNA sequence data. Genome Res. 19, 136–142 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Marth, G. T. et al. The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166, 351–372 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Schaffner, S. F. et al. Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 15, 1576–1583 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Voight, B. F. et al. Interrogating multiple aspects of variation in a full resequencing data set to infer human population size changes. Proc. Natl Acad. Sci. USA 102, 18508–18513 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Plagnol, V. & Wall, J. D. Possible ancestral structure in human populations. PLoS Genet. 2, 972–979 (2006).

    Article  CAS  Google Scholar 

  111. Fagundes, N. J. R. et al. Statistical evaluation of alternative models of human evolution. Proc. Natl Acad. Sci. USA 104, 17614–17619 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. DeGiorgio, M. et al. Explaining worldwide patterns of human genetic variation using a coalescent-based serial founder model of migration outward from Africa. Proc. Natl Acad. Sci. USA 106, 16057–16062 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Pritchard, J. K. Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Reich, D. E. & Lander, E. S. On the allelic spectrum of human disease. Trends Genet. 17, 502–510 (2001).

    Article  CAS  PubMed  Google Scholar 

  115. Di Rienzo, A. Population genetics models of common diseases. Curr. Op. Genet. Dev. 16, 630–636 (2006).

    Article  CAS  PubMed  Google Scholar 

  116. Liu, J. S. et al. Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res. 11, 1716–1724 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Morris, A. P. et al. Fine-scale mapping of disease loci via shattered coalescent modeling of genealogies. Am. J. Hum. Genet. 70, 686–707 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Zöllner, S. & Pritchard, J. K. Coalescent-based association mapping and fine mapping of complex trait loci. Genetics 169, 1071–1092 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Minichiello, M. J. & Durbin, R. Mapping trait loci by use of inferred ancestral recombination graphs. Am. J. Hum. Genet. 79, 910–922 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Kimmel, G. et al. Association mapping and significance estimation via the coalescent. Am. J. Hum. Genet. 83, 675–683 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Rosenberg, N. A. & VanLiere, J. M. Replication of genetic associations as pseudoreplication due to shared genealogy. Genet. Epidemiol. 33, 479–487 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  122. Gorroochurn, P. et al. Non-replication of association studies: 'pseudo-failures' to replicate? Genet. Med. 9, 325–331 (2007).

    Article  PubMed  Google Scholar 

  123. Zöllner, S. & Pritchard, J. K. Overcoming the winner's curse: estimating penetrance parameters from case–control data. Am. J. Hum. Genet. 80, 605–615 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Goldstein, D. B. Common genetic variation and human traits. N. Engl. J. Med. 360, 1696–1698 (2009).

    Article  CAS  PubMed  Google Scholar 

  125. Hirschhorn, J. N. Genomewide association studies — illuminating biologic pathways. N. Engl. J. Med. 360, 1699–1701 (2009).

    Article  CAS  PubMed  Google Scholar 

  126. Kraft, P. & Hunter, D. J. Genetic risk prediction — are we there yet? N. Engl. J. Med. 360, 1701–1703 (2009).

    Article  CAS  PubMed  Google Scholar 

  127. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  128. Cooper, R. S. et al. Genome-wide association studies: implications for multiethnic samples. Hum. Mol. Genet. 17, R151–R155 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. Need, A. C. & Goldstein, D. B. Next generation disparities in human genomics: concerns and remedies. Trends Genet. 25, 489–494 (2009).

    Article  CAS  PubMed  Google Scholar 

  130. Hindorff, L. A., Junkins, H. A., Mehta, J. P. & Manolio, T. A. A catalog of published genome-wide association studies. National Human Genome Research Institute [online], (accessed 25 Feb 2010).

    Google Scholar 

  131. Zeggini, E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genet. 40, 638–645 (2008).

    Article  CAS  PubMed  Google Scholar 

  132. Grant, S. F. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genet. 38, 320–323 (2006).

    Article  CAS  PubMed  Google Scholar 

  133. Groves, C. J. et al. Association analysis of 6,736 U.K. subjects provides replication and confirms TCF7L2 as a type 2 diabetes susceptibility gene with a substantial effect on individual risk. Diabetes 55, 2640–2644 (2006).

    Article  CAS  PubMed  Google Scholar 

  134. Scott, L. J. et al. Association of transcription factor 7-like 2 (TCF7L2) variants with type 2 diabetes in a Finnish sample. Diabetes 55, 2649–2653 (2006).

    Article  CAS  PubMed  Google Scholar 

  135. Helgason, A. et al. Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nature Genet. 39, 218–225 (2007).

    Article  CAS  PubMed  Google Scholar 

  136. Luo, Y. et al. Meta-analysis of the association between SNPs in TCF7L2 and type 2 diabetes in East Asian population. Diabetes Res. Clin. Pract. 85, 139–146 (2009).

    Article  CAS  PubMed  Google Scholar 

  137. Chandak, G. R. et al. Common variants in the TCF7L2 gene are strongly associated with type 2 diabetes mellitus in the Indian population. Diabetologia 50, 63–67 (2007).

    Article  CAS  PubMed  Google Scholar 

  138. Lehman, D. M. et al. Haplotypes of transcription factor 7-like 2 (TCF7L2) gene and its upstream region are associated with type 2 diabetes and age of onset in Mexican Americans. Diabetes 56, 389–393 (2007).

    Article  CAS  PubMed  Google Scholar 

  139. Tan, J. T. et al. Polymorphisms identified through genome-wide association studies and their associations with type 2 diabetes in Chinese, Malays, and Asian-Indians in Singapore. J. Clin. Endocrinol. Metab. 95, 390–397 (2010).

    Article  CAS  PubMed  Google Scholar 

  140. Cann, H. M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).

    Article  CAS  PubMed  Google Scholar 

  141. Ramachandran, S. et al. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl Acad. Sci. USA 102, 15942–15947 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  142. Rosenberg, N. A. et al. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet. 1, 660–671 (2005).

    Article  CAS  Google Scholar 

  143. Rogers, A. R. & Jorde, L. B. Ascertainment bias in estimates of average heterozygosity. Am. J. Hum. Genet. 58, 1033–1041 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank L. Hindorff for detailed information on the National Human Genome Research Institute (NHGRI) catalogue of GWA studies, the DIAGRAM Consortium for use of prepublication data, J. Li and S. Zöllner for helpful discussions, and N. Patterson and an anonymous reviewer for comments on a draft of the manuscript. We are grateful to M. DeGiorgio, M. Jakobsson, S. Reddy and P. Scheet for assistance with Box 1 and with figure preparation. Support was provided by US National Institutes of Health grants DK062370, GM081441, HG000376 and HL090564, and by grants from the Burroughs Wellcome Fund and the Alfred P. Sloan Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noah A. Rosenberg.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

Related links

FURTHER INFORMATION

1000 Genomes Project

Human Genome Diversity Cell Line Panel

International HapMap Project

Nature Reviews Genetics article series on Genome-wide association studies

NHGRI list of GWA studies

Glossary

Genome-wide association studies

Study designs in which many markers spread across a genome are genotyped, and tests of statistical association with a phenotype are performed locally along the genome.

Genotype imputation

Probabilistic prediction of genotypes that have not been measured experimentally.

Principal component

A composite variable that summarizes the variation across a larger number of variables, each represented by a column of a matrix.

Loading

In a principal components analysis, a quantity that represents the contribution of one of the original variables (columns of the data matrix) to one of the principal components.

SNP

A nucleotide site at which two or more variants exist in a population. Most SNPs in genome-wide association studies are biallelic.

Tag SNP

A SNP chosen from a larger set of available SNPs for use in an association study. Tag SNPs are generally selected on the basis of favourable linkage disequilibrium properties.

Linkage disequilibrium

A statistical association in the occurrence of alleles at separate loci.

Tag-SNP portability

The utility of SNPs chosen as tags in one population for use as tags in another population.

Minor allele frequency

The frequency of the less frequent allele at a biallelic genetic locus.

Expected heterozygosity

The probability for a locus that two alleles drawn from its allele-frequency distribution are distinct.

Ascertainment bias

A distortion in results due to the use of a subsample that, in a systematic manner, fails to properly represent a larger sample.

Admixed population

A population formed recently from the mixing of two or more groups whose ancestors had long been separated.

Microsatellite

A type of genetic marker in which individuals vary in their number of tandemly repeated copies of a short DNA unit.

Coalescent

A specific stochastic process that describes the relationship among genetic lineages sampled in a population.

Recombination hot spot

A region of the genome in which the per-generation recombination rate is substantially elevated above the genome-wide average.

Contingency table

A table of observations of two or more variables that might have a statistical relationship of interest. For each variable, a contingency table places each observation into one of a series of categories.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rosenberg, N., Huang, L., Jewett, E. et al. Genome-wide association studies in diverse populations. Nat Rev Genet 11, 356–366 (2010). https://doi.org/10.1038/nrg2760

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg2760

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing