Genome-wide association studies in diverse populations

Rosenberg, Noah A.; Huang, Lucy; Jewett, Ethan M.; Szpiech, Zachary A.; Jankovic, Ivana; Boehnke, Michael

doi:10.1038/nrg2760

Review Article
Published: May 2010

Genome-wide association studies in diverse populations

Noah A. Rosenberg^1,2,3,4,
Lucy Huang²^na1,
Ethan M. Jewett²^na1,
Zachary A. Szpiech²^na1,
Ivana Jankovic²^na1 &
…
Michael Boehnke^4,5^na1

Nature Reviews Genetics volume 11, pages 356–366 (2010)Cite this article

7784 Accesses
407 Citations
10 Altmetric
Metrics details

Subjects

Key Points

Genome-wide association (GWA) studies have identified large numbers of genetic variants that contribute to disease risk.
Most GWA studies have been performed primarily in populations of European descent.
Phenotypes differ in prevalence across populations, and risk variants differ in frequency, linkage-disequilibrium patterns and effect-size across populations. Diverse populations are therefore required for fully characterizing risk variants.
For a given population, both intrinsic population-genetic properties and the properties of genomic resources affect the utility of tag SNPs and the performance of genotype-imputation methods.
Population-genetic modelling provides a basis for examining GWA phenomena in diverse populations and for testing the potential of new statistical methods for improving GWA in diverse populations.
A combination of population-genetic modelling, statistical methods targeted to diverse populations and new genomic resources will help to address challenges involved in extending GWA to diverse populations.

Abstract

Genome-wide association (GWA) studies have identified a large number of SNPs associated with disease phenotypes. As most GWA studies have been performed in populations of European descent, this Review examines the issues involved in extending the consideration of GWA studies to diverse worldwide populations. Although challenges exist with issues such as imputation, admixture and replication, investigation of a greater diversity of populations could make substantial contributions to the goal of mapping the genetic determinants of complex diseases for the human population as a whole.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Differences in 'mappability' of a risk variant between two populations with different linkage disequilibrium patterns.**

**Figure 2: Effect of frequency in Europe on the occurrence of an allele in other regions.**

**Figure 3: Excess SNP variability in Europeans resulting from ascertainment bias.**

**Figure 4: Genotype imputation accuracy in 29 populations, with and without external reference panels.**

**Figure 5: Imputation in admixed populations.**

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Dick Schijven, Sourena Soheili-Nezhad, … Clyde Francks

Protein-truncating variants in BSN are associated with severe adult-onset obesity, type 2 diabetes and fatty liver disease

Article Open access 04 April 2024

Yajie Zhao, Maria Chukanova, … John R. B. Perry

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

References

McCarthy, M. I. et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev. Genet. 9, 356–369 (2008). An informative overview of key issues in the field of GWA studies.
Article CAS PubMed Google Scholar
Frazer, K. A. et al. Human genetic variation and its contribution to complex traits. Nature Rev. Genet. 10, 241–251 (2009).
Article CAS PubMed Google Scholar
Altshuler, D. et al. Genetic mapping in human disease. Science 322, 881–888 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hardy, J. & Singleton, A. Genomewide association studies and human disease. N. Engl. J. Med. 360, 1759–1768 (2009).
Article CAS PubMed PubMed Central Google Scholar
Manolio, T. A. et al. A HapMap harvest of insights into the genetics of common disease. J. Clin. Invest. 118, 1590–1605 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009). An investigation of the properties of GWA findings in the National Human Genome Research Institute (NHGRI) catalogue of published genome-wide association studies.
Article CAS PubMed PubMed Central Google Scholar
Halperin, E. & Stephan, D. A. SNP imputation in association studies. Nature Biotech. 4, 349–351 (2009).
Article CAS Google Scholar
Li, Y. et al. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).
Article CAS PubMed PubMed Central Google Scholar
de Bakker, P. I. W. et al. Practical aspects of imputation-driven meta-analysis of genome-wide association studies. Hum. Mol. Genet. 17, R122–R128 (2008).
Article CAS PubMed PubMed Central Google Scholar
Zeggini, E. & Ioannidis, J. P. A. Meta-analysis in genome-wide association studies. Pharmacogenomics 10, 191–201 (2009).
Article PubMed Google Scholar
Garcia-Barcelo, M.-M. et al. Genome-wide association study identifies NRG1 as a susceptibility locus for Hirschsprung's disease. Proc. Natl Acad. Sci. USA 106, 2694–2699 (2009).
Article CAS PubMed PubMed Central Google Scholar
Zhang, X.-J. et al. Psoriasis genome-wide association study identifies susceptibility variants within LCE gene cluster at 1q21. Nature Genet. 41, 205–210 (2009).
Article CAS PubMed Google Scholar
Unoki, H. et al. SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations. Nature Genet. 40, 1098–1102 (2008).
Article CAS PubMed Google Scholar
Yasuda, K. et al. Variants in KCNQ1 are associated with susceptibility to type 2 diabetes mellitus. Nature Genet. 40, 1092–1097 (2008).
Article CAS PubMed Google Scholar
Cho, Y. S. et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nature Genet. 41, 527–534 (2009).
Article CAS PubMed Google Scholar
Kim, S.-H. et al. Alpha-T-catenin (CTNNA3) gene was identified as a risk variant for toluene diisocyanate-induced asthma by genome-wide association analysis. Clin. Exp. Allergy 39, 203–212 (2009).
Article CAS PubMed Google Scholar
Lowe, J. K. et al. Genome-wide association studies in an isolated founder population from the Pacific island of Kosrae. PLoS Genet. 5, e1000365 (2009).
Article CAS PubMed PubMed Central Google Scholar
Smith, J. G. et al. Genome-wide association study of electrocardiographic conduction measures in an isolated founder population: Kosrae. Heart Rhythm 6, 634–641 (2009).
Article PubMed PubMed Central Google Scholar
Cardon, L. R. & Palmer, L. J. Population stratification and spurious allelic association. Lancet 361, 598–604 (2003).
Article PubMed Google Scholar
Ziv, E. & Burchard, E. G. Human population structure and genetic association studies. Pharmacogenomics 4, 431–441 (2003).
Article PubMed Google Scholar
Tiwari, H. K. et al. Review and evaluation of methods for correcting for population stratification with a focus on underlying statistical principles. Hum. Hered. 66, 67–86 (2008).
Article PubMed PubMed Central Google Scholar
Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).
Article CAS PubMed Google Scholar
Wang, S. et al. Genetic variation and population structure in Native Americans. PLoS Genet. 3, 2049–2067 (2007).
CAS Google Scholar
Friedlaender, J. S. et al. The genetic structure of Pacific Islanders. PLoS Genet. 4, e19 (2008).
Article CAS PubMed PubMed Central Google Scholar
The HUGO Pan-Asian SNP Consortium. Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009).
Reich, D. et al. Reconstructing Indian population history. Nature 461, 489–494 (2009).
Article CAS PubMed PubMed Central Google Scholar
Tishkoff, S. A. et al. The genetic structure and history of Africans and African Americans. Science 324, 1035–1044 (2009).
Article CAS PubMed PubMed Central Google Scholar
Heath, S. C. et al. Investigation of the fine structure of European populations with applications to disease association studies. Eur. J. Hum. Genet. 16, 1413–1429 (2008).
Article CAS PubMed Google Scholar
Lao, O. et al. Correlation between genetic and geographic structure in Europe. Curr. Biol. 18, 1241–1248 (2008).
Article CAS PubMed Google Scholar
Novembre, J. et al. Genes mirror geography within Europe. Nature 456, 98–101 (2008). References 23–30 provide extensive genome-wide analyses of population structure in individual geographic regions.
Article CAS PubMed PubMed Central Google Scholar
Jakkula, E. et al. The genome-wide patterns of variation expose significant substructure in a founder population. Am. J. Hum. Genet. 83, 787–794 (2008).
Article CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. The impact of divergence time on the nature of population structure: an example from Iceland. PLoS Genet. 5, e1000505 (2009).
Article CAS PubMed PubMed Central Google Scholar
The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
Carlson, C. S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).
Article CAS PubMed Google Scholar
Gu, C. C. et al. On transferability of genome-wide tagSNPs. Genet. Epidemiol. 32, 89–97 (2008).
Article PubMed Google Scholar
Nordborg, M. & Tavaré, S. Linkage disequilibrium: what history has to tell us. Trends Genet. 18, 83–90 (2002).
Article CAS PubMed Google Scholar
Slatkin, M. Linkage disequilibrium — understanding the evolutionary past and mapping the medical future. Nature Rev. Genet. 9, 477–485 (2008).
CAS PubMed Google Scholar
Weir, B. S. Linkage disequilibrium and association mapping. Annu. Rev. Genomics Hum. Genet. 9, 129–142 (2008).
Article CAS PubMed Google Scholar
Xing, J. et al. HapMap tagSNP transferability in multiple populations: general guidelines. Genomics 92, 41–51 (2008).
Article CAS PubMed Google Scholar
Conrad, D. F. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nature Genet. 38, 1251–1260 (2006).
Article CAS PubMed Google Scholar
Tishkoff, S. A. & Kidd, K. K. Implications of biogeography of human populations for 'race' and medicine. Nature Genet. 36, S21–S27 (2004).
Article CAS PubMed Google Scholar
Jakobsson, M. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 998–1003 (2008).
Article CAS PubMed Google Scholar
Dhandapany, P. S. et al. A common MYBPC3 (cardiac myosin binding protein C) variant associated with cardiomyopathies in South Asia. Nature Genet. 41, 187–191 (2009). An example of a high-risk complex disease variant that is absent in Europe but occurs with non-trivial frequency in a non-European population.
Article CAS PubMed Google Scholar
Myles, S. et al. Worldwide population differentiation at disease-associated SNPs. BMC Med. Genomics 1, 22 (2008).
Article CAS PubMed PubMed Central Google Scholar
Adeyemo, A. & Rotimi, C. Genetic variants associated with complex human diseases show wide variation across multiple populations. Public Health Genomics 13, 72–79 (2010).
Article CAS PubMed Google Scholar
McCarthy, M. I. & Hirschhorn, J. N. Genome-wide association studies: potential next steps on a genetic journey. Hum. Mol. Genet. 17, R156–R165 (2008).
Article CAS PubMed PubMed Central Google Scholar
Teo, Y. Y. et al. Power consequences of linkage disequilibrium variation between populations. Genet. Epidemiol. 33, 128–135 (2009).
Article PubMed PubMed Central Google Scholar
Teo, Y.-Y. et al. Methodological challenges of genome-wide association analysis in Africa. Nature Rev. Genet. 11, 149–160 (2010). A Review that focuses on particular challenges for GWA studies in Africa.
Article CAS PubMed Google Scholar
Zaitlen, N. et al. Leveraging genetic variability across populations for the identification of causal variants. Am. J. Hum. Genet. 86, 23–33 (2010). This simulation study argues that fine-mapping of causal variants is improved by the joint analysis of multiple populations. The study provides an approach for selecting multiple-population samples for following up on GWA discoveries.
Article CAS PubMed PubMed Central Google Scholar
Tang, H. Confronting ethnicity-specific disease risk. Nature Genet. 38, 13–15 (2006).
Article CAS PubMed Google Scholar
Tang, M. X. et al. The APOE-ε4 allele and the risk of Alzheimer disease among African Americans, whites, and Hispanics. JAMA 279, 751–755 (1998).
Article CAS PubMed Google Scholar
Maher, B. The case of the missing heritability. Nature 456, 18–21 (2008).
Article CAS PubMed Google Scholar
Bodmer, W. & Bonilla, C. Common and rare variants in multifactorial susceptibility to common diseases. Nature Genet. 40, 695–701 (2008).
Article CAS PubMed Google Scholar
Iles, M. M. What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet. 4, e33 (2008).
Article CAS PubMed PubMed Central Google Scholar
Schork, N. J. et al. Common vs. rare allele hypotheses for complex diseases. Curr. Op. Genet. Dev. 19, 212–219 (2009).
Article CAS PubMed Google Scholar
Dickson, S. P. et al. Rare variants create synthetic genome-wide associations. PLoS Biol. 8, e1000294 (2010).
Article CAS PubMed PubMed Central Google Scholar
Nielsen, R. Population genetic analysis of ascertained SNP data. Hum. Genomics 1, 218–224 (2004).
Article CAS PubMed PubMed Central Google Scholar
Clark, A. G. et al. Ascertainment bias in studies of human genomewide polymorphism. Genome Res. 15, 1496–1502 (2005).
Article CAS PubMed PubMed Central Google Scholar
Barrett, J. C. & Cardon, L. R. Evaluating coverage of genomewide association studies. Nature Genet. 38, 659–662 (2006).
Article CAS PubMed Google Scholar
Wray, N. R. Allele frequencies and the r² measure of linkage disequilibrium: impact on design and interpretation of association studies. Twin Res. Hum. Genet. 8, 87–94 (2005).
Article PubMed Google Scholar
Eberle, M. A. et al. Frequency-matching SNPs reveals extended linkage disequilibrium in genic regions. PLoS Genet. 2, 1319–1327 (2006).
Article CAS Google Scholar
VanLiere, J. M. & Rosenberg, N. A. Mathematical properties of the r² measure of linkage disequilibrium. Theor. Popul. Biol. 74, 130–137 (2008).
Article PubMed PubMed Central Google Scholar
Pemberton, T. J. et al. Using population mixtures to optimize the utility of genomic databases: linkage disequilibrium and association study design in India. Ann. Hum. Genet. 72, 535–546 (2008).
Article CAS PubMed Google Scholar
Egyud, M. R. L. et al. Use of weighted reference panels based on empirical estimates of ancestry for capturing untyped variation. Hum. Genet. 125, 295–303 (2009).
Article PubMed PubMed Central Google Scholar
Campbell, M. C. & Tishkoff, S. A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genomics Hum. Genet. 9, 403–433 (2008).
Article CAS PubMed PubMed Central Google Scholar
Browning, S. R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).
Article CAS PubMed PubMed Central Google Scholar
Huang, L. et al. Genotype-imputation accuracy across worldwide human populations. Am. J. Hum. Genet. 84, 235–250 (2009).
Article CAS PubMed PubMed Central Google Scholar
Huang, L. et al. The relationship between imputation error and statistical power in genetic association studies in diverse populations. Am. J. Hum. Genet. 85, 692–698 (2009). References 68 and 69 provide detailed analyses of genotype imputation in diverse populations.
Article CAS PubMed PubMed Central Google Scholar
Wang, S. et al. Geographic patterns of genome admixture in Latin American mestizos. PLoS Genet. 4, e1000037 (2008).
Article CAS PubMed PubMed Central Google Scholar
Silva-Zolezzi, I. et al. Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. Proc. Natl Acad. Sci. USA 106, 8611–8616 (2009).
Article CAS PubMed PubMed Central Google Scholar
Bryc, K. et al. Genome-wide patterns of population structure and admixture in West Africans and African Americans. Proc. Natl Acad. Sci. USA 107, 786–791 (2010).
Article CAS PubMed Google Scholar
Rosenberg, N. A. & Nordborg, M. A general population-genetic model for the production by population structure of spurious genotype–phenotype associations in discrete, admixed, or spatially distributed populations. Genetics 173, 1665–1678 (2006).
Article CAS PubMed PubMed Central Google Scholar
McKeigue, P. M. Prospects for admixture mapping of complex traits. Am. J. Hum. Genet. 76, 1–7 (2005).
Article CAS PubMed Google Scholar
Reich, D. & Patterson, N. Will admixture mapping work to find disease genes? Phil. Trans. R. Soc. Lond. B 360, 1605–1607 (2005).
Article CAS Google Scholar
Smith, M. W. & O'Brien, S. J. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nature Rev. Genet. 6, 623–632 (2005).
Article CAS PubMed Google Scholar
Seldin, M. F. Admixture mapping as a tool in gene discovery. Curr. Op. Genet. Dev. 17, 177–181 (2007).
Article CAS PubMed Google Scholar
Zhu, X. et al. Admixture mapping for hypertension loci with genome-scan markers. Nature Genet. 37, 177–181 (2005).
Article CAS PubMed Google Scholar
Freedman, M. L. et al. Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men. Proc. Natl Acad. Sci. USA 103, 14068–14073 (2006).
Article CAS PubMed PubMed Central Google Scholar
Reich, D. et al. Admixture mapping of an allele affecting interleukin 6 soluble receptor and interleukin 6 levels. Am. J. Hum. Genet. 80, 716–726 (2007).
Article CAS PubMed PubMed Central Google Scholar
Nalls, M. A. et al. Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. Am. J. Hum. Genet. 82, 81–87 (2008).
Article CAS PubMed PubMed Central Google Scholar
Smith, M. W. et al. A high-density admixture map for disease gene discovery in African Americans. Am. J. Hum. Genet. 74, 1001–1013 (2004).
Article CAS PubMed PubMed Central Google Scholar
Tian, C. et al. A genomewide single-nucleotide-polymorphism panel with high ancestry information for African American admixture mapping. Am. J. Hum. Genet. 79, 640–649 (2006).
Article CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. A genomewide admixture map for Latino populations. Am. J. Hum. Genet. 80, 1024–1036 (2007).
Article CAS PubMed PubMed Central Google Scholar
Tian, C. et al. A genomewide single-nucleotide-polymorphism panel for Mexican American admixture mapping. Am. J. Hum. Genet. 80, 1014–1023 (2007).
Article CAS PubMed PubMed Central Google Scholar
Risch, N. & Tang, H. Whole genome association studies in admixed populations. Am. J. Hum. Genet. 79, S254 (2006).
Google Scholar
Falush, D. et al. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003).
CAS PubMed PubMed Central Google Scholar
Hoggart, C. J. et al. Design and analysis of admixture mapping studies. Am. J. Hum. Genet. 74, 965–978 (2004).
Article CAS PubMed PubMed Central Google Scholar
Tang, H. et al. Reconstructing genetic ancestry blocks in admixed individuals. Am. J. Hum. Genet. 79, 1–12 (2006).
Article CAS PubMed PubMed Central Google Scholar
Sankararaman, S. et al. Estimating local ancestry in admixed populations. Am. J. Hum. Genet. 82, 290–303 (2008).
Article CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 5, e1000519 (2009).
Article CAS PubMed PubMed Central Google Scholar
Pas¸aniuc, B., Kennedy, J. & Ma˘ndoiu, I. Imputation-based local ancestry inference in admixed populations. Lect. Notes Comput. Sci. 5542, 221–233 (2009).
Article Google Scholar
Pas¸aniuc, B. et al. Inference of locus-specific ancestry in closely related populations. Bioinformatics 25, i213–i221 (2009).
Article CAS Google Scholar
Shriner, D. et al. Practical considerations for imputation of untyped markers in admixed populations. Genet. Epidemiol. 34, 258–265 (2010).
PubMed PubMed Central Google Scholar
Kruglyak, L. The road to genome-wide association studies. Nature Rev. Genet. 9, 314–318 (2008).
Article CAS PubMed Google Scholar
Hein, J. et al. Gene Genealogies, Variation and Evolution (Oxford Univ. Press, 2005).
Google Scholar
Wakeley, J. Coalescent Theory (Roberts & Company, 2008).
Google Scholar
Peng, B. et al. Forward-time simulations of human populations with complex diseases. PLoS Genet. 3, 407–420 (2007).
Article CAS Google Scholar
Chadeau-Hyam, M. et al. Fregene: simulation of realistic sequence-level data in populations and ascertained samples. BMC Bioinformatics 9, 364 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hernandez, R. D. A flexible forward simulator for populations subject to selection and demography. Bioinformatics 24, 2786–2787 (2008).
Article CAS PubMed PubMed Central Google Scholar
Padhukasahasram, B. et al. Exploring population genetic models with recombination using efficient forward-time simulations. Genetics 178, 2417–2427 (2008).
Article PubMed PubMed Central Google Scholar
Hellenthal, G. & Stephens, M. msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics 23, 520–521 (2007).
Article CAS PubMed Google Scholar
McVean, G. A. T. & Cardin, N. J. Approximating the coalescent with recombination. Phil. Trans. R. Soc. Lond. B 360, 1387–1393 (2005).
Article CAS Google Scholar
Marjoram, P. & Wall, J. D. Fast 'coalescent' simulation. BMC Genet. 7, 16 (2006).
Article CAS PubMed PubMed Central Google Scholar
Liang, L. et al. GENOME: a rapid coalescent-based whole genome simulator. Bioinformatics 23, 1565–1567 (2007).
Article CAS PubMed Google Scholar
Chen, G. K. et al. Fast and flexible simulation of DNA sequence data. Genome Res. 19, 136–142 (2009).
Article CAS PubMed PubMed Central Google Scholar
Marth, G. T. et al. The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166, 351–372 (2004).
Article CAS PubMed PubMed Central Google Scholar
Schaffner, S. F. et al. Calibrating a coalescent simulation of human genome sequence variation. Genome Res. 15, 1576–1583 (2005).
Article CAS PubMed PubMed Central Google Scholar
Voight, B. F. et al. Interrogating multiple aspects of variation in a full resequencing data set to infer human population size changes. Proc. Natl Acad. Sci. USA 102, 18508–18513 (2005).
Article CAS PubMed PubMed Central Google Scholar
Plagnol, V. & Wall, J. D. Possible ancestral structure in human populations. PLoS Genet. 2, 972–979 (2006).
Article CAS Google Scholar
Fagundes, N. J. R. et al. Statistical evaluation of alternative models of human evolution. Proc. Natl Acad. Sci. USA 104, 17614–17619 (2007).
Article CAS PubMed PubMed Central Google Scholar
DeGiorgio, M. et al. Explaining worldwide patterns of human genetic variation using a coalescent-based serial founder model of migration outward from Africa. Proc. Natl Acad. Sci. USA 106, 16057–16062 (2009).
Article CAS PubMed PubMed Central Google Scholar
Pritchard, J. K. Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).
Article CAS PubMed PubMed Central Google Scholar
Reich, D. E. & Lander, E. S. On the allelic spectrum of human disease. Trends Genet. 17, 502–510 (2001).
Article CAS PubMed Google Scholar
Di Rienzo, A. Population genetics models of common diseases. Curr. Op. Genet. Dev. 16, 630–636 (2006).
Article CAS PubMed Google Scholar
Liu, J. S. et al. Bayesian analysis of haplotypes for linkage disequilibrium mapping. Genome Res. 11, 1716–1724 (2001).
Article CAS PubMed PubMed Central Google Scholar
Morris, A. P. et al. Fine-scale mapping of disease loci via shattered coalescent modeling of genealogies. Am. J. Hum. Genet. 70, 686–707 (2002).
Article CAS PubMed PubMed Central Google Scholar
Zöllner, S. & Pritchard, J. K. Coalescent-based association mapping and fine mapping of complex trait loci. Genetics 169, 1071–1092 (2005).
Article CAS PubMed PubMed Central Google Scholar
Minichiello, M. J. & Durbin, R. Mapping trait loci by use of inferred ancestral recombination graphs. Am. J. Hum. Genet. 79, 910–922 (2006).
Article CAS PubMed PubMed Central Google Scholar
Kimmel, G. et al. Association mapping and significance estimation via the coalescent. Am. J. Hum. Genet. 83, 675–683 (2008).
Article CAS PubMed PubMed Central Google Scholar
Rosenberg, N. A. & VanLiere, J. M. Replication of genetic associations as pseudoreplication due to shared genealogy. Genet. Epidemiol. 33, 479–487 (2009).
Article PubMed PubMed Central Google Scholar
Gorroochurn, P. et al. Non-replication of association studies: 'pseudo-failures' to replicate? Genet. Med. 9, 325–331 (2007).
Article PubMed Google Scholar
Zöllner, S. & Pritchard, J. K. Overcoming the winner's curse: estimating penetrance parameters from case–control data. Am. J. Hum. Genet. 80, 605–615 (2007).
Article CAS PubMed PubMed Central Google Scholar
Goldstein, D. B. Common genetic variation and human traits. N. Engl. J. Med. 360, 1696–1698 (2009).
Article CAS PubMed Google Scholar
Hirschhorn, J. N. Genomewide association studies — illuminating biologic pathways. N. Engl. J. Med. 360, 1699–1701 (2009).
Article CAS PubMed Google Scholar
Kraft, P. & Hunter, D. J. Genetic risk prediction — are we there yet? N. Engl. J. Med. 360, 1701–1703 (2009).
Article CAS PubMed Google Scholar
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).
Article CAS PubMed PubMed Central Google Scholar
Cooper, R. S. et al. Genome-wide association studies: implications for multiethnic samples. Hum. Mol. Genet. 17, R151–R155 (2008).
Article CAS PubMed PubMed Central Google Scholar
Need, A. C. & Goldstein, D. B. Next generation disparities in human genomics: concerns and remedies. Trends Genet. 25, 489–494 (2009).
Article CAS PubMed Google Scholar
Hindorff, L. A., Junkins, H. A., Mehta, J. P. & Manolio, T. A. A catalog of published genome-wide association studies. National Human Genome Research Institute [online], (accessed 25 Feb 2010).
Google Scholar
Zeggini, E. et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nature Genet. 40, 638–645 (2008).
Article CAS PubMed Google Scholar
Grant, S. F. et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nature Genet. 38, 320–323 (2006).
Article CAS PubMed Google Scholar
Groves, C. J. et al. Association analysis of 6,736 U.K. subjects provides replication and confirms TCF7L2 as a type 2 diabetes susceptibility gene with a substantial effect on individual risk. Diabetes 55, 2640–2644 (2006).
Article CAS PubMed Google Scholar
Scott, L. J. et al. Association of transcription factor 7-like 2 (TCF7L2) variants with type 2 diabetes in a Finnish sample. Diabetes 55, 2649–2653 (2006).
Article CAS PubMed Google Scholar
Helgason, A. et al. Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nature Genet. 39, 218–225 (2007).
Article CAS PubMed Google Scholar
Luo, Y. et al. Meta-analysis of the association between SNPs in TCF7L2 and type 2 diabetes in East Asian population. Diabetes Res. Clin. Pract. 85, 139–146 (2009).
Article CAS PubMed Google Scholar
Chandak, G. R. et al. Common variants in the TCF7L2 gene are strongly associated with type 2 diabetes mellitus in the Indian population. Diabetologia 50, 63–67 (2007).
Article CAS PubMed Google Scholar
Lehman, D. M. et al. Haplotypes of transcription factor 7-like 2 (TCF7L2) gene and its upstream region are associated with type 2 diabetes and age of onset in Mexican Americans. Diabetes 56, 389–393 (2007).
Article CAS PubMed Google Scholar
Tan, J. T. et al. Polymorphisms identified through genome-wide association studies and their associations with type 2 diabetes in Chinese, Malays, and Asian-Indians in Singapore. J. Clin. Endocrinol. Metab. 95, 390–397 (2010).
Article CAS PubMed Google Scholar
Cann, H. M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).
Article CAS PubMed Google Scholar
Ramachandran, S. et al. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl Acad. Sci. USA 102, 15942–15947 (2005).
Article CAS PubMed PubMed Central Google Scholar
Rosenberg, N. A. et al. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet. 1, 660–671 (2005).
Article CAS Google Scholar
Rogers, A. R. & Jorde, L. B. Ascertainment bias in estimates of average heterozygosity. Am. J. Hum. Genet. 58, 1033–1041 (1996).
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank L. Hindorff for detailed information on the National Human Genome Research Institute (NHGRI) catalogue of GWA studies, the DIAGRAM Consortium for use of prepublication data, J. Li and S. Zöllner for helpful discussions, and N. Patterson and an anonymous reviewer for comments on a draft of the manuscript. We are grateful to M. DeGiorgio, M. Jakobsson, S. Reddy and P. Scheet for assistance with Box 1 and with figure preparation. Support was provided by US National Institutes of Health grants DK062370, GM081441, HG000376 and HL090564, and by grants from the Burroughs Wellcome Fund and the Alfred P. Sloan Foundation.

Author information

Lucy Huang, Ethan M. Jewett, Zachary A. Szpiech, Ivana Jankovic and Michael Boehnke: These authors contributed equally to this work.

Authors and Affiliations

Department of Human Genetics, University of Michigan, Ann Arbor, 48109, Michigan, USA
Noah A. Rosenberg
Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48109, Michigan, USA
Noah A. Rosenberg, Lucy Huang, Ethan M. Jewett, Zachary A. Szpiech & Ivana Jankovic
Life Sciences Institute, University of Michigan, Ann Arbor, 48109, Michigan, USA
Noah A. Rosenberg
Department of Biostatistics, University of Michigan, Ann Arbor, 48109, Michigan, USA
Noah A. Rosenberg & Michael Boehnke
Center for Statistical Genetics, University of Michigan, Ann Arbor, 48109, Michigan, USA
Michael Boehnke

Authors

Noah A. Rosenberg
View author publications
You can also search for this author in PubMed Google Scholar
Lucy Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ethan M. Jewett
View author publications
You can also search for this author in PubMed Google Scholar
Zachary A. Szpiech
View author publications
You can also search for this author in PubMed Google Scholar
Ivana Jankovic
View author publications
You can also search for this author in PubMed Google Scholar
Michael Boehnke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Noah A. Rosenberg.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Glossary

Genome-wide association studies: Study designs in which many markers spread across a genome are genotyped, and tests of statistical association with a phenotype are performed locally along the genome.
Genotype imputation: Probabilistic prediction of genotypes that have not been measured experimentally.
Principal component: A composite variable that summarizes the variation across a larger number of variables, each represented by a column of a matrix.
Loading: In a principal components analysis, a quantity that represents the contribution of one of the original variables (columns of the data matrix) to one of the principal components.
SNP: A nucleotide site at which two or more variants exist in a population. Most SNPs in genome-wide association studies are biallelic.
Tag SNP: A SNP chosen from a larger set of available SNPs for use in an association study. Tag SNPs are generally selected on the basis of favourable linkage disequilibrium properties.
Linkage disequilibrium: A statistical association in the occurrence of alleles at separate loci.
Tag-SNP portability: The utility of SNPs chosen as tags in one population for use as tags in another population.
Minor allele frequency: The frequency of the less frequent allele at a biallelic genetic locus.
Expected heterozygosity: The probability for a locus that two alleles drawn from its allele-frequency distribution are distinct.
Ascertainment bias: A distortion in results due to the use of a subsample that, in a systematic manner, fails to properly represent a larger sample.
Admixed population: A population formed recently from the mixing of two or more groups whose ancestors had long been separated.
Microsatellite: A type of genetic marker in which individuals vary in their number of tandemly repeated copies of a short DNA unit.
Coalescent: A specific stochastic process that describes the relationship among genetic lineages sampled in a population.
Recombination hot spot: A region of the genome in which the per-generation recombination rate is substantially elevated above the genome-wide average.
Contingency table: A table of observations of two or more variables that might have a statistical relationship of interest. For each variable, a contingency table places each observation into one of a series of categories.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rosenberg, N., Huang, L., Jewett, E. et al. Genome-wide association studies in diverse populations. Nat Rev Genet 11, 356–366 (2010). https://doi.org/10.1038/nrg2760

Download citation

Issue Date: May 2010
DOI: https://doi.org/10.1038/nrg2760

This article is cited by

Similarity and diversity of genetic architecture for complex traits between East Asian and European populations
- Jinhui Zhang
- Shuo Zhang
- Ping Zeng
BMC Genomics (2023)
VEGF-A, VEGFR1 and VEGFR2 single nucleotide polymorphisms and outcomes from the AGITG MAX trial of capecitabine, bevacizumab and mitomycin C in metastatic colorectal cancer
- Fiona Chionh
- Val Gebski
- Niall C. Tebbutt
Scientific Reports (2022)
Promoting Inclusive Recruitment: a Qualitative Study of Black Adults’ Decision to Participate in Genetic Research
- Jade Connor
- Ashley Kyalwazi
- Daniele Ölveczky
Journal of Urban Health (2022)
Trans-ancestral dissection of urate- and gout-associated major loci SLC2A9 and ABCG2 reveals primate-specific regulatory effects
- Riku Takei
- Murray Cadzow
- Wen-Hua Wei
Journal of Human Genetics (2021)
The impact of late-career job loss and genetic risk on body mass index: Evidence from variance polygenic scores
- Lauren L. Schmitz
- Julia Goodwin
- Dalton Conley
Scientific Reports (2021)