Introduction

Dyslexia (or reading disability, RD) and specific language impairment (SLI) are two prevalent childhood learning disorders that show extensive co-morbidity. Dyslexia involves an unexpected deficit in the development of reading skills whereas SLI refers to an impairment in the acquisition of oral language (Pennington and Bishop 2009). In both disorders, a diagnosis is usually dependent upon the existence of normal non-verbal performance and the problems must be unexpected. As such neither dyslexia nor SLI are diagnosed in children who present with other co-occurring medical conditions (e.g. hearing loss for SLI) or neurological disorders (e.g. generalised learning disability) (Bishop 1997).

Like many neurodevelopmental conditions, SLI and dyslexia are considered to have a complex genetic aetiology, caused by multifaceted interactions between various genetic and environmental factors. Studies have repeatedly indicated that, for both disorders, relatives of affected individuals have a 30–50% chance of also developing the disorder (Snowling et al. 2007; Barry et al. 2007). This contrasts to a general population prevalence of approximately 5% (Snowling et al. 2007; Barry et al. 2007). Moreover, such investigations have found a significant co-morbidity between dyslexia and SLI. It is estimated that 43% of children with SLI are later diagnosed with reading disability (Snowling et al. 2000) and 55% of dyslexic children meet diagnostic criteria for SLI (McArthur et al. 2000). This observation has led to the suggestion that SLI and dyslexia may share some aetiological factors or, that they may represent different manifestations of the same underlying cognitive deficit (Bishop and Snowling 2004; Catts et al. 2005; Pennington and Bishop 2009).

Genetic linkage studies of dyslexia and SLI have identified several loci which may contribute to these disorders (Williams and O’Donovan 2006; Newbury et al. 2005). These include seven putative dyslexia loci (DYX1 on chromosome 15q21, DYX2 on chromosome 6p21, DYX3 on chromosome 2p, DYX5 on chromosome 3p, DYX6 on chromosome 18, DYX8 on chromosome 1p, and DYX9 on chromosome Xq27) and three SLI linkage loci (SLI1 on chromosome 16q, SLI2 on chromosome 19q and SLI3 on chromosome 13q). More recently, association studies of these linkage loci have led to the identification of specific genetic variants. To our knowledge, no genome-wide association studies have been published for either disorder. Below, we provide a summary of the genetic studies that led to the identification of candidate genes. Further details of these and other investigations are described in comprehensive recent reviews (Scerri and Schulte-Korne 2009; Newbury et al. 2010).

DYX1C1 (dyslexia susceptibility 1 candidate 1, also known as EKN1, OMIM#608716) was the first gene to be proposed as a candidate for dyslexia susceptibility (Taipale et al. 2003). This gene was identified by the breakpoint mapping of a chromosome translocation (t(2;15)(q11;q21)) co-segregating with reading disability in a single family (Taipale et al. 2003). In addition, two coding variants in DYX1C1, one that alters a putative transcription factor binding site and another that introduces a premature stop codon, were found to be associated with dyslexia in a larger sample (Taipale et al. 2003). Initial replication attempts met with little success (Wigg et al. 2004; Scerri et al. 2004; Bellini et al. 2005; Ylisaukko-Oja et al. 2005; Marino et al. 2005; Meng et al. 2005a) and it was suggested that this may be due to the low minor allele frequency of the SNPs described or differences in linkage disequilibrium patterns between samples. However, recent studies have reported significant association between DYX1C1 variants and both working memory (Marino et al. 2007) and reading measures (Dahdouh et al. 2009; Bates et al. 2009).

The ROBO1 (Roundabout 1, also known as DUTT1, OMIM#602430) gene was also identified as a candidate for dyslexia susceptibility through the breakpoint mapping of a chromosome translocation involving a dyslexia linkage locus (DYX5, chromosome 3) in a dyslexic individual (Nopola-Hemmi et al. 2001). Sequencing of this gene in members from a large multi-generational pedigree, in whom the DYX3 linkage was originally identified, revealed a rare SNP haplotype that segregated with dyslexia (Hannula-Jouppi et al. 2005). This haplotype was not observed in other families with dyslexia suggesting that ROBO1 may play a role only in sporadic cases (Hannula-Jouppi et al. 2005).

The DYX2 locus on chromosome 6 has produced the most consistent observations of association with dyslexia and significant associations have been described to two genes in this region; DCDC2 (Doublecortin domain containing, OMIM#605755) (Meng et al. 2005b; Schumacher et al. 2006; Lind et al. 2010; Rice et al. 2009) and KIAA0319 (OMIM#609269) (Francks et al. 2004; Cope et al. 2005; Ludwig et al. 2008; Rice et al. 2009). Dyslexia-associated alleles in KIAA0319 fall within a putative promoter region (Couto et al. 2009b) and a single variant has been shown to specifically cause a reduced expression of this gene (Dennis et al. 2009).

Lastly, in 2007, Anthoni et al. reported association to SNPs in two genes located within the DYX3 locus; MRPL19 and C2ORF3 (Anthoni et al. 2007). These two genes are in high linkage disequilibrium and appear to be co-regulated. These associations have yet to be replicated.

To date, only three associations have been described for SLI and these have yet to be independently verified. The first association to be reported was with CNTNAP2 (contactin-associated protein-like 2, OMIM#604569) on chromosome 7q (Vernes et al. 2008). This gene was initially identified as a candidate for SLI as it is regulated by FOXP2 (OMIM#605317). Disruption of the FOXP2 gene causes a severe form of language impairment in some rare cases (Lai et al. 2001). Subsequent investigation of CNTNAP2 variants demonstrated a significant association to performance on a task of phonological short-term memory in a language-impaired cohort (Vernes et al. 2008).

The other two SLI associations were identified by a high density SNP screen for association in the SLI1 region of linkage on chromosome 16 (Newbury et al. 2009). Across SLI1, two clusters of variants were observed to be significantly associated with phonological short-term memory in language-impaired families. The first cluster fell within the CMIP (C-MAF Inducing Protein, OMIM#610112) gene and the second within the ATP2C2 (ATPase, Ca++ transporting, type 2C, member 2, OMIM#613082) gene. Both associations were internally replicated in a sample selected from a population cohort on the basis of low language performance.

The identification of specific genetic variants allows the direct evaluation of the question of shared genetic influences between co-morbid disorders. For example, studies have found that genes that contribute to dyslexia or SLI are also associated to related but distinct neurodevelopmental disorders. CNTNAP2 is a prime example and has been associated with autism (Alarcon et al. 2008; Arking et al. 2008; Bakkaloglu et al. 2008; Rossi et al. 2008; Jackman et al. 2009; Poot et al. 2009) and ADHD (Elia et al. 2009), Gilles de la Tourette syndrome (Verkerk et al. 2003), schizophrenia and epilepsy (Strauss et al. 2006; Friedman et al. 2008) and mental retardation (Ballarati et al. 2009; Zweier et al. 2009). Similarly, variants in ATP2C2 and DCDC2 have been associated with ADHD (Lesch et al. 2008; Couto et al. 2009a) and ROBO1 has been shown to have reduced expression in autistic cases (Anitha et al. 2008). Rice et al. directly addressed the question of shared effects between SLI and dyslexia by performing association analyses of SNPs in DCDC2 and KIAA0319 in a sample of children affected by SLI (Rice et al. 2009). They found marginal association (0.05 > P >  0.01) with several SNPs across KIAA0319 and measures of reading, articulation, vocabulary and an omnibus test of language ability, indicating that this gene may have pleiotropic effects upon both reading and language measures or that neurodevelopmental pathways are shared between these traits.

In this paper, we further explore the possibility of shared genetic influences between SLI and dyslexia. We investigate variants in the DYX1C1, DCDC2, KIAA0319, MRPL19/C2ORF3, CNTNAP2, CMIP and ATP2C2 genes in groups of SLI and dyslexic subjects. We perform both case–control and quantitative association analyses using measures of oral and written language skills in both cohorts. The aims of this study were, firstly to provide a replication set for previously identified risk variants and secondly to investigate the possibility of shared genetic effects across disorders and traits. We replicate association between reading-related measures and DCDC2 and KIAA0319 and provide evidence for association between spoken language and KIAA0319 indicating that this gene may have shared effects across traits and disorders. We detect association to multiple variants in CNTNAP2 and CMIP with reading-measures but these findings were limited to the SLI cohort indicating that, in contrast to KIAA0319, the effects of these genes may be more specific.

Subjects and methods

Subjects

SLI cohort

The SLI families used in this study were provided by the SLI Consortium (SLIC) cohort which has previously been described in detail (SLIC 2002, 2004; Falcaro et al. 2008). This family-based sample included 780 individuals from 181 2-generation families and had a child male:female ratio of 1.6:1. All families were ascertained on the basis of a single proband with receptive and/or expressive language skills, either currently or in the past, more than 1.5 SD below the normative mean for their chronological age. The samples were assessed by one of five separate centres across the UK and were derived from both clinical and epidemiological studies. The Newcomen Centre at Guy’s Hospital, London, the Cambridge Language and Speech Project (CLASP—(Burden et al. 1996)), the Child Life and Health Department at the University of Edinburgh (Clark et al. 2007), the Department of Child Health at the University of Aberdeen and the Manchester Language Study (Conti-Ramsden and Botting 1999; Conti-Ramsden et al. 1997). Ethical approval for these studies was provided by local ethics committees. Any child reported to have a non-verbal IQ of below 80 was excluded from the study. Other exclusion criteria included monozygotic twinning, chronic illness requiring multiple hospital visits or admissions, deafness, a clinical diagnosis of autism, English being a second language, children with known neurological disorders and children under local authority care.

Whole blood (Guy’s Hospital and Edinburgh) or buccal swab (Cambridge, Aberdeen and Manchester) samples were collected from all available family members, regardless of language ability. DNA was extracted using standard protocols and all buccal swab DNAs were pre-amplified using either a PCR pre-amplification procedure (PEP (Zhang et al. 1992)) or a rolling circle whole genome amplification protocol (Genomiphi—GE Healthcare).

Language abilities of all available SLIC children (regardless of language ability) were assessed using the expressive (ELS, n = 392) and receptive (RLS, n = 392) scales of the Clinical Evaluation of Language Fundamentals (CELF-R) battery (Semel et al. 1992) and a 28-item nonword repetition (NWR, n = 451) test (Gathercole et al. 1994). Reading aptitude was measured using the single-word reading (Read, n = 312), single-word spelling (Spell, n = 310) and reading comprehension (Comp, n = 276) tests from the Wechsler Objective Reading Dimensions (WORD) (Rust et al. 1993). Verbal and non-verbal IQ were examined, primarily for exclusion purposes, using the Wechsler Intelligence Scales for Children (WISC-III) (Wechsler 1992). Due to the age constraints of standardised tests, all phenotypic data were collected for children only. All scores were normalised for age effects. Due to logistical constraints, the reading tests were only performed for a subset of SLI individuals.

Correlations between all the measures analysed in the SLI family sample can be found in Electronic Supplementary Table S1.

All 181 families in the SLI cohort formed part of the sample previously used to identify association between language measures and variants in CNTNAP2 (Vernes et al. 2008), ATP2C2 and CMIP (Newbury et al. 2009). Although an identical cohort, this investigation included repeat genotyping data and the reported P-values may therefore differ from those previously described due to small differences in missing genotype information.

Dyslexia cohort

The collection of families used for quantitative trait association has been extensively described previously (Francks et al. 2004). Briefly, all probands and siblings from our complete Oxford set of 264 unrelated nuclear families (a total of 634 siblings with a male:female ratio of 1.5:1), were identified from the dyslexia clinic at the Royal Berkshire Hospital (Reading, U.K). Families were ascertained if the proband had a British Abilities Scales (BAS) single-word reading score >2 SDs below that predicted by their intelligence quotient (IQ) and if at least one other sibling had a history of reading problems. These criteria identified some probands with high IQ scores and BAS reading scores within the ‘normal’ range. Therefore, after collecting 173 UK families the ascertainment conditions were changed such that the only required criterion was that the probands’ difference in their BAS single-word reading score had to be ≥1 SD below the population mean for their age (and not IQ), along with an IQ ≥ 90. Probands were excluded if they had been diagnosed with co-occuring developmental disorders such as SLI, autism or attention deficit-hyperactivity disorder (ADHD).

We administered a battery of psychometric tests to all probands and siblings in each family, and we age-adjusted and standardized their scores against a normative control data set, as described elsewhere (Marlow et al. 2001; Fisher et al. 2002). These included measures of single-word reading ability (READ; n = 634) and spelling ability (SPELL; n = 603) from the British Ability Scales (BAS) (Elliot et al. 1983) or Wide Range Achievement Test (WRAT-R) for children older than 14.5 (Jastak and Wilkson 1984), phonological decoding ability (PD; n = 629) (use of letter to sound relationship rules to read pseudowords) (Castles and Coltheart 1993), phonemic awareness (PA; n = 591) (awareness of the phonemic structure of language; test which required the individuals to orally move phonemes either within the same word or between words, also known as “spoonerism”) (Gallagher and Frederickson 1995), orthographic coding (OC-irreg; n = 625) (the ability to read real words that do not follow conventional spelling to sound rules e.g. yacht) (Castles and Coltheart 1993) and orthographic coding assessed by a forced word choice test (OC-choice; n = 548) (identification of a correctly spelt word from two phonologically equivalent options, e.g. rane vs rain) between (Olson et al. 1994). Tests of verbal and non-verbal reasoning were assessed using the BAS similarities (SIM; n = 620) or BAS matrices (MAT; n = 588) tests respectively (Elliot et al. 1983). The Similarities sub-scale of the Wechsler Adult Intelligence Scales (WAIS), which is analogous to the BAS similarities test, was used when age was >17.5 years (Wechsler 1981). Note that the measures of reading and spelling were derived from different tests of the same constructs to those utilized in the SLI cohort.

Correlations between all the measures analysed in the dyslexia family samples can be found in Electronic Supplementary Table S2.

The 264 families have been previously used to identify the KIAA0319 locus (Francks et al. 2004) and to study the DYX1C1 locus (Scerri et al. 2004). Therefore, some of the results are a repetition of data previously published. These results are clearly identified in the tables and text.

In addition to the dyslexia family-based sample, we analysed a collection of 331 UK unrelated cases which have not been investigated in previous studies for any of the loci under study here. These samples were recruited through the Dyslexia Research Centre clinics in Oxford and Reading, and the Aston Dyslexia and Development clinic in Birmingham. The cases are between 8 and 18.5 years of age, have a BAS2 single-word reading score ≤100 (at chronological age) and >1.5 SDs below that predicted by their IQ scores. Since these individuals were collected as a case cohort, we did not investigate quantitative measures for them.

Ethical approval for this study was acquired from the Oxfordshire Psychiatric Research Ethics Committee (OPREC O01.02). Written informed consent to participate in this study was obtained from all families and individuals.

Data cleaning and data handling

SNP selection was primarily achieved through a literature review with those highly significant and consistently associated to dyslexia or SLI being prioritised.

In total 31 SNPs were genotyped. These included four SNPs from MRPL19/C2ORF3, four SNPs from DCDC2, six SNPs from KIAA0319, two SNPs from DYX1C1, five SNPs from CNTNAP2, five SNPs from CMIP and five SNPs from ATP2C2 (Table 1). SLI and dyslexia individuals were genotyped in separate experiments, therefore the SNP data quality varied between these cohorts. In addition, control data were not available for all SNPs, therefore case–control analyses were not performed for all 31 SNPs.

Table 1 SNPs analysed in the current study

SNPs were genotyped using the Sequenom iPLEX assay. Genotypes in the form of marker clusters were checked manually in the MassArray TyperAnalyser software. Any SNP with a success rate of <80% or any SNP that showed consistent bad inheritances (>10 errors after data clean-up) within each independent cohort were removed from the association analyses. Probabilities of Hardy–Weinberg Equilibrium (HWE) were calculated for all SNPs within PEDSTATS (Wigginton and Abecasis 2005) and any SNP with a HWE-P of <0.0001 was excluded.

Statistical analyses

The primary samples used for this study were family-based and thus the principal analysis consisted of a quantitative test of association within QTDT (Abecasis et al. 2000) using all available language and reading measures in each data set. In both cohorts, a number of unlinked SNP markers spread across the genome (n = 28 in the dyslexia cohort and n = 25 in the SLI cohort) were directly tested for population stratification effects within QTDT. These tests supported an assumption of no population stratification and we therefore applied the total association QTDT option across all analyses. This model jointly considers both within- and between-family variance and hence should provide more power to detect association provided there is no sample stratification. Identity by Descent (IBD) values (required as an input for QTDT) were calculated in MERLIN (Abecasis et al. 2002).

Since each cohort applied different measures of language and reading ability, the QTDT tests were performed independently within the SLI and dyslexia families.

This investigation included the repeat genotyping of some SNPs already assayed in these families. The reported P-values may differ from those previously reported due to small differences in missing genotype information.

Case–control analyses

In addition to the quantitative tests of association, we performed an allelic test of association within PLINK (Purcell et al. 2007). In these analyses, for the SLI cohort, the case was defined as the proband upon whom the family was ascertained (and therefore had receptive and/or expressive language skills either currently or in the past, more than 1.5 SD below the normative mean for their chronological age). This included additional individuals from singleton families providing a total of 213 cases. Note that this proband selection is different from that applied in our previous association paper, in which we specifically investigated NWR and therefore SLI cases were selected on the basis of a low NWR performance alone (Newbury et al. 2009). For the dyslexia cohort, this included 188 cases, selected on the basis of severe reading deficit. In addition, we analysed the collection of 331 unrelated UK dyslexic cases specifically recruited for case–control analyses as an independent case–control sample. Between 112 and 363 unselected European controls from the Human Random Control (HRC) panel of the European Collection of Cell Cultures (ECACC) were genotyped for each SNP and used as universal controls across all case–control analyses. Thus the control data were identical for each case–control analysis performed.

Results

In this study a large number of tests were performed. The application of a Bonferroni correction would therefore yield a very stringent P-value. Furthermore, since we tested a restricted and targeted selection of SNPs from each locus, these tests were not independent (due to both the proximity of the markers within each locus and the correlation between the phenotypes). Therefore we do not correct our P-value for multiple testing but rather aim to describe trends of association across markers and traits.

Association analysis of dyslexia loci (Tables 2 and 3)

The results of all quantitative analyses of dyslexia loci in the SLI and dyslexia families can be found in Table 2. The results of the case–control analyses of the dyslexia loci are in Table 3.

Table 2 Quantitative association to dyslexia loci in family samples
Table 3 Case-control analyses of dyslexia loci

Across all the dyslexia loci tested, the minimum P-value seen with measures of reading ability was 0.013, observed with rs2143340 (KIAA0319) and the single-word reading measure (Read) in the SLI families. The minimum P-value arising from the investigation of language measures was 0.004 between receptive language (RLS) and rs3212236 (KIAA0319) in the SLI families.

In the SLI families, two KIAA0319 markers (rs2143340 and rs3212236) and one MRPL19/C2ORF3 marker (rs1000585) showed trends of association with reading-related measures. No notable association was observed for either DCDC2 or DYX1C1 with any of the traits analysed in the SLI families.

In the dyslexia families we found a trend of association between orthographic and spelling measures and SNPs in KIAA0319, as previously reported (Francks et al. 2004; Harold et al. 2006; Dennis et al. 2009) and between a single SNP in DYX1C1 and orthographic choice.

In the case–control analyses of the dyslexia probands selected from the family samples, we observed a trend of association to two SNPs in DCDC2 (minimum P = 0.027). Association to one of these markers (rs807724) was also supported by case–control analyses in an independent cohort of 331 unrelated dyslexic cases (P = 0.036). In the SLI probands, we also observed marginal association to single SNPs in KIAA0319 (rs2143340, P = 0.036) and DYX1C1 (rs57809907, P = 0.012) but, in both cases, with opposite direction compared to the original reports (Francks et al. 2004; Cope et al. 2005; Harold et al. 2006; Taipale et al. 2003).

Association Analysis of SLI Loci (Table 4)

Across the SLI loci investigated, the minimum P-value achieved was 8 × 10−5. This was between rs17236239 in CNTNAP2 and NWR in the SLI families and has been previously reported (Vernes et al. 2008). Also as previously reported, we observed strong evidence for association between SNPs in CMIP, CNTNAP2 and ATP2C2 and multiple language measures in the SLI families.

Table 4 Quantitative association to SLI loci in family samples

In addition, in the SLI families we identified novel evidence of association to reading-related measures (Read, Comp and Spell) for multiple SNPs in CNTNAP2 and CMIP. In contrast, in the SLI families, SNPs in ATP2C2 only showed association to measures of language ability (ELS, RLS and NWR).

In the dyslexia families at the SLI loci, we observed only trends of association to a single SNP in CNTNAP2 (rs7794745 and OC-choice, P = 0.0425), in the opposite orientation from that previously described and to a single SNP in ATP2C2 (rs2875891 and PA, P = 0.0251).

In our case–control analyses, we did not observe any evidence for association to any SLI locus in either the SLI or the dyslexia cases (minimum P = 0.084, data not shown).

Discussion

In this study, we performed a candidate gene association study in three independent samples, one consisting of SLI families, one of dyslexic families and one of dyslexic cases. We evaluated the effects of putative SLI and dyslexia risk variants upon language and reading measures in the two family cohorts and performed case–control analyses across all three sample sets. The aims of this study were twofold: firstly, to provide a replication set for previously identified risk variants and secondly to investigate the possibility of shared genetic effects across disorders and traits. We found consistent evidence for a trend of association between KIAA0319 and reading measures in the SLI cohort, thus replicating previous findings from both dyslexic individuals (Cope et al. 2005; Dennis et al. 2009; Francks et al. 2004; Harold et al. 2006; Paracchini et al. 2006), population cohorts (Paracchini et al. 2008; Luciano et al. 2007) and SLI families (Rice et al. 2009). In addition, we observed some association between KIAA0319 and expressive and receptive language skills further supporting the findings of Rice et al. (2009) who suggested that this gene may have pleiotropic effects across measures. We also found association to DCDC2 across two independent case–control analyses of dyslexic individuals providing consistent support for a role for this gene in susceptibility to reading deficits. Finally, we also detected novel and consistent associations between variants in CNTNAP2 and CMIP and reading-measures in SLI families. Association to all three language loci were, at best, sporadic in the dyslexia cohort, indicating that their effects may be specific to language-impaired individuals.

MRPL19/C2ORF3

To our knowledge, this is the first attempted replication analysis of the MRPL19/C2ORF3 locus. In the dyslexia families, we did not see any compelling evidence for association to reading or language measures in this gene. The strongest association was observed in the SLI cohort, where we found a trend of association to rs1000585 (P = 0.0217). However, this result was in the opposite direction to that previously reported (Anthoni et al. 2007) and was observed at a nominal level with only a single SNP. Thus this is likely to represent a false positive.

DCDC2

Variants in DCDC2 had not previously been explored in the SLI families investigated in this manuscript but had been analysed in the dyslexia families (Harold et al. 2006). At this locus, we did not observe any evidence for association in the family samples. Nonetheless, we did observe a consistent, if weak, trend of association to rs807724 across both the dyslexic samples tested in our case–control analyses (min P = 0.027). In these analyses, the minor allele occurred at a higher frequency in both groups of dyslexic probands (24 and 22%) when compared to that of controls (18%) (Table 3). These case–control data therefore provide compelling support for the involvement of DCDC2 in dyslexia susceptibility.

KIAA0319

The dyslexic family sample used here includes those families in which the KIAA0319 association was initially identified (Francks et al. 2004). However, variants at this locus had not previously been investigated in the SLI families described in this manuscript. Two SNPs (rs3212236 and rs2143340) were found to show a trend of association with reading-related measures, in the same direction as previously described (see Table 1) (Francks et al. 2004; Cope et al. 2005; Harold et al. 2006). Furthermore, these two SNPs were also found to be associated with receptive language ability, again with the same allelic trend (min P = 0.0038). Conversely, two additional KIAA0319 SNPs (rs761100 and rs6935076) were associated with the expressive language scores (min P = 0.0073), but this association was in an opposite orientation from that previously described for reading. Association between KIAA0319 and language ability has previously been suggested by a recent investigation of SLI subjects (Rice et al. 2009) which also found that reduced performance in a test of spoken language was associated with two SNPs in KIAA0319 including the minor allele of rs6935076 (i.e. in an opposite direction to that observed in the present study).

In the case–control analyses of the SLI probands, we observed a trend of association to the major allele of rs2143340 (Table 3, P = 0.036) but in these analyses this effect is in an opposite direction to the original findings for this variant (Table 1) (Francks et al. 2004).

Thus, our analyses provide further evidence that variants in KIAA0319 may not only contribute to reading ability but may also be relevant to the development of other language-related skills.

DYX1C1

We previously analysed DYX1C1 variants in the dyslexic families (Scerri et al. 2004) and described nominal association at rs57809907 in an opposite trend to the discovery sample where it was labelled 1249G→T (Taipale et al. 2003). Variants in DYX1C1 had not previously been explored in the SLI family sample. No quantitative association was observed in the SLI families but in the case–control analyses, we did see an increase in the frequency of the major allele of rs57809907 in SLI probands (Table 3, P = 0.012). Once more, this is a sporadic result in an opposite direction to that originally reported and so should be treated cautiously (Taipale et al. 2003).

Language Loci

Analysis of CNTNAP2, ATP2C2 and CMIP in the SLI families investigated here have previously been reported (Vernes et al. 2008; Newbury et al. 2009) but these loci had not been studied for association in the dyslexic families. In the dyslexia cohort we only observed weak and sporadic associations to these loci. Thus, our investigations do not support pleiotropic effects for the investigated SLI loci in the dyslexic spectrum. It is perhaps surprising therefore that the reading measures collected in the SLI sample yielded strong and consistent association across multiple SNPs in CMIP and CNTNAP2 (Table 4, minimum P = 0.0002). In our previous study, we did investigate the role of CMIP and ATP2C2 across multiple measures (including reading) and did not see any evidence for association outside expressive and receptive language and nonword repetition (Newbury et al. 2009). However, that study was limited to a single SNP from each gene (rs4265801 for CMIP and rs16973771 for ATP2C2). Ironically, rs4265801 is the only CMIP SNP in the present study which did not show association to reading measures (Table 4). One may postulate that the overlaps of association between the reading and language measures at the CNTNAP2 and CMIP loci merely reflect phenotypic correlations between these traits. In the SLI families, these traits are moderately correlated with each other (Electronic Supplementary Table S1). Nonetheless, the associations to ATP2C2 in these families appear to be specific to language measures (ELS, RLS and NWR) indicating that the results observed for CNTNAP2 and CMIP are not caused directly by correlation effects. Thus, our findings support a novel role for variants in CNTNAP2 and CMIP upon reading ability in SLI individuals. Since children with SLI were excluded from the dyslexia sample, the divergence of the results between cohorts may indicate that these genes represent modifier loci whose effects are strengthened in the presence of other variants which predispose individuals to language-impairment but not dyslexia and, as such, support the findings of Newbury et al. (2009). Alternatively, these data may just reflect sampling or other stochastic differences across the two cohorts studied here, or the use of differing phenotypic measures across samples. In particular, the use of alternative measures across samples forces the assumption that different tests of the same construct measure comparable underlying biological processes. This is a common problem in the replication of quantitative genetic investigations and a limitation of the current study. Such issues exemplify the complexity of the biological pathways underlying multifaceted phenotypes and demonstrate the danger of attempting to delineate a complex trait by mapping specific genetic effects to distinct behavioural components.

As mentioned in the results section, the proximity of the markers and the correlation between the phenotypes in replication studies can complicate the application of a suitable multiple testing correction method. It should therefore be noted that none of the results presented in this paper have been corrected for multiple testing and that such a correction would render much of our data non-significant. For example, a traditional Bonferroni correction would yield a 0.05 significance threshold of 1 × 10−4. Even if we consider the fact that the 31 SNPs investigated in the present study could be tagged by 19 markers (r2 > 0.6), the significance threshold would still be P < 2 × 10−4 and none of the novel data presented in this paper exceed this level of significance. Nonetheless, it should be noted that these correction methods rely upon both the number of tests presented in any one paper, rather than the total number of tests performed in any given sample over time, and the assumption that all phenotypes are completely independent. In the absence of an accurate multiple testing procedure for such a hypothesis-driven study, we choose to present uncorrected P-values and openly acknowledge that these may overestimate the significance. We suggest that when assessing the validity of any given data point, the reader should consider the consistency of that result across samples and phenotypes.

Another common issue in the interpretation of association data is the presence of flip-flop associations in which SNP-trait associations are replicated but follow an opposite trend in the replication cohort (Lin et al. 2007). In this investigation, we observed some such associations, many of which were sporadic results of nominal significance (e.g. MRPL19/C2ORF3) and thus are likely to represent Type I errors. In some cases, however, the reversed associations were found to be consistent across traits or SNPs (e.g. KIAA0319 rs761100 and rs6935076 and expressive language measures). It has been postulated that the inversion of association may be caused by interactive effects or variable patterns of linkage disequilibrium between causal variants and associated SNPs (Lin et al. 2007). These data therefore reinforce the importance of remembering that even a robust genetic association is unlikely to precisely identify a causal genetic variant.

In conclusion, our investigations of dyslexia loci provide some marginal evidence for the existence of shared genetic effects across SLI and dyslexia, particularly for variants in KIAA0319. We found that the minor alleles of rs3212236 and rs2143340 in KIAA0319 were associated with reduced reading and language-related ability in samples of SLI individuals. This finding provides an independent replication for the suggested role of KIAA0319 in reading and language abilities. We were also able to replicate association to DCDC2 in our case–control analyses across two independent dyslexic samples. Our investigations of CNTNAP2 and CMIP yielded association to both reading- and language-related traits but these were restricted to the SLI cohort providing further support that these loci are particularly relevant to this clinical group. Our data indicate that, within the SLI population, variants in CMIP and CNTNAP2 influence both reading- and language-related traits whilst those in ATP2C2 appear to be more specific to oral language skills.