Introduction

Autism spectrum disorders (ASDs) represent a group of neurodevelopmental disorders, which involve deficits in three areas of functioning: reciprocal social interaction, communication, and stereotyped and restricted behaviors. When deficits are present in all three domains of functioning with at least one defect in one area being detected before completion of 3 years of age, a classification of autistic disorder (AD) is warranted [1]. ASDs occur either sporadically or as familial cases, with an estimated prevalence of one in 150 children [2]. Males are four times more frequently affected than females [3]. In 15% to 70% of children with ASD, also mental and developmental retardation has been diagnosed [4]. Comparisons of monozygotic and dizygotic twins suggested a heritability as high as 90% for the narrow phenotype of AD [57]. Analyzing autism risk in multiplex families from the Autism Genetic Resource Exchange (AGRE), Zhao and co-workers [8] found strong evidence for dominant transmission to male offspring. They hypothesized that two types of families may exist: (1) low-risk families with sporadic autism being mainly caused by spontaneous mutations with high penetrance in males and relatively poor penetrance in females; and (2) high-risk families in which ASD probands receive a dominant mutation, most often from females, who carry this mutation but are themselves unaffected [8]. Using a Bayesian approach in a population-based sample, Nishiyama and co-workers showed that the largest proportion of ASD cases can be explained by a model allowing multiple, concomitantly inherited risk alleles [9].

ASD emerges increasingly as a genetically heterogeneous disorder [7]. Searching for specific ASD risk genes by genetic linkage and association studies identified only few of such genes and contributed little to explain the phenotypic variability among ASD patients. Replication of results proved to be difficult [4, 7, 1012]. Genome-wide segmental aneuploidy profiling revealed submicroscopic structural genome alterations, named copy number variations (CNVs), being either inherited or emerging de novo in 7% to 27% of ASD patients under investigation [10, 1317]. In a recent study of the Autism Genetic Resource Exchange dataset, Itsara and co-workers found a de novo CNV rate in the order of 2% and concluded that de novo CNVs may contribute a significant risk for autism [18]. Given that CNVs frequently occur in the healthy population [19], discrimination between neutral variants and pathogenic events becomes an increasingly difficult and challenging task. Therefore, it is pivotal to study the relationship between CNV genotype and clinical phenotype(s) both in ASD patients and in their parents and siblings [20]. Patients with ASD in addition to co-morbidities (e.g., dysmorphisms, congenital anomalies) and a family history of intellectual impairment not only make up the majority of cases but are also more likely to carry CNVs [4, 16]. Based on studies of CNVs, a model of ASD etiology, assuming a single or several loci with additional effects, has been proposed [21]. In this model, a CNV by itself may be sufficient to cause ASD or may contribute additional effect(s) to other mutation(s) such that via an additive or a synergistic interaction the threshold for full-blown ASD will be crossed [21, 22].

In order to evaluate the clinical significance of inherited and de novo CNVs, we followed an unbiased approach by focusing on patients with co-morbidities in addition to ASD. These make up the majority of patients seen at our outpatient department for preschoolers. We selected a cohort of families in which the index patient carried a CNV containing at least one gene that is transcribed in the primary target organ of ASD (i.e., the brain) from the 210 ASD patients referred to our institution at preschool age. In those families, in which such a CNV was found in the referred patient, and which had multiple affected children (multiplex families), all patients, their parents, and all available siblings were subsequently genotyped by SNP arrays and were evaluated for behavioral phenotypes using the Social Responsiveness Scale (SRS) [2327]. We assumed that SRS scores above the normal range in any family member may reflect sub-clinical behavioral impairments. If at least one of the parents scored above the normal range on the SRS, we assumed that at least one allele involved in ASD could conceivably have been transmitted to one or several children with ASD. Thus, following the Cook and Scherer model [21], CNVs are hypothesized as being a primary (i.e., sufficient) cause for the ASD phenotype of the proband when they were inherited from a parent with a SRS score above the normal range, whereas the other parent scored within the normal range. In case the CNV arose de novo in families in which both parents scored within the normal range on the SRS, such a CNV was also assumed to be a primary cause of ASD. In all other cases, a CNV is hypothesized to be of secondary (i.e., possibly additional) contribution within a background of other putative susceptibility alleles. Subsequently, we used the same approach to determine the phenotypic impact of CNVs in families with a single affected child. Finally, we discuss these outcomes in terms of the inherited and sporadic forms of ASD [8, 9].

Patients and methods

Ethics statement

A written informed consent was obtained from all parents of the children included in the study. The Medical Ethics Review Board of the UMC Utrecht approved all procedures.

Psychiatric and clinical genetic evaluation and patient selection

The 210 children, being consecutively referred with symptoms of ASD as detected at or before the age of 4 years, have been routinely evaluated by a specialized team of clinicians for both their psychiatric and clinical genetic phenotypes. At this age, a best estimate diagnosis of ASD was obtained by combining a clinical diagnosis of ASD (including DSM IV-TR criteria [1]) with an ASD classification based on a clinical evaluation including the Autism Diagnostic Observation Schedule-Generic (ADOS-G) [28], but not the Autism Diagnostic Interview—Revised (ADI-R) [29]. At the time of the initial evaluation, it was specified that most of the prototypical autistic behavior is seen at the ages 4–5 years, and that the ADI-R may be less specific or sensitive at younger ages [29]. At this initial evaluation, a psychometric test, the Mullen Scales of Early Learning (MSEL) [30], was administered to the proband by a licensed psychologist. The MSEL was used to calculate an overall cognitive score (CS) (Table 1). If other cognitive tests are administered at a later age, these later cognitive data are displayed instead of the data at the initial evaluation (Table 1, column 3). From our initial cohort, 51 patients did not reach a best estimate diagnosis and were excluded from this study. In addition, nine children having a genetic disorder involving ASD, i.e., Angelman syndrome (OMIM:105830), neurofibromatosis 1 (OMIM:162200), tuberous sclerosis (OMIM:191100), Rett syndrome (OMIM:312750), Smith Lemli Opitz syndrome (OMIM:270400), 22q11.2 deletion syndrome (VCF) (OMIM:192430), and fragile X syndrome mental retardation syndrome (OMIM:300624), as well as children with cytogenetic abnormalities as ascertained by routine (see below) karyotyping and molecular genetic diagnosis were also excluded at this step.

Table 1 Clinical phenotypes and SRS scores

Subsequently, probands were investigated according to standard medical genetic procedures and scored for a combination of clinical characteristics. Those were family history of ASD and/or intellectual disability (one point), intrauterine growth retardation (two points), postnatal growth disorder (two points), facial dysmorphic features (two points), minor malformations and congenital anomalies (two points maximum), and neurological disorder (one point). The cut-off was set at three points in at least two domains. On the basis of this score, 50 patients were selected out of the initial cohort of preschoolers with ASD and were genotyped with SNP arrays (see below). After detection and evaluation of CNVs as described below, we retained 13 probands, five from multiplex and eight from simplex families, who carried a CNV with at least one gene being transcribed in the brain. These patients were subsequently evaluated for AD with a standardized interview, the ADI-R [29]. The ADI-R provides a cut-off point; scores above or on this cut-off point indicate an ADI-R classification of AD, and a lower score indicates no classification. If the scoring on the four algorithm items of the ADI-R reached one point below cut-off, a classification of ASD was applied. At this stage, one proband did not reach a best estimate diagnosis of ASD, and three families (with probands from two multiplex families and one simplex family) withdrew from this study such that nine families (with probands from three multiplex and six from simplex families) participated in the final phase.

In these nine selected families, both parents, the proband, and siblings were evaluated with the SRS; the probands and siblings by parent report and the parents by spouse report. The result of the SRS for parents were evaluated using the “parent rating raw score means in the general population”—females: 27.6 (SD 18.1), males: 33.7 (SD 20.9) [24]. Scores of 2 SD above the mean or higher in parents suggest interference with everyday social interactions and suggest that the parent is affected with features of a broader phenotype. Scores in parents between 1 SD and 2 SD indicate deficiencies in reciprocal social interaction and suggest that the parent is partly affected. For children between 4 and 18 years, the T-score was used [24]. Norms and validity of the SRS are based on an American population. Some parents and siblings of our MA families have been formally diagnosed with ASD according to the criteria of the DSM-IV-TR, and when available these diagnoses are presented. All phenotypic data of this final subset of probands and their parents are summarized in Table 1.

One of the limitations of our study is that when considering the use of the SRS in an adult population in research projects, individuals presumed to be substantially affected on the basis of elevated quantitative traits (SRS) scores necessitate confirmatory clinical evaluations in large samples [24, 27]. Acknowledging the fact that autistic symptoms are continuously distributed in the general population [23], a boundary between affected and unaffected is more uncertain [25]. Currently, normative data for adults are not available. Further studies in adults are needed to confirm the specificity of the SRS for measuring autistic social impairment and its relationship to diagnoses of ASD.

Considering children and adolescents (age 4–18 years), normative SRS data including over 2,500 subjects are available as yet [2327], normative data on a Dutch and Flemish population are pending. In clinical studies, the specificity of the SRS for measuring autistic social impairment is highly associated with diagnoses of ASD, but not with other child psychiatric disorders. The SRS has been extensively validated in both clinically ascertained and population-based samples of subjects [2327].

Karyotyping and molecular genetic analyses

We ascertained all patients’ karyotypes at the 700 band level in cultured peripheral blood lymphocytes according to standard procedures. To confirm segmental aneuploidies detected by the SNP array (see below), BAC-based array CGH or fluorescence in situ hybridization (FISH) with region-specific probes was performed [22, 31].

Illumina Infinium HumanHap300 Genotyping BeadChip SNP array

Infinium HumanHap300 Genotyping BeadChip SNP array analyses were performed according to the protocol of the manufacturer (Illumina Inc., San Diego, CA, USA). In a first-pass analysis, CNVs were detected using the Beadstudio V2.3.41 software package (Illumina Inc.) as described before (see Supplementary data file S2 in [32]). Since CNV detection algorithms such as the package used here detect only part of the CNVs present in the data [33], all SNP profiles were curated by visual inspection. In the next step, all CNVs previously detected in the healthy population were excluded (using the Database of Genomic Variants; http://projects.tcag.ca/variation). Next, the thus retained CNVs that contained at least one gene being transcribed in the clinically relevant target organ [34], i.e., the brain—according to the Atlas from the Allen Institute for Brain Research (http://www.brain-map.org) and the SESTAN lab (www.humanbrainatlas.org) [35], were retained.

Results

We performed genome-wide CNV profiling of a cohort of 50 probands, 21 from multiplex and 29 from simplex ASD families (see above). After re-evaluation of the patient’s phenotype, nine patients (with probands from three multiplex and six from simplex families) with a total of two genomic gains and eight losses were retained (Table 2). Whereas both gains had arisen de novo, only five out of eight losses (in four patients) were de novo. Pedigrees of the three multiplex families with potentially clinically significant CNVs are displayed in Fig. 1. Only one of the CNVs, indicated as either a green rim for a gain or a red rim for a loss, found in our multiplex ASD families appeared to co-segregate with ASD symptomatology as reflected by SRS scores of parents and siblings, indicated as numbers below the symbols (family M2 in Fig. 1 and Tables 1 and 2). For the other multiplex families, it is conceivable that a mutant allele has been transmitted from a parent with a SRS score above the normal range. Therefore, the CNVs may at most exert an additional contribution to the behavioral phenotype of the patients in these multiplex ASD families. That means they should be considered “secondary”. Application of the same SRS-based assessment of segregation of ASD symptomatology to our simplex families (Fig. 1) shows that only the gain in 3p26 (containing CNTN6 in family S1) and the losses in 7p22.1 and in 12q15 (containing ATXN7, and KDELR2, ZNF12, in family S2) may by themselves be sufficient to cause ASD. The de novo gains in 12q15 (containing KCNMB4 in family S3), in 2p16.1 (containing GIRDIN in family S4), and the de novo losses in 7p31.3 and in 8p23.1 (containing the 5′ part of IMMP2L and the 5′ part of MCPH1 in Families S5 and S6, respectively) may at most have exerted an additional effect since transmission of a dominant mutant allele from a parent with ASD symptomatology (based on SRS scores above the normal range) can be suspected in each of these families.

Table 2 All CNVs with brain-transcribed genes in index patients and family members
Fig. 1
figure 1

Pedigrees of multiplex and simplex families in which at least one ASD patient carried CNVs with at least one brain-transcribed gene. M1 through M3 represent families with multiple affected children (multiplex); S1 through S6 represent families with a single affected child (simplex). Filled dark symbols indicate a diagnosis of autistic disorder; gray symbols autism spectrum disorder; clear symbols subjects not diagnosed with ASD. Numbers under symbols indicate the following SRS scores: 0 = 0 SD within normal range, 1 = 1 SD above the mean, and 2 = 2 SD above the mean (for further explanation, see “Patients and methods” and “Results” sections). Colored rims indicate CNVs: a green rim represents a gain; a red rim a hemizygous loss (see also Table 1). The gene names next to the family numbers indicate the brain-transcribed genes in the CNVs (same color code as above). The asterisks denote the index patient of each family

In case of a de novo loss, the SNPs from the parental genome in which the loss occurred will be lost, and consequently, those from the other parent will be retained. In one of our cases, the maternal allele was retained while in another case the paternal allele was found, which is different from a report on a disruption of CNTNAP2 in a boy with speech delay and autism spectrum disorder [22]. These findings need further investigation in future studies of larger cohorts.

Discussion

Copy number variations (CNVs) are the most frequently detected type of structural genome alterations in ASD patients [1417]. In our cohort of multiplex and simplex families, we found CNVs that contain genes being part of the phosphoinositol signaling pathway (PIK3CA, GIRDIN) [17, 36], of the contactin-based networks of cell communication (CNTN5, CNTN6) [3640], and CNVs encompassing genes such as IMMP2L, MCPH1, and HOXA. Thus, the CNVs containing at least one brain-transcribed gene identified candidate genes that are in agreement with published proposals regarding a pathogenic contribution to ASD of the phosphoinositol pathway [17, 36], of microcephalin [38, 39], and of the contactin-based networks of cell communication [40]. The pathways identified in this study contrast with those found in two studies of patients with mental retardation and multiple congenital anomalies [41, 42]. Apparently, our approach to evaluate both de novo and inherited CNVs in ASD patients allowed us to identify biological pathways distinct from those involved in mental retardation and congenital anomalies and likely to be related to a specific subset of patients with ASD [32].

Yet the contribution of these CNVs to the ASD phenotype of an individual patient is not a priori clear [20]. It is frequently assumed that only de novo CNVs occurring concomitantly with ASD in sporadic patients bear a causal relationship with the ASD phenotype [1417]. Since a significant proportion of ASD patients may have inherited ASD risk, in particular from their healthy mothers [8, 9], such inherited, yet clinically significant, mutations and CNVs may thus inadvertently get excluded. Therefore, it is pivotal to ascertain whether parents, although not being diagnosed with an outright ASD, may still show some aspects of ASD symptomatology. For ASD, an increased rate of less severe, but similar impairments, termed the broader autism phenotype, is found in 12.4% of the siblings and in 10–45% of parents of children with ASD [57]. Therefore, it is conceivable that in such families the ASD of the patient may have resulted from interaction of a de novo or inherited CNV with an inherited allele from an impaired parent.

To evaluate the potential phenotypic contribution of the thus identified CNVs, we determined whether a CNV co-segregated with ASD symptomatology in the family as reflected by the SRS score of all available family members in multiplex ASD families. We reasoned that a CNV may by itself be sufficient to cause the ASD phenotype if it was either inherited from a parent carrying the same CNV and showing a SRS score higher than 1 SD above the mean or if it arose de novo in a single affected child concomitantly with SRS scores within the normal range (i.e., below 1 SD above the mean) in both parents. Such CNVs, which may by themselves be “sufficient” to cause ASD symptomatology, fit into the leftmost column of the Cook and Scherer model [21]. In contrast, in families with multiple affected children, a CNV that arose de novo or did not co-segregate with SRS-ascertained ASD symptomatology, this particular CNV may interact with a putative inherited allele. In these cases, the CNV may exert an additional or epistatic effect (as represented by columns 2–4 in the Cook and Scherer model) [21, 22].

According to our SRS-aided segregation analysis in multiplex families, hemizygosity for PIK3CA and KCNMB3 (in family M1) or CHL1 and part of CNTN6 (in family M3) may by itself not be sufficient to cause ASD (Fig. 1, Tables 1 and 2). Rather, copy number variation of these genes may act in concert with an, as yet not identified, inherited mutation located elsewhere in the genome. In simplex families, however, a gain of part of CNTN6 (in family S1) or hemizygosity for ATXN7 together with KDELR3 or ZNF12 (in family S2) may be sufficient to cause ASD in the affected proband. On the other hand, the de novo loss of KCNMB4 together with 18 other genes (in family S3), a gain encompassing GIRDIN (in family S4) may not be sufficient to cause ASD in the patient since in both families the mother showed an elevated SRS score (Fig. 1, Tables 1 and 2). Interestingly, a loss of CHL1 and part of CNTN6 (in family M3) may exert a weaker phenotypic effect than the gain of part of the same gene (in family S1). This is similar to reported findings on the MCPH1 gene and may indicate a dominant negative effect of a gain of part of CNTN6 or MCPH1 [40]. This is analogous to the effects of causal and modifying alleles of TTC21B in ciliopathies [43].

Only in one of the multiplex families studied a CNV co-segregating with ASD symptomatology was found. In contrast, CNVs were found in the other two multiplex families in which on the basis of the SRS scores of the parents the existence of an inherited and transmitted mutation located elsewhere in the genome can be presumed (Fig. 1a). Similarly, a de novo gain of CNTN6 or a paternally transmitted loss of CNTN5 (in family M2) may by itself be sufficient to cause ASD (in family S1), while a loss of CHL1 and part of CNTN6 (in family M3) may have to act in concert with a transmitted, presumably mutated, gene elsewhere in the genome. These findings emphasize that the phenotypic impact of CNVs of previously published candidate genes for ASD need to be re-assessed [1416, 32, 36]. Our data also add to the mounting evidence that two or even more loci may by additive or epistatic interactions be involved in the causation of ASD [8, 9, 16, 22].

Systematic evaluation of CNVs, by taking into account data on gene content and brain transcription, their mode of inheritance, and the outcome of the SRS in probands and parents, allowed us to attribute a sufficient or an additional impact of each CNV on the ASD phenotype of the proband. Thus, our approach extends the scope of genome-wide CNV profiling beyond de novo CNVs in sporadic patients. This also may constitute a first step toward uncovering the missing heritability in genome-wide screening studies of complex disorders [44]. Considering the relatively small size of our sample and that it refers to a specific subset of patients with ASD, our study does not yet allow for exhaustive conclusions [21, 36]. Yet, our study does suggest that in particular in multiplex ASD families not yet discovered, dominantly acting mutations or non-genetic factors may be involved. It is conceivable that in such families several loci may, by additive of epistatic interactions, provoke the ASD phenotype in some, but not all, probands. Multiplex families, such as family M1 and M3 in this study, may represent excellent targets for the novel, genome-wide next generation sequencing approaches [45]. Future replication of our systematic, family-based approach to CNV evaluation, complemented by genome-wide re-sequencing efforts and followed by gene prioritization, may enhance our insights into the impact of rare genetic variants on the etiology of ASD and other complex psychiatric disorders with a high heritability, such as schizophrenia, and idiopathic mental retardation [46, 47].