Main

Neurodevelopmental disorders (NDDs) are multifaceted conditions characterized by impairments in cognition, communication, behavior and/or motor skills resulting from abnormal brain development. Intellectual disability, communication disorders, autism spectrum disorder (ASD), attention deficit/hyperactivity disorder (ADHD) and schizophrenia fall under the umbrella of NDD.1, 2, 3 Currently, there are no biomarkers to diagnose NDD or to differentiate between them. Rather, these disorders are categorized into discrete disease entities, based on clinical presentation.1 This is problematic, as many symptoms are not unique to a single NDD, and several NDDs have clusters of symptoms in common. For example, impaired social cognition is common to ASD and schizophrenia,4, 5, 6, 7 and psychosis is observed not only in schizophrenia but also in those with bipolar disorder or major depressive disorder.8, 9 Thus, such overlap of clinical symptoms presents a challenge for nosology and course of treatment. This is in stark contrast to other disorders, such as cardiovascular diseases, where diagnosis is rooted in biological manifestations, biomarkers and pathophysiology. The diffuse clinical boundaries among NDD calls into question the appropriateness of current disease definitions.10, 11, 12, 13 Here, we advocate reformulating current nosological categories with novel disorder definitions rooted in the biology of processes that are awry in NDDs. We predict that biological disorder definitions will change the way we use symptomology for diagnosis.

Neurodevelopmental disorders: boundary definitions from genomes

The hypothesis that NDDs are distinct nosological entities predicts that genetic factors associated with risk for or causation of a given disorder should segregate with diagnostic categories; thus, in classical terms, there should be little or no overlap among the genetic factors implicated in each NDD. That is, the genes that operate in one disorder should not be involved in another. However, genetic epidemiology reveals substantive overlap between genes conferring risk for or causing NDDs.

Genetic defects associated with risk or causation of NDDs range from large chromosomal deletions to single-nucleotide polymorphisms (SNPs). Notably, among major genomic defects, a number of chromosomal deletions are associated with intellectual disability, ASD and schizophrenia.12, 14, 15, 16 Among the most frequent are 1q21.1, 16p11.2 and 22q11.2.12, 17 The large number of genes affected by these deletions should cause little surprise that they give rise to disorders with overlapping phenotypes. However, smaller genetic modifications, specifically SNPs in non-coding regions, are shared among diverse NDDs.18 Genetic overlap among NDDs extends to monogenic defects that affect the coding sequence and expression of a single polypeptide encoded by the gene (for example, SHANK3, NRXN1, DISC1, FMR1, MECP2, GPHN). Patients carrying these mutations are diagnosed either with intellectual disability, ASD, schizophrenia or combinations of thereof.19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33 Monogenic genetic defects affect subunits of obligated and stable protein complexes. For example, the adaptor complex AP-3 is an obligate heterotetramer that generates vesicles from early endosomes bound to lysosomes/synapses;34, 35 human mutations in a neuronal-specific AP-3 subunit (AP3B2) associate with ASD.36, 37 Thus, irrespective of the size of a genetic defect, there is a continuously expanding list of affected genes that do not respect categorical diagnostic boundaries among NDDs.

Neurodevelopmental disorders: boundary definitions from interactomes

Protein interaction networks (interactomes) to which NDD genes belong also overlap. Interactomes built from genes associated with intellectual disability, ASD, ADHD and schizophrenia converge on common molecular pathways.38 Genes associated with these NDDs intersect on one out of 700 genes catalogued as risk factors. However, the list of common proteins shared by these NDDs increases to 147 out of the 700 genes simply by expanding the gene catalog to include predicted first-degree interacting neighbors obtained from protein–protein interaction databases.38 These computational studies support the concept that the interactomes associated with NDDs overlap. However, the power of these types of studies is limited by the present quality of protein interaction databases, which are incomplete, are only moderately curated to accommodate newly published findings and are often populated by results not confirmed by alternative biochemical, genetic and/or functional approaches.39, 40, 41 Furthermore, protein interaction databases are biased by the experimental approach used in their generation; for example, most protein interaction databases poorly represent membrane proteins that are not amenable to exploration by traditional yeast two hybrid or pull-downs with recombinant proteins.42

The interactome of the schizophrenia susceptibility gene DTNBP1 well illustrates several of these problems (Figure 1). DTNBP1 encodes dysbindin, a subunit of the BLOC-1 complex.43, 44, 45, 46, 47, 48, 49, 50, 51 This complex participates in membrane protein trafficking between endosomes and lysosomes, and between endosomes located in neuronal cell bodies and the synapse.35, 50, 51 Published in silico dysbindin interactomes52, 53 differ from biochemically and genetically tested protein interaction networks.37 However, discrepancies among interactomes expand beyond those published (Figures 1a and d). Three protein interaction databases report associations that differ from each other in interactor identities. Furthermore, feeding those associations into a rigorous algorithm for ‘de novo’ generation of interactomes reveals different network topologies (Figures 1a and d).54 Only one of these four dysbindin interactomes links dysbindin with the adaptor complex AP-3, despite multiple biochemical, cell biology, and genetic evidence that these complexes interact in vivo and in vitro (Figure 1a).55, 56, 57, 58, 59, 60, 61, 62 This deficiency in existing databases has immediate ramifications, as mutations in AP3B2 associated with ASD cannot be linked to schizophrenia through the BLOC-1 subunit dysbindin.36, 37 AP3B2 is not an isolated instance. Rather, only the experimentally defined dysbindin interactome identifies SNAP29 and CLTCL1.37 SNAP29 has been identified as a de novo risk factor for schizophrenia, while both SNAP29 and CLTCL1 map to the chromosome interval affected in velocardiofacial (chromosome 22q11.2 deletion) syndrome.17, 63 This syndrome closely associates with schizophrenia, ASD and intellectual disability.17 Gaps in content and quality in relation to protein interaction databases are important, as these repositories are the foundation for molecular connectivity between genetic defects associated with a given disorder. These deficiencies are missed opportunities for establishing molecular mechanisms of disease and finding mechanistic commonalities among NDDs. Thus, we argue in favor of generating interactomes confirmed by biochemical, genetic and/or functional strategies. Epidemiological genomics offer the field a good selection of solid candidate genes with which to begin this quest.

Figure 1
figure 1

DTNBP1–dysbindin interactomes differ in their constituents and topology. Interactomes were assembled with the Dapple algorithm (http://www.broadinstitute.org/mpg/dapple/dapple.php)54 using as inputs the dysbindin associated proteins identified by affinity chromatography (a), and interactors reported in three protein–protein interaction databases: (b) Biogrid (http://thebiogrid.org/), (c) Genemania (http://www.genemania.org/) and (d) String 9.05 (http://string.embl.de/). Red boxes highlight DTNBP1. Note that the identity of interacting proteins differs among interactomes. Color code represents a Dapple estimated probability that a protein would be as connected to other proteins (directly or indirectly) by chance as is depicted. Only interactome A presents a biochemically and genetically confirmed interaction between the adaptor complex AP-3 and the dysbindin-containing BLOC-1 complex.

Neurodevelopmental disorders: ‘guilty by association’ mechanisms of disease and their inclusion in interactomes

Loss of one protein function due to a genetic mutation can alter levels or activity of other proteins that interact either directly or indirectly with the mutant protein. These ‘guilty by association’ proteins can be the actual culprits of disease phenotypes. The concept is illustrated readily by Marfan syndrome, a connective tissue disorder in which morbidity and mortality are chiefly associated with aortic aneurisms.64, 65 This disease is caused by mutations in the extracellular matrix protein fibrillin-1 (FBN1), which organizes into 10 nm fibers.64, 65 An old pathogenic hypothesis considered that loss of fibrillin fibers decreased blood vessel resilience to mechanical stress.64, 65 However, there is a new conceptualization of this syndrome that pinpoints abnormal TGFβ signal transduction as the main culprit in its vascular pathology. This unexpected shift can be understood from the fact that fibrillin binds and presents TGFβ to its receptor at the optimal concentration, time and location.64, 65, 66 Thus, TGFβ is a ‘guilty by association with fibrillin’ protein.

Subunits of protein complexes are particularly susceptible to being ‘guilty by association’ proteins. Genetic defects, or even non-pathogenic allelic variation affecting a single subunit of a protein complex, frequently lead to downregulation and/or covariation of other complex subunits.67, 68, 69, 70, 71, 72 DTNBP1 null mutations abrogating dysbindin expression downregulate most subunits of the BLOC-1 complex, despite the monogenic character of the mutation.44, 50, 69 Reciprocally, genetic defects on other BLOC-1 subunits decrease dysbindin cellular content.44, 50 ‘Guilty by association’ proteins in the dysbindin interactome extend beyond intrinsic components of the BLOC-1 complex. These proteins include membrane protein cargoes such as VAMP7 (VAMP7), a synaptic vesicle fusogenic membrane protein (SNARE) implicated in spontaneous synaptic vesicle fusion and the Menkes disease copper transporter (ATP7A), the adaptor complex AP-3, RhoGEF1 (ARHGEF1) and BDNF (BDNF), a neurotrophin with a long history of association with several NDDs.57, 59, 73, 74, 75, 76 None of these proteins whose levels are affected by mutations in DTNBP1, or other BLOC-1 complex subunits, can be identified in current protein interaction databases that focus on physical protein–protein interactions. This problem prevents their inclusion in any analysis seeking to connect genetic defects found in genome-wide associations studies to relevant molecular pathology.

Creating a nosology from genome informed proteomes-interactomes

Genome-wide association studies (GWAS) search the genomes of clinically defined patient populations for genetic markers that reach a threshold of statistical significance to associate with disease risk (Figure 2a). This approach encounters the problem that these disorders are polygenic and that categorical NDD definitions are not linked to biological markers or molecular phenotypes.77, 78 Thus, it is likely that genetically heterogeneous patient cohorts in these studies gather multiple molecular mechanisms of disease. However, these studies offer powerful insight when a particular genetic marker reaches statistical significance, despite the ‘noise’ introduced by the polygenic character of these disorders and the problems intrinsic to categorical NDD definitions. Genetic defects associated with one or multiple NDD should be seen as the tip of the iceberg to unravel biological mechanisms of disease. Interactomes of gene products consistently implicated in NDDs (‘tip of the iceberg genes’) are a fertile ground to search for disease mechanisms.54, 79 This prediction stems from the hypothesis that genomes of patients affected by polygenic NDD should concentrate alleles that affect the expression or function of genes whose products belong to or modulate a relevant pathway.80 We illustrate this concept in Figure 2b where gene-α has reached statistical significance in a population GWAS. The product encoded by gene-α is a bait to ‘fish out’ the red protein interaction network (Red interactome B1, Figure 2b). The biochemical definition of interactome 1 would occur irrespective of whether interactome 1 contains products encoded by genes carrying defects that do not cross a population statistical threshold (Figure 2b).

Figure 2
figure 2

Models of cross-fertilization between genomes, proteomes and interactomes. Grid in diagrams (a) to (e) depicts a polygenic genetic landscape associated with a NDD. Circles represent defined genes within the grid that when affected in different combinations trigger a NDD. Bars above each gene indicate a subject where a gene defect was found on a GWAS. Blue bars are those subjects that have a defect in a gene below statistical threshold, which is marked by the asterisk in (a). Red bars above a gene represent subjects that have a defect in a gene above statistical threshold. (b) Depicts a ‘tip of the iceberg gene α’ and the network to which it belongs represented by the connected red circles (interactome 1). (c) Depicts three ‘tip of the iceberg genes’ and the network to which they belong (interactome 1). The yellow interactome 2 is constituted by genes below statistical threshold as defined by gene-centric GWAS statistical analysis. (d) Represents genetic defects (blue bars) in two interactomes per patient (subjects 1–3). Note that in all patients there are no gene defects in the red interactome. E depicts hypothetical results of an interactome-centric GWAS that includes subjects 1–3 in (d). The yellow interactome 2 is now above statistical threshold as defined by an interactome-centric GWAS statistical analysis. See text for details.

This genome to proteome ‘reverse’ approach is not foreign to current genomic studies, in which bioinformatics of protein–protein interaction databases are used to find connections between gene defects that associate with a disorder at a GWAS level36, 79 (Figure 2c). However, mapping GWAS results back to an interactome requires the availability of several network genes that cross a statistical threshold (Red interactome 1, Figure 2c) as well as pre-existing and reliable protein interaction databases. Genes below statistical threshold in the red network C1 would not contribute to the identification of the C1 interactome (Red interactome C1, Figure 2c). Moreover, current criteria to allocate GWAS results to an interactome would miss the yellow interactome C2 where genes encoding interactome products are all below statistical threshold (Figure 2c).

How can we obtain mechanistic insight from studying ‘omes’? We propose two non-exclusive approaches to define the biology of NDDs using protein–protein interaction networks and genomics. The first approach is through the definition of ‘tip of the iceberg gene’ protein networks, such as those depicted by the red interactomes in Figures 2b and c. Second, reliable protein interactomes can be used as a query matrix to explore patient’s genomes for genetic defects or variants targeting interactome-encoding loci. Different patients may carry defects in one or more genes encoding products belonging to an interactome. Each gene defect does not reach statistical significance in a ‘gene-centric’ GWAS study (Subject 1–3, Figure 2d). However, collective analysis of the genomes in a cohort of patients (Subject 1–3, Figure 2d) shows significant enrichment of genetic defects clustered on a common pathway (compare red and yellow interactomes, Figure 2e). The association of a biological mechanism, defined by an already known and reliable interactome, with the genome of affected individuals would occur, although each gene in isolation would have not risen above statistical threshold. In this case, statistical significance is assigned to a collection of genes defining an interaction network rather than a single gene (Figure 2e).

These solutions depend on reliable protein interactions networks. As mentioned above, the quality of protein–protein interaction databases commonly used is substandard. This is due to a lack of thorough biochemical, functional and/or genetic confirmation of interactions. We posit that it is possible to extract more information about disease mechanisms and disorder boundaries from current GWAS studies if reliable protein interaction maps were to exist. As these are either not available or they are in construction, we propose to focus efforts on defining the interactomes of (a) NDDs ‘tip of the iceberg genes’ as well as (b) ‘guilty by association’ proteins detected in the proteomes of cells carrying genetic defects in ‘tip of the iceberg genes’. These and other experimentally confirmed interactomes (yellow interactome 2 in Figure 2e) would allow us to extract novel genetic information from existing and future GWAS.

Creating a genome-independent nosology from proteomes-interactomes

Human proteomes are hereditable molecular phenotypes72 and as such constitute valuable, yet untapped, resources to create disorder classifications rooted in molecules and their pathways. The study of proteomes shares with the analysis of genomes its quantitative and unbiased character. However, proteomes and interactomes offer the distinctive advantage of being executors of phenotypic programs in cells and tissues. Therefore, proteomes and interactomes are causally closer to the identity of disease mechanisms than genomes. Proteomes are already beginning to shed light on complex neurological disorders such as schizophrenia.81, 82 However, we should not limit ourselves to just exploring postmortem brains of subjects grouped solely by their clinical features. Instead, we advocate for the study of proteomes from cells isolated from individuals that are genetically related. Cell proteomes from affected probands compared with their unaffected first-degree relatives offer a great prospect for the identification of hereditable or de novo abnormalities in molecular phenotypes. Evidently, in the context of NDDs, human inducible pluripotent stem cells are a great resource, as they can be differentiated into neurons.83 However, it is likely that the molecular mechanisms affected in NDDs are common to many, if not all cells. For example, Fragile X syndrome or velocardiofacial syndrome, where multiple tissues are affected 12. Thus, fibroblasts or lymphoblasts from human pedigrees are likely to offer valuable insights into neuronal disorders. We predict that proteomes built from genetically related subjects’ cells will bridge two camps. On one hand, proteomes will help us to interpret results from genome-wide analyses. On the other hand, they will guide us to define NDD mechanisms at levels of complexity higher than the traditional single genes or proteins. These would include, for instance, subcellular compartments, such as synapses or mitochondria, and deficits in tissue organization, such as those in neural circuits. Genomes, proteomes and interactomes give us vantage points, the inevitable next step is to dive deep into the biology emerging from and converging to them.