Introduction

Autism spectrum disorder (ASD) has a reported prevalence of 1 in 68 children in the United States.1 ASD is a grouping of lifelong neurodevelopmental disorders, characterized by impairments in reciprocal social interaction and communication, and the presence of stereotypical behaviors, interests or activities. The etiology of ASD is not yet well understood. Although mutations of many genes, including NLGN3, NLGN4, NRXN1, SHANK2, SHANK3 and PTCHD1, have been associated with ASD,2, 3 metabolic, infectious, inflammatory and other environmental factors have also been implicated in the pathogenesis of ASD.4, 5, 6, 7, 8, 9 We previously determined that hypermethylation of the ENO2 gene is present in 15% of children with ASD,10 indicating that epigenetic factor(s) may contribute to the etiology of ASD.3, 11 In addition, as transcriptional and posttranscriptional regulators, both microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) have been reported to be involved in ASD as well as in many other neurological disorders.12, 13, 14, 15, 16, 17, 18, 19, 20, 21 Overexpression and knockdown studies have shown that lncRNAs have important roles in regulating a variety of processes, including splicing,22 transcription,23 localization24 and the organization of subcellular compartments.25, 26, 27, 28, 29, 30, 31 Underscoring the importance of lncRNAs’ regulatory roles is their emergence as essential components in the etiology of many disorders, and of complex diseases in particular, for which genetic and environmental interactions have key roles.31, 32, 33, 34

LncRNAs are a subset of RNA molecules greater than 200 nt in length that are transcribed but not translated. They may be positioned in genomic sequences as antisense, intronic and large intergenic noncoding RNAs (ncRNAs), as well as at promoter-associated and untranslated regions, which function as cis or trans regulators. LncRNAs can function as translational and posttranslational regulators of brain development and differentiation, and are associated with various human brain disorders.16, 17, 18, 19, 20, 21, 35, 36, 37, 38, 39, 40 LncRNAs have been reported to be involved in many complex diseases, including neurodegenerative and psychiatric diseases, cardiovascular disease, immune dysfunction and auto-immunity, carcinogenesis and reproductive diseases.41, 42, 43, 44, 45, 46, 47, 48, 49, 50 Deregulation of lncRNAs is becoming recognized as a major feature of many types of diseases. Importantly, cancer-associated lncRNAs may serve as diagnostic or predictive biomarkers and provide targets for new therapeutic strategies for selective silencing.51 Among 168 human diseases that have been found to be associated with lncRNAs, and that are recorded in the lncrnadisease (http://cmbi.bjmu.edu.cn/lncrnadisease) database, neurological diseases, cardiovascular diseases and cancers account for 8.3%, 10.7% and 40.5%, respectively.52

Altered lncRNA levels have been identified in ASD brains.21, 53, 54, 55 In one study of ~33 000 annotated lncRNAs and 30 000 messenger RNA (mRNA) transcripts from the postmortem prefrontal cortex and cerebellar tissues of two ASD and two control subjects, over 200 differentially expressed lncRNAs were detected. These differentially expressed lncRNAs in the ASD subjects were enriched for genomic regions containing genes related to neurodevelopment and neuropsychiatric diseases. Comparison of differences in the expression of mRNAs between the prefrontal cortex and the cerebellum within individual ASD brains showed more transcriptional homogeneity than within control brains. This finding was also true of the lncRNA transcriptome.55 Abnormalities in mRNA expression in ASD have also been observed in peripheral blood mononuclear cells, which are safely and easily assayed in infants and offer the potential of a peripheral blood-based, early biomarker panel to detect risk for ASD in infants and toddlers.56 We undertook this study to determine whether lncRNAs are differentially expressed in the blood of individuals with ASD, rather than in ASD brains. Our positive findings may open a new approach to investigate potential epigenetic mechanisms underlying ASD and to explore biomarker identifications for possible clinical screening and diagnosis of ASD.

Materials and methods

Ethics statement

The Hospital Ethics Committee reviewed and approved the research project. Informed consent was obtained from the parents of the participating children. All the material and data were previously de-identified and coded, and were anonymous to the investigators.

Subjects

Twenty-five pairs of gender- and age-matched Chinese ASD and control children were recruited for this discovery study at their first-time clinical visit before any clinical laboratory studies, intervention or medication. The children with ASD were clinically diagnosed by means of DSM-IV criteria and did not have epilepsy, any physical disabilities or family history of ASD. The controls were phenotypically and developmentally normal children who were undergoing an annual health checkup. There were 17 pairs of boys and eight pairs of girls, 3–5 years of age, in both the ASD and control groups. Lymphocytes were isolated from 3 to 5 ml of peripheral blood specimens of the Caucasian participants and stored at −70 °C until total RNA was extracted with a Qiagen Mini kit (Qiagen, Valencia, CA, USA). In addition, total RNAs, isolated from 10 lymphoblast cell lines derived from Caucasian children (seven boys and three girls, aged 3 to 8 years) with ASD, were subjected to the validation study. There were no Caucasian control samples used for the validation.

Microarray hybridization

The Arraystar Human LncRNA Array v2.0 (www.arraystar.com), which detects genome-wide lncRNAs and mRNAs simultaneously, was used for this study. This array covers 33 045 lncRNAs and 30 218 mRNAs that were identified from authoritative data sources, including RefSeq, UCSC Knowngenes and Ensembl. RNA labeling and array hybridizations were performed according to the Agilent One-Color Microarray-Based Gene Expression Analysis protocol (Agilent Technologies, Santa Clara, CA, USA) with minor modifications. Briefly, mRNA was purified from total RNA after the removal of ribosomal RNA with the mRNA-ONLY Eukaryotic mRNA Isolation Kit (Epicentre, Omaha, NE, USA). Each sample was amplified and transcribed into fluorescent complementary RNA (cRNA) along the entire length of the transcripts without a 3′ bias, utilizing the random priming method. The labeled cRNAs were purified using an RNeasy Mini Kit (Qiagen). The concentration and specific activity of the labeled cRNAs (pmol Cy3 per μg cRNA) were measured with NanoDrop ND-1000. One microgram of each labeled cRNA was fragmented by adding 5 μl of 10 × blocking agent and 1 μl of 25 × fragmentation buffer, and then heating the mixture to 60 °C for 30 min. Finally, 25 μl 2 × GE of hybridization buffer was added to dilute the labeled cRNA. Fifty microliters of hybridization solution was dispensed onto the gasket slide and assembled with the lncRNA expression microarray slide. The slides were incubated for 17 h at 65 °C in an Agilent hybridization oven. The hybridized arrays were washed, fixed and scanned by using the Agilent DNA Microarray Scanner (Agilent Technologies). Agilent Feature Extraction software (version 11.0.1.1) was used to analyze the acquired array images. Quantile normalization and subsequent data processing were performed using the GeneSpring GX v12.1 software package (Agilent Technologies). After normalization of the raw data, lncRNAs and mRNAs that had flags (‘All Targets Value’) were chosen for further data analysis. Differentially expressed lncRNAs and mRNAs between the two groups with statistical significance were identified through volcano plot filtering. Hierarchical clustering was performed using the Agilent GeneSpring GX software (Version 12.1). Both ‘GO analysis’ and ‘Pathway analysis’ were performed with the DAVID program (http://david.abcc.ncifcrf.gov), in which analysis of gene ontology (GO) and KEGG PATHWAY was conducted. The results were also analyzed using the genetic and molecular interaction software GeneMANIA,57, 58 an algorithm to determine the relationship between these mRNAs. The bio-functions and canonical pathways associated with our data were generated by using the core-analysis option in Ingenuity Pathway Analysis (Ingenuity Systems; http://www.ingenuity.com).

Quantitative real-time PCR analysis

The total RNA extracted from leukocytes or from lymphoblasts was used to synthesize cDNA. The expression levels of lncRNAs and of lncRNA-targeted mRNAs were determined by quantitative real-time PCR. Quantitative PCR reactions (the primer sequences used in quantitative PCR are listed in Supplementary Table S1) were performed by the ABI7900 system (Life Technologies, Grand Island, NY, USA) and SYBR green dye SuperArray PCR master mix (SABiosciences, Frederick, MD, USA). mRNA of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as an internal control for quantitative analysis of lncRNA or mRNA. The lncRNA or mRNA values were normalized to GAPDH levels. For each lncRNA or mRNA, triple reactions were analyzed simultaneously, and the result was reported as the relative expression calculated relative to this control. All the data were given in terms of relative expression of the mean±S.E. (n=10). The data were subjected to one-way analysis of variance followed by an unpaired, two-tailed t-test. Differences were considered significant at P<0.05.

Results

Differential expression profiles of lncRNA and mRNAs

A total of 3929 lncRNAs were identified as differentially expressed in Chinese ASD peripheral blood cells, including 2407 that were upregulated and 1522 that were downregulated. Among these, intergenic lncRNAs were the most common (accounting for 43%), followed by natural antisense (19%), intronic antisense (12%), exon sense-overlapping (9%), bi-directional (5%) and intron sense-overlapping (4%). Five percent of identified lncRNAs belong to uncharacterized groups (Supplementary Table S2). Simultaneously, 2610 mRNAs, including 1789 upregulated and 821 downregulated, that were differentially expressed in ASD blood cells genome-wide were also identified. The entire data set has been deposited in a public domain (DataDryad.org, DOI: doi:10.5061/dryad.d8f84).

Functional pathways derived from lncRNAs–mRNAs

The gene loci where the lncRNAs are localized were subjected to pathway and gene ontology analysis. A total of 13 pathways derived from upregulated lncRNAs and 14 from downregulated lncRNAs were identified as being significant in ASD group (P<0.05). The 10 pathways with the highest enrichment score (Figure 1) showed that downregulated lncRNA loci were mainly involved in infection and inflammatory pathways. However, three pathways that are related to neurological regulation—the long-term depression, the synaptic vesicle cycling and the long-term potentiation pathways—were characterized from the upregulated lncRNA loci. The plot of enrichment score shows the relevant level of probability of the involvement of differentially expressed lncRNAs in the pathway. The range-axis in the upregulated pathways is lower than in the downregulated, suggesting that the pathogenic impact of lncRNAs in downregulated pathways is heavier than in upregulated pathways and the downregulated pathways are more likely involved in ASD than in upregulated pathways.

Figure 1
figure 1

Metabolic pathways characterized from the lncRNAs differentially expressed in ASD: The top-10 score of up- and downregulated pathways were characterized with KEGG functional analysis. Three P-values, the EASE-score, Fisher P-value and Hypergeometric P-value were integrated for the analysis. The bar plot shows the top Enrichment Score [−log10(P-value)] value of the significant enrichment pathway. The higher Enrichment Score indicates that more lncRNA molecules are involved in this pathway. ASD, autism spectrum disorder; lncRNA, long noncoding RNA.

Differential expression of synaptic lncRNAs and mRNAs

Thirteen synaptic lncRNAs (Table 1A), including nine upregulated and four downregulated, and 19 synaptic mRNAs (Table 1B), including 12 upregulated and seven downregulated, were identified as being differentially expressed in children with ASD. Among the upregulated lncRNAs, six were intronic antisense, two exon sense-overlapping and one natural antisense. Among the downregulated lncRNAs, three were natural antisense and one intronic antisense. To validate the differential expression of the synaptic lncRNAs and mRNAs identified from the microchip-based discovery study, the lncRNAs and mRNAs were subjected to quantitative analysis by quantitative PCR. Before the differential expression between the ASD and the control groups was analyzed, inter-group comparisons were made between the lymphocytes in a Chinese population and the lymphoblasts in a Caucasian population (A1 vs A2, C1 vs C2). As shown in Table 2A, the expression of lncRNAs NR_037945 (STX16), ENST00000565041 (SYNGR3), ENST00000425264 (SLC18A2), ENST00000502589 (SV2C) and ENST00000453544 (SNAP25) showed no significant difference between the Chinese and the Caucasian populations in both the ASD and the control groups; expression of ENST00000527880 (SYP), NR_034115 (STXBP5), NR_033656 (STX8) and ENST00000553165 (SYT1) showed no significant difference between the Chinese ASD groups; and expression of uc001mff.1 (SYT9), ENST00000504206 (SYT9), ENST00000506914 (SYT15) and ENST00000433499 (STXBP5) showed no significant difference between the two control groups. The differential expression of all lncRNAs, except ENST00000453544 (locus SNAP25) in the Caucasian ASD subjects and ENST00000553165 (SYT1) in both the Chinese and Caucasian ASD subjects, was statistically significant between the ASD and control groups. However, there was no statistical significance (P>0.05) observed between the males and females, nor between different age groups, in ASD or controls. No significant difference in the expression of three mRNAs—NGR4, SYNDIG1 and STX2—was evident between the Chinese and Caucasian ASD subjects. Expression of the mRNAs SYNJ1, SDCBP, SYPL1, SYNM and SYNDIG1L was not significantly different between the Chinese and Caucasian control groups. Expressions of all mRNAs, except SYCE1 in the Caucasian population and STX2 and SYT3 in both populations, were determined to be significantly different between the ASD and control groups in both the populations (Table 2B). To integrate the genome-wide-expressed lncRNAs with the synaptic mRNAs, we were able to draw the networks between the lncRNAs and mRNAs (Figure 2). This helps associate differential gene expression with gene ontology, biological pathway and the regulatory functions of the lncRNAs.

Table 1A Synaptic lncRNAs differentially expressed in ASD (discovery)
Table 1B Synaptic mRNAs differentially expressed in ASD (discovery)
Table 2A Synaptic lncRNAs differentially expressed in ASD (validation)
Table 2B Synaptic mRNAs differentially expressed in ASD (validation)
Figure 2
figure 2

Validation of lncRNAs: qRT-PCR was applied to validate differentially expressed lncRNAs (right panels) and mRNAs (left panels) between the ASD and the control groups. Other than the gene symbol, each lncRNA was labeled with its name that can be matched to the symbol in Table 3. The height of each bar, measured by mean±s.d., represents the relative expression level. ASD, autism spectrum disorder; lncRNA, long noncoding RNA; qRT-PCR, quantitative real-time PCR.

Association of lncRNAs with autistic genes

Genetic and genomic studies have revealed that a substantial proportion of ASD risk resides in high-impact rare variation, ranging from chromosome abnormalities, single-nucleotide variation, copy-number variation to gene mutations.59 To match our lncRNA results to these gene loci, 19 lncRNAs were found to be associated with ASD genes (Table 3A). Among these lncRNAs, seven were natural antisense (AS); six, intronic antisense; three, bi-directional; two, intron sense-overlapping; and one, intergenic. Twelve of the 19 lncRNAs associated with ASD genes were homeobox or homeobox-related genes, including nine that were upregulated and three that were downregulated, followed by four brain-derived neurotrophic factor (BDNF) isoforms. Interestingly, differential expression of mRNAs for HOXA and HOXB was also identified in our microarray-based discovery study of the ASD patients (Table 3B). However, no mRNAs, which are the targets of BDNF-AS and of the intronic AS of SHANK2—a membrane of the SHANK gene family and its gene mutations found in ASD patients (http://autism.mindspec.org/GeneDetail/SHANK2)—were identified.

Table 3A LncRNAs associated with autistic genes
Table 3B Differential expression of mRNAs encoded by autistic genes

Discussion

Differentially expressed lncRNAs represent a new potential biomarker category for the early detection of ASD

Previous studies have reported that lncRNAs were aberrantly expressed in brain tissues and associated with ASD.56 However, brain tissue cannot be used as clinical material for early screening or for diagnostic purposes. In addition, there is no specific biomarker at present that can be applied in clinical practice, owing to the genetic heterogeneity of ASD,10 although efforts have been undertaken to characterize blood mRNA profiles.60, 61, 62, 63 Use of a panel of lncRNAs that are specifically associated with ASD phenotypes and are differentially expressed in ASD peripheral blood would be valuable and practical. In this study, we presented metabolic pathways (Figure 1) that define peripheral blood lncRNAs that are differentially expressed in ASD. Among these, synaptic vesicle cycling, long-term depression and long-term potentiation are neurologically related pathways. LncRNAs that are differentially expressed in ASD and have been identified in these pathways include IGF1, mGluR1, CRFR1, IGF1R, NMDAR and VDCC, which are localized to cell membranes; and Ras, G protein, PLC, IP3R, PKG, ERK1, ERK2 and PP2A, which are in the cytoplasm and involved in signal transduction for long-term depression and long-term potentiation. The differentially expressed lncRNAs involved in the synaptic vesicle cycling pathway are Rab3A, Munc13, Syntaxin, SNAP25, Clathrin, V-ATPase and trans-SNARE complex. Expression of all these genes is found with various technology platforms (www.genecards.org). It is not clear whether the lncRNAs and mRNAs identified in the peripheral lymphocytes are identical to those expressed in neuronal cells or whether they reflect ASD brain functions.

Although lncRNAs have been determined to associate with transcriptional regulation in neuronal development and diseases,17 applying gene differential expression profile analysis of peripheral bloods for brain disorders presents a challenge to convince that the differential expression profile in the peripheral bloods may be relevant to that in brain tissues. In fact, there is no way to obtain human brain tissue for routine screening or diagnostic analysis in clinical practice. The eQTL gene transcripts identified from brains was demonstrated as being stably expressed in peripheral bloods.64, 65 Our earlier study also determined that the brain gene ENO2 showed differentially expressed methylation in peripheral bloods.10 Therefore, to analyze differential expression profile in blood may open a new approach to explore applying differential expression profile in blood as a biomarker for brain diseases.

The differential expression we found demonstrates the potential for lncRNAs to be applied as clinical biomarkers. Replication of our study with larger samples and various ethnic backgrounds will be needed. Indeed, we noted differences in the lncRNAs when comparing the Chinese and Caucasian populations (Table 2A). These differences suggest that there exist, at certain gene loci, inter-population and inter-condition differences in gene expression. A similar finding of the influence of different ethnic backgrounds was also observed in the mRNAs (Table 2B).

Synaptic lncRNAs may regulate synaptic vesicle transportation

Among the 13 synaptic lncRNAs, three were lncRNAs, which resided at the genes for synaptic vesicle proteins (Table 1A). The gene SLC18A2, a member of the vesicular monoamine transporter family, encodes a vesicular monoamine transporter of cytosolic monoamines into synaptic vesicles, using the proton gradient maintained across the synaptic vesicular membrane. Its proper function is essential for the proper activity of monoaminergic systems, which have been implicated in several human neuropsychiatric disorders, including brain dopamine–serotonin vesicular transport disease and cocaine dependence.66 The gene SV2C encodes synaptic vesicle glycoprotein 2C, which has a role in the control of regulated secretion in neural and endocrine cells, selectively enhances low-frequency neurotransmission, and positively regulates vesicle fusion by maintaining the readily releasable pool of secretory vesicles.67 SYP is a gene that encodes the synaptic protein synaptophysin, an integral membrane protein of small synaptic vesicles in the brain and endocrine cells, which is a transporter and a calcium ion–binding protein. This protein may also bind cholesterol and is thought to direct targeting of vesicle-associated membrane protein 2 (synaptobrevin) to intracellular compartments.68 Mutations in this gene are associated with X-linked mental retardation (www.researchgate.net/publication/12901506_XLMR_database).69 In addition to these three synaptic vesicle proteins, the genes STX8 and STX16 are also involved in synaptic vesicle metabolism. The STX8 gene is involved in protein trafficking from early to late endosomes via vesicle fusion and exocytosis. It encodes a vesicle trafficking protein that functions in the early secretory pathway, possibly by mediating retrograde transport from cis-Golgi membranes to the endosome reticulum.70, 71 The STX16 gene is a member of the t-SNARE (target-SNAP receptor) family. Proteins in this family are found on cell membranes and serve as the targets for V-SNARES (vesicle-SNAP receptors), permitting specific synaptic vesicle docking and fusion. A disease associated with STX8 includes visual epilepsy, and diseases associated with STX16 are pseudohypoparathyroidism type 1b and pseudohypoparathyroidism.72, 73

HOX genes are likely to be deregulated in ASD

Several studies have demonstrated that lncRNAs can function in the regulation of in vivo transcription. An lcnRNA dubbed linc-HOXA1 RNA has been found to repress Hoxa1 expression. Knockdown of linc-HOXA1 increases transcription of the Hoxa1 gene that is located some 50 kb adjacent to the linc-HOXA1.74 HOXA cluster antisense RNA 2 (HOXA-AS2) is an lncRNA located between the HOXA3 and HOXA4 genes in the HOXA cluster. HOXA-AS2 is an apoptosis repressor in all trans retinoic acid-treated NB4 promyelocytic leukemia cells.75 Its transcript is expressed in NB4 promyelocytic leukemia cells and human peripheral blood neutrophils, and expression is increased in NB4 cells treated with all trans retinoic acid. The all trans retinoic acid induction of HOXA-AS2 suppresses all trans retinoic acid-induced apoptosis.75 The HOTAIR (Hox transcript antisense RNA) gene contains 6232 bp and encodes a 2.2- kb lncRNA. Its source DNA is located within a HOXC gene cluster. Recently, differential expression of HOTAIR has been determined to be associated with cancer metastasis and possibly to represent an independent prognostic factor.76 The 5′ end of HOTAIR interacts with the polycomb-group protein Polycomb repressive complex 2 and as a result regulates chromatin state. It is required for gene-silencing of the HOXD locus by Polycomb repressive complex 2 and the 3′ end of HOTAIR interacts with the histone demethylase LSD1.77 In our study, we identified several lncRNAs of the HOX genes (HOXA13, HOXB5, HOXB6 and HOXD1) and HOX-related genes (DLX6, HMBOX1 and BARX1) from leukocytes derived from ASD patients (Table 3A). These findings led us to hypothesize that the differentially expressed lncRNAs of HOX genes and HOX-related genes, referred to as lncHOXs, could represent a new set of biomarkers for ASD.

The variety of identified lncRNAs suggests that lncRNAs’ regulatory functions involved in ASD may have various epigenetic mechanisms.

LncRNAs have been recognized as transcriptional and posttranscriptional regulators.28, 29, 30, 31, 32, 33 They may function to activate gene transcription by binding a transcriptional factor to the promoter region to signal or guide transcription. They may prevent miRNA from binding to the target gene, may suppress gene transcription by decoying the transcription factor away from the promoter, or may be a chromatin modifier by bringing a chromatin enzyme onto chromatin to form a complex and thereby modify histones.28, 29, 30, 31, 32, 33 Usually, if the lncRNA is the antisense of a gene, the lncRNA likely functions as the cis-suppressor to inhibit gene transcription. This could be the case for both the natural and intronic antisense lncRNAs that we identified (Tables 1A and 3A). To further understand their molecular mechanisms, transgenic models created by introducing extragenic lncRNA to generate a knockout, knockin or knockdown model at the cellular and/or animal level could be investigated. Such a transgenic model could provide phenotype(s) to mimic ASD. So far, there is little evidence that details the molecular and pathogenic mechanisms of the bi-directional and intergenic lncRNAs involved in ASD and other neurological diseases. In the present study, we identified three bi-directional and one intergenic lncRNA, and observed that all are located within HOX loci (Table 3A). A better understanding of the molecular mechanisms will clarify how these lncRNAs are involved in regulating gene expression in ASD and other neurological conditions.

In conclusion, we have profiled here the differential expression of lncRNAs and mRNAs in ASD peripheral leukocytes and have identified important clusters that may be associated with this disorder. Our findings suggest the importance of synaptic lncRNAs, which are likely involved in synaptic vesicle transportation and cycling and thus would be important for the delivery of synaptosomal protein(s) between presynaptic and postsynaptic membranes. LncRNAs that are the antisense of the HOX genes may be related to ASD. This finding may open a new approach to investigate the pathogenic mechanisms of the HOX genes in the development of ASD. Identification of the lncRNAs of SHANK2-AS and BDNF-AS indicate that in addition to gene mutation, de-regulation of lncRNAs on ASD-causing gene loci may represent a new category of, and allow exploration of, the epigenetic mechanisms involved in ASD. Further investigation with larger sample sizes may validate the use of lncRNAs as biomarkers for early detection of ASD.