Keywords

1 Introduction

Autism spectrum disorder (ASD) refers to a group of complex neurodevelopmental disorders characterized by repetitive and characteristic patterns of behavior and difficulties with social communication and interaction. The latest analysis from the Centers for Disease Control and Prevention estimated 1 in 68 children diagnosed with ASD in the U.S.. ASD is mainly diagnosed by the observation of core behavioral symptoms [1]. For many neurodevelopmental disorders, brain dysfunction may precede abnormal behavior by months or even years. However, according to Interagency Autism Coordinating Committee strategic plan, due to the absence of early biomarkers to detect people either with or at-risk of ASD, diagnosis must rely on behavioral observations long after birth. As a result, ASD is not typically diagnosed until around 3–4 years of age in the U.S. [2]. Consequently, intervention efforts may miss a critical developmental window. Thus, it is extremely important to detect ASD earlier in life for better intervention.

Magnetic resonance imaging (MRI) based characterization of ASD has been explored as a complement to the current behavior-based diagnosis [3]. MRI-based brain volumetric studies on children and young adults with ASD found abnormalities in the hippocampus [4], precentral gyrus [5], and anterior cingulate gyrus [6]. However, most of the autistic subjects involved in previous studies are 2+ years old. In fact, the first year of postnatal development represents the most drastic postnatal developing phase, with a rapid tissue growth and milestones of a wide range of cognitive and motor functions [7]. This early period is critical in neurodevelopmental disorders such as ASD [8]. In this work, for the first time, we perform a volume-based analysis on infants at 6 months of age with risk of autism. To perform volume-based analysis and further identify possible imaging biomarkers, accurate segmentation of the infant brain into different types of tissue, e.g., white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF), is the most critical step. It will allow reliable quantification of volumetric tissue abnormalities for ASD [9]. However, accurate segmentation of infant brain MRI is challenging, especially for those at a very early age, such as at 6 months old [10], due to the lowest tissue contrast caused by the largely immature myelination [10]. For instance, the 1st column of Fig. 1 shows representative examples of T1-weighted (T1w) and T2-weighted (T2w) MRI scanned at 6 months of age. It can be observed that intensities in WM and GM are within similar range (especially around the cortex), even they were acquired with an imaging protocol dedicated for maximizing tissue contrast [10]. Such isointensity poses a significant challenge for accurate tissue segmentation. To the best of our knowledge, few studies focused on tissue segmentation of 6-month-old infants at-risk of ASD.

Fig. 1.
figure 1

(a) and (b) show T1w and T2w 6-month-old infant brain images with extremely low tissue contrast caused by the inherent ongoing myelination and maturation. (c) and (d) are the segmentation results by [11] and the proposed work with anatomical guidance, respectively, with their corresponding inner surfaces shown in (e) and (f), and cortical thickness (in mm) shown in (g) and (h). Without anatomical guidance, some WM is missing by previous work [11] as indicated by the yellow ellipse (c), which results in abnormal cortical thickness (g).

Recently, deep convolutional neural networks (CNNs) have demonstrated outstanding performances in a wide range of computer vision and image analysis applications. For example, fully convolutional networks (FCNs) [12, 13], as a natural extension of traditional CNNs, are now a common choice for semantic image segmentation in computer vision. FCNs train end-to-end segmentation models by directly optimizing intermediate feature layers. To compensate for the resolution loss induced by pooling layers, U-Net [12] introduces skip connections between their down-sampling and up-sampling paths, thus helping the up-sampling paths recover fine-grained information from the down-sampling layers. To date, many network architectures further incorporate the residual connection [14] or dense connection [15] to CNNs [11]. Although these previous networks can automatically learn effective feature hierarchies in a data-driven manner, most of them ignore the prior anatomical knowledge during the segmentation. A typical example can be seen in Fig. 1(c), in which some WM is missing, leading to abnormal cortical thickness in (g). Particularly, in the task of 6-month-old infant brain tissue segmentation, there are two kinds of critical prior knowledge, i.e.,

  • (1) tissue contrast between CSF and GM is higher than that between GM and WM;

  • (2) cortical thickness is within a certain range.

Capitalizing on these kinds of prior knowledge, we propose an Anatomy-guided and Densely-connected U-Net (ADU-Net) method for accurate segmentation of 6-month-old infant brain MRI. Specifically, we first train an initial ADU-Net to segment CSF and GM+WM, considering that CSF is relatively easier to be distinguished. Then, based on CSF segmentation and the second kind of prior knowledge, we estimate the outer cortical surface (i.e., CSF and GM boundary), and use it as guidance to train a cascaded ADU-Net for estimation of the inner cortical surface (i.e., GM and WM boundary). Based on the segmentation results, we parcellate infant brain into 83 ROIs [16] for computing volumetric measures of 6-month-old autistic infants and healthy controls, and finally perform early diagnosis.

2 Method

Dataset and Preprocessing.

The T1w and T2w MR images of 18 de-identified infants were gathered from National Database for Autism Research (NDAR). All images were acquired at around 6 months of age on a Siemens 3T scanner. All scans were acquired while the infants were naturally sleeping and fitted with ear protection, with their heads secured in a vacuum-fixation device. T1w MR images were acquired with 160 sagittal slices using parameters: TR/TE = 2400/3.16 ms and voxel resolution = 111 mm3. T2w MR images were obtained with 160 sagittal slices using parameters: TR/TE = 3200/499 ms and voxel resolution = 1 × 1 × 1 mm3. Note that the imaging protocol has been optimized to maximize tissue contrast [10]. For image preprocessing, T2-weighted images were linearly aligned onto their corresponding T1w MR images. Then, in-house tools were used to perform skull stripping, intensity inhomogeneity correction, and histogram matching for each MR modality.

Accurate manual segmentation is of importance for learning-based segmentation methods. However, due to the low contrast and huge number of voxels in 3D brain image, manual segmentation is extremely time-consuming. Hence, to generate reliable manual segmentations, we first take advantage of longitudinal follow-up 24-month-old scan of the same subject, with high tissue contrast, to generate an initial segmentation for 6-month-old scans by using a publicly available software iBEAT (http://www.nitrc.org/projects/ibeat/). This is based on the fact that, at term birth, the major sulci and gyri in the brain are already presented, and are generally preserved but only fine-tuned during early postnatal brain development [17]. Therefore, we can utilize the late-time-point longitudinal images (e.g., 24-month-old), which can be segmented with high accuracy by using existing segmentation tools, e.g., FreeSurfer [18], to guide the segmentation of early-time-point (e.g., 6-month-old) infant images. Based on the segmentation results generated by iBEAT, manual editing was further performed by an experienced neuroradiologist. For each subject, around 200,000 voxels (24% of total brain volume) were re-labeled. In this way, the potential bias from the automatic segmentations can be largely minimized and also the quality of manual segmentation can be ensured.

2.1 Anatomy-Guided Densely-Connected U-Net (ADU-Net)

To derive anatomic guidance from the outer surface, we need to first classify brain images into two classes, i.e., CSF and WM+GM. Inspired by the recent success of densely-connected networks [15] and U-Net [12] in medical image segmentation, we propose an anatomy-guided densely-connected U-Net architecture (shorted as ADU-Net) for the segmentation of 6-month-old infant brain images. The proposed network architecture is shown in Fig. 2, which includes a down-sampling path and an up-sampling path, going through seven dense blocks. Each dense block consists of three BN-ReLU-Conv-Dropout operations, in which each Conv includes 16 kernels and the dropout rate is 0.1. In the down-sampling path, between any two contiguous dense blocks, a transition down block (i.e., Conv-BN-ReLU followed by a max pooling layer) is included to reduce the feature map resolution and increase the receptive field. While in the up-sampling path, a transition up block, consisting of a transposed convolution, is included between any two contiguous dense blocks. It up-samples the feature maps from the preceding dense block. The up-sampled feature maps are then concatenated with the same level feature maps in the down-sampling path, and then input to the subsequent dense block. The final layer in the network is a convolution layer, followed by a softmax non-linearity to provide the per-class probability at each voxel. For all the convolutional layers, the kernel size is 3 × 3 × 3 with stride size 1 and 0-padding.

Fig. 2.
figure 2

Diagram of our architecture for segmentation. Input 1: T1w and T2w images for (CSF, WM+GM) segmentation to construct anatomy guidance; Input 2: T1w and T2w images and anatomy guidance for (WM, GM) segmentation.

Network Implementation.

We randomly extract 32 × 32 × 32 3D patches from training images. The loss function is cross-entropy. The kernels are initialized by Xavier, and the bias are initially set to 0. We use SGD optimization strategy. The learning rate is 0.005 and multiplies by 0.1 after each epoch. Training and testing are performed on a NVIDIA Titan X GPU. Basically, training a ADU-Net takes around 96 h and in application stage, segmenting a 3D image requires 70–80 s.

Anatomical Guidance Generation.

T1w and T2w MR images of training subjects and their corresponding manual segmentations are employed to train the network. To generate the anatomical guidance, as mentioned, we first train an initial ADU-Net to classify the brain images into two classes (i.e., CSF and WM + GM). Figure 3 shows the final estimated CSF and WM + GM segmentation maps for a testing image (in Fig. 1). Based on the segmentation results, it is straightforward to construct a signed distance function (i.e., a level set function) with respect to the boundary of GM/CSF, as shown in Fig. 3(c). Basically, the function value at each voxel is the shortest distance to its nearest point on the boundary of GM/CSF, taking positive value for voxels inside of WM+GM, and negative value for voxels outside of WM+GM. Therefore, the zero level set corresponds to the outer surface, as shown in Fig. 3(d).

Fig. 3.
figure 3

(a) and (b) show the estimated CSF and WM + GM segmentations for a testing image in Fig. 1. (c) illustrates the signed distance function with respect to the outer surface shown in (d).

Anatomy Guided Tissue Segmentation.

We further classify WM+GM into WM and GM separately by training a cascaded ADU-Net with the same architecture as in Fig. 2. It is worth noting that, besides using T1w and T2w MR images, the level set function (Fig. 3(c)) is also input to the network as an anatomical guidance for tissue segmentation. By recalling the inner cortical surface of the testing image in Fig. 1, we can see that the cortical thickness is incorrect due to the missing WM areas. With anatomical guidance from the outer surface in Fig. 3(d), more WM voxels are expected to keep cortical thickness in a reasonable range. From Fig. 1(d), it can be observed that the WM in the yellow ellipse is recovered, and thus the missing inner gyrus is also recovered (Fig. 1(f)) by the anatomical guidance, resulting in a reasonable cortical thickness in Fig. 1(h). Similarly, the topological errors, i.e., holes or handles, causing abnormal cortical thickness, can also be corrected.

2.2 ROI-Based Volumetric Measures as Imaging Biomarkers

To parcellate 6-month-old infant brain image into different regions of interest (ROIs), we employ a multi-atlas strategy. In particular, a total of 33 two-year-old subjects were employed as individual atlases (www.brain-development.org) [16]. Each atlas consists of a T1w MR image and a label image of 83 ROIs. We first employ FreeSurfer [18] to segment each T1w MR image into WM, GM, and CSF. Then, we register all atlases into each 6-month-old infant image space based on their segmentations using ANTs [19]. Finally, we employ majority voting to parcellate each 6-month-old infant brain into 83 ROIs.

We use leave-one-out cross-validation to evaluate the diagnosis performance. Basically, one subject is used for testing and the rest subjects are used for training. For the training data, we perform unpaired t-test on each of ROI measures between autistic and normal infant subjects, and find ROI measures with statistically significant differences as potential autism-related biomarkers. With these identified biomarkers, we perform early diagnosis of at-risk infants in the testing data. Following [20], we adopt random forest as the classifier for disease diagnosis. Instead of using just a single ratio of extra-axial fluid to the total cerebral volume as in [10], we employ the volumes of CSF, GM and WM in each identified autism-related ROI as features to construct the diagnosis model. This process is repeated until all subjects are used for testing.

3 Experimental Results and Conclusions

Segmentation Results on 18 6-Month-Old Infant Subjects from NDAR.

We first make comparisons with state-of-the-art methods on 18 6-month-old infant subjects from NDAR. Among these competing methods, SegNet [21] has achieved promising results in natural images segmentation; U-Net [12] has achieved the best performance on ISBI 2012 EM challenge dataset, and Bui et al.’s method [22] has won the 1st prize in iSeg-2017. For all the methods, we perform the same 6-fold cross-validation. In each fold, we use 12, 3 and 3 subjects as training, validation, and testing datasets, respectively. The results on a testing subject by different methods are shown in Fig. 4. The estimated inner surface and cortical thickness by the proposed method are much more consistent with the ground truth from manual segmentation. We further quantitatively evaluate the results by using the Dice ratio and Modified (95th percentile) Hausdorff Distance (MHD). As shown in Table 1, our proposed method achieves a significantly better performance in terms of Dice ratio on WM and GM, and MHD on WM. We then apply our trained model on other 59 infants (29 sex- and age- balanced autistic/30 normal subjects) and further perform ROI-based volumetric measurement and diagnosis as detailed below.

Fig. 4.
figure 4

Comparison with state-of-the-art methods on 18 6-month-old infant subjects from NDAR. The first and second rows show the inner surface and corresponding cortical thickness, respectively, with the zoomed views shown in the third and fourth rows. From left to right: results by SegNet [21], U-Net [12], Bui et al.’s method [22], proposed method, and ground truth from manual segmentation. Color bar indicates the thickness in mm.

Table 1. Segmentation accuracy on 18 6-month-old infant images, and diagnosis accuracy on other 59 infant subjects (29 autistics and 30 normal controls) from NDAR. The bold values indicate that our proposed method is significantly better than others with p-value < 0.005.

Diagnosis/Prediction on 6-Month-Old Infant Subjects from NDAR.

Based on the brain parcellation results, we then perform unpaired t-test on each ROI between 29 autistic and 30 normal subjects, who were later diagnosed at 24 months old. Among these two groups, we find statistically significant differences in many ROIs (p-value < 0.005), e.g., right precentral gyrus, left hippocampus, cingulate gyrus, cuneus, and lateral occipital cortex. Most of the autism-related regions identified by our method were also confirmed with previous reports on other children and young adults. For example, it was found in [23] that ASD is associated with decreased structural connectivity and functional connectivity in the lateral occipital cortex. This disruption may impair the integration of visual communication cues in ASD individuals, thereby impacting their social communications. In [4], it was found that children (7.5–12.5 years of age) with autism had larger left hippocampus than the controls. Although the ages of studied subjects are different, these results encourage us to explore the individualized diagnostic value of such potential biomarker regions. The last column of Table 1 provides the classification accuracy (ACC) and the area under the ROC curve (AUC) using leave-one-out cross-validation. We make a quick comparison with [20], in which they achieved 0.87 ACC and 0.96 AUC on 19 autistic and 19 normal infants. Besides, we also perform ROI-based volumetric measurements based on the segmentations achieved by other methods in Table 1. We found that much less significant regions were identified based on their segmentations. The corresponding ACC and AUC by other methods are also provided in the last column of Table 1. It is clear that our method achieves the highest diagnosis/prediction accuracy, which also indicates a high segmentation accuracy achieved by the proposed ADU-Net.

To conclude, for the first time, this paper presents a novel volume-based analysis on 6-month-old infant subjects at-risk of autism. We first proposed an anatomy-guided and densely-connected convolutional neural network for accurate tissue segmentation. Based on accurate segmentations, we then perform brain parcellation and statistical analysis to identify significantly different regions between autistic and normal infant subjects for final classification. Comparisons with the state-of-the-art methods demonstrate the advantages of our proposed method in terms of both segmentation accuracy and diagnosis accuracy. Our future work will include improvement of diagnosis accuracy by using surface-based features, e.g., cortical thickness and surface area, and further validation on more infant subjects.