Introduction

Attention deficit hyperactivity disorder (ADHD) is characterized by inappropriate levels of inattention, hyperactivity, and/or impulsiveness. ADHD has a great impact on affected children and their families in terms of academic, social, and behavioral dysfunction [28, 29], and is at the moment the most common neurodevelopmental disorder of childhood with 5% of children worldwide affected [34]. ADHD symptoms are likely to be continuously distributed in childhood through adolescence with ADHD being on the extreme tail of the distribution [16, 21, 23, 24, 35]. Typical presentations of ADHD symptoms in childhood are premature changes of activity, restless when calm expected, distracted by the environment, forgetful, acting out of turn, intrusions on peers, and thoughtless rule-breaking [39].

Community studies on the development of ADHD symptoms in childhood report somewhat mixed findings. A number of them show decreases in ADHD symptoms. For example, in an American longitudinal sample of 6- to 20-year-old boys, ADHD symptoms declined with increasing age, with hyperactivity symptoms declining at a higher rate than inattention symptoms [6]. In a sample of 8- to 17-year-old Swedish twins levels of inattention remained relatively constant, whereas levels of hyperactivity-impulsivity declined with increasing age [19]. Another American longitudinal study showed that levels of ADHD symptoms were generally constant until the teen years, and declined from there [30]. Similarly, mean levels of ADHD symptoms decreased after age 10 in Dutch twins [36] and singletons [7]. An Australian general population sample showed only minimal age differences in the number of ADHD symptoms in children aged 5–11 years [15]. The results of these studies may differ for various reasons, such as the use of different ADHD measures. Nonetheless, the general picture seems to be that the development of ADHD symptoms is relatively stable in childhood with a possible decrease of symptoms starting around the age of 10 years.

The development of ADHD symptoms can also be investigated by examining subgroups with distinct developmental trajectories. Only few studies analyzed different developmental trajectories of ADHD symptoms in school-age children. Two trajectories were identified in a high-risk sample of American children aged 7–16 years of families with parental alcoholism: one with stable low levels, and one with stable high levels, the latter containing 57% of the children [17]. In a sample of Canadian boys from low socioeconomic areas, four trajectories of hyperactivity were identified from 6 to 15 years [32]. Roughly 6% of the children in this study followed a chronic high trajectory. The other children followed low or decreasing trajectories. In a general population sample of Dutch children aged 4–18 years, four developmental trajectories of ADHD symptoms were estimated, among which was a high trajectory with increasing scores into late childhood [43]. Three trajectories of ADHD symptoms were identified in a sample of children aged 8–14 years who were selected from high-risk schools: one with minimal problems, one that showed an increase and then a decrease in symptoms, and one that showed a decrease and then a slight increase in symptoms [27]. Recently, two hyperactivity-impulsivity trajectories (low, high-decreasing) and two inattention trajectories (low, high-increasing) were found in a population-based twin study [18]. Summarizing the results of these studies, two to four trajectories of ADHD symptoms were identified. Subgroups with specific developmental trajectories of ADHD symptoms should be investigated more thoroughly by using large representative samples of school-age children.

ADHD symptoms in twins and singletons

In the current study, data from twins were analyzed to estimate developmental trajectories of Attention Problems. Twin data are frequently used to study the heritability of ADHD symptoms, which usually varies between 50 and 80% [11, 14, 35, 36, 40, 49]. An important assumption of twin studies is that the results can be generalized to the general population, which mainly includes singletons. The comparability of twins to singletons is however still being questioned for various reasons, such as more pre- and perinatal problems among twins that could result in a higher prevalence of behavioral problems in twins than in singletons [37]. However, many of these problems such as low birth weight and preterm birth are unlikely to have the same significance in twins as in singletons, as the etiology of these risk factors appears to be different in the two groups [33].

Despite the uncertainty about the representativeness and possibly increased vulnerability of twins, research on twin-singleton differences in ADHD symptoms is sparse and has been cross-sectional so far. To our knowledge, there are three cross-sectional studies that compared levels of ADHD symptoms between twins and singletons. An Australian study found more ADHD symptoms in twins than in singletons aged 4–12 years [20], while a study of 2- and 3-year-old Dutch twins found that twins showed slightly lower levels of ADHD symptoms than singletons [42]. In line with this study, an American study of 12- to 19-year-old twins and their non-twin siblings found some evidence for a higher prevalence of ADHD in the non-twin siblings, although this result was not consistently observed for all age groups and in both sexes [12]. Whether ADHD symptoms develop differently over time for twins versus singletons has not yet been investigated.

The present study

In the current study we will extend the findings of a recent study in 7-, 10- and 12-year-old boys from the Netherlands Twin Register, that showed three mainly quantitatively different latent classes of Attention Problems at each age, i.e., high-, moderate-, and low-scoring classes [23]. Our aim was to identify subgroups of children with specific developmental trajectories of Attention Problems from ages 6 to 12 years. We expected to find a minimum of three relatively stable trajectories (e.g., high-, moderate-, and low-scoring) with the majority of children having low levels of Attention Problems. Trajectories were expected to reflect slightly decreasing levels of Attention Problems late in childhood, as self-regulation increases with beginning puberty [3]. The second aim of this study was to investigate if singletons follow similar trajectories as twins. As in general most twins are physically healthy individuals who grow up under normal circumstances, we did not expect to find large differences between twins and singletons. Finally, it has been well established that ADHD symptoms are more prevalent in boys than in girls [5, 34]. Because we had two large samples of children, we were able to investigate the development of Attention Problems separately for boys and girls.

Methods

Subjects

Twin sample

All participating twins were volunteer members of the Netherlands Twin Register (NTR). The NTR represents a twin family sample that is largely representative for the Dutch general population [4]. For the present study, data from twins born between 1986 and 1998 were analyzed. Parents and teachers of twins received surveys by mail, around the twins’ 7th, 10th and 12th birthdays. The exact ages (in years) of the twins at the time of completion of the surveys were calculated from date of birth of twins and date of completion of the surveys. The response rate at each measurement was 61–63% for mother reports. About 50% of the parents gave written permission to approach the teacher, and the subsequent teacher response rate was 74–78%. Attrition analyses revealed that, at ages 7 and 10, the level of socio-economic status (SES) was higher in families that returned the survey than in families that did not return the survey [9]. Also, twins had higher levels of Attention Problems at ages 7 and 10 when the parents did not respond at the previous target age [9]. However, the effect sizes were small, and it is therefore unlikely that attrition in the Netherlands Twin Register strongly affected the results.

Twin pairs were excluded if they suffered from a severe handicap, which interferes with daily functioning. Maternal ratings were available for 9,432 male twins and 9,718 female twins. A total of 6,219 twins were part of an opposite-sex pair and were all included in the analyses. There were 6,338 twins from same-sex male pairs, and 6,748 twins from same-sex female pairs. Since data obtained from twin pairs are not independent, one twin was randomly selected from the same-sex twin pairs. It was not necessary to randomly select one twin from the opposite-sex twin pairs, since data from boys and girls were analysed separately. We excluded 215 twins for whom information on SES was unknown. The final twin sample consisted of 6,161 boys and 6,325 girls. For 42% of this sample data were available from one assessment, for 29% from two assessments, and for 29% from three assessments. The smaller proportion of children with two or three assessments partly reflects the longitudinal design of the study, since not all children had reached ages 10 and 12 years by the time we ran our analyses.

For 3,506 boys and 3,673 girls, teacher ratings were available as well. For 71% of this sample teacher data were available from one assessment, for 26% from two assessments, and for 3% from three assessments.

Singleton sample

The data from singletons that were analyzed in this study came from the Zuid-Holland study, an ongoing longitudinal study of behavioral and emotional problems that started in 1983. The sample (N = 2,600) was randomly drawn from municipal registers that list all residents in the Dutch province of Zuid-Holland, and represents a general population [45]. Written informed consent was obtained after complete description of the study to the subjects. After the first measurement in 1983, the respondents were approached biennially. The current study uses data from Time 1 (1983) to Time 5 (1991). Response rates ranged from 80 to 85% at each measurement. All children who were between 6 and 12 years at any assessment (i.e., born between 1971 and 1979) were included (N = 662 boys; N = 684 girls). Singleton data that were fully contemporaneous to the twin data were not available. However, the first assessment of twins born in 1986 was only 2 years removed in time from the fifth assessment of the Zuid-Holland study (i.e., 1993, and 1991, respectively). Attrition analyses on all participants of the Zuid-Holland study showed that dropouts had lower SES. However, dropouts did not have higher levels of behavioral problems based on the Total Problems scale of the Child Behavior Checklist [7].

Because of the selected age range and the design of the Zuid-Holland study with assessments every two years, longitudinal data could be used from maximum four assessments (e.g., a child who was 6 years old at Time 1, was 12 years old at Time 4). For 39% of the sample, there were data from one assessment, for 26% from two assessments, for 23% from three assessments, and for 12% from four assessments. Most of the children for whom data were available from just one assessment were already 11 or 12 years old at Time 1. Teacher ratings were available for 580 boys and 631 girls, and were obtained at Time 1, Time 3, Time 4, and Time 5. No information from teachers was obtained at Time 2 owing to financial constraints. The teacher response rates were above 70% at each assessment. For 59% of this sample teacher data were available from one assessment, for 29% from two assessments, and for 12% from three assessments. By design, teacher data were available from just one assessment for children who were between 9 and 12 years old at Time 1.

Measures

Attention Problems

For both twins and singletons, maternal ratings were collected with the Attention Problems (AP) subscale of the Child Behavior Checklist (CBCL/4-18) [1, 46]. This scale includes 11 items such as “can’t sit still”, “daydreams”, and “can’t concentrate”. It includes features of inattention, hyperactivity, and impulsivity. All items were scored on a three-point scale, reflecting the occurrence of behavioral problems during the preceding 6 months: 0 if the item was not true, 1 if the item was somewhat or sometimes true, and 2 if the item was very true or often true. The 2-week test–retest correlation and the internal consistency of the AP scale are 0.83 and 0.67, respectively [1, 46]. Teacher ratings were collected using the Teachers’ Report Form (TRF) [2, 47]. Teachers were instructed to rate the child’s behavior over the preceding 2 months. The AP subscale of the TRF consists of 20 items with the same response categories as the CBCL. The 6-week test–retest correlation is 0.83. The internal consistency coefficients are 0.90 in boys and 0.92 in girls [2, 47]. The TRF includes extra items that capture situational-specific behaviors, such as “difficulty following directions”, and “messy work”. Ten items of the CBCL-AP scale and the TRF-AP scale overlap.

Socio-economic status

For the twin sample, SES was either obtained from a full description of the occupation of the parents and subsequently coded [8], or obtained by a nine-category classification scheme for occupations [13], combined with information on parental education. This information was recoded into three SES levels (i.e., low, middle, and high). For the singleton sample, SES was scored on a six-step scale of parental occupation [44], and was also recoded into three SES levels to allow comparison with the twin sample.

Data analysis

In order to compare growth trajectories between twins and singletons, the singleton data were reordered as a function of chronological age instead of survey year. This was done by creating age-dependent variables, equal to the ones used in the twin sample, resulting in a larger dataset with values that were missing by design [31]. To determine trajectories of mother-rated AP, growth mixture modeling (GMM) was used to analyze the data, separately for twins and singletons, and separately for boys and girls (Mplus Version 5) [31]. The trajectories were determined by latent growth factors, which model the intercepts and slopes of the individual growth trajectories. Models were tested with linear as well as quadratic effects. The latter represent a curvilinear development over time (e.g., first increasing, then decreasing). The trajectories were estimated using maximum likelihood with robust standard errors (MLR), which is robust regarding non-normality of the scores. MLR is similar to the full information maximum likelihood (FIML) method, in which missing data are not imputed, but parameters and standard errors are estimated directly using all the observed data [50].

Models were fit with an increasing number of classes and different within class structures (i.e., linear and quadratic growth). There is a trade-off between within class model complexity and number of classes where more classes can compensate for a less complex within class structure [25, 26]. Models with increasing numbers of classes cannot be compared with likelihood ratio tests, since in that case the test statistic does not follow a chi-squared distribution. Therefore, the optimal number of classes, and the decision between linear versus quadratic growth, was determined by the model with the smallest Bayesian information criterion (BIC). In case of small BIC differences, the more parsimonious model was chosen.

To test for twin-singleton differences in the mean intercepts and slopes of the trajectories, we fit a mixture model with the optimal number of classes simultaneously for twins and singletons. We used a group dummy variable indicating twin versus singleton as known class membership such that, effectively, a multi-group model was fitted with a mixture model within each group. The mean intercepts, mean slopes, and intercept variances were then separately tested for equality by constraining them to be equal between twins and singletons (i.e., three tests with df = 3 per sex). These tests were evaluated with scaled chi-square tests using the log-likelihood values. Differences in class proportions between twins and singletons were tested by means of a standard chi-square test for cross-tables. To control for SES differences between the samples, the latent growth factors were regressed on SES. Also, class membership was regressed on SES, so that SES predicted the log odds of the probability of belonging to a given class compared with the probability of belonging to another class. Because the models were estimated conditional on SES, families without data on SES had to be excluded from the analysis.

Because of the small number of multiple assessments with teacher ratings, trajectories of teacher-rated AP could not be examined. Instead, we analyzed the age-specific mean scores with SPSS15. To test for twin-singleton differences, analysis of variance (ANOVA) was performed using two fixed factors (i.e., twin/singleton status and SES). For these analyses, a statistical significance at the level of p < 0.01 was chosen.

Results

A total of 2,665 twins (21%) had low SES, 8,401 twins (67%) had middle SES and 1,420 twins (12%) had high SES. A total of 734 singletons (55%) had low SES, 392 singletons (29%) had middle SES, and 220 singletons (16%) had high SES. Twins and singletons were not evenly distributed over the three SES levels (χ 2(2) = 848.26, p < 0.001).

Table 1 shows the model fit statistics for the linear and the quadratic models for twins. The models were fit with within-class intercept variability, whereas all slope factor variances were fixed to zero. The intercept variances were constrained to be equal between the classes in all models. Estimating nonzero slope variances and class-specific intercept variances resulted in convergence problems for models with more than three classes, which is often an indication of overfitting (i.e., the fitted model is overly complex). For both boys and girls, a three-class linear model was the best fitting model given the LMR-LRT, BIC, and model parsimony. The quadratic models did not fit convincingly better than the linear models, as indicated by minimal BIC differences. More specifically, the BIC differences between the linear and the quadratic models were smaller than the BIC differences between the models with a different number of classes.

Table 1 Growth mixture modeling model fit statistics for twins

Table 2 and Figs. 1 and 2 show the results for twins and singletons combined, i.e., linear three-class models, with class-specific intercept variances (BIC boys = 63,945.99; BIC girls = 60,683.19). The three classes differed with respect to the intercept and slope means. The results were very similar for boys and girls. The three classes were (1) stable low (boys: 71% twins, 64% singletons; girls: 64% twins, 62% singletons); (2) low-increasing (boys: 15% twins, 15% singletons; girls: 16% twins, 18% singletons), with children whose AP scores were initially low but increased with age; and (3) high-decreasing (boys: 14% twins, 21% singletons; girls: 20% twins, 20% singletons), with children whose AP scores were initially high and decreased with age.

Table 2 Model results for the three-class linear model for twins and singletons
Fig. 1
figure 1

Trajectories of mother-rated Attention Problems for boys

Fig. 2
figure 2

Trajectories of mother-rated Attention Problems for girls

The intercept means of the three classes were equal between twins and singletons (boys: χ 2(3) = 0.90, p = 0.83; girls: χ 2(3) = 0.70, p = 0.87). The slope means were equal between twins and singletons (boys: χ 2(3) = 4.82, p = 0.19; girls: χ 2(3) = 1.04, p = 0.79), and the intercept variances were also equal between twins and singletons (boys: χ 2(3) = 0.59, p = 0.90; girls: χ 2(3) = 1.83, p = 0.61). Finally, the class proportions were not different for twins and singletons (boys χ 2(2) = 1.76, p = 0.41; girls χ 2(2) = 0.15, p = 0.93).

Teacher ratings

Table 3 presents teacher-rated AP mean scores, which are corrected for SES differences between twins and singletons. For boys, ANOVA showed that there were no main effects of twin/singleton status on levels of AP (all p values >0.01). For girls, twins had significantly lower AP scores at age 12 than singletons (p < 0.001). There were no main effects of twin/singleton status on AP scores for ages 6–11 years.

Table 3 Estimated means for teacher-rated Attention Problems corrected for SES

Discussion

In this longitudinal study, we identified three linear trajectories of mother-rated Attention Problems in boys and girls from 6 to 12 years: stable low, low-increasing and high-decreasing symptom levels. Most of the children followed the stable low trajectory, which is what we hypothesized. Further, we expected two other stable trajectories with a possible decrease late in childhood. Instead of these stable trajectories, we found a low-increasing trajectory, and a high-decreasing trajectory.

Our findings are in agreement with the study from Malone et al. [27] which identified three trajectories that, when considering children from middle childhood until early adolescence, included increasing and decreasing classes. Van Lier et al. [43] also identified a high-increasing trajectory in children from the Zuid-Holland Study using the DSM-oriented ADHD scale of the CBCL. Two earlier studies that reported a stable high trajectory included children with already elevated risk (i.e., parental alcoholism, and low-socioeconomic position) [17, 32]. It is possible that a stable high trajectory only presents in high risk populations, or populations characterized by low use of pediatric health care. In the US low-socioeconomic groups have a lower use of pediatric care, while there is no association between SES and help seeking in the Netherlands, where there are no major financial constraints to receiving professional help [51]. It could also be that a stable high trajectory appears only in children with both attention deficits and hyperactivity problems. In a general population study, it may be more likely that a decreasing trajectory appears, in accordance with theory that attention deficits diminish as self-regulation increases, and in response to adequate treatment. These reasons may explain why a stable high trajectory was not identified in our study.

It might be argued that a fourth class should be included, both for the boys and the girls. However, an additional class did not provide additional information, and appeared to split the high-decreasing trajectory into two ordered classes. This seems to be an example of a so-called ‘indirect interpretation of mixture models’ where classes do not represent different types of subjects, but rather approximate different parts of the joint distribution of observed data. Not including the fourth class does not change the conceptual interpretation of the modeling results.

Mean parent-rated AP scores larger than 9 (up to age 11) or 10 (age 12) are in the subclinical range, and scores larger than 12 (up to age 11) or 13 (age 12) are considered clinical [1]. None of the trajectories exceeded these levels at any age. A post hoc analysis among the boys showed that 2.3% of the twins and 4.5% of the singletons had mean AP scores of 9 or higher on at least two assessments. About two-third of these children were assigned to the high-decreasing trajectory. Since DSM diagnoses of ADHD were not available, we could not investigate whether children with specific ADHD-subtypes would tend to be either in the low-increasing or high-decreasing class. However, it was found that children with a low AP-score obtained a negative ADHD diagnosis in 96% of the cases [10]. Furthermore, children with a high AP-score obtained a positive diagnosis in 59 (boys) and 36% (girls) of the cases. Since hyperactivity tends to decrease over time [18], we hypothesize that children with the hyperactive-impulsive or the combined type of ADHD would be overrepresented in the high-decreasing trajectory. Children on the low-increasing trajectory seem at risk for having high levels of ADHD symptoms later in childhood. As this risk may arise from a combination of several genetic, biological and environmental factors [18, 38], further research is needed to identify specific predictors of the trajectories.

Linear growth provided the best description of the development of Attention Problems for the observed time. Attention deficits may increase during childhood as academic demands, such as demands on impulse control and response inhibition, increase. Linearity does however not mean that the regression line will go up indefinitely, but that linear models best describe the observed time (6–12 years). With longer follow-up of these children a quadratic model could provide support for declining levels of Attention Problems in adolescence.

The second aim of this study was to investigate if similar trajectories could be identified in singletons. For both boys and girls, singletons followed three trajectories identical to twins, with similar class proportions. The mean intercepts and slopes of the trajectories did not differ between twins and singletons. Therefore, we conclude that twins and singletons are comparable with respect to the development of ADHD symptoms in childhood. The findings from the teacher ratings support this conclusion, as we observed no consistent differences in the mean AP scores between twins and singletons. This conclusion confirms the generalizability of twin studies to singleton populations with regard to ADHD symptoms in middle and late childhood. Our findings are in agreement with a cross-sectional twin-singleton comparison, in which twins were compared to their non-twin siblings, that reported no consistent differences with respect to the prevalence of ADHD [12].

This is the first study that investigates trajectories of Attention Problems in middle childhood in the general population. Strengths of the study are the use of prospective data over a 6-year period, the representativeness of the samples, large sample sizes, and the use of advanced person-centered statistical analyses. Nevertheless, several limitations of this study must be considered. First, there was a modest association of non-response with SES, which may have led to underestimating the proportion of children in the high-decreasing and the low-increasing trajectories, especially in the twin sample. Also, the twin and singleton samples differed with respect to SES, with a higher proportion of twins from higher SES backgrounds. The singleton sample consists of families who were randomly selected from municipal registers, after which participation was strongly pursued, e.g., by means of home-visits, making participants (especially those from low SES) more likely to participate. In contrast, the twin sample depends on voluntary participation and families are encouraged to remain on the register, even when they do not take part in each survey for which they are approached. Secondly, the twin and singleton samples were comprised of different cohorts. For twins, birth cohort did not predict mean AP scores at ages 7, 10 and 12 [9]. For singletons, an earlier study did not find evidence for secular changes in parent-rated AP over a 10-year period (1983–1993), but small secular changes were reported for teacher-rated AP [48]. Similarly, small increases in Dutch children’s parent- and teacher-rated AP scores were found over a 20-year period (1983–2003) [41]. As these differences were very small (Cohen’s d < 0.2), it is unlikely that cohort effects confound our findings. Thirdly, the twin and singleton samples were recruited from different regions of the country (data collection is nation-wide for twins, whereas for singletons a specific part of the country is included). However, an earlier study showed there were no significant differences in CBCL scale scores between children living in Zuid-Holland and children living elsewhere in The Netherlands [41].

Conclusion

In conclusion, the development of Attention Problems in boys and girls from age 6 to 12 years can be characterized by stable low, low-increasing, and high-decreasing developmental trajectories. Our findings confirm that twins are not a more vulnerable group than singletons with respect to the development of Attention Problems in childhood, and that results from twin studies regarding ADHD symptoms can be generalized to singleton populations. As our results and interpretations apply only to children in the age range 6–12 years, future research should extend our findings by describing trajectories of Attention Problems from childhood to adulthood.