Introduction

The supports and interventions necessary for children with autism spectrum disorder (ASD) are multifaceted and substantial and have considerable economic implication for families, social services systems, healthcare and education systems (Knapp et al. 2009). The need for support and intervention often continues beyond childhood, and outcomes for adults are often poor in many aspects of life, even for most high-functioning children with ASD (Barnhill 2007; Cimera and Cowan 2009). It is therefore essential that interventions for children with ASD should be highly effective, that the effects maintain, and that skills are generalized outside the intervention context.

Interventions for people with ASD are dominated by behavioral approaches with limited considerations of participant cognition. It has been suggested that a wider range of approaches, including interventions with a cognitive approach that address cognitive skills and changes in cognition should be researched to suit the wide diversity of needs of children with ASD (Duffy and Healy 2011; Kasari and Lawton 2010; Luckett et al. 2007). Further, interventions for people with ASD should include a focus on generalization of skills, an area that has long been identified as problematic in learners with ASD (McGee et al. 1983; Rincover and Koegel 1975).

One promising alternative to behavioral approaches for children with ASD is the cognitive-behavioral approach (CBA) (Volker and Lopata 2008; Wang and Spillane 2009). CBA is considered a further development of traditional behavioral strategies integrated with cognitive therapy, having an emphasis on social cognition, and facilitating behavioral changes through cognition (Beck and Fernandez 1998; Dobson and Dozois 2010). CBA has been extensively adopted and is one of the most researched forms of psychotherapy (Butler et al. 2006). When using CBA with typically developing children, therapists focus on the interaction of the learning process, the social environment, and the centrality of information-processing factors (Braswell and Kendall 1988). CBA has been applied successfully to modify children’s impulsive behaviors, to teach academically relevant tasks, as well as to improve problem solving, role taking, and self-control abilities (Little and Kendall 1979; Meichenbaum and Asarnow 1979; Meichenbaum and Goodman 1971).

There have been a number of studies on CBA interventions for adults with ASD; all reported encouraging results and some reported maintenance and generalization of treatment gains, specifically, reduction in social anxiety and in the avoidance of social situations (Cardaciotto and Herbert 2004; Dansey and Peshawaria 2009; Hare 1997; Weiss and Lunsky 2010). In addition, CBA is considered to have the potential to facilitate the generalization of skills in children with ASD (Chalfant et al. 2007; Sofronoff et al. 2007).

People with ASD may have difficulties with cognitive processing such as poor executive functioning, cognitive inflexibility, weak central coherence and impaired theory of mind (Baron-Cohen et al. 1985; Happé and Frith 2006; Hill 2004). Since these cognitive impairments are suggested to be intricately interlinked, interventions that directly address cognition as well as behavior (e.g., CBA interventions) may offer the opportunity for generalized improvement in adaptive functioning (Greig and MacKay 2005; Klinger and Williams 2009).

A number of literature reviews have included CBA interventions for children with ASD. Generally, reviewers have concluded that although there is extremely limited high quality empirical evidence, CBA appeared to be a feasible and effective treatment for specific mental health problems of children with high-functioning ASD. To date, some reviews provided narrative analyses (Anderson and Morris 2006; Donoghue et al. 2011; Moree and Davis 2010; Rotheram-Fuller and MacMullen 2011), some included non CBA interventions (Cappadocia and Weiss 2011; White et al. 2009), and some were limited to specific issues such as interventions for anxiety (Lang et al. 2010). Only two reviews of CBA interventions for children/adolescents with ASD did not restrict the included intervention studies to specific treatment targets (Kincade 2009; White 2004). These two reviews have limitations. Primarily, they were narrative and did not provide a meta-analytic synthesis of the research. Kincade (2009) reviewed 15 published journal articles but did not address important aspects of research design quality. The older White (2004) review included only two randomized controlled trial studies (RCTs). Neither of these two reviews included unpublished reports, with the attendant risk of publication bias. Since the review by Kincade (2009), more relevant CBA research studies have been published and many of them are group design studies which can provide stronger evidence regarding effectiveness (e.g., Drahota et al. 2011; Koning 2010; McNally Keehn 2010; Reaven et al. 2012; Scarpa and Reyes 2011; Sung et al. 2011; Wood et al. 2009a, b).

The present study provides a meta-analytic review of CBA interventions for children and adolescents with ASD. This review considers generalization to other settings and behaviors, as well as long term maintenance of the intervention effects. It includes all relevant RCTs, either in published journal articles or unpublished theses, and includes an analysis of study quality.

In this review, the following research questions are addressed:

  1. 1.

    What are the types and qualities of CBA intervention studies with children having a reported diagnosis of ASD?

  2. 2.

    What are the characteristics of participants and intervention programs?

  3. 3.

    What are the common strategies used in these interventions?

  4. 4.

    What measures are used?

  5. 5.

    Did the treatment effect generalize and maintain?

  6. 6.

    How effective are CBA interventions?

Methodology

A search was conducted of four databases: ERIC, CINAHL, ProQuest, and PsycINFO, in December 2012 using the descriptors “autism,” “autistic,” “Asperger*,” “pervasive developmental disorders-not otherwise specified,” and “PDD-NOS” to describe the population; “cognitive-behavio*,” and “cognitive behavio*” to describe the approach; and “treatment,” “intervention,” “program,” and “training” to describe the nature of the study. A study was included in the initial listing if it met the following criteria:

  1. 1.

    It was a journal article, or a dissertation / thesis at or above master’s degree, or an original study reported in a published book.

  2. 2.

    It was written in English.

  3. 3.

    It was an intervention study that reported outcomes.

  4. 4.

    The author(s) explicitly described the intervention as cognitive-behavioral.

  5. 5.

    The participants in the study were described as adolescents or children with a reported diagnosis of ASD, i.e., autistic disorder, Asperger syndrome, or pervasive developmental disorder-not otherwise specified (PDD-NOS).

  6. 6.

    In the case of studies including participants with and without ASD, outcomes for participants with ASD were reported separately.

The initial search located 612 papers, 170 from ERIC, 44 from CINAHL, 191 from ProQuest, and 207 from PsycINFO. When duplicates were removed, there were 530 unique reports. All reports were screened by the first and second authors who viewed the report titles and abstracts. A report was excluded when information in its title or abstract unequivocally indicated any of the inclusion criteria were not met. Sixty three reports were reserved after the first-stage screening. Overall agreement for this first-stage screening process was 93.0 %. The third author reviewed reports where the first and second authors disagreed and all disagreements were resolved through discussion between all the authors.

In the second stage, the first and second author independently examined the full text of all the 63 reserved reports to confirm whether they fully met the inclusion criteria. Twenty-six reports were excluded, leaving 37 reports to be included. Overall agreement for this second-stage screening process was 98.4 %. All disagreements were resolved through discussion between all authors. Papers were excluded because the participants were not children (Cardaciotto and Herbert 2004; Hare 1997; Kruzynski 1998; Perry 2008; Weiss and Lunsky 2010; Zelazo 1997), they were not intervention studies reporting on outcomes (Attwood 2004; Barsky 2001; Gratton 2010), the author did not explicitly describe the intervention as cognitive-behavioral (Lerner 2013; Snell 2012), the participants did not have a reported diagnosis of ASD (Fitzgerald and Werner 1996; Grosso 2012), or because they included participants with or without ASD and outcomes for participants with ASD were not reported separately (Epp 2008; Gooding 2010; Kellner and Tutin 1995; Lim et al. 2007). The other excluded reports were either not written in English, did not report on intervention outcomes, or did not use a CBA.

To complete the search, the names of authors of the reports selected were searched in the ERIC and PsycINFO databases. Ancestral searches of the reference lists of all intervention studies and previous reviews in this area were carried out (i.e., Anderson and Morris 2006; Cappadocia and Weiss 2011; Lang et al. 2010; Rotheram-Fuller and MacMullen 2011; White 2004). Two further studies meeting the inclusion criteria were located (Ooi et al. 2008; Vickers 2002). Again, these were independently reviewed by the first and second authors, and it was agreed that both should be included. The final list included 27 articles, 6 theses, and 6 book chapters.

Then, both the first and second authors independently reviewed the methodologies of all the reports shortlisted and classified the research design. Twelve reports, based on ten studies, were identified as RCTs. One thesis and two journal articles reported on the same study (i.e., Drahota 2008; Drahota et al. 2011; Wood et al. 2009a), and were treated as a single study. The agreement for identification of the RCTs was 100 %.

Both the first and second authors independently extracted all the quantitative data (including study quality indicators and intervention components reported by authors for generalization or maintenance), scored the study quality, and coded data of intervention targets. The scoring of study quality indicators was based on criteria as documented in the Scoring Guidelines (see Appendix) and was derived from Gersten et al. (2005) with the maximum possible quality score being 18. For the data on intervention targets, both the first and second author independently coded 30 % of the studies (n = 3 studies). The authors’ agreement was 88.5 % for extracting quantitative data, 78.0 % for scoring study quality, and 87.5 % for coding the intervention targets. All disagreements were resolved through discussion between all the authors.

Following the data extraction, a meta-analysis was performed. Given the variation in the interventions and outcome measures, an a priori decision was made to employ a random effects analysis as recommended by Borenstein et al. (2009). Data were analyzed using the Comprehensive Meta-analysis program (Borenstein et al. 2005). Where multiple CBA intervention groups were compared with a no-treatment control (i.e., Sofronoff et al. 2005), each comparison was treated as a separate study for the purpose of analysis. Therefore, based on nine journal articles and three theses, 11 comparisons were conducted involving 128 effect size (ES) calculations. In calculating ES, the expected direction of change on the measure was first determined, and positive values were allocated to ESs that reflected better performance in the CBA group. Hedges’ g was calculated using pre- and post-test means and pooled post-test standard deviation data where available, or from dichotomous outcomes where reported (events or non-events). In the absence of such data, ES estimates were extracted from t values, means and p values or other available data. In one study (Sung et al. 2011), scores were assigned to the categorical data of severity of anxiety based on Clinical Global Impression Scale (i.e., normal = 1, borderline = 2, mildly ill = 3, moderately ill = 4, and markedly ill = 5), consistent with the original instrument scoring system. The means and standard deviations of these scores were used to calculate the effect sizes. In the same study, the three categories of data reporting participant improvement status (i.e., deteriorated, no change and improved) were resolved into two categories to generate dichotomous outcomes for calculating the effect sizes (i.e., the deteriorated category was combined with the no-change category to form one category of not improved, and the improved category was maintained). ESs were calculated for each outcome variable and time point within a study. Mean ES was calculated across outcome variables and time points in each study using a mixed effect analysis model, thus, confidence intervals are likely to be conservative.

Results

Participants

The reviewed studies enrolled 402 participants (M = 40.2 per study, 199 completed intervention, 173 completed control conditions, 30 dropped out before completion). Participant details are provided in Table 1. All studies specified their participant selection criteria with eight reporting a replicable selection process. The most frequently specified selection criterion was a lower limit on IQ scores.

Table 1 Demographic details of participants

The selection criteria were reflected in the characteristics of the selected participants. All reported mean IQ scores of study cohorts were above 100, except in one study (mean verbal IQ score = 96.8 in Sung et al. 2011). The overall mean full-scale IQ score was 106.0 and individual IQs ranged from 70 to 139 across the 203 participants with data reported. Most participants were either diagnosed with Asperger syndrome (n = 196, 48.8 %), or described as high functioning with autistic disorder (n = 71, 17.7 %). Others were diagnosed with PDD-NOS (n = 20, 5.0 %) or without sub-type diagnoses (n = 40, 10 %). Although nine studies reported the instruments or protocols for diagnosing ASD, only four provided scores on the level of autistic symptomology. Across all the reviewed studies, participant mean age was 10.5 years (ranged from 4.5 to 16 years) and nine studies included participants aged between 10 and 11 years old. The most frequently reported comorbidity was anxiety disorder. The gender ratio of participants was approximately seven males to one female. Participants in two studies (n = 56, 13.9 %) were using medication.

Study Characteristics and Intervention Features

Study characteristics and intervention features of the ten RCTs are summarized in Table 2. All the reviewed reports were dated between 2005 and 2012. All interventions used a manual or program documentation, and the Coping Cat CBA Program by Kendall and Hedtke (2006a, b) for treating anxiety in typically developing children was used or adopted in three interventions. All interventions were administered by psychologists or therapists in clinical settings and included the involvement of parents and/or peers. In addition to attending intervention sessions, parents also supported their child’s practice of learnt skills outside intervention sessions (six studies), and/or acted as social coaches (four studies). Most interventions were delivered in small groups (seven studies) and a few were delivered individually (three studies).

Table 2 Study characteristics and intervention features

The time span over which the interventions took place varied from 6 weeks to 6 months (M = 13.7 weeks). All interventions had weekly sessions. While the length of individual sessions varied between 1 and 2 h (M = 97.5 min), the total training hours varied from 8 to 30 h (M = 19.7). The actual hours a therapist spent with each participant during the interventions ranged from 6.7 to 17.5 (M = 10.8 h). There were insufficient data on the therapist hours each parent received as well as hours the therapists/researchers spent in preparing the interventions. No studies reported post intervention follow-up training.

A large number of intervention components were identified and each intervention had a unique combination of intervention components. The mean number of components implemented in the interventions was 10.6. Some components were common across most studies such as practice of skills (ten studies), affective education or training regarding emotion (nine studies), homework assignments (eight studies), relaxation (seven studies), and visual support (seven studies). The three commonly discussed components in CBA literature for the general population and adults with intellectual disabilities were used in some of the reviewed studies (i.e., cognitive restructuring in six studies, problem solving in three studies, and self-instruction in one study) and three studies did not refer to these components at all.

Intervention Outcome Measurement

There were 91 outcome variables measured across studies (M = 9.1 per studies). These variables were used to indicate improvement in eight categories of intervention targets (i.e., anxiety reduction in eight studies, changes in cognition in seven studies, problematic emotions other than anxiety in four studies, problematic behavior other than anxiety-related in three studies, social skills improvement in three studies, parental or teachers’ issues in two studies, other constructs in two studies, and other specific problems in two studies). The most frequently measured target was anxiety (seven studies). While four reports included measures of all stated intervention targets, some intervention targets were seldom measured (i.e., specific problems not measured in the two targeting studies, parental’ or teachers’ issues measured in one targeting study, and other constructs measured in one targeting study). On the other hand, some researchers measured variables not specified as intervention targets (problematic behavior other than anxiety related in one study, problematic emotions other than anxiety in one study, social skills improvement in one study, parental or teachers’ issues in one study, other constructs in one study).

There were five main categories of outcome measures used: standard rating scales or inventory checklists completed by parents, participants, therapists, or teachers measured 59 outcome variables; diagnostic interviews with parents or with both participants and parents measured 16 variables; knowledge or ability tests measured 6 variables; rating scales developed by authors using parent reports measured six variables, and behavior observations measured four variables. The most frequently used measures were standard rating scales or inventory checklists completed by parents (n = 37 outcome variables in 8 studies) and by participants (n = 17 outcome variables in 4 studies). The most frequently used standard measures were the Spence Children’s Anxiety Scale (SCAS) (four studies), the Anxiety Disorders Interview Schedule (ADIS) (four studies), and the Clinical Global Impression (CGI) (three studies). Only one study did not include parent/participant reports and checked interrater reliability/interobserver agreement on all outcome measures. Three other studies included parent/participant reports in some measures and checked interrater reliability/interobserver agreement on all other measures not using parent/participant reports. All reported reliability was above 80 % except one measure for child behavior observation (interobserver agreement = 76 % in Koning 2010).

Effectiveness of Intervention

There was considerable variation in intervention features and outcome variables across the studies and the appropriateness of reporting an overall summary ES metric was questioned. After consideration, it was decided to report a summary metric along with examination of possible moderators. Hedges’ g was calculated for the ten RCTs based on 11 intervention group and control group comparisons. A forest plot of the meta-analysis is presented in Fig. 1. The relative weighting of studies in the overall estimate is indicated by the size of the squares in the figure. The resulting overall ES of 0.89 was statistically significant (95 % CI [0.50, 1.29]; p < 0.0001). Heterogeneity was significant (Q = 27.77, p = 0.002) and a moderate to high amount of variance was accounted for by between-study heterogeneity (I 2 = 63.99).

Fig. 1
figure 1

Forest plot using random effects model

Fig. 2
figure 2

Funnel plot of standard error by Hedges’s g. The reviewed studies were shown as open circles and the four imputed studies shown as filled circles. The calculated ES was shown as an open diamond and the adjusted ES was shown as filled diamond

Given the level of heterogeneity, six possible moderators (i.e., intervention time span, total training hours, group or individual treatment, parent involvement, research design quality score, and total study quality score) were examined by dichotomizing the studies on each moderator. The detailed ES comparisons of studies dichotomized with these possible moderators are shown in Table 3. All ES comparison results were nonsignificant, indicating that these moderators were not significantly related to ES. This lack of statistical significance was possibly due to the small number of studies and related low statistical power. The comparisons with moderators of intervention time span, total training hours, and group/individual treatment resulted in small ES differences. The comparison with moderators of parent involvement, research design quality score and total study quality score, however, resulted in large ES differences, indicating their potential as moderators.

Table 3 ES comparisons of studies dichotomized with possible moderators

In addition, three types of outcomes were examined. The overall ES of all outcomes measuring anxiety and anxiety related issues was 1.07 and was statistically significant (95 % CI [0.48, 1.66]; p < 0.001) (in seven studies, with 93 individual ES). The overall ES of all outcomes measuring social skills was 0.98 and was statistically significant (95 % CI [0.47, 1.49]; p < 0.001) (in three studies, with 11 individual ES). The overall ES of all other outcomes was 0.81and was statistically significant (95 % CI [0.25, 1.37]; p = 0.005) (in six studies, with 24 individual ES).

Interventionists using a CBA appear to have assumed that intervention effects would be generalized and maintained by strategies such as parent/teacher involvement and practice, and thus incorporated them. Only some of the reviewed studies specified their intervention components for generalization purposes (seven studies) and none did for maintenance. The specified strategies for generalization were home practice (four studies), parent training and participation in sessions (three studies), school involvement (two studies), and video modeling (one study). Participant behavior changes were measured at home and in contrived social settings in three studies, providing some limited evidence of generalization of skills outside intervention settings. Follow- up measurements on outcomes variables, primarily based on parent reports, were taken at six weeks to six months post intervention in seven studies and mostly indicated that effects were maintained. The social validity of the interventions effect was checked in four studies. The measures used were mainly questionnaires, and all results were positive with high ratings.

In order to evaluate possible publication bias, a funnel plot was constructed with missing studies imputed using the Duval and Tweedie trim-and-fill procedure. The resulting plot is presented in Fig. 2. The plot was asymmetrical and four missing studies were imputed. The adjusted overall random effects ES was 0.58 (95 % CI [0.18, 0.99]). In addition, the classic fail-safe N was 142, indicating that an additional 142 studies with a mean ES of 0 were needed to reduce the overall calculated ES to a non-significant level (α = 0.05). The Orwin fail-safe N was 77, indicating that an additional 77 studies with mean ES of 0 would be needed to reduce overall ES to below a trivial value, set at 0.1.

Study Quality

An acceptable and appropriately documented randomization method was reported in six of the reviewed studies. A participant attrition rate below 10 % in both intervention and control groups was reported in only three studies. Preintervention equivalence of control and comparison groups or adjustments made in data analysis was reported in five studies on all outcome measures and in three studies on some outcomes. Raters were blind to treatment conditions for all measures in one study, and blind for measures not using parent or self-reports in four studies. The remaining five studies used either measures by raters not blind to treatment conditions or reported by parents or participants. Many studies provided clear descriptions of the intervention setting (six studies) and procedural fidelity checking (nine studies), but only a few provided data on procedural fidelity (three studies).

The study quality scores of individual studies are displayed in Table 2. The quality of the reviewed studies varied widely with research design quality scores varying from 25.0 to 75.0 % of the maximum possible score (M = 56.3 %), and total study quality scores from 25.0 to 83.3 % of the maximum possible score (M = 60.6 %). The average study quality score was 58.3 % for published journal articles and 70.4 % for the three doctoral dissertations.

Discussion

The present review examined RCTs on cognitive behavioral interventions for children with ASD. The earliest report located was dated 2005, indicating that development in this research area is relatively recent. Overall, the research studies focused on high-functioning children with ASD aged around 10 years old, were mostly interventions for anxiety, and were restricted to clinical settings with therapists.

Participant Characteristics

It has been suggested that symptoms of ASD might moderate the efficacy of CBA interventions for anxiety of typically developing youth (Puleo and Kendall 2011). However, with limited reports on participant level of autistic symptomology, this review could not examine the moderating effects of the degree of autistic symptomology on intervention outcome. In addition, the focus of the reviewed interventions was on participants within normal range of intellectual ability and around the age of 10 years. This narrow focus did not allow examination of the effects of CBA on children with ASD having intellectual disabilities and/or beyond this narrow age range.

Intervention Features

Other than all interventions being conducted by therapists in clinical settings and nearly all intervention targets being anxiety reduction, the reviewed interventions were heterogeneous in many other important features such as programs used, intervention modalities, time span over which intervention took place, numbers of sessions, total intervention hours, and intervention components. Each reviewed intervention implemented a unique package of multiple components. Based on the currently available information, it is impossible to identify the key components that contributed to the efficacy of the CBA. This difficulty in concluding which components contribute to efficacy has also been noted for multiple component interventions in other fields, such as nursing (Edwards et al. 2006).

All reviewed interventions used program manuals and a few manuals allowed flexibility or adopted a modular approach, by which individual components in the manuals were selected and implemented to meet the specific needs of the participants (Cool Kids in Chalfant et al. 2007; Building Confidence FCBT in Wood et al. 2009a). This flexibility was useful to address individual needs (Kendall et al. 1998), but introduces further variation within individual studies.

Across the interventions, the terminology and descriptions of interventions were often inconsistent. Although both reported on the same study, Drahota (2008) reported the use of behavioral experimentation but Drahota et al. (2011) did not mention behavioral experimentation and reported the use of exposure, which was not mentioned in Drahota (2008). Some studies explicitly reported the use of cognitive restructuring; some used a simplified version of cognitive restructuring (Chalfant et al. 2007); others did not report the use of cognitive restructuring, but reported the use of “thinking tools,” which appeared to be similar (Sofronoff et al. 2005, 2007).

Many intervention components used in the reviewed studies (e.g., practice of skills, homework, visual support, relaxation, and affective education or training regarding emotion) are typical components of interventions for children with ASD, regardless of whether a cognitive framework is used (Attwood 2003; Beaumont and Sofronoff 2008; Groden and LeVasseur 1995; Howlin et al. 1999; National Research Council 2001). On the other hand, some often-discussed CBA components such as cognitive restructuring, self-instruction, and problem solving were not commonly documented/implemented in the reviewed studies (Crawley et al. 2010; Graham 2005; Meichenbaum and Goodman 1971; Scott 2009). The considerable heterogeneity of the interventions, together with the possible absence of consensus in component terminology and description of interventions, do not facilitate the investigation of the underlying mechanisms.

Measurement of Outcome Variables

The outcomes variables in the reviewed studies were mainly severity of anxiety or related issues and were measured with standardized measures. Although behavioral changes were measured in five studies, only three studies provided observational measures of the actual changes in natural or contrived settings.

The majority of the standardized measures (e.g., rating scales and inventory checklists) were completed by either participants’ parents or the participants themselves. In the measurement process, fewer than half of the reviewed studies reported that assessors were blind to participant treatment conditions. Only one study had all outcomes measured by assessors blind to participant condition. With the informants or assessors being aware of participant treatment condition, the outcome data would possibly be at higher risk of bias (Gersten et al. 2005). These issues are of concern when interpreting the improvement indicated for most of the outcome variables measured.

There were additional issues regarding the major informants being the parents and the participants. Discrepancies between their reports were raised in two reviewed studies (i.e., McNally Keehn 2010; Wood et al. 2009a). Some studies examined these discrepancies and found that compared to their parents, high-functioning adolescents with ASD reported more psychiatric symptoms and empathetic features but fewer autistic traits (Hurtig et al. 2009; Johnson et al. 2009). Parents’ reports may be limited when reporting on perceptions of their child’s feelings and emotions. Self-reports on emotions and related cognition might also be unreliable and invalid (Mazefsky et al. 2011). Sofronoff et al. (2005) discarded self-reported data due to the participants’ difficulty with accessing and reporting their own emotions. McNally Keehn (2010) argued that the accuracy of self-reporting on emotions and thoughts by individuals with ASD is questionable because they appear to lack insight into these mental states. Although intervention training might have increased participant awareness of symptoms or understanding of themselves, such measures might be more useful as indicators of change in awareness, rather than actual improvement, in symptoms such as anxiety disorders (Drahota 2008; Reaven et al. 2009). The very few direct observations of behavioral changes, the dependence on standard rating scales completed by parents or participants, and the informants or raters being aware of treatment conditions are all problematic, as is the reliability of self-reports about emotion and cognition.

Efficacy of Interventions

The significant and large ESs (0.89) obtained in the current meta-analysis is suggestive of the overall efficacy of CBA interventions for children with ASD. This finding is primarily applicable to anxiety interventions for children with ASD within the normal range of intellectual abilities and around the age of 10 years. Given, the diversity in intervention components and measured outcomes, the overall summary effect size should be viewed with caution. Further, the interpretation of this result has to be made in the context of the problems with the outcome measures used, the small number of studies reviewed, as well as the quality of the studies. Attempts were made to examine possible moderator variables but the findings were not statistically significant, although the low power of tests and the small number of studies need to be considered in interpreting the findings. Parent involvement, research design quality, and total study quality appeared to be potential moderators with large ES difference between studies dichotomized with these features. Although evidence of publication bias was found, the estimated effect size was only reduced to a medium level after adjustment for imputed missing studies. In addition, given the limited size of the corpus of published research, an implausibly large number of missing studies would be required to reduce the calculated ES to either a trivial or non-significant level.

There appears to be a lack of attention paid to generalizing or maintaining the intervention effects gained. The current data are thus inadequate to demonstrate which strategies are effective for generalization or maintenance. Nonetheless, the few generalization strategies used in the reviewed studies were those suggested in the wider literature for generalization and maintenance, being mainly parental involvement and repeated practice (Eveleigh 2010; Reaven and Hepburn 2006).

The importance of assessing intervention effectiveness in real life, i.e., social validity, is widely acknowledged but seldom practiced in behavioral intervention research for individuals with disabilities (Brosnan and Healy 2011; Finn and Sladeczek 2001; Kroeger and Sorensen-Burnworth 2009). Given the relative complexity of the reviewed interventions and the considerable therapist hours invested, it is reasonable to check the interventions’ social validity in order to rationalize their possibly higher cost. Not all reviewed studies reported on social validity and the measures used were rating scales and questionnaires completed by parents, which are potentially subjective.

Based on this review, no evidence is found regarding the efficacy of CBA for lower-functioning children with ASD or for interventions carried out in school settings. Evidence is extremely limited for its efficacy with difficulties other than anxiety. There is also the need to measure the social validity of these interventions with more comprehensive and objective methods.

Study Quality

The superiority of randomized controlled trials depends on the group equivalence achieved through the random allocation of participants into treatment or control conditions, adjustment for pretest and posttest covariance difference being made, raters being blind to the group allocation, and ensuring procedural fidelity. Only half of all the reviewed studies met half or more than half of all the research design quality criteria satisfactorily. The average study quality scored just over half of the possible maximum, and study quality appeared to be a possible variance moderator of the ESs in that lower quality studies produced higher ESs.

Recommendations for Further Research

Although it is argued that CBA might be particularly suitable to children with ASD with higher cognitive abilities (Klinger and Williams 2009; Ozonoff 1999; Whyte 2009), the current narrow focus restrains the development of a thorough understanding of the application of CBA to children with ASD and with lower cognitive abilities (Rotheram-Fuller and MacMullen 2011). One of the very few research studies intervening with participants with ASD and with lower cognitive abilities incorporated CBA as a major component of its treatment protocol, and positive effects were reported (Pardini et al. 2012). Positive results were also reported in studies on CBA interventions for anger management for adults with intellectual disabilities (Taylor and Novaco 2005). As 50 to 70 % of individuals with ASD have intellectual disabilities (Matson and Shoemaker 2009), there may be potential benefits in extending CBA interventions to children with ASD and intellectual disabilities. To deepen the understanding on the feasibility of CBA for children with ASD, it may be important to quantify participant verbal IQ as well as level of autistic symptomology in the future studies.

To expand the application of CBA for children with ASD, the potential of implementing CBA in school settings by classroom teachers should be investigated. The possible efficacy of CBA carried out by teachers in school settings and the potential of facilitating the application of learnt skills in daily natural contexts were demonstrated in some studies (Bauminger 2002, 2007a, b; Bolton et al. 2012). To extend the application of learnt skills to the children’s daily lives, given the possible moderating effects of parent involvement found in this review, future research should include active parent involvement in the child’s home environment. Given the current success of CBA interventions with anxiety, exploring the feasibility of CBA for impulsive behaviors, academic tasks, problem solving ability, role taking, and self-control would be reasonable directions to take. CBA has been used with success to intervene on these targets in typically developing children (e.g., Kendall and Braswell 1985; Little and Kendall 1979; Meichenbaum and Asarnow 1979; Meichenbaum and Goodman 1971).

Due to the high treatment cost and high service needs in the population with ASD, an effective and acceptable intervention should include only the effective components or necessary features (Foster et al. 2007; Jacobson and Mulick 2000). Future research should include more comparison studies to clarify the relative effectiveness of individual components and features. Parental involvement and the teaching of problem solving and self-instructions are two such components to be considered. Some indicative support for parental involvement is found in this review, and the teaching of problem solving and self-instructions are suggested for programming generalization and maintenance in the literature (Kendall and Finch 1979; Little and Kendall 1979; Meichenbaum and Asarnow 1979).

One problem in interpreting the intervention results is the heterogeneity of CBA interventions. A question to ask is whether RCT is the best research method to investigate the relative efficacy of elements in complex multiple-component interventions. A reasonable alternative research method may be small-n designs; which are recommended for investigating multiple-component interventions, comparing alternative interventions in special education, and for children with ASD (Horner et al. 2005; Odom et al. 2005; Odom et al. 2010; Sheffield and Waller 2010).

In order to objectively measure the intervention outcomes and mitigate the issues with standardized measures, observations in natural settings of the behaviors concerned as well as the measures of actual improvement in participant quality of life should be more frequently used in future studies. Observation of actual behavior changes and the actual improvement in participant quality of life could also serve to measure social validity, which is currently measured with parent questionnaires only in the reviewed studies.

Conclusion

The present review suggested that CBA to manage anxiety, delivered in clinical settings, could be highly effective for high-functioning children with ASD. This conclusion is qualified by the heterogeneity of the reviewed interventions, the heavy reliance on standard rating scales completed by participants and parents, and the lack of satisfactory documentation of randomization of participants in some studies. Future research should investigate the feasibility of CBA in school settings, the applicability for lower-functioning children with ASD, and also address difficulties other than anxiety. Small n designs are recommended for detailed comparisons of the various intervention components, in particular for determining which common cognitive components (i.e., cognitive restructuring, self-instruction and problem solving) are appropriate and necessary for children with ASD. The observation of targeted behaviors in natural settings by observers blind to intervention conditions should be used to measure the intervention outcomes.