Autism spectrum disorder (ASD) is a lifelong neurodevelopmental disorder characterized by marked deficits in social communication along with restricted and repetitive behaviors and interests (American Psychiatric Association, 2013). Social communication deficits are strongly associated with negative long-term outcomes (Gillberg & Steffenburg, 1987; Howlin et al., 2004). Early intervention, starting as early as the second year of life, has been shown to have positive long-term outcomes (e.g., Pickles et al., 2016; Wetherby & Woods, 2006) and has been associated with greater developmental gains and reduction in ASD symptoms than intervention delivered later in life (Koegel et al., 2006). Interventions targeting social communication skills have demonstrated collateral benefits in other areas (Ledbetter-Cho et al., 2017). For instance, teaching children with ASD to make requests has resulted in concomitant decreases in problem behavior (Charlop-Christy et al., 2002; Gianoumis et al., 2012).

The recommended intervention for individuals with ASD is Applied Behavior Analysis (ABA) (National Autism Centre, 2009, 2015). Recommended practice guidelines in early behavioral intervention for children with ASD emphasize building parents’ capacity to support their child’s behavior within the context of daily life (Division for Early Childhood, 2014; National Research Council, 2001). Parent training is an established intervention for increasing interpersonal and play skills and decreasing problem behavior in children with ASD (National Autism Centre, 2015). Parent training has also been shown to have benefits for the parent. Parents of children with ASD report higher levels of stress and affective symptoms than parents of typically developing children and parents of children with other disabilities and chronic illnesses (e.g., Abbeduto et al., 2004; Dumas et al., 1991; Smith et al., 2010). Parents who engage in parent training programs have reported decreases in stress (Keen et al., 2010), improvements in mental health symptoms such as anxiety and depression (Tonge et al., 2006), increases in the amount of parental leisure and recreation time (Koegel et al., 1982), and increased levels of parental self-efficacy (McConachie & Diggle, 2007) and optimism (Koegel et al., 1982). Participating in parent training has also been shown to have positive effects on parent-child interactions (Koegel et al., 1996).

Despite growing interest in parent training, few studies have examined the role of other family members (i.e., siblings) in intervention. A conceptually similar area of research is peer-mediated intervention (PMI), which involves typically developing peers in teaching a variety of skills to children with ASD. PMI is an established intervention for increasing communication and interpersonal skills with children with ASD (National Autism Centre, 2009, 2015). Benefits of PMI for individuals with ASD include increased opportunities to interact with social partners, improved social competence, and independence (Sperry et al., 2010). PMI has also been shown to have a positive impact on peer implementers. They were found to demonstrate academic gains, increased sensitivity to others, higher self-confidence, and expanded peer networks after participating as peer mediators for individuals with ASD (Carter et al., 2008). Criteria for successful peer candidates include attending the same school as the child with ASD, having developmentally appropriate cognitive and language abilities, a history of compliance, and social competence and enthusiasm (Gunning et al., 2019a).

Sibling training potentially combines the benefits of parent training and PMI, as siblings could support skill generalization for the child with ASD along with parents, and increase fun, reciprocal play, and learning opportunities comparable to peers. Indeed, a systematic review of sibling-mediated interventions found that results from these interventions were similar to results from PMIs (Shivers & Plavnick, 2015). Target skills included play skills, social skills, academic or functional skills, and physical fitness. Overall, the siblings learned the intervention procedures, and the children with ASD showed increases in skill acquisition and/or decreases in problematic behavior.

Given the high prevalence rates of ASD (1 in 59 children; Baio et al., 2018) and the importance of early intervention, attention has been called to innovating service delivery models that would allow clinicians to maximize their productive time and reach. Additionally, families who live in rural or remote areas often experience barriers in accessing evidence-based intervention (Kogan et al., 2008; Liptak et al., 2008; Mandell et al., 2010). Telehealth is a promising service delivery model in which clinicians consult and deliver treatment over a distance using communication technologies such as videoconferencing or interactive websites (Dudding, 2009). Telehealth has been used to deliver evidence-based intervention to individuals with ASD, including parent training, and has demonstrated positive results, although insufficient methodological rigor is a concern (Boisvert et al., 2010; Knutsen et al., 2016). Continued evaluation of remote parent training, employing stronger research methodologies, is ongoing (e.g., Ingersoll et al., 2016; Vismara et al., 2018).

A key advantage of involving family members in intervention delivery is the potential for improved generalization outcomes. Children with ASD often present with severe challenges in applying skills learned in treatment to everyday use, which is a major barrier to effective intervention (Vismara & Rogers, 2010). Generalization outcomes are improved when intervention addresses functional behaviors within the natural environment; natural consequences are used; training occurs across different settings, people, and stimuli; and mediation strategies such as problem-solving are taught (Chandler et al., 1992; Gunning et al., 2019b; Stokes & Baer, 1977; Stokes & Osnes, 1989). As family-mediated interventions are delivered by people closest to the child, it could be expected that increased learning opportunities would occur across contexts and interactions and continue to develop as the child grows. A review of generalization and maintenance of social communication skills in parent-mediated interventions found improved social communication across situations and over time, although not to the extent achieved during acute intervention (Hong et al., 2018a). Similarly, a meta-analysis of early intervention effects on social communication found the largest effect sizes in context-bound measures (i.e., outcomes measured with the same person and context as the treatment context), compared to semi-generalized and generalized measures (i.e., outcomes measured with a different person and/or in a different context) (Fuller & Kaiser, 2019).

Several recent systematic reviews have synthesized the literature on parent training as it relates to language and communication interventions (Akamoglu & Meadan, 2018), toddlers (Beaudoin et al., 2014), school-age children (Black & Therrien, 2018), and functional communication training (Gerow et al., 2018). Overall, parents were able to implement the intervention, and positive outcomes were noted for the parents and children. However, firm conclusions could not be reached given the paucity of studies with strong methodological rigor. A meta-analysis of 19 randomized clinical trials (RCT) of parent-mediated interventions for children with ASD found small improvements in symptom severity, socialization, and cognition, and trivial improvements in communication and language (Nevill et al., 2018). Although small effects are concerning, the reviewers noted that both the quantity and quality of research on parent-delivered interventions have been increasing. Additionally, more research is needed on parent and child variables that may explain the inconsistency in treatment outcomes. A systematic review on potentially influencing factors found mixed results on the impact of broad child factors (e.g., age, verbal ability), fine-grained child factors (e.g., vocal initiation, joint attention, imitation), and contextual child factors (e.g., service access, intervention hours), along with parent factors such as including demographics, intervention factors (e.g., adherence, involvement, fidelity), and contextual factors (e.g., therapist support, accessibility) (Trembath et al., 2019).

The present study aims to extend on these reviews and provide practical guidance for clinicians by synthesizing treatment effectiveness and research strength of family-mediated social communication interventions for children with ASD under 6 in terms of intervention characteristics such as intervention agents, modality, setting, dosage, focused intervention practices (FIP), and intervention packages. This is also the first review to combine and compare in situ and telehealth interventions along with parent- and sibling-mediated interventions.

Method

Search Procedures

A systematic search of the literature was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009). The following databases were used: EBSCO (Psychology and Behavioral Sciences Collection), Education Resources Information Center (ERIC), PsycINFO, Scopus, and Web of Science. Combinations of the following terms were inputted into each database: (1) “autis*,” (2) “parent train*” or “parent mediat*” or “parent implement*” or “sibling train*” or “sibling mediat*” or “sibling implement*,” and (3) “social communication” or “social interaction” or “social skills” or “communication skills” or “play.” Initial online searches yielded 3682 results. A hand search of the reference sections of studies that met the criteria was also conducted to identify additional relevant studies. A total of 2574 records remained after duplicates were removed.

Inclusion Criteria

Studies were included if (a) at least one child had a confirmed diagnosis of ASD (in studies employing single-subject research design; SSRD) or all focal children had a confirmed diagnosis of ASD (in studies employing group designs); (b) at least one child with ASD was between 0 and 6 years old (in studies employing SSRD) or all focal children were between 0 and 6 years old (in studies employing group designs); (c) parent or sibling implemented 100% of the intervention; (d) the article presented an evaluation of an intervention to improve one or more social communication skills as defined in the Social Communication Checklist (SCC; Ingersoll & Dvortcsak, 2010); (e) the intervention employed an experimental design, i.e., either a single-subject research design (SSRD) or a group comparison design; (f) the intervention employed at least one evidence-based FIP as outlined by Wong et al. (2015); and (g) the study must have been published in a peer-reviewed journal. Only studies available electronically and published in English between 1980 and 2019 were included.

Data Extraction and Coding

Each article in the review was summarized in terms of participant and intervention agent characteristics, intervention characteristics, target skills, and collateral outcomes. Treatment effectiveness and research strength were then evaluated and synthesized by intervention agent, intervention characteristics, target skills, FIPs, and intervention packages. Generalization promotion strategies and outcomes were also coded.

Participant and Intervention Agent Characteristics

Information regarding the total number of participants, number of participants that met inclusion criteria, and age was noted. Only data for participants that met the criteria were extracted. For intervention agents, relationship to participant (i.e., mother, father, brother, sister) was recorded, along with age of sibling (if sibling was mediating the intervention). If additional characteristics such as sibling skills were reported, these variables were noted.

Intervention Characteristics

Intervention strategies used to teach target skills to the child with ASD were coded by FIP. Intervention packages that used combinations of these FIPs were also recorded. Protocols used to train intervention agents were coded as instructions, modelling, video modelling, role play, rehearsal, and/or feedback. Treatment modality (telehealth or in situ) and treatment setting (home, clinic, or mixed) were noted. Treatment dosage (i.e., duration and frequency of sessions and total hours of intervention) was also recorded. Total hours of treatment were coded into low-dose interventions (≤10 h), medium-dose interventions (11–20 h), high-dose interventions (21–30 h), and extended interventions (over 30 h).

Target Skills and Collateral Outcomes

Target skills were recorded as described by the study authors and then coded into one of the three social communication domains based on the SCC: social engagement, language and communication, and imitation and play. Collateral outcomes for the participant and intervention agents were recorded. If standardized measures were used, those were noted.

Treatment Effectiveness

In articles using SSRD, percentage of non-overlapping data (PND; Scruggs et al., 1987) was used to measure treatment effectiveness for each dependent variable. Advantages of PND include ease of calculation from graphical rather than raw data, high degree of inter-rater reliability, applicability to any SCD design type, and ease of interpretation (Campbell, 2013; Parker et al., 2007). PND has also been shown to be a more conservative measure of treatment effects compared to other nonoverlap methods (e.g., nonoverlap of all pairs (NAP) and percentage of all overlapping data (PAND)) and is widely applicable to data gathered for participants with ASD (Carr, 2015). PND is calculated by dividing the number of treatment data points that fall below the lowest baseline data point by the total number of data points in the treatment phase, multiplied by 100 (Scruggs et al., 1987).

PND scores range from 0 to 100%. An intervention with a PND score of 90% or higher is considered to be a very effective treatment, a PND score between 70 and 89% indicates effective treatment, a PND score between 51 and 69% is considered questionable treatment, and an intervention with a PND score of 50% or lower is classified as not effective treatment (Scruggs & Mastropieri, 1998). Where all baseline data points were at zero, this was noted. Graphs with fewer than three baseline data points and graphs that showed ceiling or floor effects at baseline were excluded from calculation (Scruggs et al., 1987; Schlosser et al., 2008).

In articles utilizing group designs, Cohen’s d was calculated for all primary dependent variables using means, standard deviations, and sample sizes of the treatment and control groups (Cohen, 1988). Cohen’s (1988) criteria were then used to determine effect sizes: trivial (<0.2), small (0.2–0.49), medium (0.5–0.79), or large (⩾0.8). Cohen’s d is commonly used to synthesize studies using group designs (Warner, 2013). Treatment effects were then synthesized by target skills, intervention agents and characteristics, FIPs, and intervention packages.

Research Strength and Determining Evidence-Based Practice

Studies were assessed for research quality using primary and secondary quality indicators and rated as strong, adequate, or weak using an operationalized evaluative method outlined by Reichow et al. (2008). Primary quality indicators for both SSRDs and group designs included information on participant characteristics and independent and dependent variables. Additional primary quality indicators for SSRDs included demonstration of a stable baseline condition, visual analysis, and experimental control, while additional primary quality indicators for group designs included use of a comparison condition, demonstration of the link between the research question and data analysis, and accurate statistical analysis. Primary quality indicators were rated as high, acceptable, or unacceptable quality. Secondary quality indicators for both SSRDs and group designs included evidence of interobserver agreement, blind raters, measurement of treatment fidelity and generalization and maintenance outcomes, and demonstration of social validity. Additional secondary quality indicators for SSRDs included calculation of the Kappa statistic, while additional secondary quality indicators for group designs included random assignment, details of participant attrition, and reporting of treatment effect sizes. Secondary quality indicators were rated as being present or absent.

To receive a strong rating, studies needed to receive high quality grades on all primary quality indicators and demonstrate evidence of at least three (for SSRDs) or four (for group designs) secondary quality indicators, while to receive an adequate rating, studies needed to receive high quality grades on at least four primary quality indicators and no unacceptable ratings on any primary quality indicators, along with demonstrating evidence of at least two secondary quality indicators. Studies that did not meet these criteria received a weak rating.

From this, Z scores were calculated to determine whether an intervention package could be rated as an evidence-based practice (EBP). The following formula was applied to calculate the Z score (“GroupS” equals the total number of group design studies with an overall “strong” rating, “GroupA” equals the total number of group design studies with an overall “adequate” rating, “SSEDS” equals the total number of participants in single-subject studies with a “strong” rating, and “SSEDA” equals the total number of participants in single-subject studies with an “adequate” rating):

$$ \left({\mathrm{Group}}_{\mathrm{S}}\ast 30\right)+\left({\mathrm{Group}}_{\mathrm{A}}\ast 15\right)+\left({\mathrm{S}\mathrm{SED}}_{\mathrm{S}}\ast 4\right)+\left({\mathrm{S}\mathrm{SED}}_{\mathrm{A}}\ast 2\right)=Z $$

Z scores of ≥60 points indicate “established EBP” and >30 indicate “probable EBP.” Research strength and EBP status were then synthesized by target skills, intervention agents, intervention characteristics, FIPs, and intervention packages.

Generalization

The studies were assessed for generalization promotion strategies, generalization dimension, latency to maintenance probe, and generalization outcome. Generalization promotion strategies were divided into four categories: exploiting current functional contingencies, training diversely, incorporating functional mediators, and using sequential modification (Chandler et al., 1992; Gunning et al., 2019b; Neely et al., 2015; Stokes & Osnes, 1989; Swan et al., 2016). Generalization dimension was coded into settings, materials, people, and maintenance (i.e., generalization over time). Generalization outcome was coded as “complete” if all participants’ generalization scores were above baseline or intervention data, “partial” if some participants’ generalization scores were above baseline or intervention data, and “failure” if none of the participants’ generalization scores were above baseline or intervention data (Chandler et al., 1992; Gunning et al., 2019b).

Inter-rater Reliability

Inter-rater reliability (IRR) was calculated on at least 20% of the literature search, title and abstract screening, full-text screening, and data extraction procedures. All IRR scores in this review were calculated by dividing agreements between two independent raters by agreements plus disagreements and multiplying by 100. IRR was 100% for the literature search, 97% for title and abstract screening, 87.8% for full-text screening, and 82% for data extraction. If there was disagreement between the raters, the discrepancy was discussed until the raters came to a consensus.

IRR for PND and Cohen’s d was calculated on 20% of the included studies. An agreement was defined as both raters recording the same percentage of non-overlapping data per behavior. Overall agreement was determined by the following formula:

$$ \#\mathrm{of}\ \mathrm{agreements}=\#\left[\mathrm{agreements}+\mathrm{disagreements}\right]\times 100=\% $$

IRR for the calculation of PND and Cohen’s d was 90%. When calculations were considered inaccurate, the co-authors reached agreement through discussion. This process was repeated until 100% agreement was achieved.

Results

Fifty-four studies met inclusion criteria. None of the studies was published between 1980 and 999. Fourteen studies (25.9%) were published between 2000 and 2009, while 40 studies (74.1%) were published between 2010 and 2019. Two studies reported intervention outcomes for the same group of participants (Gengoux et al., 2015; Hardan et al., 2015). Data for these studies were consolidated where appropriate. Participant, intervention agent, and intervention characteristics are presented in Table 1, while dependent variables, treatment effects, and research design and strength are presented in Table 2.

Table 1 Descriptive summary of included studies
Table 2 Treatment effects and research strength

Participant Characteristics

The 54 included studies had a total of 653 participants. Twenty-three participants were included in two studies, a randomized controlled trial (Hardan et al., 2015) and its 3-month follow-up study (Gengoux et al., 2015). A total of 444 participants met inclusion criteria (not including participants who only participated in control conditions). A total of 316 (71.2%) participants were male, and 84 participants (18.9%) were female. Gender was not reported for 44 participants (9.9%). Participant age ranged from 19 to 69 months.

Intervention Agent Characteristics

Parents were trained to implement the intervention in 50 studies (92.6%), while siblings were trained in four studies (7.4%). None of the studies trained both parents and siblings together; however, two studies (3.7%) trained parents directly and involved siblings indirectly (Jull & Mirenda, 2011; Madzharova & Sturmey, 2015). None of the telehealth interventions involved siblings. A total of 193 mothers acted as intervention agents (43.8%), along with 24 fathers (5.4%), 8 brothers (1.8%), and 6 sisters (1.4%). A total of 220 intervention agents were labelled as “parent” and not specified as mother or father (49.9%).

Siblings ranged from 4 to 13 years old (M=7.86 years). One study reported that the siblings were verbal, exhibited strong interaction skills, and said that they would like to increase their interactions with their sibling with ASD (Oppenheim-Leaf et al., 2012). These siblings were reported to exhibit problem behavior. One sibling had undiagnosed sensory concerns as reported by his mother (Walton & Ingersoll, 2012). None of the siblings had cognitive or developmental concerns. No other sibling characteristics were noted in the articles.

Intervention Characteristics

Modality, Setting, and Dosage

Forty articles evaluated in situ interventions (74.1%), and 14 articles evaluated telehealth or self-directed interventions (25.9%). Of the in situ studies, 15 studies took place in a clinic (37.5%), while 18 studies took place in participants’ homes (45%). Six studies (15%) took place in multiple settings. Twenty-one studies (38.9%) were categorized as low-dose interventions, 12 studies (22.2%) were categorized as medium-dose interventions, seven studies (13%) were categorized as high-dose interventions, and three studies (5.6%) were categorized as an extended intervention. Total hours were unclear in ten studies (18.5%). Session duration ranged from 5 min to 5 h, and session frequency ranged from 3 per day to one per week. Intervention duration ranged from 5 days to 1 year.

Intervention Agent Training Methods

All studies (100%) used instructions to train the intervention agent, while 27 (50%) used modelling, 23 (42.6%) used video modelling, 15 (37.8%) used role play, and 34 studies (67%) used rehearsal and feedback. One study (1.9%) read a story with the sibling, and three studies (5.6%; 75% of sibling studies) used a reward system for participating. Of the 14 telehealth interventions, 4 (28.6%) involved partial in situ training, 10 (71.4%) involved live videoconferencing, and 11 (78.6%) involved self-directed materials (e.g., self-directed websites, apps, videos).

Focused Intervention Practices

Given the selection criteria for this review, all studies used naturalistic intervention (NI; 100%) along with either parent-implemented intervention (PII; n=50, 92.6%) or peer-mediated instruction and intervention (PMII; n=4, 7.4%). PMII was coded for both sibling and non-sibling peer involvement. In addition, 48 studies used prompting (88.9%), 47 studies used reinforcement (87%), 43 used time delay (79.6%), 32 used modelling (59.3%), 17 used PRT (31.5%), 7 used functional behavior assessment (FBA; 13%), 3 used Discrete Trial Teaching (DTT; 5.6%), 2 used video modelling (3.7%), 2 used functional communication training (FCT; 3.7%), 1 used picture exchange communications system (PECS; 1.9%), and 1 used scripting (1.9%). Prompting, reinforcement, modelling, and time delay were coded where it was outlined within the intervention method, as well as when it was inherent within other FIPs used in the study (e.g., PRT, FCT, PECS), even if it was not explicitly described in the procedure.

Intervention Packages

The most commonly studied intervention was Pivotal Response Treatment (PRT; Koegel et al., 1987). Ten studies (18.5%) used PRT as a standalone intervention package, including a follow-up study and a self-directed study. Seven additional studies (13%) combined PRT with other FIPs, four of which evaluated the Early Start Denver Model (ESDM; Vismara & Rogers, 2008) and three of which did not specify an intervention package. All PRT studies were delivered by parents. Six studies (11.1%) used Reciprocal Imitation Training (RIT; Ingersoll, 2008), including two telehealth studies and one sibling-mediated study. Four studies (7.4%) used ESDM, including two telehealth studies, and three studies (5.6%) used Project ImPACT, including two telehealth studies. Two studies (3.7%) used Milieu Teaching, including enhanced milieu teaching (EMT) and modified milieu teaching, and two studies (3.7%) used Joint Attention, Symbolic Play, Engagement & Regulation (JASPER), all delivered by parents. One study each (1.9%) used Natural Language Paradigm (NLP), a modified Parent-Child Interaction Therapy (PCIT), Pathways, Picture Exchange Communication System (PECS), Video Modeling Imitation Training (VMIT), and a combination of Stay-Play-Talk, Play Time/Social Time, and Getting Along with Others. Other telehealth intervention packages include two studies (3.7%) that used Sunny Starts/Decide Arrange Now Count Enjoy (DANCE) and one study (1.9%) that used i-PiCS. All these studies were parent mediated. Seventeen studies (31.5%) used a combination of FIPs without labelling the intervention, including 3 sibling-mediated studies.

Dependent Variables and Collateral Outcomes

Dependent Variables

A total of 117 dependent variables were targeted as primary outcomes, and 6 dependent variables were measured as part of a 3-month follow-up (Gengoux et al., 2015). Primary outcomes were coded into social engagement skills (n=32; 28.3%), language and communication skills (n=55; 48.7%), and imitation and play skills (n=18; 15.9%). Two (1.8%) primary dependent variables were measured using only comprehensive assessments that included social communication skills across multiple domains. Six dependent variables (5.3%) not directly related to social communication were also measured, including adaptive skills, compliance with instructions, and increasing appropriate behaviors or decreasing problem behaviors.

Collateral Outcomes for Child with ASD

Seventeen studies (31.5%) measured collateral outcomes for the child with ASD. Nine studies (16.7%) evaluated collateral outcomes using standardized pre-post measures that assessed multiple developmental domains (see Table 2 for specific measures). Seven studies (13%) measured specific untargeted behaviors, including joint attention (n=6), play (n=3), imitation (n=1), and affect (n=1).

Family Outcomes

Of the 50 parent-mediated studies, parent outcomes included treatment integrity (n=49; 86%); social validity (n=30; 50%), program engagement, e.g., DVD or website use (n=5; 10%); adherence and/or competence (n=3; 6%); knowledge of intervention and/or ASD (n=3; 6%); self-efficacy (n=2; 4%); quality of involvement (n=2; 4%); parent affect (n=2; 4%); stress and/or coping (n=2; 4%); family impact (n=1; 2%); engagement style (n=1; 2%); subjective distress (n=1; 2%); and observed confidence (n=1; 2%). Of the 4 sibling-mediated studies, sibling outcomes included fidelity (n=4; 100%) and social validity (n=4; 100%), including 2 studies (50%) that assessed social validity through video ratings of sibling interactions such as interaction quality and fun. The most measured outcome for family members was treatment integrity, although a consistent method for assessing treatment integrity was not found. The next most measured outcome for family members was social validity. To assess social validity, 29 studies (54.2%) administered questionnaires, 4 studies (6.3%) conducted open-ended interviews, and 3 studies (6.3%) used video ratings. All participants (100%) who completed questionnaires and open-ended interviews reported being satisfied with the intervention. Nine out of 10 implementers (90%) who were part of social validity video ratings were rated as meeting the author’s criteria for social validity. Twenty studies (37%) did not explicitly measure social validity.

Treatment Effects and Research Strength

Treatment Effects

Treatment effects were calculated for each primary dependent variable. Where treatment effects could not be calculated, those studies were excluded from the analyses below. In studies that utilized SSRDs, treatment effectiveness could not be calculated for 3 studies. One study used an AB within-subject design (Hansen & Shillingsburg, 2016), one study did not provide visual representation of child outcomes (Mcduffie et al., 2013), and one study displayed 8 data paths per graph which made it impossible to calculate PND (Vismara et al., 2009). In studies that utilized group designs, effect sizes could not be calculated using Cohen’s d for four studies that did not utilize true control groups. One study compared a self-directed and therapist-assisted model of Project ImPACT (Ingersoll et al., 2016); one study compared a parent education group with a parent education plus parent support group, with both groups being trained in PRT methods (Stahmer & Gist, 2001); one study compared a basic and enhanced model of ESDM (Rogers et al., 2019); and one study used pre-post measures (Mcgarry et al., 2019). Additionally, in four studies, treatment effects could be calculated for some, but not all, dependent variables (e.g., PND could be calculated for one dependent variable but could not be calculated for the second dependent variable as the second dependent variable was only measured pre- and post-intervention).

Research Strength

Among all 54 studies, 17 studies (31.5%) received a “strong” rating, 18 studies (33.3%) received an “adequate” rating, and 19 studies (35.2%) received a “weak” rating (see Table 2). Among the 42 studies that utilized SSRD, 7 studies (16.7%) met criteria for a “strong” rating, 17 studies (40.5%) met criteria for an “adequate” rating, and 18 studies (42.9%) were given a “weak” rating. Among the 12 studies that utilized a group design, 10 studies (83.3%) met the criteria for a “strong” rating and two studies (16.7%) were given a “weak” rating. One group study that received a “weak” rating was given this rating only because it did not specify the gender of the participants.

Target Skills, Intervention Agents, and Intervention Characteristics

Treatment effects and research strength were synthesized by target skills, intervention agents, and intervention characteristics, i.e., modality, setting, and dosage (see Table 3). Target skills that were categorized into the social engagement domain showed a higher percentage of treatment effectiveness (60% very effective or effective; 75% large or medium effect size), as compared to target skills in the language and communication domain (37.2% very effective or effective; 50% large or medium effect size) and imitation and play domain (43.8% very effective or effective; 100% large or medium effect size, n=1). Additionally, social engagement skills were targeted in a larger proportion of studies rated as strong or adequate (70%), compared to language and communication skills (59.3%) and imitation and play skills (58.3%). In two studies (Manohar et al., 2019; Rogers et al., 2019), primary dependent variables were only measured using comprehensive assessments (Childhood Autism Rating Scale (CARS; Scholpler et al., 1988); PATH Curriculum Checklist (PATH CC; Rogers et al., 2013)), which made it difficult to extract treatment effects for specific social communication categories. The comprehensive assessments measured skills from multiple domains but did not report specific data for each variable. Data from these studies were therefore not included in the calculations for treatment effects by social communication domains.

Table 3 Treatment effectiveness, effect sizes, and research strength by intervention characteristics

Parent-mediated interventions were coded as PII and sibling-mediated interventions were coded as PMII (see Table 4). Both types of intervention agents had similar proportions of very effective to effective treatment (46.3% for parents and 50% for siblings). However, parent-mediated studies (n=50) greatly outnumbered sibling-mediated studies (n=4), and parent-mediated studies had a higher proportion of studies rated as strong or adequate (68% compared to 25%).

Table 4 Treatment effectiveness, effect sizes, and research strength by focused intervention practice (FIP)

Evaluating intervention modality, treatment effectiveness was similar in telehealth studies (44.4% very effective or effective) and in situ studies (47.5% very effective or effective). A larger proportion of effect sizes were in the large or medium range for telehealth studies (100%) than for in situ studies (50%), although there were more target skills measured in in situ group studies (n=10) compared to telehealth (n=2). Telehealth had a smaller proportion of studies rated as strong or adequate (57.2% for telehealth and 66.7% for in situ). Among in situ studies, there was no difference in treatment effectiveness for intervention delivered in home (51.2% very effective or effective) or clinic (50% very effective or effective). The proportion of large or medium effect sizes was higher in home (100%) compared to clinic (50%), but fewer skills were targeted in home (n=1) than clinic (n=8). Research strength was similar across home (66.6% strong or adequate) and clinic studies (73.4% strong or adequate). Interventions in mixed settings had the lowest treatment effectiveness (16.7% very effective or effective) and research strength (50% strong or adequate), although the single outcome measures in a group study showed a large effect (100%).

When considering dosage, the relationship between treatment effects and intervention hours is uncertain. Extended-dose intervention had the highest proportion of skills improved with very effective or effective intervention (66.7%), followed by medium-dose (61.6%), high-dose (50%), and low-dose (39.1%). However, only 3 studies evaluated an extended intervention (Guðmundsdóttir et al., 2017; Kashinath et al., 2006; Rogers et al., 2019). Examining effect sizes, low- and high-dose interventions showed 100% large or medium effect sizes, compared to 50% for medium-dose and 0% for extended-dose.

FIPs and Intervention Packages

Treatment effects and research strength were synthesized for each FIP (see Table 4) and intervention package (see Table 5). PRT is separately conceptualized here as an FIP (combined with other FIPs into intervention packages such as ESDM) and as a standalone intervention package (not combined with other FIPs). PRT as an FIP was very effective or effective for 57.2% of target skills and demonstrated large or medium treatment effects for 44.4% of target skills, while PRT as a standalone intervention package was very effective or effective for 66.7% of target skills and demonstrated large or medium effects for 42.9% of target skills.

Table 5 Treatment effectiveness, effect sizes, and research strength by intervention package

Most interventions combined at least two FIPs, making it difficult to attribute treatment effects to specific FIPs. All studies utilized naturalistic interventions. Given the inclusion criteria for this review, all studies utilized either parent-mediated intervention or peer-mediated intervention (i.e., sibling-mediated intervention). Unsurprisingly, prompting, reinforcement, modelling, and time delay were commonly utilized, as these practices are inherent to most behavioral interventions. Among the less utilized FIPs, FBA was very effective or effective for 71.5% of target skills and demonstrated large or medium effects for 50% of target skills, DTT was very effective or effective for 62.5% of target skills, and video modelling was very effective or effective for 33.3% of target skills. Three FIPs (FCT, PECS, and scripting) demonstrated 100% very effective or effective treatment; however, these FIPs were only used to target one (PECS and scripting) or two (FCT) skills.

After Z score calculations for intervention packages, PRT (Z=98), JASPER (Z=60), and ESDM (Z=60) qualified as established EBP, while Project ImPACT (Z=50) qualified as probable EBP. However, while treatment effects were promising for PRT (see above) and JASPER (100% large or medium effects), treatment effects were low for ESDM (0% very effective or effective; 50% large or medium effects) and Project ImPACT (0% very effective or effective; effect size could not be calculated due to lack of true control group). These four intervention packages were the only ones to be evaluated in group studies, which may have contributed to higher Z scores as group studies are weighted higher in calculation. PRT, Project ImPACT, and ESDM were all evaluated in both in situ and telehealth modalities, while JASPER was only evaluated in situ. None of these studies involved siblings.

Generalization

Generalization Promotion Strategies

Generalization promotion strategies are outlined in Table 6. In the category of exploiting current functional contingencies, 54 studies (100%) recruited natural contingencies and targeted functional behaviors, 40 studies (74.1%) involved contacting natural reinforcement, 14 studies (25.9%) involved modifying maladaptive consequences, and 12 studies (22.2%) specified reinforcement of occurrences of generalization. In the category of training diversely, 42 studies (77.8%) used sufficient stimulus exemplars, 34 studies (63%) used sufficient response exemplars, 9 studies (16.7%) made antecedents or consequences less discriminable, 37 studies (68.5%) occurred across settings or in natural settings, and 6 studies (11.1%) involved skills teaching across people. In the category of incorporating functional mediators, 54 studies (100%) programmed common stimuli and 5 studies (9.3%) incorporated self-mediated stimuli. Two studies (3.7%) used sequential modification.

Table 6 Generalization measures and outcomes

Generalization Dimension

Sixteen studies (29.6%) evaluated generalization across settings or situations, 13 studies (24.1%) evaluated generalization across materials, and 8 studies (14.8%) evaluated generalization across people. Thirty-three studies (61.1%) measured generalization across time (i.e., maintenance). Fourteen studies (25.9%) did not measure any generalization dimension.

Generalization Outcomes

Of the 40 studies that measured generalization, one study (2.5%) demonstrated failure to generalize, 26 studies (65%) demonstrated partial generalization, and 11 studies (27.5%) demonstrated complete generalization. Latency to maintenance probe ranged from 1 week to 12 months. Of the 33 studies that measured maintenance, four studies (12.1%) had unclear latency. Twenty-three studies (69.7%) measured maintenance once after treatment, 3 studies (9.1%) measured twice, and 5 studies (15.2%) measured 3 or more times. The maximum number of maintenance checks was 6.

Discussion

The purpose of this review was to synthesize and evaluate the literature on family-mediated interventions in order to identify the characteristics that impact treatment effectiveness along with generalization strategies and outcomes. Fifty-four studies were included in this review. Seventy-four percent of the included studies were published in or after 2010, corroborating the recent interest in family-mediated interventions and the relative infancy of the literature on this topic. The current review focused on interventions aiming to improve social communication skills in children with ASD. Social communication was categorized into social engagement, language and communication, and imitation and play. Treatment was evaluated as slightly more effective for social engagement skills when compared to language and communication skills and imitation and play skills, although not by a wide margin.

Consistent with the parent training literature, mothers were found to participate most often in family-mediated interventions. Only four studies investigated sibling-mediated interventions, with 75% of these studies rated as weak. Treatment was found to be very effective or effective for half of the target skills. The study that received an adequate rating (Oppenheim-Leaf et al., 2012) was found to be very effective at improving sharing behaviors and effective at improving play with others. The “weak” studies were shown to be very effective at increasing verbal initiation (Ferraioli & Harris, 2011) and very effective or effective at improving joint attention skills (Ferraioli & Harris, 2011; Tsao & Odom, 2006). Qualitatively, parents reported increased cooperative play, shared enjoyment, amount of time spent together, and improved interaction quality (Ferraioli & Harris, 2011; Oppenheim-Leaf et al., 2012). Taken together, the results tentatively point towards social engagement as a target skill area for researchers and clinicians interested in involving siblings in intervention. The current evidence for sibling-mediated intervention is still weak. However, given the small number of studies, the recent emergence of research interest, and the potential for positive outcomes for children with ASD and their siblings, more high-quality research in this area is recommended.

None of the included studies involved training both parents and siblings, although two studies trained parents directly and involved siblings indirectly. In one study, parents were trained to facilitate play sessions involving mutually reinforcing activities and cooperative arrangements; one of the parent participants arranged for the child’s older sister to act as his play partner (Jull & Mirenda, 2011). This intervention was very effective at increasing synchronous reciprocal interactions between siblings. The other study trained parents to teach peer-to-peer manding; one of the parent participants chose to involve the child’s younger sister as his peer partner (Madzharova & Sturmey, 2015). In this case, the intervention was not effective at improving the child’s independent requests to the sibling. Future research could examine child and family characteristics (e.g., developmental level, communication status, and areas of strength, challenge, and interest) that determine the most beneficial model of sibling-mediated intervention for each family unit (Wright & Benigno, 2019). For instance, researchers could compare horizontal, vertical, and pyramidal approaches, in which the practitioner trains the child and sibling on the same skills together, the practitioner trains the sibling to teach skills to the child, or the practitioner trains parents to teach one or both of their children.

Telehealth is a promising intervention modality due to the potential increased access to intervention for families living in areas where access to evidence-based intervention is scarce. These interventions could also increase access for families who experience barriers related to logistical challenges such as scheduling or childcare issues. Fourteen telehealth studies were evaluated in this review. Telehealth parent training methods included live videoconferencing, self-paced videos, websites, or apps, and training manuals. Training methods typically utilized in situ, including providing instructions, modelling (in the form of video modelling), rehearsal, and feedback, were also utilized in most telehealth interventions. Treatment effects were similar for telehealth and in situ interventions, suggesting that children were as able to make gains regardless of whether parents were trained remotely or face-to-face. This represents a promising opportunity for expanding the reach of evidence-based interventions, with minimal sacrifice in terms of outcome.

Surprisingly, of the in situ interventions, no difference was found in treatment effects between home and clinic settings. We hypothesized that interventions set in the family’s home would facilitate both family and child learning; however, this was not demonstrated in the current review. Additionally, it would be reasonable to assume that home-based interventions would increase access and adherence by reducing time and resource barriers for families. Given that clinic-based interventions allow clinicians to maximize their clinical time by eliminating commutes between families, the decision to provide home or clinic-based services may be best made on a case-by-case basis.

Greater intervention intensity has historically been associated with greater child gains (e.g., Eldevik et al., 2010; Magiati et al., 2007; Virues-Ortega et al., 2013). Most of the studies in this review provided low to medium hours of intervention (≤20 h), potentially because of the inherent dynamics of involving family members. Treatment effects demonstrated mixed results based on dose, with no apparent benefit to delivering additional hours of intervention. This is consistent with recent meta-analyses that did not find treatment effects of dosage in parent-mediated interventions (Nevill et al., 2018) and social-communication interventions (Fuller & Kaiser, 2019). There may be advantages to providing lower-dose interventions, such as increasing parents’ attendance and adherence (Carr et al., 2016). When involving family members in intervention, it is important to consider its potential effects on the family as a whole and weigh the costs and benefits of increased expectations on the family member delivering the intervention. For instance, in one of the included studies, three parents reported difficulty implementing the intervention at home due to comorbid attention-deficit hyperactivity disorder (ADHD), and two parents reported difficulty due to personal distress and lack of social support (Manohar et al., 2019). Few studies measured family outcomes other than treatment integrity and social validity. Future studies should consider the collateral effects of intervention on the child and family unit.

Three intervention packages, PRT, ESDM, and JASPER, qualified as established EBPs, while Project ImPACT qualified as a probable EBP. However, Project ImPACT and ESDM did not show strong treatment effects. All these interventions were only evaluated with parents delivering the intervention; adapting these interventions for sibling involvement may provide further opportunities to improve outcomes for the child and family. Several other intervention packages were shown to be very effective or effective for at least two-thirds of the target skills, even though their Z scores were not yet high enough to qualify as established or probable EBP. These packages include milieu training, PECS, VMIT, Pathways, and Sunny Starts/DANCE. More research evaluating these intervention packages is recommended. Similarly, focused intervention practices such as FBA, DTT, FCT, PECS, and scripting showed promising treatment effects but were only utilized in a small number of studies. Incorporating these FIPs into other intervention packages could potentially improve outcomes, depending on child characteristics and target skills.

A major consideration of family-mediated intervention is the potential for generalization, which is often challenging for children with ASD (Vismara & Rogers, 2010). While it is discouraging that a quarter of studies did not evaluate any generalization dimension, it is assuring that among the studies that did evaluate at least one generalization dimension, over 90% demonstrated complete or partial generalization outcomes. Several strategies for promoting generalization are intrinsic to family-mediated intervention, such as recruiting natural contingencies and programming common stimuli. Potentially due to the primary role of family members in family-mediated interventions, addressing functional behaviors, contacting natural reinforcement, and conducting intervention in natural or multiple settings were also prevalent. Less utilized strategies included modifying maladaptive consequences, reinforcing occurrences of generalization, making antecedents or consequences less discriminable, and teaching skills across different people. These strategies are less inherent in family-mediated intervention and would require conscious effort on the part of the clinician to incorporate these strategies. The least utilized generalization promotion strategy was sequential modification, in which further strategies are employed to promote generalization if generalization results are not satisfactory. This could be due to the limitations of conducting research studies (e.g., time constraints) and may not reflect the reality occurring in clinical intervention. The lack of studies that take place in the clinical “real world” has been previously noted (Beaudoin et al., 2014).

Limitations

The articles included in this review were evaluated for quality using Reichow and colleagues’ (2008) operationalized rubric. It should be noted that this evaluative method has different criteria for SSRD and group comparison designs. Primary quality indicators for SSRD rely on strong research design as well as positive results (i.e., baseline, visual analysis, experimental control), while primary quality indicators for group comparison designs rely on strong research design alone, with no indicators related to results. Additionally, the stringent requirements for primary quality indicators lead to an “unacceptable” rating for baseline if even one out of several baselines is unstable, and an “unacceptable” rating for experimental control if even one out of several data paths did not demonstrate change in the dependent variable with introduction of the independent variable. This may explain the large proportion of “weak” ratings for studies that employed SSRD. Likewise, studies that employed a group design could obtain a “strong” rating even if results were negative or inconclusive. Previous reviews (e.g., Ferguson et al., 2018; Tomlinson et al., 2018) have noted similar issues using this evaluative method. In the current review, these potential discrepancies were partly accounted for by including additional evaluation of treatment effects using PND and Cohen’s d. Future reviews could evaluate methodological rigor using additional or alternative operationalized methods (e.g., What Works Clearinghouse, 2010).

Treatment effects were only calculated for primary outcome variables, i.e., target skills. Treatment effects were not calculated for secondary outcomes. Future studies could evaluate the relationship between target skills and collateral effects, as this could provide valuable information on which target skills would have the widest-reaching impact on the child and family unit. Collateral benefits have been shown when social communication skills are improved (Ledbetter-Cho et al., 2017); however, this has not been systematically evaluated in family-mediated interventions.

Summary

In the current review, parent-mediated PRT, ESDM, and JASPER were the only intervention packages that qualified as established EBP. Treatment was more effective for social engagement skills compared to language and communication skills and imitation and play skills. No major differences in treatment effects were found based on intervention agent, treatment modality, setting, or dosage. Generalization outcomes were encouraging, although more focus should be placed on incorporating additional generalization promotion strategies into family-mediated interventions. Sibling-mediated intervention, while still in its infancy in the literature, has the potential for meaningful impact when incorporated into clinical practice. A further avenue of research could be to explore interventions that involve the complete family unit by addressing motivational variables and intervention outcomes of parents, siblings, and children with ASD to promote integrated learning and benefits within the family. Telehealth is also a promising area of further research, not only during the current climate of social restrictions due to the COVID-19 pandemic, but also when moving forward to reach families who previously had limited access to evidence-based intervention.