Repetitive and stereotyped behaviors are one of the core features of autism spectrum disorders (ASD; American Psychiatric Association 2000). In line with current diagnostic criteria for the condition, individuals must display restricted, repetitive, and stereotyped patterns of behavior, interests, and activities as manifested by at least one of the following: (1) encompassing preoccupation with one or more stereotyped and restricted patterns of interest that is abnormal either in intensity or focus; (2) apparently inflexible adherence to specific, nonfunctional routines or rituals; (3) stereotyped and repetitive motor mannerisms (e.g., hand or finger flapping or twisting, or complex whole-body movements); and (4) persistent preoccupation with parts of objects (American Psychiatric Association 2000).

These ritualistic, repetitive, stereotyped, and obsessive compulsive behaviors are displayed by most individuals with ASD to some degree. A recent study by Murphy et al. (2009) showed that 72 % of children with autism engaged in some form of stereotypy or repetitive behavior. While stereotypy is not unique to autism, evidence suggests that differences exist between the stereotypy displayed by those with autism in comparison to other disorders. Elevated levels of stereotypy have been observed in individuals with autism in comparison to those with intellectual disability (Bodfish et al. 2000) highlighting the importance of using evidence-based treatments which are unique to the disorder.

Many different topographies of such behavior are reported in the literature including vocal stereotypy (Ahearn et al. 2007), repetitive face rubbing (Britton et al. 2002), hand flapping (Conroy et al. 2005), perserverative speech (Rehfeldt and Chambers 2003), body rocking and head weaving (Ahearn et al. 2005), lining up objects (Sigafoos et al. 2009), and mouthing (Tarbox et al. 2002) as well as stereotyped self-injurious behaviors, dyskinesia, akathesia, obsessions, and compulsions (Healy and Leader 2011).

Multiple theories have been proposed regarding the purpose of these behaviors for individuals with ASD with a focus mainly on operant accounts or neurological interpretations. There is some evidence to suggest that social deprivation, impoverished environments, pharmacological agents, and arousal levels may be implicated in the development of stereotypy. The review by Rapp and Vollmer (2005a) on neurobiological interpretations of stereotypy suggests that some evidence exists for the hypothesis that social deprivation and being raised in impoverished environments increases engagement in stereotypy. Animals reared in social isolation or restricted environments have been demonstrated to engage in higher levels of repetitive behavior than those who have not been subjected to such conditions. Ridley (1994) hypothesized that increased rates of stereotypy observed under these conditions are a result of restricted opportunities to engage in any other behavior. Indeed, Langen et al. (2011) argue that similar effects have been observed in children raised in impoverished environments such as orphanages.

A second theory outlined by Rapp and Vollmer (2005a) involves the role of pharmacological agents on engagement in stereotypy. In animal models, physiological over-arousal resulting from manipulations of corticosterone (an index of physiological arousal) has been shown to increase arousal levels inducing higher rates of stereotypy, with engagement in stereotypy subsequently reducing arousal (see Rapp and Vollmer (2005a) and Langen et al. (2011) for a detailed review of these mechanisms). Dopamine and serotonin systems have both been implicated in the development of stereotypy in animals. Animals injected with dopamine agonists demonstrate increased levels of stereotypy, while injection with dopamine antagonists result in decreases in stereotypy. This suggests that dopaminergic drugs may attenuate engagement in stereotypy (Langen et al. 2011). Similarly, stimulation of the postsynaptic serotonin receptors has been demonstrated to increase engagement in stereotypy, suggesting that serotonin reuptake inhibitors may decrease instances of the behavior (Langen et al. 2011).

Physiological stress has been demonstrated to increase levels of stereotypy in both animals and humans. Rapp and Vollmer (2005a) highlighted research which indicates that increased physiological affects the probability of engagement in repetitive behavior in animals and humans. More recently, Lydon et al. (2012) demonstrated some evidence which indicated that stereotypy in individuals with ASD did not modulate arousal levels. In contrast to previous research, this study demonstrated that heart rate levels were not found to decrease following engagement in stereotypy. Stereotypy occurred, both during times of high and low physiological arousal, and subsequent decreases in arousal were not found following engagement in stereotypy. The authors concluded that, while replication of such results is needed, stereotypy may produce reinforcement in the form of elevated heart rate.

Some researchers have argued that stereotypy is a result of an underlying problem with neurological processing and organization, and engagement in the behavior regulates one’s ability to attend and be sensitive to external environmental stimuli (Smith et al. 2005). Sensory integration is based on the provision of sensory stimulation to address such neurological processing. Since Ayres (1972) first described the approach, various techniques have been developed to provide sensory stimulation including, for example, deep pressure, brushing, massage, and weighted vests. Vollmer et al. (2014) believe that this kind of stimulation could be conceived of, as environmental enrichment or differential reinforcement, if it is highly preferred by the individual.

One further theory is the operant view of repetitive, ritualistic, and stereotyped behavior. Specifically, this theory suggests that repetitive behavior may be maintained by reinforcing consequences automatically produced by engaging in the behavior (Lovaas et al. 1987). While most repetitive, ritualistic, and stereotyped behaviors may be maintained by automatic reinforcement, it is also possible that these behaviors could be maintained by external social consequences (Wilke et al. 2012). Wilke et al. (2012) evaluated the function of stereotyped behavior for 53 individuals with ASD using indirect functional analysis. While the majority of stereotyped behaviors were found to be maintained by automatic reinforcement, 10 % of the participants demonstrated stereotyped behavior which was maintained by social consequences. Furthermore, Rehfeldt and Chambers (2003) found that the perseverative speech of a girl with autism was maintained by socially mediated reinforcement in the form of attention. This highlights the importance of determining the function of behavior prior to treatment. Indeed, DiGennaro Reed et al. (2012), in their literature review, found that the majority of published studies which used treatments to decrease stereotypy did not use an experimental functional analysis prior to implementing a treatment. The authors stress the importance of identifying the function of these behaviors prior to treatment to ensure evidence-based practice. Boyd et al. (2012) also stress the importance of identifying the function of stereotypy and the role of functional analysis in designing treatments for stereotypy.

While numerous research has been published which reviews treatments of stereotypy for people with autism (e.g., Boyd et al. 2012; DiGennaro Reed et al. 2012; Rapp and Vollmer 2005b), to date, no research has systematically evaluated the research and established which treatments may be regarded as evidence based. Furthermore, no research has been conducted which compares treatments across disciplines. First, the current review aims to determine which treatments may be deemed evidence based in the treatment of stereotypy. Second, this review aims to compare the efficacy of function- and nonfunction-based treatments for stereotypy. While previous reviews have highlighted the utility of functional analysis in the treatment of stereotypy (e.g., Boyd et al. 2012; DiGennaro Reed et al. 2012; Rapp and Vollmer 2005b), a direct comparison of functional- and nonfunction-based treatments has not been conducted.

Method

Search Procedures

Searches were conducted using the following databases: (1) Scopus, (2) PsycINFO, (3) Medline, (4) Web of Science, and (5) Psychological and Behavioral Sciences. Searches were carried out in each database using the term “autis*” in combination with each of the following terms: Stereotypy, Repetitive Behavior, Compulsive Behavior, Self stimulatory, Obsessive Behavior, Behavior Modification, Applied Behavior Analysis, Behavioral Intervention , Pharmaco*, Psychopharmaco*, Antidepressant, Psychostimulants, Anticonvulsants, Antipsychotic, Sensory Integration, Diet, and Auditory Integration. Abstracts of records returned were reviewed in order to determine inclusion in the review.

Inclusion and Exclusion Criteria

To be included in this review, the article had to meet five inclusion criteria. First, the study was published in English and in a peer-reviewed journal to ensure that all studies had been subjected to quality control via peer review. Second, the study reported an evaluation of one or more treatments for stereotypy, repetitive behavior, and obsessive or compulsive behavior. Treatment was defined as implementing one or more therapeutic treatments with the main aim of reducing the frequency or severity of stereotypy. Third, the study included objective data, based on either direct observation or use of standardized rating scales, on the frequency and/or severity of stereotypy in at least one person with ASD. Fourth, participants in included studies must have had a main diagnosis of ASD. Fifth, the study must have been published after 1990.

Repetitive self-injurious behavior was excluded from this review as differentiating between repetitive self-injurious behavior, and nonrepetitive self-injurious behavior was beyond the scope of this article.

Selection of Articles

Seventy-one articles were identified for inclusion. Articles were categorized as (1) function-based behavioral treatments, defined as treatment approaches drawn from the results of a functional assessment or functional analysis of the problem behavior prior to treatment; (2) nonfunction-based behavioral treatments, defined as any treatment which utilized the principles of applied behavior analysis but had not been based on a previous functional analysis or functional assessment of the target behavior; (3) pharmacological, defined as any treatment which utilized psychotropic medication with the aim of decreasing stereotypy; (4) sensory integration-based treatments, defined as treatments which encompassed all or some aspects of the sensory integration therapy described by Ayres (1972); and (6) other, defined as any treatment which did not fit within the above categories.

Function- and nonfunction-based treatments were further divided into (a) antecedent-based treatments, (b) reinforcement or skills-based treatments, (c) consequence-based treatments, and (d) mixed treatments. Where one study evaluated two or more different treatments, each treatment is presented in the relevant category.

Determining Treatment Efficacy

Percentage Reduction of Behavior

Treatment efficacy of each study was determined by calculating a percentage reduction of the target behavior from baseline to treatment phases. Percentage reduction was calculated using the method outlined by Kahng et al. (2012). The value of the last five data points in the baseline and treatment phases were first determined. Where less than five data points were unavailable in either phase, the value was determined for the maximum number of data points available in both phases. Where a reversal was used, values were extracted from the last treatment and baseline phases.

Mean condition values were calculated for both the treatment and treatment phases. Treatment effectiveness was determined by subtracting the mean value of the treatment phase from the mean value of the baseline phase. This was then divided by the mean baseline value and multiplied by 100 to obtain a percentage decrease or increase in stereotyped and repetitive behaviors. A negative percentage indicates an increase in behavior.

Where the use of more than one treatment was implemented within a study, each treatment was evaluated independently with percentage reduction of behavior (PRB) calculated for each individual treatment. Treatments were categorized as “effective” where a minimum of 50 % reduction was observed. If less than 50 % reduction was observed, treatments were categorized as “ineffective”.

Criterion for Evidence-Based Treatments

The criterion for determining empirically supported therapies outlined by Chambless and Hollon (1998) was applied in order to determine whether each treatment could be considered as “efficacious”. In accordance with this method, a treatment is deemed “efficacious” if at least two, well designed, between-group experiments have demonstrated that the treatment is either superior to an alternative treatment or equivalent to an already established treatment. For a single case research, a treatment is shown to be evidence based if three or more independent small N studies demonstrate positive results with at least nine participants. A treatment is deemed “lacking in sufficient evidence” or “promising” if initial results are positive but as of yet the treatment lacks the required number of studies or participants (Chambless and Hollon 1998). Treatment approaches were deemed “ineffective” if a minimum of three or more studies within the existing literature demonstrated that the treatment was ineffective in reducing stereotypy or repetitive behaviors.

Interrater Agreement

Interrater agreement for PRB was calculated on 38.36 % of the studies identified through the literature search. Two raters independently calculated treatment efficacy for each study. Agreement was defined as obtaining the exact same percentage for each study. Interobserver agreement was calculated by dividing the total number of agreements by the total number of agreements plus disagreements and multiplying by 100. Interobserver agreement was determined to be 96.43 %.

Results

Of the 71 articles included in this review, 37 were categorized as function-based treatments, 22 were categorized as nonfunction-based treatments, 5 evaluated pharmacological treatments, 5 used sensory integration techniques, and 2 were identified as using other treatments.

Function-Based Behavioral Treatments

Thirty-seven studies were identified which evaluated 45 treatments derived from a previous functional analysis or functional assessment. Function-based treatments were further categorized as either (1) antecedent treatments, (2) reinforcement or skills-based treatments, (3) consequence-based treatments, and (4) mixed treatments.

Antecedent Treatments

Antecedent treatments were defined as treatments which altered antecedent variables such as the environment or instructional context. Six studies were identified which evaluated the effects of an antecedent treatment derived from previous functional analysis (see Table 2). For all seven participants, stereotypy was found to be automatically reinforced and each study evaluated the effects of environmental manipulations as an antecedent treatment on occurrence of stereotypy. Environmental manipulations included environmental enrichment and free access to items of matched or unmatched stimulation (see Table 1 for a description of these interventions). PRB was calculated for six of these studies and is summarized in Table 2.

Table 1 Description of behavioral interventions
Table 2 Summary of efficacy of function-based treatments

Ahearn et al. (2005) compared the effects of providing continuous access to items of matched stimulation and unmatched stimulation with two participants aged 11–13 years (m = 12 years) whose stereotypy was maintained by automatic reinforcement using an alternating treatments design. Both matched and unmatched stimulation (see Table 1) effectively reduced stereotypy. However, unmatched stimulation was more effective (mean PRB = 78.04 %; range = 69.33–86.76 %) than matched stimulation (mean PRB = 62.89 %; range = 54.7–71.08 %).

Similarly, Hagopian and O’Toole (2009) made a stimuli which competed with automatically reinforced repetitive body tensing freely available to one participant aged 10 years. The availability of competing stimuli was demonstrated to effectively decrease stereotypy using a reversal design (mean PRB = 67.43 %; range = 67.43–67.43 %).

A third study evaluated the effect of continuous access to matched stimulation (see Table 1) on the vocal stereotypy of two males aged 8–9 years (m = 8.5 years). Love et al. (2012) demonstrated that this treatment was ineffective in decreasing stereotypy (mean PRB = 49.23 %; range = 45.37–60.79 %).

Luiselli et al. (2008) evaluated two items which were thought to compete with the automatically reinforced saliva play of a 6-year-old male. Using an alternating treatments design, continuous access to matched stimulation (see Table 1), either chewing gum or a chew toy, was provided to the participant. While the toy was effective in reducing stereotypy to zero levels (mean PRB = 100 %; range = 100–100 %), an increase in behavior was observed when the chewing gum condition was in effect (PRB = −44.68 %; range = −44.68 to −44.68 %).

The effect of continuous access to toys on mouthing was examined by Tarbox et al. (2002) with one participant aged 4 years using a reversal design. Continuous access to toys was shown to be ineffective in decreasing stereotypy (PRB = 18.22 %; range = 18.22–18.22 %).

Sidener et al. (2005) evaluated the effect of environmental enrichment (see Table 1) on the automatically reinforced repetitive surface scratching behavior of two girls aged 6 years using a multiple baseline across participants design. This treatment was effective in decreasing engagement in stereotypy (mean PRB = 67.45 %; range = 65.53–67.45 %).

Antecedent treatments which were based on a prior functional analysis were effective in decreasing stereotypy across six participants, with one study demonstrating that antecedent-based treatments were ineffective with one participant. According to the criteria for evidence-based treatments by Chambless and Hollon (1998), antecedent treatments which are designed based on prior identification of the function of stereotypy may be considered “efficacious” (see Table 2).

Reinforcement or Skills-Based Treatments

Eleven studies were identified which evaluated the effects of reinforcement or skills-based treatments which were implemented following identification of behavioral function (see Table 2). In this category, the majority of the participants’ stereotypy was automatically reinforced (n = 15). One study identified the stereotypy of one participant to be multiply controlled by attention, escape from task demand, and an unidentified source (Kennedy 1994). A variety of reinforcement or skills-based treatments were utilized including noncontingent reinforcement, teaching functional alternative skills, increasing on task behavior, and self-management (see Table 1 for a description of these interventions). PRB was calculated for 10 of these studies and is summarized in Table 2.

Appropriate alternative verbal behavior was taught to three participants aged 8–10 years (m = 9.33 years) in order to determine its effect on vocal stereotypy (Colón et al. 2012). A multiple baseline across participants design was used to assess the effect of verbal operant training on the target behavior. Both mands and tacts were taught to one participant, while only tacts were taught to the second participants. Neither condition resulted in effective decreases in vocal stereotypy (mean PRB = −2.38 % (range = −2.38 to −2.38) and mean PRB = −63.29 % (range = 54.76 to –118.05 %), respectively).

Functional communication training (FCT) was assessed by Kennedy (1994) as a treatment for multiply controlled motor stereotypy of one participant aged 10 years. The participant was taught to mand for attention, escape from task demand, and for no attention. A multiple baseline across behavioral functions demonstrated that FCT effectively decreased stereotypy (mean PRB = 78.25 %; 62.25–100 %).

Lang et al. (2010) evaluated the effect of teaching appropriate play skills to four children aged 5–11 years (m = 5 years) who engaged in repetitive tacting, counting, and object manipulation. An alternating treatments design was used to evaluate the effect of increasing appropriate play skills on stereotypy in comparison to a treatment which added an abolishing operation component. Teaching appropriate play was effective in decreasing stereotypy (mean PRB = 78.39 %; range = 78.39–78.39 %).

Mancina et al. (2000) implemented self-management (see Table 1) by teaching a 12-year-old child to self-monitor her own vocal stereotypy. Initially, the participant was taught to self-monitor, self-record, and self-reinforce her own behavior. This treatment was evaluated using a multiple baseline across settings design. An initial mean PRB of 70.61 % (range = 70.6170.61 %) was observed when a professional service provider implemented the treatment. Effective outcomes were also demonstrated when a class teacher implemented the intervention (mean PRB = 80.64 %; range = 53.09–95.01 %).

Roane et al. (2003) evaluated the effect of providing noncontingent access to food, as a form of competing stimulus, on the mouthing of an 8-year-old boy. Using a multiple baseline across settings design, a mean PRB of 88.87 % (range = 71.77–100 %) was observed.

Similarly, Groskreutz et al. (2011) compared the effect of noncontingent access to high-competition and high-preference items (see Table 1) on vocal stereotypy with a 4-year-old boy. A reversal design revealed that neither items effectively decreased stereotypy (mean PRB = 49.71 %; range = 48.71–48.71 and −25.75 %; range = −25.75 to −25.75 %, respectively). However, high-competition items were more effective than high-preference items in reducing vocal stereotypy.

Taylor et al. (2005) implemented a treatment whereby reinforcement in the form of auditory toys was provided to a 4-year-old child on a fixed time schedule. A reversal design demonstrated that the treatment was ineffective in reducing vocal stereotypy (mean PRB = 18.14 %; range = 18.14–18.14 %).

Noncontingent access to matched stimulation (see Table 1) was also evaluated by Lanovaz and Argumedes (2010) with one participant, aged 3 years, who engaged in repetitive mouthing. A three-component multiple schedule was implemented to examine the immediate and subsequent effects of noncontingent access to matched stimulation which effectively decreased immediate engagement in stereotypy (PRB = 90.95 %; range = 90.95–90.95 %). When noncontingent access to matched stimulation was removed, an increase in behavior was observed (PRB = −4.27 %; range = −4.27–4.27 %).

Differential reinforcement (DR) procedures were used across five studies in this category. The studies presented here used DR procedures without extinction (see Table 1); DR procedures which incorporated an extinction component were categorized as “mixed treatments” and are presented below.

Anderson and Le (2011) assessed the effects of a DRO and DRA (see Table 1) on vocal stereotypy of one participant aged 7 years using a series of reversals. As no baseline data was available, PRB could not be calculated; however, neither the DRO nor the DRA contingencies effectively decreased stereotypy.

A DRO contingency was also evaluated by Lanovaz and Argumedes (2010) with one participant aged 3 years who engaged in repetitive mouthing. A three-component multiple schedule was implemented to examine the immediate and subsequent effects of the DRO contingency. The DRO contingency effectively decreased immediate engagement in stereotypy (mean PRB = 50.67 %; range = 50.67–50.67 %), and, when it was removed, an increase in behavior was observed (mean PRB = −9.94 %; range = −9.94 to −9.94 %).

Nuernberger et al. (2013) implemented three treatments which used a DRO contingency (see Table 1) to treat repetitive hair manipulation of a 19-year-old female. A DRO was implemented using items which competed with engagement in the target behavior as a reinforcer, which produced a mean PRB of 98.8 % (range = 98.8–98.8 %). Subsequently, a delay in access to reinforcement was implemented which increased the mean PRB to 100 % (range = 100–100 %). A DRO with a self-monitoring component was also implemented and resulted in a mean PRB of 100 % (range = 100–100 %).

Patel et al. (2000) used a similar procedure, implementing a DRO using high-preference, high-competition stimuli as reinforcers for the absence of repetitive tongue clicking. A reversal design was employed to examine the effect of this treatment with one participant, aged 10. PRB calculated for this study show that the treatment effectively decreased repetitive behavior (mean PRB = 95.31 %; range = 95.31–95.31 %).

Nine reinforcement or skills-based treatments were found to be effective in decreasing stereotypy with 11 participants, and 4 studies demonstrated ineffective treatments with 5 participants. According to the Chambless and Hollon (1998) criteria for evidence-based treatments, reinforcement and skills-based treatments which are designed based on prior identification of the function of stereotypy may be considered “promising but lacking in sufficient evidence” (see Table 2).

Consequence-Based Treatments

Ten studies were identified which evaluated the effects of consequence-based treatments which were developed following a functional analysis (see Table 2). Functional analysis and/or assessment revealed that all participants’ (n = 19) stereotypy was automatically reinforced. Six different treatments were implemented including response interruption and redirection (RIRD), response cost, response blocking, redirection, overcorrection, and extinction (see Table 1). PRB was calculated for nine of these studies and is summarized in Table 2.

RIRD was evaluated as a treatment for stereotypy in seven studies across fifteen participants. Ahearn et al. (2007), for example, interrupted the vocal stereotypy of four participants aged 3–11 years (m = 7 years), and redirected participants to engage in other vocalizations such as answering questions. Using a reversal design, Ahearn et al. (2007) effectively decreased stereotypy (mean PRB = 81.57 %; range = 78.67–85.71 %). Colón et al. (2012) evaluated the effect of implementing RIRD using a multiple baseline across three participants aged 8–10 years (m = 9.33 years). A mean PRB of 76.07 % (range = 69.44–82.70 %) was observed. Cassella et al. (2011) used a reversal design with two participants aged 4.9–7.17 years (m = 6.04 years) to assess the effect of RIRD on vocal stereotypy. A mean PRB of 79.68 % (range = 79.49–82.87 %) was calculated, suggesting an effective treatment. Liu-gitz and Banda (2010) used a reversal design to evaluate the effects of RIRD on vocal stereotypy with a 10 year old male. RIRD effectively decreased vocal stereotypy (mean PRB = 96.79 %; range = 96.79–96.79 %).

Ahrens et al. (2011) investigated the effects of using RIRD in topographically similar and dissimilar stereotypic behavior. A reversal design with an alternating treatments component was used in the first phase of this study. Two participants, aged 4–6 years (m = 5 years), who engaged in vocal stereotypy, were redirected to vocal tasks or motor tasks in an alternating fashion. Both forms of redirection were effective in decreasing vocal stereotypy with comparable PRBs, though motor RIRD resulted in higher reduction than vocal RIRD (mean PRB = 73.15 % (range = 50.9–95.04 %); mean PRB = 71.44 % (range = 47.65–95.23 %)). A further analysis of the effects of matched and unmatched topographies of RIRD was conducted with two participants aged 4–5 years (m = 4.5 years) who engaged in both motor and vocal stereotypy. Motor RIRD was more effective in decreasing vocal stereotypy than motor stereotypy (mean PRB = 94.11 % (range = 93.79–94.43 %); mean PRB = 80.47 % (range = 79.46–81.48 %)). Vocal RIRD was almost equally effective in deceasing both motor and vocal stereotypy (mean PRB = 86.28 % (range = 78.77–93.79 %); mean PRB = 86.45 % (range = 83.96–89.94 %)).

In contrast to other studies, Dickman et al. (2012) used a reversal design to demonstrate that RIRD was ineffective in decreasing vocal stereotypy with one participant (PRB = 35.8 %; range = 35.8 %). Similarly, Love et al. (2012), using a reversal across two participants aged 8–9 years (m = 8.5 years), found that RIRD was ineffective in decreasing vocal stereotypy (mean PRB = 15.52 %, range = 11.54–96.72 %).

Giles et al. (2012) investigated the separate effects of response blocking and RIRD on repetitive motor movements, hand mouthing, and string play displayed by three participants aged 6–10 years (m = 8 years). A reversal design with an embedded alternating treatments design was used. Response blocking was marginally more effective in decreasing stereotypy than RIRD (mean PRB = 93.19 % (range = 83.33–99.63 %); mean PRB 90.59 % (range = 90.54–90.63 %)). Furthermore, all participants demonstrated preference for RIRD over response blocking as identified through a concurrent chains assessment.

Both response cost and overcorrection (see Table 1) were found to effectively decrease vocal stereotypy displayed by a 7-year-old male with autism (Anderson and Le 2011). A reversal design was used to demonstrate the effect of these treatments separately. Response cost using music did not effectively decrease stereotypy. However, stereotypy occurred during only 5–20 % of intervals during the response cost using a DVD phase, suggesting that this was effective in decreasing stereotypy. The authors also prompted the participant to raise a finger to their lips and repeat “shh” 100 times contingent on vocal stereotypy. This procedure decreased stereotypy to at, or near, zero levels.

Wolff et al. (2013) were the only researchers in the current review to evaluate the effects of extinction on repetitive behavior with three participants aged 3.5–4.5 years (m = 3.94 years). The effects of extinction on decreasing obsessive door checking and closing, screaming, and rubbing head on others were evaluated using a reversal design. Extinction was found to be effective in decreasing repetitive behavior (mean PRB = 85.33 %; range = 56–100 %).

Positive results were observed in 8 studies across 16 participants while two studies demonstrated treatments which were ineffective across three participants. Furthermore, ineffective treatments were observed in two studies across three participants. According to the criteria for evidence-based interventions by Chambless and Hollon (1998), consequence-based treatments which are designed based on prior identification of the function of stereotypy may be considered “efficacious” (see Table 2).

Mixed Treatments

Eighteen studies were identified which were based on a previously identified function and used more than one treatment to decrease stereotypy or repetitive behavior (see Table 2). Stereotypy across all participants in this category was found to be automatically reinforced (n = 29). PRB was calculated for 16 of these studies and is summarized in Table 2.

RIRD was used in combination with other treatments across four (see Table 2). Brusa and Richman (2005) and O’Connor et al. (2011) used a discriminative stimulus (Sd) to signal that RIRD would be implemented contingent upon stereotypy. A second Sd was used to signal the absence of consequences for engagement on stereotypy. This combination was shown to be effective in decreasing engagement in repetitive object manipulation for one boy aged 8 years (mean PRB = 100 %; range = 100–100 %; Brusa and Richman 2005) and increasing latency to engaging in motor and vocal stereotypy for one boy aged 11 years (O’Connor et al. 2011).

Dickman et al. (2012) implemented RIRD with a DRI, when RIRD alone was ineffective, in an effort to decrease vocal stereotypy displayed by one participant aged 5.5 years. A reversal design demonstrated that this combination was effective in decreasing stereotypy (mean PRB = 83.94 %; range = 83.94–83.94 %).

A combination of noncontingent access to matched stimulation and RIRD was evaluated by Love et al. (2012) with two participants aged 8–9 years (m = 8.5) when each treatment was ineffective alone. This combination of treatments was also ineffective in decreasing stereotypy (mean PRB = 47.28 %; range = 23.57–60.79 %).

Anderson and Le (2011) assessed the effects of combining DRA with overcorrection procedures on vocal stereotypy of one participant aged 7 years using a series of reversals. As no baseline data was available, PRB could not be calculated; however, when a DRA contingency was combined with overcorrection, vocal stereotypy reduced to near-zero levels.

Fisher et al. (2013) used an ABC design and implemented differential reinforcement of “on topic speech” with extinction to decrease the perseverative speech of a 14-year-old teen with Asperger syndrome and neurofibromatosis syringomylia. This treatment was ineffective in decreasing perseverative speech (mean PRB = 30.48 %; range = 30.48–30.48 %), and the target behavior had returned to near baseline levels at follow-up (mean PRB = 2.02 %, range = 2.02–2.02 %).

As well as delivering reinforcement on a fixed time schedule, Taylor et al. (2005) evaluated a DRO procedure, during which a correction procedure was used contingent upon the occurrence of vocal stereotypy. The correction procedure involved the therapist telling the participant that she had engaged in vocal stereotypy and resetting the timer to start a new interval. The DRO procedure was effective in decreasing vocal stereotypy (mean PRB = 96.28 %, range = 96.28–96.28 %).

Fritz et al. (2012) used a combination of discrimination training, self-monitoring, and differential reinforcement to decrease motor and vocal stereotypy in three participants aged 12–40 years (m = 33.67 years). Differential reinforcement of accurate recording of the absence of stereotypy, DRO, and differential reinforcement of accurate recording of the presence of stereotypy were evaluated using a component analysis to elucidate which elements of the treatment package were effective. Discrimination training and differential reinforcement of accurate recording of stereotypy were effective in decreasing stereotypy (mean PRB = 74.8 %; range = 24.39–100 %). Accurate self-recording when used in combination with a DRO for stereotypy effectively reduced the target behavior for three participants (mean PRB = 82.43 %; range = 54.52–100 %). Differential reinforcement of accurate recording of the presence of stereotypy was implemented with one participant and was effective in decreasing stereotypy (mean PRB = 99.39 %; range = 99.39–99.39 %). A DRO alone, implemented with two participants, was also effective in reducing stereotypy (mean PRB = 98.5 %; range = 97.59–99.4 %). For one participant, reductions in stereotypy were hypothesized to be attributable to engagement in an activity rather than self-monitoring, i.e., it was thought that stereotypy would reduce irrespective of the activity which was implemented. A control activity (transcribing words) was implemented to test this hypothesis. Stereotypy was effectively reduced in this condition (mean PRB = 98.71 %; range = 98.71–98.71 %), suggesting that for this participant, self-management procedures may not have been the cause of decrease in behavior. The authors conclude that self-monitoring may be an unnecessary component as DRO contingencies, recent exposure to reinforcement for accurate self-monitoring, instructional control, and access to an alternative activity sufficiently decreased stereotypy.

Rehfeldt and Chambers (2003) used a combination of DRA and extinction to decrease the perseverative speech of a 23-year-old man. A reversal design demonstrated that this treatment was effective in decreasing the target behavior (PRB = 81.58 %; range = 81.58–81.58 %).

Shabani et al. (2001) evaluated the effect of a treatment package which included a DRO, discrimination training, and self-monitoring with a 12-year-old male. A multiple baseline across settings design was employed. Body rocking was effectively decreased across three settings using this procedure (mean PRB = 97.56 %; range = 96.69–98.44 %).

Shillingsburg et al. (2012) implemented a combination of NCR, response cost (see Table 1), and a DRO contingency with one participant, aged 12 years. NCR, when used with response cost, was effective in reducing vocal stereotypy (PRB = 100 %; range = 100–100 %). However, once demand was introduced, stereotypy returned to baseline levels (PRB = 0 %; range = 0–0 %) when NCR with response cost failed to decrease vocal stereotypy during demand conditions. Using a reversal design, a 95.65 % mean PRB (range = 95.65–95.65 %) was observed.

Discrimination training in combination with a stimulus control procedure was evaluated by Haley et al. (2010) with an 8-year-old boy who engaged in vocal stereotypy. A red card was used to signal that the absence of stereotypy was expected and a green card was used to signal times when vocal stereotypy was acceptable. Discrimination training was used to bring vocal stereotypy under the antecedent control of each stimulus. Following training, a card was placed on the participant’s desk, and engagement in stereotypy resulted in correction. An alternating treatments design demonstrated a mean PRB of 59.1 % (range = 59.1–59.1 %), suggesting that this combination of treatments was effective in decreasing stereotypy.

A similar procedure was implemented by Rapp et al. (2009) with two participants aged 8 years. However, unlike the study described by Haley et al. (2010), verbal reprimands were delivered on a continuous schedule, contingent upon the occurrence of vocal stereotypy in the presence of a red card, and no consequence was delivered for engagement in vocal stereotypy in the presence of a green card. This effectively reduced engagement in stereotypy for both participants (mean PRB = 75.07 %, range = 51.98–98.15 %). As the procedure was less effective for one participant (mean PRB = 51.98 %; range = 51.98–51.98 %), the authors investigated the effect of bringing vocal stereotypy under stimulus control of a range of punishment procedures. The red card was used to signal that, contingent on vocal stereotypy a mild reprimand, a more aversive reprimand, a reprimand with response cost, or response cost with a faded reprimand would be delivered. A reversal design was used to evaluate the effect of each condition. Each punishment procedure or combination was effective in decreasing stereotypy. A 52.47 % mean PRB (range = 52.47–52.47 %) was observed when a mild reprimand was delivered in the presence of the red card. A 71.45 % mean PRB (range = 71.45–71.45 %) occurred when a more aversive reprimand was delivered. A reprimand delivered with response cost resulted in a 100 % mean PRB (range = 100–100 %). Following this condition, reprimands were faded and behavior remained low (mean PRB 96.45 %; range = 96.45–96.45 %).

Similarly, Langone et al. (2013) assessed the utility of using an Sd to signal that a punishment procedure was in effect with a 16-year-old male. When a tennis bracelet was worn, response blocking (see Table 1) was implemented contingent upon repetitive hand movements. A reversal design was implemented which demonstrated that the presence of the Sd in combination with response blocking was effective in decreasing stereotypy. Furthermore, when response blocking was no longer in effect and the Sd was worn by the participant, behavior remained low (mean PRB = 61.65 %; range = 61.65–61.65 %) and maintained at follow-up (mean PRB = 68.46 %; range = 68.46–68.46 %).

Noncontingent access to toys was used as part of a multicomponent treatment evaluated by Tarbox et al. (2002). Noncontingent access to toys alone was ineffective in reducing the repetitive mouthing of a 4-year-old male, as was noncontingent access to toys with prompts to engage in toy play (mean PRB = 41.79 %; range = 41.79–41.79 %). Adding response blocking to the treatment package effectively decreased stereotypy (93.33 % mean PRB; range = 93.33–93.33 %).

Reid et al. (2010) used a mix of treatments to decrease the gross motor, fine motor stereotypy, and repetitive eye gaze of three adults aged 33–45 years (individual ages not reported) in supported work placement. A combination of antecedent- and consequence-based treatments were used to decrease stereotypy during work periods. For one participant, simply providing more work once work was completed decreased stereotypy (mean PRB = 96.51 %; range = 96.51–96.51 %). The same treatment was implemented for the second participant, with the addition of prompts to return to work and praise for on-task behavior, and a 59.48 % reduction in stereotypy was observed. For the third participant, praise was provided for on-task behavior and prompts to return to work were effective in decreasing stereotypy (mean PRB = 54.17 %; range = 54.17–54.17 %).

Lang et al. (2009, 2010) combined treatments which manipulated motivating operations and increased appropriate play across a total of five participants. Using an alternating treatments design, participants were given free access to engage in motor and vocal stereotypy, and repetitive object manipulation prior to a condition in which they were taught appropriate play skills. Lang et al. (2009) did not effectively decrease the stereotypy of one participant, aged 8 years (mean PRB = 38.67 %; range = 38.67–38.67 %); however, Lang et al. (2010) found this treatment to be moderately more effective with four participants, aged 4–7 years (m = 5 year; mean PRB = 57.63 %; range = 15.97–59.87 %).

Sixteen studies were found to have used treatments which effectively decreased stereotypy across 27 participants, while 2 studies reported using a mixture of treatments which were ineffective across 2 participants. According to the criteria for evidence-based interventions by Chambless and Hollon (1998), mixed treatments which are designed based on prior identification of the function of stereotypy may be considered “efficacious” (see Table 2).

Nonfunction-Based Behavioral Treatments

Thirty-four treatments were identified across 22 studies where treatments were not based on an identified function. Of these treatments, five utilized antecedent-based treatments, reinforcement or skills-based treatments were evaluated across eight studies, consequence-based strategies were evaluated in four studies, and a further five studies evaluated mixed treatments for stereotypy.

Antecedent Treatments

Five studies evaluated antecedent-based treatments across seven participants (see Table 3). Antecedent exercise, providing choice of activity, increasing tutor accuracy when delivering discrete trials, and giving advanced notice of transitions, are detailed in this section (see Table 1). PRB was calculated for three of these studies and is summarized in Table 3.

Table 3 Summary of efficacy of nonfunction-based treatments

Celiberti et al. (1997) compared antecedent exercise in the form of walking and jogging with one participant aged 5.75 years using a reversal design. Neither antecedent walking nor antecedent jogging was found to effectively decrease stereotypy (mean PRB 36.98 %; range = 36.98–36.98 % and −1 %; range = −1 to −1 % respectively).

Changes in instructional conditions have been documented to decrease stereotypy, including changes in delivery of trials, providing choice of activity and the use of schedules to signal transitions. Dib and Sturmey (2007) increased tutor accuracy when delivering discrete trials in an attempt to decrease stereotypy with three participants aged 9–12 years (m = 11 years) using a multiple baseline across participants design. This treatment was found to be effective in decreasing inappropriate vocalizations and repetitive body movements (mean PRB = 74.64 %; range = 61.9–81.6 %).

Modifications to the environment such as environmental enrichment have been suggested to be essential components in the treatment of stereotypy (Rapp and Vollmer 2005b). In line with this, Lanovaz et al. (2009) evaluated the effect of providing free access to items which were hypothesized to match the stimulation provided by vocal stereotypy. Three children aged 2.08–2.42 years (m = 2.22 years) participated in this study. A three-component multiple schedule was used to examine the effects of continuous access to (1) matched stimulation, (2) nonmatched preferred items, and (3) music on the vocal stereotypy of participants. Continuous access to matched stimulation more effectively decreased stereotypy than unmatched stimulation for two out of three participants. However, exposure to unmatched stimuli did decrease the target behavior for two participants during subsequent conditions.

Providing choice of activity has previously been demonstrated to decrease challenging behavior (Shogren et al. 2004). Sigafoos et al. (2009) provided one participant, aged 15 years, with a choice of two activities in order to evaluate the effect of choice on repetitive lining up/re-arranging of objects. A reversal design failed to demonstrate choice as an effective treatment (mean PRB = 40.42 %; range = 40.42–40.42 %).

Tustin (1995) found that stereotypy frequently occurred during transitions. This treatment involved providing the participant (age 28 years) with advanced notice of activity transitions. Advanced notice was compared with no advanced notice and was evaluated using a reversal (BCB) design. As no baseline was reported, PRB could not be calculated, however, providing advanced notice of activity transitions resulted in lower levels of stereotypy than transitioning with no notice.

While increasing tutor accuracy and providing continuous access to matched stimulation were effective in decreasing stereotypy for the six participants in these studies, the treatment described by Tustin (1995) failed to provide a baseline and antecedent exercise and choice were both ineffective in decreasing stereotypy. According to the criteria for evidence-based interventions by Chambless and Hollon (1998), antecedent treatments which are not designed based on prior identification of the function of stereotypy may be considered “promising but lacking in sufficient evidence” (see Table 3).

Reinforcement or Skills-Based Treatments

Eight studies were identified that utilized reinforcement- or skills-based methods which were not based on a prior functional assessment (see Table 3). These treatments were evaluated across a total of 21 participants. Reinforcement- and skills-based strategies identified in this category included teaching appropriate alternative behaviors and noncontingent reinforcement using matched stimulation and differential reinforcement (see Table 1). PRB was calculated for six of these studies and is summarized in Table 3.

Three studies which taught appropriate alternative behaviors were identified and used to decrease stereotypy across 11 participants. Frea (1997) taught two participants aged 15–23 years (m = 19 years) to orient to environmental stimuli in order to decrease repetitive eye gaze movements, vocal stereotypy, and motor stereotypy. A multiple baseline across participants was used to evaluate the effect of the treatment on stereotypy. This treatment effectively decreased stereotypy (mean PRB = 76.12 %; range = 55.56–86.09 %).

Peer training and social initiation training were evaluated by Loftin et al. (2008) across three participants aged 9–10 years (m = 9.67 years). A multiple baseline across participants design demonstrated a 97.13 % (range = 58.99–97.14 %) mean PRB. Participants were subsequently taught to self-monitor their own stereotypy which was demonstrated to maintain a 72.18 % mean PRB (range = 58.99–81.25 %).

Conditioning toy play as a reinforcer was examined by Nuzzolo-gomez et al. (2002) by pairing self-initiated toy play with praise and edible reinforcers within a multiple baseline across participants design. The repetitive object mouthing, finger licking, vocal stereotypy, and motor stereotypy of three children with autism aged 4–7 years (m = 6 years) was effectively reduced (mean PRB = 78.42 %; range = 68.35–83.66 %).

Noncontingent access to items hypothesized to provide matched stimulation has been documented across numerous studies. Rapp (2006) evaluated the effect of providing noncontingent access to items of matched stimulation (NMS) using a three-component multiple schedule with one participant aged 9 years. NMS resulted in a 100 % mean PRB (range = 100–100 %) in repetitive object tapping. Repetitive object tapping was also lower in the post-treatment component of the multiple schedule than in the pretreatment component of the multiple schedule.

Rapp (2007) compared the effects of noncontingent access to toys or music which provided similar stimulation to the vocal stereotypy of two participants, both aged 9 years (m = 9 years). Noncontingent access to music was more effective than noncontingent access to toys in decreasing challenging behavior. Furthermore, Rapp (2007) demonstrated that stereotypy remained low in conditions following the implementation of noncontingent access to matched stimulation, suggesting that matched stimulation may function as an abolishing operation for stereotypy.

Saylor et al. (2012) evaluated the effect of three forms of matched auditory stimulation on the vocal stereotypy of three participants aged 5.5–6.58 years (m = 6.04 years). An alternating treatments design was used to evaluate the separate effects of noncontingent white noise, noncontingent music, and noncontingent access to participants own recorded voice. While music and the participants voices were effective in decreasing vocal stereotypy (mean PRB = 100 % (range = 100–100 %) and 95.05 % (range = 92.2–97.8 %), respectively), white noise increased levels of vocal stereotypy (mean PRB = 8.5 %; range = −18.75 to 1.77 %).

Lanovaz et al. (2012) also evaluated the effect of auditory stimulation on stereotypy. Four participants aged 4–9 years (m = 6.25 years) were exposed to alternating conditions of noncontingent access to high- or low-preference music to evaluate their effects on vocal stereotypy. While both treatments decreased stereotypy, high-preference music reduced stereotypy to near-zero levels and was more was more effective in decreasing stereotypy then low preference music.

Rozenblat et al. (2009) investigated effective methods for schedule thinning when using differential reinforcement of other behaviors (DRO) to decrease vocal stereotypy. Three children aged 9–10 years (m = 9.33 years) participated in this study. When the DRO interval was set to the 25th percentile of the previously mastered interval, a mean PRB of 88.6 % (range = 83.32–93.55 %) was observed. However, when the DRO interval was set to the 95th percentile, a lower reduction was observed (mean PRB = 47.99 %; range = 32.29–56.46 %).

Eight treatments which were implemented with 21 participants were demonstrated to be effective treatments for stereotypy, and 1 study implemented a treatment which was ineffective with four participants. According to the criteria for evidence-based treatments by Chambless and Hollon (1998), reinforcement- or skills-based treatments which are not designed based on prior identification of the function of stereotypy may be considered “efficacious” (see Table 3).

Consequence-Based Strategies

Four studies which did not base their treatment on the results of a prior functional analysis evaluated the effects of consequence-based strategies across 10 participants (see Table 3). Three of these studies examined the effect of response interruption and redirection (RIRD), one evaluated the effects of differential reinforcement, and one examined the use of punishment (see Table 1). PRB was calculated for two of these studies and is summarized in Table 3.

RIRD was evaluated by Boyd et al. (2011), Schumacher and Rapp (2011), and Pastrana et al. (2013). Boyd et al. (2011) used a multiple baseline across participants design to compare the effect of RIRD when implemented by parents and therapists with five children aged 3.08–5.42 years (m = 4 years) who engaged in a variety of higher-order repetitive behaviors. Parent implemented RIRD-reduced stereotypy by a mean PRB of 77.8 % (range = 61.51–100 %) while a mean PRB of 80.35 % (range = 62.57–97.82 %) was observed when the treatment was implemented by therapists.

Schumacher and Rapp (2011) examined the immediate and subsequent effects of RIRD with two participants aged 5–8 years (m = 6.5 years) using an alternating treatments design with an embedded three-component multiple schedule. In each case, RIRD effectively decreased stereotypy. Unlike other consequence-based treatments which report subsequent increases in stereotypy, no increase in stereotypy was observed when RIRD was removed relative to the condition prior to the implementation of RIRD.

Pastrana et al. (2013) also investigated the immediate and subsequent effects RIRD on vocal and motor stereotypy using a three-component multiple schedule. Gross motor stereotypy was targeted using RIRD, and the effect of this on vocal stereotypy was also evaluated. Two children aged 6.5–9.75 years (m = 8.18 years) participated in this study. RIRD decreased immediate but not subsequent engagement in the targeted topography of motor stereotypy. An immediate increase was observed in untargeted stereotypy for one participant, and a decrease was observed in both topographies of stereotypy for the second participant.

Response blocking was implemented by Rapp (2006) in order to evaluate the immediate and subsequent effects of the treatment on repetitive object tapping with one participant, aged 9 years, using a three-component multiple schedule. While response blocking effectively decreased stereotypy (mean PRB = 92.12 %; range = 92.12–92.12 %), stereotypy increased above pretreatment levels once the treatment was removed.

Each study demonstrated an effective treatment for stereotypy across a total of eight participants. Based on the criteria outlined by Chambless and Hollon (1998), consequence-based treatments which are not based on a pre-identified behavioral function may be categorized as “promising but lacking in sufficient evidence” (see Table 3).

Mixed Treatments

Five studies evaluated mixed treatments which were not based on a previous functional analysis across 14 participants (see Table 3). PRB was calculated for four of these studies and is summarized in Table 3.

Boyd et al. (2013) examined the feasibility and effects of exposure and response prevention across five participants aged 5–11 years (m = 8.6 years) who engaged in repetitive preoccupation with objects using a pre- post-test design. This treatment involved alternating trials, whereby the participants had free access to objects evoking preoccupations, with trials whereby the participants were to engage in academic tasks. The results demonstrated an increase in latency to engage in preoccupations, a decrease in problem behavior, and an increase in on-task behavior. However, due to the low number of participants and lack of an experimental control or comparison group, further research is needed before drawing conclusions in relation to this treatment.

Mason and Newsom (1990) evaluated the use reinforcement for on-task behavior while participants were wearing rings which were hypothesized to mask sensory stimulation with three participants 12–16 years (mean = 14.33 years). However, only one participant had been diagnosed with ASD and so only data from this participant are included here. A mean PRB of 100 % (range = 100–100 %) was observed using this combination of treatments.

Sigafoos et al. (2009) implemented a treatment which combined providing choice of activity with social attention. Choice alone had been ineffective in decreasing the repetitive object manipulation of one participant aged 15 years. When social attention was provided in combination with choice, a 78.72 % mean PRB (range = 78.72–78.72 %) was observed.

Stahmer and Schreibman (1992) used a combination of discrimination training, self-monitoring, and differential reinforcement of appropriate behavior to decrease the repetitive behaviors of three children aged 7–13 (m = 10.6 years) using a multiple baseline across participants. Reinforcement was provided for appropriate play in the absence of stereotypy, participants were taught to self-monitor their own behavior throughout increasing intervals. Once behavior had decreased to near-zero levels, the self-monitoring materials and the therapist were faded. This treatment was effective in decreasing repetitive behaviors (mean PRB = 71.51 %; range = 38.83–75.7 %).

The effect of environmental enrichment in combination with response cost was evaluated by Watkins et al. (2011) with two children aged 7–11 years (m = 9 years) using a multiple baseline across tasks for one participant and a reversal for the second participant. When environmental enrichment was implemented, preferred items were removed contingent upon vocal stereotypy. This combination was effective in decreasing the stereotypy of both participants (mean PRB = 87.41 %; range = 65.93–100 %). A PRB of 94.3 % (range = 94.3–94.3 %) was demonstrated for one participant at follow-up, no follow-up data were provided for the second participant.

In line with the criteria outlined by Chambless and Hollon (1998), mixed treatments which are not based on a prior functional analysis are deemed “promising but lacking in sufficient evidence” (see Table 3).

Pharmacological Treatments

Of the five studies (see Table 4) which evaluated the effects of pharmacological treatments on stereotypy and repetitive behavior, three studies evaluated the use of antidepressants with a total of 87 participants, one evaluated the use of anticonvulsants with 13 participants, and one evaluated the use of selective serotonin reuptake inhibitors (SSRIs) with 149 participants. It was not possible to calculate PRB for these studies.

Table 4 Summary of efficacy of pharmacological treatments

Antidepressant Medication

Gordon et al. (1993) compared clomipramine to a placebo in a single-blind washout phase, followed by a double-blind crossover comparison comparing clomipramine to desipramine, in the treatment of obsessive compulsive behaviors and motor stereotypy with children aged 6–18 years (m = 9.42 years). Clomipramine was found to be superior to both the placebo and desipramine and resulted in significant decreases in the target behaviors. Side effects were reported by 24 participants when taking clomipramine; 12 participants reported side effects when taking desipramine, and 12 participants reported side effects when taking the placebo. The authors concluded that these side effects were minor and were not statistically significant between groups.

Hollander et al. (2005, 2012) evaluated the use of fluoxetine in the treatment of stereotypy and repetitive behavior across 65 participants. A double-blind placebo-controlled cross over trial was implemented by Hollander et al. (2005) with children aged 5–15 years (m = 8.18) while a randomized placebo control trial was used by Hollander et al. (2012) with adults aged 18–60 years. Significant decreases in repetitive behavior and stereotypy were observed in both studies with no significant side effects.

While significant reductions in stereotypy and repetitive behavior were observed across all three studies which used antidepressant medication, the same author was involved in two of the studies According to the criteria for evidence-based treatments by Chambless and Hollon (1998), antidepressant medication as a treatment for stereotypy may be deemed “promising but lacking in sufficient evidence” (see Table 4).

Anticonvulsant Medication

Divalproex sodium was the only anticonvulsant medication used to treat stereotypy within the studies included for review. Hollander et al. (2006), using a randomized control trial, compared the effect of divalproex sodium on stereotypy and repetitive behavior to a placebo. Thirteen participants aged 5–17 years (m = 9.5 years) were included in the study. A significant decrease in stereotypy was observed in 79 % of the participants in the treatment group in comparison to 0 % of the control group, with no significant differences between the side effects reported by either the treatment group or control group.

According to the criteria for evidence-based treatments by Chambless and Hollon (1998), the use of anticonvulsant medication in the treatment of stereotypy may be considered “promising but lacking in sufficient evidence” (see Table 4).

Selective Serotonin Reuptake Inhibitors

King et al. (2009) compared citalopram against a placebo control group with 149 participants aged 5–17 years (m = 9.4 years) in a single-blind randomized control trial. No significant difference was observed in stereotypy, as measured by the Clinical Global Impressions Improvements subscale nor was a reduction observed for either group on the Children’s Yale-Brown Obsessive Compulsive Scale (CY-BOCS). Furthermore, citalopram was significantly more likely to be associated with adverse events such as increased energy levels, impulsiveness, decreased concentration, and hyperactivity than the placebo.

According to the criteria for evidence-based treatments by Chambless and Hollon (1998), the use of anticonvulsant medication in the treatment of stereotypy may be considered “lacking in sufficient evidence” (see Table 4).

Sensory Integration-Based Treatments

Of the five studies which evaluated the use sensory integration therapy, one study evaluated sensory integration therapy, three evaluated the use of weighted vests, and one evaluated a brushing treatment. PRB was calculated for five of these studies and is summarized in Table 5.

Table 5 Summary of efficacy of sensory integration-based treatments

Watling and Dietz (2007) evaluated the effect of sensory integration therapy (SIT) on a range of repetitive behaviors with four children with ASD aged 3–4.33 years (m = 3.7 years). Using an ABAB reversal design, a 56.1 % (range = 45.1–66.65 %) reduction in behavior was observed. The impact of SIT on engagement was also assessed; however, no improvement in engagement was found.

Weighted vests as a treatment for stereotypy was assessed by Fertel-daly and Bedell (1992), Hodgetts et al. (2011), and Kane et al. (2004) with a total of 12 children with ASD and pervasive development disorder not-otherwise specified (PDD-NOS).

Fertel-daly et al. (1992) assessed the effect of a weighted vest with five participants; however, data for three participants was excluded from this review as these participants engaged in repetitive self-injurious behaviors. Two participants aged 2.75–2.83 years (m = 2.79 years) with PDD-NOS who engaged in repetitive object manipulation, gross and fine motor stereotypy, and vocal stereotypy were included in this review. A reversal design revealed that weighted vests were ineffective in decreasing stereotypy (mean PRB = 25.62 %; range = −22.97–74.54 %).

Hodgetts et al. (2011) compared the use of weighted vests which were calibrated at either 5 or 10 % of the child’s body weight to decrease a variety of stereotypy including fine and gross motor stereotypy, repetitive object manipulation, and vocal stereotypy. Six children aged 4–10 years (m = 6.7 years) participated in this research. A reversal design was used to evaluate the effect of each condition on stereotypy; however, neither treatment condition was effective in decreasing stereotypy. For participants wearing a weighted vest calibrated at 5 % body weight, a mean PRB of 11.28 % (range = −32 to −85.61 %) was reported, while wearing a vest calibrated at 10 % body weight resulted in a mean increase in stereotypy of −60 % (range = −60 to −60 %).

Kane et al. (2004) employed an ABC design to assess the impact of wearing a weighted vest on a range of stereotyped and repetitive behaviors including gross and fine motor stereotypy and repetitive object manipulation. Four children aged 8–11 years (m = 9.3 years) wore a vest with no weight during the first treatment phase and a vest with weights during the second treatment condition. Wearing a weighted vest resulted in an increase in stereotyped and repetitive behavior (mean PRB = −11.39 %; range = −75 to 4.44 %). As with Watling and Dietz (2007), no increase in attention to task was observed.

Davis et al. (2011) used a reversal design to evaluate the effect of The Wilbarger Protocol, a brushing technique, with one 4-year-old participant. Brushing was used in an attempt to decrease repetitive gross and fine motor stereotypy. Results showed a mean PRB of −35.57 % (range = −35.57 to −37.57 %) in the target behavior as a result of the brushing technique employed.

Of the studies which used sensory integration-based treatments, only one effectively decreased stereotypy and repetitive behavior with four participants. Three studies were demonstrated to be ineffective across 13 participants. According to the criteria for evidence-based treatments by Chambless and Hollon (1998), this classifies sensory-based treatments for stereotypy as “ineffective” (see Table 5).

Other Treatments

Two “other” treatments were identified in the review and are summarized in Table 6. PRB could not be calculated for either study.

Table 6 Summary of efficacy of other interventions

Bahrami et al. (2012) evaluated the effect of Kata techniques training, an exercise-based treatment across 30 children with ASD aged 5–16 years (m = 9.13 years). Using a randomized control trial, a statistically significant decrease in stereotypy was observed for participants in the experimental group but not the control group.

In line with the criteria set out by Chambless and Hollon (1998), this treatment may be considered “promising but lacks sufficient evidence” due to the small sample size and lack of replication (see Table 6).

The second treatment evaluated the effects of oxytocin infusion. Hollander et al. (2003) hypothesized that, based on findings from animal studies which demonstrate that oxytocin may be implicated in the development of repetitive behavior and findings of increased oxytocin levels in children who respond to treatment with clomipramine, oxytocon may be effective in decreasing repetitive behaviors in children with ASD. Hollander and colleagues used a within-subjects double-blind randomized control trial with 15 participants aged 19.4–55.6 years (m = 32.9 years) to test this hypothesis. Results showed a significantly greater reduction in repetitive behaviors over time following oxytocin infusion in comparison to a placebo infusion. Side effects reported by participants were mild.

In line with the criteria set out by Chambless and Hollon (1998), this treatment may be considered “promising but lacks sufficient evidence” due to the small sample size and lack of replication.

Discussion

Function-Based Treatments

Both antecedent and reinforcement or skills-based treatments derived from the results of a functional analysis were categorized as promising but lacking in sufficient evidence according to the criteria outlined by Chambless and Hollon (1998). While the majority of antecedent-based treatments were demonstrated to effectively decrease stereotypy by more the 50 % using PRB, studies in this category lacked sufficient replication across participants in order to be deemed evidence based. Similarly, the majority of reinforcement- or skills-based treatments effectively decreased stereotypy; however, conflicting results were observed. Four treatments across five participants were ineffective, and, while this does not suffice to determine this category of treatments as ineffective, it does suggest that there are parameters to the efficacy of reinforcement and skills-based interventions in the treatment of stereotypy. Further research is needed to determine the variables associated with theses treatment outcomes.

Consequence-based and mixed treatments derived from a previous functional analysis or assessments were categorized as efficacious and thus meet the criteria for evidence-based treatment (Chambless and Hollon 1998). This suggests that consequence-based strategies and multicomponent treatments are more effective in decreasing stereotypy than antecedent- and reinforcement or skills-based treatments alone. Notably, 18 of the 37 studies included in this category implemented an intervention which comprised two or more treatments. The majority of these studies also examined the effects of individual treatments and concluded that multicomponent treatments were more effective than individual treatments alone. Such findings suggest that it may be more effective to decrease stereotypy using multiple treatments which include antecedent, reinforcement, skills, and consequence-based strategies.

Nonfunction-Based Treatments

Antecedent-based treatments and mixed treatments which were not derived from a previous functional assessment were shown to lack sufficient evidence. Of the antecedent-based treatments evaluated in this category, one study (Tustin 1995) failed to demonstrate experimental control, and two studies (Celiberti et al. 1997; Sigafoos et al. 2009) did not demonstrate a sufficient decrease in stereotypy. Although this approach was not categorized as ineffective, it does highlight the lack of supporting evidence for antecedent-based treatments which have not been derived from a previous functional analysis.

In contrast to function-based treatments, nonfunction-based treatments which evaluated mixed treatments were lacking in sufficient evidence. Boyd et al. (2013) failed to demonstrate experimental control; positive results were observed in four studies across seven participants and thus suggest that these treatments may be effective, but lack sufficient evidence. Mixed treatments based on a prior functional analysis have been deemed evidence based, suggesting that determining the function of stereotypy prior to the implementation of mixed treatments may increase their efficacy.

Both reinforcement- or skills- and consequence-based treatments were determined to be “efficacious” treatments. The majority of reinforcement- and skills-based treatments in this category evaluated the use of noncontingent access to matched stimuli. Given that 90 % of stereotypy in individuals with ASD is automatically reinforced (Wilke et al. 2012), it is possible that the stereotypy of the participants in these studies was maintained by automatic reinforcement and would account for the decrease in stereotypy reported. However, these treatments should not be implemented arbitrarily as Wilke et al. (2012) also reported that the stereotypy of 10 % of participants was maintained by social consequences.

As with function-based treatments, consequence-based treatments which were not derived from previous functional analysis were determined to be evidence-based interventions. This suggests that, for stereotypy at least, consequence-based treatments may be effective irrespective of behavioral function.

Pharmacological Treatments

Positive results were demonstrated across each category of pharmacological treatments. Antidepressants were effective in treating stereotypy with 100 participants across three studies; however, two of these studies were conducted by the same author. Anticonvulsant medications also show promise; however, given the low number of participants and lack of replication, further research is needed to determine the efficacy of anticonvulsant medication in the treatment of stereotypy in autism. Selective serotonin reuptake inhibitors were not effective in treating stereotypy; however, without further research, these cannot be considered to be ineffective. Overall, pharmacological treatments demonstrate promising results but as of yet lack sufficient evidence to meet the criteria for evidence-based treatment.

Sensory Integration-Based Treatments

Of the studies identified which used sensory integration-based treatments, only one study effectively decreased stereotypy for four participants. These treatments were found to be ineffective across four studies with sixteen participants, and thus the conclusion must be drawn that sensory integration-based treatments are ineffective in decreasing stereotypy.

Other Treatments

Both treatment approaches which used “other” treatments reported positive results. Both Kata techniques training and Oxytocin were effective in decreasing stereotypy across five participants. While these results are promising, at present, they fail to meet the criteria for evidence-based treatment. Further research and replication is needed before any conclusions can be drawn regarding the efficacy of these treatments for individuals with autism.

Conclusion

A variety of treatments were identified which effectively decreased stereotypy; however, many are in need of further replication before they may be determined as evidence-based approaches. Sensory integration-based treatments were found to be ineffective and therefore may not be considered effective treatments for stereotypy. More research is needed in order to determine the efficacy of pharmacological treatments and caution should be exercised in their use for the treatment of stereotypy in persons with autism.

While function- and nonfunction-based treatments appear to be comparable in efficacy, the function of stereotypy cannot be ignored given that stereotypy was found to be maintained by social consequences across two studies (Fisher et al. 2013; Kennedy 1994). Furthermore, while a functional analysis was not conducted, Sigafoos et al. (2009) found that choice alone was ineffective in decreasing stereotypy but choice combined with social attention, when provided noncontingently, demonstrated a significant decrease in behavior, suggesting that social contingencies may have been maintaining the behavior. As with all challenging behavior, conducting a functional analysis prior to the implementation of an intervention for stereotypy may result in more effective treatments being implemented and consequently, a more rapid decrease in behavior may be observed.

Consequence-based treatments were deemed efficacious irrespective of category, suggesting that analysis of the function of stereotypy may not be as important when considering the use of consequence-based treatments. However, while these treatments are effective, it is important that ethical considerations are taken into account and a least restrictive model is utilized. Furthermore, acquisition of alternative and more appropriate replacement behaviors may be necessary through reinforcement and skills-based teaching in order to eliminate the problem behavior and produce long-term positive outcomes.

Mixed treatments which were based on a pre-identified function met the criteria for evidence-based treatments, while mixed treatments which were not based on an identified function showed promise but lacked sufficient evidence. This suggests that a predetermined behavioral function may be useful in determining which treatments to use and in what combination. It is therefore recommended that, when treating stereotypy, a prior functional analysis or assessment is conducted and a mixture of effective treatments be used in order to effectively decrease stereotypy.