Introduction
Anxiety disorders are the most common type of mental disorder in Western societies (see Grillon et al.,
2019), with a prevalence of up to 25% in the adult population (Baxter et al.,
2013; Remes et al.,
2016). These disorders do not only constitute a major health problem for patients, but also come at enormous economic and societal cost. The burden is even greater considering that anxiety disorders are comorbid with other health problems and increase the risk for different mental disorders, such as substance addiction or depression (Grillon et al.,
2019). Fortunately, exposure-based therapies, a form of cognitive-behavioural therapy (CBT), have empirically demonstrated their effectiveness for most patients suffering from anxiety disorders (Craske & Mystkowski,
2006). Exposure-based therapies are a set of techniques in which the patient is repeatedly confronted with an anxiogenic stimulus in the absence of aversive consequences. The aim of this exposure is the reduction of the fear response associated to the anxiogenic stimulus, as well as improving clinical anxiety.
Experimental extinction in the laboratory has been widely used as a model for exposure therapies of anxiety disorders (Graham & Milad,
2011; Urcelay,
2012) and to understand the origin of different forms of relapse (Vervliet et al.,
2013). According to this model, fear extinction depends on the development of inhibitory learning (Bouton,
2004; Craske et al.,
2014). The use of experimental laboratory models becomes a relevant tool for trying new techniques that may potentiate inhibitory learning and eventually translate into the improvement of exposure-based therapies (Craske et al.,
2014; Sewart & Craske,
2020; Vervliet et al.,
2013). In fact, there is ample consensus in the literature that relapse prevention depends crucially on the optimization of inhibitory learning (Craske et al.,
2008,
2014; Jacoby & Abramowitz,
2016; McGuire & Storch,
2019; Weisman & Rodebaugh,
2018).
1 Nevertheless, and although exposure-based therapies are successful in reducing anxiety in the short term, they do not always maintain their effects in the long term, with relapse estimates ranging from 19 to 62% (Craske & Mystkowski,
2006).
Several factors have been shown to be implicated in the recovery of the initial problematic response in the laboratory, both in non-human and human conditioning studies. For example, the mere passage of time after the extinction phase may lead to a relapse of the initial anxious response (i.e., the spontaneous recovery effect; Pavlov,
1927). Another factor that has been related to relapse is the experience of a stressful situation after extinction, even if that situation is unrelated to the initially anxiogenic stimulus (i.e., the reinstatement effect; Rescorla & Heth,
1975). A change in the context in which extinction was initially provided has also shown to promote the renewal of the initial anxious response (i.e., the renewal effect; Bouton & Bolles,
1979; see Vervliet et al.,
2013 for a review of renewal research). Furthermore, after extinction, later encounters with the initial anxiogenic experience lead to a very rapid relearning, faster than the original learning experience (i.e., the rapid reacquisition effect; Bouton,
2002). Importantly, all these different forms of recovery have not only been observed in conditioning studies, but also in the clinical setting after exposure therapy (Boschen et al.,
2009; Craske et al.,
2012). Thus, the main current challenge for exposure-based therapies is not so much to achieve anxiety reduction but to prevent the relapse of the pathological anxiety response. In other words, there is ample room for improvement in the efficacy of this successful evidenced-based therapy, with relapse prevention becoming a top priority in this regard (Dunsmoor et al.,
2015; Vervliet et al.,
2013).
It is important to note that, although here we refer to the conditioning of fear, these ideas can also be applied to appetitive contexts, in which a neutral cue becomes associated with an event of appetitive or positive significance (for instance, food or drugs). Additionally, although this type of conditioning is less studied in animals and, especially, in humans, it is highly important to advance in the study of certain pathologies, such as substance addiction, gambling, or obesity (Andreatta & Pauli,
2015; Quintero et al.,
2020; Ramnerö et al.,
2019; Schyns et al.,
2020). Moreover, equivalent relapse phenomena have been described in appetitive conditioning and, in fact, the relapse prevention strategy studied in this review was first proposed to prevent the relapse of appetitive responses (see Bouton et al.,
2004).
In recent years, several techniques have been developed from laboratory extinction studies aimed to potentiate inhibitory learning (see Craske et al.,
2014,
2018,
2022; Tolin,
2019; Vervliet et al.,
2013). One of these techniques is the occasional inclusion of reinforced trials during extinction (i.e., occasional reinforced extinction; ORE hereafter). This effect was initially described by Bouton et al. (
2004) and Woods and Bouton (
2007). In a series of animal classical and instrumental conditioning experiments, they found that including some pairings between the conditioned stimulus (CS) and the unconditioned stimulus (US) as part of the extinction procedure could slow down the rate of reacquisition of a previously extinguished response. Following these initial studies, several authors, like Craske et al. (
2014,
2018), argued that ORE may be a viable and general strategy to enhance inhibitory learning and its retrieval, with a potentially translational value in the clinical domain. Apparently, experiencing the US during extinction may provide some form of resilience to the individual which can be therapeutically beneficial (Krompinger et al.,
2019).
Different explanations have been proposed for the potential effectiveness of ORE on the mitigation of relapse. According to Bouton et al.’s account (
2004), the initial excitatory association acquired will stay unchanged after extinction. A new inhibitory association between the original CS and the US will be created during extinction, and this second association will be context dependent. This means that it will be engaged only when features of the extinction context are present. Therefore, at a later test phase, conditioned response will be reduced to the extent that the test context resembles the extinction context. The reasoning for this would be that the inhibitory memory will be retrieved and reduce the expression of the original association. Following Bouton et al.’s studies, the preventive effects of ORE should be specific to rapid reacquisition but not to other forms of relapse (Bouton et al.,
2004; Woods & Bouton,
2007). This should occur because reinforced trials in ORE, unlike standard extinction, become part of both the acquisition and the extinction contexts. Reacquisition will be slowed down as reinforced trials introduced after extinction will be able to promote the retrieval of the inhibitory learning developed during extinction. In the case of other recovery phenomena (e.g., spontaneous recovery), the test phase is conducted including only extinction trials. As have been explained, after ORE, the extinction memory will only be retrieved if both reinforced and non-reinforced trials are presented during test.
Gershman et al. (
2013) offer a different explanation for the preventive effects of ORE based on the concept of prediction error. According to their account, relapse prevention depends crucially on the specific distribution of reinforced trials during extinction, so that only a gradual decrease of reinforced trials after acquisition will have a preventive effect. This account assumes that the onset of a standard extinction training produces large prediction errors. The CS strongly predicts the presentation of the US after the initial training, but suddenly this is not the case anymore. They propose that these persistently large prediction errors may serve as a segmentation signal (i.e., a novel state in the environment), demanding the formation of a new inhibitory memory and thus, the original memory remains mostly unmodified. The newly formed inhibitory memory becomes context dependent (see also Bouton,
1993,
2002). However, should these prediction errors be small or infrequent, but still large enough to drive learning, as in ORE, no segmentation would occur, and the original acquisition memory will be weakened. Thus, according to Gershman et al. (
2013), a gradual ORE should have a general preventive effect to all forms of relapse (see Culver et al.,
2018, for other theoretical accounts of ORE general preventive effects).
In recent years, evidence has started to be gathered regarding ORE effects, including experiments with non-human and human participants, using appetitive as well as aversive procedures, and evaluating its effectiveness on different relapse phenomena such as spontaneous recovery, reinstatement, renewal, or rapid reacquisition. Given the rapid accumulation of evidence and the different, even contradicting, pattern of results obtained so far, it is necessary to make a comprehensive and critical review of this evidence. Our objective was to conduct a systematic review of ORE studies to answer the following questions: Is there consistent evidence showing that ORE is effective in reducing the relapse of the conditioned response? Is this relapse prevention effect homogeneous across the different relapse phenomena tested? Under what specific circumstances have these effects been studied (for instance, type of sample, outcome measure, etc.)? What methodological criteria should be taken into account when testing the effectiveness of ORE (for example, distribution of reinforced trials, critical prerequisites to test the effectiveness of ORE, etc.)?
Clinical Studies: Effects of ORE in Therapy
A total of three clinical studies have applied an occasional reinforced intervention in a clinical setting. One of them as a case study with OCD patients (Krompinger et al.,
2019) and the other two as clinical studies with snake fearful adults (Jessup & Olatunji,
2022) and overweight women (Schyns et al.,
2020). As can be seen in Table
2, a great variety of measures were assessed, from symptom-related questionnaires (e.g., Yale-Brown Obsessive Compulsive Scale, Y-BOCS, or Eating Disorder Examination Questionnaire, EDE-Q) to expectancy ratings or behavioural tests (i.e., behavioural approach task, BAT).
On these studies, the intervention consisted of exposure experiences where the participants had to occasionally encounter the relevant stimulus or situation. For Krompinger et al. (
2019) this meant that two OCD patients underwent a treatment where they had to confront evidence “confirming their fears” (for instance, one of the patients, who was fearful of causing harm while driving, had to complete a driving exposure where she accidentally knocked over some warning signs that were on the road), taking this experience as an opportunity to learn and recover more easily (i.e., by realizing they can manage the situation despite the unpleasant occurrence). The patients engaged in daily sessions of extinction with response prevention treatment for several weeks, besides attending CBT therapy groups. Symptom progression was assessed weekly and showed a reduction in OCD symptomatology (see Table
2 for more details).
Schyns et al. (
2020) were interested on the effects of cue exposure therapy aimed at strengthening inhibitory learning by violating the CS-US (i.e., food → eating) expectancy. They conducted eight exposure sessions in which participants were exposed to palatable food and instructed to eat a small amount of it once per session and at a variable point. Participants’ expectancies were then measured throughout the session and the researchers evaluated how those exposures affected different relapse-related measures. They compared their results to those from a control condition in which participants received eight sessions (four in person and four via telephone) of psychoeducation on body image, mindfulness, and lifestyle advice. Participants in both groups were evaluated before and after the intervention, as well as three months later. The authors found that the exposure intervention was more effective than the control condition to reduce snacking and binge eating behaviours (see Table
2 for more information).
Finally, Jessup and Olatunji (
2022) exposed participants to four videos of snakes that could be presented alone for 5 min or followed by another video of a snake biting a person (for 20 s) before returning to the initial video. Measuring expectancy ratings and behavioural approach tendencies before and after the intervention, as well as one week later, they found that occasionally reinforced trials during exposure diminished both measures in comparison to the standard exposure group.
As can be seen, the idea underlying these different procedures was also to promote the violation of CS-US expectancies to enhance a stronger inhibitory learning on these subjects. Results from these three studies support a beneficial effect of ORE, with a significant reduction in the problematic symptomatology displayed by the individuals, even in the long term (for instance, Krompinger et al.,
2019, report results from a 6-month follow-up in which reduced symptom levels are maintained.)
Statistical and Methodological Considerations
Although we have focused our review on discussing procedural differences within the ORE literature that may explain the sometimes-contradictory results, statistical and methodological aspects should also be taken into account, as they could explain part of the variability observed when investigating the effectiveness of ORE.
The laboratory work discussed in this review varies substantially concerning the number of participants included in each study (see Table
1). In human studies, sample sizes tended to be relatively small, with some experiments including 17 participants (see Dunsmoor et al.,
2018). However, larger samples sizes have also been used, as in Thompson et al. (
2018), Lipp et al. (
2021), or Quintero et al. (
2022), with some of them including up to 157 participants after applying exclusion criteria. Importantly, all of them used a between-subjects design, which have reduced statistical power and limited precision compared to within-participants designs.
6
Another aspect worth mentioning is the statistical power of the experiments. On average, these studies included 25 subjects per experimental condition. With this sample size, studies are well powered only to detect very large effects (e.g., 80% power to detect a Cohen’s
d = 0.8). Moreover, out of the eleven studies conducted in the laboratory, only two reported power analyses: whereas Quintero et al. (
2022) performed a
post-hoc power calculation, Lipp et al. (
2021) used an a priori power analysis to establish the appropriate sample size.
Additionally, in some of those studies, especially the ones measuring physiological variables (namely SCR and startle), the data from some participants were not included in certain analyses (see Fig. 4, in Shiban et al.,
2015). For example, for the reinstatement test in Shiban et al. (
2015), the contingency and SCR data from 13 participants from the ORE group and 15 from the standard extinction condition were considered, whereas the startle analysis included the data from only 11 participants in the occasional reinforced group and 12 in the standard group. These authors point out that their small sample size should be taken as a limitation. This is especially important considering that small samples data can lead to more variable results.
As for the type of contrasts used in the included studies, we found a wide variety of tests. For instance, while some studies calculated recovery as the difference between response levels at the end of extinction vs. at the beginning of the test, others compared acquisition and test response levels or solely compared the performance of the different groups at test. These differences in the way the ORE effect is calculated, in combination with the large variety of outcome measures (from expectancy ratings to SCR), hamper any formal comparison across studies and the synthesis of the results on a meta-analysis.
Out of the twelve laboratory studies, only Morís et al. (
2017) and Quintero et al. (
2022) made the data and scripts publicly available. However, none of the protocols was pre-registered. Out of the three clinical works, only Schyns et al.’s (
2020) study proposal had been previously published (see van den Akker et al.,
2016, for a detailed description of the protocol as well as a brief section including the proposed statistical analyses).
Discussion
Extinction has been proposed as the experimental model of exposure therapy, allowing researchers to investigate potential ways to improve the latter with results derived from the laboratory. In fact, several studies have found a correlation between the laboratory and the clinical outcomes (Ball et al.,
2017; Forcadell et al.,
2017; Hahn et al.,
2015; Waters & Pine,
2016; see Scheveneels et al.,
2021, for a review on this topic). In recent years, the number of studies investigating potential ways to improve extinction has exponentially increased, with the target at maintaining low levels of the anxiety response in the long term (Craske et al.,
2014; Vervliet et al.,
2013).
The sparse inclusion of reinforced trials during extinction has been suggested as an effective strategy to achieve relapse prevention via the enhancement of inhibitory learning (Craske et al.,
2014). Initially described by Bouton et al. (
2004), ORE has been explored in several laboratory and clinical studies. In this review, we aimed at collecting and performing a critical analysis of the divergent existing literature about this extinction intervention, trying to answer various questions regarding the effect of ORE and the potential conditions that may account for its effectiveness. In the following sections, we will try to answer each of our research questions considering the results of our review.
Is there Consistent Evidence Showing that ORE is Effective in Reducing the Relapse of the Conditioned Response? Is this Relapse Prevention Effect Homogeneous Across the Different Relapse Phenomena Tested?
After conducting the systematic search and applying the inclusion and exclusion criteria, we selected a total of 15 reports, including 12 laboratory and three clinical studies published between 2004 and 2022. By and large, the effects of ORE in the laboratory (see Fig.
2) are not homogeneous across the different response recovery phenomena tested in the reviewed studies. The most consistent result seems to be the slowing down of the rate of reacquisition, both in animal and human experiments, although there are some negative results as well. Evidence tends to be less clear when it comes to other less studied recovery phenomena, yielding mixed results regarding the preventive effects of ORE (see Fig.
2). Therefore, it is difficult to draw yet a clear conclusion on whether ORE is effective to reduce recovery.
Under What Specific Circumstances Have These Effects Been Studied?
The benefits of ORE are not homogeneous across all relapse phenomena tested. So, which characteristics of these studies may help understand the conditions under which those effects can be obtained?
First, not only the results are contrasting, but the type of outcome measures assessed in the different studies also tends to differ. It should be noted that, although the inclusion of different measures is not uncommon and can even be advisable in the field (see Lonsdorf et al.,
2017), the ORE literature offers a picture that is difficult to interpret. Not only is ORE not consistently effective to reducing specific recovery phenomena, but the response systems it has an effect on tend to vary across studies (see the
Primary outcome measures section for more details). In general, the evidence is mixed, and ORE has not shown to be systematically effective at tackling specific response systems. Unfortunately, it cannot be confirmed whether these differences are telling us something about the dimension of the fear response that ORE could be modifying or if the different results could be solely explained based on procedural or methodological aspects.
It should be noted that it is not unusual to find divergences among the different components of the conditioned response (i.e., verbal, physiological, and behavioural indices). However, even if some components of the response are positively affected by ORE (i.e., preventive effects are observed), the fact that ORE does not influence all response components may eventually cause a more generalised response recovery (Boddez et al.,
2013). Moreover, it is not clear why ORE should affect certain response systems and not others. A more detailed examination of these discrepancies needs to be performed in the future if the field aim at generating solid and guiding evidence that could be applied to therapy.
Regarding the type of sample, as can be seen in Fig.
2, animal studies offer consistent evidence supporting the benefit of ORE. However, human studies provide less consistent results (i.e., six experiments with positive ORE effects, six experiments with negative effects, and one experiment with inconclusive results), making it difficult to judge whether ORE is really effective to reduce response recovery.
A wide variety of stimuli has also been used on the different empirical studies, which may have an impact on the learning processes and on the potential comparison among studies. For instance, whereas some stimuli may promote a stronger conditioned response, others may not be adequate to elicit such intense emotional response (either negative or positive). In this case, learning could be hindered, as well as the interpretation of the results, obscuring any benefit of ORE.
What Methodological Criteria Should be Taken into Account When Testing the Effectiveness of ORE?
The heterogeneity among the studies can also be observed in methodological and statistical features. We found a considerable heterogeneity in the procedures used across different studies, especially in terms of number of trials per phase and, more importantly, the type of occasional reinforcement rate applied during extinction. Again, these differences complicate the comparison between studies and might have important implications considering that one of the theoretical explanations of ORE suggests that the original acquisition memory can only be modified by the gradual reduction of the CS-US pairings. Based on associative learning theories (i.e., Rescorla & Wagner,
1972), we may expect that the longer the conditioning, the stronger the association between stimuli (other things being equal). Hence, this may lead to differential effects, as it would not be the same to conduct extinction on memories established after an acquisition phase of variable duration. For instance, we would expect the acquisition memory to be stronger and more difficult to modify after twelve than after merely three acquisition trials, and this could potentially explain part of the discrepancies observed in the ORE literature.
Importantly, some of the positive results observed in the ORE literature should be taken with caution. A low number of extinction trials, being some of them CS-US presentations (see Fig.
3 for a summary of the different reinforcement schedules that have been applied), could potentially hinder asymptotic extinction, especially when reinforced trials were still presented on the last trials of this phase. In fact, Culver et al. (
2018) and Shiban et al. (
2015) found a difference between the conditioned response to CS + and CS– even at the end of extinction training. This was noted by Morís et al. (
2017) on their first two experiments, opting for a more gradual decrease in their Experiment 3, in which complete extinction was observed in both ORE and standard extinction groups. Conceptually, an important prerequisite is that extinction must be effectively established before assessing any form of response recovery, especially in order to rule out any difference on conditioned response levels between groups before test that are not due to the experimental manipulation.
Although some studies evaluated various recovery phenomena on different experiments (for example, Gershman et al.,
2013, or Quintero et al.,
2022), others did not test them independently. We found that those studies evaluated the different phenomena in a sequential way, that is, one test after the other, which might have obscured any preventive benefit of ORE (Culver et al.,
2018; Lipp et al.,
2021; Thompson et al.,
2018) due to a potential carryover effect. For instance, evaluating spontaneous recovery could affect a later evaluation of reacquisition and this latter test would not be a sensible measure for the preventive effects of ORE. In this regard, the experiments that failed to offer support for the slower reacquisition effect after an occasional reinforced training evaluated different response recovery phenomena sequentially. So, even though this manipulation could have been able to slow down the rate of reacquisition, the cumulative effect of previous tests might have undermined the sensitivity to detect it.
Taken together, the differences on various methodological and statistical relevant aspects involved in the study of ORE might have a cumulative detrimental impact on the field, obscuring the potential effect this intervention could have. Moreover, some of them did not ensure minimal critical prerequisites to assess the effectiveness of extinction (i.e., asymptotic response levels prior to the test) or conducted experiments and/or analyses with small sample sizes (see the section on Statistical and methodological considerations of the ORE literature). These issues hinder a proper comparison across studies, making it more difficult to ascertain the effect ORE could have. Because of this, it can be concluded that, at this time, there is a dearth of clear and systematic laboratory evidence supporting the effectiveness an ORE treatment may have on the reduction of the recovery of the conditioned response.
Recommendations for Future Studies
Although we excluded theoretical articles from the final sample, it should be noted that throughout the literature search we found 18 reports of this kind, that is, articles that mention ORE as a potential strategy to enhance extinction learning and prevent or reduce relapse in the laboratory or within clinical settings (Bautista & Teng,
2022; Craske et al.,
2014,
2018,
2022; Dunsmoor et al.,
2015; Elsey & Kindt,
2017; Jansen et al.,
2016; Keller et al.,
2020; Kummar et al.,
2019; Lipp et al.,
2020; McGuire et al.,
2016; McGuire & Storch,
2019; Monfils & Holmes,
2018; Pittig et al.,
2016; Sewart & Craske,
2020; Tolin,
2019; van den Akker et al.,
2018; Weisman & Rodebaugh,
2018). Their discussion of the ORE effects varied, going from a simple description of promising results to even recommendations on how to apply it on a clinical setting. From the detailed numbers, it can be concluded that there are more articles highlighting the potential effectiveness of an ORE intervention than actual empirical tests providing evidence of the suggested benefits. This is remarkable considering that the field lacks a standard protocol that could be widely implemented in laboratory or clinical settings and that this type of intervention has already been applied to clinical cases (see Jessup & Olatunji,
2022; Krompinger et al.,
2019; Schyns et al.,
2020). But even more so when closely investigating the actual effectiveness of ORE on the reduction of relapse and noticing the lack of clear and consistent evidence.
Comparing the results from laboratory and clinical studies, one important factor that could be neglected in the lab would be the suitability of the procedures (for instance, the type of stimuli, the strength of the learning…). It is possible that conditioning and extinction, as studied in laboratory settings, do not really embody the experience that takes place within the clinical context, making it difficult to find strong and clear evidence. In contrast, clinical studies might facilitate the expression of any ORE benefit by conducting research on a more adequate and significant environment for the individuals. It should also be noted that the clinical application of ORE may entail several changes from the laboratory procedure, such as including additional intervention components (i.e., psychoeducation, expectancy violation intervention, multiple contexts exposure, etc.) or a different procedure than the one used in the lab (i.e., including only one reinforced presentation per exposure sessions, as in Schyns et al.,
2020, or only reinforced trials, as in Jessup & Olatunji,
2022). Those additional intervention components cast doubts on the idea that ORE is the key element in those positive results, therefore hindering a real interpretation of the effectiveness of this treatment. Moreover, if we consider the translational framework timeline (see Vervliet et al.,
2013), more systematic evidence is desirable on early stages before advancing on the implementation of ORE with clinical samples. Additionally, individual differences are not being considered when evaluating the potential impact of ORE in the lab given their importance on anxiety and fear (see Lonsdorf & Merz,
2017), as well as in addictive behaviours (see Brunault & Ballon,
2021).
The already mentioned lack of clear and systematic evidence, as well as the great heterogeneity within the ORE literature, calls for the development of unified protocols (i.e., equivalent number of acquisition and extinction trials, similar reinforcement schedules, ensuring asymptotic extinction, independent study of different relapse phenomena, etc.), consideration of statistical aspects (for instance, including larger samples, an a priori calculation of statistical power or establishing common tests for the effectiveness of ORE to allow comparison across studies), as well as for the adoption of Open Science practices (for instance, pre-registrations or registered reports, making data available, etc.), so that replication is facilitated in the future.
Some limitations of our review should also be noted. First, although we followed the PRISMA 2020 guidelines (Page et al.,
2021), we did not pre-register the systematic review (for instance, using the OSF or PROSPERO’s servers) nor performed several of the recommended practices (the PRISMA checklist may be found at
https://osf.io/6nvta/). Additionally, although a second researcher was consulted when necessary, data search and entry was performed by one researcher. The small sample of the laboratory studies included, the different indices used in the studies to calculate response recovery, as well as the great variety of protocols did not allow us to conduct a meta-analysis, which could have provided additional information about the effects of ORE. Lastly, we only included published articles (see
Eligibility criteria), but there may be laboratory and clinical studies that have not been published yet due to publication bias (Dwan et al.,
2013; Franco et al.,
2014), and, therefore, were not included in this review. Future studies could try to tackle the necessary statistical approach to conduct a meta-analysis, searching for non-published results.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.