Associative learning allows organisms to predict events in their environments by enabling them to identify stimuli with informational value. Once such stimuli are identified, future uncertainty is reduced, and effective anticipatory responses can be made. The capacity for effective anticipation confers considerable advantages, but, despite its importance, many details of how this is accomplished are not well understood. The development of adequate theoretical models of associative learning has been difficult, not least because the environment is complex, with the meaning of predictive stimuli often being ambiguous and liable to change (Bouton, 1994; Bouton & Bolles, 1985). Bouton and his colleagues have focused on one particular type of ambiguity, that which arises when a conditioned stimulus (CS) first acquires a predictive value for an unconditioned stimulus (US) by being paired with the US. After this, the CS undergoes extinction by presentation in the absence of the US. Thus, the CS’s meaning is now ambiguous; it has served as both a predictor and a nonpredictor of the US. The crucial questions relate to the mechanisms that determine the selection of responses to such ambiguous stimuli.

In the present study, we investigated the role of context in determining response selection. Previous work in this area has used two principal experimental designs, both of which use three stages, with the context varying from stage to stage. In an ABA design, the first stage (acquisition) occurs in context A. The second stage (extinction) takes place in a different context, B, and the third stage (recovery test) occurs in context A. In contrast, in an ABC design, the recovery test is carried out in a novel context C. Many studies, using both types of designs, have reported recovery effects. Responding to the CS rises during acquisition in context A, declines during extinction in context B, and then recovers during the test. The effect has been reported in animals (e.g., using rats with ABC designs, Bouton & Bolles, 1979; Harris, Jones, Bailey, & Westbrook, 2000; and with ABA designs, Nakajima, Tanaka, Urushihara, & Imada, 2000) and in humans (e.g., with ABC and ABA designs: Pineño & Miller, 2004; Üngör & Lachnit, 2008). In the experiments reported below, we used ABC designs, but we will mention the ABA design again in the General Discussion because of its clinical relevance. For now, we confine attention to ABC recovery effects and the underlying mechanism. Although the ABC recovery effect appears straightforward enough, at least two different mechanisms can explain it (Bouton & Nelson, 1998; see Lovibond, Preston, & Mackintosh, 1984, for additional possibilities, as well as Nelson, Sanjuan, Vadillo-Ruiz, Pérez, & León, 2011). The protection-from-extinction and occasion-setting explanations for response recovery both apply to ABC designs, and a brief introductory account of each is provided below.

First, on the basis of the predictions of the Rescorla–Wagner theory of conditioning (Rescorla & Wagner, 1972), the context might acquire inhibitory strength during the extinction phase. This would be expected to have the effect of protecting the CS from extinction, so that response recovery would occur as soon as the CS was presented in the absence of that inhibitory extinction context. Several experiments have shown that extinction carried out in the presence of a conditioned inhibitor can indeed protect from extinction (e.g., Rescorla, 2003 [rats and pigeons]; Soltysik, Wolfe, Nicholas, Wilson, & Garcia-Sanchez, 1983 [cats]), thus confirming the principle. However, recovery studies do not usually carry out extinction with a putative inhibitor present. Instead, it is usually carried out in a context that is associatively neutral at the start of the extinction phase. The protection-from-extinction hypothesis of response recovery suggests that the acquisition of inhibition to the context during extinction would be sufficient to produce a protection-from-extinction effect on the target stimulus. Early evidence for this point has only been suggestive. Using rats, Calton, Mitchell, and Schachtman (1996) found that a neutral stimulus present during extinction offered protection‐from‐extinction, but did not determine whether or not that neutral stimulus actually became inhibitory. Using human subjects, Lovibond, Davis, and O’Flaherty (2000) found that an excitatory stimulus present during extinction also protected from extinction. The latter result is puzzling because, taking a Rescorla–Wagner analysis of the extinction process, the presence of an excitatory cue during extinction should have produced more, rather than less, extinction. Furthermore, in both of the studies cited above, the stimulus in the protective role was a discrete local cue rather than a global context. Although nonreinforced exposure to a discrete excitatory cue in the presence of a diffuse cue has been shown in pigeons to result in the development of inhibition to the diffuse cue (Rescorla, 1999), other studies with rats have failed to show inhibition developing to contexts when excitatory cues are presented nonreinforced in those contexts (Bouton & King, 1983; Bouton & Swartzentruber, 1986; Grahame, Hallam, Geier, & Miller, 1990; see Nelson et al., 2011, for a discussion of human work). Some evidence from rat studies has also shown that diffuse internal contexts produced by alcohol can acquire inhibitory properties during extinction (Cunningham, 1979). More recently, two experiments by Polack, Laborda, and Miller (2012) have shown that, at least within specific parameters, a physical context such as a Skinner box can indeed acquire inhibitory properties. Using rats, these authors reported that an extinction context passed both summation and retardation tests for inhibition (Rescorla, 1969).

Second, the response suppression in context B that is lost on test in an ABC design could be produced by occasion-setting (Bouton, 1994; Holland, 1992; Swartzentruber, 1995). An occasion-setting stimulus controls the association between a CS and a US by acting on the association rather than having a direct connection with the US. This effect is demonstrated in experiments that have established that an occasion-setter can simultaneously control both excitatory and inhibitory associations conditioned to different stimuli. For example, cue Y is reinforced in the presence of X and nonreinforced when presented alone, while cue Z is nonreinforced in the presence of X and reinforced when presented alone (e.g., using rats; Holland & Reeve, 1991). The occasion-setting performed by X under these conditions seems unlikely to be based on an association between X and the US, because although such an association would facilitate responding to Y, it could not simultaneously suppress responding to Z. Thus, occasion-setters can control excitatory and inhibitory associations, and in the case of response recovery effects, the context might act as an occasion-setter controlling an inhibitory link between the context and the US. Thus, the CS generates an expectation of the US, but only in the absence of the occasion-setting context. The fact that occasion-setting functions are more easily established with serially rather than simultaneously presented occasion-setter-plus-target pairs suggests that contexts may be in a good position to act as occasion-setting stimuli. Like serially presented occasion-setters, contexts are present before the arrival of the CS–US pairings involved in conditioning. The context is also present during and after the CS–US pairings, but that may not be crucial, because occasion-setters do not readily lose their power by exposure alone (Holland, 1992; Murphy & Skinner, 2005 [rats]; Rescorla, 1986 [pigeons]).

Given these different mechanisms for response recovery, it is vital for any study demonstrating recovery to explore the mechanisms before any claims are made that rely on one mechanism or the other. However, few human studies of recovery effects have fully examined the potential mechanisms involved. For example, response recovery observed in an ABA design could be produced by either of the mechanisms outlined above, in addition to summation between the residual associative strength of the cue and the test context (Vansteenwegen et al., 2005). The ABC design, on the other hand, involves testing in a novel, and hence associatively neutral, context C, effectively ruling out test context–outcome associations as explanations of the recovery. But, unless tests are made for the inhibitory strength of the extinction context, it is not possible to distinguish recovery due to a protection-from-extinction mechanism from recovery due to occasion-setting or some yet-to-be specified mechanism in such experiments.

An exception to this lack of investigation into mechanism is a recent study using humans by Nelson et al. (2011). In this experiment, Nelson et al. also used ABA and ABC designs to rule out summation for the observed ABA recovery effect; recovery effects were clear in both designs. This study also directly tested for inhibition conditioned to the extinction context by presenting an excitatory test stimulus in that context in order to form a summation test for inhibition (Rescorla, 1969). Presentation of this test stimulus yielded no evidence of inhibition; responses were the same in the extinction context and in an associatively neutral control context. Thus, because other explanations were effectively ruled out, the recovery effects observed in that study are perhaps best understood as being based on an occasion-setting type of extinction, controlled by the extinction context. It should be pointed out, however, that appropriate tests to assess the occasion-setting properties of contexts (i.e., selective transfer of control and control following counterconditioning; see Swartzentruber, 1995) following a simple extinction treatment have not been conducted in any study. This single demonstration with humans that the context of extinction does not appear inhibitory does not mean that all recovery effects are based on the same mechanism. Instead, it provides a clear demonstration of a recovery effect in the absence of direct inhibitory context–outcome associations, and suggests the need to provide further analysis of the basis of recovery effects in other studies, and to establish the conditions under which one effect or another might be observed.

The present research was designed to further determine the mechanism(s) of recovery and to assess ways to reduce response recovery, because of the clinical implications. A growing number of studies using rats and humans have shown reduced recovery after multiple-context extinction (e.g., Chelonis, Calton, Hart, & Schachtman, 1999; Gunther, Denniston, & Miller, 1998 [rats]; Glautier & Elgueta, 2009; Pineño & Miller, 2004 [humans]). Notably, one of the straightforward predictions of the protection-from-extinction account of response recovery is that recovery should be reduced following extinction carried out in multiple contexts. According to this hypothesis, a multiple-context extinction procedure should cause the buildup of inhibition to be distributed across several extinction contexts, resulting in the target cue being extinguished against a less inhibitory background with each context change. Thus, there would be less protection from extinction across the course of extinction. However, in none of the experiments in which multiple-context extinction has been shown to reduce response recovery has the inhibitory role of the context been assessed to either support or deny the protection-from-extinction account of response recovery. In the present work, we explored the role of context inhibition in response recovery while assessing whether multiple-context extinction can reduce recovery.

Experiment 1

In Experiment 1, participants learned a response to a stimulus that was then extinguished either in one or in multiple (two) different contexts. As the extinction context should come to control extinction performance, either by an occasion-setting mechanism (Bouton, 1993) or by the context acquiring inhibition, we expected extinction to be more rapid in the single-context group. In the multiple-context group, changing the context should remove a source of control of the extinction performance that could produce recoveries of responding, and therefore slower extinction. In addition, and of crucial interest, the multiple-extinction-context group should show less recovery of responding on test. As each change in context removes any control of extinction performance acquired by that context, any resulting extinction performance should be more likely to be controlled by learning about the stimulus rather than by the context, enabling that learning to be expressed independently of the context. Also, to the extent that the contexts acquire some control of extinction performance, whether by occasion-setting or direct inhibitory associations with the outcome, extinction in multiple contexts should enhance the ability of that control to generalize to new contexts. However, in the case of the protection-from-extinction account of response recovery, any differences between the single- and multiple-context conditions would be mediated by an inhibitory association developing between the extinction context and the extinction outcome. In order to distinguish these accounts, we included a summation test as a measure of any inhibitory potential accruing to the extinction context.

Method

Participants

A group of 69 participants were recruited via posted advertisement and word of mouth from the University of Southampton, Highfield campus. They were paid £4 or received course credit for participating. Their average age was 20 years, and they included six males.

Apparatus

Three personal computers were used, with screens measuring 41 cm × 26 cm (W × H). The screens were run in 32-bit color mode with pixel resolutions of 1,440 × 900. The display was controlled by a computer program written in Microsoft Visual Studio 2008 C# language, and we used XNA Game Studio Version 3.1 for 3-D rendering of the experimental scenario. XNA sound facilities were also used to present auditory stimuli via small speakers mounted to both sides of the screens.

Design and procedure

Participants were tested individually in one of three small experimental cubicles, each housing one PC, a desk, and a chair. Participants were assigned to one of three groups—no extinction, single extinction context, or multiple extinction contexts—as described below. Assignment to groups was done using a group number balancing algorithm. On each PC, if the numbers of participants in the groups were equal, group assignment was done at random. In the case of unequal group numbers, a participant would be allocated to the smaller group. In this way, we tested 23 participants in each group. To begin, they were given a brief verbal description of the procedure before reading and signing a consent form. Next, a more detailed description of the procedure was presented onscreen for participants to read, along with a voiceover of the text, played through the PC speakers. The text is reproduced below:

In this experiment you will watch tests of various objects passing a special sensor. Your job is to learn how the sensor responds to the different test objects. The sensor may show red or green and you have to try to predict the sensor response while the objects are passing through a prediction window. Make your predictions by pressing the key R or the key G while the objects are in the prediction window. Key presses made while the objects are outside the window will not count. You should aim to make as many correct predictions as possible, and minimise incorrect predictions. Tests may be carried out in one of four test containers, each of which might hold a different gas. Before the experiment starts for real, we will have some practice trials. In the practice trials you have to predict which objects turn the sensor blue. Make your predictions by pressing the key B while the practice objects are in the prediction window. Review these instructions on the screen. When you are sure that you understand what is required, press the key C to continue. You will be told when the practice trials have finished and when the experiment begins running for real. Remember, during the practice trials you have to predict when the sensor will turn blue. When the experiment begins running for real you have to predict red or green. Ask the experimenter if you have any questions or press the key C to begin.

After participants pressed key C, the text was removed from the screen, and a context-change animation followed. The context-change animation began with an external view of four boxes, arranged in a 2 × 2 grid and seen from a distance. After a delay of 2 s, the screen showed a first-person camera view of a flight between the distant viewpoint and the inside of the box that was programmed as the context for that phase of the experiment. Thus, the simulated flight from a distant viewpoint into one of the boxes ended in a view of the inside of one of four boxes, as is illustrated in Fig. 1. The boxes differed in terms of visual texture and color. Replay of the Pink Floyd audio track “On the Run” began, and this track was played in a continuous loop as background music throughout the remainder of the experiment. Within the onscreen box, participants saw a pyramid shaped “sensor.” Directly above the sensor was a translucent band forming the prediction window. From time to time, as dictated by the experimental protocol, a 3-D shape entered the box from the top of the screen and fell down through the box at a constant speed. The fall through the box was in a straight line with a randomly selected screen x-axis offset for each trial within the left/right boundaries of the prediction window. Entry to the box marked the start of a trial, which lasted approximately 4.8 s in total. During the shape’s transit through the box, it rotated slowly around its x-axis, parallel to the plane of the screen, to give a good overall view of its structure. The shape was always visible as it passed behind the prediction window, and then in front of the sensor, before disappearing from the bottom of the screen. The approximate times for the different parts of screen crossing were 2.1 s from entering at the top of the screen to entering the prediction window, 1.2 s behind the prediction window, and 1.5 s for the passage between the bottom of the prediction window and exit at the bottom of the screen. If an outcome was scheduled to occur, the sensor would flash with the scheduled outcome color, accompanied by the sound of a buzzer. Outcomes occurred during the passage of the shape between the bottom of the prediction window and its exit at the bottom of the screen. Exit of the shape from the screen ended the trial and began an intertrial interval of 5.2 s.

Fig. 1
figure 1

Screen shots showing the general features of the experimental screen display, examples of different predictive shape stimuli, and the red outcome event

A total of 16 different shapes were used. These were made from all combinations of four binary-valued features. All of the shapes were based on a cube that could vary in terms of size (small [side ~1 cm] or large [side ~2 cm]), distortion (shear or twist of the top face), color (yellow or turquoise), and the pattern decorating each face (zigzag lines or a jug [Wingdings symbols 98 and 104]). Figure 1 provides some illustration. In the top panel, a large turquoise cube, with a twist made to its top face and the jug pattern, is shown behind the prediction window. Responses made by the participant while the stimulus was behind the prediction window were counted. In the bottom panel, a small yellow cube, with a shear applied to its top face and with the zigzag pattern is shown after leaving the prediction window. The red flashing outcome is shown in progress. Responses made during this part of the screen transition were not counted, as they would no longer be predictions. The accumulated number of correct responses was displayed at the bottom left-hand side of the screen, alongside a note giving the status of the experiment and a reminder of the task requirement.

To begin, participants were presented with a series of eight practice trials that resembled the experimental trials to come, except that different predictive stimuli (chosen from the same set) and a different outcome (blue) were used. When the practice trials were over, an auditory and onscreen notice informed participants that the practice was over and that the experiment proper had begun. A context-change animation then took place to begin the experiment proper. The experiment proper consisted of 92 trials with the design shown in Table 1. During Stage 1, ten trials each were presented for cues A and G, and 20 trials each for cues B and C. The additional trials with B and C were used to make all outcomes equiprobable during this stage. Stage 2 included 24 trials, eight of each type. Summation Stage 2a consisted of two trials with cue G, and the test stage involved two trials of each type in a novel context, making a total of 92 trials over the four stages. On each trial, participants could press either the R or the G key while the shape was in the prediction window, in order to make a prediction of the red or the green outcome, or they could predict no outcome by withholding these keypresses. The order of the trials was randomized for each participant, subject to the constraint that no more than four of each type could occur in sequence. The actual shapes serving in the stimulus roles A–C were selected at random for each participant from the 16 possible shapes. Outcomes X and Y were selected to be a red and a green flash, or vice versa, on a random basis for each participant; in the design, Z signifies no outcome. Trials took place in one of three distinctive boxes serving in the roles of contexts A–C, with the actual box serving in each of the context roles being randomly selected for each participant from the four possibilities. Transitions between contexts were preceded by the context-change animation sequence described earlier.

Table 1 Design of Experiments 1 and 2

In summary, all participants experienced an acquisition phase (Stage 1) in which target responses to cues A and G were acquired on the basis of pairings with outcome X. Stage 2 took place in a different context, where the response to cue A was extinguished for two groups. For half of the extinction participants (single-extinction group), extinction took place in a single context, whereas for the other half (multiple-extinction group), two extinction contexts were used, with contexts alternating after six trials of each type. For the third, no-extinction, group, presentations of cue A also occurred during Stage 2, but the outcome X pairing established during Stage 1 continued. In Stage 2a, nonreinforced trials with cue G were presented in the extinction context as a summation test for the inhibitory strength of the extinction context. In the Test Stage, a novel context was introduced, and cue A was presented for a recovery test. Cues B and C were included as fillers to provide an appropriately complex learning task to properly engage the attention of participants. Responses to these cues were not of central interest, so the analysis below focuses on responses to cue A, in particular on the outcome X responses that were conditioned with cue A during Stage 1.

Results

The left panel of Fig. 2 shows responses to cue A during Stages 1 and 2 and during the response recovery test. As can be seen, all groups learned to respond appropriately to cue A during Stage 1. Responses to filler cues also developed as expected during Stage 1, further confirming that the learning procedure was effective. At the end of Stage 1, the proportions of correct responses (defined as the Stage 1 appropriate responses) were significantly higher than would be expected from chance for all cues [one-sample ts(68) > 15.7, ps < .001].

Fig. 2
figure 2

Mean proportions (±1 SE) of Stage 1 appropriate responses to cues A (left panel, responses for each of ten blocks of trials) and G (right panel, responses for each of six blocks of trials) for the three experimental groups (single extinction, multiple extinction, and no extinction) in Experiment 1

On entry to Stage 2, responses to cue A declined in response to the extinction contingency in both of the extinction groups. The left panel of Fig. 2 shows that this was specific to the extinction treatments, as judged by the differences between the extinction groups and the no-extinction group during Stage 2. In this and subsequent analyses of variance (ANOVAs), Greenhouse–Geisser corrected degrees of freedom are used whenever violations of sphericity were detected in the SPSS Mauchley sphericity test (Greenhouse & Geisser, 1959; IBM Corp, 2010; Mauchley, 1940). A 3 (group: no, single, or multiple extinction) × 4 (block) ANOVA on the Stage 2 data for cue A produced a significant effect of block, a Block × Group interaction, and an effect of group [F(2.59, 171.6) = 28.8, F(5.17, 171.6) = 10.8, and F(2, 66) = 76.2, respectively; all ps < .001]. As can be surmised from Fig. 2, the Block × Group interaction was produced by the difference in responding between the no-extinction group and the other two groups. We found no block effect in the no-extinction group for cue A across Stage 2 [F(3, 66) < 1]. Considering the single- and multiple-extinction groups alone, a 2 (group: single or multiple extinction) × 4 (block) ANOVA on the Stage 2 data for cue A produced a significant block effect [F(2.49, 109.7) = 40.0, p < .011] and confirmed that these two groups did not differ; neither the group effect nor the Block × Group interaction was significant in this analysis (Fs < 1). At the end of Stage 2, the no-extinction group was producing 96 % Stage 1 responses to cue A, whereas the single- and multiple-extinction groups were producing 9 % and 22 % Stage 1 responses to cue A, respectively. These differences were all significantly different from chance expectation [one-sample t tests, ts(22) > 1.87, ps < .05]. Although in the last block the multiple-extinction group was producing more Stage 1 responses to A than was the single-extinction group, this difference was not significant [t(44) = 1.77, p < .1].

The introduction of the novel context during the Test Stage reduced X responses in the no-extinction group but increased responses for the single- and multiple-extinction groups (left panel, Fig. 2). The reduction in responding in the no-extinction group was due to the nonreinforcement of cue A on the two test trials, but the increase in responding in the single and multiple groups represents response recovery, and the issue of interest, was whether or not differences would emerge between these two groups. A 2 (group: single or multiple) × 2 (block) ANOVA was carried out to examine this hypothesis. The two levels of block used were the last block of Stage 2 and the test block. The ANOVA produced a significant block effect and a significant Block × Group interaction [Fs(1, 44) > 6.15, ps < .05], but the group effect was not significant [F(1, 44) < 1]. Responses to cue A did not increase significantly between the end of Stage 2 and the test block for the multiple-extinction group, whereas the difference was significant for the single-extinction group [t(22) = 0.70 and t(22) = 4.22, p < .001, respectively].

A feature of the present experimental design was the inclusion of a summation test to determine whether or not the extinction context became inhibitory for the extinguished outcome during Stage 2. The results of this test are shown in the right panel of Fig. 2. There, it can be seen that responses to cue G were learned equivalently during Stage 1 for all three groups. During Stage 2, cue G was not presented again until the last two trials. The average response to cue G during these trials is shown in Fig. 2, with the label Stage 2a. Responses to cue G were ordered by group, with the no-extinction group producing the most and the single-extinction group the least Stage 1 appropriate responses. A one-way ANOVA with Group as the single factor on the Stage 2a responses to G confirmed that these differences were statistically reliable [F(2, 66) = 11.2, p < .001]. Follow-up t tests showed that the no-extinction group produced more Stage 1 appropriate responses to G in the extinction context than did either of the other two groups [ts(44) > 2.7, ps < .05], but the difference between the single- and multiple-extinction groups failed to achieve significance with α = .05 [t(44) = 1.95, p = .06]. The direction of the effect, however, was consistent with the a priori hypothesis that responses to G would be lower in the single- than in the multiple-extinction group, reflecting higher context inhibition in this group.

Discussion

Acquisition, extinction, and response recovery were observed according to expectations. Because we used an ABC design, in which the recovery test is carried out in an associatively neutral context, the recovery cannot have been due to summation between the associative strength of the test context and that of the cue under test. Instead, the observed recovery must be based directly on the associative strength of the target cues.

The experiment confirmed that measurable inhibition had accrued to the extinction context, and that the single- and multiple-extinction groups produced different levels of recovery: Recovery was observed in the single-extinction group, and but not in the multiple-extinction group. However, the evidence that the differential recovery could be explained by differing levels of inhibition was more suggestive than firm. The summation test produced evidence that the context was more inhibitory in the single-extinction than in the multiple-extinction group, but the effect was small and failed to reach significance. During extinction, the observed rates matched our expectations. In these samples, extinction occurred more rapidly in the single-extinction than in the multiple-extinction condition. Nonetheless, those differences were not confirmed statistically, so we are unable to reach firm conclusions about this effect.

Experiment 2

Experiment 1 provided evidence for less response recovery after extinction in multiple contexts as compared to extinction in a single context. On its own this is not theoretically decisive, but it does carry important clinical implications, so a replication would be of interest. We also observed inhibition developing to the extinction context. This result is theoretically important, because it provides specific support for the protection-from-extinction account of response recovery. The main source of evidence for context inhibition came from summation tests involving presentation of target cue G in context B and comparing responses to G in the extinction and no-extinction groups. We observed a clear inhibition effect overall, but the protection-from-extinction hypothesis also predicts differences in the amounts of inhibition between the single- and multiple-extinction groups. We obtained such an effect in the predicted direction, but it was not strong, so once again replication is called for.

As well as replicating key results, Experiment 2 was designed to determine the viability of an alternative explanation for the “inhibition” effect observed in Experiment 1. We assumed that lowered X responses to G in the extinction groups were caused by an inhibitory association between context B and outcome X. However, in the no-extinction condition, cue A continued to be paired with outcome X in context B during the extinction phase. Thus, greater associative strength between context B and outcome X in the no-extinction than in the extinction conditions could have been a source of group differences. Therefore, in the present experiment, we repeated Experiment 1 but added an additional control group, “no-extinction no-A → X,” against which to assess the effect of the cue A–outcome X pairings in the no-extinction group.

Method

The experimental procedures were the same as those used in Experiment 1, with the exceptions noted below.

Participants

A group of 99 participants took part. Their average age was 17 years, and they included 45 males. They were recruited from three different sixth-form (age 16–18) colleges in Hampshire, UK, and were tested during a site visit. Participation was voluntary.

Apparatus

Tests were carried out at three computer workstations, set up together in a mobile research laboratory in the load compartment of a specially equipped Citroen Relay van instead of individual cubicles. To minimize interference between participants, the auditory stimuli were presented over headphones instead of speakers.

Design and procedure

The design is given in Table 1. Four groups were used, with 25 participants being allocated to the no-extinction, no-extinction no-A → X, and multiple-extinction groups, and 24 participants allocated to the single-extinction group. Group no-extinction no-A → X was treated exactly like group no-extinction, except that the A → X trials during extinction were omitted. Participants were tested in groups of up to three.

Results

Figure 3, left panel, shows responses to cue A during Stages 1 and 2 and during the response recovery test. Once again, participants learned differential responses to all cues in Stage 1; at the end of the stage, the proportions of correct responses to all cues exceeded chance levels [one-sample ts(98) > 13.1, ps < .001].

Fig. 3
figure 3

Mean proportions (±1 SE) of Stage 1 appropriate responses to cues A (left panel, responses for each of ten blocks of trials) and G (right panel, responses for each of six blocks of trials) for the four experimental groups (single extinction, multiple extinction, no extinction, and no extinction no A → X) in Experiment 2. Note that there were no presentations of cue A during Stage 2 for the no-extinction no-A → X group

During Stage 2, responses to cue A declined for the extinction groups but remained at Stage 1 levels, or increased slightly, for the no-extinction group (group no-extinction no-A → X did not have A trials during Stage 2). As in Experiment 1 the proportions of X responses were lower during Stage 2 for the single- than for the multiple-extinction group. However, in contrast to Experiment 1, we also found some statistical support for more rapid extinction in the single-extinction than in the multiple-extinction group. A 3 (group: no, single, or multiple extinction) × 4 (block) ANOVA on the Stage 2 data for cue A produced significant effects of block and group and a Block × Group interaction [F(3, 213) = 37.1; F(2, 71) = 77.0; and F(2, 71) = 18.3, respectively; all ps < .001]. No block effect was apparent for the no-extinction group [F(3, 72) = 2.21], and a Group × Block ANOVA for the single- and multiple-extinction groups produced a block effect, a marginal group effect, but no Block × Group interaction [F(3, 141) = 75.8, p < .001; F(1, 47) = 4.03, p = .05; and F(3, 141) = 1.28, respectively]. At the end of Stage 2, the no-, single-, and multiple-extinction groups were producing 86 %, 4 %, and 22 % Stage 1 responses, respectively. Responding in the no-extinction and single-extinction groups differed from chance level [t(24) = 9.72 and t(23) = 10.1, ps < .001], but responding in the multiple-extinction group did not differ from chance levels [t(24) = 1.94, p < .1]. The multiple-extinction group was producing more Stage 1 responses than the single-extinction group at the end of Stage 2 [t(47) = 2.7, p < .01].

Presentation of cue A in the novel context C (Fig. 3, left panel) resulted in an increase in responding—that is, response recovery—for the extinction groups and a reduction in responding for both no-extinction groups, replicating observations from Experiments 1 (the symbol for the no-extinction no-A → X group partially obscures the symbol for the no-extinction group in Fig. 3, left panel). A 2 (group: single or multiple) × 2 (block) ANOVA was carried out on cue A to reexamine our previous observation of more recovery in the single- than in the multiple-extinction group. The two levels of block used were the last block of Stage 2 and the test block. The ANOVA produced a significant block effect and a significant Block × Group interaction [Fs(1, 47) > 12.0, ps < .001], but the group effect was not significant (F < 1). Responses to cue A did not increase significantly between the end of Stage 2 and the test block for the multiple-extinction group, whereas the difference was significant for the single-extinction group [t(24) = 0.70 and t(23) = 4.45, p < .001, respectively].

The right panel of Fig. 3 shows responses to cue G during Stage 1 and in the summation test. As previously, we were interested in whether or not the extinction context would become inhibitory, as suggested by the protection-from-extinction hypothesis, and whether or not this effect would be greater in the single-extinction than in the multiple-extinction group. We included a no-extinction no-A → X group in this experiment to control for the possibility that previously observed differences between the no-extinction and extinction conditions could have been due to more context excitation for outcome X in the no-extinction condition, due to the continuation of A → X trials during Stage 2. We analyzed the Stage 2a summation data using a 2 (extinction: no extinction [comprising the no-extinction and no-extinction no-A → X groups] or extinction [comprising the single- and multiple-extinction groups]) × 2 (treatment: control [comprising the no- and multiple-extinction groups] or treatment [comprising the no-extinction no-A → X and single-extinction groups]) between-subjects ANOVA. This produced a significant effect of extinction, but no effect of treatment nor an interaction [F(1, 95) = 10.6, p < .01; F(1, 95) = 1.04; and F(1, 95) = 1.26, respectively]. That is, the no-extinction conditions combined differed from the extinction conditions combined. The extinction context was inhibitory relative to both no-extinction conditions, but we found no difference between the single- and multiple-extinction conditions, nor between the no-extinction conditions.

Discussion

Again we observed the expected patterns of acquisition, extinction, and response recovery. Importantly, we also replicated our previously reported recovery differences between single- and multiple-extinction groups and our context inhibition effect. Since the inhibition effect was still present after controlling for differences in context excitation, by including the additional no-extinction no-A → X control group we have ruled out an alternative explanation for the group differences in this summation test that could have been applied to Experiment 1. The fact that responses to G were actually higher in the no-extinction no-A → X group than in the no-extinction group shows that the additional outcome-X trials in the no-extinction group during Stage 2 were not the source of our original effect. Unfortunately for the protection-from-extinction hypothesis, the marginal difference in inhibition that we observed between the single- and multiple-extinction groups in Experiment 1 was not reproduced, meaning that the protection-from-extinction hypothesis can only claim partial support from these experiments. Although the hypothesis uniquely predicts the development of inhibition during extinction, which was confirmed, it also predicts that the level of inhibition will be correlated with the size of the recovery effect. The protection-from-extinction hypothesis also predicts that extinction would be more rapid in the single than in the multiple condition, specifically due to group differences in inhibition. In Experiment 1, the group difference in the rates of extinction was not statistically reliable, and the group difference on inhibition was marginal. In Experiment 2, this pattern was reversed: We observed marginally significant differences in the rates of extinction, but there was no sign of a difference in inhibition. Thus, the size of recovery effects, the rate of extinction, and the strength of context inhibition are not strongly coupled, as would be expected according to the protection-from-extinction hypothesis.

General discussion

Two major findings emerged from the present experiments: (1) Extinction in multiple contexts can reduce response recovery when compared to a single-context extinction procedure, and (2) contexts can become inhibitory during extinction. The reduction of response recovery after multiple-context extinction has important clinical implications that we will consider below. But first, we will review the results in the light of the protection-from-extinction (Rescorla, 2003; Soltysik et al., 1983) and occasion-setting (Bouton, 1994; Holland, 1992) explanations that were described in the introduction.

While protection from extinction is frequently cited as a potential mechanism for recovery effects, as yet, no fully established body of work has convincingly demonstrated the link between recovery with context change and protection from extinction. It is accepted that extinction can be reduced when it is carried out in the presence of an already-established inhibitor (e.g., Rescorla, 2003; Soltysik et al., 1983), but as we explained in the introduction, recovery designs typically begin the extinction phase in an associatively neutral context. Furthermore, until recently (Polack et al., 2012), no data had shown that a context could become inhibitory during extinction. Previous work using both discrete and contextual cues yielded a rather mixed picture (e.g., Bouton & King, 1983; Calton et al., 1996; Lovibond et al., 2000). Until recently, the strongest evidence for a protection-from-extinction mechanism operating in renewal designs came from studies that had shown reduced response recovery following extinction in multiple contexts, a result that follows directly from, but is not unique to, the protection-from-extinction hypothesis. However, none of the studies reporting reduced recovery following multiple-context extinction had directly addressed the issue of contextual inhibition (Chelonis et al., 1999; Glautier & Elgueta, 2009; Gunther et al., 1998; Pineño & Miller, 2004). Thus, the present experiments bring context-inhibition and multiple-context-extinction effects together for the first time. Our two main findings were predicted by the protection-from-extinction logic, but further predictions following the same logic did not gain such strong support.

We found some evidence of more rapid extinction and more inhibition in single- than in multiple-context extinction conditions, confirming both of these further protection-from-extinction predictions, but we have some doubts as to whether or not these results confirm the protection-from-extinction model of extinction and response recovery. The remaining doubt concerns the failure to observe a clear difference in inhibition between the single- and multiple-extinction groups. In Experiment 1, we found evidence for more inhibition in the single- than in the multiple-extinction group (Fig. 2, right panel). This sits well alongside more recovery in the single- than in the multiple-extinction group. However, this inhibition effect failed to reach significance, and no sign of replication emerged in Experiment 2. Furthermore, according to the protection-from-extinction model, the development of context inhibition should also drive the rate of extinction, and therefore group differences in the rates of extinction should have been stronger in Experiment 1 than in Experiment 2. Yet, the opposite pattern was found. In Experiment 1 differences in rates of extinction were not reliable, whereas they were reliable in Experiment 2, albeit marginally so. Thus, we observed a degree of dissociation between differential recovery, levels of inhibition, and rates of extinction. Our findings show that extinction contexts can, and do, become inhibitory, yet the differential levels of recovery are not fully explained by differential levels of inhibition.

A second alternative, an occasion-setting view of extinction and response recovery, explains the recovery effect and the differences in recovery between the single- and multiple-extinction conditions without reference to context inhibition. During extinction, it is proposed that an inhibitory link between the cue and the outcome develops, alongside an existing excitatory link between the cue and the outcome. Context specificity of extinction comes about because this inhibitory link is only active in the extinction context (e.g., Bouton, 1994; Nelson, 2002). As a consequence, inhibition is only available in the extinction context, and once that context changes, response recovery is seen. This account can explain recovery differences between single- and multiple-extinction conditions because an increase in the number of extinction contexts could increase the sample of stimulus elements that are able to exert this occasion-setting control, effectively increasing generalization from the extinction context, a factor that has been shown to affect recovery in humans (Bandarian Balooch & Neumann, 2011). It could be argued that the finding of a direct inhibitory link between the extinction context and outcome X should be counted against occasion-setting, but this argument relies on a view of occasion-setting and inhibitory mechanisms being mutually exclusive. Such a view lacks support, as Bouton and Nelson (1994) have shown evidence suggesting that occasion-setting and inhibition can coexist as properties of a single stimulus. Moreover, counterconditioning, in which a direct association between an occasion-setter and the US is established, does not abolish the stimulus’s ability to occasion-set (see Swartzentruber, 1995, for a review). From that result, we would argue that the unreliable link between recovery, level of inhibition, and rate of extinction clearly warrants further investigation. For example, if recovery were still to be observed after removal of inhibition from the extinction context (perhaps by including other trials with outcome X during Stage 2), this would strongly suggest that contextual control is not dependent on inhibition, as was observed recently (Nelson et al., 2011). Ideally, this manipulation would take place in a single experiment, so that recovery could be compared directly with and without context inhibition.

To conclude, we would stress the clinical implications of reduced recovery after extinction in multiple contexts. Some studies have failed to show reductions of response recovery after multiple-context extinction (Bouton, García-Gutiérrez, Zilski, & Moody, 2006 [in rats]; Lang & Craske, 2000 [in humans]; Neumann, Lipp, & Cory, 2007 [in humans]), but a growing list of studies, to which the present work can be added, have reported positive findings in both animals (Chelonis et al., 1999; Glautier & Elgueta, 2009; Gunther et al., 1998; Pineño & Miller, 2004) and humans (Glautier & Elgueta, 2009; Pineño & Miller, 2004). These studies have used ABA as well as ABC designs, and this adds to their clinical relevance, because the ABA design is more closely aligned to the home–clinic–home pattern followed in many treatment settings. It does, however, add a level of complexity with respect to the mechanism, because ABA recovery might involve a summation-based mechanism, as well as the protection-from-extinction and occasion-setting mechanisms discussed above. One pragmatic view is that the only thing that matters for clinical application is whether or not outcomes are likely to be improved after multiple-context extinction. However, there is reason to believe that some additional clarity on the underlying mechanism could have further clinical benefits, as different underlying mechanisms might be linked to different therapeutic strategies. For example, cue exposure in-vivo might be used to minimize summation-based recovery; multiple-context extinction, to eliminate protection from extinction or occasion-setting; and reminder-based treatments, to enhance occasion-setting control of the extinction context (Brooks & Bouton, 1994; Dawe et al., 1993; Glautier & Elgueta, 2009; Gunther et al., 1998; Mystkowski, Craske, Echiverri, & Labus, 2006).