Introduction

Forgetting is often attributed to passive processes involved in the decay of memories, interference from the storage of similar memories, and/or uncontrollable fluctuations in temporal context (Atkinson & Shiffrin, 1968; Estes, 1955; Howard & Kahana, 2002; McGeoch, 1942; Murdock, 1982; Raaijmakers & Shiffrin, 1981). However, sometimes one has control over what is to be forgotten (Bjork, 1972; MacLeod, 1998). The phenomenon is referred to as intentional forgetting, and it is often explored in the laboratory using list-method directed forgetting (Bjork, LaBerge, & Legrand, 1968); subjects study a list of words (L 1 ), and then they are instructed to forget that list or they are instructed to remember that list. Subjects are then given another list of words to study (L 2 ).

The list-method procedure captures the essence of many scenarios in which buffering task performance from intrusions of memories for recent events is critical. For instance, the person who witnesses a crime on the way to work must put out of mind the memory for the crime event so as not be distracted when performing work activities. Like our hypothetical person, who will be later interviewed about what occurred at the crime scene, the subject’s memory is tested for L 1 , and memory for L 1 is worse for participants in the “forget” condition compared to participants in the “remember” condition; this is referred to as the costs of directed forgetting (Bjork et al., 1968; Geiselman, Bjork, & Fishman, 1983; Lehman & Malmberg, 2009; Macleod, Dodd, Sheard, Wilson, & Bibi, 2003; Roediger & Crowder, 1972; Sahakyan & Kelley, 2002; Sheard & MacLeod, 2005; Weiner, 1968). Additionally, memory for L 2 is greater in the forget condition than in the remember condition; this is referred to as the benefits of directed forgetting.

Some have compared intentional forgetting to the processes by which one updates memory after receiving a new telephone number, moving to a new address, or changing a last name (Bjork, 1989; Payne & Corrigan, 2007, e.g.). These analogies are only very loosely related to what occurs in the lab. List method directed forgetting tests memory for events, whereas these examples involve updating of general knowledge. In addition, our “real world” scenario suggests that at times, it is important for memories, once intentionally forgotten, to be retrieved accurately later. We hypothesize that many intentional forgetting phenomena are closely linked to compartmentalization, whereby the goal is to temporarily put out of mind recent events or items that would interfere with performing the task that is at hand. If so, it is important to determine under which conditions the costs of intentional forgetting can be overcome.

Transitory costs and the cognitive mechanisms that produce them

There are conflicting results concerning the permanence of the forgetting induced by the list method, and the question of whether intentionally forgotten memories are lost permanently brings to the foreground strengths and weaknesses of different accounts of directed forgetting. Inhibition models assume that the instruction to forget decreases the activation of L 1 traces (Bjork, 1989; Geiselman et al., 1983). While inhibition models do not necessarily preclude changes in the level of inhibition over time, extant models do not provide a mechanism that describes how traces are selected for inhibition or how their activations are reduced, and therefore they also do not provide satisfactory explanations for how or under what conditions traces are selected for reactivation. Despite this ambivalence, there are a large number of findings showing that memories previously thought to be forgotten may be retrieved (Capaldi & Neath, 1995, for a review; also Bahrick, 1983; Bahrick, Bahrick, & Wittlinger, 1975; Eich & Birnbaum, 1982; Keppell & Underwood, 1962).

The problem for inhibition theory is that nothing is assumed about the nature of memory that dictates whether items once “forgotten” maybe retrieved in the future, and therefore it also does not generate predictions based on concrete assumptions about the conditions under which intentional forgetting may be overcome. This lack of specificity is highlighted by the interaction between the intention to forget and whether the memory task involves recall or recognition. In many early studies, for instance, testing memory via recognition produced null effects (Basden, Basden, & Gargano, 1993; Block, 1971; Elmes, Adams & Roediger, 1970; Geiselman et al., 1983). This finding was touted as confirmation of the inhibition hypothesis because the instruction to forget did not permanently eliminate or alter the to-be-forgotten memories (e.g., Elmes et al., 1970). Accordingly, previously forgotten traces during free recall are suddenly “released” from inhibition on demand when memory is tested by recognition (e.g., Bjork, 1989; Bjork & Bjork, 1996). It is important to note that the inhibition hypothesis did not in fact predict the “release from inhibition” associated with recognition testing. If recognition testing was affected by the instruction to forget, the inhibition models could just as easily have accounted for that finding on the assumption that the to be-forgotten items “were not released from inhibition”. The “release from inhibition” is only a circular description of the data, and the nontrivial problems of transitory effects for inhibition theory remain: How and under what testing conditions are once inhibited traces suddenly reconstituted and revitalized?

In contrast to the aforementioned models, the contextual differentiation models predict transitory effects of the intention to forget under specific conditions by emphasizing the availability of effective retrieval cues (Sahakyan & Kelley, 2002; also see Mensink & Raaijmakers, 1989, for a comprehensive treatment of other findings associated with spontaneous recovery). Many models of memory propose that an effective retrieval cue for a given trace is one that matches the stored information (Dennis & Humphreys, 2001; Lehman & Malmberg, 2009; Mensink & Raaijmakers, 1989; Raaijmakers & Shiffrin, 1981; Tulving & Pearlstone, 1967; Morris, Bransford & Franks, 1977). For a free recall task, in the absence of any other cues, one must rely on context as a retrieval cue (Howard & Kahana, 2002; Humphreys, Bain, & Pike, 1989; Malmberg & Shiffrin, 2005). Thus, retrieval is more effective when the context during retrieval matches the context during encoding (Godden & Baddeley, 1975). According to contextual differentiation models, subjects engage a process by which they “think about something else” when attempting to “forget”, and this produces an accelerated change in mental context that makes the to-be-forgotten L 1 context information less similar to the context cues available at test (Lehman & Malmberg, 2009; Sahakyan & Kelley, 2002). On the aforementioned assumption that context is used as part of an episodic retrieval cue, the accelerated change in mental context makes it more difficult to reinstate the L 1 context cues, producing the costs of directed forgetting.

On these assumptions, the effects of directed forgetting are predicted only under certain circumstances. Contextual differentiation models predict costs and benefits for recognition when a context cue plays an important role in achieving high levels of accuracy. This prediction was recently confirmed (Lehman & Malmberg, 2009; see also Sahakyan, Waldum, Benjamin, & Bickett, 2009). Other findings fit nicely within the framework of the contextual differentiation model, and are troublesome for inhibition models. There is a large body of literature on context-dependent memory; memory is harmed when it is tested in a context that is different from the one in which learning occurred (Godden & Baddeley, 1975). The deficit is usually attributed to difficulty reinstating the learning context as part of the retrieval cue used to probe memory. Indeed, facilitating the reinstatement of context cues reduces the costs associated with changes in context (e.g., Smith, 1979). Within the contextual differentiation framework, the costs of intentional forgetting should be observed only when context alone is used as a retrieval cue. Importantly, the costs of directed forgetting are greatly reduced when test instructions involve the reinstatement of L 1 context at test (Sahakyan & Kelley, 2002).

Thus, the context differentiation model makes specific predictions about when the costs of directed forgetting should be eliminated. While the findings of transitory effects of directed forgetting are not necessarily inconsistent with inhibition models, it is unclear exactly what predictions the inhibition models would make. For example, Bjork and Bjork (1996) explain the lack of costs of directed forgetting in recognition by release from inhibition. The inhibition account may explain these findings, but it does not make any predictions about how or under what conditions this release from inhibition will occur. Indeed, the inhibition models are a better description of the phenomena than an explanation for them.

The transitory nature of the transitory effects of intentional forgetting

Transitory costs and benefits of intentional forgetting are consistent with contextual differentiation models, but there are reports that some stimuli are resistant to intentional forgetting. For instance, no costs were observed when L 1 consisted of highly emotional pictures, suggesting that participants are unable to compartmentalize emotional events (Payne & Corrigan, 2007; but see Wessel & Merckelbach, 2006). On the other hand, the encoding conditions used in such experiments may allow extra-context cues to be induced. Perhaps, for instance, the absence of costs observed for valenced materials was due to the induction of effective category-level cues.

If a category-level cue is used to probe memory, free recall should be improved compared to when a context cue must be reinstated (Nickerson, 1980; Tulving & Pearlstone, 1967). Moreover, if the costs of intentional forgetting are due to a context change, then the costs should be reduced or eliminated when category cues are used to probe memory (cf. Raaijmakers & Shiffrin, 1980). The results on this front are mixed. Wilson, Kipp and Chapman (2003) reported that lists consisting of category exemplars are resistant to intentional forgetting, but Sahakyan (2004) reported that categorized lists are susceptible to effects of intentional forgetting. Again, however, it is possible that subjects in these experiments utilized different retrieval cues. For instance, Wilson et al. alerted the subjects prior to study about the structure of the lists, which increases the likelihood that subjects would use extra-context category cues as a means of effectively probing memory for specific lists. Sahakyan, on the other hand, did not alert subjects to the categorical structure of the study lists, and she further instructed subjects to use a temporal context cue in order to remember specific lists. (e.g., recall the items from L 2 ). While these findings seem inconsistent, and therefore they might be dismissed as a concrete test of the relevant models, they actually fall right out of the contextual differentiation models of intentional forgetting.

A model of directed forgetting

To make concrete the aforementioned assumptions underlying contextual differentiation models and how the effects of directed forgetting interact with the cues used to probe memory, we utilize the framework of a Retrieving Effectively from Memory model (REM; Lehman & Malmberg, 2009; Malmberg & Shiffrin, 2005; Shiffrin & Steyvers, 1997). We assume that memory consists of two types of traces (Shiffrin & Steyvers, 1997). Lexical/semantic traces are vectors of geometrically distributed feature values, which vary in frequency, representing the items and all the contexts in which they have been encountered. Episodic traces also consist of vectors of item and context features, but episodic traces are associated with a single context. In addition, episodic traces represent new associations between list items. For each item on the list, a separate episodic trace is stored.

Encoding

The content of a trace is determined by the operations of a limited capacity buffer (Raaijmakers & Shiffrin, 1981; also Atkinson & Shiffrin, 1968; Malmberg & Shiffrin, 2005). The capacity of the buffer is not known, but we will assume for simplicity that it is two items (see also Atkinson & Shiffrin, 1968; Lehman & Malmberg, 2009). While study items are attended to, they reside in the buffer, and information is encoded about them in one or more episodic traces. Thus, upon the presentation of the first list item, it enters the buffer, and an episodic trace is stored. Assuming that no items repeat, each lexical/semantic feature associated with the first list item and each context feature is copied to an episodic trace with the probability,

$$ c\left[ {1 - {{\left( {1 - u_x^{*}} \right)}^t}} \right], $$

where \( u_x^{*} \) is the probability of storing a feature given t attempts to do so and c is the probability of copying that feature correctly; \( u_i^{*} \) is the probability of storing an item feature, \( u_c^{*} \) is the probability of storing a context feature. If a feature is stored but copied incorrectly from a lexical/semantic trace or context, a feature is drawn randomly from the geometric distribution identified by the g parameter. If a feature is not encoded, it takes the value 0.

Upon the presentation of the second list item, it enters the rehearsal buffer, and a new episodic trace is stored. The trace consists of item information associated with the second list item and context information stored according to the equation above. We further assume that a result of the capacity limitation is that encoding is split between the storage of item, context, and associative information (Lehman & Malmberg, 2009). In this example, the two buffered items compete for encoding resources. Some of the resources are spent encoding the second list item, and we assume that the resources spent encoding it are similar to those spent encoding the first list item when it was initially presented. The remainder of the encoding resources is divvied up between the storage of associative information and context. This is accomplished in the model by reducing the \( u_x^{*} \) parameter for context features such that \( u_c^{*} < u_{{c1}}^{*} \), where the latter term is the probability of encoding a context feature for the first list item, and the former is the probability of encoding a context feature for all other list items. In addition, some of the buffer capacity is spent encoding associative information representing the fact that the first and second items were corehearsed. This is represented by appending to the trace representing the second list item a relatively weak encoding of the first item’s lexical/semantic features. Again, this is implemented by reducing the \( u_x^{*} \) value for associative information, \( u_a^{*} \). With the presentation of the third list item, the oldest item in the buffer is knocked out with probability δ, and the encoding cycle begins anew.

We assume that context features change between lists, but not within lists, with a probability of β. Thus, for each list a single context vector is generated to represent the current context, and all items within that list are associated with the same context information, which is stored according to the rules for item storage outlined above. Between lists, each context feature is either copied from the previous list or, with a probability of β, a new feature is generated from the geometric distribution. We further assume that context features change after the final study list, in the same manner as they change between lists (Lehman & Malmberg, 2009).

Retrieval

Retrieval is conceived of a series of sampling and recovering operations in REM (Lehman & Malmberg, 2009; Malmberg & Shiffrin, 2005; Raaijmakers & Shiffrin, 1981; Shiffrin & Steyvers, 1997). Sampling is governed by a Luce choice rule which assumes that the probability of sampling a given trace, j, is a positive function of the match of trace j to the retrieval cue and negative function of the match of other N-1 traces to retrieval cue,

$$ P\left( {j|Q} \right) = \frac{{{\lambda_j}}}{{\sum\limits_{{k = 1}}^N {{\lambda_k}} }}, $$

where λ j is a likelihood ratio computed for each trace,

$$ {\lambda_j} = {\left( {1 - c} \right)^{{{n_{{jq}}}}}}{\prod\limits_{{i = 1}}^{\infty } {\left[ {\frac{{c + (1 - c)g{{(1 - g)}^{{i - 1}}}}}{{g{{(1 - g)}^{{i - 1}}}}}} \right]}^{{{n_{{ijm}}}}}} $$

and where n jq is the number of mismatching features in the j th concatenated trace and n ijm is the number of features in the j th concatenated trace that match the features in the retrieval cue.

Once a trace is sampled, recovery of its contents is attempted. Since the contents are only a noisy incomplete representation of a study event, the contents of some traces are more likely to be recovered than others. The recovery probability is a positive function of the number of features in the sampled trace that match the retrieval cue, x,

$$ \frac{1}{{1 + {e^{{ - x + b}}}}}, $$

where b is a scaling parameter.

Here we wish to account for the advantages imparted by studying categorized lists. The traditional assumption made by these models is that recovery is more likely to be successful for traces stored on categorized lists (Raaijmakers, 1979; also see Raaijmakers & Shiffrin, 1980, for a discussion of retrieval from categorized lists). In this case, the categorized list advantage falls right out of the model; it is due to the additional matches obtained from the use of readily available category features in the retrieval cue. Lehman and Malmberg (2009) assumed that participants initially probe memory with a cue consisting of current context features, and create a subset of items (ρ of the total number of items) that best match the current context from which to sample items. After creating the subset, participants then use a combination of the current context and some proportion, γ, of reinstated features from the relevant list. For example, if a participant was attempting to recall from L 1 , the cue would consist of current context features and context features from L 1 . After an item is output, the next cue used to probe memory consists of this same context cue along with recovered item information from the last item recalled. A co-rehearsed item will be most likely to be sampled, as it will share some of this item information. If no item is output, then the original context cue is used for the next probe of memory. The sample-and-recovery process repeats κ times.

Directed forgetting

The context differentiation model assumes that directed forgetting instructions lead to increased context change between lists and better encoding for L 2 in the forget condition (Lehman & Malmberg, 2009; Sahakyan & Delaney, 2003). As such, the directed forgetting instructions have effects on both encoding and retrieval operations in the model. The context differentiation occurs by an increased rate of context change between lists after the forget instruction, represented by an increased β parameter. Additionally, the encoding of context associated with the first item on a list is increased for the first item on L 2 , represented by an increased u* c1 , under the assumption that all other items have been dropped from the buffer. Finally, the forget instruction decreases γ, the probability of reinstating context features used in the cue to probe memory for L 1 .

Categorized lists

The predictions of the model are consistent with directed forgetting data when study lists consist of randomly related items (Lehman & Malmberg, 2009). An assumption is implemented to take into account the nature of categorized lists. Prior models of retrieval from categorized lists have assumed that category-to-item associations are stored (Raaijmakers & Shiffrin, 1980). Here we assume that for items that belong to a categorized list, w additional category features are appended to the item vector. These features are shared by all members of a category, thus within a list where all items are members of the same category, these features will overlap for all items. These features are encoded in the same way as item features, and the likelihood of storing these features is represented by the \( u_{{cat}}^{*} \) parameter.

If a list is categorized, and a temporal cue is used to probe memory, we assume that the same initial test cue will be used, consisting of current context features. If a category cue is used to probe memory, however, a different initial cue is used, which consists of not only the same current context features, but also of the additional category features appended to the cue. Additionally, when an item is recalled, the next cue used to probe memory will consist of context features, item features, and category features that are retrieved from the last recalled item, giving a recall advantage to items from categorized lists. Thus, a recovery advantage will lead to higher recall rates for any categorized list over uncategorized lists. Additionally, due to the use of category features in the initial cue, lists probed with a category cue will incur additional recall advantages over lists probed with a temporal cue alone (the additional advantage will be driven primarily by more successful initial recall attempts; see Lehman & Malmberg, 2009).

Predictions

With these additional assumptions, the model makes various predictions about what should occur in a directed forgetting task when lists are categorized and different cues are used to probe memory. The model predictions are shown in the top row of Fig. 1. Panel A shows the model predictions for L 1 performance. There are costs of directed forgetting in a control condition, where randomly constructed lists are used. When L 1 is categorized, and a temporal cue is used to probe memory (the L 1 -temp condition), the model again predicts costs of directed forgetting. When L 1 is categorized and a category cue is used to probe memory (the L 1 -cat condition), an effective category cue is available at test, and the model predicts that the costs of directed forgetting should be disrupted.Footnote 1 Panel B shows the model predictions for L 2 performance. In all conditions, L 2 is randomly constructed and a temporal cue is used, thus the model predicts the benefits of directed forgetting in each condition.

Fig. 1
figure 1

Model predictions and data from Experiment 1. The top two graphs show the model predictions for List 1 (Panel A) and List 2 (Panel B) in each cue condition: a control condition (Control), and two conditions where L 1 is categorized and either a temporal cue is given (L1-temp) or a category cue is given (L1-cat) at test. The bottom row shows the data from the experiment. Panel C shows List 1 performance (costs) and Panel D shows List 2 performance (benefits). P(Recall) is probability of recall. Error bars represent standard error

In comparing L 1 performance to L 2 performance in each of these conditions, recency of L 2 is predicted in the control condition (in that performance on L 2 , the most recent list, is greater than performance on L 1 , the less recent list; see Lehman & Malmberg, 2009). A recovery advantage for categorized lists leads to better recall for L 1 in all of the categorized list conditions when compared to L 1 in the control condition.

These predictions shown are based on the parameter values shown in Table 1. Note that the vast majority of parameter values are based on those used in prior applications of REM (Malmberg & Shiffrin, 2005; Shiffrin & Steyvers, 1997), and the predicted effects of directed forgetting and list content are robust and not dependent on them. Only three parameters change as the result of the instruction to forget, and these values are those used by Lehman and Malmberg (2009) to account for the results of several experiments. In this sense, they are not free parameters, and no parameters vary between the categorized and uncategorized list conditions.

Table 1 REM parameter values to generate Lehman-Malmberg model predictions

Experiment

We tested these predictions with an experiment in which the lists consisted of either unrelated words or categorical exemplars. Our assumption was that the structured list (consisting of categorical exemplars) would provide additional category cues with which to probe memory. In addition, we varied the instructions given to the subjects at test. In two conditions, the control condition, in which L 1 consisted of randomly related items and in the L 1 -temp condition, one of the conditions in which L 1 items were exemplars drawn from a common category (e.g., clothing), subjects were provided a temporal cue at test: Recall as many words from L 1 as you can. In the L 1 -cat condition, L 1 was categorized and subjects were provided a category cue at test: Recall as many items from the clothing list as you can. The prediction was that the category cue would reduce or eliminate the costs of directed forgetting.

Method

Participants, design, and materials

Participants were 520 undergraduate psychology students at the University of South Florida who participated in exchange for course credit. For each participant, three 16-word lists were created. Lehman and Malmberg (2009) used an additional study list consisting of unrelated words before L 1 (referred to as L 0 ) in order to ensure that both L 1 and L 2 experienced some degree of proactive interference, in an attempt to equate L 1 and L 2 as much as possible. The costs and benefits of directed forgetting were not affected by supplementation of the additional list. As the first list was used as a control to equate conditions, it will not be discussed after this point. For this reason, the second list will hereby be referred to as L 1 and the third list will hereby be referred to as L 2 . One hundred sixty subjects were assigned to the control condition, where all lists consisted of randomly related concrete nouns (within each condition, test list and remember/forget instruction were manipulated between subjects). The remaining subjects were randomly divided into two experimental conditions where L 1 was categorized, consisting of concrete nouns from the category “clothing”, and L 2 was uncategorized. Half of these participants were assigned to the L 1 -temp condition and half to the L 1 -cat condition. The categorized and uncategorized lists were equated for frequency (between 20 and 50 occurrences per million; Francis & Kucera, 1982) and word length. Latent semantic analysis was conducted to ensure that words on the categorized list were highly related and words on the uncategorized lists were not (Landauer & Dumais, 1997).

The experiment used a 3 (cue) x 2 (instruction) x 2 (list) between subjects design. The three cue conditions were Control (uncategorized lists, temporal cue), L 1 -temp (categorized L 1 , temporal cue), and L 1 -cat (categorized L 1 , category cue). Within each of these conditions, subjects saw either the remember or the forget instruction, and were tested on either L 1 or L 2 .

Procedure

At the beginning of the experiment, participants were told that the experimenters wanted to see how well people could not only remember information but also remember where that information came from. Participants were informed that they would be presented three lists of words, and that they would be tested on only one of the lists, but they would not be told which list until later in the experiment, so they needed to remember all of the lists:

At the beginning of this experiment, you will study three lists of words. The words will appear on the screen one at a time for a few seconds each.

Your task is to remember these words for a later memory test. Importantly, I will only ask you to remember the words from one of the lists, which will be chosen randomly, but you will not be told which list until later in the experiment.

In between each list, there will be short math task. This involves adding digits in your head and entering the total into the computer. Once you have done so, the next list of words will be presented.

Participants were not informed about the categorized lists prior to study. The 1ists were shown one word at a time, with each word appearing for 8 s.Footnote 2 Participants in the remember condition were shown each list followed by a 30 s math distractor task, where they completed two-digit addition problems. Participants were instructed to complete as many math problems as they could in this amount of time. In the forget condition, the procedure was the same, except that participants were given the following instruction to forget just prior to the presentation of L 2 : "Next you are going to receive the third study list. This is the list that you will be asked to recall, so you do not need to worry about the first two lists."

At test, participants were given a free recall test lasting 90 seconds. Half of participants in each condition were tested on L 1 and half were tested on L 2 . Participants who were given the forget instruction and who were tested on L 1 (the “forget” list) were told that we want them to recall from this list even though we had previously told them that they won’t need to remember it. Participants in the L 1 -cat condition were instructed to use the appropriate category cue (they were given the word “clothing”) to help them to recall from this list. In all other conditions, participants were given a temporal cue; they were asked to recall the items from the specified list (for example, “Please enter all of the words you remember from the second list.”).

Results and discussion

An alpha level of .05 was adopted as the standard for significance for all analyses. The results are shown in Fig. 1. When considered separately, there were significant main effects for both the forget instruction and the category condition (all F > 2). Analyses comparing L 1 performance to L 2 performance revealed a significant recency effect in the control condition insofar as L 2 was better recalled than L 1 , t(1,158) = 7.24, SE = .031. In the L 1 -cat condition, the available category cue produced better recall for L 1 than for the uncategorized L 2 , t(1,178) = 6.010, SE = .03. Thus, as the model predicted, the category cue made less recent events easier to recall than more recent events not associated with an effective cue.

In addition, an ANOVA indicated a significant three-way List x Instruction x Cue interaction, F(2,508) = 3.85, MSE = .037. Further analyses also revealed significant List x Instruction interactions in the Control condition, F(1,156) = 28.139, MSE = .03, the L 1 -temp condition F(1,176) = 9.41, MSE = .046, but not in the L 1 -cat condition, F = 1.131, MSE = .03, p = .29. To understand these interactions better we separately analyzed the costs and benefits in the three Cue conditions. Thus, for each Cue condition, the effect of Instruction was examined separately for L 1 and L 2 . The different patterns of costs and benefits that were observed are presented in the bottom two panels of Fig. 1. Overall, they support the contextual differentiation hypothesis.

Panel C shows the results for L 1 performance. For the control condition, recall of L 1 was better for the remember condition than for the forget condition, t(78) = 2.78, SE = .025, replicating the costs observed many times before. According to the contextual differentiation hypothesis, the accelerated change in mental context after the instruction to forget made it more difficult to reinstate an effective context cue for L 1 . This assumption is supported by the results of the L 1 -temp condition, in which L 1 consisted of category exemplars, and subjects were given only temporal cues at test; the costs were again observed, t(88) = 1.99, SE = .051. This result establishes that the costs of directed forgetting occur even when L 1 is highly structured. These costs, however, were not observed in the L 1 -cat condition, where L 1 was also categorized but subjects were given a category cue at test, t(88) = .68, SE = .037, p = .50.

Panel D shows the results for L 2 performance. For the control condition, recall was better for L 2 in the forget condition than in the remember condition, t(78) = 4.54 SE = .049, replicating the benefits observed many times before. In the L 1 -temp condition, t(88) = 2.43, SE = .039, and in the L 1 -cat condition, t(88) = 2.002, SE = .042, the benefits were replicated. These findings are consistent with the model’s predictions in that a highly structured L 1 did not eliminate the benefits on L 2 . The model, however, underestimates the benefits of the instruction to forget in the control condition, and fails to predict diminishing benefits when a categorized L 1 is studied. The benefits are unaffected by the cuing condition, and the categorical status of L 1 has no effect on recall of L 2 in the forget condition. At this point, an explanation for this complex interaction in which much confidence should be placed is difficult to obtain. However, the benefits under conditions similar to the control condition are often larger for free recall than costs, and the model can predict this as demonstrated by Lehman and Malmberg (2009). However, the model fits in this simulation are constrained by all the data (as the parameter values are the same as those used by Lehman & Malmberg, 2009), and in order to predict the interaction, an additional factor must be specified. One might speculate that between-subject variance caused differential levels of benefits. Although possible, this explanation is somewhat unsatisfactory pending attempts to replicate the finding. On the other hand, it by no means diminishes the fact that the costs of directed forgetting are not permanent and depend on the availability and use of appropriate retrieval cues.

Intrusion analyses were conducted for items that were output during recall but not present on any list in the experiment. All intrusion rates are shown in Table 2. In the control condition, there were no significant main effects of List or Instruction, and no significant interaction effect in extra-list intrusions (all p > .10). In the L 1 -temp condition, there was no significant effect of Instruction and no significant interaction (all p > .10); however, there was a significant main effect of List, F = 8.79, MSE = 4.13, indicating that there were more extra-list intrusions on L 1 (the categorized list) than L 2 . The same pattern was repeated for the L 1 cat condition, i.e., no significant effect of Instruction and no significant interaction (all p > .10), and a significant main effect of List, F = 13.24, MSE = 2.89, indicating that there were more extra-list intrusions on L 1 (the categorized list) than L 2 . One concern is that subjects in the L 1 -cat condition simply dumped all members of the category when they were provided with the category cue, accounting for the improved performance in the forget condition. However, the intrusion rates in the forget condition were not higher than intrusion rates in the remember condition, nor were the intrusion rates for the forget condition higher in the L 1 -cat condition than in the L 1 -temp condition, suggesting that this is not the case.

Table 2 Number of extra-list intrusions for Control, L1-temp, and L1-cat conditions

Here, we have shown that categorized lists may produce the costs associated with intentional forgetting, but only when memory is cued with temporal context. When category cues are used to probe memory the costs of intentional forgetting are eliminated. The model correctly predicted the observed pattern, and thus is a viable explanation for how intentional forgetting is accomplished and the conditions under which it will and will not occur. While these findings are qualitatively consistent with the predictions of the model; quantitatively, the model fit could be improved for the benefits in the control condition. Additional assumptions, such as a slower context change that occurs within lists, or more complex buffer operations (as discussed by Lehman & Malmberg, 2009), may lead to better fits; however, at this time our main goal is to show that the models makes accurate predictions about categorized lists in directed forgetting, and this was accomplished by a few simple changes to the model.

General discussion

Perhaps the key issue for memory research is to explain forgetting. In this article, we have examined intentional forgetting. Some models of intentional forgetting assume that forgetting is result of the inhibition of previously stored traces, whereas other models assume that forgetting occurs as the result of interference among traces that are competing for retrieval. For instance, when attempting to intentionally forget an event inhibition models assume that the trace corresponding to that event is inhibited from retrieval. A shortcoming of the inhibition approach is that it does not specify how inhibition takes place. For instance, how is the relevant to-be-inhibited trace identified? Where does the inhibition come from? And the key question for present purposes is, how do traces once inhibited return to their original state?

Indeed, there are a large number of findings that indicate that some memories, once thought to be forgotten, can be retrieved when proper cues are provided. Thus, questions concerning the permanence of intentional forgetting should not be framed in terms of whether previously forgotten traces can be retrieved in the future because the answer to that question is almost certainly yes. Rather, the relevant questions concern how to best explain these phenomena, under what conditions should we expect to observe them, and what is the fate of memories that were once intentionally forgotten.

Such findings are naturally explained by interference models that assume that retrieval success is a positive function of the similarity between the contents of the retrieval cue and contents of a memory trace. Here we addressed the issue of whether transitory costs of directed forgetting are due to the nature of the retrieval cues used to probe memory. We began by generating predictions from a recently developed model of forgetting (Lehman & Malmberg, 2009). Like the aforementioned models of interference, the present model assumes that intentional forgetting is dependent on the nature of the retrieval cue. Temporal context is notoriously difficult to reinstate (Dennis & Humphreys, 2001; Winograd, 1968). Nevertheless, tasks like free recall require the use of temporal context to probe memory, especially when memory is tested using multiple-list designs, as is the case for the list method for examining intentional forgetting. When temporal context is difficult to reinstate but nevertheless used to probe memory, the costs of intentional forgetting should be observed according to this model because the instruction to forget causes a decrease in the similarity between the temporal context stored during the study of L 1 and context available at test. In contrast, the costs should be eliminated when a list-specific retrieval cue is used. In our experiment, such cues were categorical in nature, and when these cues were specified at test the costs on intentional forgetting were eliminated.

Thus, the results that we report are consistent with these a priori predictions of the model. Directed forgetting was achieved in this study when participants were given a forget instruction and a temporal retrieval cue with which to probe memory. This was true regardless of whether the to-be-forgotten list was categorized or not. When the lists were categorized and category cues were provided, however, the costs of the forget instruction were eliminated. These findings support the contextual differentiation models of intentional forgetting, and extant models of inhibition fail to predict our findings. However, the failure of inhibition models has much to do with the lack of precise development. In fact, it may not necessarily be the case that contextual differentiation and inhibition are mutually exclusive. Indeed, it may be more useful to consider the former as a mechanism and the later as an effect (much like global-matching is a mechanism supporting recognition and mirror patterns are effects). Indeed, since the inhibition models are not well specified it may be the case that contextual differentiation is a mechanism that produces many or even all of the phenomena that exhibit the characteristics of inhibition.

Anderson (2005) discusses the possibility that inhibition occurs at the context level—an entire list context is inhibited after the forget instruction. However, this proposal predicts that all traces associated with the to-be-forgotten list should be inhibited directly after the instruction to forget is received. One problem for this proposal is that when serial position curves are plotted, the primacy portion of the to-be-forgotten list is attenuated whereas the recency portion is not (Lehman & Malmberg, 2009; Sheard & MacLeod, 2005). The model presented in this study assumes that under typical delayed free-recall conditions where random items comprise the study lists a temporal cue is used on the first attempt to sample a trace from memory. Because context is stored most strongly for the first item on the list, it tends to be sampled first. This sampling advantage is attenuated after the instruction to forget due to use of a relatively ineffective context cue. Thus, the costs of directed forgetting are large for the item in the first serial position. Subsequent retrieval attempts are made by using the information recovered on the previous trial, which includes both item and context features. Use of the item features to access subsequent traces naturally attenuates the effect of the instruction to forget. Again, the key assumption of our model is that the effect of the instruction to forget is accounted for by an interaction between retrieval cues and the contents of memory at test, whereas in the inhibition model the effect of the instruction to forget occurs directly after the instruction is received and presumably affects the entire contents of these traces.

It may be that directed forgetting exhibits the characteristics of inhibition, while context change describes the mechanism behind the task. On the other hand, our findings can serve as a test of the context differentiation and inhibition accounts of directed forgetting. Based on the model presented here, the context differentiation account makes exact predictions about when the costs and benefits of directed forgetting will be present. The inhibition account, however, fails to make any predictions about under which condition the costs and benefits will occur. Further, when these findings are viewed in light of findings from another paradigm believed to also be linked to inhibition, retrieval-induced forgetting (RIF), the inhibition account fails.

According to inhibition theorists, in a directed forgetting task, a release from inhibition occurs when an item is re-presented, as in a recognition test (Bjork & Bjork, 1996). Additionally, in studies of RIF, a cue (such as a category name) is paired with multiple targets (exemplars of that category), and practicing retrieval of one target in response to the cue impairs memory for the unpracticed target. Inhibition has been proposed as a mechanism for RIF—practice of one target causes inhibition of the other (competing) target (Anderson, Bjork & Bjork, 1994). As evidence that RIF is not simply due to the interference of practiced items on unpracticed items, inhibition theorists point to findings that independent cues do not eliminate RIF, thus RIF must be due to inhibition of the unpracticed targets and not interference from the practiced targets (Anderson & Spellman, 1995; but see Camp, Pecher, Schmidt & Zeelenberg, 2009). Thus, if inhibition is released by re-presentation of the studied items themselves, but not by independent cues, then inhibition theory would predict that in the current experiments, we would not see a release from inhibition, as category cues are used, and not the items themselves. This is inconsistent with our results—the costs were eliminated when the independent category cues were used. Findings from our study add to a growing area of research suggesting that phenomena thought by some to be due to inhibitory processes may not be so (Camp et al., 2009; Sahakyan and Goodmon (2010); Tomlinson, Huber, Rieth & Davelaar, 2009; see also Bulevich, Roediger, Balota & Butler, 2006).

Our research is another example of that which shows that forgetting is not necessarily the result of a permanent change to the contents of memory. Humphreys, Tehan, O’Shea and Bolland (2000) demonstrated that classical interference, commonly believed to cause the destruction of memory traces (i.e. Loftus & Palmer, 1974; Wixted, 2005), can occur without unlearning of the memory traces (see also McCloskey & Zaragoza, 1985). Humphreys et al. showed that memory impairment in interference paradigms is due to passive interference arising from a decrease in the accessibility of effective retrieval cues. Like this paradigm, our results suggest that the detriment to L 1 memory that occurs with directed forgetting is also due to a decrease in the accessibility of effective retrieval cues, but in this case this is due to an active, intentional process.

While our studies were conducted within the frameworks of existing models of intentional forgetting, it may be useful to draw a conceptual distinction between intentional forgetting and compartmentalization. According to the former conceptualization, the goal is to rid the mind of useless memory traces. According to the compartmentalization conceptualization, the goals are more temporary and related to improving performance of the task at hand. Our results support the view that intentional forgetting should lead to only temporary costs. “To-be-forgotten” material is temporarily rendered inaccessible, and compartmentalized material can be retrieved when an effective cue is used to probe memory.

The distinction between compartmentalization and intentional forgetting may seem nuanced. However, the distinction has implications in more specific areas of memory study, such as research on eyewitness memory. Eyewitness testimony is widely accepted as evidence and is often the only evidence provided in criminal cases; however, both laboratory and field studies have shown that eyewitness memory can be extremely inaccurate (Goldstein, Chance & Schneller, 1989). If an individual witnesses a crime and then compartmentalizes the information in memory in order to perform other tasks, it may be hard to retrieve that information later, as it is for participants who are given a forget instruction. As we have shown here, however, it may be possible to retrieve the compartmentalized information if an effective retrieval cue can be created, as is attempted in the Cognitive Interview technique for eyewitness testimony (Geiselman, Fisher, MacKinnon & Holland, 1985; see Golding & Long, 1998, for a review of intentional forgetting in legal settings).

Up to this point, we have discussed the ability to overcome the effects of compartmentalization. Compartmentalization may, however serve as a coping strategy, and as with any cognitive function, there may be individual differences in the ability to compartmentalize. The study of these individual differences may be a new avenue in which to extend this research. For example, what are the consequences for those who are unable to compartmentalize?

One relevant area of research is the study of cognitive functioning in depression. Depressed individuals have a tendency to ruminate, or to focus thoughts on their depressive symptoms, and the causes and consequences of those symptoms (Nolen-Hoeksema, Morrow & Fredrickson, 1993). Rumination is especially problematic because depressed individuals who ruminate tend to stay depressed for longer periods of time (Nolen-Hoeksema et al., 1993). It may be that depressed individuals are unable to compartmentalize thoughts about their depressive symptoms, and, if so, further research may be directed at determining whether this is a cause or a symptom of depression. An inability to compartmentalize information that is not relevant at the current point in time may also be a characteristic of cognitive functioning in other populations (such as individuals with ADHD or OCD, or elderly individuals).

Conclusions

We investigated the persistence of the intent to forget or compartmentalize material on free recall. We hypothesized that the costs and benefits of intentional forgetting are associated with an accelerated change of mental context and the subsequent difficulties associated with reinstating it at the time of test for use as a retrieval cue. Thus, we predicted that when a cue is provided at test, the costs of intentional forgetting should be minimal. The results of our experiments confirm this prediction, and they challenge alternative models of intentional forgetting to provide a mechanism that predicts the transitory effects of intentional forgetting.