One of the oldest and most well-established laws of human memory is that memory improves with repetition (Ebbinghaus, 1913; see Crowder, 1976, for a review). The benefits of repetition are evident in nearly all memory paradigms, with the degree of benefit often characterized by a negatively accelerated function of the number of repetitions. In implicit memory, repetition enhances accessibility, as measured by priming in speeded response tasks (see, e.g., Grant & Logan, 1993; Lewis & Ellis, 2000), threshold identification (e.g., Salasoo, Shiffrin, & Feustel, 1985), and word fragment completion, though benefits of added repetition often diminish after only a few repetitions (Chen & Squire, 1990). In explicit memory, prolonged maintenance rehearsal enhances stem cued recall (Greene, 1986) and recognition memory (Glenberg, Smith, & Green, 1977; Woodward, Bjork, & Jongeward, 1973), but is generally of less benefit to free recall (see Greene, 1987, for a review) and of little or no benefit to other forms of explicit memory (e.g., Hintzman, Curran, & Oppy, 1992; Nairne, 1983). It is rarely considered, however, that prolonging rehearsal time may actually reverse the benefits to long-term memory that are typically associated with brief rehearsal. In this article, we highlight one case in which more repetition is not better. The case of interest is that of massed, rote rehearsal and its relationship with long-term semantic accessibility.

This investigation was motivated by the old but ill-understood phenomenon of semantic satiation. Semantic satiation refers to the phenomenon whereby prolonged exposure to a word creates the subjective experience that a word briefly loses its meaning, akin to the phenomenon of sensory adaptation. Although experimental investigation of semantic satiation has a long history (Severance & Washburn, 1907), much of the early work on this phenomenon suffered from methodological shortcomings (see Esposito & Pelton, 1971, for a thorough critique). However, evidence for semantic satiation eventually came from a paradigm developed by Smith (1984), in which subjects repeated semantic category names (e.g., fruit) aloud for either 3 or 30 consecutive repetitions. Immediately afterward, a target appeared that was either a member (e.g., “apple”) or a nonmember (e.g., “robin”) of the category, and subjects made a category membership decision as quickly as possible. Critically, member decisions were actually slower following 30 repetitions of the category than after only 3 (see also Balota & Black, 1997, and Smith & Klein, 1990). Using similar methods, satiation effects have been observed with faces as the repeated stimuli (Lewis & Ellis, 2000), suggesting that this phenomenon may be a general consequence of massive repetition, at least for measures of accessibility taken immediately after repetition.

The durability of satiation effects remains unclear, however. Because most research has assumed that semantic satiation effects are very short-lived, akin to sensory adaptation, no study has examined whether the effects of massed repetition might actually affect long-term memory for the repeated words. If such a persisting effect could be shown, it would constitute a striking exception to the general principle that repetition improves long-term memory for repeated materials: Repetitions would actually lead to a downturn in performance rather than a simple reduction in further benefit. In the present studies, we addressed this possibility using a novel paradigm that consisted of a repetition phase followed by a semantic generation (test) phase that, like studies of semantic satiation, probed semantic as opposed to episodic memory. During the repetition phase, different words (e.g., “sheep”) were repeated aloud for varying durations: 0 (nonexposed baseline), 5, 10, 20, or 40 s. Upon completion of the repetition phase, the test phase presented a cued semantic generation task that measured how likely subjects were to free associate previously repeated words (hereinafter repeated words; e.g., “herd s____” for “sheep”), as well as critical, never-presented semantic associates of the repeated words (hereinafter, associate words; e.g., “fabric w___” for “wool”). Of interest was how often participants would generate repeated and associate words as a function of repetition duration of the repeated words.

Clearly, repeating a word for a short duration (e.g., 5–10 s) ought to increase the likelihood of generating it in a later free association test (as compared to words not exposed in the experiment). This priming effect might also extend to a word’s semantic associates, to the degree that the repeated word’s underlying meaning is activated. Of particular interest to our hypothesis, however, is whether repetition would continue to yield further benefits (though perhaps with diminishing returns) when extended further in time. Is more necessarily better? Or would added repetition, paradoxically, begin to reverse initial priming effects? Inspired by the curious phenomenon of semantic satiation, we predicted that prolonged rehearsal of a word (20 or 40 s) would not only fail to increment priming any further, but would actually elicit a reversal or even elimination of priming, yielding a nonmonotonic relation between repetition and delayed semantic generation performance. Importantly, by separating the repetition and generation phases of our procedure by several minutes, we were able to assess whether these effects of repetition are evident in long-term memory. Finally, by separately measuring delayed semantic generation performance for associate (semantically related) words, we were able to determine whether priming—and a potential reversal of priming—occurred at the level of semantic representations. That is, whereas delayed generation performance for repeated words might be influenced by effects of repetition on phonological or lexical representations, such influences ought to be less relevant to associate words, which are never repeated or seen in the repetition phase.

Experiment 1

Subjects

A group of 40 undergraduate students participated in exchange for credit toward a psychology course requirement.

Design

Two factors were manipulated within subjects: Repetition Duration and Probe Type. Repetition duration had five levels: 0 (nonrepeated baseline words), 5, 10, 20, or 40 s. The uneven spacing of these repetition intervals reflected the assumption that initial repetitions would yield greater changes in memory than would later repetitions. Thus, for all trend analyses, repetition duration was treated as an ordinal variable. Probe type consisted of two levels: probes testing repeated words and probes testing associates of repeated words. The two counterbalancing factors (five assignments of repeated words to repetition durations; two assignments of repeated words to test probe type) were treated as between-subjects factors.

Procedure

The experiment consisted of a repetition phase and a test phase. During the repetition phase, subjects were presented words, one at a time, on a computer screen and were asked to repeat each word aloud, at a moderate pace, until the word disappeared. Words were separated by a 2-s intertrial interval. Subjects first practiced this procedure with a few filler words. During this practice, the experimenter encouraged subjects to repeat words at a slightly faster pace if the rate of repetition was below 1/s. The experimenter recorded (by hand) the number of repetitions of each word.

The test phase immediately followed and was described as a separate experiment measuring word associations. Subjects viewed words, one at a time, paired with a single letter. Subjects were given up to 4 s to free associate to each word–letter stem combination. It was emphasized that there were no correct answers, and that subjects should simply say the first word that came to mind that was related to the word and that began with the letter provided. The experimenter recorded (by hand) whether the subject generated the intended repeated or associate word on each trial.

Materials

Repeated words and their associates

A total of 40 repeated–associate pairs were constructed (note: only the repeated words appeared in the repetition phase). Word pairs were selected using word association norms (Nelson, McEvoy, & Schreiber, 1998), such that the associate word was the closest associate of the repeated word and the associate did not have a backward association to the repeated word. The elements of each repeated–associate pair were unrelated to the elements of any other repeated–associate pair. The repeated and associate words ranged in length from 3 to 9 letters (Ms = 5.6 and 4.8 for the repeated and associate words, respectively). These pairs were divided into five sets of eight words, corresponding to the five repetition conditions, and were counterbalanced across subjects. Within the repetition phase, the serial positions of words from each repetition condition were equated using blocked randomization. Several filler words were presented at the beginning and end of the repetition phase.

Repeated and associate word probes

For each repeated and associate member, a probe word was selected for use during the test phase. The probe words were chosen so that they would be likely to cue the target word (repeated or associate) without cuing any other word in the experiment. For example, for the repeated–associate pair “sheep–wool,” the repeated word probe was “herd,” and the associate word probe was “fabric.” Thus, in the test phase, subjects would see “herd s___” or “fabric w___” (see Fig. 1). Repeated word probes were selected such that they were unrelated to the associate word for that pair; likewise, probes for associate words were unrelated to the corresponding repeated word. Thus, the accessibility of repeated and associate words was independently assessed (Anderson & Spellman, 1995). It is important to note that repeated word probes were selected such that the repeated words did not strongly elicit the probe word, whereas the association from the probe to the repeated word was strong. This relatively unidirectional relationship from repeated word probe to repeated word made it less likely for the repeated word to prime or satiate the repeated word probes themselves, allowing for a cleaner measure of a repeated word’s accessibility at test. The relationship between associate word probes and associate words was also asymmetric.

Fig. 1
figure 1

Sample stimulus structure of our experiments

For half of the words in each repetition condition, the repeated word was probed during the test; for the remaining half, the associate word was probed. Thus, for each repeated–associate pair, either the repeated word or its associate word was probed. Because half of the probes tested associate words, and associate words were never presented in the experiment, associates helped to create the impression that the free association test was not a test of the earlier words (along with our instructions to freely associate any related response to any cue). Two versions of the test were constructed such that, across subjects, for each repeated–associate pair, the repeated and associate members were equally likely to be probed in each of the repetition conditions. Repeated and associate trials were intermixed in the test, as were the different repetition conditions. Blocked randomization ensured that the average test positions were equated across conditions. Several filler trials were inserted at the beginning of the test phase to allow subjects to acclimate to the task.

Results and discussion

Responses on the free association test were scored as correct if they matched the preselected target (i.e., the repeated or associate word, as appropriate). Incorrect trials included cases in which subjects did not generate any response (M = 25.5%) or instead generated a response that did not match the preselected target (M = 74.5%).

Figure 2 displays test performance for the repeated and associate words. A significant main effect of repetition was observed [F(4, 120) = 2.47, p < .05; no interaction with item type: F(4, 120) = 1.24, p = .30], indicating that repetition duration influenced semantic generation. To more precisely characterize how repetition influenced semantic accessibility, we tested for linear and quadratic trends across repetition, using orthogonal polynomial contrasts. Critically, if repetition elicits a monotonic increase in semantic accessibility—as most models of memory would assume—this would be reflected in a positive linear trend; if, however, brief repetition elicits an initial increase in semantic accessibility and prolonged repetition actually reverses this effect, this would be reflected in a quadratic trend.

Fig. 2
figure 2

Test phase performance for Experiment 1. Error bars represent within-subjects standard errors of the item type by repetition duration interaction

Strikingly, across item types, there was no evidence for a linear trend [F < 1; no interaction with item type: F(1, 30) = 1.82, p = .19]. In contrast, there was a significant quadratic trend [F(1, 30) = 5.9, p < .05; no interaction with item type: F < 1]. More specifically, whereas 10 s of repetition elicited priming, 40 s of repetition abolished this effect (see Table 1). To more strictly test for a reversal of priming with increasing repetition, we performed linear trend analyses for repeated and associate words across the 5-, 10-, 20-, and 40-s repetition conditions. If increasing repetition actually reverses initial priming, this should be reflected in a negative linear trend. Indeed, a robust negative linear trend was observed [F(1, 70) = 5.78, p < .05], indicating—completely counter to what would typically be expected—a clear negative consequence of simply increasing from 5 to 40 s of repetition. While this effect did not interact with item type [F(1, 70) = 1.35, p = .26], when considered separately, the linear effect was significant for associate words [F(1, 30) = 7.93, p < .01], but not for repeated words (F < 1).

Table 1 Statistical analysis of test phase performance, relative to the 0-s baseline condition, in each repetition condition in Experiment 1

The present results provide novel evidence that massed repetition of a word reduces the level of priming that is associated with brief repetition. Critically, the presence of this effect for the associate words indicates an effect at the level of a word’s semantic representation. Notably, these counterintuitive effects of prolonged repetition were observed despite an average lag of 5 min between the repetition of a word and its later testing. Thus, consistent with our hypothesis, these data suggest that prolonged rehearsal of a word can diminish or even eliminate the accessibility benefits to semantic memory that arise with brief repetition.

Experiment 2

Experiment 1 provided novel evidence for a highly counterintuitive phenomenon: Prolonged rehearsal of a word can actually undo some of the accessibility benefits normally associated with brief rehearsal. More rehearsal, apparently, is not always better. Although we did not observe an interaction between this striking pattern and item type, it is clear form Fig. 2 that the effect was numerically larger for associate words. Although it is not clear why this occurred, it must be noted that the semantic generation phase differed for repeated and associate trials in an important way: Subjects could augment their free association performance for repeated words by explicit recall from the earlier repetition phase. In other words, a semantic deficit may have been partially compensated for by an intact or even facilitated episodic memory trace. While it was impossible to eliminate the potential for explicit recall of repeated words, given the paradigm we used, in Experiment 2 we thought it was important to replicate Experiment 1 while minimizing the contribution of explicit recall through two changes. First, we added a block of 10 filler words (semantically unrelated to all other words in the experiment and not probed during the test phase) to the end of the repetition phase. Thus, any bias toward explicit recall of late serial position words would be less likely to influence performance in the semantic generation task. Second, we inserted a 10-min distractor task between the repetition and semantic generation phases to further disguise the relationship between these tasks.

Method

Subjects and design

A group of 40 students participated in exchange for credit toward a course requirement. The design was identical to that of Experiment 1.

Procedure

The procedure matched that of Experiment 1, except that an unrelated visual pattern classification task was performed as a distractor task for 10 min between the repetition and test phases.

Materials

All critical words were identical to those in Experiment 1. Ten new filler words were also generated, semantically unrelated to any of the critical repeated words, associate words, or test probes.

Results and discussion

As in Experiment 1, responses on the free association test were scored as correct if they matched the preselected target. Incorrect trials represented cases in which subjects failed to respond (M = 28.7%) or generated a nontarget word (M = 71.3%).

Figure 3 displays the test performance for repeated and associate words. As in Experiment 1, a significant main effect of repetition was observed [F(4, 120) = 4.05, p < .005; no interaction with item type: F < 1]. There was again no evidence for a linear trend relating repetition time to free association performance [F < 1; no interaction with item type: F(1, 30) = 1.65, p = .21]. In contrast, consistent with our predictions and with Experiment 1, the quadratic trend was highly significant [F(1, 30) = 15.66, p < .001] and did not interact with item type (F < 1).

Fig. 3
figure 3

Test phase performance for Experiment 2. Error bars represent within-subjects standard errors of the item type by repetition duration interaction

As in Experiment 1, there was priming at 10 s (and at 5 s), but again, after a full 40 s of repetition, repeated and associate words were no more likely to be generated than were baseline words never exposed in the experiment (see Table 2). Again, as in Experiment 1, a linear trend analysis considering performance from 5 to 40 s, collapsing across item types, revealed a significant decrease in performance with added repetitions [F(1, 30) = 7.30, p < .05]. This effect did not interact with item type [F(1, 30) = 1.21, p = .28], but as in Experiment 1, when considered separately, the effect was significant for associate words [F(1, 30) = 10.12, p < .005], but not for repeated words (F < 1).

Table 2 Statistical analysis of test phase performance, relative to the 0-s baseline condition, in each repetition condition in Experiment 2

Given the nearly identical procedures and materials of Experiments 1 and 2, we performed several additional analyses using the aggregated data across both experiments (treating Experiment as a between-subjects factor). Collapsing across item types and considering all repetition conditions, the quadratic trend was highly significant [F(1, 60) = 20.43, p < .0001; Fig. 4] and did not interact with experiment (F < 1). The linear trend, in contrast, was again not significant (F < 1). Thus, the nonmonotonic relationship between repetition and performance was highly robust and consistent across experiments. Similarly, the negative linear trend observed from 5 to 40 s of repetition was also highly significant [F(1, 60) = 13.07, p < .001, collapsed across item types], providing striking evidence for a reversal of priming with added repetitions. Thus, our data clearly violate the seemingly obvious prediction of monotonic—even if asymptotic—increases in performance with increasing repetition. Rather, these data clearly indicate that simply adding repetition time can progressively eliminate priming effects that would otherwise be evident with low levels of repetition.

Fig. 4
figure 4

Test phase performance aggregating across Experiments 1 and 2. Error bars represent within-subjects standard errors of the item type by repetition duration interaction

While we did not observe significant interactions by item type (repeated vs. associate words) in either of the experiments, the reversal of priming was most evident for associate words. Considering the combined data, the quadratic trend did not interact with item type (F < 1), but there was a marginal interaction in the linear trend [F(1, 60) = 3.46, p = .07]. Considering repeated words separately, the quadratic trend was significant [F(1, 60) = 8.29, p < .01], but the linear trend was not [F(1, 60) = 1.26, p = .27], consistent with a nonmonotonic function. Likewise, for associate words the quadratic trend was significant [F(1, 60) = 12.29, p < .001], but the linear trend was not [F(1, 60) = 1.92, p = .17]. Similarly, for both item types, significant priming was observed at 5 and 10 s, but following 40 s of repetition, this effect was not significant for repeated words (even with the increased power afforded by combining data across experiments) or for associate words, where the elimination of priming was complete, because performance was numerically below baseline (see Table 3). The sharp reduction in priming for associate words was reflected in a very robust linear trend from 5 to 40 s of repetition [F(1, 60) = 18.01, p < .0001], whereas this effect was not significant for repeated words [F(1, 60) = 1.35, p = .25]. The interaction in this linear trend, however, was not significant [F(1, 60) = 2.55, p = .12]. Thus, the overall patterns of data were quite similar for repeated and associate words (see Fig. 4), and there were no significant interactions by item type—either within or across experiments. However, the present data nonetheless suggest a somewhat weaker reversal of priming for repeated words.

Table 3 Statistical analysis of test phase performance, relative to the 0-s baseline condition, in each repetition condition in Experiments 1 and 2, combined

Finally, to further characterize how repetition impacted on priming, we considered whether the types of “errors” that subjects produced (i.e., nontarget response vs. no response) differed as a function of repetition. For nontarget-response trials—which comprised the majority (M = 72.8%) of error trials—there was a significant quadratic trend across repetition conditions [F(1, 60) = 17.54, p < .001; linear trend: F < 1]. This quadratic trend reflected an initial decrease in nontarget production, followed by an increase in nontarget production at longer repetition durations (see Table 4). In other words, these data were largely the inverse of those for target production. No-response trials were, overall, a less common form of “error” (M = 27.2%), and there were no quadratic or linear trends for these data (Fs < 1; Table 4).

Table 4 Mean proportions of nontarget and no-response trials as a function of item type and repetition condition for Experiments 1 and 2, combined

General discussion

Both Experiments 1 and 2 demonstrated a striking and clear violation of the memory benefits typically associated with repetition. Specifically, increasing the rehearsal time of a word did not yield a straightforward monotonic increase in performance on a later free association test; rather, it led to a nonmonotonic effect, with performance initially increasing, but then declining with longer repetition durations. Importantly, this counterintuitive reversal of priming was particularly evident for semantically associated words never presented in the experiment, indicating that this effect arises, at least in part, at the semantic level. Both the priming and its reversal were measured on a test that took place long after repetition of the words had been completed (a lag of 15 min in Experiment 2), indicating that the reversal of priming was clearly not limited to the immediate aftermath of its repetition, but rather was, surprisingly, reflected in long-term accessibility measures.

Although the present results provide compelling evidence that massive repetition reverses the benefits of short repetition, as reflected in delayed semantic accessibility, it is not obvious what underlies this effect. One mechanism that can be clearly ruled out is passive decay. According to a decay account, semantic representations are activated upon initial repetitions of a word but decay across subsequent repetitions, as participants’ attention to the word they are repeating lapses. Although prior studies of semantic satiation may be subject to this account, it is not a tenable account of the present data. In particular, if semantic activation dissipates quickly, we should never have observed any priming effects for any level of repetition, because all words were tested at extensive delays. Moreover, the retention intervals between the repetition and semantic generation phases were carefully matched across repetition duration conditions by blocked randomization, making differential decay across these conditions an unsatisfactory explanation. Rather, the reversal of priming clearly reflects an active effect of massive repetition. To capture this idea in a theory-neutral fashion, we will refer to this as a massed repetition decrement.

If not passive decay, what might cause massed repetition decrements? One possibility is the mechanism long proposed to be at work in more traditional, semantic satiation paradigms: semantic adaptation. That is, prolonged attention to a word may lead the underlying semantic representation to become less responsive to additional input. Although this account builds an intriguing analogy to mechanisms of sensory adaptation, it is at present not well motivated from existing models of semantic memory. Perhaps more problematically, however, is the time scale of the massed repetition decrement observed here. Namely, whereas sensory adaptation dissipates very quickly after the adapted stimulus is removed, here we observed performance decrements 5–15 min after the words had been repeated. Thus, the adaptation theory may be a better account of phenomena observed in typical short-term semantic satiation studies than it is of the present, more enduring effects. Nevertheless, this account would remain tenable, given an account of how persisting adaptation might arise.

A second reason to question the relevance of the concept of semantic adaptation to the present studies is that the findings do not really fit the notion that the underlying semantic concept is “satiated.” In particular, prolonged repetition never drove free association performance below baseline levels. It is worth noting, however, that evidence for this conceptualization of the satiation phenomenon has never been clear cut. Indeed, evidence for satiation has been subtle, consisting of moderately slower semantic processing after prolonged repetition (e.g., 30 s; Smith, 1984; Smith & Klein, 1990) as compared to brief repetition (e.g., several seconds). Because satiation paradigms have not typically included a nonexposed baseline (cf. Balota & Black, 1997), it has been unclear whether prolonged repetition truly slows semantic processing time for the repeated word relative to an unprimed state. In the present study, we found that increasing repetition duration from 5 to 40 s was associated with a negative influence on semantic accessibility—but it is clear that this downward trend reflected a reduction in performance relative to a primed state and not reductions relative to preexperimental levels. Thus, while the present results parallel semantic satiation effects, the lack of a reduction to below-baseline performance suggests that the term “satiation” may be misleading or overly strong—at least for the present results, and possibly for prior studies of semantic satiation as well. Of course, it remains possible that repetition durations longer than those used here would drive performance below baseline levels, which would be more in line with true adaptation of the semantic representation.

Massed repetition decrements might also reflect the inhibition of part or all of a word’s underlying semantic representation. By this view, semantic representations of the repeated word are actively inhibited to the extent that they prove interfering. At least two variants of this account seem plausible. First, semantic representations may directly interfere with word repetition, resulting in competition between a word’s phonological and semantic representations. This competition may increase with repetition, owing to distortions in the perception of word sounds that typically increase with prolonged repetition (Warren & Gregory, 1958). By a phonological focusing view, the need to focus attention sharply on phonological representations in the service of word articulation could have the secondary consequence of inhibiting the word’s underlying meaning. It is interesting to note that this view could account for the somewhat weaker repetition decrement for repeated words than for associate words, as production of repeated words could reflect the joint contributions of primed phonological representations and inhibited semantic representations.

Second, with massed repetition, activation of semantic associates over time may proliferate to such a degree that attention is occasionally drawn away from the repeated word’s meaning; to rein in activation, attention may be refocused on the word’s meaning, thereby inhibiting noncentral semantic features and associates. Thus, rather than inhibiting a word’s entire semantic representation to focus on phonology, inhibition may be levied against all nonfocal semantic features in order to retain semantic focus on the repeated word (Shivde & Anderson, 2011). According to this view, production of target words at test may have critically depended on whether the probes tapped into dominant/primed features and associations of a repeated word or more peripheral features and associations, as might be true for associates. Critically, each of the inhibition-based accounts suggests that semantic representations are weakened not because words are massively repeated, per se (see Lewis & Ellis, 2000), but because the activated semantic representations interfere with task performance. Thus, the present results may represent a more general case of inhibition via sustained attention.

While the adaptation and inhibition accounts are similar in some respects, they differ in a fundamental way. Namely, the adaptation account explains the massed-repetition decrement as the result of sustained attention ultimately resulting in fatigue for a word’s semantic representation. In contrast, the inhibition view suggests that attention is eventually shifted away from a word’s semantic representation (or at least away from noncentral features), thereby eliciting inhibition of the semantic representation. Thus, by this inhibition account, elements of a word’s semantic representation may be inhibited while other representations (e.g., phonological, lexical, or attended elements of the semantic representation) are primed. In this respect, the inhibition account suggests that the semantic weakening observed with prolonged repetition may be the flip side of priming that occurs at other levels of representation. This interpretation is potentially consistent with the curious phenomenon of antipriming (Marsolek, 2008), whereby exposure to an item can simultaneously result in facilitated subsequent processing of that item (priming) and impaired processing of similar or competing items (antipriming).

Concluding remarks

The beneficial effect of repetition is one of the oldest and most unquestioned effects in memory. Repetition almost always improves long-term retention, or, at worst, leaves it unaffected. Yet, in the present experiments we found evidence that massive, continuous repetition of the sort employed in studies of semantic satiation not only fails to further improve memory, but actually reverses and eliminates the benefits that brief periods of repetition impart on long-term semantic memory. Quite simply, more was not better, a finding that is in striking contrast to the near ubiquitous benefits that repetition confers. Importantly, this massed repetition decrement was evident at a delay of 15 min, indicating that the effect represented a change in long-term memory and not a fleeting lapse in semantic accessibility. We believe that understanding the mechanistic basis of this massed repetition decrement may provide an intriguing and informative window into the interaction between attention and memory.