Life is filled with emotional events, some joyful, others stressful, such as a wedding or a divorce. Consequently, the role of emotion in the memory of past events has been a major focus of interest for studies on autobiographical and episodic memory. However, few studies have examined how we remember the duration of emotional events, even though this dimension is fundamental (D’Argembeau & Van der Linden, 2005). The purpose of the present study was thus to begin to investigate the memory for the duration of emotional events.

There is abundant evidence that memories of emotional events are more accurate than those of neutral events (Christianson, 1992). Since the pioneering research conducted by Kleinsmith and Kaplan (1963), several studies have demonstrated that arousing emotions enhances long-term memory for events. For example, after an interval of 24 h, individuals remember arousing words better than neutral words (LaBar & Phelps, 1998; Sharot & Phelps, 2004). Indeed, some evidence suggests that emotions facilitate the consolidation of traces in memory (LaBar & Cabeza, 2006). During a period after encoding, called consolidation, newly learned information is fragile and susceptible to disruption before becoming fixed in memory. Emotional reactions are believed to produce a release of adrenal stress hormones that increases the significance of events and enhances their hippocampus-dependent memory consolidation (McGaugh, 2000). This is consistent with studies that have shown that the administration of stimulant drugs within minutes or hours after training facilitates retention in long-term memory (McGaugh, 1973; McGaugh & Roozendaal, 2002). As a result of changes in the consolidation processes, memories of emotional events are thus more persistent and vivid that those of neutral events (Phelps, 2004).

However, inconsistent with the result that emotional events are remembered better, a small number of studies of memory for temporal aspects of emotional events have suggested a distortion rather than an improvement in memory (for a review, see Droit-Volet, in press). For example, individuals who have experienced a traumatic event (e.g., a car accident) report that time appeared to run more slowly than normal during the event (e.g., Anderson, Reis-Costa, & Misanin, 2007; Loftus, Schooler, Boone, & Kline, 1987). Similarly, novice skydivers overestimated the duration of their first jump, and the degree of their overestimation increased with their fear level (Campbell & Bryant, 2007). This raises the question of whether this distortion of the memory for duration of emotional events is specific to time or results from methodological artifacts in the studies that have been conducted to date.

A major problem in the studies on the temporal memory of emotional events is the difficulty of identifying whether the observed time distortions were due to processes of consolidation in memory per se or simply mirrored what was initially encoded. There is now ample evidence that emotions affect the perception of time (e.g., Angrilli, Cherubini, Pavese, & Manfredini, 1997; Droit-Volet, Brunot, & Niedenthal, 2004; Droit-Volet, Fayolle, & Gil, 2011; Droit-Volet, Mermillod, Cocenas-Silva, & Gil, 2010; Falk & Bindra, 1954; Gil & Droit-Volet, in press; Grommet et al., 2010; Stetson, Fiesta, & Eagleman, 2007; Watts & Sharrock, 1984), and it has been demonstrated that durations experienced in a high-arousal emotional context are judged as longer than those experienced in a neutral context. According to internal-clock models of time perception (Gibbon, Church, & Meck, 1984; Treisman, 1963), the increase of arousal occasioned by the emotional event speeds up the internal clock that provides the raw material for the representation of an event’s duration. When the pacemaker of the internal clock runs faster, more temporal units occur during the interval timed, and thus its duration is judged to be longer (for reviews, see Droit-Volet & Meck, 2007; Grondin, 2010; Meck, Penney, & Pouthas, 2008). Consequently, the finding of distorted time judgments in long-term memory for emotional events may result from the encoding of time under emotion-provoking conditions rather than from a specific problem of consolidation in memory. To be able to experimentally examine whether time distortions take place due to consolidation processes, it is thus necessary to verify that the duration has been correctly encoded and stored in long-term memory in the first place, irrespective of different emotional contexts. Therefore, in the present study, we used an original procedure in which the participants learned a standard duration in different emotional contexts and then were tested either immediately or after a long-term retention interval.

In addition, most previous studies of temporal memory have used the “retrospective judgment of time” paradigm (Hicks, Miller, & Kinsbourne, 1976). In this paradigm, participants are instructed that they have to estimate the duration of an event only after having experienced it. According to theories that seek to explain retrospective time judgments, there is no guarantee in this situation that participants have paid attention to time or have encoded it. Consequently, temporal judgment is considered to be reconstructed using nontemporal information stored in memory (Block, 1992; Hicks et al., 1976). To avoid such a reconstructive process, we used a “prospective timing” paradigm, in which people were explicitly instructed to pay attention to the stimulus duration that they would experience, and later to estimate it.

The aim of the present study was therefore to try to reconcile the apparent discrepancy between results of enhanced memory for emotional events and those of time judgments, which have suggested distortions of temporal memory under emotion-provoking conditions. We used the temporal generalization task (Church & Gibbon, 1982; Wearden, 1992), which has been used to test memory for duration in a number of previous studies (Jones & Wearden, 2004; Ogden, Wearden, & Jones, 2008; Rattat & Droit-Volet, 2010). In this task, participants initially learn a standard duration (learning phase). They are then presented with comparison durations of the same length as the standard or of shorter or longer durations (test phase), and they must judge whether or not these durations have the same duration as the standard in order to make a “same” or “different” response. In our study, the test phase was administered either immediately after the learning phase or after a retention period of 24 h. Furthermore, three emotional contexts were used during the learning phase: threatening, nonthreatening, and neutral. In the threatening context, the participants expected an aversive sound (a 50-ms burst of 95-dB white noise) at the end of the stimulus to be timed. The expectation of this forthcoming event has been demonstrated to increase a person’s level of arousal and to induce the emotion of fear because it produces a small pain in ears (e.g., Droit-Volet et al., 2010; Hillman, Hsiao-Weckslerb, & Rosengren, 2005; Mermillod, Droit-Volet, Devaux, Schaefer, & Vermeulen, 2010). In the nonthreatening context, the participants expected a nonaversive sound (50 ms of a 50-dB beep) judged to be low-arousing and pleasant (Droit-Volet et al., 2010). In the neutral control condition, no sound was expected. Our hypothesis was that the temporal comparison judgment between the comparison durations and the standard duration would be more accurate when the standard duration had been previously experienced in an emotional rather than a neutral context, and that this would occur to a greater extent for the aversive than for the nonaversive emotional condition.

Method

Participants

A total of 120 students from Blaise Pascal University, Clermont-Ferrand, France, participated in the experiment. All gave informed consent and were paid €10 for their participation.

Materials

The participants sat in a quiet room in the laboratory in front of a PC, which controlled the events and recorded the responses via E-Prime (version 1.2; Psychology Software Tools, Pittsburgh, PA). The stimulus to be timed (i.e., for the standard duration and comparison durations) was always a blue circle, 2.5 cm in diameter, presented in the center of the computer screen. The participants gave their responses by pressing two keys (“k” or “d”) on the computer keyboard. The emotion-provoking stimuli were two acoustic signals: one aversive and one nonaversive. The aversive signal consisted of a 50-ms burst of 95-dB white noise with an instantaneous rise time that produces a startle reflex (Hillman et al., 2005). The nonaversive signal was a 50-dB beep lasting 50 ms. The emotion-provoking nature of these acoustic signals was recently tested in a temporal task by Droit-Volet et al. (2010) using physiological indexes (skin conductance responses) and self-assessment reports of arousal, of valence, and of the emotions that the stimuli induced. In that study, the aversive signal was rated as highly arousing (i.e., M    =    7.25, SD    =    1.25, on a 9-point scale), as of negative valence, and as being fear-inducing. In contrast, the nonaversive signal was judged to be less arousing (M    =    3.5, SD    =    1.5) and of positive valence, and to produce neutral emotions and emotions of happiness. The acoustic signals were delivered binaurally using calibrated headphones.

Procedure

The participants performed a temporal generalization task that consisted of two phases: learning and test. In the learning phase, the participants were initially instructed to memorize the standard duration (4 s) and were presented with this standard duration five times. They were then given at least two training blocks of four trials each: two for the standard duration and two for comparison durations of 0.5 and 7.5 s. The standard duration and the comparison durations were always presented in the form of the blue circle. The participants were told to press one key (e.g., “d”) if the comparison had the same duration as the standard (“same” response), and another key (e.g., “k”) if it was different (“different” response). Buttonpress assignments to the “same” and “different” responses were counterbalanced across participants. Each response was immediately followed by informative feedback (“correct” or “wrong”). None of the participants required more than one or two training blocks (16 trials in total) to learn the standard duration—that is, to produce 100% correct responses.

The procedure in the test phase was similar to that used in the learning phase, except for the use of different comparison durations (1, 2, 3, 4, 5, 6, and 7 s) that were presented in the absence of feedback. The participants were told “It’s the same game that you played just now/yesterday, but now you won’t receive any feedback.” After each comparison duration, they responded by making a “same” or “different” response as in the training phase. The participants completed eight blocks of 9 trials (72 trials)—that is, 3 trials for the standard duration and 1 trial for each of the comparison durations. The durations were presented in random order within each block. Each trial started when the participant pressed the space bar after the word “ready!” To prevent the participants from using a counting strategy, they were explicitly told not to count, and the experimenter added that if they did count, the results would be distorted (for this method, see Rattat & Droit-Volet, 2011).

A between-subjects design was used in which the participants were randomly assigned to six experimental conditions (20 participants per group) as a function of the retention period between the learning and the test phase (immediate and 24 h) and the emotional condition experienced during the learning phase (aversive, nonaversive, and control). However, 1 participant in the nonaversive/24-h-delayed testing condition was excluded from the statistical analyses because he produced a totally flat generalization gradient—that is, he pressed the same button for all responses. In the aversive and nonaversive conditions, the aversive and nonaversive sounds were presented 50 ms after the stimulus duration on every trial of the learning phase (i.e., the 5 trials of standard duration presentation and the 16 trials of the training blocks). The acoustic signal was not delivered during the test phase for these two “emotional” conditions. In the control condition, no sound was presented after the stimulus duration in either the learning or the test phase.

Results

Figure 1 shows the temporal generalization gradients as the mean proportions of “same” responses (comparison duration judged to be the same as the standard duration) plotted against the comparison durations for the immediate test and 24-h-delayed test conditions. The upper, middle, and bottom panels show data for the aversive, nonaversive, and control conditions, respectively. To describe the generalization gradients, two measures were calculated: the peak time—that is, the stimulus duration that gave rise to the highest proportion of “same” responses—and the width of the temporal generalization gradient at half of its maximum height (full width at half maximum, or FWHM; as used by Hinton & Rao, 2004; Penney, Holder, & Meck, 1996). These two measures were obtained by fitting each participant’s temporal gradient with the logarithmic curve-fitting algorithms from the PeakFit program. The logarithmic function produced the best fit of temporal gradients for the participants (mean R 2    =    .91, SD    =    .08).

Fig. 1
figure 1

Temporal generalization gradients (proportions of responses judged as being of the same duration as the standard duration) for the immediate and the 24-h-delayed tests in the aversive condition (upper panel), the nonaversive condition (center panel), and the neutral condition (lower panel)

Figure 2 shows the peak times obtained in this way. An ANOVA was conducted on the peak time measure, with test time (immediate vs. delayed) and emotion condition at training (aversive, nonaversive, and control) as between-subjects factors. There was a significant interaction between the test time and the emotion condition at training, F(2, 113)    =    3.52, p    =    .03, but no main effect of test time, F(1, 113)    =    2.50, p    =    .12, nor of emotion, F(2, 113)    =    1.76, p    =    .18. In the immediate test, the peak time seemed to be longer for the aversive than for the other emotional conditions; however, the effect of emotion was not significant, F(2, 57)    =    0.71, p    =    .50. This suggests that, in the present study, temporal discrimination was accurate in the temporal generalization task when the standard duration was learned across several trials in different emotional conditions. While there was no effect of emotion at the immediate test, the effect of emotion was significant for the delayed test, F(2, 56)    =    4.15, p    =    .02. This was due to the fact that, in the control condition, the peak of the generalization gradient was shifted toward a longer duration value in the delayed than in the immediate test, t(38)    =    2.49, p    =    .02. The ratio between the peak time and the standard duration (Peak Time – Standard Duration / Standard Duration) was indeed greater than zero in the delayed test, t(19)    =    3.50, p    =    .002, whereas it did not significantly differ from zero in the immediate test, t(19)    =    1.06, p    =    .30. In sum, durations learned in a neutral condition were distorted (i.e., overestimated) after the long retention interval. In contrast, the difference in the peak time between the delayed and the immediate tests decreased for the durations learned in an emotional context, with a nonsignificant difference in the aversive condition [3.91 vs. 4.19; t(38)    =    1.08, p    =    .29] and in the nonaversive condition [4.36 vs. 3.95; F(1, 113)    =    2.50, p    =    .12]. In the delayed test, the ratio between the peak time and the standard duration did not differ from zero in the aversive condition, t(19)    =    0.51, p    =    .62, whereas it just reached significance in the nonaversive condition [t(19)    =    2.14, p    =    .047]; in the immediate test, it always remained close to zero [t(19)    =    1.06 and t(19)    =    0.38, respectively; all ps    >    .05]. Overall, these results indicate that the distortion of time following long-term retention of durations was reduced for emotional conditions as compared to the neutral condition, and especially in the case of the aversive condition.

Fig. 2
figure 2

Mean peak values of the generalization gradient for the immediate and the 24-h-delayed tests in the nonaversive, aversive, and neutral conditions

An ANOVA was also run on the measure of variability in time discrimination (i.e., FWHM), using the same between-subjects factors as for the peak time (see Fig. 3). This revealed neither a main effect of test time, F(1, 113)    =    2.81, p    =    .59, nor any test time × emotion interaction, F(2, 113)    =    2.23, p    =    .11. There was only a marginally significant effect of emotion, F(2, 113)    =    2.79, p    =    .06. However, further statistical analyses revealed that a significant effect of emotion appeared in the 24-h-delayed test condition, F(2, 56)    =    4.24, p    =    .02, while this effect was not significant in the immediate test condition, F(2, 57)    =    0.91, p    =    .41. For the 24-h-delayed test, this significant effect was due to an FWHM value that was smaller for the aversive than for the control condition (1.97 vs. 2.52; Scheffé test, p    <    .05). The FWHM value in the nonaversive condition (2.18) was halfway between the value for the aversive condition and that for the control condition, with no significant difference being obtained between these conditions (all ps    >    .05). After a long retention interval, the variability in time discrimination was thus lower in the aversive than in the neutral condition. This result indicates that durations experienced in emotional conditions were recalled better than those experienced in nonemotional contexts.

Fig. 3
figure 3

Mean full widths at half maximum (FWHM) of the generalization gradient for the immediate and the 24-h-delayed tests in the nonaversive, aversive, and neutral conditions

Discussion

Our results showed that after a 24-h retention interval, temporal comparison judgments between the standard duration and other durations were more accurate and less variable when the standard duration had been previously experienced in a context known to evoke emotional responses than when it was experienced in a neutral context. In addition, temporal discrimination was better when the learning of the standard duration was experienced in an aversive rather than a nonaversive condition.

The improvement of the temporal judgment when the standard duration was learned in the emotional context as compared to the neutral context suggests that the emotional responses enhanced the long-term memory of the standard duration. This finding is entirely consistent with the results of studies of long-term memory for emotional events, which have shown that emotions improve the memories of events and of some of their properties (e.g., location) (e.g., D’Argembeau & Van der Linden, 2004; Dunbar & Lishman, 1984). Therefore, contrary to findings in the literature on time distortions in long-term memory for emotional events, our results suggest that durations are remembered better when they are initially experienced in an emotional rather than a neutral context. However, in our study we tested a relative judgment between a current duration and a previously experienced standard duration rather than an absolute judgment of the standard. It will thus be important to further investigate the impact of different types of temporal judgment on the sense of time in long-term memory.

Our results also showed that the accuracy of temporal memories was increased when the standard duration was learned in an aversive rather than a nonaversive context. In the aversive context, the participants expected a threatening sound that is known to induce the emotion of fear and to increase the arousal level during the processing of the standard duration (Droit-Volet et al., 2011). Studies that have used a fear-conditioning paradigm have shown that the expectation of this type of threatening stimulus produces arousal responses but also activates the amygdala (Phelps et al., 2001). According to some ideas about the neuroscience of emotion and memory, the amygdala modulates the hippocampus-dependent memory consolidation of emotional events through the production of stress hormones (for reviews, see McGaugh, 2000; Phelps, 2004). Consequently, we can assume that the association between emotional stress reactions induced by the expectation of an aversive sound and the standard duration to be processed would have facilitated the consolidation of the standard in memory and its long-term retrieval.

The processes of consolidation and encoding involve different mechanisms (McGaugh, 2000). However, one can also suppose that the emotional context might have improved the processing of the standard duration, because it has been shown that threatening stimuli automatically attract focused attention for basic survival reasons (Sharot & Phelps, 2004; Öhman & Soares, 1993). In addition, there is a positive correlation between the degree of activation of the amygdala (for the rapid detection of threatening stimuli) and long-term memory (Cahill et al., 1999). It is thus possible that the amount of attention allocated to the processing of time was higher in the aversive than in the nonaversive or the neutral condition, with the result that the standard duration was encoded better and recalled better 24 h later. However, our results showed no significant difference in temporal discrimination as a function of the emotional conditions in the immediate test phase as well as during the learning phase. This provides clear support for a consolidation- rather than an encoding-related hypothesis.

In the presence of emotional arousal, we would also have expected to observe a lengthening of duration in the aversive as compared to the neutral condition. As we stated in the introduction, numerous studies have shown that time is judged as longer in high-arousal than in low-arousal conditions, because the internal clock speeds up with the increase of arousal (Droit-Volet & Gil, 2009). As far as the memory is concerned, Millar, Styles, and Wastell (1980) proposed that participants reactivate in memory not only the content of past events, but also the emotional state associated with these events. It is thus possible that, in our study, the emotional state experienced during the learning phase was reactivated during the testing phase. In this case, the standard duration and the comparison durations would have been processed with the same internal clock speed, and no lengthening effect would have been observed in the aversive condition. However, Gil and Droit-Volet (2011) have recently observed that the emotion-based lengthening effect is not obtained in the temporal generalization task, but only in tasks in which the participants can compare different emotional stimuli across the same session. In addition, in the present study, the participants learned the standard duration over several training trials, which probably improved timing accuracy in the temporal generalization task. Consequently, methodological factors (the generalization task and learning phase) likely contributed to the absence of a subjective lengthening effect in the aversive condition.

No previous study has experimentally investigated the long-term memory of durations learned in different emotional contexts, and several questions remain unanswered. Nevertheless, our study is the first to provide results that suggest that arousing emotions enhance the long-term memory of stimulus durations, just as they enhance the memory of events and of their nontemporal characteristics.