Major depressive disorder (MDD) has been characterized by high levels of rumination—uncontrollable negative thoughts about the depressed individuals’ symptoms and situation that interfere with their ability to concentrate and carry out daily activities (Nolen-Hoeksema, 2000). Rumination is not merely a symptom of depression; it maintains and exacerbates depressive symptoms, and it has been found to predict the likelihood of recurrence of depressive episodes (Nolen-Hoeksema, 2000; Nolen-Hoeksema, Wisco, & Lyubomirsky, 2008). Therefore, identifying the cognitive and neural mechanisms of rumination may help us gain a better understanding of the etiology and maintenance of depression.

Cognitive processes of rumination

It is likely that rumination involves difficulties in controlling the contents of short-term memory (STM; Joormann & Gotlib, 2008; Joormann, Nee, Berman, Jonides, & Gotlib, 2010); thus, we propose that depressed individuals have a specific deficit in removing negative self-relevant information from STM. Specifically, we hypothesize that this failure to expel negative information from STM leads to increased interference, which, in turn, results in the difficulties in concentration and memory that have been associated with rumination and depression (Lyubomirsky, Kasri, & Zehm, 2003). As such, individuals with MDD (MDDs, for short) may also show smaller STM capacity than do healthy control participants (HCs) because rumination may consume cognitive resources.

Considerable research has examined the neural and cognitive mechanisms involved in resolving interference in STM (Berman, Jonides, & Lewis, 2009; D’Esposito, Postle, Jonides, & Smith, 1999; Jonides et al., 1998; Nee, Jonides, & Berman, 2007; Nelson, Reuter-Lorenz, Sylvester, Jonides, & Smith, 2003; Oztekin & McElree, 2007; Thompson-Schill, D’Esposito, Aguirre, & Farah, 1997). Although some investigators have demonstrated that depression is associated with an impaired ability to remove negative information from STM once it enters (Joormann & Gotlib, 2008; Joormann et al., 2010; Whitmer & Banich, 2007; Banich et al., 2009), little research has examined the neural underpinnings of the relationship between depression and the ability to control information in STM. In this study, therefore, we used an STM interference task to elucidate the neural and behavioral mechanisms of rumination and depression.

Overview of the present research

We used a directed-forgetting task to assess interference resolution of affectively valenced stimuli in STM (Nee et al., 2007; Zhang, Leung, & Johnson, 2003). The task required participants to attempt to remove from STM previously encoded positive and negative words (see also Joormann & Gotlib, 2008; Joormann et al., 2010). We hypothesized that MDD and HC participants would differ with respect to both behavioral and neural functioning when trying to forget negative, but not when trying to forget positive, information. Behaviorally, we hypothesized that MDD participants would exhibit longer reaction times (RTs) and poorer accuracy than would HCs when trying to forget negative words. We also hypothesized that higher levels of rumination would be associated with greater difficulty forgetting negative items.

Neurally, we expected to find differences between the MDD and HC participants when forgetting negative versus positive items, specifically in the left inferior frontal gyrus (LiFG), a region that has been implicated in memory selection and inference resolution (D’Esposito et al., 1999; Jonides et al., 1998; Nee et al., 2007; Nelson et al., 2003). In addition to exploring activation magnitude differences between groups, we also explored differences in the variances of activation between groups. While exploring variance in activations is a relatively recent development in the analysis of brain-imaging data (researchers typically control for, rather than measure, this variable), variability has proven to be an important measure in other subfields of psychology and behavioral science (Mischel & Shoda 1995, 1998; Riley & Turvey, 2002; Segerstrom & Nes, 2007). With that said, recent fMRI research has explored variance of the fMRI signal as a dependent variable, and researchers have found this to be a sensitive measure (e.g., Musso, Konrad, Vucurevic, Schaffner, Friedrich, & Frech, 2006; Winterer, Musso, Beckmann, Mattay, Egan, Jones, et al., 2006a). Garrett, Kovacevic, McIntosh and Grady (2010), for example, found that variance in the BOLD signal during fixation periods differentiated older and younger adults with five times the predictive power of an amplitude-based analysis. Other research has shown that variance in the fMRI signal predicted psychotic symptoms for people with schizophrenia (Winterer et al., 2006a). In addition, in a recent review, Bush, (2010) called for more research examining variability in fMRI signals as they pertain to ADHD because such “noise” could be related to decreased dopamine levels and could help to evaluate hypoactivation in ADHD groups. All of this research suggests that using variance of activation as a dependent measure could be important when comparing and contrasting fMRI results from different populations.

We hypothesized that neural differences between MDDs and HCs could manifest themselves in any of three ways: (1) There could be differences in the magnitude of activation in LiFG between the two groups (e.g., MDDs may activate LiFG to a lesser degree than do HCs). (2) There could be differences in the temporal variance of activations in LiFG (e.g., MDDs may activate LiFG more variably over time than do HCs). (3) There could be differences in the spatial variance of activations in LiFG (e.g., MDDs may activate LiFG more diffusely than do HCs). Therefore, exploring both spatial and temporal variance in depressed and healthy participants may elucidate neural aspects of rumination in conjunction with more standard magnitude analyses.

Method

Participants and measures

Thirty-two right-handed adults (21 female, mean age 24.4 years) participated in this study. Sixteen participants met criteria for MDD as assessed by the Structured Clinical Interview (SCID),Footnote 1 and 16 had no current or past Axis I pathology. All SCID diagnoses were confirmed by a second, independent interviewer. Participants also completed the Ruminative Response Styles (RRS) questionnaire (Nolen-Hoeksema & Morrow, 1991; Treynor, Gonzalez, & Nolen-Hoeksema, 2003), which assesses the degree to which participants engage in rumination with depressive content, and the Beck Depression Inventory II (BDI; Beck, Steer, & Brown, 1996), which assesses the severity of depressive symptomatology. Three participants were excluded from all analyses (1 had poor fMRI normalization, 1 did not complete the experiment, and 1 did not perform the word-rating task at the end of the study), leaving 15 MDDs and 14 HCs for the final analyses.Footnote 2 However, the participant with poor normalization was included in the behavioral analyses, yielding 15 MDDs and 15 HCs for those analyses.

Materials and procedures

Participants saw a display of four words on a computer screen. Two of the words were presented in blue and two in red. Participants were instructed to encode and remember all four words. After a 4-s delay, participants saw either a blue or a red color patch indicating the color of the words they were now to remember; the words in the other color were to be forgotten. Following a jittered cue-to-stimulus interval (CSI) of 4, 6, 8, or 10 s (average CSI = 7 s), participants saw a single probe word and pressed a “yes” key if that word was one of the two words they were to remember, or a “no” key if it was not one of the two words they were to remember. The intertrial interval (ITI) was jittered to be 4, 6, 8, or 10 s (average ITI = 7 s).

There were two types of “no” trials, which were the trials of main interest here: those with “control” probes (words that were not seen in over 100 trials on average) and those with “lure” probes (words that were drawn from the to-be-forgotten set of the current trial). Previous research demonstrated that people are both delayed and less accurate in responding to lure trials than to control trials (Nee et al., 2007; Zhang et al., 2003). The difference in performance between lure and control trials indexes how well participants are able to control the contents of STM and to resolve interference. The task used in this study required participants to forget and remember both positively and negatively valenced items. Because of the high level of rumination by MDDs about negative material (Beck, 1967; Nolen-Hoeksema, 1991; Nolen-Hoeksema et al., 2008), we hypothesized that MDD, but not HC, participants would have more difficulty saying “no” to negatively valenced lures than they would to positively valenced lures, indexed by increased RT and decreased accuracy.

Participants first practiced 32 trials of the directed-forgetting task with words that they would not see again in the study. All words were selected from the Affective Norms of English Words (ANEW; Bradley & Lang, 1999). Specifically, words were selected that were positively or negatively valenced according to ANEW norms in order to increase the likelihood that participants would perceive the words as differentially valenced. The mean ANEW valence was 3.15 for the negative words and 7.21 for the positive words. The positive and negative words were equated for arousal (negative, M = 5.46, SD = 1.12; positive, M = 5.48, SD = 1.12) and frequency (negative, M = 24.9, SD = 36.1; positive, M = 27.6, SD = 27.3). As described below, participants were asked to rate the valence of each word at the end of the experiment, and these idiosyncratic ratings were used in our behavioral and neural analyses so we could examine the effect of affective valence as perceived individually by each participant. The experimental task contained 192 trials: 24 lure negative trials, 24 lure positive trials, 24 control negative trials, 24 control positive trials, 48 “yes” negative trials, and 48 “yes” positive trials.Footnote 3 Participants completed 12 runs of the experiment (in two sessions of six runs each; see Fig. 1).

Fig. 1
figure 1

Schematic diagram of the valenced directed forgetting task. Each run consisted of 16 trials that were balanced for the different trial-type combinations, lasted 7 min 38 s, and included 16 s of fixation at the beginning and end of each run

After completing Session 1 (approximately 1.25 h), participants completed the BDI and RRS questionnaires. They then performed a short automated operation span task in which they had to remember words while simultaneously solving math problems (Unsworth, Heitz, Schrock, & Engle, 2005). The automated operation span task measures STM capacity and was administered to determine if there were differences in STM capacity between groups, and could also be used as a covariate in our behavioral analyses. Following these tasks, participants returned to the scanner to complete Session 2 (also approximately 1.25 h). The experiment was divided into two sessions to avoid the fatigue that would have been caused by 2.5 h of continuous scanning time.Footnote 4 After completing Session 2, participants had 3 min to recall as many words as they could from the experiment. Following this, participants rated each word on a scale from 1 to 7 indicating how negative or positive that word was to them (1 = this word is very negative to you; 7 = this word is very positive to you).Footnote 5

fMRI analysis parameters

Details regarding our fMRI acquisition and preprocessing parameters are presented in the fMRI Methods section of the online supplemental materials. Functional images were entered into a general linear model in which fixation (at the beginning and end of each run), the stimulus display (all negative items, all positive items, and half negative and half positive items separately), the remember cue, and the probe were modeled. There were three types of probes: lure, control, and yes trial probe words that were modeled separately based on participant ratings. Words with ratings of 1 or 2 were categorized as negative, ratings of 3–5 as neutral, and ratings of 6 or 7 as positive. MDD and HC participants did not differ significantly in the number of words they rated as positive and negative, but both groups rated more words as negative than as positive, t(28) = 7.8, p < .001. This fMRI analysis matched many of our behavioral analyses in which similar-valence trials were aggregated. Incorrect trials were modeled separately, as were probe words that had missing word ratings. Furthermore, 24 motion regressors were added into our model—including the linear, squared, derivative, and squared derivative of the six rigid-body movement parameters (Lund, Norgaard, Rostrup, Rowe, & Paulson, 2005)—resulting in a total of 40 regressors (some participants did not have incorrect trials or unrated words, resulting in 38 regressors for those participants).

For all fMRI analyses, a threshold of p < .005 (uncorrected) at the voxel level was used, which was then corrected using a cluster-size threshold of 20 contiguous voxels, thus reducing Type I error probability (Forman et al., 1995). To assess brain regions showing overlap for MDDs and HCs when attempting to remove information from STM, a conjunction analysis was performed on the lure – control contrast for both groups. The conjunction analysis was thresholded at p < .01 for each group contrast, producing a conjoint p < .001 threshold, and was restricted to five contiguous voxels. This conjunction analysis followed the same procedure as in Nee, Jonides, and Berman (2007).

fMRI spatial variance calculation

Spatial variance in regions of interest (ROIs) was calculated by taking the activation magnitudes of all voxels in a specified ROI for a given contrast and calculating the variance score of those activations across all voxels in the ROI for each participant. No temporal information was used in the calculation of spatial variance. These variances were all calculated within subjects.

Behavioral analysis

Behavioral analyses were focused on the “no” response trials, given our interest in examining participants’ ability to expel information from STM. To correct for outliers, trials for which RTs were either greater than three standard deviations from each participant’s mean (calculated separately for each trial type) or less than 400 ms were excluded (a standard procedure used to trim RT data). On average, this trimming procedure removed only 1.35% of trials. The resulting means for the correct trials were used in our analyses. We hypothesized that, compared with HCs, MDDs would exhibit a larger lure – control difference for negative words than for positive words (i.e., slower RTs, more errors). To test this hypothesis, two 2 (group: MDD vs. HC) X 2 (trial type: lure vs. control) X 2 (valence: positive vs. negative) repeated measures ANOVAs were conducted (one on RTs and one on accuracy), with group as a between-subjects factor and trial type and valence as within-subjects factors.

Results

Behavioral results

MDDs had more difficulty removing negative information from STM than did HCs, but there were no group differences when removing positively valenced information from STM. A 2 (group: MDD vs. HC) X 2 (trial type: lure vs. control) X 2 (valence: positive vs. negative) ANOVA conducted on our behavioral data yielded a highly reliable main effect of trial type: Responses to lure trials were less accurate, F(1, 28) = 11.352, p < .005, and slower, F(1, 28) = 81.85, p < .001. There was not, however, a reliable Group X Trial Type interaction for either RT, F(1, 28) = 1.473, n.s., or accuracy, F(1, 28) = 1.759, n.s., nor was the interaction of Trial Type X Valence significant for RT, F(1, 28) = 2.275, n.s., or for accuracy, F(1, 28) = 1.611, n.s. Critically, the RT ANOVA yielded a significant three-way interaction of Group X Trial Type X Valence: MDDs and HCs differed in the lure – control contrast by valence, F(1, 28) = 5.12, p < .05. These results show that MDDs have more difficulty than do HCs in resolving interference from negatively valenced items. These results do not appear to be driven by a more generic effect of group or valence, as evidenced by the two null two-way interactions. We explored these effects in greater detail with planned comparison t tests on RT. MDDs had a greater lure – control difference than did HCs for negatively valenced words, t(28) = 2.05, p < .05, but this was not the case for positively valenced words, t(28) = 0.04, n.s. In addition, MDDs had a reliably larger lure – control difference for negatively valenced words than for positively valenced words, t(14) = 2.23, p < .05. These results are presented in Fig. 2 and Table S1 online. There were no differences by valence for HCs. None of these valence effects was reliable with accuracy as the dependent variable.

Fig. 2
figure 2

Mean correct RTs for the lure – control contrast for the most negatively (ratings 1 and 2) and positively (ratings 6 and 7) rated words for both individuals with major depressive disorder (MDDs) and controls. Here, valence was determined by each individual participant. MDDs exhibit more difficulty in removing negative information from STM than positive. Controls do not show this pattern

In sum, in contrast to HCs, MDDs are less effective at removing negatively valenced information from STM than they are at removing positively valenced information. We confirmed this finding in a separate analysis using all of the word ratings, not just the extremes. The slope of the function relating the lure – control contrast to ratings of valence was reliably different for the two groups, t(28) = 2.40, p < .05. For MDDs, as the valence of the words became more positive, the lure – control difference became smaller. In contrast, HCs displayed a mild trend in the opposite direction. These data suggest that although the majority of the effect lies in the extremes of valence, the effect is also evident at intermediate levels of valence. These plots are presented in Fig. S1 online.

Additional behavioral effects

To assess whether rumination was related to the ability to remove negative and positive information from STM, RRS scores were correlated with the difference in RTs for the lure – control contrast for negative valence minus positive valence. As presented in Fig. 3, the more participants ruminated, the more difficulty they had removing negatively relative to positively valenced information from STM, r = .43, t(29) = 2.5, p < .05. This result was nearly identical when lure negative – lure positive RTs were correlated with RRS, r = .42, t(29) = 2.43, p < .05, indicating that the lure trials were driving these results. These same correlations were explored substituting the full-RRS scores with the Brooding and Reflection subscales. There was a trend for the Brooding scores to correlate positively with RT differences for lure – control negative versus lure – control positive, r = .31, t(29) = 1.73, p < .1, but not for the Reflection scores to correlate in this way, r = .16, t(29) = 0.85, n.s. These correlations were not reliably different from one another, Z(29) = 0.58, n.s., but we may not have had the power to detect differences with our sample size. Even so, these results, though modest, suggest that expelling negative relative to positive information may be related to more negative forms of rumination (e.g., brooding) and not to general pondering (e.g., reflection).

Fig. 3
figure 3

The difference in the lure – control contrast for the most negatively and positively valenced words as rated by individual participants, plotted against the Rumination score (RRS) of each participant. The linear equation is shown in the upper left of the figure

If rumination consumes cognitive resources, then MDDs may have lower STM capacities than do HCs. Operation span scoresFootnote 6 were found to be reliably lower for MDDs than for HCs, t(27) = 2.23 p < .05,Footnote 7 and there was a trend for spans to be negatively correlated with rumination scores (r = –.31, p = .10), all of which suggests that depressive rumination consumes cognitive resources. Although this analysis implies that rumination reduces cognitive resources, the reverse relationship could also be true, that reduced cognitive resources predispose people to rumination. We cannot determine the direction of causality in these data, but this question warrants further consideration.

To check whether our directed-forgetting effects may be driven by these differences in operation span scores, we conducted two stepwise regressions. For the first, we used RTs for the lure – control contrast for negatively valenced words as the dependent measure, with depression status (MDD vs. HC) and operation span scores as potential regressors. Our criterion for inclusion was having a p value for the regressor of < .05, and our criterion for exclusion was having a p value of > .10. For this stepwise regression, only depression status was entered, and operation span scores were excluded. We performed a second stepwise regression with the same regressors but changed the dependent variable to the difference in RTs between the lure – control contrast for negatively valenced words and the lure – control contrast for positively valenced words. Again, only depression status was included, and operation span scores were excluded. Based on these analyses, the differences in the directed-forgetting task do not appear to be driven by differences in operation span scores.

While exploring differences in STM was of most interest in this experiment, there may also be some differences in long-term recall for MDDs versus HCs, since MDDs may remember more negatively valenced information, while HCs may remember more positively valenced information. At the conclusion of the experiment, participants were asked to recall as many words as possible, regardless of whether the words were to be remembered or to be forgotten (because words were repeated four times throughout the course of the experiment, a word could appear as a to-be-remembered item on one trial and a to-be-forgotten item on a later trial, or vice versa). MDDs recalled more negatively than positively valenced words (8.2 negative vs. 5.7 positive), and HCs recalled more positively than negatively valenced words (8.1 positive vs. 5.7 negative), producing a reliable Group X Valence interaction, F(1, 28) = 7.8, p < .01. These results suggest that MDDs latch onto negatively valenced information, while HCs latch onto positively valenced information. More work needs to be done to explore whether these recall effects are driven by current mood state, STM differences, or some combination of the two.

Finally, all of these behavioral effects were explored separately for our nonmedicated MDDs (9 participants) versus HCs. All of these behavioral effects were still found to be reliable, but the directed-forgetting three-way interaction was reduced to a trend, perhaps because the medicated MDDs had more severe depression or because of the reduced power due to the smaller sample size for this comparison.

fMRI results

The behavioral results indicate that MDD participants had more difficulty removing negatively valenced information from STM than did HC participants and that this inability may be related to rumination. The fMRI results were analyzed to uncover the neural mechanisms that accompany these differences, to gain a better understanding of why MDDs are deficient in removing negatively valenced information from the mind.

Similarities in magnitude of activation

For the lure – control contrast at the onset of the probe, both groups robustly activated a network that has been repeatedly implicated in interference resolution for verbal material, including LiFG, right inferior frontal gyrus (RiFG), dorsal anterior cingulate cortex (ACC), and left and right parietal cortex (see Nee et al., 2007; Zhang et al., 2003). A conjunction analysis was performed to quantify this overlap. The group-averaged t-statistical maps for the lure – control contrast were calculated separately for MDDsFootnote 8 and HCs and were conjoined (see fMRI Analysis Parameters in the Method section). The result of this conjunction analysis showed that MDDs and HCs both activated LiFG, RiFG, dorsal ACC, and left and right parietal cortex, all of which areas are involved in interference resolution for verbal material. The results of our conjunction analysis can be seen in Fig. 4. This conjunction verified that both groups activate highly overlapping neural networks in the service of interference resolution in short-term memory.

Fig. 4
figure 4

Results of the conjunction analysis performed on the lure neg – control neg contrast for individuals with major depressive disorder and controls. This figure shows the overlap in LiFG and RiFG (the leftmost axial slice, z = 4, and the coronal slice, y = 28), the dorsal ACC and precuneus (the sagittal slice, x = –5), and the left and right parietal cortex with an additional overlap in bilateral middle frontal gyrus (rightmost axial slice, z = 51)

Of course, the behavioral data underscore the importance of valence. Therefore, the lure neg – control neg contrast was examined in the anatomically defined LiFG (from the WFU PickAtlas) ROI. There was no group difference when comparing the magnitudes of activation. Whole-brain analyses were also performed comparing the two groups for the lure neg – control neg and lure pos – control pos contrasts, but few group differences were found at conservative statistical thresholds. All of these results suggest that the two groups activate similar networks to the same magnitude when resolving interference from positive and negative information.

Differences in spatial variance

As noted above, few group differences were found when exploring activation magnitude. However, as noted above, differences between groups may be present in the variance of activation, not just its magnitude. Therefore, spatial variance was examined with differences assessed between groups, specifically in LiFG. When the spatial variance of activations was examined within this ROI for each participant for the lure neg – control neg contrast, MDDs showed more spatial variance than did HCs, t(27) = 2.33, p < .05, and this group difference in spatial variance was still reliable when we excluded the medicated MDD participants, t(21) = 3.0, p < .01. This group difference in variance is presented in Fig. 5, which displays a three-dimensional rendering of the LiFG for both groups with activations superimposed in a wire-mesh plot. A marching cubes algorithm (Lorensen & Cline, 1987) was used to compute these three-dimensional wire-mesh representations of the contour curves, as implemented in the misc3d package in the statistical program R.

Fig. 5
figure 5

Three-dimensional renderings of mean activations in LiFG, for MDDs on the left and HCs on the right, for the lure neg – control neg contrast. Deactivations are in blue, and activations are in yellow/red. From the figure, one can see that MDDs activate LiFG more diffusely, while HCs activate this region more focally, with activation clusters centering in lower to middle portions of the LiFG. Mean contrast values range from –1.5 (blue) to +1.5 (red)

Examining Fig. 5, one can see that HCs activate more focally in LiFG and MDDs activate more diffusely. Spatial variance in the ROI also correlated significantly with the behavioral difference for lure neg – control neg in RT (r = .60, p < .001).Footnote 9 Because of concerns that the correlation might be driven by two outlier points, we ran a robust regression, which weights outlier points less (using MATLAB version R2008b) with the robustfit function using the Huber weight function; this analysis still yielded a significant correlation (r = .35, p < .01). In addition, there was a trend for the spatial variance in the lure neg – control neg contrast to correlate positively with RRS scores, r = .34, t(29) = 1.9, p < .1.

The group difference in spatial variance was not observed throughout the brain; for example, the ACC (defined anatomically from the WFU PickAtlas) did not show group differences in spatial variance, t(27) = 1.8, n.s. In addition, the spatial variance difference does not appear for all contrasts: For the lure pos – control pos contrast, there were no group differences in spatial variance in LiFG, t(27) = 1.5, n.s. It appears, therefore, that there is some selectivity to the group difference in spatial variance, since it was observed more for the negatively valenced contrasts in LiFG, a region that is critical for this and for other interference resolution tasks.

To explore in greater detail how these group differences in variances within LiFG were manifested for the lure neg – control neg contrasts, a clustering analysis was conducted on the activation contrast values in LiFG. An unsupervised clustering algorithmFootnote 10 (Kaufman & Rousseeuw, 1990; run on Mathematica version 7.0, Wolfram Research Inc.) was performed on the group activation maps in LiFG and yielded more clusters for MDDs (seven) than for HCs (three). This analysis supports the claim of a wider distribution of activation for the MDDs, as Fig. 5 suggests visually. This analysis, in conjunction with the visual display of LiFG from Fig. 5, suggests that MDDs activate this region more diffusely than HCs and that the degree of variance in this region may affect the ability of MDDs to remove negative information from STM.

There is, however, an alternative explanation for the variance result: MDDs may not be uniformly more spatially varied than HCs but, rather, may activate a separate region (or separate regions) within LiFG that HCs do not activate. Notably, exploring Fig. 5, it appears that MDDs activate a superior portion of the LiFG that HCs do not activate. To test this possibility, we built two separate functional ROIs based on the group data for the lure – control contrast across all participants, one centered in the inferior portion of LiFG (x = –31, y = 31, z = 0) and one centered on the superior portion (x = –41, y = 6, z = 33). Group differences in spatial variance were tested in these two regions, as were group differences in mean activation; no reliable group differences were found for either ROI. In addition to the analysis of these functionally defined ROIs, a similar analysis was conducted using three anatomically defined ROIs for subregions within the LiFG (according to the WFU Pickatlas) that are listed from the most dorsal to the most ventral: pars opercularis, pars triangularis, and pars orbitalis. No group differences were found in any of these ROIs for mean activation differences or spatial variance, except for a reliable difference in spatial variance for the pars orbitalis, t(27) = 2.40, p < .05. For all of these analyses, there was a trend for MDDs to show more spatial variance than did HCs. In sum, these combined analyses suggest that MDDs do not activate a separate subarea within the LiFG.

Figure 6 provides an alternative representation of the data, which shows that MDDs are more uniformly varied across the entire LiFG ROI. This chromosome-style plot shows activation for both groups for the lure neg – control neg contrast at each slice through z (from 1 to 22), as shown from left to right across the x-axis of Fig. 6. Activation is shown for every x, y pair (the data are vectorized over x and y, but each x and y pair is represented separately) at each z-slice level, and each participant is represented by a different color. From Fig. 6, it is apparent that the spatial variance effect is present continuously through nearly all slices of z: MDDs show a larger range in contrast values (extending beyond +4 and –4), while HCs show a more restricted range.

Fig. 6
figure 6

Chromosome-style plot of LiFG for the lure neg – control neg contrast. The y-axis shows the contrast score for each voxel in LiFG for each participant. The x-axis shows the different slices through z, from 1 to 22 (ventral to dorsal). MDDs are displayed in the top panel, and HCs are displayed in the bottom panel. The data are vectorized over x and y, but each x, y pair is represented separately, and each participant is represented by a different color, as shown in the legend. HCs’ contrast values, for the most part, stay between +4 and –4, while MDD’s contrast values also range above and below +4 and –4. From the figure, it is apparent that MDDs show greater variance at nearly every slice through z, indicating that the variance effect appears continuous throughout the region of interest

Differences in behavioral variance

These differences in neural spatial variance inspired the exploration of group differences in behavioral variance. A 2 (group: MDD vs. HC) X 2 (trial type: lure vs. control) X 2 (valence: positive vs. negative) ANOVA was conducted with variance in RT as the dependent measure. Variance was calculated for each of the four trial types separately for each individual participant (i.e., within subjects). We found a reliable three-way interaction, in which variance differed by group, trial type, and valence, F(1, 28) = 5.082, p < .05. Follow-up analyses of this interaction yielded a significant group difference for lure neg variance,Footnote 11 t(28) = 2.80, p < .01, but not for the other trial types. However, variance tends to increase with increasing means, so it could be that these variance differences for lure neg trials are due to differences in mean RT and not variance per se. To explore this more fully, coefficients of variation were calculated for lure neg trials for each participant, where the standard deviation in RTs for lure neg trials for each participant was divided by the mean for each participant to control for mean differences. The coefficients of variation were then compared between the two groups, and a highly reliable difference was found in which MDDs showed a higher coefficient of variation for lure neg trials relative to HCs, t(28) = 3.5, p < .005. It appears, therefore, that the lure neg trials drive the group differences in RT variance. In addition, a reliable positive relation was found between variance for the lure neg – control neg contrast in RTs and the amount of spatial variance for the lure neg – control neg contrast, t(27) = .389, p < .05, indicating that the more noise is in the fMRI signal, the more noise there is behaviorally, reflecting a tight coupling between neural and behavioral variance.

Temporal variance

Group differences in temporal variance were also examined. MDDs may be more varied in their activations over time, which may lead to an overall inability to resolve interference for negatively valenced items. This analysis was restricted to the lure neg condition at the onset of the lure neg probe word, and data were taken from two repetition times (TRs) from each trial, beginning at the onset of the probe word. In addition, motion-related activity from the functional data was covaried out. These TR data were then concatenated, and the variance in activation for each voxel across time was calculated. Since LiFG was of most interest, the temporal variance of all voxels in LiFG was averaged and compared between the two groups. This analysis yielded no significant group differences in temporal variance in LiFG, t(27) = 0.66, suggesting that the groups did not differ in temporal variance for these lure neg trials of interest.

Summary

It appears that there were no group differences in activation magnitude between the groups, even when the analysis was restricted to the LiFG. The more pronounced effects were related to differences in the spatial variance of activation, where MDDs activated LiFG more diffusely than did HCs; this was related to their greater difficulty removing negative information from STM and was associated with greater behavioral variance in RTs. Finally, no group differences were found in temporal variance.

Discussion

In this study, MDDs experienced more difficulty than did HCs in removing negative, but not positive, information from STM. We extended the results of Joormann and Gotlib (2008) and Joormann et al. (2010) by relating behavioral findings directly to rumination and by examining the neural mediators of the directed-forgetting effects. MDDs also had lower operation span scores than did HCs, which corroborates hypotheses that depressive rumination consumes cognitive resources. Other investigators have shown that lower memory span scores are related to more task-unrelated thoughts (Kane et al., 2007) and less ability to suppress unwanted thoughts (Brewin & Smart, 2005). These results could indicate a harsh self-sustaining cycle for MDDs, in which ruminative thoughts lower STM spans by increasing the amount of negative task-unrelated thoughts, which subsequently diminishes the ability to suppress such thoughts. While the MDD and HC participants differed in operation span scores, these differences did not drive our directed-forgetting results. Additionally, while rumination might reduce cognitive resources, the reverse relationship, in which reduced cognitive resources may predispose people to depression, may also be true. Uncovering the direction of these relationships will be an important topic for the future.

Importantly, both MDD and HC participants activated similar neural networks to resolve interference. MDDs, however, had greater spatial variance in activation of LiFG than did HCs, even though both groups activated this region to similar magnitudes. These differences in spatial variance may indicate that MDDs are not as effective at using the mechanisms of this region to resolve the interference caused by negatively valenced conflicting probes, leading to their larger behavioral interference effect for this valence. In support of this noisy interference resolution process is the correlation that we found between spatial variance and behavioral variance for the lure neg – control neg contrast in RTs.

The physiological significance of our spatial variance result may be related to dopamine. As Bush (2010) mentions, increased “noise” in the fMRI signal may be attributable to decreased dopamine levels, which serve to dampen background neural firing noise. In fact, Winterer et al. (2006b) found that Val polymorphic carriers of the COMT gene (who have less available synaptic dopamine) show less mean activation (i.e., smaller magnitude) and more varied activation in prefrontal areas in a visual oddball task. These results suggest that dopamine helps to sharpen fMRI signals and suppress surrounding noise (Winterer et al., 2006b). A similar argument could be made and applied to depression. Investigators have posited that people suffering from MDD have lower dopamine neurotransmission levels than do HCs (e.g., Hasler et al., 2008). Therefore, the increased noise or variance that was found for MDDs relative to HCs may be a reflection of decreased dopamine levels, which may hinder the ability of MDDs to dampen background neural noise. Our effects were specific to negatively valenced stimuli, which may indicate that MDDs have more noise to suppress when negative information has entered STM, but they may not be able to suppress this noise due to a reduced level of dopamine. Decreased dopamine levels may not be as problematic when suppressing positive information because positively valenced information may not produce rumination/background noise of the sort that negative information does for MDDs. While these conjectures are admittedly speculative, rumination may be implicated as a causal factor in the increased noise/background neural firing for negatively valenced trials.

We should caution, however, that we are not claiming that spatial variance is the variable that distinguishes MDDs from HCs. Different tasks and designs may produce group differences in activation magnitude as well, as several investigators have shown (Elliott, Rubinsztein, Sahakian, & Dolan, 2002; Engels et al., 2010; Sheline et al., 2001; Siegle, Steinhauer, Thase, Stenger, & Carter, 2002; Matsuo et al., 2007). Adding spatial variance as a potential dependent measure, however, may have probative value, as it has had with our data.

We should note that there has been some controversy surrounding the directed-forgetting procedure and, more generally, whether memories can be suppressed at all (Anderson & Green, 2001). The most common directed-forgetting paradigms use a list method procedure in which participants are instructed to remember a list of words and are then given an instruction to forget that list and learn a new one. Participants are then tested by trying to recall as many words as they can from either list. There are unanswered questions concerning this task: whether directed forgetting affects recall (Bjork, LaBerge, & Legrand, 1968) and not recognition (Block, 1971; Elmes, Adams, & Roediger 1970), or whether it affects both (Benjamin, 2006). More specifically, the debate centers on whether the processes of directed forgetting involve retrieval inhibition or simply selective rehearsal (Benjamin, 2006). While the time scales of these effects are outside the realm of our paradigm, one could still ask whether our group effects are due to impaired inhibitory mechanisms or impaired selective rehearsal. Unfortunately, our data cannot speak to this issue directly, and future research will be needed to answer the question as to whether the impaired process in MDD involves insufficient levels of inhibition, rehearsal, or both. What our data do suggest is that such an exploration should focus on negatively valenced stimuli.

In sum, although MDDs have the ability to resolve interference in STM, they have more difficulty in resolving interference for negative than for positive stimuli, and this difficulty is related to their propensity to ruminate. As such, the inability to remove negative information from STM may be a mechanism or mediator of rumination. Moreover, MDDs do not activate LiFG as efficiently/focally as do HCs in resolving interference for negative stimuli, which appears to be related to their behavioral difficulties and increased behavioral variance for negative material. In turn, depression may be perpetuated by difficulty in suppressing negative stimuli, leading to continued depressive rumination.