Introduction

Many studies investigating human emotion rely on tasks that require verbal stimulus material. Prominent examples are the emotional Stroop (Dresler, Mériau, Heekeren, & van der Meer, 2009; Phaf & Kan, 2007; Thomas, Johnstone, & Gonsalvez, 2007), the recognition memory test (Grider & Malmberg, 2008; Võ et al., 2008; Zimmermann & Kelley, 2010), the lexical decision task (LDT, Kuchinke, Võ, Hofmann, & Jacobs, 2007; Schacht & Sommer, 2009; Scott, O’Donnell, Leuthold, & Sereno, 2009), naming (Estes & Adelman, 2008; Simpson, Snyder, Gusnard, & Raichle, 2001), verb generation (e.g., Simpson et al., 2001), or word-stem completion (Danion, Kauffmann-Muller, Grangé, Zimmermann, & Greth, 1995). However, because numerous variables are known to influence visual word processing (Graf, Nagler, & Jacobs, 2005), well-controlled and reliable emotion-inducing stimulus material is necessary in order to produce interpretable effects. In most cases, researchers depend on published norm lists, providing reproducible stimulus characteristics.

Studies using English, for example, most often use the Affective Norms for English Words (ANEW; Bradley & Lang, 1999) list. ANEW provides norms for 1,034 nouns, verbs, and adjectives, characterized among the three affective dimensions of valence (indicating the positivity or negativity of a stimulus), arousal (indicating the excitement), and dominance (indicating the feeling of being in control versus being controlled) (Osgood, Suci, & Tannenbaum, 1957). All three dimensions have been proven to strongly influence human behavior (e.g., Hess, Adams, & Kleck, 2005; Larsen, Mercer, Balota, & Strube, 2008; Thomas & Hasher, 2006), and their neural correlates have been examined in several imaging studies (e.g., Anders, Eippert, Weiskopf, & Veit, 2008; Lewis, Critchley, Rotshtein, & Dolan, 2007; Nielen et al., 2009; Steinmetz & Kensinger, 2009). This three-dimensional affective space model has been reduced to a two-dimensional model in most research lately, relying solely on valence and arousal dimensions (Bradley & Lang, 2000; Russell, 2003).

Affective dimensions are only one way to conceptualize emotion, however. To allow for a more complete view of the matter, supplemental material for the ANEW was recently published. Stevenson, Mikels, and James (2007) implemented discrete emotion norms for happiness, anger, fear, disgust, and sadness into the ANEW database based on classic discrete emotion models, as originally suggested by Charles Darwin (1872). Discrete emotions are a second major approach to conceptualize the affective space. Recent brain stimulation studies in animals have demonstrated emotion-specific behavioral responses to stimulation of predefined brain regions, thereby supporting the discrete emotion approach (e.g., Panksepp, 1998, 2006). Discrete emotion effects have mainly been shown in emotion recognition from facial expressions (Campbell & Burke, 2009; Elfenbein, Beaupré, Lévesque, & Hess, 2007; Seidel, Habel, Kirschner, Gur, & Derntl, 2010) or static pictures (see Mikels et al., 2005). Some word processing studies have documented response time (RT) effects as well (Armstrong et al., 2009; Parrott, Zeichner, & Evces, 2005). These, however, did not use published discrete emotion norms and concentrated on contamination-phobic participants (Armstrong et al., 2009) and on participants with high trait anger (Parrott et al., 2005). Thus, further experimental investigations are needed using combined approaches, given that both discrete emotion and dimensional theories share a great overlap in explanational value (Reisenzein, 1994).

Because of ANEW’s success, norm lists for emotional words have been collected in non- English speaking countries as well. Prominent examples are the Spanish adaption of ANEW (Redondo, Fraga, Padrón, & Comesaña, 2007), the Finish and British English word list (Eilola & Havelka, 2010), and BAWL (see Võ et al., 2006), which was recently revised (BAWL-R, see Võ et al., 2009) and now contains norms on the dimensions valence and arousal for more than 2,900 German words, including nouns (2,107), verbs (504), and adjectives (291). Norms for discrete emotions, however, which would allow for a broader focus on emotional word processing in non-English-speaking populations, are not yet available in any other language.

Following Stevenson et al. (2007), the present study meant to provide researchers with a list of reliable discrete emotion norms for German nouns, hereafter referred to as the DENN–BAWL. In a first step, the exact same discrete emotion categories to supplement the ANEW—namely happiness, anger, fear, disgust and sadness—were collected to supplement the BAWL.

Brain stimulation studies, which have identified the neurobiological systems eliciting these emotions (namely the PLAY system, the RAGE system, the FEAR system, the DISGUST system and the PANIC system; see Panksepp, 1998, 2006; Toronchuk & Ellis, 2007a, 2007b), provide strong evidence that at least some discrete emotions are not culture specific (Ekman & Friesen, 1971; Wierzbicka, 1986), but are found universally, even in different mammalian species. Thus, it is believed that the DENN–BAWL would not only allow investigations with German-speaking participants, but may also trigger broader, cross-cultural comparisons, given that norms are available in two languages, and given the likely universal neurobiological basis of the discrete emotion effects.

In a second step, the present study aimed at demonstrating that the discrete emotion ratings collected with a German-speaking population actually could account for substantial variance in behavioral measures of single-word processing. The LDT was chosen for this purpose, since it is the only verbal-stimuli-based task that has been shown to be affected by both emotional dimensions (e.g., Kuchinke et al., 2007; Schacht & Sommer, 2009; Scott et al., 2009) and discrete emotions (Armstrong et al., 2009; Parrott et al., 2005).

When participants are asked to indicate via button press whether a presented letter string is a correct word (e.g. taxi) or a nonword (e.g. tafi), positive and highly arousing negative stimuli are known to facilitate processing, whereas low-arousing negative stimuli are processed slower than matched neutral words (Hofmann, Kuchinke, Tamm, Võ, & Jacobs, 2009). Concerning discrete emotions, happiness (representing positive valence) and fear (representing negative valence) dimensions were chosen for the LDT.

When comparing RTs and error rates (ERR) for words rated as “highly happiness related” (high hap condition) with those for words not rated as “highly happiness related” (low hap condition), words in the high hap condition were expected to be processed faster than words in the low hap condition, on the basis of previous findings for dimensional emotion studies (Kuchinke et al., 2007; Schacht & Sommer, 2009; Scott et al., 2009). Similarly, the processing of words rated as “highly related to fear” (high fear condition) was compared with the processing of words rated as “not highly related to fear” (low fear condition). Concerning this manipulation, a precise prediction seems difficult, considering that investigations of discrete emotion intensities are rare (Reisenzein, 1994) and inconclusive.

Finding any effect of either fear or happiness intensities on either RT or ERR when the stimulus material is controlled for mean valence and arousal norms would document that the collected discrete emotion norms could capture variance that is not captured by the standard scales of valence and arousal alone. Moreover, it would encourage further investigation of discrete emotion effects using the DENN-BAWL.

Rating methods

Participants

A total of 79 native German participants (53 female; mean age = 24.3, SD = 6.2, range = 18 to 61) recruited via email lists, a notice posted on campus, and in experimental psychology classes at the Free University Berlin participated in this study. They were offered either course credit or 5 Euros for participation. Some participants took part voluntarily, without recompense.

Material and procedure

In order to collect discrete emotion norms, all 1,958 nouns that were 4–8 letters in length from the BAWL-R (Võ et al., 2009) were selected and subdivided into nine lists containing 200 items and into one list containing 158 items. Ratings were collected via an Internet-based html script running on a public server provided by the Free University Berlin (for a discussion on Internet experiments, see Birnbaum & Reips, 2005).

Participants were first instructed to carefully read the presented word and then indicate on five independent 5-point Likert scales the intensity of the elicited feelings of happiness, anger, fear, sadness, and disgust (1 = low intensity, 5 = strong intensity). Each word was presented individually in black uppercase letters (font type Times New Roman, font size 18 pt) on a white background. Participants were able to individually decide when to advance to the next trial by clicking on a button. Word order was randomized for each subject. Participants were explicitly allowed to rate more than one of the 10 different stimulus lists, resulting in an average of 2.7 lists rated per subject (SD = 2.7, range 1–10). Online ratings were then averaged offline per item and per discrete emotion category using JMP software (Version 7, SAS Institute Inc., Cary, NC). Each word received ratings from at least 20 different participants.

Rating results

The stimulus list resulting from the rating procedure including the averaged ratings, the respective standard deviations, and an assignment of single words to specific discrete emotion categories can be downloaded at www.fu-berlin.de/allgpsy/DENN-BAWL. Altogether, 1,104 words received a higher rating in happiness than in any other discrete emotion variable, and thus were labeled as happiness-related words in the list (e.g., Sonne [“sun”]; see column “BasicEmoCat liberal” in the supplementing materials). Using the same logic, 384 words were labeled as anger-related words (e.g., Zorn [“rage”], 261 words were fear related (e.g. Lawine [“avalanche”]), 125 words received the highest rating in disgust (e.g., Schleim [“slime”]), and 43 words were classified as sadness related (e.g., Abschied [“parting”]). Additionally, a more conservative classification criterion was applied, assigning words to a specific discrete emotion category only in cases in which the averaged rating in one discrete emotion was more than one standard deviation higher than in any other discrete emotion (see column “BasicEmoCat conservative” in the supplementing materials).

Correlational analyses with the discrete emotion ratings and the valence and arousal scores taken from the BAWL-R revealed a highly significant positive relationship between happiness ratings and valence, as well as between the four negative discrete emotions and arousal. A negative correlation was found between happiness and arousal, between happiness and the other discrete emotion variables, and between the negative discrete emotions and valence. Correlations and some descriptive statistics describing the rating data can be found in Table 1.

Table 1 Correlational analyses for discrete emotions, valence (Val) and arousal (Arou), including descriptive statistics

Lexical decision task methods

Participants

An additional 20 native German participants (14 female; 18 right-handed; mean age = 24.4, SD = 4.0, range = 19 to 36) recruited at the Free University Berlin participated in this study. Some of them received course credit for participation; others participated without recompense.

Materials

Stimulus material consisted of 175 nouns taken from the ratings described previously and of an equal number of nonwords, as described below. Within the word set, five conditions (high and low hap, neutral, high, and low fear) were constructed, each containing 35 items that were four to six letters in length. Neutral words had valence ratings lying between −0.5 and +0.5, according to the BAWL-R. Negative words (high- and low-fear conditions) had a valence rating below −1, and positive words (high and low hap conditions) had a valence rating above 1. All three valence conditions were matched on number of letters, syllables, phonemes, orthographical neighbors, frequency, and averaged bigram frequency using an ANOVA (p > .1). Estimates were taken from the BAWL-R.

Positive and negative categories were split in nonoverlapping halves. High-fear condition stimuli had a fear score above 2.6 (mean fear = 2.92) and were matched to low-fear stimuli (fear score below 2.6, M = 2.16) on valence, arousal, happiness, sadness, anger, disgust, frequency, imageability, bigram frequency, number of letters, syllables, phonemes, and orthographical neighbors using a t test (all ts < 1, all ps > .3). Both conditions significantly differed in fear, t(63.6) = −12.588, p < .001. High hap stimuli had happiness scores above 2.6 (mean happiness = 3.24) and were matched to low hap stimuli (happiness score below 2.6, mean = 2.19) on valence, arousal, fear, sadness, anger, disgust, frequency, imageability, bigram frequency, number of letters, syllables, phonemes, and orthographical neighbors using a t test (all ts < 1, all ps > 0.3). Both conditions significantly differed in happiness, t(59.7) > −4.315, p < .001. Discrete emotion ratings were taken from the online rating described previously; all other estimates were taken from the BAWL-R. An overview of the stimulus characteristics is given in Table 2.

Table 2 Mean stimulus characteristics

Nonwords were created by taking an additional 175 words that were four to six letters in length from the BAWL-R and replacing one or two letters—vowels with vowels, and consonants with consonants—thus creating pronounceable but meaningless letter strings. They did not differ from words in length and number of syllables in a t test (all ts < 1, all ps > 0.3).

Procedure

Participants sat in a quiet room in front of a 15-in. laptop screen. They were instructed to decide as fast and as accurately as possible whether they were presented a correct German word or a nonword. Decisions were made using left and right index fingers, lying on the respective SHIFT buttons. The button-to-response assignment was counterbalanced across participants. After nine practice trials (not belonging to the stimulus set and therefore excluded from any analysis), the experimenter left the room provided that participants did not have further questions.

Stimuli were presented by Presentation 9.9 software (Neurobehavioral Systems Inc., Canada) in randomized trial order in the center of the screen, using black uppercase letters (font type Arial, size 24 pt, ~ 0.57° vertical visual angle) on a blank white screen. Each trial began with a fixation cross (+) presented for 500 ms in the center of the screen, followed by the stimulus (500 ms) at the exact same position and another fixation cross, which was presented until the button press. Then, the next trial began.

Data preparation

Error-free mean RTs were calculated for each condition and for each participant. Trials with responses that were faster or slower than the individual mean RT ±2 SDs were excluded as outliers (5.5%). For error analyses, behavioral errors were summed up per participant and condition. One participant was excluded from all analyses, having committed 38% behavioral errors. The remaining participants committed 6.5% errors, on average. All analyses were computed using SPSS software (Version 13.0, SPSS Inc., USA) at an a priori significance level of .05.

Lexical decision task results

The results are summarized in Fig. 1. A repeated measures ANOVA over all five conditions (high fear, low fear, neutral, high hap, low hap) revealed a significant main effect in RTs, F(4, 72) = 3.766, p = .008, \( \eta_p^2 \) = 0.173. Planned pairwise comparisons using matched pairs t tests revealed faster responses to words in the high hap condition (M = 681 ms, SD = 142 ms) when compared with those in the low hap condition (M = 699 ms, SD = 145 ms), t(18) = −2.272, p = .036, with neutral words (M = 707 ms, SD = 137 ms), t(18) = −3.248, p = .004, and with those in both fear conditions (high fear: M = 702 ms, SD = 141 ms), t(18) = −3.989, p = .001 (low fear: M = 699 ms, SD = 132 ms), t(18) = −3.015, p = .007. High- and low-fear conditions, however, did not differ from each other or from neutral words in RT (p > .05).

Fig. 1
figure 1

Response times (in milliseconds) and mean sum of errors for the lexical decision task. Error bars indicate the respective standard errors

Analyzing the ERR, a repeated measures ANOVA over all five conditions (high fear, low fear, neutral, high hap, low hap) also revealed a significant main effect, F(72, 15) = 7.444, p < .001, \( \eta_p^2 \) = 0.293. Planned pairwise comparisons using matched pairs t tests revealed more errors in the low-fear condition (mean ERR = 3.6, SD = 2.1) than in the high fear condition (mean ERR = 2.3, SD = 1.7), t(18) = 4.301, p < .001; the low hap condition (mean ERR = 2.2, SD = 2.2), t(18) = 3.369, p = .003; and the high hap condition (mean ERR = 1.5, SD = 1.4), t(18) = 5.128, p < .001. Additionally, neutral words (mean ERR = 3.2, SD = 1.6) were processed with fewer errors than were high fear, t(18) = 2.178, p = .043; high hap, t(18) = 3.580, p = .002; and low hap, t(18) = 2.109, p = .049, words.

Discussion

A lot of research on emotions has been done using lexical stimuli in the past, relying on the two- or three-dimensional affective space model (e.g. Bradley & Lang, 2000; Russell, 2003) and the norms provided by the ANEW (Bradley & Lang, 1999) or the BAWL (Võ et al, 2006). In order to investigate discrete emotion effects on single-word processing, however, researchers had to collect stimulus data on their own, since discrete emotion norms were not available (Armstrong et al., 2009; Parrott et al., 2005). This changed with the publication of supplementing norms for ANEW (Stevenson et al., 2007). Unlike dimensional norms, which are also available in Spanish (Redondo et al., 2007), Finish, British English (Eilola & Havelka, 2010), and German (Võ et al., 2006, 2009), currently, discrete emotion norms are available only in English. The purpose of the present study was to amend this by providing discrete emotion norms to German nouns. Moreover, an LDT was used to document the usefulness of the collected norms and to experimentally investigate the influence of different fear and happiness intensities on RT and ERR.

The complete DENN–BAWL, containing almost 2,000 German nouns that are four to eight letters in length, can be downloaded from www.fu-berlin.de/allgpsy/DENN-BAWL. Descriptive statistics and bivariate correlations between the discrete emotion norms for happiness, anger, fear, disgust, and sadness are presented in Table 1. The bivariate correlations between the discrete emotion norms and valence respectively arousal norms taken from the BAWL-R replicate previous findings by Stevenson et al. (2007). Interestingly, the present data further reveal a negative correlation between German happiness and arousal norms, which was not observed for English norms. Whether this finding is related to increased statistical power in the present study due to an increased stimulus set or whether it reflects crosscultural differences in emotional language processing (e.g. Redondo et al., 2007) cannot be answered from the present results but reveals an interesting question that should be addressed in future studies.

In addition to providing discrete emotion norms, the present study also demonstrates that discrete emotion variables account for significant variance in human LDT performance. Investigating the effects of happiness and fear intensity both, RT and ERR were affected (see Fig. 1), despite the fact that the stimulus material was controlled for the emotional valence and arousal as given by the BAWL-R norms (see Table 2). Specifically, high hap stimuli were correctly recognized significantly faster than words in any other condition, including low hap words, which differed from high hap stimuli only in their mean happiness score. This acceleration in lexical processing occurs when stimuli are manipulated on positive valence (Kuchinke et al., 2007; Schacht & Sommer, 2009; Scott et al., 2009), which, in this study, was controlled between high hap and low hap stimuli. Thus, facilitated processing is related to happiness even beyond the normative measures of positive valence.

The second manipulation concerning fear intensity did not affect RT, but was found to affect ERR, in contrast with the initial hypotheses. Negative valence per se has been reported either to facilitate lexical decisions (e.g. Nakic, Smith, Busis, Vythilingam, & Blair, 2006) or, when controlled for arousal measures, to slow down RTs. In the present study, high- and low-fear stimuli were controlled for both valence and arousal measures, which may explain the missing effect in the RTs. However, an effect in the errors was still observed.

Possibly, the rather moderate manipulation on fear intensities may have also contributed to this effect. High-fear and low-fear conditions, although nonoverlapping, differed in fear intensity by only 0.72 points. Future studies are needed to investigate whether the reported relationship between fear intensity and RT is mediated by arousal, as was indicated by the present study.

Despite the missing RT effect, fear intensity variation significantly influenced lexical decision accuracy. Participants committed significantly fewer errors when presented with words in the high-fear condition than when presented with the low-fear condition words. Just as with the happiness intensity effect, this difference in ERR indicates that discrete emotion intensities influence single-word processing beyond the previously discussed effects of the dimensional affective space accounts. Previous studies proposed that emotion-related effects in single-word processing are caused by automatic evaluation (Murphy & Zajonc, 1993; Pratto & John, 1991), interfering with the actual task. Additionally, language is thought to be of special importance, since it is supposed to serve as a context for emotion perception (Barrett, Lindquist, & Gendron, 2007). The ERR effect for fear intensity manipulations and especially the RT effect for happiness intensity manipulations presented in the present study are in line with the automatic evaluation approaches (Murphy & Zajonc, 1993; Pratto & John, 1991), documenting that discrete emotions, just like affective space dimensions, affect lexical processing even when the affective information is irrelevant for the processing of the task. Accordingly, contextual learning proposed by Barrett et al. (2007) seems to be more emotion specific than previously considered in the word processing literature, in which dimensional theories dominate. Finally, since these results were achieved despite the control for valence and arousal variables, the present study documents the additional predictive power of discrete emotions, and in particular the DENN-BAWL norms, over and above emotional dimensions, as was suggested by Stevenson et al. (2011).

Future uses

The DENN–BAWL was collected to allow for a broader perspective when investigating emotion effects with verbal stimuli in the German language. Thanks to the BAWL and the BAWL-R, main effects of valence and arousal on word processing, as well as their interactions, are well documented (Hofmann et al., 2009; Kuchinke et al., 2007), and some of their associated electrophysiological and neuroanatomical correlates have been investigated (Hofmann et al., 2009; Kuchinke et al., 2005). As can be seen from the present study, the two-dimensional approach may be challenged when investigating discrete emotion categories.

In providing the DENN–BAWL to other researchers in the field, it is hoped that discrete emotions will be investigated systematically to increase knowledge of discrete emotion effects in single-word processing. Still, several questions remain unanswered. Do fear-related responses, which, from an evolutionary perspective, should lead to withdrawal behavior, behaviorally differ from anger-related responses? Are happiness and sadness indeed antagonistic emotions, as folk theory suggests? The supplements for ANEW and the DENN–BAWL enable researchers to investigate such questions, and they allow the transfer of knowledge to other cognitive domains, such as recognition memory.

In addition to questions focusing on discrete emotions alone, combined studies are possible, investigating the potential interdependencies of both dimensional and discrete emotion approaches. How do discrete emotion intensities affect LDT performance when valence and arousal are controlled? The present study provides the first evidence in favor of discrete emotion effects for happiness and fear, which leads to speculation about similar emotion-specific effects for anger, disgust, or sadness, as well as interactions between them.

The DENN–BAWL hopefully helps to answer at least some of these questions and to successfully stimulate further research on emotion.