Elsevier

Brain Research

Volume 1151, 2 June 2007, Pages 107-118
Brain Research

Research Report
When emotional prosody and semantics dance cheek to cheek: ERP evidence

https://doi.org/10.1016/j.brainres.2007.03.015Get rights and content

Abstract

To communicate emotionally entails that a listener understands a verbal message but also the emotional prosody going along with it. So far the time course and interaction of these emotional ‘channels’ is still poorly understood. The current set of event-related brain potential (ERP) experiments investigated both the interactive time course of emotional prosody with semantics and of emotional prosody independent of emotional semantics using a cross-splicing method. In a probe verification task (Experiment 1) prosodic expectancy violations elicited a positivity, while a combined prosodic–semantic expectancy violation elicited a negativity. Comparable ERP results were obtained in an emotional prosodic categorization task (Experiment 2). The present data support different ERP responses with distinct time courses and topographies elicited as a function of prosodic expectancy and combined prosodic-semantic expectancy during emotional prosodic processing and combined emotional prosody/emotional semantic processing. These differences suggest that the interaction of more than one emotional channel facilitates subtle transitions in an emotional sentence context.

Introduction

Human communication entails many facets. One of them is to convey the emotional state of a speaker. This requires the listener to integrate a number of emotional information sources such as semantics, facial, postural, gestural and vocal expressions within a short time frame to derive a proper interpretation. Thus, emotional comprehension depends on how successfully one integrates and evaluates verbal and non-verbal emotional cues. This is particularly relevant when these cues are ambiguous. To say, “I am really happy” with an angry tone of voice, for example, signals that emotional prosody and semantics do not have to “dance cheek to cheek”. Therefore, it is of special interest to investigate the relative contribution of each of these channels to understand their interaction as an utterance unfolds in time.

Here, we focus on two emotional channels, prosody and semantics. Emotional prosody is the non-verbal vocal expression of emotion and carries salient acoustic–phonetic cues (i.e., F0, duration, and intensity). Emotional semantics convey verbal information that allows the listener to derive meaning from an utterance and may or may not differ from semantics in general (Hermans et al., 1994, Fazio et al., 1986). While there is ample and controversial discussion on how and when these emotional channels interact, there is general agreement that emotional processing may be highly automatic due to its evolutionary significance (Schupp et al., 2004a). In order to understand the time course and processing nature of these channels individually and in an interactive manner, it is crucial to study the processes as they unfold in time. ERPs are an excellent tool to do so as they provide high temporal resolution.

Several ERP components have been identified as correlates of non-emotional verbal and non-verbal information processing. For example, the integration of meaning in a sentence is linked to the well-known N400 component across a variety of domains (for recent reviews, see Van Petten and Luka, 2006, Kutas and Federmeier, 2000), while prosodic information processing has been linked to a number of positivities such as the closure-positive shift (CPS; Steinhauer et al., 1999) and the P800 (Astésano et al., 2004). Overall, effortful meaning integration elicits a negativity (N400), while prosodic reanalysis or reprocessing has yielded a number of positivities (CPS, P800). As these ERP components differ morphologically, it has been suggested that components related to these processes are functionally distinct.

In the context of emotion processing, both positivities and negativities are reported for different modalities (e.g., Schirmer et al., 2005, Wambacq and Jerger, 2004, Bostanov and Kotchoubey, 2004, Schirmer and Kotz, 2003, Schupp et al., 2004a, Schupp et al., 2000, Carretié et al., 1996). For example, investigations of auditory emotional word processing (Wambacq and Jerger, 2004) and of emotional picture processing (e.g., Schupp et al., 2004a, Schupp et al., 2004b) reported positivities with varying latencies as a function of task. In particular, Schupp and colleagues (2004a) related the positivity to attention-regulated motivation, not unlike the P300 elicited to rare, unexpected stimuli (e.g., Donchin and Coles, 1988). However, Schirmer and Kotz (2003) reported an N400 to incongruent words in an auditory emotional word Stroop task. Furthermore, using prosody as context, Schirmer et al., 2002, Schirmer et al., 2005 found an N400 elicited by visual targets in a cross-modal priming paradigm. Participants listened to emotionally intoned sentences with neutral semantics followed by a matching or mismatching emotional visual target word. Mismatching visual targets elicited a larger N400 than matching targets. This effect was qualified by the listeners' sex as a function of stimulus onset asynchrony (SOA) in a lexical decision task (LDT; Schirmer et al., 2002) but not in a combined LDT/prosodic–semantic matching task (Schirmer et al., 2005). Crucially, when emotional processing is attended, both an N400 and positivity are reported (see Schirmer et al., 2005). Thus, it appears that task demands influence the respective processing pattern as well as the interaction of emotional prosody with semantics. However, while cross-modal priming is a valuable and controlled approach to study the interaction of processes, it does not reflect the temporal dynamics of interactive processes as they unfold in time. Therefore, it is of special interest to investigate the online contribution of respective emotional information in order to understand their interaction in an utterance.

Adapting an approach previously used by Steinhauer et al. (1999) and Astésano et al. (2004), we extended the method of cross-splicing auditory signals to investigate the temporal unfolding of emotional prosodic processing and of combined emotional prosodic/semantic processing after expectancy violations. One major challenge to investigate any interaction between prosody and language-specific functions is that they are not synchronized (e.g., Marslen-Wilson et al., 1992). While prosody consists of suprasegmental parameters that extend over time, syntax and semantics can be processed as local phenomena (e.g., Eckstein and Friederici, 2006). To specify whether transitions from neutral prosody into emotional prosody in a neutral semantic context differ from transitions from neutral prosody into emotional prosody in an emotional semantic context we applied cross-splicing to induce comparable transitions in a temporally and acoustically controlled manner. Given the sensitivity of the ERP measure to latency jitter, this procedure also allows to synchronize the interaction of emotional prosody and emotional semantics comparably to the interaction of emotional prosody with neutral semantics. As the temporal unfolding of sentence prosody depends on the continuous integration of primary acoustic parameters (e.g., perceived pitch, duration and intensity), prosodic expectancy builds up. To this end, the cross-splicing approach allows violating prosodic expectancy and should elicit a positivity comparable to previously reported positivities after prosodic expectancy violations (Steinhauer et al., 1999, Astésano et al., 2004). What remains to be shown is whether transitions into emotional prosody elicit a morphologically different positivity given the emotional quality of the signal and whether the predicted positivity varies by valence (angry, happy). Furthermore, parallel transitions into combined emotional prosody and emotional semantics may be similarly constrained by context (i.e., acoustic correlates in concert with semantics) as semantic expectancy in a sentence (e.g., Van Petten et al., 1999). Thus, a combined expectancy violation should elicit a negativity that may be comparable to the well-known N400 and has been reported in emotional cross-modal paradigms (Schirmer et al., 2002, Schirmer et al., 2005). This effect could be biphasic, consisting of a negativity and a positivity as both prosodic and semantic expectancy are violated, but may be influenced by task demands (see Schirmer et al., 2005).

To summarize, the current experiments aimed at specifying the relative contribution of emotional prosody and semantics during the temporal unfolding of an emotional utterance. To answer the primary research question, that is, what is the brain response of prosodic transitions into emotional prosodic context and into combined emotional prosodic/semantic context, two types of prosodic expectancy violations were created through cross-splicing. Given the exploratory nature of the current investigation we first opted for effect-unspecific hypotheses in Experiment 1 (see Handy, 2004). However, based on sparse previous evidence (e.g., Schirmer et al., 2002, Schirmer et al., 2005, Astésano et al., 2004) these effect-unspecific hypotheses were constrained in two ways: (1) quantitative procedures (i.e., mean amplitude) were adapted in accord with previous evidence in the literature, and (2) a follow-up experiment was conducted to test the theoretical implications of the first experiment. If the respective ERP effects are replicated independent of task (probe verification in Experiment 1 and emotional prosodic categorization in Experiment 2), effect-specific hypothesis can be formulated and further explored.

Section snippets

Behavioral analyses

Accuracy scores were calculated for each participant and corrected by 2.5 SD of the mean. Accuracy data for the two different conditions were calculated in separate repeated measures ANOVAs. The analysis of the prosodic condition included the within-subjects factor P (happy, angry and neutral) only, while the analysis of the combined condition included the factors M (ATCH) (match/mismatch) and P (ROSODY; happy, angry). In both conditions, SEX (female/male) was treated as a between-subjects

Discussion

The current set of experiments investigated the temporal unfolding of emotional prosodic processing and that of combined emotional prosodic/semantic processing as a function of expectancy violations realized by cross-splicing. Based on sparse previous evidence we hypothesized that prosodic expectancy violations should elicit a positivity while combined prosodic/semantic expectancy violations should elicit a negativity. Indeed, as evidenced by the between-task analysis two morphologically

Participants

Thirty-four volunteers (18 female) participated in the first experiment. The mean age of female participants was 24.7 years (SD 2.6) and of male participants 25.6 years (SD 2.1). Thirty-two right-handed volunteers (16 female, mean age of 26.1 (SD 3.1); 16 male, mean age of 25.7 (SD 3.0)) that had not participated in Experiment 1 were paid to participate in Experiment 2. All listeners were native speakers of German, students attending the local university, right-handed, and had no hearing

Acknowledgments

We would like to thank Angela D. Friederici for very helpful comments on an earlier version of the manuscript and Kristiane Werrmann for help during data acquisition. This work was supported by the German Research Foundation (DFG FOR-499). Reprint requests should be sent to Sonja A. Kotz, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstrasse 1a, 04103 Leipzig, Germany, or via e-mail: [email protected].

References (52)

  • A. Schirmer et al.

    Sex differentiates the role of emotional prosody during word processing

    Cogn. Brain Res.

    (2002)
  • A. Schirmer et al.

    On the role of attention for the processing of emotions in speech: sex differences revisited

    Cogn. Brain Res.

    (2005)
  • C. Van Petten et al.

    Neural localization of semantic context effects in electromagnetic and hemodynamic studies

    Brain Lang.

    (2006)
  • P. Vuilleumier

    How brains beware: neural mechanisms of emotional attention

    Trends Cogn. Sci.

    (2005)
  • I.J. Wambacq et al.

    Processing of affective prosody and lexical–semantics in spoken utterances as differentiated by event-related potentials

    Cogn. Brain Res.

    (2004)
  • American Electroencephalographic Society

    Guidelines for standard electrode position nomenclature

    J. Clin. Neurophysiol.

    (1991)
  • R.H. Baayen et al.

    The CELEX Database

    (1995)
  • Beringer, J., 1993. Experimental Run Time...
  • Boersma, P., Weenink, D., 2003. Praat: doing phonetics by computer (Version 4.1.13) [Computer program]. Retrieved July...
  • V. Bostanov et al.

    Recognition of affective prosody: continuous wavelet measures of event-related brain potentials to emotional exclamations

    Psychophysiology

    (2004)
  • R. Buck et al.

    Verbal and nonverbal communication: distinguishing symbolic, spontaneous, and pseudo-spontaneous nonverbal behavior

    J. Commun.

    (2002)
  • L. Carretié et al.

    N300, P300 and the emotional processing of stimuli

    Electroencephalogr. Clin. Neurophysiol.

    (1996)
  • B.N. Cuthbert et al.

    Brain potentials in affective picture processing: covariation with autonomic arousal and affective report

    Biol. Psychiatry

    (2000)
  • J. Dien et al.

    Application of repeated measures ANOVA to high-density ERP datasets: A review and tutorial

  • E. Donchin et al.

    Is the P300 component a manifestation of context updating?

    Behav. Brain Sci.

    (1988)
  • K. Eckstein et al.

    It's early: event-related potential evidence for initial interaction of syntax and prosody in speech comprehension

    J. Cogn. Neurosci.

    (2006)
  • Cited by (103)

    • The impact of crossmodal predictions on the neural processing of aesthetic stimuli

      2024, Philosophical Transactions of the Royal Society B: Biological Sciences
    View all citing articles on Scopus
    View full text