Elsevier

Acta Psychologica

Volume 198, July 2019, 102862
Acta Psychologica

Gaze patterns in viewing static and dynamic body expressions

https://doi.org/10.1016/j.actpsy.2019.05.014Get rights and content

Highlights

  • Stronger face-bias in viewing of dynamic compared to static whole body expressions for emotion categorization.

  • Observers view all body regions when faces are not visible in dynamic displays and in static displays.

  • Viewing behaviour is subtly influenced by emotion-specific gestures in viewing of whole body expressions.

  • Viewing responds flexibly to the informativeness of facial and movement cues to optimise information seeking.

Abstract

Evidence for the importance of bodily cues for emotion recognition has grown over the last two decades. Despite this growing literature, it is underspecified how observers view whole bodies for body expression recognition. Here we investigate to which extent body-viewing is face- and context-specific when participants are categorizing whole body expressions in static (Experiment 1) and dynamic displays (Experiment 2). Eye-movement recordings showed that observers viewed the face exclusively when visible in dynamic displays, whereas viewing was distributed over head, torso and arms in static displays and in dynamic displays with faces not visible. The strong face bias in dynamic face-visible expressions suggests that viewing of the body responds flexibly to the informativeness of facial cues for emotion categorisation. However, when facial expressions are static or not visible, observers adopt a viewing strategy that includes all upper body regions. This viewing strategy is further influenced by subtle viewing biases directed towards emotion-specific body postures and movements to optimise recruitment of diagnostic information for emotion categorisation.

Introduction

The ability to recognise emotional state in others is essential for social interaction and has been shown to predict better social adjustment, mental health and workplace performance (Carton, Kessler, & Pape, 1999; Izard et al., 2001; Nowicki & Duke, 1994). While facial expressions are considered to be the strongest predictors for emotional state within social interaction distance (Adolphs, 2002; Ekman, 1992), in daily life we also receive other emotional cues from the person, such as posture, gestures, vocalisations and emotion tone in speech. These cues can be aligned, thereby facilitating recognition of an emotion, or they can be in conflict and lead to confusion in emotion perception. For instance, a frightened person (expressed in bodily cues) with a happy facial expression is categorised more often as frightened compared to when both facial and bodily cues indicate a happy emotional state (Conty, Dezechache, Hugeueville, & Grèzes, 2012; Jensen & Kotz, 2011; Kokinous, Tavano, Kotz, & Schröger, 2017; Kreifelts, Ethofer, Shiozawa, Grodd, & Wildgruber, 2009; Meeren, Hadjikhani, Ahlfors, Hämäläinen, & de Gelder, 2008; Müller et al., 2011; Nelson & Mondloch, 2017; Stekelenburg & Vroomen, 2007; Yeh, Geangu, & Reid, 2016). To date, emotion research has focused predominantly on facial and vocal cues for investigating unimodal and multimodal emotion perception, whereas the role of bodily cues has received relatively less attention. Yet body cues provide critical information when the face is not clearly visible (de Gelder, 2009; de Gelder, de Borst, & Watson, 2015) or when facial expressions are ambiguous (Aviezer, Trope, & Todorov, 2012) and emotional postures have been shown to modulate judgements from facial or vocal cues when perceived at the same time (Jensen & Kotz, 2011; Yeh et al., 2016). Moreover, body expressions have been found to activate action-related neural structures, suggesting that they are critical for judging action intentions and for response preparation (de Gelder, 2013; Engelen, Zhan, Sack, & de Gelder, 2018; Grezes, Pichon, & de Gelder, 2007; Meeren, Sinke, Kret, & Tamietto, 2010).

The salience of the body for human observers in social cognition is further reflected in the way viewers divide their attention over the face and the body. When viewing social scenes to identify a person, people tend to spend approximately 40% of the time looking at the body (Bindemann, Scheepers, Ferguson, & Burton, 2010). In addition, body movement (e.g. gait) has been shown to attract observers' attention when they view people for person identification (Rice, Phillips, Natu, An, & O'Toole, 2013). However, viewing of whole bodies for emotion recognition has not been investigated in great detail. A few results highlight the importance of bodily cues for perception of threat (Kret, Stekelenburg, Roelofs, and de Gelder, 2013). Using still images of people within a social context or alone, Kret, Stekelenburg, Roelofs, and de Gelder (2013) found that observers tend to view both the face and the body when they are visible in the images. Attention to the body increases when the emotions signalled by the body (threat) and face (happy) were incongruent, suggesting that bodily cues became more important when the emotional state is ambiguous (Kret et al., 2013). A similar effect of face cue ambiguity has been demonstrated in person identification (Rice et al., 2013), reflected in increased viewing of the body when the face is less informative for identity recognition. Several findings further suggest that the body is even more informative when movement is added in dynamic displays (Grezes et al., 2007; O'Toole et al., 2011; Stoesz & Jakobson, 2014). For instance, compared to static images, dynamic displays of bodies with faces not visible increases person identification performance more than when motion is added to face stimuli alone (O'Toole et al., 2011). Moreover, compared to the face, the body is attended more often during free-viewing of dynamic social scenes compared to static images of these scenes (Stoesz & Jakobson, 2014). To date, it is not yet known whether a similar increase in attention to the body will emerge in viewing dynamic compared to static body expressions for emotion categorisation. To address this question, the present study will investigate gaze behaviour in viewing of static images (Experiment 1) or videos (Experiment 2) of whole body expressions at varying orientations with faces visible or not visible. Based on the findings discussed so far, it is expected that the body will attract more attention in dynamic compared to static displays and when the face is not visible compared to when it is visible.

Emotional body expressions have been described in terms of unique combinations of postures, gestures and muscle movements (Atkinson, Dittrich, Gemmell, & Young, 2004; Dael et al., 2012a, Dael et al., 2012b; Gunes & Piccardi, 2005; Gunes & Piccardi, 2006, Huis in 't Veld et al., 2014a, Huis in 't Veld et al., 2014b). Similar to facial action coding system (FACS) for facial expressions, a few studies have developed coding systems for body expressions where each emotion is associated with a set of body action units (BAU) (Dael et al., 2012a, Dael et al., 2012b; Gunes & Picardi, 2006). The description of these BAUs shows considerable overlap across studies. For instance, in expressions of anger, people tend to lean forward and shake their fists or point at the cameras, whereas expressions of fear involves leaning backward and raising the arms in front of the body. For sadness, the head is often dropped and hands or arms are brought close to the body and for happiness the posture is upright, with arms raised (Atkinson et al., 2004; Atkinson, Tunstall, & Dittrich, 2007; Dael et al., 2012a, Dael et al., 2012b; Gunes & Picardi, 2006). In addition to the postural cues, velocity is an important feature for discriminating between two emotions (Atkinson et al., 2007; Gunes & Picardi, 2006; Gunes, Shan, Chen, & Tian, 2015). For instance, movements for anger and happiness tend to be fast and jerky whereas movement tends to be minimal and slow for sad expressions. It is not yet known whether viewing behaviour is influenced by emotion-specific bodily cues in static and dynamic displays of whole body expressions. To investigate this question, the present study will record viewing measures separately for different body regions (head, arms, torso and legs) and for different emotions (happy, sad, fear, anger and neutral).

Emotion-specific viewing patterns have previously been observed in viewing of facial expressions (e.g. enhanced viewing of the smiley mouth for happy expressions) and are consistent with the idea that attention is drawn to facial features that are most diagnostic for different emotions (Calvo & Nummenmaa, 2008; Smith, Cottrell, Gosselin, & Schyns, 2005). A few other findings suggest however that viewing is relatively unaffected by the expressed facial expression of varying intensities (Guo, 2012). Guo argues that facial configural information (structural information about the spatial relations between local facial features) may be more informative to disambiguate subtle expressions. Under these circumstances, the use of a uniform viewing strategy that includes all facial regions may be more optimal (Guo, 2012). The importance of configural information in body expression recognition has previously been demonstrated in both static (Stekelenburg & de Gelder, 2004) and dynamic displays (Atkinson et al., 2007). Atkinson et al. (2007) found for example that accuracy for fully lit dynamic body expressions (with faces covered) was reduced by inversion and reversion of the videos clips, suggesting that the processing of structural relations between body regions and temporal sequencing in body movements was disrupted. However, accuracy was still above chance in reversed and inversed videos suggesting that other information, such as emotion-specific body signals, remain important for body expression recognition (Atkinson et al., 2007). The results of the present study will reveal whether viewers focus more on bodily cues uniquely associated with different emotion categories, which would be reflected in emotion-specific viewing patterns, or whether viewers will adopt a more uniform viewing strategy to optimise information seeking for emotion categorisation. Body posture, such as forward lean for anger or a backward lean for fear, are expected to provide relevant diagnostic information in still images and may therefore draw attention to the body region when faces are not visible (Dael et al., 2012a, Dael et al., 2012b; Gunes & Picardi, 2006). In contrast, attention may be drawn more to the diagnostic movements of the arms in dynamic displays given the diagnostic information they provide (i.e. pronounced movements in happy and angry expressions, minimal movement in sad and neutral displays) (Atkinson et al., 2004; Atkinson et al., 2007; Dael et al., 2012a, Dael et al., 2012b; Gunes & Picardi, 2006).

In summary, it is not yet known how observers view dynamic whole body expressions for emotion categorisation responses. The aim of the present study is to fill this gap by presenting observers with still and dynamic stimuli of whole body expressions of emotions with faces either visible or not visible. Based on previous findings, it is expected that the body will attract more attention in dynamic displays (Rice et al., 2013; Stoesz & Jakobson, 2014), and that attention is drawn to emotion-specific bodily cues (Dael et al., 2012a, Dael et al., 2012b, Gunes & Picardi, 2006).

Section snippets

Participants

Forty-one young adults (17 males, age = 20.1 ± 0.24 (mean ± SEM) years; 24 females, age = 19.9 ± 0.22 years) were recruited via the Subject Pool of the School of Psychology at the University of Lincoln. This sample size was based on previous research in the same field and was comparable to those published reports (e.g., Guo, 2012; Pollux, Hall, & Guo, 2014; Guo & Shaw, 2015). The suitability of the sample size was confirmed by power analysis using G*power software (Faul, Erdfelder, Lang, &

Participants

Twenty four participants (5 men (22.1 ± 0.17 years) and 19 women (21.6 ± 1.15 years)) were recruited via the Subject Pool of the School of Psychology at the University of Lincoln. This sample size was determined by checking the effect size reported in previous comparable studies. For instance, Nelson and Mondloch (2017) report an effect size of 0.4 in proportion viewing time (dynamic displays) for the interaction between Emotion and ROI. With a more conservative effect size of 0.3, a sample

General discussion

The aim of the present study was to investigate gaze strategies in viewing of static and dynamic body expressions with faces either visible or not visible for expression categorisation. Categorisation accuracy and confidence ratings were higher for dynamic compared to still displays, whereas intensity ratings were not significantly affected by body movement. Face pixilation reduced accuracy more in still displays (affecting all expressions) than in dynamic displays (affecting only sad

Conclusion

The aim of the present study was to investigate gaze strategies in viewing of dynamic and static whole body expressions of emotions. The findings revealed a stronger face-bias in dynamic compared to static displays when faces were visible. When faces were not visible or when observers viewed static images, viewing was distributed over the head, torso and the arms. Viewing was also subtly influenced by emotion-specific gestures. Together, these findings suggest that viewers adopt a uniform gaze

Acknowledgements

We would like to express our thanks to Ferenc Igali and Georgia Sapsford for their contribution to the recording of the videos and the processing of the stimuli. We would also like to thank Emma Grange and Grace Severn for their contribution to data collection.

References (54)

  • Z. Ambadar et al.

    Deciphering the enigmatic face. The importance of facial dynamics in interpreting subtle facial expression

    Psychological Science

    (2004)
  • A.P. Atkinson et al.

    Emotion perception from dynamic and static body expressions in point-light and full-light displays

    Perception

    (2004)
  • J. Aviezer et al.

    Bodily cues, not facial expressions, discriminate between intense positive and negative emotions

    Science

    (2012)
  • M. Bindemann et al.

    Viewpoint and center of gravity affect eye movements to human faces

    Journal of Vision

    (2009)
  • M. Bindemann et al.

    Face, body, and center of gravity mediate person detection in natural scenes

    Journal of Experimental Psychology: Human Perception and Performance

    (2010)
  • A.J. Calder et al.

    Configural coding of facial expressions: The impact of inversion and photographic negative

    Visual Cognition

    (2005)
  • A.J. Calder et al.

    Configural information in facial expression perception

    Journal of Experimental Psychology: Human Perception and Performance

    (2000)
  • M.G. Calvo et al.

    Detection of emotional faces: Salient physical features guide effective visual search

    Journal of Experimental Psychology: General

    (2008)
  • R. Carmi et al.

    Visual causes versus correlates of attentional selection in dynamic scenes

    Vision Research

    (2006)
  • J.S. Carton et al.

    Nonverbal decoding skills and relationship well-being in adults

    Journal of Nonverbal Behaviour

    (1999)
  • L. Conty et al.

    Early binding of gaze, gesture and emotion: Neural time course and correlates

    The Journal of Neuroscience

    (2012)
  • N. Dael et al.

    The body action and posture coding system (BAP). Development and reliability

    Journal of Nonverbal Behaviour

    (2012)
  • N. Dael et al.

    Emotion expression in body action and posture

    Emotion

    (2012)
  • B. de Gelder

    Why bodies? Twelve reasons for including bodily expressions in affective neuroscience

    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences

    (2009)
  • B. de Gelder

    From body perception to action preparation: A distributed neural system for viewing bodily expressions of emotion

  • B. de Gelder et al.

    The perception of emotion in body expressions

    WIREs Cognitive Science

    (2015)
  • M. Dorr et al.

    Variability of eye movements when viewing dynamic natural scenes

    Journal of Vision

    (2010)
  • Cited by (0)

    View full text