Skip to main content
Log in

Empirical investigation of the temporal relations between speech and facial expressions of emotion

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

Behavior models implemented within Embodied Conversational Agents (ECAs) require nonverbal communication to be tightly coordinated with speech. In this paper we present an empirical study seeking to explore the influence of the temporal coordination between speech and facial expressions of emotions on the perception of these emotions by users (measuring their performance in this task, the perceived realism of behavior, and user preferences). We generated five different conditions of temporal coordination between facial expression and speech: facial expression displayed before a speech utterance, at the beginning of the utterance, throughout, at the end of, or following the utterance. 23 subjects participated in the experiment and saw these 5 conditions applied to the display of 6 emotions (fear, joy, anger, disgust, surprise and sadness). Subjects recognized emotions most efficiently when facial expressions were displayed at the end of the spoken sentence. However, the combination users viewed as most realistic, preferred over others, was the display of the facial expression throughout speech utterance. We review existing literature to position our work and discuss the relationship between realism and communication performance. We also provide animation guidelines and draw some avenues for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Allen J (1983) Maintaining knowledge about temporal intervals. Commun ACM 26:832–843

    Article  MATH  Google Scholar 

  2. Arya A, DiPaola S, Parush A (2009) Perceptually valid facial expressions for character-based applications. Int J Comput Games Technol 2009:1–13

    Article  Google Scholar 

  3. Calder AJ, Rowland D, Young AW, Nimmo-Smith I, Keane J, Perrett DI (2000) Caricaturing facial expressions. Cognition 76:105–146

    Article  Google Scholar 

  4. Cassell J (2000) Nudge nudge wink wink: elements of face-to-face conversation for embodied conversational agents. In: Cassell J, Sullivan J, Prevost S, Churchill E (eds) Embodied conversational agents. MIT Press, Cambridge, pp 1–27

    Google Scholar 

  5. Cassell J, Pelachaud C, Badler N, Steedman M, Achorn B, Becket T, Douville B, Prevost S, Stone M (1994) Animated conversation: rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents. In: SIGGRAPH’94. ACM, New York, pp 413–420

    Google Scholar 

  6. Cassell J, Vilhjálmsson H, Bickmore T (2001) BEAT: the Behavior Expression Animation Toolkit. In: SIGGRAPH’01. ACM, New York, pp 477–486

    Google Scholar 

  7. De Rosis F, Pelachaud C, Poggi I, Carofiglio V, De Carolis B (2003) From Greta’s mind to her face: modelling the dynamics of affective states in a conversational embodied agent. Int J Hum-Comput Stud 59:81–118

    Article  Google Scholar 

  8. Devillers L, Vidrascu L, Lamel L (2005) Emotion detection in real-life spoken dialogs recorded in call center. J Neural Netw 18:407–422

    Article  Google Scholar 

  9. Egges A, Kshirsaga S, Magnenat-Thalmann N (2004) Generic personality and emotion simulation for conversational agents. Comput Animat Virtual Worlds 15:1–13

    Article  Google Scholar 

  10. Ekman P (1994) Strong evidence for universals in facial expressions: a reply to Russell’s mistaken critique. Psychol Bull 115:268–287

    Article  Google Scholar 

  11. Ekman P, Friesen WV (1975) Unmasking the face. A guide to recognizing emotions from facial clues. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  12. Gratch J, Rickel J, André E, Badler N, Cassell J, Petajan E (2002) Creating interactive virtual humans: some assembly required. IEEE Intell Syst 17:54–63

    Article  Google Scholar 

  13. Gratch J, Marsella S, Egges A, Eliëns A, Isbister K, Paiva A, Rist T, ten Hagen P (2004) Design criteria, techniques and case studies for creating and evaluating interactive experiences for virtual humans. Working group on ECA’s design parameters and aspects, Dagstuhl seminar on Evaluating Embodied Conversational Agents

  14. Groom V, Nass C, Chen T, Nielsen A, Scarborough JK, Robles E (2009) Evaluating the effects of behavioral realism in embodied agents. Int J Hum-Comput Stud 67:842–849

    Article  Google Scholar 

  15. Grynszpan O, Nadel J, Constant J, Le Barillier F, Carbonell N, Simonin J, Martin JC, Courgeon M (2009) A new virtual environment paradigm for high functioning autism intended to help attentional disengagement in a social context. In: Virtual rehabilitation international conference, pp 51–58

  16. Isbister K, Doyle P (2004) The blind men and the elephant revisited. In: Ruttkay Z, Pelachaud C (eds) From brows to trust: evaluating embodied conversational agents. Kluwer Academic, Norwell, pp 3–26

    Google Scholar 

  17. Johnson WL, Rickel J, Lester J (2000) Animated pedagogical agents: face-to-face interaction in interactive learning environments. Int J Artif Intell Educ 11:47–78

    Google Scholar 

  18. Krahmer E, Swerts M (2004) More about brows. In: Ruttkay Z, Pelachaud C (eds) From brows to trust: evaluating embodied conversational agents. Kluwer Academic, Norwell, pp 191–216

    Google Scholar 

  19. Krahmer E, Swerts M (2009) Audiovisual prosody—introduction to the special issue. Lang Speech 52:129–133

    Article  Google Scholar 

  20. Krumhuber E, Manstead ASR, Kappas A (2007) Temporal aspects of facial displays in person and expression perception: the effect of smile dynamics, head-tilt, and gender. J Nonverbal Behav 31:39–56

    Article  Google Scholar 

  21. Lester J, Towns S, Callaway C, Voerman J, FitzGerald P (2000) Deictic and emotive communication in animated pedagogical agents. In: Cassell J, Prevost S, Sullivan J, Churchill E (eds) Embodied conversational agents. MIT Press, Cambridge, pp 123–154

    Google Scholar 

  22. Martin JC, Niewiadomski R, Devillers L, Buisine S, Pelachaud C (2006) Multimodal complex emotions: gesture expressivity and blended facial expressions. Int J Humanoid Robot 3:269–292

    Article  Google Scholar 

  23. Messinger DS, Fogel A, Dickson KL (1999) What’s in a smile? Dev Psychol 35:701–708

    Article  Google Scholar 

  24. Nusseck M, Cunningham DW, Wallraven C, Bülthoff HH (2008) The contribution of different facial regions to the recognition of conversational expressions. J Vis 8:1–23

    Article  Google Scholar 

  25. Pelachaud C (2005) Multimodal expressive embodied conversational agents. In: International multimedia conference, pp 683–689

  26. Pelachaud C (2009) Modelling multimodal expression of emotion in a virtual agent. Philos Trans R Soc B 364:3539–3548

    Article  Google Scholar 

  27. Pelachaud C, Badler N, Steedman M (1996) Generating facial expressions for speech. Cogn Sci 20:1–46

    Article  Google Scholar 

  28. Pelachaud C, Carofiglio V, De Carolis B, de Rosis F, Poggi I (2002) Embodied contextual agent in information delivering application. In: AAMAS’2002, pp 758–765

  29. Scherer KR (1980) The functions of nonverbal signs in conversation. In: Giles H, St Clair R (eds) The social and physhological contexts of language. LEA, New York, pp 225–243

    Google Scholar 

  30. Scherer KR (2001) Appraisal considered as a process of multi-level sequential checking. In: Scherer KR, Schorr A, Johnstone T (eds) Appraisal processes in emotion: theory, methods, research. Oxford University Press, New York, pp 92–120

    Google Scholar 

  31. Tanguy E, Willis P, Bryson J (2007) Emotions as durative dynamic state for action selection. In: IJCAI’07: International joint conference on artificial intelligence, pp 1537–1542

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphanie Buisine.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Buisine, S., Wang, Y. & Grynszpan, O. Empirical investigation of the temporal relations between speech and facial expressions of emotion. J Multimodal User Interfaces 3, 263–270 (2009). https://doi.org/10.1007/s12193-010-0050-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-010-0050-4

Keywords

Navigation