Skip to main content

Recognising Expression in Speech for Human Computer Interaction

  • Conference paper
Designing a More Inclusive World

Abstract

Human-computer interaction and human-human communication via computer interfaces have become a major part of our lives, but still lack the basic means of recognising and responding to non-verbal cues of attitudes, emotions and mental states, that we take for granted in human communication. They fail to appreciate the users’ reactions and intentions. This problem is more acute in speech interfaces, used by the general population and specifically by people with degraded motor abilities. In these systems speech is used to convey commands and data, while natural behaviour also uses speech for thinking out loud, expressions of frustration, misunderstanding, discomfort, and more. Most of these functions relate to nuances of expressions, and some of them are obvious only in speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Bechara A, Damasio H, Tranel D, Damasio AR (1997) Deciding advantageously before knowing the advantageous strategy. Science 275(5304): 1293–1295

    Article  Google Scholar 

  • Boersma P (1993) Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, Amsterdam

    Google Scholar 

  • Cohn JF, Katz GS (1998) Bimodal expression of emotion by face and voice. In: Workshop on Face / Gesture Recognition and Their Applications, The Sixth ACM International Multimedia Conference, Bristol, UK

    Google Scholar 

  • Cornelius R, Cowie R (2003) Describing the emotional states that are expressed in speech. Speech Communication, 59

    Google Scholar 

  • Cowie R, Douglas-Cowie E, Tsapatsoulis N, Votsis G, Kollias S, Fellenz W, Taylor JG (2001) Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1): 32–80

    Article  Google Scholar 

  • Dellaert F, Polzin Th, Waibel A (1996) Recognizing emotions in speech. ICSLP 96

    Google Scholar 

  • Delier JRJ, Proakis JG, Hansen JHL (1993) Discrete-time processing of speech signals. New York: Macmillan Publishing Company

    Google Scholar 

  • Ekman P (1999) Basic emotion. In: Handbook of cognition and emotion, Wiley, Chichester, UK

    Google Scholar 

  • Fernandez R, Picard RW (2003) Modeling drivers’ speech under stress. Speech Communication 40: 145–159

    Article  MATH  Google Scholar 

  • Guojun Z, Hansen JHL, Kaiser JF (1998) Classification of speech under stress based on features derived from the nonlinear Teager energy operator. In: Proceedings of the ICASSP ’98, New York, USA

    Google Scholar 

  • Klein J, Moon Y, Picard RW (2002) This computer responds to user frustration: theory, design, and results. Interacting with Computers, 14(2): 119–140

    Article  Google Scholar 

  • Lisetti CL, Schiano DJ (2000) Automatic facial expression interpretation: Where human-computer interaction, artificial intelligence and cognitive sciences intersect. Pragmatics & cognition, 8(1)

    Google Scholar 

  • Moore CA, Cohn JF, Katz GS (1994) Quantitative description and differentiation of fundamental frequency contours. Computer Speech & Language, 8(4): 385–404

    Article  Google Scholar 

  • Mozziconacci SJL (2001) Modeling emotion and attitude in speech by means of perceptually based parameter values. User Modeling & User-Adapted Interaction, 11(4): 297–326

    Article  MATH  Google Scholar 

  • Murray IR, Arnott JL (1993) Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. Journal of the Acoustical Society of America, 93(2): 1097–1108

    Article  Google Scholar 

  • Nass B, Reeves C (1996) The media equation. Cambridge University Press, Cambridge, UK

    Google Scholar 

  • Oudeyer PY, (2003) The production and recognition of emotions in speech: features and algorithms. International Journal of Human Computer Interaction 59(1–2): 1–2

    Google Scholar 

  • Petrushin V (1999) Emotion in speech: Recognition and application to call centers. Intelligent Engineering Systems Through Artificial Neural Networks, ASME Press

    Google Scholar 

  • Picard RW, Klein J (2002) Computers that recognise and respond to user emotion: theoretical and practical implications. Interacting with Computers, 14(2): 141–169

    Article  Google Scholar 

  • Picard RW, Vyzas E, Healey J (2001) Toward machine emotional intelligence: analysis of affective physiological state. IEEE Transactions on Pattern Analysis & Machine Intelligence, 23(10): 1175–1191

    Article  Google Scholar 

  • Polzin T, Waibel A (2000) Emotion-sensitive human-computer interfaces. In: ISCA Workshop on Speech and Emotion, Belfast, UK

    Google Scholar 

  • Scherer KR (2000) Emotion effects on voice and speech: Paradigms and approaches to evaluation. In: ISCA Workshop on Speech and Emotion, Belfast, UK

    Google Scholar 

  • Shen JL, Hung JW, Lee LS (1998) Robust entropy-based endpoint detection for speech recognition in noisy environments. International Conference on Spoken Language Processing, Sydney, Australia

    Google Scholar 

  • Wierzbicka A (2000) The semantics of human facial expressions. Pragmatics & cognition, 8(1)

    Google Scholar 

  • Yacoob Y, Davis LS (1994) Recognizing human facial expressions. Image Understanding Workshop. In: Proceedings. San Francisco, CA, USA

    Google Scholar 

  • Yan Li FY, Ying-Qing X, Chang E, Heung-Yeung S (2001) Speech driven cartoon animation with emotions. In: ACM Multimedia 2001, Ottawa, Canada

    Google Scholar 

  • Zhao WW, Ogunfunmi T (1999) Formant and pitch detection using time-frequency distribution. International Journal of Speech Technology 3(1): 35–49

    Article  Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag London

About this paper

Cite this paper

Shikler, T.S., Robinson, P. (2004). Recognising Expression in Speech for Human Computer Interaction. In: Keates, S., Clarkson, J., Langdon, P., Robinson, P. (eds) Designing a More Inclusive World. Springer, London. https://doi.org/10.1007/978-0-85729-372-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-372-5_16

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-1046-0

  • Online ISBN: 978-0-85729-372-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics