Abstract
Human-computer interaction and human-human communication via computer interfaces have become a major part of our lives, but still lack the basic means of recognising and responding to non-verbal cues of attitudes, emotions and mental states, that we take for granted in human communication. They fail to appreciate the users’ reactions and intentions. This problem is more acute in speech interfaces, used by the general population and specifically by people with degraded motor abilities. In these systems speech is used to convey commands and data, while natural behaviour also uses speech for thinking out loud, expressions of frustration, misunderstanding, discomfort, and more. Most of these functions relate to nuances of expressions, and some of them are obvious only in speech.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bechara A, Damasio H, Tranel D, Damasio AR (1997) Deciding advantageously before knowing the advantageous strategy. Science 275(5304): 1293–1295
Boersma P (1993) Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of the Institute of Phonetic Sciences, Amsterdam
Cohn JF, Katz GS (1998) Bimodal expression of emotion by face and voice. In: Workshop on Face / Gesture Recognition and Their Applications, The Sixth ACM International Multimedia Conference, Bristol, UK
Cornelius R, Cowie R (2003) Describing the emotional states that are expressed in speech. Speech Communication, 59
Cowie R, Douglas-Cowie E, Tsapatsoulis N, Votsis G, Kollias S, Fellenz W, Taylor JG (2001) Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18(1): 32–80
Dellaert F, Polzin Th, Waibel A (1996) Recognizing emotions in speech. ICSLP 96
Delier JRJ, Proakis JG, Hansen JHL (1993) Discrete-time processing of speech signals. New York: Macmillan Publishing Company
Ekman P (1999) Basic emotion. In: Handbook of cognition and emotion, Wiley, Chichester, UK
Fernandez R, Picard RW (2003) Modeling drivers’ speech under stress. Speech Communication 40: 145–159
Guojun Z, Hansen JHL, Kaiser JF (1998) Classification of speech under stress based on features derived from the nonlinear Teager energy operator. In: Proceedings of the ICASSP ’98, New York, USA
Klein J, Moon Y, Picard RW (2002) This computer responds to user frustration: theory, design, and results. Interacting with Computers, 14(2): 119–140
Lisetti CL, Schiano DJ (2000) Automatic facial expression interpretation: Where human-computer interaction, artificial intelligence and cognitive sciences intersect. Pragmatics & cognition, 8(1)
Moore CA, Cohn JF, Katz GS (1994) Quantitative description and differentiation of fundamental frequency contours. Computer Speech & Language, 8(4): 385–404
Mozziconacci SJL (2001) Modeling emotion and attitude in speech by means of perceptually based parameter values. User Modeling & User-Adapted Interaction, 11(4): 297–326
Murray IR, Arnott JL (1993) Toward the simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. Journal of the Acoustical Society of America, 93(2): 1097–1108
Nass B, Reeves C (1996) The media equation. Cambridge University Press, Cambridge, UK
Oudeyer PY, (2003) The production and recognition of emotions in speech: features and algorithms. International Journal of Human Computer Interaction 59(1–2): 1–2
Petrushin V (1999) Emotion in speech: Recognition and application to call centers. Intelligent Engineering Systems Through Artificial Neural Networks, ASME Press
Picard RW, Klein J (2002) Computers that recognise and respond to user emotion: theoretical and practical implications. Interacting with Computers, 14(2): 141–169
Picard RW, Vyzas E, Healey J (2001) Toward machine emotional intelligence: analysis of affective physiological state. IEEE Transactions on Pattern Analysis & Machine Intelligence, 23(10): 1175–1191
Polzin T, Waibel A (2000) Emotion-sensitive human-computer interfaces. In: ISCA Workshop on Speech and Emotion, Belfast, UK
Scherer KR (2000) Emotion effects on voice and speech: Paradigms and approaches to evaluation. In: ISCA Workshop on Speech and Emotion, Belfast, UK
Shen JL, Hung JW, Lee LS (1998) Robust entropy-based endpoint detection for speech recognition in noisy environments. International Conference on Spoken Language Processing, Sydney, Australia
Wierzbicka A (2000) The semantics of human facial expressions. Pragmatics & cognition, 8(1)
Yacoob Y, Davis LS (1994) Recognizing human facial expressions. Image Understanding Workshop. In: Proceedings. San Francisco, CA, USA
Yan Li FY, Ying-Qing X, Chang E, Heung-Yeung S (2001) Speech driven cartoon animation with emotions. In: ACM Multimedia 2001, Ottawa, Canada
Zhao WW, Ogunfunmi T (1999) Formant and pitch detection using time-frequency distribution. International Journal of Speech Technology 3(1): 35–49
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag London
About this paper
Cite this paper
Shikler, T.S., Robinson, P. (2004). Recognising Expression in Speech for Human Computer Interaction. In: Keates, S., Clarkson, J., Langdon, P., Robinson, P. (eds) Designing a More Inclusive World. Springer, London. https://doi.org/10.1007/978-0-85729-372-5_16
Download citation
DOI: https://doi.org/10.1007/978-0-85729-372-5_16
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1046-0
Online ISBN: 978-0-85729-372-5
eBook Packages: Springer Book Archive