Abstract
During conversations, speakers employ a number of verbal and nonverbal mechanisms to establish who participates in the conversation, when, and in what capacity. Gaze cues and mechanisms are particularly instrumental in establishing the participant roles of interlocutors, managing speaker turns, and signaling discourse structure. If humanlike robots are to have fluent conversations with people, they will need to use these gaze mechanisms effectively. The current work investigates people's use of key conversational gaze mechanisms, how they might be designed for and implemented in humanlike robots, and whether these signals effectively shape human-robot conversations. We focus particularly on whether humanlike gaze mechanisms might help robots signal different participant roles, manage turn-exchanges, and shape how interlocutors perceive the robot and the conversation. The evaluation of these mechanisms involved 36 trials of three-party human-robot conversations. In these trials, the robot used gaze mechanisms to signal to its conversational partners their roles either of two addressees, an addressee and a bystander, or an addressee and a nonparticipant. Results showed that participants conformed to these intended roles 97% of the time. Their conversational roles affected their rapport with the robot, feelings of groupness with their conversational partners, and attention to the task.
- Argyle, M. and Dean, J. 1965. Eye-contact, distance and affiliation. Sociometry 28, 3, 289--304.Google ScholarCross Ref
- Argyle, M. and Ingham, R. 1972. Gaze, mutual gaze, and proximity. Semiotica 6, 1, 32--49.Google ScholarCross Ref
- Aron, A., Aron, E., and Smollan, D. 1992. Inclusion of other in the self scale and the structure of interpersonal closeness. J. Person. Soc. Psych. 63, 4, 596--612.Google ScholarCross Ref
- Bailenson, J., Beall, A., Loomis, J., Blascovich, J., and Turk, M. 2005. Transformed social interaction, augmented gaze, and social influence in immersive virtual environments. Hum. Comm. Resear. 31, 4, 511--537.Google ScholarCross Ref
- Bales, R. 1970. Personality and Interpersonal Behavior. Holt, Rinehart, and Winston, New York.Google Scholar
- Bales, R., Strodtbeck, F., Mills, T., and Roseborough, M. 1951. Channels of communication in small groups. Amer. Soc. Rev. 16, 4, 461--468.Google ScholarCross Ref
- Bennewitz, M., Faber, F., Joho, D., Schreiber, M., and Behnke, S. 2006. Towards a humanoid museum guide robot that interacts with multiple persons. In Proceedings of the 5th IEEE-RAS International Conference on Humanoid Robots. IEEE, 418--423.Google Scholar
- Bouman, C. 1997. Cluster: An unsupervised algorithm for modeling Gaussian mixtures. http://www. ece.purdue.edu/~bouman.Google Scholar
- Brown, G., Currie, K., and Kenworthy, J. 1980. Questions of Intonation. Routledge.Google Scholar
- Brown, P. and Levinson, S. 1987. Politeness: Some Universals in Language Usage. Cambridge University Press.Google ScholarCross Ref
- Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Vilhjálmsson, H., and Yan, H. 1999a. Embodiment in conversational interfaces: Rea. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 520--527. Google ScholarDigital Library
- Cassell, J., Nakano, Y., Bickmore, T., Sidner, C., and Rich, C. 2001. Non-verbal cues for discourse structure. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 114--123. Google ScholarDigital Library
- Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., and Stone, M. 1994. Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In Proceedings of the 21st Annual Conference on Computer Graphics and Interactive Techniques. ACM, 413--420. Google ScholarDigital Library
- Cassell, J., Torres, O., and Prevost, S. 1999b. Turn taking vs. discourse structure: How best to model multimodal conversation. In Machine Conversations, Kluwer, 143--154.Google Scholar
- Clark, H. 1992. Arenas of Language Use. University of Chicago Press.Google Scholar
- Clark, H. 1996. Using Language. Cambridge University Press.Google Scholar
- Clark, H. and Carlson, T. 1982. Hearers and speech acts. Language 58, 2, 332--373.Google ScholarCross Ref
- Colburn, A., Cohen, M., and Drucker, S. 2000. The role of eye gaze in avatar mediated conversational interfaces. Tech. rep. MSR-TR-2000-81, Microsoft Research.Google Scholar
- Cook, M. and Smith, J. 1975. The role of gaze in impression formation. Brit. J. Soc. Clin. Psych. 14, 19--25.Google ScholarCross Ref
- Duncan, S. 1972. Some signals and rules for taking speaking turns in conversations. J. Person. Soc. Psych. 23, 2, 283--292.Google ScholarCross Ref
- Edelsky, C. 1981. Who's got the floor? Lang. Soc. 10, 03, 383--421.Google ScholarCross Ref
- Efran, J. 1968. Looking for approval: effects on visual behavior of approbation from persons differing in importance. J. Person. Soc. Psych. 10, 1, 21--25.Google ScholarCross Ref
- Exline, R. 1963. Explorations in the process of person perception: visual interaction in relation to competition, sex, and need for affiliation1. J. Person. 31, 1, 1--20.Google ScholarCross Ref
- Garau, M., Slater, M., Bee, S., and Sasse, M. 2001. The impact of eye gaze on communication using humanoid avatars. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 309--316. Google ScholarDigital Library
- Garau, M., Slater, M., Vinayagamoorthy, V., Brogni, A., Steed, A., and Sasse, M. 2003. The impact of avatar realism and eye gaze control on perceived quality of communication in a shared immersive virtual environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 529--536. Google ScholarDigital Library
- Goffman, E. 1955. On face-work; an analysis of ritual elements in social interaction. Psych. Interper. Biol. Proc. 18, 3, 213--231.Google ScholarCross Ref
- Goffman, E. 1971. Relations in Public: Microstudies of the Public Order. Harper & Row.Google Scholar
- Goffman, E. 1979. Footing. Semiotica 25, 1-2, 1--30.Google ScholarCross Ref
- Goldberg, L., Johnson, J., Eber, H., Hogan, R., Ashton, M., Cloninger, C., and Gough, H. 2006. The international personality item pool and the future of public-domain personality measures. J. Resear. Person. 40, 1, 84--96.Google ScholarCross Ref
- Goodwin, C. 1980. Restarts, Pauses, and the Achievement of a State of Mutual Gaze at Turn-Beginning. Soc. Inq. 50, 3-4, 272--302.Google ScholarCross Ref
- Goodwin, C. 1981. Conversational Organization: Interaction between Speakers and Hearers. Academic Press New York.Google Scholar
- Grosz, B. and Sidner, C. 1986. Attention, intentions, and the structure of discourse. Computat. Linguist. 12, 3, 175--204. Google ScholarDigital Library
- Halliday, M. 1967. Intonation and Grammar in British English. Mouton.Google Scholar
- Hanks, W. 1996. Language & Communicative Practices. Westview Press.Google Scholar
- Hanna, J. and Brennan, S. 2007. Speakers' eye gaze disambiguates referring expressions early during face-to-face conversation. J. Mem. Lang. 57, 4, 596--615.Google ScholarCross Ref
- Hayashi, R. 1988. Simultaneous talkÑfrom the perspective of floor management of English and Japanese speakers. World Englishes 7, 3, 269--288.Google ScholarCross Ref
- Heylen, D., Es, I., Nijholt, A., and Dijk, B. 2005. Controlling the gaze of conversational agents. In Advances in Natural Multimodal Dialogue Systems, N. Ide, J. Véronis, H. Baayen, K. Church, J. Klavans, D. Barnard, D. Tufis, J. Llisterri, S. Johansson, J. Mariani, J. Kuppevelt, L. Dybkjær, and N. Bernsen, Eds. Text, Speech and Language Technology Series, vol. 30, Springer, Berlin, 245--262.Google Scholar
- Hinds, J. 1976. Aspects of Japanese Discourse Structure. Kaitakusha.Google Scholar
- Hirschberg, J. and Grosz, B. 1992. Intonational features of local and global discourse structure. In Proceedings of the Workshop on Speech and Natural Language. Association for Computational Linguistics, 441--446. Google ScholarDigital Library
- Hirschberg, J. and Pierrehumbert, J. 1986. The intonational structuring of discourse. In Proceedings of the 24th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 136--144. Google ScholarDigital Library
- Hymes, D. 1972. Models of the Interaction of Language and Social Life. Holt, Rinehalt & Winston, 35--77.Google Scholar
- Ishiguro, H., Ono, T., Imai, M., Maeda, T., Kanda, T., and Nakatsu, R. 2001. Robovie: An interactive humanoid robot. Indust. Robot. A Int. J. 28, 6, 498--504.Google ScholarCross Ref
- Kendon, A. 1967. Some functions of gaze-direction in social interaction. Acta Psychologica 26, 1, 22.Google ScholarCross Ref
- Kirchner, N., Alempijevic, A., and Dissanayake, G. 2011. Nonverbal robot-group interaction using an imitated gaze cue. In Proceedings of the 6th International Conference on Human-Robot Interaction. ACM, 497--504. Google ScholarDigital Library
- Kleck, R. and Nuessle, W. 1968. Congruence between the indicative and communicative functions of eye contact in interpersonal relations. Brit. J. Soc. Clin. Psych. 7, 241--246.Google ScholarCross Ref
- Kuno, Y., Sadazuka, K., Kawashima, M., Yamazaki, K., Yamazaki, A., and Kuzuoka, H. 2007. Museum guide robot based on sociological interaction analysis. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1191--1194. Google ScholarDigital Library
- Laurel, B. 1991. Computers as Theatre. Addison-Wesley. Google ScholarDigital Library
- Lee, S., Badler, J., and Badler, N. 2002. Eyes alive. ACM Trans. Graph. 21. ACM, 637--644. Google ScholarDigital Library
- Levinson, S. 1988. Putting Linguistics on a Proper Footing: Explorations in Goffman's Concepts of Participation. 161--227, Oxford, UK, Polity Press.Google Scholar
- Libby, W. 1970. Eye contact and direction of looking as stable individual differences. J. Exper. Resear. Person. 4, 303--312.Google Scholar
- Lu, D., Pileggi, A., Wilson, C., and Smart, W. 2010. What Can Actors Teach Robots About Interaction? In Proceedings of AAAI Spring Symposium Series.Google Scholar
- Mason, M., Tatkow, E., and Macrae, C. 2005. The look of love. Psych. Sci. 16, 3, 236--239.Google ScholarCross Ref
- Maynard, S. 1986. On back-channel behavior in Japanese and English casual conversation. Linguistics 24, 6, 1079--1108.Google ScholarCross Ref
- Maynard, S. 1989. Japanese Conversation: Self-Contextualization through Structure and Interactional Management. Ablex Publishing, Norwood, NJ.Google Scholar
- McLaughlin, M. and Cody, M. 1982. Awkward silences: Behavioral antecedents and consequences of the conversational lapse. Hum. Comm. Resear. 8, 4, 299--316.Google ScholarCross Ref
- Murray, N., Roberts, D., Steed, A., Sharkey, P., Dickerson, P., and Rae, J. 2007. An assessment of eye-gaze potential within immersive virtual environments. ACM Trans. Multimed. Comput. Comm. Appl. 3, 4, 8. Google ScholarDigital Library
- Mutlu, B., Forlizzi, J., and Hodgins, J. 2006. A storytelling robot: Modeling and evaluation of human-like gaze behavior. In Proceedings of the 6th IEEE-RAS International Conference on Humanoid Robots. IEEE, 518--523.Google Scholar
- Nielsen, G. 1962. Studies in Self Confrontation. Munksgaard, Copenhagen.Google Scholar
- Otteson, J. and Otteson, C. 1980. Effect of teacher's gaze on children's story recall. Percept. Motor Skills 50, 35--42.Google ScholarCross Ref
- Parise, S., Kiesler, S., Sproull, L., and Waters, K. 1996. My partner is a real dog: cooperation with social agents. In Proceedings of the ACM Conference on Computer Supported Cooperative Work. ACM, 399--408. Google ScholarDigital Library
- Quek, F., McNeill, D., Bryll, R., Duncan, S., Ma, X., Kirbas, C., McCullough, K., and Ansari, R. 2002a. Multimodal human discourse: gesture and speech. ACM Trans. Comput.-Hum. Interac. 9, 3, 171--193. Google ScholarDigital Library
- Quek, F., McNeill, D., Bryll, R., Kirbas, C., Arslan, H., McCullough, K., Furuyama, N., and Ansari, R. 2000. Gesture, speech, and gaze cues for discourse segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognitionn. Vol. 2, 247--254.Google Scholar
- Quek, F., McNeill, D., Bryll, R., Kirbas, C., Arslan, H., McCullough, K., Furuyama, N., and Ansari, R. 2002b. Gesture, speech, and gaze cues for discourse segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Vol. 2, IEEE, 247--254.Google Scholar
- Rehm, M. and André, E. 2005. Where do they look? Gaze behaviors of multiple users interacting with an embodied conversational agent. In Intelligent Virtual Agents, Springer, 241--252. Google ScholarDigital Library
- Sacks, H., Schegloff, E., and Jefferson, G. 1974. A simplest systematics for the organization of turn-taking for conversation. Language 50, 4, 696--735.Google ScholarCross Ref
- Schegloff, E. 1968. Sequencing in Conversational Openings. Amer. Anthro. 70, 6, 1075--1095.Google ScholarCross Ref
- Schegloff, E. 2000. Overlapping talk and the organization of turn-taking for conversation. Lang. Soc. 29, 1, 1--63.Google ScholarCross Ref
- Schegloff, E. and Sacks, H. 1973. Opening up closings. Semiotica 8, 4, 289--327.Google ScholarCross Ref
- Schiffrin, D. 1988. Discourse Markers. Cambridge University Press.Google Scholar
- Sherwood, J. 1987. Facilitative effects of gaze upon learning. Percept. Motor Skills 64, 1275--1278.Google ScholarCross Ref
- Sidner, C., Kidd, C., Lee, C., and Lesh, N. 2004. Where to look: a study of human-robot engagement. In Proceedings of the 9th International Conference on Intelligent User Interfaces. ACM, 78--84. Google ScholarDigital Library
- Staudte, M. and Crocker, M. 2011. Investigating joint attention mechanisms through spoken human-robot interaction. Cognition 120, 268--291.Google ScholarCross Ref
- Steptoe, W., Wolff, R., Murgia, A., Guimaraes, E., Rae, J., Sharkey, P., Roberts, D., and Steed, A. 2008. Eye-tracking for avatar eye-gaze and interactional analysis in immersive collaborative virtual environments. In Proceedings of the ACM Conference on Computer Supported Cooperative Work. ACM, 197--200. Google ScholarDigital Library
- Tanaka, H. 1999. Turn-Taking in Japanese Conversation: A Study in Grammar and Interaction. Vol. 3, John Benjamins Publishing Company.Google Scholar
- Tannen, D. 2005. Conversational Style: Analyzing Talk Among Friends. Oxford University Press.Google ScholarCross Ref
- Thomas, F. and Johnston, O. 1995. The Illusion of Life: Disney Animation. Hyperion New York.Google Scholar
- Thórisson, K. 2002. Natural turn-taking needs no manual: Computational theory and model, from perception to action. In Multimodality in Language and Speech Systems, Kluwer, 173--207.Google Scholar
- Trafton, J., Bugajska, M., Fransen, B., and Ratwani, R. 2008. Integrating vision and audition within a cognitive architecture to track conversations. In Proceedings of the 3rd ACM/IEEE International Conference on Human Robot Interaction. ACM, 201--208. Google ScholarDigital Library
- Van Breemen, A. 2004. Bringing robots to life: Applying principles of animation to robots. In Proceedings of Shaping Human-Robot Interaction Workshop at the 22nd ACM/SigCHI Conference on Human Factors in Computing.Google Scholar
- Vertegaal, R., Slagter, R., van der Veer, G., and Nijholt, A. 2001. Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 301--308. Google ScholarDigital Library
- Vertegaal, R., van der Veer, G., and Vons, H. 2000. Effects of gaze on multiparty mediated communication. In Proceedings of Graphics Interface. 95--102.Google Scholar
- Vilhjálmsson, H. and Cassell, J. 1998. Bodychat: Autonomous communicative behaviors in avatars. In Proceedings of the 2nd International Conference on Autonomous Agents. ACM, 269--276. Google ScholarDigital Library
- Wang, N. and Johnson, W. 2008. The Politeness Effect in an intelligent foreign language tutoring system. In Intelligent Tutoring Systems, Springer, 270--280. Google ScholarDigital Library
- Wang, N., Johnson, W., Rizzo, P., Shaw, E., and Mayer, R. 2005. Experimental evaluation of polite interaction tactics for pedagogical agents. In Proceedings of the 10th International Conference on Intelligent User Interfaces. ACM, 12--19. Google ScholarDigital Library
- Ward, N. and Tsukahara, W. 2000. Prosodic features which cue back-channel responses in English and Japanese* 1. J. Pragmatics 32, 8, 1177--1207.Google ScholarCross Ref
- Watson, D., Clark, L., and Tellegen, A. 1988. Development and validation of brief measures of positive and negative affect: The PANAS scales. J. Person. Soc. Psych. 54, 6, 1063--1070.Google ScholarCross Ref
- Watson, O. 1970. Proxemic behavior: A cross-cultural study. Mouton, The Hague.Google Scholar
- Weisbrod, R. 1965. Looking behavior in a discussion group. Unpublished manuscript. Cornell University.Google Scholar
- Whittaker, S. and Stenton, P. 1988. Cues and control in expert-client dialogues. In Proceedings of the 26th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 123--130. Google ScholarDigital Library
- Wilkes-Gibbs, D. and Clark, H. 1992. Coordinating beliefs in conversation. J. Mem. Lang. 31, 2, 183--194.Google ScholarCross Ref
- Williams, K., Cheung, C., and Choi, W. 2000. Cyberostracism: Effects of being ignored over the Internet. J. Person. Soc. Psych. 79, 5, 748--762.Google ScholarCross Ref
- Wirth, J., Sacco, D., Hugenberg, K., and Williams, K. 2010. Eye gaze as relational evaluation: Averted eye gaze leads to feelings of ostracism and relational devaluation. Personal. Soc. Psych. Bull. 36, 7, 869--882.Google ScholarCross Ref
- Yamazaki, A., Yamazaki, K., Burdelski, M., Kuno, Y., and Fukushima, M. 2010. Coordination of verbal and non-verbal actions in human-robot interaction at museums and exhibitions. J. Pragmatics 42, 9, 2398--2414.Google ScholarCross Ref
- Yamazaki, A., Yamazaki, K., Kuno, Y., Burdelski, M., Kawashima, M., and Kuzuoka, H. 2008. Precision timing in human-robot interaction: coordination of head movement and utterance. In Proceeding of the 26th SIGCHI Conference on Human Factors in Computing Systems. ACM, 131--140. Google ScholarDigital Library
- Yngve, V. 1970. On getting a word in edgewise. In Proceedings of the 6th Regional Meeting of the Chicago Linguistic Society. 657--677.Google Scholar
Index Terms
- Conversational gaze mechanisms for humanlike robots
Recommendations
Conversational gaze aversion for humanlike robots
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionGaze aversion-the intentional redirection away from the face of an interlocutor-is an important nonverbal cue that serves a number of conversational functions, including signaling cognitive effort, regulating a conversation's intimacy level, and ...
Who's next?: Integrating Non-Verbal Turn-Taking Cues for Embodied Conversational Agents
IVA '23: Proceedings of the 23rd ACM International Conference on Intelligent Virtual AgentsTaking turns in a conversation is a delicate interplay of various signals, which we as humans can easily decipher. Embodied conversational agents (ECAs) communicating with humans should leverage this ability for smooth and enjoyable conversations. ...
Grounding and turn-taking in multimodal multiparty conversation
HCI'13: Proceedings of the 15th international conference on Human-Computer Interaction: interaction modalities and techniques - Volume Part IVThis study explores the empirical basis for multimodal conversation control acts. Applying conversation analysis as an exploratory approach, we attempt to illuminate the control functions of paralinguistic behaviors in managing multiparty conversation. ...
Comments