research-article

Applying computational tools to predict gaze direction in interactive visual environments

Authors:
Robert J. Peters

University of Southern California, Los Angeles, California

University of Southern California, Los Angeles, California
View Profile

,
Laurent Itti

University of Southern California, Los Angeles, California

University of Southern California, Los Angeles, California
View Profile

Authors Info & Claims

ACM Transactions on Applied Perception Volume 5 Issue 2Article No.: 9pp 1–19https://doi.org/10.1145/1279920.1279923

Published:19 May 2008Publication History

ACM Transactions on Applied Perception

Abstract

Future interactive virtual environments will be “attention-aware,” capable of predicting, reacting to, and ultimately influencing the visual attention of their human operators. Before such environments can be realized, it is necessary to operationalize our understanding of the relevant aspects of visual perception, in the form of fully automated computational heuristics that can efficiently identify locations that would attract human gaze in complex dynamic environments. One promising approach to designing such heuristics draws on ideas from computational neuroscience. We compared several neurobiologically inspired heuristics with eye-movement recordings from five observers playing video games, and found that human gaze was better predicted by heuristics that detect outliers from the global distribution of visual features than by purely local heuristics. Heuristics sensitive to dynamic events performed best overall. Further, heuristic prediction power differed more between games than between different human observers. While other factors clearly also influence eye position, our findings suggest that simple neurally inspired algorithmic methods can account for a significant portion of human gaze behavior in a naturalistic, interactive setting. These algorithms may be useful in the implementation of interactive virtual environments, both to predict the cognitive state of human operators, as well as to effectively endow virtual agents in the system with humanlike visual behavior.

References

Bailenson, J. and Yee, N. 2005. Digital chameleons: Automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological Science 16, 814--819.Google ScholarCross Ref
Ballard, D., Hayhoe, M., and Pelz, J. 1995. Memory representations in natural tasks. Journal of Cognitive Neuroscience 7, 1 (Winter), 66--80. Google ScholarDigital Library
Burt, P. and Adelson, E. 1983. The laplacian pyramid as a compact image code. IEEE Transactions on Communications 31, 4, 532--540.Google ScholarCross Ref
Carmi, R. and Itti, L. 2004. Bottom-up and top-down influences on attentional allocation in natural dyamic scenes. In Proc. Vision Science Society Annual Meeting (VSS04). 20.Google Scholar
Finney, S. A. 2001. Real-time data collection in linux: A case study. Behavior Research Methods, Instruments, and Computers 33, 167--173.Google ScholarCross Ref
Greenspan, H., Belongie, S., Goodman, R., Perona, P., Rakshit, S., and Anderson, C. 1994. Overcomplete steerable pyramid filters and rotation invariance. In Proceedings Computer Vision and Pattern Recognition. IEEE Computer Society, Los Alamitos, CA. 222--228.Google Scholar
Hayhoe, M. 2004. Advances in relating eye movements and cognition. Infancy 6, 2, 267--274.Google ScholarCross Ref
Hayhoe, M., Ballard, D., Triesch, J., and Shinoda, H. 2002. Vision in natural and virtual environments. In Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA). 7--13. Google ScholarDigital Library
Hayhoe, M., Shrivastava, A., Mruczek, R., and Pelz, J. 2003. Visual memory and motor planning in a natural task. Journal of Vision 3, 1, 49--63.Google ScholarCross Ref
Henderson, J. M. and Hollingworth, A. 1999. High-level scene perception. Annual Review of Psychology 50, 243--271.Google ScholarCross Ref
Itti, L. and Baldi, P. 2005. A principled approach to detecting surprising events in video. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Siego, CA. 631--637. Google ScholarDigital Library
Itti, L. and Koch, C. 2001. Feature combination strategies for saliency-based visual attention systems. Journal of Electronic Imaging 10, 1 (Jan.), 161--169.Google ScholarCross Ref
Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 11 (Nov.), 1254--1259. Google ScholarDigital Library
Itti, L., Dhavale, N., and Pighin, F. 2003. Realistic avatar eye and head animation using a neurobiological model of visual attention. In Proceedings of SPIE 48th Annual International Symposium on Optical Science and Technology. 64--78.Google Scholar
Koch, C. and Ullman, S. 1985. Shifts in selective visual-attention—towards the underlying neural circuitry. Human Neurobiology 4, 4, 219--227.Google Scholar
Land, M. and Hayhoe, M. 2001. In what ways do eye movements contribute to everyday activities&quest; Vision Research 41, 25--26, 3559--3565.Google Scholar
Land, M., Mennie, N., and Rusted, J. 1999. The roles of vision and eye movements in the control of activities of daily living. Perception 28, 11, 1311--1328.Google ScholarCross Ref
Mannan, S., Ruddock, K., and Wooding, D. 1997. Fixation patterns made during brief examination of two-dimensional images. Perception 26, 8, 1059--1072.Google ScholarCross Ref
May, J., Dean, M., and Barnard, P. 2003. Using film cutting techniques in interface design. Human-Computer Interaction 18, 4, 325--372. Google ScholarDigital Library
Najemnik, J. and Geisler, W. 2005. Optimal eye-movement strategies in visual search. Nature (London) 434, 7031 (Mar.), 387--391.Google ScholarCross Ref
Navalpakkam, V. and Itti, L. 2005. Modeling the influence of task on attention. Vision Research 45, 2 (Jan.), 205--231.Google ScholarCross Ref
Navalpakkam, V. and Itti, L. 2006. Optimal cue selection strategy. In Advances in Neural Information Processing Systems, Vol. 19 (NIPS&ast;2005). MIT Press, Cambridge, MA. 1--8.Google Scholar
Parker, R. 1978. Picture processing during recognition. Journal of Experimental Psychology—Human Perception and Performance 4, 2, 284--293.Google Scholar
Parkhurst, D. and Niebur, E. 2004. Texture contrast attracts overt visual attention in natural scenes. European Journal of Neuroscience 19, 3 (Feb.), 783--789.Google ScholarCross Ref
Parkhurst, D., Law, K., and Niebur, E. 2002. Modeling the role of salience in the allocation of overt visual attention. Vision Research 42, 1, 107--123.Google ScholarCross Ref
Peli, V., Goldstein, T., and Woods, R. 2005. Scanpaths of motion sequences: Where people look when watching movies. In Proceedings of the Fourth Starkfest Conference on Vision and Movement in Man and Machines. School of Optometry, UC Berkeley, Berkeley, CA. 18--21.Google Scholar
Peters, C. and O'Sullivan, C. 2003. Bottom-up visual attention for virtual human animation. In Computer Animation and Social Agents 2003. 111--117. Google ScholarDigital Library
Peters, R., Iyer, A., Itti, L., and Koch, C. 2005. Components of bottom-up gaze allocation in natural images. Vision Research 45, 18, 2397--2416.Google ScholarCross Ref
Pomplun, M. 2005. Saccadic selectivity in complex visual search displays. Vision Research 46, 12 (June), 1886--1900.Google Scholar
Privitera, C. and Stark, L. 1998. Evaluating image processing algorithms that predict regions of interest. Pattern Recognition Letters 19, 11, 1037--1043. Google ScholarDigital Library
Privitera, C. and Stark, L. 2000. Algorithms for defining visual regions-of-interest: Comparison with eye fixations. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 9, 970--982. Google ScholarDigital Library
Rao, R., Zelinsky, G., Hayhoe, M., and Ballard, D. 2002. Eye movements in iconic visual search. Vision Research 42, 11 (May), 1447--1463.Google ScholarCross Ref
Rayner, K. 1998. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124, 3 (Nov.), 372--422.Google ScholarCross Ref
Reinagel, P. and Zador, A. 1999. Natural scene statistics at the centre of gaze. Network-Computation in Neural Systems 10, 4 (Nov.), 341--350.Google ScholarCross Ref
Rensink, R. A. 2000. The dynamic representation of scenes. Visual Cognition 7, 17--42.Google ScholarCross Ref
Sodhi, M., Reimer, B., Cohen, J., Vastenburg, E., Kaars, R., and Kirschenbaum, S. 2002. On-road driver eye-movement tracking using head-mounted devices. In Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA). 61--68. Google ScholarDigital Library
Stampe, D. M. 1993. Heuristic filtering and reliable calibration methods for video based pupil tracking systems. Behavior Research Methods, Instruments, and Computers 25, 2, 137--142.Google ScholarCross Ref
Terzopoulos, D. and Rabie, T. F. 1997. Animat vision: Active vision in artificial animals. Videre: Journal of Computer Vision Research 1, 1, 2--19.Google Scholar
Toet, A. 2006. Gaze directed displays as an enabling technology for attention aware systems. Computers in Human Behavior 22, 4 (July), 615--647.Google ScholarCross Ref
Torralba, A. 2003. Modeling global scene factors in attention. Journal of the Optical Society of America A-Optics Image Science and Vision 20, 7 (July), 1407--1418.Google ScholarCross Ref
Tosi, V., Mecacci, L., and Pasquali, E. 1997. Scanning eye movements made when viewing film: Preliminary observations. International Journal of Neuroscience 92, 1--2, 47--52.Google ScholarCross Ref
Treisman, A. and Gelade, G. 1980. A feature-integration theory of attention. Cognitive Psychology 12, 97--136.Google ScholarCross Ref
Voge, S. 1999. Looking at paintings: Patterns of eye movements in artistically naive and sophisticated subjects. Leonardo 32, 4, 325--325.Google ScholarCross Ref
Wolfe, J. and Horowitz, T. 2004. What attributes guide the deployment of visual attention and how do they do it&quest; Nature Reviews Neuroscience 5, 6 (June), 495--501.Google Scholar
Yarbus, A. 1967. Eye movements during perception of complex objects. In Eye Movements and Vision, L. Riggs, Ed. Plenum Press, New York.Google Scholar
Zetzsche, C., Schill, K., Deubel, H., Krieger, G., Umkehrer, E., and Beinlich, S. 1998. Investigation of a sensorimotor system for saccadic scene analysis: an integrated approach. In From animals to animats, Proceedings of the Fifth International Conference on the Simulation of Adaptive Behavior. Vol. 5. 120--126. Google ScholarDigital Library

Index Terms

Applying computational tools to predict gaze direction in interactive visual environments
1. Computing methodologies
  1. Artificial intelligence
    1. Philosophical/theoretical foundations of artificial intelligence
      1. Cognitive science
  2. Computer graphics
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

Computational mechanisms for gaze direction in interactive visual environments
ETRA '06: Proceedings of the 2006 symposium on Eye tracking research & applications

Next-generation immersive virtual environments and video games will require virtual agents with human-like visual attention and gaze behaviors. A critical step is to devise efficient visual processing heuristics to select locations that would attract ...
Read More
Video game design using an eye-movement-dependent model of visual attention

Eye movements can be used to infer the allocation of covert attention. In this article, we propose to model the allocation of attention in a task-dependent manner based on different eye movement conditions, specifically fixation and pursuit. We show ...
Read More
Human Visual Scanpath Prediction Based on RGB-D Saliency
ICIGP '18: Proceedings of the 2018 International Conference on Image and Graphics Processing

Human visual perception is considered as a dynamic process of information acquisition, while the visual scanpath can clearly reflect the shift of our eye fixations. In the previous study of visual attention, researchers generally do the saliency ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Applied Perception Volume 5, Issue 2
May 2008
120 pages
ISSN:1544-3558
EISSN:1544-3965
DOI:10.1145/1279920
Issue’s Table of Contents

Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 May 2008
- Accepted: 1 May 2007
- Revised: 1 April 2007
- Received: 1 April 2006
Published in tap Volume 5, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Active vision
computational modeling
eye-movements
immersive environments
video games
visual attention
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 74
  Total Citations
  View Citations
- 862
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Applying computational tools to predict gaze direction in interactive visual environments

ACM Transactions on Applied Perception

Abstract

References

Cited By

Index Terms

Recommendations

Computational mechanisms for gaze direction in interactive visual environments

Video game design using an eye-movement-dependent model of visual attention

Human Visual Scanpath Prediction Based on RGB-D Saliency

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Applying computational tools to predict gaze direction in interactive visual environments

ACM Transactions on Applied Perception

Abstract

References

Cited By

Index Terms

Recommendations

Computational mechanisms for gaze direction in interactive visual environments

Video game design using an eye-movement-dependent model of visual attention

Human Visual Scanpath Prediction Based on RGB-D Saliency

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media