Abstract
Future interactive virtual environments will be “attention-aware,” capable of predicting, reacting to, and ultimately influencing the visual attention of their human operators. Before such environments can be realized, it is necessary to operationalize our understanding of the relevant aspects of visual perception, in the form of fully automated computational heuristics that can efficiently identify locations that would attract human gaze in complex dynamic environments. One promising approach to designing such heuristics draws on ideas from computational neuroscience. We compared several neurobiologically inspired heuristics with eye-movement recordings from five observers playing video games, and found that human gaze was better predicted by heuristics that detect outliers from the global distribution of visual features than by purely local heuristics. Heuristics sensitive to dynamic events performed best overall. Further, heuristic prediction power differed more between games than between different human observers. While other factors clearly also influence eye position, our findings suggest that simple neurally inspired algorithmic methods can account for a significant portion of human gaze behavior in a naturalistic, interactive setting. These algorithms may be useful in the implementation of interactive virtual environments, both to predict the cognitive state of human operators, as well as to effectively endow virtual agents in the system with humanlike visual behavior.
- Bailenson, J. and Yee, N. 2005. Digital chameleons: Automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological Science 16, 814--819.Google ScholarCross Ref
- Ballard, D., Hayhoe, M., and Pelz, J. 1995. Memory representations in natural tasks. Journal of Cognitive Neuroscience 7, 1 (Winter), 66--80. Google ScholarDigital Library
- Burt, P. and Adelson, E. 1983. The laplacian pyramid as a compact image code. IEEE Transactions on Communications 31, 4, 532--540.Google ScholarCross Ref
- Carmi, R. and Itti, L. 2004. Bottom-up and top-down influences on attentional allocation in natural dyamic scenes. In Proc. Vision Science Society Annual Meeting (VSS04). 20.Google Scholar
- Finney, S. A. 2001. Real-time data collection in linux: A case study. Behavior Research Methods, Instruments, and Computers 33, 167--173.Google ScholarCross Ref
- Greenspan, H., Belongie, S., Goodman, R., Perona, P., Rakshit, S., and Anderson, C. 1994. Overcomplete steerable pyramid filters and rotation invariance. In Proceedings Computer Vision and Pattern Recognition. IEEE Computer Society, Los Alamitos, CA. 222--228.Google Scholar
- Hayhoe, M. 2004. Advances in relating eye movements and cognition. Infancy 6, 2, 267--274.Google ScholarCross Ref
- Hayhoe, M., Ballard, D., Triesch, J., and Shinoda, H. 2002. Vision in natural and virtual environments. In Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA). 7--13. Google ScholarDigital Library
- Hayhoe, M., Shrivastava, A., Mruczek, R., and Pelz, J. 2003. Visual memory and motor planning in a natural task. Journal of Vision 3, 1, 49--63.Google ScholarCross Ref
- Henderson, J. M. and Hollingworth, A. 1999. High-level scene perception. Annual Review of Psychology 50, 243--271.Google ScholarCross Ref
- Itti, L. and Baldi, P. 2005. A principled approach to detecting surprising events in video. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Siego, CA. 631--637. Google ScholarDigital Library
- Itti, L. and Koch, C. 2001. Feature combination strategies for saliency-based visual attention systems. Journal of Electronic Imaging 10, 1 (Jan.), 161--169.Google ScholarCross Ref
- Itti, L., Koch, C., and Niebur, E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 11 (Nov.), 1254--1259. Google ScholarDigital Library
- Itti, L., Dhavale, N., and Pighin, F. 2003. Realistic avatar eye and head animation using a neurobiological model of visual attention. In Proceedings of SPIE 48th Annual International Symposium on Optical Science and Technology. 64--78.Google Scholar
- Koch, C. and Ullman, S. 1985. Shifts in selective visual-attention—towards the underlying neural circuitry. Human Neurobiology 4, 4, 219--227.Google Scholar
- Land, M. and Hayhoe, M. 2001. In what ways do eye movements contribute to everyday activities? Vision Research 41, 25--26, 3559--3565.Google Scholar
- Land, M., Mennie, N., and Rusted, J. 1999. The roles of vision and eye movements in the control of activities of daily living. Perception 28, 11, 1311--1328.Google ScholarCross Ref
- Mannan, S., Ruddock, K., and Wooding, D. 1997. Fixation patterns made during brief examination of two-dimensional images. Perception 26, 8, 1059--1072.Google ScholarCross Ref
- May, J., Dean, M., and Barnard, P. 2003. Using film cutting techniques in interface design. Human-Computer Interaction 18, 4, 325--372. Google ScholarDigital Library
- Najemnik, J. and Geisler, W. 2005. Optimal eye-movement strategies in visual search. Nature (London) 434, 7031 (Mar.), 387--391.Google ScholarCross Ref
- Navalpakkam, V. and Itti, L. 2005. Modeling the influence of task on attention. Vision Research 45, 2 (Jan.), 205--231.Google ScholarCross Ref
- Navalpakkam, V. and Itti, L. 2006. Optimal cue selection strategy. In Advances in Neural Information Processing Systems, Vol. 19 (NIPS*2005). MIT Press, Cambridge, MA. 1--8.Google Scholar
- Parker, R. 1978. Picture processing during recognition. Journal of Experimental Psychology—Human Perception and Performance 4, 2, 284--293.Google Scholar
- Parkhurst, D. and Niebur, E. 2004. Texture contrast attracts overt visual attention in natural scenes. European Journal of Neuroscience 19, 3 (Feb.), 783--789.Google ScholarCross Ref
- Parkhurst, D., Law, K., and Niebur, E. 2002. Modeling the role of salience in the allocation of overt visual attention. Vision Research 42, 1, 107--123.Google ScholarCross Ref
- Peli, V., Goldstein, T., and Woods, R. 2005. Scanpaths of motion sequences: Where people look when watching movies. In Proceedings of the Fourth Starkfest Conference on Vision and Movement in Man and Machines. School of Optometry, UC Berkeley, Berkeley, CA. 18--21.Google Scholar
- Peters, C. and O'Sullivan, C. 2003. Bottom-up visual attention for virtual human animation. In Computer Animation and Social Agents 2003. 111--117. Google ScholarDigital Library
- Peters, R., Iyer, A., Itti, L., and Koch, C. 2005. Components of bottom-up gaze allocation in natural images. Vision Research 45, 18, 2397--2416.Google ScholarCross Ref
- Pomplun, M. 2005. Saccadic selectivity in complex visual search displays. Vision Research 46, 12 (June), 1886--1900.Google Scholar
- Privitera, C. and Stark, L. 1998. Evaluating image processing algorithms that predict regions of interest. Pattern Recognition Letters 19, 11, 1037--1043. Google ScholarDigital Library
- Privitera, C. and Stark, L. 2000. Algorithms for defining visual regions-of-interest: Comparison with eye fixations. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 9, 970--982. Google ScholarDigital Library
- Rao, R., Zelinsky, G., Hayhoe, M., and Ballard, D. 2002. Eye movements in iconic visual search. Vision Research 42, 11 (May), 1447--1463.Google ScholarCross Ref
- Rayner, K. 1998. Eye movements in reading and information processing: 20 years of research. Psychological Bulletin 124, 3 (Nov.), 372--422.Google ScholarCross Ref
- Reinagel, P. and Zador, A. 1999. Natural scene statistics at the centre of gaze. Network-Computation in Neural Systems 10, 4 (Nov.), 341--350.Google ScholarCross Ref
- Rensink, R. A. 2000. The dynamic representation of scenes. Visual Cognition 7, 17--42.Google ScholarCross Ref
- Sodhi, M., Reimer, B., Cohen, J., Vastenburg, E., Kaars, R., and Kirschenbaum, S. 2002. On-road driver eye-movement tracking using head-mounted devices. In Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA). 61--68. Google ScholarDigital Library
- Stampe, D. M. 1993. Heuristic filtering and reliable calibration methods for video based pupil tracking systems. Behavior Research Methods, Instruments, and Computers 25, 2, 137--142.Google ScholarCross Ref
- Terzopoulos, D. and Rabie, T. F. 1997. Animat vision: Active vision in artificial animals. Videre: Journal of Computer Vision Research 1, 1, 2--19.Google Scholar
- Toet, A. 2006. Gaze directed displays as an enabling technology for attention aware systems. Computers in Human Behavior 22, 4 (July), 615--647.Google ScholarCross Ref
- Torralba, A. 2003. Modeling global scene factors in attention. Journal of the Optical Society of America A-Optics Image Science and Vision 20, 7 (July), 1407--1418.Google ScholarCross Ref
- Tosi, V., Mecacci, L., and Pasquali, E. 1997. Scanning eye movements made when viewing film: Preliminary observations. International Journal of Neuroscience 92, 1--2, 47--52.Google ScholarCross Ref
- Treisman, A. and Gelade, G. 1980. A feature-integration theory of attention. Cognitive Psychology 12, 97--136.Google ScholarCross Ref
- Voge, S. 1999. Looking at paintings: Patterns of eye movements in artistically naive and sophisticated subjects. Leonardo 32, 4, 325--325.Google ScholarCross Ref
- Wolfe, J. and Horowitz, T. 2004. What attributes guide the deployment of visual attention and how do they do it? Nature Reviews Neuroscience 5, 6 (June), 495--501.Google Scholar
- Yarbus, A. 1967. Eye movements during perception of complex objects. In Eye Movements and Vision, L. Riggs, Ed. Plenum Press, New York.Google Scholar
- Zetzsche, C., Schill, K., Deubel, H., Krieger, G., Umkehrer, E., and Beinlich, S. 1998. Investigation of a sensorimotor system for saccadic scene analysis: an integrated approach. In From animals to animats, Proceedings of the Fifth International Conference on the Simulation of Adaptive Behavior. Vol. 5. 120--126. Google ScholarDigital Library
Index Terms
- Applying computational tools to predict gaze direction in interactive visual environments
Recommendations
Computational mechanisms for gaze direction in interactive visual environments
ETRA '06: Proceedings of the 2006 symposium on Eye tracking research & applicationsNext-generation immersive virtual environments and video games will require virtual agents with human-like visual attention and gaze behaviors. A critical step is to devise efficient visual processing heuristics to select locations that would attract ...
Video game design using an eye-movement-dependent model of visual attention
Eye movements can be used to infer the allocation of covert attention. In this article, we propose to model the allocation of attention in a task-dependent manner based on different eye movement conditions, specifically fixation and pursuit. We show ...
Human Visual Scanpath Prediction Based on RGB-D Saliency
ICIGP '18: Proceedings of the 2018 International Conference on Image and Graphics ProcessingHuman visual perception is considered as a dynamic process of information acquisition, while the visual scanpath can clearly reflect the shift of our eye fixations. In the previous study of visual attention, researchers generally do the saliency ...
Comments