Research report
Auditory capture of vision: examining temporal ventriloquism

https://doi.org/10.1016/S0926-6410(03)00089-2Get rights and content

Abstract

Four experiments investigated whether irrelevant sounds can influence the perception of lights in a visual temporal order judgment task, where participants judged which of two lights appeared first. In Experiment 1, presenting a sound before the first light and after the second light improved performance relative to baseline (sounds appearing simultaneously with the lights), as if the sounds pulled the perception of lights further apart in time. Experiment 2 ruled out an alerting explanation for this effect and indicated that the performance improvement resulted from the second sound trailing the second light. Experiment 3 excluded the possibility that leading or simultaneous sounds were interfering with performance and revealed that only the second sound had an effect within the temporal window known to support multisensory integration. Experiment 4 demonstrated that sounds intervening between the two lights led to a decline in performance, as if the sounds pulled the lights closer together. The results suggest a ‘temporal ventriloquism’ phenomenon analogous to spatial ventriloquism.

Introduction

Perception of the world is inherently a multisensory experience. The brain integrates information from the different sensory modalities to better interact with the environment. Multisensory integration is frequently studied using intermodal conflict, as in the ventriloquist effect whereby the perceived location of a sound shifts towards a visual stimulus presented at a different position [1], [10]. Much of the multisensory literature has focused on spatial interactions of this type and on identity interactions such as the McGurk effect [15]. In the McGurk effect, what is being heard is influenced by what is being seen (for example, when hearing /ba/ but seeing the speaker say /ga/ the final perception may be /da/). However, it is also the case that the senses can interact in time, that is, not where or what is being perceived but when it is being perceived. In the present study we investigated how the perception of time between two visual events is affected by presenting two sounds at conflicting times relative to the lights.

The temporal relationships between inputs from the different senses play an important role in multisensory integration. Indeed, a window of synchrony between auditory and visual events is crucial to spatial ventriloquism, as the effect disappears when the audio-visual asynchrony exceeds approximately 300 ms [28]. This is also the case in the McGurk effect, which fails to occur when the audio-visual asynchrony exceeds 200–300 ms [13], [18].

In the spatial domain, vision can bias the perceived location of sounds whereas sounds rarely influence visual localization. One key reason for this asymmetry seems to be that vision provides more accurate location information than sound [32], [33]. Thus, when auditory and visual stimuli co-occur from different positions, the perceived location of the sound is shifted towards the visual event as the best perceptual solution. Outside the spatial domain however, audition can bias vision [24], [25]. For example, multiple beeps accompanying a single flash can induce the perception of multiple flashes [25].

In the temporal domain evidence suggests that audition provides more accurate temporal information than vision [32], [9], [26], [30]. In keeping with the notion of greater auditory temporal acuity, audition has been found to capture visual temporal perception in a phenomenon called auditory ‘driving’ [26], [7], [19]. Here, changes in the rate of a repetitive clicking sound induce corresponding changes in the perceived rate of a repetitive flashing light [7], [31]. This phenomenon has been questioned, however, because it relies on subjective estimation and may reflect a response bias rather than a perceptual effect [6]. In a recent investigation, Fendrich and Corballis [6] attempted to overcome the limitations of auditory driving studies by asking participants to report the location of a rotating visual marker on a clock-face at the time of a flash. When irrelevant repetitive clicks preceded the flash the position reported was earlier, and when the clicks followed the flash the position reported was later. Unfortunately, this finding could reflect post-perceptual processes in the form of a response or spatial bias rather than a temporal perception bias, as the time estimation was based on the perceived location of a moving flash on a clock-face (see Ref. [29] for similar concerns). Consequently, the shortcomings inherent to the previous auditory driving research have not yet been fully addressed. Moreover, it is unclear whether audition may influence vision when non-repetitive stimuli are used, as repetitive stimuli carry supplementary rhythmic information.

Taken together however, the results are suggestive and compatible with the hypothesis tested in the current study: that auditory events can alter the perceived timing of target lights. This would demonstrate a temporal analogue of the spatial ventriloquist effect, where visual events can alter the perceived location of target sounds.

Section snippets

Experiment 1

Participants determined which of two lights appeared first in a visual temporal order judgment task. Performance in a TOJ task is sensitive to the perception of the temporal interval between the lights. Our working hypothesis was that presenting task-irrelevant sounds before the first light and after the second light might attract the visual onsets so that they would seem to occur further apart in time, thereby improving performance (see Fig. 1). Sounds simultaneous with the onset of each of

Experiment 2

In Experiment 2 the temporal relationships between the first sound and light, and the second sound and light were varied independently. Specifically, the first sound could appear at the same time as the first light or lead by 100 ms. Similarly, the second sound could appear at the same time as the second light, or trail by 100 ms. If the effect in Experiment 1 is due to alerting then performance should improve when the first sound precedes the first light [22].

Experiment 3

This study was similar to Experiment 2 but involved a wider range of intervals. The interference account predicts that the detrimental effect of the first sound on the first light should decline and performance should improve as the temporal interval between them increases. Regarding the trailing sound, as performance is already showing a benefit when the sound is trailing, additional increases in the interval duration may not improve performance further. In contrast, if improved performance

Experiment 4

A strong prediction of the temporal ventriloquism explanation is that sounds intervening between the lights (i.e. after the first light and before the second light) should lead to a decline in performance. In other words, intervening sounds should now draw the lights closer together in time making the TOJ harder. We set out to test this prediction in the current experiment. We tested two different gaps between the sounds, one of 16 ms and one of 40 ms. Additionally, a condition with a single

General discussion

The present findings demonstrate that sounds can alter performance on a visual TOJ task. In Experiment 1 sounds presented before the first light and after the second light improved performance relative to baseline (two sounds presented simultaneously with the lights). This effect was robust and extended up to sound-light intervals of 225 ms. Experiment 2 ruled out an alerting explanation, as only the second sound trailing the second light improved performance. Experiment 3 excluded an

Acknowledgments

This work was funded by grants to Alan Kingstone from the Human Frontiers Science Program, the Natural Sciences and Engineering Research Council of Canada, and the Michael Smith Foundation for Health Research. The authors thank Dr Lawrence Ward and Dr Jim Enns for their help and advice, as well as Charles Spence and an anonymous reviewer for their insightful comments.

References (33)

  • I.J. Hirsh et al.

    Perceived order in different sense modalities

    J. Exp. Psychol.

    (1961)
  • I.P. Howard et al.

    Human Spatial Orientation

    (1966)
  • L. Lewkowicz

    The development of temporal and spatial intermodal peception

  • D.W. Massaro et al.

    Perception of asynchronous and conflicting visual and auditory speech

    J. Acoustical Soc. Am.

    (1996)
  • M. McGrath et al.

    Intermodal timing relations and audio-visual speech recognition by normal-hearing adults

    J. Acoustical Soc. Am.

    (1985)
  • H. McGurk et al.

    Hearing lips and seeing voices

    Nature

    (1976)
  • Cited by (320)

    • Synesthetic Correspondence: An Overview

      2024, Advances in Experimental Medicine and Biology
    View all citing articles on Scopus
    View full text