Neural correlates of audiovisual temporal processing – Comparison of temporal order and simultaneity judgments
Introduction
Understanding perceptual processing cannot be achieved without considering brain mechanisms of multisensory integration. An awake organism constantly receives a plethora of sensory signals coming from separate modalities, delivering information about different aspects of its environment. For organism to behave in an adaptive way this variety must be transformed into a consistent and yet dynamic representation of the surrounding world. Obviously spatial distribution of the sources of stimulation is an important cue for the successful integration (Spence et al., 2003, Zampini et al., 2003a, Zampini et al., 2005), but another crucial factor is the temporal relation between multisensory events. The majority of the effects of multisensory integration involve temporal coincidence of its components (Keetels and Vroomen, 2012). However, these effects are not only constrained to the cases when there is an objective, physical coincidence of two (or more) stimuli from separate sensory channels. There is compelling evidence for a conjecture that multisensory integration should not be viewed as an effect of passive coincidence detection of signals arriving by separate sensory channels. It is a well-known fact that subjects perceive as simultaneous the pairs of stimuli that are not physically synchronous (Stevenson and Wallace, 2013). The hypothesis of ‘temporal window of integration’ provides a conceptual account of this phenomenon (Van Wassenhove et al., 2007, Lewkowicz and Ghazanfar, 2009, Vroomen and Keetels, 2010, Colonius and Diederich, 2012). This notion denotes the temporal interval between multisensory stimuli during which multisensory integration may only occur. In many cases the brain can dynamically adjust the perceived temporal relations between stimuli arriving at different times, for example using the mechanism of temporal perceptual recalibration (Fujisaki et al., 2004, Vroomen et al., 2004).
Temporal order judgments (TOJs) and simultaneity judgments (SJs) are the two most widely used paradigms for the assessment of temporal perception, also in the field of multisensory integration (Keetels and Vroomen, 2012). Two parameters are usually derived from these measures. The point of subjective simultaneity (PSS) parameter provides an estimate of the interval between stimuli at which there is the highest probability of the perception of simultaneity. The ‘just noticeable difference’ (JND) variable reflects the subject’s sensitivity to changes in intervals between the stimuli. The JND value (in milliseconds) denotes the minimal temporal interval at which the change between the perceived temporal relation stimuli can be observed.
During the TOJ procedure subjects are presented with pairs of stimuli with variable stimulus onset asynchronies (SOA), and after each presentation they are asked to make an explicit judgment about which of them was the first. In case of audiovisual pairs the subject has to select from two alternatives: ‘sound-first’ or ‘flash-first’. The obtained psychometric function has a characteristic sigmoid profile and it is usually modeled by a cumulative Gaussian or logistic function (Keetels and Vroomen, 2012). The PSS value for TOJ task is taken at the cross-over point of the psychometric function, when there is an equal probability of ‘sound-first’ and ‘flash-first’ judgments, and subjects are maximally unsure about the temporal relation between the members of the audiovisual pair. The measure of sensitivity, JND, is calculated as a half of SOA difference between 25% and 75% points of the psychometric function. An alternative estimate of sensitivity is the psychometric function slope coefficient at the PSS value.
During the SJ procedure subjects are presented with the same kind of stimuli as during the TOJ procedure, but this time they are asked to judge whether the stimuli were perceived as simultaneous or not. In this case the psychometric function is usually modeled by the Gaussian function. There is however one important observation related to audiovisual stimuli: the resulting psychometric function may be asymmetric, being steeper for pairs with the leading auditory stimulus and shallower for pairs with the leading visual stimulus. This phenomenon suggests that in both cases subjects display different sensitivity to the temporal structure of stimulation (Van Eijk et al., 2008, Alcalá-Quintana and García-Pérez, 2013). The PSS estimate for SJs is taken from the point of the psychometric function with the maximum probability of ‘synchronous’ response and JND is calculated as a mean SOA for 75% point of the psychometric curve (both for ‘sound-first’ and ‘flash-first’ pairs).
According to most observations, the PSS values for audiovisual pairs observed during TOJ and SJ procedures are usually positive, i.e. do not correspond to the point of objective simultaneity (at SOA = 0 ms). The positive value means that both stimuli are perceived as simultaneous when the visual stimulus leads the auditory stimulus (usually by the order of tens of milliseconds). This is probably caused by the different sensitivity of auditory and visual systems to temporal cues (such as temporal dynamics of intensity changes). So far this phenomenon has not been a subject of extensive research (but see Van Eijk et al., 2010, Stevenson and Wallace, 2013).
Though both procedures are used to investigate processes of temporal integration, they often give inconsistent results. Estimates of the PSS values obtained with TOJ and SJ do not correlate with each other (reviewed by van Eijk et al., 2008). Moreover, as for the audiovisual SJ judgments the PSS values are usually positive, for the TOJ judgments the negative SOA values are reported in some studies (so stimuli are perceived as simultaneous when auditory stimulus leads visual stimulus). Van Eijk et al. (2008) directly compared PSS estimates for two types of audiovisual stimuli (flash-click pairs and bouncing ball with an impact sound) and three types of procedures: two-alternative SJ (‘synchronous’, ‘asynchronous’), three-alternative SJ (‘sound-first’, ‘synchronous’, and ‘flash-first’), and TOJ (‘sound-first’, ‘flash-first’). PSS values for both SJ tasks were indeed correlated, but the authors did not observe any correlation between TOJ and any of the SJ tasks. More recently, a similar result was obtained by Love et al. (2013) in the study involving five types of audiovisual pairs. As in Van Eijk et al. (2008), Love et al. also observed negative PSS values for the TOJ tasks and consistent positive PSS values for the SJ tasks.
This result suggests that there could be essential differences in the composition of cognitive processes engaged in both tasks. According to Hirsh and Sherrick (1961) and Jaśkowski (1991), perceiving the temporal asynchrony is a necessary, though insufficient, condition for achieving an accurate judgment of temporal order. They suggest a two-stage architecture for temporal judgments. For example, Jaśkowski (1991) proposed a two-stage model consisting of two separate processing centers. The first stage, labeled ‘the simultaneity center’, works as a ‘moment-gating’ mechanism. Depending on the relative signal delays and the applied threshold it can generate two possible ‘perceptual states’: synchronous or asynchronous. On the second stage, ‘the order center’ decides on the temporal order of the stimuli, taking into account their relative latency differences and the perceptual state of the simultaneity center. In effect it can generate three possible states (for a pair consisting of A and B stimuli): ‘order AB’, ‘order BA’ or a ‘uncertainty’ – in this latter case the emitted response is random. Thus this model allows an outcome where stimuli are perceived as non-simultaneous but an adequate decision concerning their order cannot be made.
However, other authors (e.g. Sternberg and Knoll, 1973, Allan, 1975, García-Pérez and Alcalá-Quintana, 2012 for review) maintain that perception of asynchrony is both the necessary and the sufficient condition for an adequate TOJ. This is achieved by a ternary decision system operating on a ternary decision rule applied to the arrival-time difference between the two signals. Thus the dedicated decision system may generate three types of responses: ‘order AB’, ‘order BA’, ‘synchronous’.
More recently, Zampini et al. (2003a) and Shore et al. (2005) emphasized the different character of both tasks, while not proposing the specific theoretical accounts explaining those differences. For example, Zampini et al. (2003a) suggested that essentially the SJ task requires multisensory binding, while the TOJ task is related to temporal discrimination. Elaborating on Zampini’s proposal one can point at one important difference between both tasks specified in such a manner. In the case of TOJ task, to make an adequate decision about the temporal order of stimuli the brain has to be able to create a time-ordered representation of them. During TOJ task the detected sensory signals must then be aligned to this internal representation of the temporal order of events. Such operation is not required for the SJs – the decision system only has to make a decision concerning the synchrony of the signals, whether they can be treated as one unified multisensory event, or two separate events. This does not necessitate any representation of the temporal order of the sensory signals. Thus the TOJ task requires an additional stage of forming a time-ordered representation in addition to detection of asynchrony. The presence of this representation is a necessary part of the decisional process in the TOJ task. Such interpretation of the TOJ task is consistent with the two-stage models of TOJs.
Research on brain processes underlying multisensory synchrony initially focused on the function of superior colliculus (SC) in animals, one of the phylogenetically oldest structures having the multisensory properties (for review see Stein and Meredith, 1993). The animal studies yielded several important findings which may be generalized to other multisensory regions. One of the most important is the spatial principle of multisensory integration. According to this rule, multisensory integration in SC (and probably in neocortical regions) is based on the spatial consistency between overlapping topographic maps formed by afferent connections belonging to separate sensory channels. Another important neurophysiological mechanism observed in SC is the ‘inverse effectiveness principle’. It is based on non-linear, synergistic amplification (or depression) of responses of SC neurons to multisensory inputs in comparison to unisensory inputs. This effect has been also observed in humans in SC and in a number of cortical multisensory regions in humans (see Calvert and Thesen, 2004 for review).
The advent of neuroimaging methods has made possible exploration of the neural mechanisms of multisensory integration also in humans. This field of research can be divided into paradigms with or without explicit requirement of the judgment of simultaneity nor temporal order. The first group of paradigms involves manipulation of stimulus asynchrony treated as an independent variable. The dependent variable is brain activity change observed in response to changes in stimulus asynchrony. The second group of paradigms includes experimental conditions in which brain activity is also treated as the dependent variable but its variability is analyzed in the context of reported stimulus asynchrony or stimulus order judgments.
Research investigating effects of audiovisual integration without explicit requirement of the judgment of simultaneity or temporal order revealed a number of cortical and subcortical regions involved in this perceptual effect. For example Calvert et al. (2001) using the paradigm derived from animal studies involving superadditivity or subadditivity interactions (i.e. nonlinear effects associated with multisensory stimulation) found that such effects can be found during audiovisual stimulation in insula (Ins), SC, superior temporal sulcus (STS), intraparietal sulcus (IPS) and the regions of ventral and dorsal frontal cortex. Partially overlapping results were detected in Bushara et al. (2003) study on illusory collision invoked by audiovisual stimuli. More recent studies using the same paradigm revealed the important role of posterior STS and primary sensory cortices in multisensory integration, and also attempted to define its possible function and regional functional differences (Macaluso et al., 2004, Bischoff et al., 2007, Degerman et al., 2007, Noesselt et al., 2007, Noppeney et al., 2010, Marchant et al., 2012, Powers et al., 2012, Beer et al., 2013, Marchant and Driver, 2013).
Neuroimaging studies involving explicit judgment of audiovisual simultaneity have revealed a similar pattern of structures engaged in perception of audiovisual simultaneity to the ones described above. Bushara et al. (2001) asked participants to judge simultaneity of simple nonverbal stimuli (flashes and tones presented for 100 ms) and observed a number of right hemisphere activations including insula, ventrolateral prefrontal cortex, right inferior parietal lobule and left cerebellum. Activity in the right insula also correlated with increasing task demands (i.e. decreasing SOA between auditory and visual stimulus). Dhamala et al. (2007) using a variation of three-alternative SJ paradigm (ternary response with additional option ‘can’t tell’) and simple non-verbal, rhythmic stimuli also observed activation of the predominantly right-hemisphere regions: inferior frontal gyrus (IFG), superior temporal gyrus (STG), middle occipital gyrus and inferior parietal lobule. More specifically, activation associated with perception of asynchrony activated right primary sensory, visual, prefrontal and inferior parietal cortices, while perception of synchrony activated the right inferior parietal cortex and posterior midbrain. Using connectivity analysis, the authors were able to detect different recruitment of the neural nodes involved in the perceptual states associated with the task: for the perception of asynchrony they observed significant correlations between frontal, parietal and primary and auditory regions. This pattern changed considerably during perception of synchrony, where interregional correlations excluded parietal regions and involved posterior midbrain region, possibly encompassing SC. In paradigms involving tasks with verbal stimuli, the most prominent structure involved in audiovisual simultaneity detection is the posterior STS (Van Atteveldt et al., 2004, Van Atteveldt et al., 2007, Stevenson and James, 2009, Stevenson et al., 2010, Noesselt et al., 2012). Lux et al. (2003) compared effects of tasks requiring paying attention to spatial orientation vs. temporal synchrony using nonverbal visual stimuli and observed activations covering left anterior STG, left inferior parietal cortex, left medial frontal gyrus, and right operculum in the latter task.
Davis et al. (2009) employing an explicit TOJ on visual stimuli observed specific activation of left temporo-parietal junction (TPJ) when the TOJ judgment was compared with the task requiring shape comparison. Lewandowska et al. (2010), using pairs of nonverbal auditory stimuli (two white-noise bursts of different durations) observed an increasing bilateral activation of the inferior parietal lobule and the inferior frontal cortex associated with decreasing inter-stimulus interval between stimuli. Adhikari et al. (2013) compared responses for perceptions of temporally ordered audiovisual stimuli in a variant of SJ3 procedure (published originally in Dhamala et al., 2007). The contrast between AV asynchrony perception and rest revealed activation in the right STG, right supramarginal gyrus (TPJ), right middle frontal gyrus (MFG) and in the left hemisphere: medial frontal gyrus and left inferior parietal lobule (IPL). Results of connectivity analysis suggested that for trials with perception of temporal order of AV stimuli right prefrontal regions coordinated activity of the right TPJ and the left IPL.
Due to the substantial differences between the paradigms of the SJ and TOJ task used in the studies cited above, it is hardly possible to draw firm conclusions concerning similarities and differences pertaining to the neural correlates of both tasks. Thus the main objective of the present study is an assessment of the differences of the activation patterns evoked by the audiovisual SJ and TOJ tasks. To make possible the interpretation of any obtained differences only in task-related terms, both SJ and TOJ task used in this study share their temporal stimulus characteristics and in both cases subjects were required to focus their attention on the same aspect of stimulus, namely their onsets.
The presence of overlapping activations in SJ and TOJ tasks would in principle support the one-stage models of temporal order estimation and suggest that both tasks rely on the same neural substrate. On the other hand, observing non-overlapping regions for either task (especially TOJ task), would opt for the hypotheses assuming two-stage process of decision making in temporal order estimation.
Section snippets
Subjects
Fifteen right-handed subjects (mean age 25.2, range 19–40, six males and nine females) participated in the study. All subjects were in good health and declared no past history of psychiatric or neurological disorders. All subjects gave informed consent to the protocol that has been approved by the local Research Ethics Committee. Subjects had normal or corrected-to-normal (with contact lenses) visual acuity. Three subjects (two females, one male) were discarded from further analysis due to
Behavioral results
The means and standard errors of the mean (in brackets) of PSS values for the three conditions were as follows: staircase 98 (±20) ms, SJ – 110 (±27 s, TOJ – 73 (±50) ms. Positive values indicate that sound stimulus lagged visual stimulus onset.
The relative PSS differences between conditions were estimated using an ANOVA (with Greenhouse-Geisser correction) were insignificant, F(1.235,13.585) < 1. Thus there were no significant PSS shifts between conditions. The averaged psychometric curves for all
Discussion
The SJs task and the TOJs task are the two most-widely used procedures for the study of the role of temporal processing in multisensory integration. However, results of behavioral studies suggest that that both involve disparate sets of the underlying basic cognitive operations. The objective of this study was to explore similarities and differences in brain activity evoked by both tasks within one group of subjects.
The behavioral results did not reveal any significant differences in the
Conclusion
This study explored differences between neural activities associated with simultaneity and temporal order tasks using audiovisual stimuli. The common activity evoked by both conditions overlapped significantly with regions usually associated with selective attention based on spatial information. This suggests that focusing on temporal relations between audiovisual stimuli involves the same regions that are active during tasks based on spatial information. While SJ condition did not recruit any
Acknowledgments
This work was supported by the grant funded by the Polish Ministry of Higher Education (N N106 039538). The author wishes to thank Michal Kuniecki and Marcin Szwed for their insightful comments and suggestions. This project could not have been possible to come into existence without an active involvement of the late professor Piotr Jaśkowski. Unfortunately he could not participate in its development and completion.
References (129)
- et al.
Difficulty of perceptual spatiotemporal integration modulates the neural activity of left inferior parietal cortex
Neuroscience
(2005) - et al.
Left inferior parietal cortex integrates time and space during collision judgments
Neuroimage
(2003) - et al.
Dissociating working memory from task difficulty in human prefrontal cortex
Neuropsychologia
(1997) - et al.
The “when” pathway of the right parietal lobe
Trends Cogn Sci
(2007) - et al.
Visual extinction with double simultaneous stimulation: what is simultaneous?
Neuropsychologia
(2002) - et al.
General multilevel linear modeling for group analysis in FMRI
Neuroimage
(2003) - et al.
Utilizing the ventriloquism-effect to investigate audio-visual binding
Neuropsychologia
(2007) - et al.
Conflict monitoring and anterior cingulate cortex: an update
Trends Cogn Sci
(2004) - et al.
Parietal connectivity mediates multisensory facilitation
Neuroimage
(2013) - et al.
Detection of audio-visual integration sites in humans by application of electrophysiological criteria to the BOLD effect
Neuroimage
(2001)
Multisensory integration: methodological approaches and emerging principles in the human brain
J Physiol Paris
Functionally dissociating temporal and motor components of response preparation in left intraparietal sulcus
Neuroimage
Human brain activity associated with audiovisual perception and attention
Neuroimage
Selective attention to sound location or pitch studied with fMRI
Brain Res
An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest
Neuroimage
Multisensory integration for timing engages different brain networks
Neuroimage
The anatomy of fronto-occipital connections from early blunt dissections to contemporary tractography
Cortex
Is neocortex essentially multisensory?
Trends Cogn Sci
Visuospatial attention: how to measure effects of infrequent, unattended events in a blocked stimulus design
Neuroimage
The modulatory effects of nicotine on parietal cortex activity in a cued target detection task depend on cue reliability
Neuroscience
Hypothalamic abnormalities in schizophrenia: sex effects and genetic vulnerability
Biol Psychiatry
Improved optimization for the robust and accurate linear registration and motion correction of brain images
Neuroimage
FSL
Neuroimage
A global optimisation method for robust affine registration of brain images
Med Image Anal
Attentional modulation of perceptual comparison for feature binding
Brain Cogn
Changes in fMRI BOLD response to increasing and decreasing task difficulty during auditory perception of temporal order
Neurobiol Learn Mem
Distinct systems for automatic and cognitively controlled time measurement: evidence from neuroimaging
Curr Opin Neurobiol
The emergence of multisensory systems through perceptual narrowing
Trends Cogn Sci
Neural mechanisms associated with attention to temporal synchrony versus spatial orientation: an fMRI study
Neuroimage
Orienting of spatial attention and the interplay between the senses
Cortex
Spatial and temporal factors during processing of audiovisual speech: a PET study
Neuroimage
The left intraparietal sulcus and verbal short-term memory: focus of attention or serial order?
Neuroimage
Decreased volume of left and total anterior insular lobule in schizophrenia
Schizophr Res
Working memory for order and the parietal cortex: an event-related functional magnetic resonance imaging study
Neuroscience
Comparing event-related and epoch analysis in blocked design fMRI
Neuroimage
Evidence for a close relationship between conscious effort and anterior cingulate cortex activity
Int J Psychophysiol
Single-trial coupling of EEG and fMRI reveals the involvement of early anterior cingulate cortex activation in effortful decision making
Neuroimage
Valid conjunction inference with the minimum statistic
Neuroimage
The neural basis of temporal auditory discrimination
Neuroimage
Visual extinction and prior entry: impaired perception of temporal order with intact motion perception after unilateral parietal damage
Neuropsychologia
Functional dissociation of pre-SMA and SMA-proper in temporal processing
Neuroimage
Multisensory maps in parietal cortex
Curr Opin Neurobiol
Neural processing of asynchronous audiovisual speech perception
Neuroimage
Temporal-order judgment of audiovisual events involves network activity between parietal and prefrontal cortices
Brain Connect
Fitting model-based psychometric functions to simultaneity and temporal-order judgment data: MATLAB and R routines
Behav Res Methods
The relationship between judgments of successiveness
Percept Psychophys
Multimodal integration for the representation of space in the posterior parietal cortex
Philos Trans R Soc London Ser B Biol Sci
Common neural substrates for ordinal representation in short-term memory, numerical and alphabetical cognition
PLoS One
Prefrontal modulation of visual processing in humans
Nat Neurosci
Combined diffusion-weighted and functional magnetic resonance imaging reveals a temporal-occipital network involved in auditory-visual object processing
Front Integr Neurosci
Cited by (44)
Audiovisual illusion training improves multisensory temporal integration
2023, Consciousness and CognitionVisual intensity-dependent response latencies predict perceived audio–visual simultaneity
2021, Journal of Mathematical PsychologyAudiovisual temporal integration: Cognitive processing, neural mechanisms, developmental trajectory and potential interventions
2020, NeuropsychologiaCitation Excerpt :It has been reported that the SJ task should be preferred over the TOJ task when the primary goal is to measure audiovisual synchrony perception (van Eijk et al., 2008). In addition, they may measure slightly different cognitive processes (Vatakis et al., 2008b; Vroomen and Keetels, 2010), with the TOJ task requiring an extra stage of “order” processing apart from the (a)synchrony perception (Love et al., 2013) and thus eliciting stronger activity in several regions in the left hemisphere (Binder, 2015; Love et al., 2018). Another example is that explicit synchrony judgement is not a prerequisite for implicit fusion of audiovisual stimuli (Soto-Faraco and Alsius, 2009; Zmigrod and Hommel, 2011), and they may involve separate mechanisms that are not tightly correlated (Tsilionis and Vatakis, 2016) or even negatively correlated (Freeman et al., 2013; Ipser et al., 2018).
The influence of spatial location on temporal order perception
2024, Current PsychologyTemporal order judgment of multisensory stimuli in rat and human
2023, Frontiers in Behavioral Neuroscience