Temporal binding as multisensory integration: Manipulating perceptual certainty of actions and their effects

Klaffehn, Annika L.; Sellmann, Florian B.; Kirsch, Wladimir; Kunde, Wilfried; Pfister, Roland

doi:10.3758/s13414-021-02314-0

Temporal binding as multisensory integration: Manipulating perceptual certainty of actions and their effects

Open access
Published: 01 June 2021

Volume 83, pages 3135–3145, (2021)
Cite this article

Download PDF

You have full access to this open access article

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Temporal binding as multisensory integration: Manipulating perceptual certainty of actions and their effects

Download PDF

Annika L. Klaffehn¹,
Florian B. Sellmann¹,
Wladimir Kirsch¹,
Wilfried Kunde¹ &
…
Roland Pfister¹

2121 Accesses
23 Citations
1 Altmetric
Explore all metrics

Abstract

It has been proposed that statistical integration of multisensory cues may be a suitable framework to explain temporal binding, that is, the finding that causally related events such as an action and its effect are perceived to be shifted towards each other in time. A multisensory approach to temporal binding construes actions and effects as individual sensory signals, which are each perceived with a specific temporal precision. When they are integrated into one multimodal event, like an action-effect chain, the extent to which they affect this event’s perception depends on their relative reliability. We test whether this assumption holds true in a temporal binding task by manipulating certainty of actions and effects. Two experiments suggest that a relatively uncertain sensory signal in such action-effect sequences is shifted more towards its counterpart than a relatively certain one. This was especially pronounced for temporal binding of the action towards its effect but could also be shown for effect binding. Other conceptual approaches to temporal binding cannot easily explain these results, and the study therefore adds to the growing body of evidence endorsing a multisensory approach to temporal binding.

Studying the sense of agency in the absence of motor movement: an investigation into temporal binding of tactile sensations and auditory effects

Article Open access 07 April 2021

The time course of intentional binding

Article 09 February 2017

Egocentric Temporal Order Bias Robust Across Manipulations of Cue Predictability and Sensory Modality

Article Open access 19 February 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In their everyday life, most humans are bombarded with perceptual information, most of it redundant and irrelevant. When faced with such abundance, one way to cope lies in multisensory integration, which links related sensory signals and thereby helps us perceive coherent multimodal events rather than a host of independent sensory signals. This process forms a cornerstone of everyday perception, and its prevalence is documented by striking multisensory illusions such as the ventriloquist effect (Alais & Burr, 2004). Here, concurrent visual and auditory signals are merged into an integrated percept by fusing their perceived location. However, integration is not limited to the spatial dimension, but also affects other attributes such as the intensity (Stein et al., 1996) and the perceived timing of stimuli (Fendrich & Corballis, 2001; Shams, Ma, & Beierholm, 2005). Despite these long-known insights, theories of multisensory integration have only recently been applied to the phenomenon of temporal binding, a perceptual illusion in the temporal domain (Cao et al., 2020; Kirsch et al., 2019; Wolpe et al., 2013).

Temporal binding – or intentional binding as it was termed when first described (Haggard et al., 2002) – occurs when two causally related events are perceived as shifted toward each other in time. Initially, temporal binding was proposed to arise from predictive mechanisms in intentional, voluntary motor actions (motor approach). Therefore, the illusion has received particular interest in research on how human agents perceive the consequences of their own actions, and intentional binding has often been used as a proxy for an agent’s implicit sense of agency over the effects of an action (Haggard & Tsakiris, 2009). While such applications of the motor approach are still quite common, accumulating evidence has shown that two events can be perceived as temporally shifted towards each other when they are solely causally linked without one of them being an action and the other being its effect (e.g., Borhani et al., 2017; Buehner, 2012, 2015; Kirsch et al., 2019; Ruess et al., 2020). These findings led to the suggestion that perceived causality rather than intentional motor action is the root of temporal binding (mere causality approach). The motor approach is further called into question by reports on an absent correlation between temporal binding and explicit agency ratings in action contexts (Dewey & Knoblich, 2014; Obhi & Hall, 2011; Schwarz et al., 2019). Thus, the mechanisms underlying temporal binding have to be subjected to new interpretations beyond being an implicit measure or proxy for agency (Hoerl et al., 2020). The causality approach itself offers an intriguing alternative to the motor approach, but until recently lacked a clear theoretical foundation. It construes intentional binding as one instance of a causal event chain but, on a theoretical level, merely replaces the term of agency with causality.

Considering temporal binding as an outcome of a multisensory cue-integration process appears to be particularly promising for integrating this phenomenon into a wider conceptual setting. From the perspective of cue integration, two events must be perceived to “belong together” or to be part of one meta-event, at least to a certain degree, in order to have an influence on how other parts of the meta-event are perceived. The integration of two sensory signals in a meta-event is aided by the temporal proximity and the perceived cross-correlation between two events in time. The magnitude of this relation determines the general strength of signal coupling (i.e., of binding). Importantly, when asked to judge the timing of elements of the meta-event, both temporal cues included in this event are combined, and weighted according to their relative precision (Ernst, 2006; Holmes, 2009; Rohde et al., 2016). The strength of coupling is assumed to vary on a continuum from complete fusion of the signals into a single percept to partial integration and complete segregation. In temporal binding settings, participants usually do not fuse both events (action and effect), but obviously apply partial integration expressed in a subjective temporal attraction between the two events that does not completely cover the physical delay between them. In the case of complete fusion, such an integration of distinct multimodal events can be very well explained by a maximum-likelihood estimation model, which results in a more robust multisensory percept compared to each components individual qualities. The same is not necessarily true in partial integration, where the time or space in between the individual cues is also integrated into the multisensory event (e.g., Debats, Ernst, & Heuer, 2017). Nevertheless, we expect that predictions based on relative certainty hold true, even in partial integration.

The multisensory approach to temporal binding has the potential to explain previous findings in a more comprehensive context than the motor approach. Furthermore, it expands the mere causality approach, allowing for clear, quantitative predictions. One critical aspect of the multisensory approach in the current context is that it predicts a different relationship of action and effect binding compared to previous accounts. That is, based on the motor or the mere causality approach both measures may be used interchangeably or in conjunction without changing their conceptual meaning. Therefore, any manipulation of the event chain should influence overall binding, which in turn should be reflected similarly in action as well as in effect binding. The multisensory approach does not preclude the possibility of changes in overall binding capacity, which might result from changes in perceptual precision or perceived causality. However, it also predicts a trade-off between action and effect binding in many situations based on the relative precision (or reliability) of action and effect cues (referred to as cue certainty hereafter). For example, when the certainty of the action cue is reduced, while the effect cue remains constant, this should lead to stronger action binding and weaker effect binding and vice versa. These predictions are in line with the common finding that an effect is shifted more strongly towards its cause (effect binding) than the cause is shifted towards its effect (action binding). According to the multisensory approach, this outcome could be due to a higher certainty about the timing of own actions as compared to the timing of external events. Even more strikingly, the magnitude of temporal binding can be manipulated by relatively minor changes in the design, such as delay (Haggard et al., 2002), or the force of a key-press (Cao et al., 2020), which alter neither the action intention nor the causal chain between the action and its effect. Changes in relative cue precision might be responsible for these effects.

Evidence in favor of the multisensory approach comes from experimental designs that actively manipulated the reliability of effect-related signals (Wolpe et al., 2013), or incidentally influenced the reliability of action-related signals. For example, Cao et al. (2020) showed that a light key-press with relatively weak somatosensory feedback is biased more strongly towards an ensuing effect tone than a forceful key-press. The present experiments intended to provide converging evidence from a design that employs a direct manipulation of perceptual precision on both ends of the action-effect episode. This logic was implemented in two experiments, in which we manipulated the temporal certainty of an action as well as the temporal certainty of the ensuing effect alike. In particular, participants used their index finger either to press a key on a keyboard (certain action) or to press against a force sensor placed on a table (uncertain action). In the uncertain action condition, this action was followed by a short beep tone (certain effect). In the certain action condition, either a longer lasting white noise with slow rise and fall (Exp. 1) or a quiet beep tone (Exp. 2) were presented following the action (uncertain effects). Experiment 1 additionally featured a control condition, where a certain action generated a certain effect. We reasoned that exerting pressure on a force sensor press provides less reliable cues for the perception of action timing than a keyboard key-press with tactile on- and offset. In a similar vein, the white noise and the quiet beep tone were assumed to decrease the certainty in the perception of the effect as compared with a well audible beep tone with a clearly defined beginning and end. We tested the validity of these manipulations by comparing variance scores of the certain and uncertain actions, and the certain and uncertain effects in baseline conditions, that is, when their timing was judged in isolation. Reduced perceptual precision of an event is expected to come with higher variances in temporal judgments (see, e.g., Ernst, 2006).^{Footnote 1}

If the different actions and effects are indeed perceived with varying certainty, as intended, the multisensory approach predicts a trade-off between action and effect binding for such a situation. Specifically, when certainty about the timing of the action decreases and certainty about the timing of the effect increases, the temporal perception of action should be biased strongly towards the effect and the perception of the effect should be less biased toward the action. That is, a stronger action binding and a smaller effect binding is expected for the “uncertain action – certain effect” condition as compared to the “certain action – uncertain effect” condition. Note that the original motor approach and the mere causality approach predict no changes in binding for these critical conditions because action intention as well as the causal chain are constant. Alternatively, if the strength of the causal link is impacted by the current manipulation, they predict similar changes in action and effect binding. Thus finding evidence for the trade-off between action and effect binding would strongly support the multisensory approach. For both experiments, the design and hypotheses as well as the data analysis plan were preregistered prior to data collection (Exp. 1: osf.io/vxn93; Exp. 2: osf.io/29j7p). All statistical analyses of directed hypotheses specified in these documents are reported as one-tailed tests. Raw data and analyses are available online (https://osf.io/spjqh/).

Experiment 1

Methods

Participants

We collected data of 30 participants at the University of Würzburg and reimbursed them with monetary compensation or partial course credit. The sample size grants a power of 1-β > .99 to detect the effect of tone certainty on action binding in Wolpe et al. (2013). Three participants were excluded (for reasons, see the Data preprocessing section). The remaining sample reported a mean age of 31.9 (±12.7) years, six self-identified as male and 21 as female, and one participant reported being left-handed.

Apparatus and stimuli

The experiment was programmed with Matlab Version 2016a and the Psychtoolbox plugin. Following the classic temporal binding paradigm, we assessed the subjective timing of actions and following auditory effects. That is, in operant blocks participants performed actions and thereby generated auditory effects, whereas they performed key-presses without auditory effects and encountered isolated auditory stimuli in baseline blocks. Temporal binding should be evident in later estimates of the key-press in operant blocks as compared to baseline blocks (action binding) and in earlier estimates of the auditory stimulus in operant blocks as compared to baseline blocks (effect binding).

Actions were performed with the left index finger either via key-press on a keyboard (certain action) or on a force sensor fixed on the table (uncertain action). The keyboard was a standard computer keyboard and thus came with clearly defined onsets and offsets for each key-press (3.5-mm travel distance to bottom out key). Presses on the force sensor were accepted if pressure remained within a predefined force range for 50 ms. Participants were asked not to lift their finger between presses. How to perform a successful action via force sensor was explained and briefly trained before the experiment. Effect sounds were played via headphones and were either a 200-ms, 600-Hz beep (certain tone) or 827 ms of white noise that slowly rose and fell (uncertain tone; see Fig. 1A).

During every block, participants saw a Libet clock with ticks at every quarter hour on which they were instructed to estimate the timing of either actions or auditory events. The clock hand began to rotate at the beginning of the trial (taking a full turn every 2 s) and continued to do so for 1.2–1.5 s after the event in question had occurred. Then it stopped moving and jumped to a random position on the clock. Participants were then asked to judge the timing of one element of the trial by moving the clock hand with the arrow keys on the keyboard to the position it had been in at the time of the event, using their right hand. The Libet clock and all written instructions were presented on a 24-in. monitor with a refresh rate of 60 Hz.

Design

We implemented two kinds of actions and two kinds of effects that differed in how precisely their timing could be perceived (see Fig. 1A). As is standard in temporal binding experiments, both actions and both effects were once probed in isolation to generate a baseline measure of temporal judgments. Additionally, these actions and effects were combined in three operant conditions (see Fig. 1B).

The “certain action – certain effect” (c-c) condition served as a control condition by replicating typical setups in the literature. Here, a key-press on the keyboard triggered a 200-ms beep tone with a constant delay of 500 ms. In the “uncertain action – certain effect” (u-c) condition, a force sensor press triggered a 200-ms beep tone at a constant delay of 500 ms, whereas in the “certain action – uncertain effect” (c-u) condition, a key-press on the keyboard triggered 827 ms of white noise with a slow rise and fall. The white noise began to rise after a 173-ms delay. All three operant conditions were either presented as action blocks, that is, participants only had to judge the timing of the action in this block, or as effect blocks, in which they only had to judge the timing of the effect. In effect baseline blocks, tones were presented at a random interval of 2–3 s after trial start. In all other blocks, participants were asked to wait at least 1 s before performing their action. Overall, the experiment had ten block types: four baseline blocks, three operant action blocks, and three operant effect blocks. Each block was once presented in a practice phase, which was not entered into data analysis. During the main experiment, every block was presented three times with 15 trials each in an unconstrained randomized order.

Data preprocessing

We excluded trials in which participants did not wait for at least a full turn before initiating their actions (4.4%), did not move the Libet clock hand during judgment (2.2% of all trials), and trials in which the temporal judgments deviated more than 2.5 standard deviations (SDs) of the participant’s cell mean (2.1%). Additionally, three participants were excluded: one consistently failed to move the clock hand, one had too many errors (failure to respect the inter-trial interval), and one had too high a variance in their judgments (2.5 SDs above the mean of the full sample). These participant exclusions were not preregistered, but we deemed them preferable to avoid a biased assessment of the results. A re-analysis of the whole sample, including these participants, is available in Appendix A. Furthermore, we computed the estimation error for each block type (judged time – actual time). If the judgment was in the clock half after the actual timing, we assumed a shift forward in time, and if it was in the clock half before the actual timing, we assumed a shift backward in time.^{Footnote 2}

Results

Manipulation check

As a manipulation check, we computed the variance of the estimation errors in baseline blocks, which is assumed to be the inverse of the respective events’ certainty. Baseline blocks of uncertain actions as well as baseline blocks of uncertain effects should thus come with higher variances than certain baseline blocks (see Fig. 2A and Table 1). Indeed, one-tailed paired t-tests showed higher variances in uncertain than in certain baseline blocks for actions, t(26) = 4.67, p < .001, d = 0.90, whereas the differences of variance for effects conformed to our hypothesis numerically, but did not reach significance, t(26) = 1.70, p = .051, d = 0.33.

Table 1 Mean binding values (ms) (operant – baseline judgment errors) for Experiment 1

Full size table

Main analysis

For the main analysis, we contrasted the judgment error in operant blocks with the judgment error in baseline blocks with paired t-tests (one-tailed) to test for the existence of temporal binding. There was significant action and effect binding for all conditions, as shown in Table 1.

Action and effect binding were computed for each type of operant block by subtracting the respective baseline judgment error. Bigger action binding is thus shown by more positive values, whereas effect binding is shown by more negative values. The binding values were entered into a repeated-measures analysis of variance (ANOVA) with the factor certainty-relation (c-u vs. c-c vs. u-c) separately for action judgments and effect judgments. Sphericity could not be assumed for either, and reported p-values are based on Greenhouse-Geisser corrected degrees of freedom. Differences between conditions were tested by planned contrasts (see Fig. 2). The ANOVA for action binding showed a significant impact of certainty-relation, F(2,52) = 4.25, p = .044, η_p² = 0.14, ε = 0.56. Action binding was strongest when the action was uncertain and the effect certain (u-c), and was significantly smaller when the certainty relation was reversed (Action_u-c vs. Action_c-u), t(26) = 2.18, p = .039, d = 0.42 (two-tailed), but also when only the action certainty increased (Action_u-c vs. Action_c-c), t(26) = 1.98, p = .029, d = 0.38 (one-tailed). The ANOVA for effect binding did not show a significant impact of certainty relation, F(2,52) = 1.01, p = .357, ε = 0.79, and neither did the planned contrasts (all |t|s < 1.13, all ps > .135).

Follow-up analyses

Based on the marked differences in variances, especially in action blocks, we followed up on the above pre-registered analyses and performed a non-parametric confirmation of the main analysis. That is, we compared action and effect binding in c-u and u-c blocks in a two-tailed paired Wilcoxon signed-rank test (action binding: Z = -1.87, p = .061; effect binding: Z = -1.35, p = .178), which did not reach significance.

Nevertheless, the observed pattern of results corroborates the predicted trade-off between action and effect binding, and it may therefore not be appropriate to analyze the two binding scores only in separation. Following this finding, we computed the sum of action and effect binding in all three conditions as a measure of an action-effect binding trade-off (with action binding coming with a positive sign and effect binding coming with a negative sign). If the trade-off account is true, this action-effect sum should be smallest in the “certain action – uncertain effect” (c-u) condition, because the absolute value of action binding is small relative to effect binding, while it should be biggest in the “uncertain action – certain effect” (u-c) condition. On the other hand, if the manipulation influenced action and effect binding in a similar way, as would be predicted by motor or causality accounts, the action-effect sum should not change between conditions. Two-tailed paired t-tests show that the sum of both binding scores was bigger in the u-c than in the c-u condition, t(26) = 2.75, p = .011, d = 0.53 (c-u vs. c-c: t(26) = 1.26, p = .221; c-c vs. u-c: t(26) = 2.23, p = .034, d = 0.43), supporting a trade-off account.

Discussion

Significant action and effect binding was present in all conditions of Experiment 1. Furthermore, the relationship between action and effect binding strikingly resembled the trade-off predicted by the multisensory approach. A stronger action binding and a descriptively weaker effect binding were observed when the action was comparatively difficult, and the effect rather easy to pinpoint in time (i.e., in the u-c condition) than when the certainty-relation was reversed (i.e., in the c-u condition). Moreover, the results suggest an effect of action certainty on action binding independently from effect certainty, as actions were bound more strongly to the same effect, when they were uncertain as compared to when they were certain.

On the other hand, effect bindings did not differ significantly between conditions, and variances between certain and uncertain effects were not significantly affected by the manipulation either. In addition, participants judged the timing of the uncertain tone very close to its onset, rather than its peak (see Fig. 2 for an illustration of the problem). These observations might indicate an inapt effect manipulation. We thus conducted a second experiment, where we retained our action manipulation, but replaced the effect manipulation with one that was modelled more closely on the manipulation applied in previous work (Wolpe et al., 2013).