Although we often bob our heads and tap our feet when listening to music, the reason for this behavior remains elusive. One possible explanation is that this movement might improve our perception of the music via interactions between the sensory and motor systems. This proposed explanation is consistent with observations that event perception and action planning capitalize on the same internal mechanism (Prinz, 1997). It is similarly consistent with the unified theories of event processing that describe perception and action planning within a single framework (Hommel, Müsseler, Aschersleben, & Prinz, 2001)—a framework crucial for the production and perception of complex auditory information such as music.

Neurological research provides evidence for interactions between the auditory and motor systems (Fujioka, Trainor, Large, & Ross, 2012; Grahn & McAuley, 2009), as well as the sensory and motor regions (D’Ausilio, Altenmüller, Belardinelli, & Lotze, 2006; Grahn & Brett, 2007; Grahn & Rowe, 2009). These findings support the idea of a cross-talk between perceptual systems (Goodale & Westwood, 2004). Sensorimotor integration research offers insight into the complex relationship between perception and action, and music represents a rich window for this exploration (Zatorre, Chen, & Penhune, 2007).

Not only is movement central to music production, it also occurs naturally and automatically during music listening among both musicians and nonmusicians. Although the ability to synchronize movements to an auditory beat does not require training or practice, musical experience is helpful in perceiving (Ehrlé & Samson, 2005; Jones & Yee, 1997; Madison & Merker, 2002; Yee, Holleran, & Jones, 1994) and producing (Repp, 2010; Repp & Doggett, 2007) isochronous sequences. Despite our tolerance for timing irregularities in musical sequences (Madison & Merker, 2002), changes in isochronous sequences are quite salient. Here, we complement previous work examining the perceptual effects of music listening on body movement by exploring movement’s effect on our ability to detect timing changes in isochronous sequences.

Subjective measures of movement and auditory perception

Movement’s effect on listeners’ perception of music raises questions about its effects on those performing it. Studies investigating whether movement alters one’s own perception have typically used subjective tasks. For example, body movement can affect the perception of metric structure—a subjective “grouping” of beats. Participants moving on every second or third beat while listening to an ambiguous auditory rhythm later report that the motion-consistent meter sounds more familiar (Phillips-Silver & Trainor, 2007, 2008). Extensive musical exposure is not a requirement, since infants bounced in this manner exhibit similar effects (Phillips-Silver & Trainor, 2005). Vestibular stimulation is crucial, since artificial vestibular input independent of physical head movement is sufficient to trigger the phenomenon (Trainor, Gao, Lei, Lehtovaara, & Harris, 2009).

Moving to an auditory sequence also facilitates subjective pulse extraction (i.e., identifying “the beat”), particularly for tempi within comfortable movement frequencies. Movement while listening also influences the amount of synchrony between this extracted pulse and the auditory sequence (Su & Pöppel, 2011). Although this indicates that movement can facilitate pulse extraction, its effect on perceived timing remains an open question. This relationship can (and frequently has) been explored through tapping—a paradigm useful for studying synchronization, as well as timing acuity (see Repp, 2005, for an extensive review).

In contrast to tapping itself, tapping’s effect on perception is less well researched. Nonetheless, Repp (2002) explored the role of musical context in sensorimotor synchronization and in timing perception. After hearing a piano solo with either consistent timing or small perturbations, trained musicians listened to a separate rhythmic sequence in one of three conditions: perception only, where they identified perturbations; synchronization only, where they tapped along to the sequence; and perception and synchronization, where they both tapped along and identified perturbations. Detection of perturbations in the perception-only and the perception-and-synchronization conditions was significantly impaired by timing perturbations, yet performance in the synchronization-only condition was not affected. Here, perception was more sensitive to preceding context than was movement (tapping). Since it was beyond the study’s scope, Repp (2002) did not explicitly address whether synchronization influenced the ability to detect deviations within the rhythm.

Present study

Here, we build upon previous work by introducing an objective task that does not require musical training to explore whether “moving to the beat” affects timing perception. While complementary explorations of action’s effect on perception use subjective ratings of pitch direction (Repp & Goehrke, 2011; Repp & Knoblich, 2007, 2009), metric structure (Phillips-Silver & Trainor, 2005, 2007, 2008), and beat extraction (Su & Pöppel, 2011), here we use an objective task focused on timing. In order to obtain results that generalize broadly, we did not select participants based on musical training (as done in related research; Krause, Pollok, & Schnitzler, 2010; Repp, 2002; Repp & Knoblich, 2007, 2009).

We hypothesized that participants would more accurately discriminate timing deviations when moving (i.e., tapping) along with a sequence than when listening alone. This is based on the assumption that tapping initiates an additional timing mechanism activated by motor networks. A consistency between motor and auditory timing loops may create a stronger reference signal with which to compare the target sounds. This finding would provide insight useful for musicians by demonstrating that body movement can improve their perception of timing. In addition, it may shed light on one of the reasons listeners often move automatically to the beat: that it aids in our ability to “understand” music’s temporal structure, thereby contributing to our knowledge of links between perception and action.

Experiment 1

Method

Materials and apparatus

We conducted the experiment using customized software developed by the MAPLE Lab playing MIDI “woodblock” sounds (gmBank = 115) through Sennheiser HDA200 headphones. An Alesis Trigger i/O–Trigger-to-MIDI USB Interface converted signals from an electronic drum pad (Roland PDX-8) into MIDI messages sent to an iMac computer.Footnote 1 Each trial consisted of 16 tones divided into groups of 4 (i.e., four measures with four beats each), followed by a probe tone (see Fig. 1). The first tone of each group used a higher relative pitch (C5; 523 Hz) than did the others (G4; 392 Hz) to induce a sense of meter (grouping the tones may help participants follow the sequence and guide attention toward the probe tone, which falls on a downbeat). In the last group, the second, third, and fourth “tones” were silent.Footnote 2 On half of the trials, the probe tone was consistent with the pattern, and on the other half, it was inconsistent. We used two different interonset intervals (IOIs): 400 ms (150 beats per minute [bpm]) and 600 ms (100 bpm), both falling within an ideal range for perception and production (Drake & Botte, 1993).

Fig. 1
figure 1

Trial structure with initialization, synchronization, and timekeeping segments labeled. Filled circles represent the accented tones, squares unaccented tones, lines silent beats, and empty circles possible probe tone locations. The gray boxes beneath the segment labels summarize the movement trial tapping instructions for each experiment, with filled boxes indicating tapping and empty boxes indicating beats without motion

Design and procedure

Participants completed 64 trials grouped into eight blocks. We asked participants to tap along to half of the blocks (movement condition) and to remain still during the other blocks (no-movement condition). Four of the 8 trials within each block included an “on-time” (i.e., at an offset of 0 ms) probe tone, with the others at one of four offsets: either 30 % or 15 % of the IOI (both early and late). Participants experienced each of the four IOI (400 ms, 600 ms) × movement condition (movement, no movement) combinations twice after completing 5 warm-up trials. We randomized both the order of the experimental blocks and the order of the trials within each block for each participant.

During movement blocks, participants tapped on each beat of the stimulus (all three segments, including the probe tone) on a drum pad using an Innovative Percussion (IP-1) drumstick. We asked participants to remain as still as possible during the no-movement blocks (i.e., no foot-tapping, head-bobbing, etc.). Using a two-alternative forced choice task, participants judged whether the final probe tone on each trial sounded “on-time” (i.e., consistent with the repeated sequence) and indicated their confidence on a scale from 1 (not at all confident) through 5 (very confident). To help retain attention, participants received feedback on the correctness of these judgments.

Participants

Forty-eight undergraduates from the McMaster University Psychology participant pool participated in exchange for course credit. We excluded 8 participants who failed to follow instructions (and therefore did not accomplish the task). This group included those who tapped during more than 25 % of the no-movement trials (n = 2), failed to tap during more than 25 % of the movement trials (n = 2), or failed to tap on at least 50 % of the beats within the timekeeping segment (n = 4). The remaining 40 participants (28 females, 12 males) ranged in age from 17 to 35 years (M = 18.4, SD = 2.8) and reported normal hearing and normal or corrected-to-normal vision. Participants had 0–12 years of music lessons (M = 4.4, SD = 3.5) and tapped with their dominant hand. The experiment met ethics standards according to the McMaster University Research Ethics Board.

Results and discussion

We assessed the percentage of “on-time” responses in all conditions using a 2 (IOI) × 5 (offset) × 2 (movement condition) repeated measures ANOVA. The most important finding was the main effect of movement, F(1, 39) = 33.80, p < .0001, reflecting a difference in task performance between the movement and no-movement conditions (shown in Fig. 2a). We also observed a main effect of offset, F(4, 156) = 43.35, p < .0001, reflecting a distinction in performance based on the probe tone offset. Additionally, we observed a two-way interaction between movement condition and offset, F(4, 156) = 12.65, p < .0001, indicating that the effect of movement was not uniform across all of the five probe tone offsets. We found a large difference in task performance (proportion of correct identification of the probe tone position derived from the “on-time” judgments) between the movement (M = .75, SD = .21) and no-movement (M = .43, SD = .24) condition for the late offsets (15 % and 30 %) combined, t(78) = 7.43, p < .0001, d = 1.46, indicating better performance when tapping (see Fig. 2a). In the movement conditions, late probe tones were easier to detect than early ones, t(317) =4.09, p < .0001. This is consistent with similar research demonstrating that late deviations are slightly easier to detect than early deviations (Large & Jones, 1999; McAuley, 1995), as well as previously noted asymmetries in timekeeping with a changing tempo (Loehr, Large, & Palmer, 2011). We observed a two-way interaction between IOI and offset, F(4, 156) = 9.05, p < .0001, indicating that performance differed across the probe tone offsets differently between the two IOIs. However, there was no main effect of IOI or two-way interaction between IOI and movement condition, so we collapsed across IOI in Fig. 2 for the sake of clarity.

Fig. 2
figure 2

Proportion of “on-time” responses for five offset conditions. Participants perform significantly better on the movement trials for the “late” (15 % and 30 % offset) conditions when moving during the timekeeping segment (Experiments 1 and 3). However, movement had no effect in Experiment 2, when there was no movement throughout the timekeeping segment. Error bars represent the 95 % confidence intervals

We recorded the timing of each tap and analyzed tapping variability and its relationship with task performance. We calculated the standard deviation of the timing of taps within each movement trial during the synchronization and timekeeping segments to obtain a measure of tapping variability and compared the log of this variability with performance during these movement trials (squared to normalize the distribution). We found a negative correlation between tapping variability and the proportion of correct responses, r = −.355, p = .025, indicating that “better” tappers performed better on the detection task overall, consistent with studies reporting this type of relationship between movement timing and timing perception (Pashler, 2001; Repp, 1999). Our design did not lend itself well to more sophisticated analyses of this relationship, since the large number of IOI × offset × movement conditions allowed for only two repetitions of each trial. In the future we will further explore this issue using variations on this design with greater numbers of trial repetitions (permitting analyses that treat performance as a continuous variable).

Additionally, we found a negative correlation between years of musical training and tapping variability, r = −.366, p = .020, indicating that participants with more musical training tapped more consistently. This parallels research demonstrating a relationship between synchronizing accuracy and musical training (Krause et al., 2010; Repp, 2010). Interestingly, there was no correlation between years of musical training and performance on the task, r = −.048, p = .769, contrary to other work reporting associations between musical training and the detection of timing changes (Ehrlé & Samson, 2005) and sequence regularity (Madison & Merker, 2002). However, future studies can investigate this relationship further by selecting musicians and nonmusicians for explicit comparison.

We assessed confidence ratings using a 2 (IOI) × 5 (offset) × 2 (movement condition) repeated measures ANOVA. This revealed a main effect of movement, F(1, 39) = 63.54, p < .0001, demonstrating greater confidence in the movement (M = 4.17, SD = 0.50) versus the no-movement (M = 3.90, SD = 0.54) condition. This increased confidence (which did not correspond to performance improvements at all offsets) may offer additional evidence as to why listeners often move to the beat while listening.

The difference between movement and no-movement trials in Experiment 1 raises an important question: Is the effect attributable to (1) movement itself or (2) an alternative strategy afforded by movement (comparing the timing of the final tap with that of the probe tone)? To distinguish between these possibilities, we conducted two additional experiments.

Experiment 2

Experiment 2 explored the importance of moving while timekeeping by eliminating movement during the three silent beats of the timekeeping segment (but retaining it for the final beat). If the effect of movement in the first experiment stemmed solely from calculating the difference in timing between the final tap and the probe tone (rather than from improvements in timekeeping during the silent segment), we would expect to once again see superior performance in the movement versus no-movement condition.

Method

This experiment was identical to the first, except that here we asked participants to tap only on the sounded beats (i.e., the initialization and synchronization segments, as well as the probe tone), excluding the three silent “beats” in the timekeeping segment (see Fig. 1). Participants included 49 undergraduate students, and we excluded 2 participants who tapped during more than 50 % of beats within the timekeeping segment. The remaining 47 (34 females, 13 males), ranged in age from 18 to 24 years (M = 18.8, SD = 1.2), reported normal hearing and normal or corrected-to-normal vision and tapped with their dominant hand. Musical training ranged from 0 to 17 years of lessons (M = 5.5, SD = 4.7).

Results and discussion

The most important finding was that in contrast to Experiment 1, we did not observe a main effect of movement, F(1, 46) = 1.22, p = .275, indicating that tapping had no effect on performance when participants did not move during the timekeeping segment (see Fig. 2b). As in Experiment 1, we found a main effect of offset, F(1, 46) = 16.32, p < .0001, and a two-way interaction between IOI and offset, F(4, 184) = 17.19, p < .0001, reflecting a difference in performance based on IOI as a function of the probe tone offset. We did not find a main effect of movement on confidence ratings, F(1, 46) = .18, p = .678, indicating that participants were no more confident in their responses when moving (M = 3.93, SD = 0.58) than when listening alone (M = 3.90, SD = 0.58). These results are inconsistent with the explanation that the effect of movement in Experiment 1 originated only from participants comparing the timing of their final tap with the position of the probe tone. Instead, it suggests that the effect of movement is the result of improved timekeeping during the silent (timekeeping) segment; an idea we tested explicitly in the final experiment.

Experiment 3

Method

We tested the role of movement while timekeeping in a different manner by asking participants to tap on all beats (including those throughout the timekeeping segment), with the exception of the probe tone. Here, it was not possible to compare the position of the probe tone with that of the final tap, and therefore any effect of movement can be attributed to movement during the timekeeping segment. Using the same criteria as in Experiment 1, we excluded 8 participants. The remaining 40 (29 females, 10 males, 1 transgender) ranged in age from 17 to 24 years (M = 19.4, SD = 1.6) and reported normal hearing and normal or corrected-to-normal vision. Participants had 0–15 years of musical training (M = 5.7, SD = 4.3) and tapped with their dominant hand.

Results and discussion

Similar to the first experiment, the most important finding was a main effect of movement on task performance, F(1, 39) = 11.03, p = .002, indicating a difference in performance during the movement and no-movement trials. Again, we observed a main effect of offset, F(4, 156) = 39.27, p < .0001, and a two-way interaction between movement and offset, F(4, 156) = 10.92, p < .0001. As in Experiment 1, performance on the movement trials (M = .66, SD = .22) was significantly better than performance on the no-movement trials (M = .41, SD = .21) when the probe tone occurred later than expected, t(78) = 5.20, p < .0001, d = 1.16, even without movement on the final beat. These findings complement Experiment 2 by suggesting that the benefits of movement cannot be explained solely through a strategy of comparing the timing of the probe tone with the timing of participants’ final tap. Nonetheless, the effect of movement was slightly less than in Experiment 1 (see Fig. 2c), perhaps stemming from less movement during the final beats of these trials (previously, participants continued timekeeping through movement until the final beat, whereas here this was not required beyond the penultimate beat). Here, we also found a main effect of IOI, F(1, 39) = 9.44, p = .004, not previously observed, as well as an interaction between IOI and offset, F(4, 156) = 10.04, p < .0001. We found a main effect of movement, overall, on confidence ratings, F(1, 39) = 8.35, p = .006, indicating that participants were more confident in their responses when moving (M = 4.12, SD = .48) than when listening alone (M = 3.98, SD = .48). Overall, these findings further support the notion that moving to the beat both improves (and increases confidence in) timekeeping abilities.

General discussion

The effect of movement in Experiments 1 and 3 demonstrates that moving to the beat can objectively improve a listener’s timing acuity, while Experiments 2 and 3 illustrate the importance of movement for timekeeping. Together, these results complement previous work demonstrating that body movement can influence the perception of subjective properties of temporal information such as metric structure (Phillips-Silver & Trainor, 2005, 2007, 2008) and pulse (Su & Pöppel, 2011). We note that although vestibular information is known to play a role in sensorimotor meter perception (Trainor et al., 2009), it does not appear to be a driving force in this timing paradigm. Additionally, we extend this literature by documenting that movement not only alters, but objectively improves timing perception, facilitating more accurate detection of timing deviations.

Consistent with previous work demonstrating better accuracy in detecting late versus early events/changes in timing (Large & Jones, 1999; McAuley, 1995), we observed an asymmetry in performance for late versus early probe tone offsets. We tested this formally using the data from Experiment 1 (as this is the experiment with the greatest contrast between the movement and no-movement conditions). Detection of late offsets was better than that of early offsets in the movement trials, t(317 = 4.09, p < .0001; however, it was worse in the no-movement trials, t(317) = 2.55, p = .011. Curiously, our results for the movement trials in Experiment 1 parallel results from earlier studies in tasks that do not involve movement. One explanation for this puzzling outcome is that here the deviations always occurred on the final tone after a silence, whereas many previous studies use deviations embedded within a sequence (Ehrlé & Samson, 2005; Jones & Yee, 1997; Keele, Nicoletti, Ivry, & Pokorny, 1989; Madison & Merker, 2002). Mid-sequence deviations change the width of two adjacent beats (shortening one and lengthening the other), whereas our manipulation of the final tone affects only one. Although further research is needed to explore whether this accounts for our different results, previous work illustrates that the context in which timing deviations occur can influence their salience (Repp, 2002).

We observed that movement during silence was critical within our paradigm and suspect that it may help maintain timing when auditory information is absent. This interpretation may account for why the asymmetry in our movement condition mirrors previous findings (which did not explicitly involve participant movement); those paradigms did not contain a silent segment requiring timekeeping. We note two possible explanations for this superior performance for late versus early probe tones in the movement condition. First, a narrowing of focus around anticipated events (Large & Jones, 1999) may increase attention as time progresses, because the to-be-expected event has not yet occurred (McAuley & Kidd, 1998). Alternatively, the perceptual asymmetry may stem from our familiarity with phrases slowing near completion to convey expression (Repp, 1998). Consequently, while musicians readily adapt synchronized movements to a changing tempo (Repp & Keller, 2004), they are better at coordinating with decreasing, rather than increasing, tempi (Loehr et al., 2011).

This asymmetry differs slightly from expectancy profiles reported in relative timing tasks (i.e., comparing the durations of two intervals), where correct detection of timing changes at the end of a sequence follows an inverted-U-shaped pattern (i.e., more accurate detection of “on-time,” as compared with early or late offsets) (Barnes & Jones, 2000; McAuley & Jones, 2003). Our findings suggest a different pattern of responses when judging the timing of a single beat with respect to a context sequence (as opposed to an interval), where moving enhances detection of late offsets but listening alone enhances detection of early offsets at the end of a sequence. Conveniently, the asymmetry is useful in clarifying that the benefits of movement are not solely attributable to increased attention/arousal in the movement versus no-movement condition. Because we randomized the order of trials within each block, there is no reason to believe attention systematically varied as a function of offset direction.

Together, our three experiments demonstrate that movement can objectively improve the perception of timing, a finding that contributes to the rapidly growing literature on perception–action links (Hommel et al., 2001; Prinz, 1997) and sensorimotor integration (Zatorre et al., 2007). Although our participants’ level of musical experience correlated with tapping variability, it did not correlate with the magnitude of the movement’s effect. This suggests that movement’s effect on perception is not dependent on training, consistent with previous work on cross-talk between the two systems (Goodale & Westwood, 2004). These data also suggest that isochronous movement may act as a mechanism for timekeeping, particularly during musical silences. Given the growing interest in rhythm and timing—fueled in part by the finding that nonhuman animals can also “move to the beat” (Patel, Iversen, Bregman, & Schulz, 2009)—we believe that these findings on the perception of timing are relevant to a wide community.

As a result of these experiments, we conclude that movement can objectively improve our sensitivity to timing, suggesting that one reason we “move to the beat” while listening to music is to help us understand its structure. We suspect that more consistent movement enables a greater improvement in performance; a question we plan to address in future studies. Together, these data contribute to our knowledge of action influencing (and improving) our perception of timing in addition to informing our understanding of the perceptual abilities of music performers and listeners alike.