During the production of complex sequences (such as speech or music), the producer concurrently executes actions while perceiving the results of the actions as perceptual feedback. We here focus on the production of musical sequences on a piano keyboard and the associated role of auditory feedback. Many studies have shown that alterations of auditory feedback (AAF), such as delaying feedback onsets or altering feedback contents (pitch in music production), can disrupt sequence production (for reviews, see Pfordresher, 2006; Yates, 1963). One interpretation of such results is that AAF interferes with action planning, because perception and action may share an underlying cognitive representation for sequence structure (Pfordresher, 2006; cf. Hommel, Müsseler, Aschersleben, & Prinz, 2001; MacKay, 1987). However, others have pointed out that such effects may simply reflect disruptions of execution as opposed to planning (e.g., Howell, 2001). Thus, we here introduce a new paradigm in which to examine a possible link between auditory feedback and the planning of action sequences.

In the experiment reported here, we expanded on paradigms used previously to examine response–effect compatibility. Such results suggest that mapping of actions to sounds can influence the latency with which performers plan action sequences. For instance, Keller and Koch (2008) demonstrated that inconsistent perception/action mapping (similar to manipulation of feedback contents in AAF) increases the latencies associated with response preparation. If, as these results suggest, perceptual input influences action planning, the perception of a sequence of events in auditory feedback (referred to here as a feedback melody) may activate an associated learned action sequence that has been stored in memory.

We addressed this hypothesis in an experiment in which participants were required to switch between two previously memorized action sequences (melodies on a keyboard). On each trial, participants would perform one melody repeatedly several times, and during the trial an auditory instruction cue (a single tone) would sound that would signal the participant to either continue performing the same melody (half of the trials) or switch to the other melody (half of the trials). In addition, while participants performed, they would hear as a feedback melody either (1) the melody that they performed at the start of the trial (termed performed feedback) or (2) a feedback melody with auditory events that matched the alternate melody (the melody to which they might switch; termed alternate feedback). We were interested in how auditory feedback influences participants’ abilities to switch between learned action sequences and to inhibit the tendency to switch, in response to the instruction cue. We focused on the response latencies associated with switching, as well as on pauses in production suggesting an inhibition of the switch response during continue trials. We examined nonpianists’ behavior because they represent the population majority who do not have well-learned mappings of sound to action for sequences of finger movements. By contrast, the results of pianists may reflect the effects of associations specific to training.

Method

Participants

Ten students (2 female) from the University at Buffalo participated in exchange for course credit. All but 1 participant reported being right-handed, and their mean age was 19 years (range 18–21). None of the participants were considered pianists; 9 reported no formal training or informal experience with playing the piano, and the remaining participant reported only 3 years of piano lessons.

Apparatus

Participants used an M-AUDIO Keystation 49e unweighted piano keyboard to produce melodies. The software program FTAP (Finney, 2001) was used to manipulate auditory feedback, acquire MIDI data, and control a Roland RD-700 digital piano that produced the auditory output. Participants heard auditory feedback and metronome pulses over Sony MDR-7500 professional headphones at a comfortable listening level. The piano timbre originated from Program 1 (Standard Concert Piano 1), and the metronome timbre from Program 126 (standard set, MIDI Key 56 = cowbell) of the RD-700. Two distinct percussive timbres (from Program 126) were used for instruction cues, each of which could function as a switch cue or a continue cue. One was a triangle sound (MIDI Key 81), and the other was a wood block (MIDI Key 76).

Materials

Four melodies were divided into two melody sets; each participant performed one set. All melodies comprised five pitch classes associated with white keys on the piano from C to G (in Octave 4), chosen because this construction allows fixed key–finger mappings. Melodies within each set were designed to be distinctive on the basis of the melodic contour, which could be smooth or alternating, and starting pitch, which could be C or G. Melodies in Set 1 included the sequences [C D E G F E D E] and [G E F D C E D F]; melodies in Set 2 included [C G E F G F D E] and [G F E D F E D C]. Melodies were designed to be played repeatedly throughout the trial without pauses between repetitions.

The melodies were displayed symbolically as a row of numbers beneath images of the right hand, with the relevant finger highlighted. For instance, the notation for the first melody in Set 1 was represented as [1 2 3 5 4 3 2 3], where 1 indicates the thumb and 5 indicates the pinky. On the keyboard, the numbers 1–5 were arranged in a row above the corresponding piano keys, with arrows pointing to each key. Nonpianists typically learn this notation system with little effort (see, e.g., Pfordresher, 2005).

Procedure

At the beginning of the session, participants were introduced to the music notation and practiced the two sequences of their melody set until both were memorized. The order of memorization was counterbalanced across all participants who experienced a given melody set, and participants’ memory for both sequences was tested after they had learned the second sequence. Participants were allowed to practice each action sequence until they thought that it was fully memorized (this typically took less than 2 min per melody), at which time the music notation was removed and the participants attempted to perform the sequence without the notation. The criterion for memorization was three successive error-free repetitions without the use of the notation. After the memorization phase, the notation was removed for the rest of the session.

Following the memorization phase, participants were introduced to the auditory instruction cues. The assignment of cue type to cue timbre was counterbalanced across participants. Participants were told to switch immediately upon hearing the switch cue, starting at the beginning of the alternate sequence, but to ignore the continue cue. The participants’ memory for each sequence was then tested again, and they went on to complete a practice trial, followed by two blocks of experimental trials.

Blocks of experimental trials were organized around which action sequence participants performed at the beginning of each trial (the order was counterbalanced across participants). For the first block, every trial started with one of the action sequences in the set, and every trial in the second block began with the other action sequence. Thus, within a block, one feedback melody from the set matched the action sequence and was called the performed feedback melody, whereas the other feedback melody from the set matched the melody that the participant might switch to (given the appropriate cue) and was called the alternate feedback melody.

Trials were structured as follows: At the beginning of a trial, four metronome events sounded (with a 500-ms period between the events) to establish the target tempo. Then the participants began playing the designated action sequence. In a randomly determined half of the trials, participants heard performed feedback (i.e., feedback that matched their actions), whereas in the other half of the trials participants heard alternate feedback (i.e., feedback that matched the melody to which the participant might switch). The feedback melody did not change during the trial. Participants were told to continue playing at a consistent tempo, without pausing between repetitions. When switching melodies, they were told to execute the switch as quickly as possible, starting the alternate sequence at the first serial position, even if that meant interrupting the production of an ongoing action sequence.

In each trial, an auditory instruction cue occurred in synchrony with a feedback tone onset during the second or third repetition of the sequence; the position of this onset was chosen from the set of serial positions [10 11 13 14 17 20 23 24], relative to the beginning of the trial (Position 1), and was chosen to sample equivalently from sequence repetition (second or third) and sequence position (1–8 within the sequence) while maintaining a reasonable total time for the experiment. Furthermore, half of the cue positions were associated with strong metrical accents and half with weak accents, as indicated by the metrical grid shown in Fig. 1. The trial continued until auditory feedback stopped, which occurred after the 33rd produced event; this constituted four repetitions of the sequence if the participant did not switch (and did not commit any omission or insertion errors) throughout the trial. Figure 1 illustrates the relationship between the action sequence (shown as musical notes) and the feedback melody for a hypothetical trial on which the participant hears the alternate feedback melody and switches two events after the switch cue.

Fig. 1
figure 1

Example of a planned action sequence and the resulting feedback melody in an alternate feedback trial with a switch instruction cue at Position 17. Metrical accents are symbolized by the number of x’s below sequence positions. The positions at which instruction cues could occur are underlined

Each keystroke triggered successive pitches from a list that represented what someone would play in an error-free performance. This fixed mapping between pitch and serial position of the keypress was necessary for trials with alternate feedback and to maintain parity across feedback conditions.

Design and analysis

The full experiment was defined by a 2 (feedback type: performed or alternate) x 2 (instruction cue: switch or continue) × 8 (cue position) within-subjects design. These factors were nested in the additional within-subjects variable action sequence (Sequence 1 or 2 within a set), such that half of the cue positions for each combination of feedback type and instruction cue were associated with each action sequence. This design yielded 32 trials per participant. These factors were crossed with eight between-subjects order conditions that resulted from counterbalancing melody set (1 or 2), assignment of timbre to instruction cue (2 levels), and two different random orders of trials.Footnote 1 Positions at which participants switched were determined by a pattern-matching algorithm in MATLAB and checked visually afterward for accuracy (no corrections were necessary). Trials associated with the presence of omission or insertion errors before the instruction cue were discarded (14% of all trials). Such errors could disrupt sequential relationships between the action sequence and feedback melodies for the fixed feedback mapping used (see Pfordresher & Palmer, 2006, for further discussion). Timing was determined by measuring the interonset intervals (IOIs) between successive keypresses.

Results

Participants tended to switch within 2–3 events following a switch cue (mean latency = 2.5 events, mode = 2 events). Switch positions were related to metrical position and were more common at the first beat of a four-beat measure (51% of all switches) than at other positions (Beat 2 = 15%, Beat 3 = 18%, Beat 4 = 15%), as has been documented elsewhere (Palmer & Baldwin, 2004). These tendencies were not influenced by auditory feedback (p > .10, related-samples t test), the primary variable of interest.

Our primary focus was on IOIs associated with auditory instruction cues (switch and continue) as a function of auditory feedback. While analyzing the data, it became apparent that participants typically lengthened a single IOI substantially within five events of the instruction cue in a way that disrupted the rhythmic flow. Based on these observations, we defined as pauses the longest IOI that followed the instruction cue within a five-event window. During switch trials, pauses (so defined) were most often located just prior to the switch (50%, as opposed to 14% or fewer trials for all other distances from the switch). Based on this relationship between the location of pauses and the location of switches within switch trials, we assumed that pauses that followed continue cues were the result of an inhibited tendency to switch. The duration of these IOIs was contrasted with the mean IOI for the entire trial in order to represent pause duration relative to the prevailing tempo of the trial. (Participants performed faster than the prescribed tempo of 500-ms IOIs; mean IOI = 430 ms, SE = 20, across participants and trials.) This computation yielded pause difference scores.

Figure 2a shows means for pause difference scores by cue type (continue or switch) and feedback (performed or alternate). A two-way repeated measures ANOVA on these factors yielded a significant Cue Type x Feedback Type interaction, F(1, 9) = 13.28, p < .01, but no main effect of either factor (p > .10 for each main effect). As shown in Fig. 2a, alternate feedback led to pauses after continue cues, similar to pauses after switch cues, whereas significantly shorter pauses followed continue cues when performed feedback was present. In other words, the presentation of alternate feedback caused participants to pause following a continue cue as if the participant was preparing to switch. This interpretation was verified with Tukey’s HSD post-hoc tests (α = .05). The same results emerged when analyzing differences relative to mean IOIs (as in Weber’s law), which follows from the fact that mean IOIs did not differ significantly across conditions (p > .10 for all effects); moreover, the longest mean IOI was associated with the condition exhibiting the shortest pauses (performed feedback + continue cue, mean IOI = 472 ms; across other three conditions, mean IOI = 459 ms).

Fig. 2
figure 2

The effects of instruction cue and feedback type on pause difference scores (a) and on error rates for events following the instruction cue (b). Error bars represent ± 1 SE

Figure 2b shows mean error rates for produced events following the instruction cue. This analysis yielded a pattern of results that was directly comparable to the pattern shown by pauses, although no effects in the ANOVA were reliable (p > .10 for each effect). Also, error rates before the instruction cue in a trial were low overall and, more importantly, did not differ significantly across feedback conditions (error rate for normal feedback, M = 2%, SE = 0.5; for alternate feedback, M = 3%, SE = 0.6). Thus, alternate feedback was not “disruptive” of performance, as other alterations of feedback pitch are (cf. Pfordresher, 2006). Further analysis suggested that the error rates shown in Fig. 2b do not generally represent misinterpretation of the instruction cue; the instruction cue was correctly followed on 96% of all trials.

Finally, we assessed whether the metrical accent associated with the position of the instruction cue influenced the effect of feedback. Analyses of switch locations, mentioned earlier, suggested that participants were more likely to initiate switches on strong as opposed to weak metrical accents (cf. Palmer & Baldwin, 2004). Other research has likewise suggested that strongly accented metrical positions attract attention (Large & Jones, 1999) and act as salient points in memory (Palmer & Krumhansl, 1990; Palmer & Pfordresher, 2003). If, as these results suggest, strong metrical accents function as stable points in music, performers may be less sensitive to the influence of auditory feedback when evaluating instruction cues that are positioned on strong beats. According to the metrical grid notation shown in Fig. 1, cues at Positions 11, 13, 17, and 23 are all metrically strong positions. Figure 3 shows the effects of instruction cue and feedback type on pause difference scoresFootnote 2 for cues at weak positions (A) and strong positions (B). A three-way ANOVA with the factors Metrical Accent of Cue Position (weak, strong), Cue Type, and Feedback Type yielded a significant three-way interaction, F(1, 9) = 5.09, p = .05, in addition to a two-way interaction of cue type and feedback type, F(1, 9) = 9.43, p < .05. No other effects reached significance (p > .10 for each). The pattern of the two-way interaction was the same in both beat conditions, but it was much more pronounced with weak beats. Tukey’s HSD post-hoc tests suggested that the results shown in Fig. 2a are primarily specific to trials on which the cue was located on a weak beat. Thus, strong metrical accents may have enhanced the salience of event cues, leading to a weaker effect of feedback.

Fig. 3
figure 3

Relationship between instruction cue and feedback type when the instruction cue occurred on a metrically weak position (a) or a metrically strong position (b)

Discussion

We have reported evidence that a sequence of auditory feedback events can activate an action sequence associated with that perceptual sequence. Interestingly, the ability of feedback events to activate an action sequence was observed in trials on which such activations were undesirable—that is, during trials on which the auditory instruction cue directed participants to continue with the current action sequence rather than switch to the alternate sequence. In other words, participants may have had difficulty inhibiting an inappropriate shift from one sequence to the other during these trials, even though the error data suggest that participants were ultimately able to avoid such errors. This effect was particularly pronounced when cues occurred at weak metrical positions, where the salience of the cue may have been weaker. Thus, the effect of auditory feedback seen here involved interfering with the participant’s interpretation of the instruction cue. It is important to note that the “interference” effect observed here is distinct from the kinds of disruptive effects found for AAF in past studies (see Pfordresher, 2006). Whereas AAF typically disrupts online execution of an action sequence, the effect seen here involved the activation of two competing action sequences.

We initially expected to find shorter pauses on switch than on continue trials during alternate-feedback trials. The data were qualitatively consistent with this prediction, but the difference was not reliable. Why? One possibility is that pauses reflect conscious evaluation of the cue meaning based on the accessibility of planned and alternate sequences (which may occasionally be initiated after an actual switch). The presentation of a switch cue initiates such an evaluation regardless of auditory feedback, based on the implications of the cue itself. However, when a continue cue is presented, evaluation only commences when the alternate sequence has been made more accessible via the presentation of feedback events associated with those actions, with longer evaluation occurring when cues are presented during less salient metrical events.

We reported findings from a piano keyboard task for a sample of nonpianists, since they are more representative of the population. Nonpianists and pianists often respond similarly to AAF (see, e.g., Pfordresher, 2005); however, it has also been shown that musical training can enhance instrument-specific associations between actions and sound (e.g., Drost, Rieger, & Prinz, 2007). It is thus natural to wonder what effects one might find if the present paradigm were presented to pianists. In fact, we originally had collected data from 5 additional participants, all of whom had at least 8 years of formal training on the piano. These pianists showed no effect of either instruction cue or feedback type on timing, and in general exhibited much shorter pauses than did nonpianists (for pianists, M = 69 ms, SE = 13; for nonpianists, M = 314 ms, SE = 37 ms). This finding makes intuitive sense, given that musicians must learn to adapt flexibly to challenging performance situations (Palmer, 1997) and that experienced pianists typically maintain steady rhythms even at the expense of pitch accuracy (Drake & Palmer, 2000). It is possible that a process similar to the one observed in nonpianists also occurs in pianists but is not detectable using the present behavioral methodology, including the use of a fixed-pitch instrument.

The melody-switching paradigm used here was inspired by the task-switching procedures used to measure cognitive control processes (for reviews, see, e.g., Kiesel et al., 2010; Koch, Gade, Schuch, & Philipp, 2010). Indeed, the presence of pauses following switch cues is analogous to the switch costs observed in such paradigms. However, in many important respects, the task used here differed from conventional task-switching paradigms. First, and most obviously, the present procedure examines the production of complex event sequences, whereas typical task-switching paradigms focus on discrete, perceptually based decisions in response to a single event. Second, it is not clear that switching between two action sequences is directly comparable to switching between two tasks (such as assessing the shape vs. the color of a visual object). Whereas different tasks are thought to recruit separate cognitive representations, the sequences we used involved highly similar representations (see Fig. 1). Thus, it is not clear that the behaviors measured here tap into precisely the same control mechanisms that are thought to underlie task switching.

In conclusion, alterations to the contents of auditory feedback do not always lead to disruption of the produced sequences, as might be predicted by an account that relates feedback strictly to the execution of a motor plan (e.g., Howell, 2001). Hearing a feedback melody associated with a melody to which one may or may not switch during a trial leads to difficulty in inhibiting one’s tendency to switch. The fact that we found effects of alternate feedback in trials on which participants did not actually switch further suggests that feedback affects the planning of actions, not merely the execution of a sequence, once selected.