Executing a response (e.g., pressing the “green” key) to a stimulus (e.g., the word small printed in green font) will result in the creation of a transient stimulus-response (S-R) binding in episodic memory. If the same stimulus is repeated later on, the associated response is retrieved. When the retrieved response is appropriate on the later occasion (e.g., when small is again printed in green in the subsequent trial), performance is typically facilitated. However, when the retrieved response is inappropriate (e.g., if small is now printed in red font), it interferes with the execution of the proper response, yielding performance costs (Rothermund, Wentura, & De Houwer, 2005). S-R binding and retrieval processes play a dominant role for the emergence of automatic and stimulus-based action control and are documented for a variety of stimuli, modalities, and responses (for an overview, see Henson, Eckstein, Waszak, Frings, & Horner, 2014; Hommel, 1998; Logan, 1988; Rothermund et al., 2005).

Notably, executing the response by oneself is not a necessary condition for the formation of S–R bindings: Giesen, Herrmann, and Rothermund (2014) recently showed that stimuli also become associated with responses that are merely observed in another person: Two participants sat opposite each other at a table and worked together on a shared color categorization task, taking the role of actor and observer in turns. Red or green words served as stimuli in a sequential prime-probe trial design. Notably, only one participant (the “actor”) saw a colored word and had to perform a color categorization response during the prime trial. The other participant saw the word stimulus in white font and was instructed to observe the response performed by the actor. If S–R bindings can be acquired solely by observing the response to a respective stimulus, subsequent stimulus repetition should retrieve the observed response from memory. To test this assumption, Giesen et al. instructed observers from the preceding prime trial to perform a color categorization response to the red or green stimuli in a subsequent probe trial. Indeed, probe performance patterns of former prime observers were consistent with “standard” S-R retrieval effects, indicating that prime stimuli had become associated with and later retrieved observed prime responses: Compared to baseline probe trials in which a different word was presented, stimulus repetition (a) facilitated performance if to-be-executed probe responses were compatible with observed prime responses, but (b) impeded performance if to-be-executed probe responses were incompatible with observed prime responses. However, retrieval effects were influenced by the social relationship between both co-actors: Pairs of participants either cooperated or competed with each other to obtain an extra reward. In a third condition, distribution of the extra reward was based on the individual performance of each participant (and did not depend on the performance of the co-actor). Stimulus-based retrieval of observed responses was affected by this social relationship manipulation: Retrieval effects were significantly stronger in cooperative and competitive pairs, but were absent in pairs that worked independently of each other. Findings by Giesen et al. thus indicate that (a) stimuli can become associated with observed responses and that (b) people rely on such observational S-R binding and retrieval to regulate their own actions. However, (c) retrieval effects were obtained only if the observed responses were performed by socially relevant others (i.e., by people with whom one interacts in a cooperative or competitive way). Together, these findings suggest that basic processes of S-R binding and retrieval are involved in social learning from observation.

This study sought to substantiate and extend this claim by examining whether known moderators of social learning phenomena also affect transient effects of S-R binding by observation. Indeed, a potent factor of social learning is vicarious feedback: In a seminal study, Bandura (1965) demonstrated that children were less likely to imitate an observed action if the model was previously punished for demonstrating the very same action (compared to conditions in which the model was either rewarded or received no feedback). Against this background, our study investigates whether vicarious feedback exerts a moderating influence on transient bindings between stimuli and observed responses.

To investigate this issue, we adopted the design by Giesen et al. (2014) and asked two participants to work through a shared color categorization task (Fig. 1a), taking the role of actor and observer in turns. We independently manipulated stimulus relation (repetition/change) and response compatibility of observed prime and to-be-performed probe responses across a prime-probe sequence (see Table 1). Observational S-R binding and retrieval is indicated by a specific pattern of stimulus repetition effects: repeating the prime stimulus in the probe should lead to facilitation (interference) if the probe response is compatible (incompatible) to the observed prime response. Statistically, this is reflected in a Stimulus Relation × Response Compatibility (S × R) interaction.

Fig. 1
figure 1

a Schematic illustration of experimental setup for participants A and B. R = red button; G = green button. Participant A is “actor” during the prime trial (performing the color categorization response). In turn, participant B is “observer” during the prime trial, but actor during the probe trial, which allows testing for retrieval of observational S-R bindings. b Trial sequence from the perspective of each co-actor, depicting a trial with stimulus repetition (same word in prime and probe display), incompatible responses (prime response: green; probe response: red), and negative prime feedback valence. Stimuli are not drawn to scale. (Color figure online)

Table 1 Factorial design and sample stimuli for prime-probe sequences in the experiment (vicarious prime feedback was manipulated independently and orthogonally as a third factor)

To investigate whether stimulus-based retrieval of observed responses is affected by vicarious feedback, we added prime feedback valence (positive/negative) as an orthogonal factor to the design. Although the feedback referred to the action that was executed by the prime actor, prime observers also were informed of the feedback that was given to the other participant. Because retrieval effects of observational S-R bindings were most pronounced in the Giesen et al. study when participants cooperated with each other (positive interdependency; cf. Iani, Anelli, Nicoletti, Arcuri, & Rubichi, 2011; Ruys & Aarts, 2010), we chose to realize only the cooperative condition in this experiment. Hence, all participant pairs were instructed that they worked together as a team, and that they could gain an extra reward jointly if their team performance was good.

In what way could vicarious feedback affect observational S-R binding and retrieval processes? In our view, two possibilities are plausible: The first hypothesis assumes that positive vicarious feedback encourages observers to rely on the observational S-R binding, whereas negative vicarious feedback discourages observers to rely on the observational S-R binding on later occasions. Indeed, a number of studies suggest a similar relationship between feedback and bindings between stimuli and self-performed responses: Experiencing negative feedback after responding to a stimulus prevents S-R retrieval in the subsequent trial, which precludes the establishment of wrong action routines (Colzato, van Wouwe, & Hommel, 2007; Rothermund, Eder, & Frings, 2015; Waszak & Pholulamdeth, 2009). According to this global “feedback” hypothesis, participants should be less likely to retrieve S-R bindings when observed responses received negative (compared to positive) feedback. Statistically speaking, the S × R interaction should be present after positive, but not after negative, vicarious prime feedback. Moreover, the S × R interaction within the positive feedback condition should reflect performance benefits for stimulus repetition probes with compatible responses and performance costs for stimulus repetition probes with incompatible responses. Such a finding would imply that an establishment of “wrong” action routines would be globally prevented after negative prime feedback.

However, self-performed and observed actions are not the same thing: Whereas the former are based on an actively represented action goal one seeks to achieve (i.e., giving a “green” categorization response), action goals of observed actions are covert and therefore not always obvious for observers. Indeed, evidence reported by Bekkering and colleagues shows that action imitation does not stem from a blunt “copying” of observed movements, but rather results from an active interpretation of observed movements as goal-directed actions. Hence, observers represent and imitate inferred action goals, rather than observed and overt behaviors (Bekkering, Wohlschläger, & Gattis, 2000; Wohlschläger, Gattis, & Bekkering, 2003). With only two possible action goals (i.e., hitting the red/green button) available in the present setup, observers might make more specific use of the feedback information to infer goals from observed actions, which will affect how the observed response is represented in the S-R episode: Specifically, positive feedback supports the initial goal inference (“She hit the green key and received positive feedback, so she should have pressed green!”). In turn, negative feedback contradicts the initially derived goal inference and provokes a reinterpretation in its opposite (“He hit the green key and received negative feedback, so he should have pressed the red key instead!”). On a more general level, this would imply that observed responses become represented in abstract format in the observationally acquired S-R episode, representing action goals rather than physical movements (for a detailed analysis of response representations in S-R episodes, see Giesen & Rothermund, 2016). According to this “flexible goal imitation” hypothesis, observers will actively employ feedback information to reconstruct (normative) action goals from observed actions: After positive vicarious feedback, the represented action goal should correspond to the observed action (yielding the “standard” S × R interaction pattern, as already described). After negative feedback, the represented action goal should be opposite to the observed action. Statistically, this should yield a reversed S × R interaction within the negative feedback condition, reflecting performance costs for stimulus repetition probes if the (observed) prime response and the (to-be-executed) probe response are compatible, and performance benefits for stimulus repetition probes with incompatible prime and probe responses.

Note that both the “global feedback” as well as the “flexible imitation” hypothesis expect the same S × R interaction pattern after positive vicarious prime feedback. However, they make different predictions after negative vicarious prime feedback. To differentiate between both accounts, it is therefore of central interest (a) whether the S × R interaction is obtained after negative vicarious prime feedback and (b) which pattern of effects emerges from that interaction.

Method

Participants

Eighty-two native German-speaking psychology students of the Friedrich Schiller University Jena participated in the experiment. Written informed consent was obtained from all participants at the start of the study. Based on the written protocol of the experimenter, 12 participants were excluded because they did not adhere to instructions (n = 6) or because they second-guessed the feedback manipulation (n = 6; this was based on verbal comments toward the experimenter). Data of 70 (eight male) participants (M age = 21.3; 19–30 years) were analyzed. Participants were tested in pairs and received partial course credit and muffins as an extra reward. Sessions lasted 40 minutes.

Apparatus and stimuli

The experiment was programmed with E-Prime 2.0. Two participants sat opposite to each other, each one in front of a 19-in. flat-screen monitor. Flat screens were purposefully positioned to prevent eye contact between participants. However, we ensured that both participants had clear peripheral vision on both push-buttons to observe responses. Two response pads were used to collect responses (see Fig. 1a): Participants continuously pressed two rest-state keys with their left and right hands and gave their responses by releasing one of these keys in order to hit the red or green push-button. Stimuli were 25 neutral, mono-/disyllabic German adjectives, presented centrally in Times New Roman font (16 pts.) on each participant’s black screen (cf. Rothermund et al., 2005). Two computer-generated sounds were used for positive/negative auditory feedback, presented via headphones: Starting at 440 Hz, sounds either increased (positive feedback) or decreased (negative feedback) in steps of 20 Hz (each step lasting for 15 ms) until a frequency of 840 Hz or 40 Hz was reached. Total sound duration was 300 ms and resulted in sounds that are intrinsically positive (increasing frequencies) or negative (decreasing frequencies; cf. Rothermund, 2003).

Procedure

Written instructions were given on-screen. Participants worked in pairs and performed a color categorization task that was shared between both participants (cf. Giesen et al., 2014, Experiment 1, which used a similar procedure): In each trial, a word appeared centrally on each participant’s screen. Only one participant (the actor in the respective trial) saw the word in red or green and performed the color categorization task. The other participant (the observer) saw the word only in white font (see Table 1, Fig. 1). Hence, word color was the task-relevant feature, whereas word content was irrelevant for the task and served as a distractor (Rothermund et al., 2005). During the first 160 prime-probe sequences, Participant A of each pair was “prime actor” and gave color categorization responses to red/green prime stimuli by pressing the corresponding push-button. The prime response was observed by Participant B (the “prime observer”), who saw prime stimuli only in white font (and hence was prevented from executing or simulating the prime response). Conforming to the findings by Giesen et al. (2014), prime observers should acquire observational S-R bindings, that is, a binding between the (task-irrelevant) word and the observed prime response. By implication, repeating the prime stimulus on a subsequent probe trial should retrieve the observational S-R episode from memory and should affect probe performance accordingly. To test this, the prime observer (Participant B) became “actor” in the subsequent probe trial and had to categorize the color of red/green probe stimuli (see Table 1, Fig. 1). The interaction partner (Participant A) now became “probe observer” and saw probe stimuli only in white font (this was done to keep procedures maximally comparable among participants; however, probe observers are not of interest for these purposes). To investigate retrieval effects of observational S-R bindings in both participants, the role pattern was switched after 160 prime-probe sequences: In the first half of the experiment (prime-probe sequences 1–160), Participant A was prime actor/probe observer, and Participant B was prime observer/probe actor. In the second half of the experiment (prime-probe sequences 161–320), Participant A was prime observer/probe actor, and Participant B was prime actor/probe observer. Initial analyses revealed that role order (observation first vs. action first) did not interact with the effects of interestFootnote 1 (in particular, no four-way interaction obtained for errors and RT, all Fs < 1). We therefore collapsed across the order factor in the final analyses.

Each prime-probe sequence (see Fig. 1b) started with a ready signal (“!!!”; 500 ms), followed by a fixation cross (250 ms), after which the prime word appeared (printed in red or green for actor; printed in white for observer) until a response was initiated by the prime actor or until a maximum of 1.500 ms elapsed (if the prime actor failed to respond during this period, the trial was counted as erroneous; however, that never happened in the experiment). Then, feedback for the prime actor’s response was presented to both participants (500 ms): Correct and fast responses were followed by a schematic smiley face and the positive feedback sound. Incorrect and/or too-slow correct responses led to a schematic grumpy face and elicited the negative feedback sound. Then, another fixation cross with variable duration (150–350 ms) appeared, followed by the probe word (printed in red/green for actor, white for observer) until a response was initiated by the probe actor or until a maximum of 1,500 ms elapsed. Although not of theoretical interest, both participants received feedback for the probe actor’s response (500 ms) that was similarly presented as the prime feedback. To guarantee that prime observers attended to and consequently encoded prime responses, a memory test appeared after 25 % of all probe trials: Prime observers were asked to repeat the observed prime actor’s response by pressing the corresponding push-button (until response). After a blank black screen (1,250 ms), the next prime-probe sequence started.

Participants performed 32 practice trials before the first experimental block and 16 additional practice trials before the second experimental block to get accustomed to the role change. To boost action corepresentation in both co-actors, we adopted the “positive interdependency” manipulation used by Giesen et al. (2014): Participants were told that they would work together in the experiment and could gain an extra reward (a muffin) if both of them performed well in terms of speed and accuracy (i.e., no more than 25 % of all release RT below 700 ms; no more than 10 % errors in the color categorization task; no more than 20 % errors in the memory test). They were further told that if only one of them, but not the other, would fulfill speed and accuracy criteria, none of them would gain the extra reward. In the past, these instructions proved successful in inducing a cooperative attitude among participants of a given pair (Giesen et al., 2014; Iani et al., 2011). Participants then worked through two experimental blocks of 160 prime-probe sequences each that were constructed according to the experimental design (see below). Prime-probe sequences were randomly presented. After every 40 prime-probe sequences, participants received interim summaries with respect to their own and their co-actor’s actual performance in the color categorization (% erroneous responses) and memory test performance (% false responses) for 10 s, together with the slogan “Remember: you work together!” to refresh the interdependency manipulation. After the experiment, participants answered a brief paper questionnaire as a manipulation check to rate the experimental situation and their impression of their co-actor (see Table 2): Using 7-point bipolar scales, participants judged whether they experienced the task as cooperative (1) or competitive (7; one item) and whether they experienced the situation as comfortable vs. uncomfortable (averaged across three items: 1 = easy/pleasant/positive; 7 = difficult/unpleasant/negative; Cronbach’s α = .49). Four additional items were used to assess whether participants perceived their co-actor as agreeable or not (averaged across four items: 1 = agreeable/confident/friendly/competent; 7 = disagreeable/insecure/unfriendly/incompetent; Cronbach’s α = .91). Participants then were thanked, rewarded, and debriefed.

Table 2 Means (SD) for manipulation checks, probe errors (%), and probe release RT (ms)

Design

The experiment comprised three within-subject factors: stimulus relation, response compatibility, and vicarious prime feedback (see Table 1). Stimulus relation was manipulated by repeating or changing the word from prime to probe (50 % stimulus repetitions, 50 % stimulus changes). Response compatibility was manipulated by requiring a probe response that was compatible or incompatible to the required prime response (50 % compatible, 50 % incompatible). For instance, if a green response was required during the prime, a green response would also be required on a compatible probe trial, but a red response would be required on an incompatible probe trial (see Table 1). Feedback for the prime actor’s response was audiovisually presented to both co-actors (see Fig. 1b) and was either positive (i.e., for correct and fast responses) or negative (for responses that were either incorrect or correct but too slow). Specifically, participants received authentic negative prime feedback for incorrect responses and also for very slow correct responses (i.e., RTs within the fourth quartile of the individual RT distribution sampled from the 20 preceding trials). In turn, participants received authentic positive prime feedback for very fast correct responses (i.e., RTs within the first quartile of the individual RT distribution). This was done to secure face validity of both feedback levels. Participants typically have accurate insight whether (a) a given response was correct or false and (b) whether a given correct response was really fast (leading to positive feedback) or too slow (justifying negative feedback). However, prime feedback was manipulated for correct responses that yielded RTs in the middle range (i.e., the second and third quartile of the individual RT distribution sampled from the 20 preceding trials). Because the color categorization task was fairly easy, one would otherwise run the risk of obtaining mainly positive feedback while the negative feedback conditions would be underpowered (cf. Rothermund, 2003). For these RTs, feedback valence was randomly determined (50 % positive, 50 % negative) to yield roughly equal rates per feedback level. Overall, this manipulation was successful and resulted in rates of 54 % positive (46 % negative) prime feedback. Both participants received explicit written instructions on-screen that audiovisual feedback was given for response speed and accuracy. They were informed that high rates (e.g., 50 %) of negative feedback were not uncommon in this experiment, due to strict speed criteria, and that they should try to work as fast and accurately through the task as possible. Because the feedback manipulation put a strong emphasis on speed that encouraged participants to trade accuracy for speed, systematic variance should be channeled from RT to error rates (Draine & Greenwald, 1998). Probe errors (i.e., releases of the wrong rest-state key, followed by a press of the wrong push-button) therefore served as the main dependent variable of interest, but probe release RT of rest-state keys was analyzed as well. For technical purposes, the following factors were also manipulated: First, prime word color was counterbalanced (50 % red, 50 % green). Likewise, 50 % of probe words were red, 50 % were green (probe word color results from the factorial combination of prime word color and response compatibility). To keep both blocks of the experiment as similar as possible from the participants’ viewpoint, feedback was also given after probe trials and audiovisually presented to both co-actors. Probe feedback was manipulated in a similar way as prime feedback, but is not of theoretical interest and thus not discussed any further.

Results

All statistical analyses were conducted with IBM SPSS Statistics, Version 21.

Manipulation check

Mean ratings are presented in Table 2. T tests against the scale midpoint (4) showed that participants experienced the situation as being highly cooperative (M = 2.2), t(69) = 12.68, p < .001, indicating that positive interdependency was successfully induced, as well as being neither comfortable nor uncomfortable (M = 3.9), |t| < 1.2, p = .25. Furthermore, they judged their co-actor as being very agreeable (M = 1.8), t(69) = 21.55, p < .001.

Probe performance

Trials with probe release RT outlierFootnote 2 values (7.9 %), and errors in the memory test for observed responses—4.2 % (1.0 % of all probe trials)—were discarded. For probe RT analyses, trials with erroneous probe responses (3.0 %) were excluded. Table 2 provides means for probe errors and release RTs.

Mean error rates were entered into a 2 (stimulus relation across prime and probe: stimulus repetition vs. change) × 2 (response compatibility across prime and probe: compatible vs. incompatible) × 2 (vicarious prime feedback: positive vs. negative) analysis of variance (ANOVA). Table 3 provides global ANOVA results. Analyses revealed a significant main effect of vicarious prime feedback, showing that participants made fewer errors after observing positive (M = 2.5 %) compared with negative (M = 3.3 %) vicarious prime feedback. Furthermore, participants had a general tendency to switch responses when it was their turn to respond (i.e., to press the response button in the probe that had not been pressed by their partner during the prime) because they made fewer errors in probe trials with incompatible responses (M = 2.5 %) compared with compatible (M = 3.2 %) probe trials, although this main effect of response compatibility was only marginally significant. Most important, the three-way interaction was also significant, indicating that vicarious prime feedback exerted a stimulus-specific influence and moderated retrieval effects of observational S-R bindings (see Fig. 2). Two separate 2 (stimulus relation) × 2 (response compatibility) follow-up ANOVAs for each vicarious prime feedback level showed that after positive vicarious prime feedback, a significant S × R interaction emerged, F(1, 69) = 4.35, p = .041, ηp 2 = .06. To test whether retrieval effects of observational S-R bindings conform to the prototypical S-R retrieval pattern of performance, we ran planned comparisons within each compatibility condition. In particular, compared to probe trials with stimulus changes, stimulus repetition should yield (a) performance benefits on response-compatible probe trials due to retrieval of the appropriate response, but (b) performance costs on response-incompatible probes due to retrieval of the inappropriate response. We employed one-tailed tests for these follow-up t tests because we tested directional hypotheses. For the positive vicarious feedback condition, stimulus repetitions (compared to stimulus changes) descriptively produced fewer errors for compatible probe responses (Δ = -0.8 %), t(69) = 1.37, p = .088, one-tailed, d z = 0.16, but led to a descriptive increase of errors for incompatible probe responses (Δ = +1.0 %), t(69) = -1.62, p = .055, one-tailed, d z = 0.20. After negative vicarious prime feedback, the S × R interaction was also significant, F(1, 69) = 5.01, p = .028, ηp 2 = .07, but reversed: Compared to stimulus changes in the probe, stimulus repetitions led to a descriptive increase of errors for compatible probe responses (Δ = +0.9 %), t(69) = -1.64, p = .053, one-tailed, d z = 0.20, but produced descriptively fewer errors for incompatible probe responses (Δ = -1.0 %) t(69) = 1.58, p = .059, one-tailed, d z = 0.19.

Table 3 Summary table for ANOVA results on probe actors’ mean error rates and mean release RT
Fig. 2
figure 2

Retrieval effects of observationally acquired stimulus-response (S-R) bindings for probe actors as a function of vicarious prime feedback valence and compatibility of observed prime responses and executed probe responses. Bars reflect S-R retrieval effects, computed as the difference of stimulus change minus stimulus repetition (Δ = SC - SR) probe trials: Positive values indicate performance benefits (i.e., reduced error rates); negative values indicate performance costs (i.e., increased error rates) due to stimulus-based response retrieval. Error bars reflect standard errors of the means.

The same ANOVA on probe release RT (see Table 3) revealed significant main effects of stimulus relation and vicarious prime feedback, reflecting faster probe responses for stimulus repetition (M = 426 ms) compared with stimulus changes (M = 430 ms), and faster probe responses after positive (M = 425 ms) compared with negative (M = 431 ms) vicarious prime feedback. Furthermore, a significant interaction of response compatibility and vicarious prime feedback emerged: After positive vicarious prime feedback, compatible probe responses were descriptively executed faster than incompatible ones (M = 423 ms vs. M = 427 ms), t(69) = 1.35, p = .181, two-tailed, d z = 0.16, whereas after negative vicarious prime feedback, incompatible responses were executed significantly faster than compatible ones (M = 434 ms vs. M = 428 ms), t(69) = 2.04, p = .046, two-tailed, d z = 0.25. However, none of the effects of interest were significant (i.e., neither the three-way interaction nor the two-way S × R interaction were reliable; see Table 3), indicating that in contrast to probe errors, a feedback-dependent modulation of stimulus-based retrieval effects was absent for probe release RTs.

Discussion

We examined whether vicarious prime feedback influences transient bindings between stimuli and observed responses. Indeed, such moderation by the valence of vicarious prime feedback was apparent in the data: After positive vicarious prime feedback, retrieval effects of observational S-R bindings corresponded to “standard” S-R retrieval effects (i.e., better performance in stimulus repetition probes with compatible probe responses, but worse performance in stimulus repetition probes with incompatible probe responses, compared to probes with stimulus change). Consequentially, however, retrieval effects of observational S-R bindings were reversed after negative vicarious prime feedback, reflecting worse performance in stimulus repetition probes if the to-be-executed probe response was compatible with the observed prime response, and better performance in stimulus repetition probes with incompatible responses, compared to probes with stimulus change. Findings are therefore consistent with the “flexible goal imitation” hypothesis and support that observers actively employ feedback information to infer action goals from observed actions.

A more detailed description of the data pattern revealed that participants showed a slight overall tendency to switch responses when it was their turn to respond. On top of that tendency, positive vicarious feedback activated stimulus-based retrieval processes in the probe that corresponded to an “imitation” of the actor’s behavior in the prime, which counteracted and neutralized the switching tendency. After negative vicarious prime feedback, stimulus repetitions elicited an additional tendency to respond in opposition to the prime actor that further boosted the tendency to switch responses. Notably, both of these S × R interaction terms were significant, although each interaction came along with a unique and diametrically opposed pattern of retrieval effects.

Limitations of the present study

These findings were visible only in the error data and were not paralleled in response latencies. We attribute this discrepancy to the strong emphasis on speed that was fueled by the feedback manipulation. Whereas participants could judge for themselves whether their own response was correct or wrong, the RT feedback remained ambivalent. This encouraged participants to trade speed for accuracy (i.e., to reduce the possibility of being penalized for too slow responses) and channeled systematic variance from RT to error rates (Draine & Greenwald, 1998). It is therefore likely that the absence of stimulus-based retrieval effects for RT is due to a floor effect. A comparison of average probe RT between this study and Giesen et al. (2014) supports this view: Probe release responses were more than 70 ms faster in this study compared to the cooperative condition of the previous study using a similar experimental setup without trial-by-trial feedback (M = 428 ms vs. M = 502 ms, respectively).

On a related note, although stimulus-based retrieval effects corresponded to the pattern that was predicted by the flexible imitation account, stimulus-based retrieval effects within the compatible and incompatible response conditions failed to reach statistical significance. Our findings thus do not allow a clear statement with regard to whether retrieval effects are driven mainly by response facilitation or response interference. It should be noted, however, that it is a fairly common finding in the response retrieval literature that these simple effects fail to reach significance, probably due to a lack of power (only a small percentage of all trials enters into these comparisons) or due to a confounding of the specific response retrieval effects with a main effect of distractor repetition (e.g., Frings, Rothermund, & Wentura, 2007; Giesen, Frings, & Rothermund, 2012; Horner, 2015; Rothermund et al., 2005). Notably, the crucial test for the effects of interest always concerns the interaction terms, that is, the net effect of (a) both performance benefits and costs within each feedback level (corresponding to the two separate S × R interactions) and (b) the difference in these net interaction effects between positive and negative vicarious prime feedback (i.e., the three-way interaction). Both the three-way interaction and both two-way interactions for each prime feedback level were significant in this study, yielding clear and unambiguous support for the flexible imitation account.

Theoretical implications

These findings are noteworthy in several respects. First, they suggest that observers had a strong tendency to regard “negative vicarious feedback” as indicating an erroneous (rather than a slow) response—otherwise, it would be unclear why stimulus-based retrieval effects were reversed after negative vicarious prime feedback. This is striking because (a) the task was fairly easy, (b) both participants were explicitly informed and therefore knew that the meaning of the vicarious feedback concerned both response speed and accuracy, and (c) participants knew from their own experience that negative feedback in most cases was due to slow responding rather than indicating an erroneous response. We believe this reflects an overlearned and automatic bias: In the real world, negative feedback most often implies that an error occurred or that something wrong was done. It would be interesting to investigate whether these findings would replicate when different negative feedback signals are used that unambiguously refer to low speed and errors, respectively. Second, these data suggest that (observed) responses are represented in an abstract/semantic format in S-R episodes, representing (normative) action goals rather than concrete behaviors (Giesen & Rothermund, 2016). This not only allows for flexible goal implementation but also for a quick and flexible change of goals (see also Giesen & Rothermund, 2014). However, we concede that the present evidence for a flexible change of inferred action goals might be conditioned on the scarcity of action goal alternatives in the present setup. We used a binary response task, which means that an initially inferred action goal can be easily reversed into its opposite. Whenever more than one action alternative is present, such a reinterpretation strategy may no longer be possible or useful. In this respect, one might speculate whether the reversal of observational S-R bindings after negative vicarious feedback varies as a function of available action alternatives, which requires further research.

On a more general level, these findings provide a conceptual replication of Giesen et al. (2014) who showed that social (and in particular, positive) interdependency was a strong moderator of observational S-R binding and retrieval effects. These results build on this earlier work, but also extends it in a meaningful way by showing that vicarious feedback affects observational S-R binding and retrieval processes. Together, these findings strengthen the claim that basic processes of S-R binding/retrieval are at the core of social learning phenomena.