Breaking with the tradition of investigating human cognition in isolated individuals, there is an increasing tendency in cognitive science to consider social influences on cognitive processes, especially in situations in which people jointly perform the same task (Knoblich, Butterfill, & Sebanz, 2011, Wenke et al., 2011). A common pattern emerging from such joint-action studies is that the presence of another person alters the cognitive representation of one’s task and/or actions. Particularly popular in joint-action studies is the joint Simon task.

In the standard Simon task (Simon, 1969), single participants carry out spatially defined responses (e.g., left and right keypresses) to nonspatial features (e.g., shape) of stimuli that appear randomly to the left or right. Although the stimulus location is entirely task-irrelevant, stimuli more or less automatically activate spatially corresponding responses, which speeds up trials with spatial stimulus–response correspondence, and slows down trials with noncorrespondence (Kornblum, Hasbroucq, & Osman, 1990). This so-called Simon effect (for a review, see Hommel, 2011) typically vanishes if the task is turned into a go–no-go task by having the participant respond to only one stimulus feature by operating a single response key (Hommel, 1996). Interestingly, however, the effect reappears if the participant is joined by another participant who carries out a go–no-go task on the other stimulus feature by operating the other response key—a phenomenon known as the social or joint Simon effect (JSE; Sebanz, Knoblich, & Prinz, 2003).

The JSE has been taken to suggest that, under joint-action conditions, people automatically (co-)represent the action or task of their co-actor, which reintroduces a kind of response competition similar to that in the standard Simon task (Sebanz et al., 2003). However, recent studies have raised considerable doubts about this social account. In particular, Dolk, Hommel, Prinz, and Liepelt (2013) demonstrated that the JSE can occur in the absence of another person. In fact, reliable JSEs have been obtained with various nonhuman “co-actors,” such as a Japanese waving cat, a clock, and a metronome. This suggests that what introduces response competition in a go–no-go task is not necessarily the presence of another person or task-related action, but the presence of another attention-grabbing event, which apparently induces the tendency to discriminate its cognitive representation from the representation of the participant’s own action (Dolk et al., 2013; Liepelt, Wenke, Fischer, & Prinz, 2011; Pfister, Dolk, Prinz, & Kunde, 2013). It certainly makes sense to assume that humans and their actions are particularly salient (Langton, Watt, & Bruce, 2000), in particular if these actions are intentional (Müller et al., 2011; Stenzel et al., 2012) and if the co-actor is perceived to be similar to oneself (Hommel, Colzato, & van den Wildenberg, 2009). Hence, social events may be particularly attention-grabbing, but that need not imply that “joint action relies on socially shared action representations and involves modelling of others’ performance in relation to one’s own” (Knoblich & Sebanz, 2006, p. 99).

Nonsocial explanations of JSE-type effects are particularly plausible because of the spatial characteristics of the situation. Pressing a single key over and over in a solo go–no-go Simon task does not suggest coding one’s action in terms of left or right, since there is simply no alternative. The presence of another person, key, or action provides such an alternative, since these events all occupy a particular location that invites the relative spatial coding of one’s own action. That is, facing a right-hand key that is repeatedly pressed by another person during a task draws attention to the fact that one’s own key is located to the left of that other key, person, or action, so that coding one’s own action as “left” makes more sense than in the solo condition. Accordingly, the “left” action code is now compatible with stimuli that are also coded as “left,” but incompatible with stimuli coded as “right,” which is a necessary condition for stimulus–response compatibility effects to emerge (Dittrich, Dolk, Rothe-Wulf, Klauer, & Prinz, 2013; Kornblum et al., 1990). The same logic may hold for nonsocial events, such as waving cats or ticking metronomes, which may also invite coding one’s own action relative to the location of these events (Dolk et al., 2011; Dolk et al., 2013; Guagnano, Rusconi, & Umiltà, 2010).

Interestingly, however, joint-action effects are by no means restricted to tasks involving spatial stimulus–response relations. Notably, Atmaca, Sebanz, and Knoblich (2011) were able to show that joint action increases the size of the go–no-go flanker effect (the joint flanker effect: JFE). In the standard (two-choice) flanker task, participants carry out spatially defined responses to nonspatial targets (e.g., the letters H and K) appearing at the center of a screen. The central target is surrounded by to-be ignored flankers that are compatible (HHHHH), neutral (UUHUU), or incompatible (KKHKK) with the actual target. Although the flanker letters are nominally task-irrelevant, responses are typically faster if flankers and target are compatible than if they are neutral or incompatible. In the study of Atmaca et al. (2011; Exps. 1 and 2), participants carried out go–no-go versions of the flanker task either alone or in the presence of an intentionally acting human co-actor. As expected, the flanker compatibility effect (i.e., mean reaction times [RTs] for incompatible trials>baseline [compatible+neutral] trials) was more pronounced when sharing the flanker task with an intentional (but not with a nonintentional) co-actor than when performing the same go–no-go flanker task in isolation (i.e., single condition)—an effect that the authors attributed to task co-representation. However, given that the assumption of automatic co-representation is unnecessary and unlikely to account for JSEs, in the present study we tested whether a nonsocial account of the JFE is feasible.

Comparing the typical outcomes of social Simon and social flanker tasks reveals an interesting difference: Whereas the presence of a co-actor often turns a nonsignificant go–no-go Simon effect in solo conditions into a significant go–no-go effect in the joint condition, co-actors only increase the size of an already significant effect in the solo go–no-go flanker task. In other words, co-actors create an effect in the go–no-go Simon task, but only modulate an existing effect in the go–no-go flanker task. Flanker effects in the standard (i.e., two-choice) condition are commonly attributed to two sources of interference: crosstalk between the features of nonidentical letters (which presumably impairs stimulus identification) and response competition induced by flankers that are assigned to different responses (Eriksen & Eriksen, 1974; Rösler & Finger, 1993). In keeping with the finding of the standard (i.e., two-choice) flanker task (Eriksen & Eriksen; 1974), the results of Atmaca et al. (2011) suggest that the response competition component is what is mainly affected by the presence of the co-actor.

According to Atmaca et al. (2011), co-actors increase the go–no-go flanker effect by making the participant represent “their partner’s task rules in addition to their own.” However, if that would really be the explanation, it would be difficult to understand why any response competition would occur in the solo go–no-go condition. In fact, any effect that goes beyond feature crosstalk produced by visual noise must reflect the automatic translation of stimuli into responses, and there is indeed strong evidence that the irrelevant flanker stimuli activate the responses that they are assigned to (Gratton, Coles, Sirevaag, Eriksen, & Donchin, 1988; Heil, Osman, Wiegelmann, Rolke, & Hennighausen, 2000). Hence, if response-incompatible flankers produce interference in a go–no-go task under solo conditions, this seem to suggest that participants represent and make active use of the alternative stimulus–response rule even in the absence of another person. What the presence of another person thus seems to do is not to induce the representation of a rule (as Atmaca et al., 2011, claimed) but merely to increase the impact of that rule on performance—presumably by drawing attention to it.

We tested this nonsocial interpretation by applying the logic of Dolk et al. (2013) to the social flanker task. If a human co-actor does not induce the representation of flanker-related stimulus–response rules but merely draws attention to it (even though humans are suspected to attract attention in particularly efficient ways: Langton et al., 2000), any sufficiently salient event might do the same job (Dolk et al., 2013). If so, it should be possible to replace the human co-actor in a social flanker task with a nonhuman “co-actor” and yet still find a JFE (i.e., an increase of the flanker effect in the joint as compared to the solo go–no-go condition).

To test this interpretation, we used the same go–no-go flanker task that was employed by Atmaca et al. (2011). In one group of participants, we compared performance in a solo go–no-go condition with performance in a joint go–no-go condition, in which participants were accompanied by a human co-actor ('Human Co-Actor' group). Here we aimed to replicate Atmaca et al.’s finding of a more pronounced go–no-go flanker effect in the joint than in the solo condition. In another group of participants, we replaced the human co-actor by a Japanese waving cat placed close to the participant ('Nonhuman Co-Actor' group). Even though this object might be expected to draw less attention than a human, it should draw some and, according to our nonsocial account, produce a JFE.

Method

Participants

A group of 48 healthy undergraduate students (26 female; 22 male; 21–29 years of age) were randomly assigned to either the 'Human Co-Actor' or 'Nonhuman Co-Actor' group. All participants were right-handed as assessed by the Edinburgh Inventory scale (Oldfield, 1971), had normal or corrected-to-normal vision, were naive with regard to the hypothesis of the experiment, and were paid for their participation.

Apparatus and stimuli

Two letters (H and K) served as go and no-go stimuli. These target letters were presented at a viewing distance of approximately 60 cm and flanked by the letters H, K, and U—two on each side. Combining target and flanker letters resulted in three different stimulus types: compatible (flanker and target signaling the same response: HHHHH, KKKKK), neutral (flanker signaling no response: UUHUU, UUKUU), and incompatible (flanker and target signaling different responses: HHKHH, KKHKK), each covering a visual angle of 2.9º × 0.5º.

Upon arrival at the laboratory, (unacquainted) pairs of participants in the Human Co-Actor group were informed that they would perform the same task under two different conditions—that is, alone in one condition and together with the other person in the other condition (see Fig. 1).

Fig. 1
figure 1

Experimental setup in the 'Human Co-Actor' group. The participant (gray-shaded) is responding to target stimulus K in the (A) solo condition (compatible trial) and (B) joint condition (incompatible trial)

In the human co-actor/joint condition (Fig. 1B), both participants were seated next to each other. They operated a response button with their right index finger (25 cm in front and 25 cm from the midline of a 17-in. computer monitor) and were asked to place their left hand underneath the table on their left thigh. Prior to the experiment, participants received written instructions (e.g., “Person on RIGHT, press response key if central letter is H, and person on LEFT, press response key if central letter is K”) and were encouraged to respond as quickly and accurately as possible to their assigned stimulus. The target letters (H/K), response side (left/right), and order of conditions (solo/joint) were counterbalanced across participants.

In the human co-actor/solo condition (Fig. 1A), everything was the same (assigned stimulus and response side), except that the left or right chair remained empty (Instruction: “Press response key if central letter is K, and do not respond if it is H”).

In the Nonhuman Co-Actor group, the procedure and treatment of the participants was as in the Human Co-Actor group, except that in the joint condition the human co-actor was replaced by a golden Japanese waving cat. It was placed 50 cm to the left of the participant’s own (right) response button (Fig. 2B), which was the only response button present. The cat kept waving with its left arm at a frequency of 0.4 Hz and an angle of 50º in the vertical plane throughout the session.

Fig. 2
figure 2

Experimental setting in the 'Nonhuman Co-Actor' group. The participant is responding to the assigned target stimulus K in the (A) solo condition (compatible trial) and (B) joint condition (incompatible trial). The Japanese waving cat used in the joint condition is shown in panel C

Participants were able to see the cat in their peripheral visual field and to hear the (unpredictable) sound produced by the waving. The solo condition (Fig. 2A) was as in the Human Co-Actor group, except that the Japanese waving cat was removed, leaving the table on the participant’s left empty. In both groups, the target letter and condition were counterbalanced across participants.

Procedure

In each condition (solo, joint), three blocks were presented: a 12-trial training block and two experimental blocks of 192 trials (i.e., 32 trials for each of the six stimuli per block, presented in random order). Short breaks separated the blocks to allow participants to maintain vigilance.

Each trial began with the 500-ms presentation of a fixation cross (0.5º × 0.5º), followed by a blank screen for another 500 ms. After 1,000 ms, the 2.9º × 0.5º stimulus array was presented at screen center until a response was given or until 1,500 ms had passed. Following a response, feedback about the accuracy was provided for 500 ms: Correct responses were followed by the fixation cross, incorrect responses by the word Fehler (“error”), and too-slow responses by zu langsam (“too slow”). In all cases, trials were separated by an intertrial interval of 1,000 ms.

Results

For the statistical analysis, we excluded errors (1.0 %) and trials in which the reaction time (RT) deviated from the corresponding cell mean by more than 2.0 standard deviations (SD; 3.8 %). Difference scores of the RTs were calculated by subtracting the average RT of compatible trials from the average RT of incompatible trials. Two participants were excluded due to error rates or difference scores more than 2.0 SDs above the mean (Müller et al., 2011). The RTs for correct responses of the remaining 46 participants were submitted to a repeated measures ANOVA with the within-subjects factors Compatibility (baseline [averaged RTs for compatible and neutral trials], incompatible) and Condition (solo, joint) and the between-subjects factor Group (human co-actor, nonhuman co-actor).

The RT ANOVA revealed three main effects, indicating that responses were faster in the Human than in the Nonhuman Co-Actor group (343 vs. 366 ms), F(1, 44) = 6.61, p < .05, η2 = .13, faster in the joint than in the solo condition (351 vs. 358 ms), F(1, 44) = 5.22, p < .05, η2 = .11, and faster in baseline than in incompatible trials (347 vs. 362 ms), F(1, 44) = 136.63, p < .001, η2 = .76 (see Table 1). More importantly, the compatibility effect was modified by a significant compatibility-by-condition interaction, F(1, 44) = 17.01, p < .001, η2 = .28: Although the compatibility effect was reliable in both the solo condition, F(1, 44) = 67.65, p < .001, η2 = .60, and the joint condition, F(1, 44) = 157.65, p < .001, η2 = .78, it was more pronounced in the joint condition (Fig. 3).Footnote 1 This interaction was not modified by group, F < 1, and no further significant interactions emerged, either, ps > .05.

Table 1 Mean reaction times (in milliseconds), mean error rates (as percentages), and standard deviations (SDs) as a function of group, condition, and compatibility. Reaction time compatibility effect (CE) sizes (incompatible minus compatible) are in the rightmost column
Fig. 3
figure 3

Averaged reaction times for baseline (i.e., compatible+neutral; dark gray) and incompatible (light gray) trials in the solo and joint condition. Error bars depict the standard errors of the paired difference scores (SE PD; Pfister & Janczyk, 2013), calculated for each condition

The error ANOVA showed main effects of compatibility, F(1, 44) = 31.94, p < .001, η2 = .42—due to a lower error rate for baseline than for incompatible trials (0.4 % vs. 1.6 %)—condition, F(1, 44) = 14.28, p < .001, η2 = .25—revealing more errors in the joint than in the solo condition (1.4 % vs. 0.6 %)—and group, F(1, 44) = 15.95, p < .001, η2 = .27, documenting fewer errors in the nonhuman than in the human co-actor group (0.5 % vs. 1.5 %). The main effect of compatibility was modified by two two-way interactions, with condition, F(1, 44) = 6.53, p < .05, η2 = .13, and group, F(1, 44) = 9.42, p < .01, η2 = .18, and by the three-way interaction, F(1, 44) = 9.70, p < .01, η2 = .18.

Discussion

In the present study, we investigated whether the JFE is due to co-actors automatically triggering the co-representation of stimulus–response rules (Atmaca et al., 2011). Contradicting this assumption, we were able to demonstrate that the impact of a human co-actor is comparable to that of a Japanese waving cat that is entirely unrelated to the present task. This is the first demonstration of a JFE in a nonsocial setting, and we take it to challenge the assumption of task co-representation (Sebanz, Knoblich, & Prinz, 2005). Apparently, the presence of another human is not necessary to induce the representation of task-unrelated stimulus–response rules (as indicated by reliable flanker effects in the solo condition), nor is it necessary to further increase this effect in the joint condition.

Dolk et al. (2013) accounted for the JSE by assuming that participants represent their own responses to task-relevant stimuli just like any other event—namely, as distributed feature codes representing the response’s characteristics (Hommel, Müsseler, Aschersleben & Prinz, 2001). The presence of other events, such as another person, a Japanese waving cat, or a metronome, introduces a discrimination problem: Participants need to distinguish between the event that they control themselves and the other events that they don’t. Discrimination problems are commonly resolved by emphasizing discriminating features—that is, by increasing the weight of feature codes that refer to event (action) characteristics that differ most from the other events. With buttonpresses, probably the most obvious discriminating feature is location, suggesting that the presence of another event (or person or action) leads to an increase of the weight of location information in the internal coding of responses—intentional weighting, in the sense of Memelink and Hommel (2013). With left and right keypress responses, this would make participants code their actions in terms of left and right (referential coding; Dolk et al., 2013), which would be unnecessary in the absence of alternative events (i.e., in the solo condition). Accordingly, the presence of such events would lead to an increase in feature overlap between the stimuli and the responses (Kornblum et al., 1990), and thus to a larger Simon effect.

In the flanker task, the same response discrimination problem exists in principle, which suggests that participants are also more likely to increase the weight of features that discriminate their response from other events (joint condition), on top of the required stimulus discrimination (joint and solo condition). Hence, participants presumably coded their right-hand keypress more as “right” in the joint than in the solo condition—just as in a social Simon task. Given that no-go actions seem to be explicitly represented (Kühn & Brass, 2010), it is possible that even the coding of the no-go response was modified. In general, however, making alternative event representations more discriminable by emphasizing the coding of discriminable features increases the competition between these representations. If two given events (e.g., stimulus and action/response events) are coded in terms of features defined on n different feature dimensions (e.g., identity and space), this induces n different competitions between feature codes (Duncan, 1996; Dutzi & Hommel, 2009). If we assume that the response decision needs to await the resolution of the slowest competition, this means that the RT should increase with every extra feature dimension that event-coding processes consider. Flankers exert their effect on response competition by activating either the same response that the central target is activating (speeding up RTs in compatible trials) or by activating a competing response (slowing RTs in incompatible trials). This effect on subsequent behavior (as expressed in the compatibility effect) must be more pronounced, the more different the activated response representations are. If we thus assume that introducing a salient event (be it another person or a waving cat) induces greater response discrimination, it follows that the flanker effect should increase accordingly.

To summarize, the JFE does not require a truly social situation to occur—the presence of another salient event is apparently sufficient to increase flanker-induced response competition. We suggest that the presence of such an event induces the need to make the internal representation of one’s own response more distinctive, which in turn increases the impact of all flanker-induced competition. If so, our present findings provide converging evidence that no specialized “social” processes may be necessary to explain joint-action effects (Dolk et al., 2013).