Rarely, if ever, do humans perform bodily movements just for the sake of moving. Instead, bodily movements are usually made to achieve certain goal states, and behavioral expressions performed to produce such desired goal states are called actions. Goals, however, differ regarding their abstractness and their temporal distance. For example, you might have the goal of becoming a famous scientist—but this goal describes a rather abstract scenario, and likely will only be achieved in the far future. On the way thereto, intermediate goals will have to be achieved first, such as publishing a sufficient number of high-quality papers, gathering third-party funding, and so forth. In contrast—at a temporally very proximal distance—an intermediate goal might require moving a finger to press the “a” key, which will lead to the letter “a” appearing on the screen, as a step toward producing the word “action” in this manuscript. These movement-contingent and perceivable changes in the environment can be understood as a subclass of goals and are usually referred to as action effects.

Ideomotor theory and effect-based action control

It is common sense that acting involves the anticipation of desired goal states, because otherwise it would be impossible to perform an action to achieve this goal. Ideomotor theory, dating back to the 19th-century philosophers (e.g., Harleß, 1861; Herbart, 1825; James, 1890; see Pfister & Janczyk, 2012, or Stock & Stock, 2004, for historical notes), in fact ascribes to action effects a generative role: Their anticipation is what selects the corresponding bodily movement suited to produce the actual effect (see also Hommel, Müsseler, Aschersleben, & Prinz, 2001). Evidence for this has been provided with the response–effect compatibility (REC) paradigm (Kunde, 2001). In a standard experiment, participants respond to (nonspatial) stimuli with a left or right keypress in two blocked conditions: In a response–effect (R–E) compatible condition, this keypress triggers a visual action effect at a spatially corresponding location (e.g., pressing the left key produces a visual effect on the left side of the screen). In an incompatible R–E condition, the action effect occurs at the opposite location. The crucial finding is that response times (RTs) are longer in the incompatible than in the compatible condition, the REC effect. Note that both conditions employ the same stimuli and the same responses, and only differ regarding the action effect—which occurs only after the RT has already been measured. Similar findings have been made for other dimensions, such as response and effect intensity (Kunde, 2001; Paelecke & Kunde, 2007), rotations (Janczyk, Yamaguchi, Proctor, & Pfister, 2015), continuous lever movements (Janczyk, Pfister, & Kunde, 2012; Kunde, Pfister, & Janczyk, 2012 ), facial expressions (Kunde, Lozo, & Neumann, 2011), and semantic overlaps between color words and actual colors (Koch & Kunde, 2002) or number words and numbers (Badets, Koch, & Toussaint, 2013), and action effects also play a role in dual-task performance (Janczyk, 2016; Janczyk, Pfister, Hommel, & Kunde, 2014).

Although the REC effect seems rather universal, whether all types of tasks involve the use of action effects is still a subject of ongoing discussion. In particular, it has been suggested that action effects and anticipatory mechanisms may not play a role at all in so-called forced-choice tasks—that is, tasks in which certain stimuli determine the one and only correct response (e.g., Herwig, Prinz, & Waszak, 2007). In contrast, a free-choice task—that is, one in which a certain stimulus asks the participant to choose between two equally appropriate responses (Berlyne, 1957)—induces an “intention-based action control mode” wherein effect anticipations do play a role. Notably, though, many of the above-mentioned studies showing REC effects have used forced-choice tasks, and when forced- and free-choice tasks were combined and administered in a random sequence, similar REC effects were reported (Pfister & Kunde, 2013). It seems, then, that both tasks appear to have more commonalities than differences (see Janczyk, Dambacher, Bieleke, & Gollwitzer, 2015; Janczyk, Nolden, & Jolicœur, 2015), but we will put this to further test in our experiment, as well.

The temporal distance of action effects

A limiting characteristic of all of the aforementioned studies is that only one single action effect occurred immediately following the response, and thus was as temporally proximal as possible.Footnote 1 The role of temporally subsequent action effects has so far been neglected in research on REC and the anticipation of goals for action selection. Even though here we were concerned with temporal distance, some indirect evidence that more distal action effects may be crucial has come from the motor-learning literature. In general, performance/training in many sports can be improved when the instructions induce an external focus of attention (i.e., one directed at the action effects) rather than an internal focus of attention (i.e., one directed at the bodily movements themselves; see, e.g., Wulf, Höß, & Prinz, 1998; see Wulf & Prinz, 2001, for a review). Varying the distance of external action effects from one’s own body, however, has an additional effect, and performance is improved when action effects are more spatially, and thus also more temporally, distal from the body (e.g., McNevin, Shea, & Wulf, 2003), even though there seems to be an optimal “distality” for effects to be efficient. According to Wulf and Prinz (2001), temporally distal effects that still can be related to and predicted from the movement should be emphasized in order to optimize motor learning.

The present experiment and predictions

We adopted a typical spatial REC paradigm with manual left/right keypress responses and left-/right-occurring visual action effects (e.g., Kunde, 2001; Pfister & Kunde, 2013). To direct attention to the visual action effects, on a small proportion of all trials the usually white action effects turned yellow. On these catch trials, participants were to press both response keys simultaneously. Furthermore, we used forced- and free-choice tasks randomly intermingled to examine whether the REC effect was modulated by these tasks. The crucial extension related to the fact that the immediately occurring action effect E1 (a white-filled circle on either the left or the right side) further changed its position to either the left or the right several hundred milliseconds after its first onset (E2). Because both E1 and E2 could be spatially compatible or incompatible with the response R, four conditions resulted, and several hypothetical outcomes could be predicted. The first possibility was that both E1 and E2 features would be activated simultaneously and integrated into one, unified event file—for example, in the way that the theory of event coding (TEC; Hommel et al., 2001) suggests. If this were true, an interaction of R–E1 and R–E2 compatibility should emerge. In this case, RTs either might decrease the more effects were compatible, or might only decrease if all effects were compatible. A second possibility was that both E1 and E2 might be activated independently of each other, and thus E1 and E2 would not interact. In this case, three scenarios were viable. (1) Only E1 would play a role, and subsequent effects would be irrelevant, leading to only a main effect of R–E1 compatibility. This resembles the situation in the typical REC studies so far. (2) Only the most distal, but still predictable, effect would be anticipated, and thus only a main effect of R–E2 compatibility would emerge. (3) Both effects would be anticipated, but only in succession. In this case, main effects of both R–E1 and R–E2 compatibility would emerge.

Method

Participants

Thirty-two students (24 female, eight male) from the University of Tübingen (M = 25.0, SD = 5.1) participated for either monetary compensation (€8) or course credit. Twenty-six of the participants were right-handed. In the absence of prior work with multiple action effects, the sample size was based on a consideration of the prior work, in which smaller samples seem more common (e.g., Kunde, 2001; Wirth, Pfister, Janczyk, & Kunde, 2015).

Apparatus and stimuli

Stimulus presentation and response recording were done by a standard PC connected to a 17-in. CRT monitor. The stimuli were the letters “H” and “S,” as well as “!,” presented in white in the center of an otherwise black screen. The visual action effects were white-filled circles, and the location of their initial onset (E1) was visualized by white outlines to the left and right of the screen center. One single response key was located on the left and another one on the right side of the participant. When a response was given, the inside of one of the circles was filled white (E1). After the circle filled, it jumped to the left or to the right of its current location (E2). The general trial structure and the possible effects resulting from crossing R–E1 and R–E2 compatibility are illustrated in Fig. 1.

Fig. 1
figure 1

Trial structure. In this example the imperative stimulus “S” is associated with a left-hand response, and “H” is associated with a right-hand response. After the response, either the left or the right circle was filled white and then jumped to either the left or the right. The current R–E1 and R–E2 mappings determined which circle would be filled and in which direction it would jump

Procedure

Each trial began with the presentation of the two white outlines (250 ms). Subsequently, a white fixation cross was presented for 250 ms in the center of the screen, and then disappeared for 250 ms. Thereafter, the stimulus (“H,” “S,” or “!”) was presented and remained visible for up to 2,000 ms, or until a response was executed. The letters prompted a forced choice, whereas the “!” called for a free-choice response. If participants took longer than 2,000 ms to respond, the message “Bitte schneller reagieren!” (German for “Please respond faster!”) was presented for 1,000 ms. In the case of an error, the message “Falsche Taste!” (German for “Wrong key!”) was presented for 1,000 ms. In the case that a participant pressed a key before stimulus onset, the message “Bitte Tasten loslassen!” (German for “Please let go of the keys!”) was presented. The response immediately filled one of the circles to the left or the right of the stimulus. In the case of a compatible R–E1 mapping, the circle matching the side of the response key was filled, whereas in the case of an incompatible R–E1 mapping, the circle on the opposite side of the screen was filled. After an interval of 500 ms, the filled circle jumped to the left or to the right of its current location, with the two outlines still present on the screen. In the case of a compatible R–E2 mapping, the filled circle jumped in the direction that corresponded to the side of the response, whereas in the case of an incompatible R–E2 mapping, the filled circle jumped to the response-opposite side. The filled circle remained at the final location for an additional 500 ms. Usually, the next trial followed a blank intertrial interval of 1,000 ms. In the case of a catch trial, however, E2 was displayed in yellow, and participants were to press both keys simultaneously and as quickly as possible. If this response was not registered within 4,000 ms, a corresponding error message was fed back to the participants.

An experimental session started with one practice block, consisting of 30 trials in which only the imperative stimuli were presented, to familiarize participants with the stimulus–response (S–R) mapping and the task. Then, four practice blocks with filling and jumping circles followed, each consisting of six trials of one of the four possible R–E1/E2 mappings. Subsequently, three blocks of each of these four R–E mappings were presented, in the same order as in the practice blocks. In each condition, the first of the blocks (12 trials, including two catch trials) was considered practice, and each of the two subsequent experimental blocks comprised 60 trials (including 20 free-choice trials and six catch trials). In total, participants completed eight experimental blocks, amounting to 480 trials, of which 160 were free-choice trials and 48 were catch trials. The S–R mapping remained constant throughout the experiment and was counterbalanced across participants. The order of the R–E mappings was pseudorandomized by creating two cyclic groups of order four of the Latin square, resulting in eight R–E mapping orders. Participants received written instructions on screen between blocks. An experimental session lasted about 45 min. Participants were instructed to respond as quickly as possible to the letter by pressing a key, while maintaining errors at a low rate.

Design and analysis

Data from the practice blocks were not included in the analyses. Mean correct RTs and error rates were calculated for each participant as a function of R–E1 mapping (compatible vs. incompatible), R–E2 mapping (compatible vs. incompatible), and choice (forced vs. free), and subsequently averaged across participants. For the analysis of RTs, all error trials were excluded from the data analyses. Furthermore, trials with RTs deviating more than 2.5 standard deviations from the individual cell mean were considered outliers and also excluded from the data analysis (2.4 %). For error rates, only forced-choice trials were considered, because no response errors can be made in free-choice trials. A 2 × 2 × 2 repeated measures analysis of variance (ANOVA), with the factors R–E1 Mapping (compatible vs. incompatible), R–E2 Mapping (compatible vs. incompatible), and Choice (forced vs. free choice), was computed on RTs. A 2 × 2 repeated measures ANOVA with the factors R–E1 Mapping (compatible vs. incompatible) and R–E2 Mapping (compatible vs. incompatible) was computed on error rates. After error trials and outliers were excluded, the response proportions (left vs. right keypresses) in free-choice trials were calculated. As was indicated by a pairwise t test, the mean response ratios (46.7 % left vs. 53.3 % right) did not differ significantly, t(31) = 1.27, p = .213.

Results

Mean corrects RTs are visualized in Fig. 2 (see also Table 1). As can be seen, RTs were much shorter for the forced-choice than for the free-choice trials, F(1, 31) = 102.91, p < .001, η p 2 = .77, a standard finding when comparing these types of tasks (see, e.g., Berlyne, 1957; Janczyk, Nolden, et al., 2015). RTs were not much affected by R–E1 compatibility, F(1, 31) = 1.07, p = .309, η p 2 = .03, but they were overall shorter in the R–E2 compatible than in the R–E2 incompatible conditions, F(1, 31) = 5.96, p = .021, η p 2 = .16—thus, an REC effect for E2. None of the two-way interactions was significant: R–E1 × R–E2 compatibility, F(1, 31) = 0.09, p = .762, η p 2 < .01; R–E1 × Choice, F(1, 31) = 2.91, p = .098, η p 2 = .09; R–E2 × Choice, F(1, 31) = 0.28, p = .599, η p 2 = .01. The three-way interaction was also not significant, F(1, 31) = 0.05, p = .809, η p 2 < .01.

Fig. 2
figure 2

Mean response times (RTs) as a function of R–E1 compatibility, R–E2 compatibility, and choice. Error bars represent within-subjects standard errors, based on the R–E2 compatibility variable

Table 1 Mean reaction times (RTs, in milliseconds) and error rates (ERs; only for forced-choice trials), as a function of R–E1 mapping, R–E2 mapping, and choice (only for RTs)

Error rates are summarized in Table 1, and no effect approached significance, all Fs ≤ 1.51, all ps ≥ .229.

Discussion

According to ideomotor theory, action effects are anticipated prior to movement initiation, and this anticipation is action selection. Starting with Kunde (2001), much research has provided evidence for this assertion, by showing that even events occurring after a response has been given (and thus after the RT has been measured) can influence RTs. In particular, the REC paradigm makes use of the dimensional overlap between the response features and features of the immediately occurring action effects, and the REC effect describes longer RTs when both are incompatible than when both are compatible. A shortcoming of this approach is that so far, only one effect was employed, which—most often—occurred directly following the response. However, rarely does a given movement produce only one effect; subsequent consequences are more likely to follow a first effect.

Summary of the results and relation to other studies

The present experiment therefore extends the literature by investigating whether temporally more distal action effects are also anticipated during action selection. To this end, the first-occurring action effect changed its position after some time in either a response-compatible or a response-incompatible direction. The results were straightforward: An REC effect was only observed for the temporally more distal effect, and the compatibility of the immediate action effect and the response did not play a role. It seems, then, that temporally more distal action effects (or, in broader terms, terminal goal states) are indeed anticipated prior to movement initiation—a finding that certainly extends the scope of ideomotor theory. This finding was also true for both forced- and free-choice tasks to the same degree (see also, e.g., Pfister & Kunde, 2013). This substantiates the increasing evidence that effect anticipation is a ubiquitous feature of both tasks, and is not restricted to free-choice tasks or intention-based action modes, as envisaged by, for example, Herwig et al. (2007).

Why did we not observe an effect of R–E1 compatibility, since others have often documented that immediate action effects are anticipated (e.g., Kunde, 2001; Pfister & Kunde, 2013)? One possibility is that E1 was not anticipated at all in our setup; another possibility is that E2 anticipation overwrote E1 anticipation. We cannot distinguish between these two alternatives on the basis of the present data. However, if E1 had been anticipated, an impact of its anticipation appears likely to us. This was not the case according to the present results, though, and tentatively we suggest that only E2—the most distal and ultimate action effect—was anticipated for action selection. However, whether multiple action effects (including body-related, proprioceptive ones) are anticipated for action selection will be important to address in future research.

Some researchers have investigated the impact of a temporal delay between response and the occurrence of action effects (Dignath, Pfister, Eder, Kiesel, & Kunde, 2014; Wirth et al., 2015). These studies showed that RTs are longer when the effect occurs after a long interval than when it occurs after a short interval. Although its results pointed in the same direction—that action effects do not have to occur immediately following a response in order to affect action selection—the present study addressed a different question, in that we did not vary the intervals between responses and their effects, but presented two subsequent action effects with fixed intervals. Whether the impact of the temporally more distal action effect depends on whether it occurs soon or late should be a subject for future research. At present, however, it is not clear whether the time intervals between a movement and its associated effects are considered a single feature or are integrated with the effect that terminates the interval (Dignath & Janczyk, 2016). Another line of research has also employed multiple action effects, but within a series of sequential actions. For example, musicians initiate a sequence of three actions (pressing three vertically aligned response buttons) faster when each keypress results in an auditory effect and the pitch of these effects is compatibly ordered with the key locations (i.e., the lowest key produces a low-pitch tone, the medium key produces a medium-pitch tone, and the highest key produces a high-pitch tone), as compared to neutral or incompatible key- and pitch-height relations (Keller & Koch, 2008). This line of research also shows that temporally more distal action effects can be included in action planning, but an important difference is that series of actions were planned, rather than the single action in the present study.

Limitations and alternative account

Several other points deserve some mention, because they indicate limitations and/or offer alternative accounts. First, a closer look at the design reveals a question that we cannot answer at this point: We defined the compatibility of E2 with respect to R—the location of the response. However, the (in)compatibility of R and E2 always went together with the (in)compatibility of E1 and E2 (see Fig. 1): For example, if one presses a left response key that is followed by a right E1, a compatible E2 would move to the left because the left response key had been pressed, but at the same time would also be to the left of E1. Tentatively, we suggest though that the relation of R and E2 (and not of E1 and E2) is crucial for two reasons: (1) When an action is performed to achieve a specific goal state, it certainly makes sense to focus on this state (and this is E2 in our case), and not on the intermediate states (such as E1). (2) Our results did not show any effect of R–E1 compatibility, and thus it might be possible that E1 was not anticipated at all. In this case, it is hard to see how the compatibility relation with E1 would count. The latter reasoning fits with observations that a standard REC effect depends on the location of the response key, and not on the anatomical features of the effectors that produce the keypress (Pfister & Kunde, 2013). In these experiments, the REC effects were of the same size in both a standard and a crossed-hand condition. In any case, even an interpretation in terms of E1–E2 compatibility suggests that not only the immediate, but also subsequent, actions effects are taken into account and anticipated prior to response execution.

Second, although we interpreted the findings in terms of the effects’ temporal distality, E1 and E2 differed with regard to other characteristics. For example, on average E2 was presented more eccentrically than E1, and thus perhaps E2 was “more (in)compatible” with the response. However, this explanation should result in an interaction of R–E1 and R–E2 compatibility, with a stronger influence of R–E2 compatibility when it matches R–E1 compatibility. Furthermore, the events following the response may have been perceived as a single event—that is, a stimulus jumping to the left or right by apparent movement. In this sense, the first part of the event (E1) may have been subjectively less relevant than its second part (E2), causing some imbalance in the effects’ salience. Future experiments could address this issue by making both effects more similar and/or manipulating the relative saliences of both effects, to scrutinize whether salience or temporal distality is the crucial factor.

Conclusion

In summary, the present experiment showed that not only immediate action effects can be anticipated prior to movement initiation, but that temporally more distal effects also contribute to action selection. According to our interpretation of the present results, we tentatively suggest that only the most distal, but still predictable, action effect is considered at all.