Human actions are goal-directed, and people often must choose from several possible action goals. For example, when cooking in the kitchen, one may need to open the freezer or refrigerator to retrieve a food item, operate one among several knobs to turn on a targeted burner on a stovetop, and reach to a desired dish among a stack of dishes. If the handles for the freezer and refrigerator doors are close together, as is the case for side-by-side refrigerators, the wrong handle may be grabbed accidently; if the knobs for the stove are ambiguously mapped to their respective burners, the wrong burner may be operated; if the dishes all have the same decorative pattern, they cannot easily be distinguished. Better separation of the alternative possible goals can help people select the intended one faster and make fewer mistakes, and this separation can be in terms of location, size, color, or how people mentally represent the goals.

Participants of experiments in a psychological laboratory also need to distinguish among alternative goals (e.g., when selecting between two possible responses in reacting to one of two stimulus colors) to perform an experimental task. The Stroop (1935) color-identification task, for example, has been widely used to reveal underlying mechanisms in the processing of task-relevant and -irrelevant information (e.g., MacLeod, 1992). In a Stroop task, participants are to identify the physical color (task-relevant) of a letter string (usually a color word), while ignoring the meaning conveyed by the string (task-irrelevant). Performance is better when the word meaning and physical color are congruent (e.g., the word “red” written in red) than when they are incongruent (e.g., the word “red” written in blue). This Stroop effect reflects interference from the word meaning on incongruent trials, with a larger Stroop effect indicating more interference.

Although the Stroop effect is most often studied with vocal color-naming responses, it also occurs when stimulus colors are identified with keypresses (e.g., Melara & Mounts, 1993). In a two-choice version of the task, the action goals are to classify the color of the displayed stimulus (e.g., red or blue) by pressing a left or right response key. The color-word meanings overlap conceptually with the relevant stimulus dimension, in that both dimensions involve the same colors. Consequently, the word meaning, though irrelevant to the task, also activates the response corresponding to its category, to some degree.

Lakens, Schneider, Jostmann, and Schubert (2011) conducted a two-choice Stroop task of this type in which the colors red and blue were mapped to keypresses made with the left and right index fingers. The Stroop effect was larger when the response hands were physically close on adjacent keys of the keyboard (“K” and “L” keys on the center row) than when they were farther apart on nonadjacent keys (“S” key and the “5” key on the number pad located on the right side). In the concluding sentence of their article, Lakens et al. attributed this response-distance effect to anatomical discriminability: “The distance between your hands can help you tell things apart in your mind” (p. 889). However, their method did not dissociate anatomical discriminability (the distance between responding hands) from response-key discriminability (the separation between response keys). This distinction is critical, because a basis in anatomical discriminability would be more consistent with an embodied-cognition account (e.g., Hostetter & Alibali, 2008), in which the state of a person’s body influences cognitive coding, whereas a basis in response-key discriminability would implicate an action-goal account (e.g., Buhlmann, Umiltà, & Wascher, 2007), in which coding is in terms of the locations at which goal actions take place.

To disentangle anatomical discriminability from response-key discriminability, Proctor and Chen (2012) had participants respond with two sticks, each of which had one end affixed to a response key and the other end operated by the responding hand. The sticks were held by the corresponding hands such that when the response keys were close, the hands were far apart, whereas when the keys were far apart, the hands were close together. Their results showed that the response-distance effect was determined by the separation between the keys rather than the distance between the hands. This evidence is in agreement with an action-goal account, according to which better discriminability of the action goals in the far-key condition leads to less interference from the alternative response, and thus to a smaller Stroop effect, than is found in the close-key condition. However, because the responses were made on a standard QWERTY computer keyboard in Lakens et al.’s (2011) and Proctor and Chen’s studies, the physical key separation was confounded with other factors, including the labels specifying the response keys and whether other keys intervened between the two response keys.

First, the response keys were labeled with letters for the close-key condition, but with a letter and a digit for the far-key condition. Thus, there was a categorical, conceptual distinction in the far-key condition, but not in the close-key condition. Because this conceptual distance was confounded with the physical distance, the lesser Stroop effect for the far-key condition could have been due to greater conceptual distance between the response keys rather than to their physical separation. In addition, although Proctor and Chen’s (2012) results provided evidence against an anatomical basis for the effect, one could still argue that the response-distance effect could be accommodated by an embodied account, according to which the sticks are viewed as tools that are functional extensions of the hands (see, e.g., Iriki, Tanaka, & Iwamura, 1996): The distance between the tips on the keys could represent the hand distance, and thus the response-distance effect could still be defined by the functional hand distance. A conceptual response-distance effect would be consistent with the action-goal account, but not with the tool-tip account.

Second, pure physical distance between the two response keys was also confounded with the absence (close condition) or presence (far condition) of additional keys between them. This factor could also be crucial, since several studies have made a distinction between categorical and coordinate spatial representations (Kosslyn, 1994; van der Ham & Borst, 2011). The former representations classify locations into categories such as left and right, whereas the latter ones code locations with respect to spatial coordinates. It is plausible that when only two keys are being used, participants might code the responses categorically as left and right, regardless of the physical separation. When the keys that are far apart have other keys intervening between them, though, the responses may be coded in terms of the spatial coordinates corresponding to the ends of the row of keys, enhancing the discriminability of the far response keys.

The experiment

The present experiment was designed to dissociate the factors Conceptual Distance, Physical Distance, and Intervening Keys, to determine which of those contribute to the response-distance effect. One purpose was to test whether a manipulation of the conceptual distance between response keys would yield a response-distance effect. Nett and Frings (2013) recently replicated the response-distance effect when the response-key labels were from the same category (both numbers or both letters) for the far keys as well as for the close keys, but the effect tended to be smaller than when the far-key labels were from different categories. Also, additional keys intervened between the far response keys in all cases. In the present study, we adopted the numerical distance between digits as the conceptual distance and varied both conceptual and physical distance. Numerical distance between numbers has been found to influence the time to classify digits. This effect was first found by Moyer and Landauer (1967), who showed that the time to judge which of two digits was larger decreased as the difference (distance) between the digits became larger. The numerical distance effect has also been found in same–different number matching and visual search tasks (e.g., Schwarz & Eiselt, 2012), and is due to the need to distinguish between the mental representations (e.g., Goldfarb, Henik, Rubinsten, Bloch-David, & Gertner, 2011) and/or the physical similarity (e.g., Schwarz & Eiselt, 2012) of the digits. Regardless of the mechanism, this effect of numerical distance between the digits presented is in terms of the stimuli. According to a common-coding view, stimuli and responses share common cognitive representations and interact with each other (Hommel, Müsseler, Aschersleben, & Prinz, 2001; Prinz, 1997). On this basis, we predicted that if digits were used to designate responses, a response-distance effect would occur as a function of the numerical distance between the digits. That is, if digits with a far distance (e.g., 1 and 9) are used as labels on the response keys, the distance between these keys could be viewed as conceptually far; in contrast, keys labeled with two digits of close numerical distance (e.g., 5 and 6) could be viewed as conceptually close.

The present experiment also dissociated the pure physical distance between two response keys from the presence of intervening keys. To test whether intervening keys is a critical factor, we placed ten intervening keys between the far response keys, similar to the previous studies, and compared this intervening-key condition with a no-intervening-key condition. A response-distance effect based on key distance in both conditions would implicate pure physical separation as an important variable, whereas an effect only in the intervening-key condition would indicate an influence of other objects on the distinctiveness of the action goals.

To summarize, we used a new method to separate conceptual distance and intervening keys from pure physical distance in the present experiment. Besides the pure, physically close and far distances between the two response keys, the response keys were also labeled differently with “5” and “6” or “1” and “9,” and the presence of intervening keys was also manipulated orthogonally with these two factors. Thus, given a constant physical distance, the responses could be conceptually close or far; similarly, given a constant conceptual distance, the response keys could be physically close or far; and for the physically far distance there could be either no or multiple keys between the response keys. In this way, we were able to test whether a conceptual response-distance effect was present and whether pure physical distance or the presence of intervening keys was a critical factor for the physical response-distance effect.

Method

Participants

A total of 288 undergraduate students enrolled in introductory psychology courses during the Fall 2012 and Spring 2013 semesters participated for experimental credits. All reported having normal or corrected-to-normal vision.

Apparatus and stimuli

The letter strings could be “red,” “blue,” or “XXXX,” displayed in red or blue color on a white background. Responses were made via an Ergodex DX1 Input System (www.ergodex.com), which allows keys to be placed at any locations on the DX1 tray. Only two, horizontally aligned keys were used as response keys; these were marked as “5” and “6” or “1” and “9,” with the distance between the inner edges of the keys being 0 or 19 cm (physically close or far; see Fig. 1). We created two physically-far conditions, with the difference being whether no or ten unlabeled keys were placed between the two response keys. In all three physical-distance conditions, the response key with the smaller number was located to the left of the one with the larger number.

Fig. 1
figure 1

Response device used in the present experiment. The left column shows the conceptually-close (5 and 6), and the right column the conceptually-far (1 and 9), condition. The top row shows the physically-close, the middle row the physically-far-with-no-intervening-keys, and the bottom row the physically-far-with-intervening-keys condition

Procedure

Participants were instructed to respond to the physical color of the letter strings while ignoring the word meaning. All instructions were given in terms of the numbers marked on the keys, and the response criterion (e.g., “5 = red, 6 = blue”) was always displayed on the bottom right of the screen in black during the experiment. Participants were told to put their left and right index fingers on the respective response keys, and to press one key for red and the other for blue. This color–response mapping was counterbalanced across participants.

Each trial started with a 200-ms blank screen, followed by a 500-ms fixation cross, and then the stimulus appeared and remained visible until a response was made. Participants were told to respond as quickly and accurately as possible. A 500-ms visual feedback—“Correct” or “Incorrect”—was presented after each response, with an auditory warning for the incorrect trials.

Three independent variables were used in the analysis: congruency between physical color and word meaning (congruent, incongruent, and neutral), physical response distance (close, far-with-no-keys, and far-with-keys), and conceptual response distance [close (“5” and “6”) and far (“1” and “9”)]. Congruency was a within-subjects variable, whereas physical and conceptual distances were between-subjects variables. Accordingly, six between-subjects conditions were created: “physically-close + conceptually-close,” “physically-close + conceptually-far,” “physically-far-with-no-keys + conceptually-close,” “physically-far-with-no-keys + conceptually-far,” “physically-far-with-keys + conceptually-close,” and “physically-far-with-keys + conceptually-far” conditions. We tested 48 participants in each condition, approximating the number tested by Lakens et al. (2011; 41 participants). Each person performed ten practice trials, followed by four blocks of 180 trials, which allowed for an assessment of whether the data pattern changed with practice. Each block included equal numbers of congruent, incongruent, and neutral trials, and successive blocks were separated by a self-paced 1- to 2-min break.

Results

In all, 17 participants (5.9 % of total) were replaced,Footnote 1 two because their overall percentages of errors (PEs) were higher than 20 %, one because the PE in the first 30 trials was extremely high (77.8 %), and 14 because their mean response times (RTs) were 2.5 SDs longer than the mean RT of the remaining participants in the same condition. Incorrect trials (4.3 %) were excluded, as were trials with RTs 2.5 SDs shorter (0.02 %) or longer (2.63 %) than the participant’s own mean RT in each congruency condition. Analyses of variance (ANOVAs) were conducted for RTs and PEs, with Congruency and Trial Block as within-subjects factors and Physical Distance and Conceptual Distance as between-subjects factors.

For RTs, two main effects were significant: physical distance (Ms = 437, 442, and 400 ms for the physically-close, far-with-no-keys, and far-with-keys conditions, respectively), F(2, 282) = 9.29, p < .001, η p 2 = .06, and congruency (Ms = 417, 439, and 422 ms for the congruent, incongruent, and neutral conditions, respectively), F(2, 564) = 105.31, p < .001, η p 2 = .27 (see Fig. 2). The former effect implies that the responses were more discriminable for the far keys, but only when other keys were located between them, whereas the latter indicates an overall Stroop effect of 22 ms that was largely due to interference on incongruent trials.

Fig. 2
figure 2

Mean response times (RTs) as a function of congruency and conceptual response distance in the physically-close (top panel), far-with-no-keys (middle panel), and far-with-keys conditions. Error bars represent 95 % confidence intervals, computed using the method for within-subjects designs (Cousineau, 2005)

Congruency interacted separately with conceptual distance, F(2, 564) = 6.34, p = .002, η p 2 = .02, and physical distance, F(4, 564) = 3.22, p = .013, η p 2 = .02. Orthogonal contrasts comparing the Stroop effects (incongruent–congruent) showed effects of conceptual distance and intervening keys, but not of pure physical distance. The Stroop effect was larger for the conceptually-close condition than for the conceptually-far condition (Ms = 26 and 16 ms, respectively), F(1, 282) = 9.25, p = .003, η p 2 = .03, but not for the physically-close condition than for the physically-far-with-no-keys condition (Ms = 25 and 23 ms, respectively), F < 1. The Stroop effect was larger, though, for the latter condition than for the physically-far-with-keys condition (Ms = 23 and 15 ms, respectively), F(1, 282) = 27.30, p < .001, η p 2 = .09. The only significant term including block was its interaction with congruency, F(6, 1692) = 2.42, p = .025, η p 2 = .01: The Stroop effect was larger in the first block than in the other three (Ms = 26, 21, 19, and 20 ms, respectively), F(1, 1148) = 5.47, p = .020, η p 2 = .01, across which it did not differ, Fs < 1.

No other effect was significant, including the three-way interaction between congruency, conceptual distance, and physical distance, F(4, 564) = 2.13, p = .076, η p 2 = .02, but because this interaction approached the .05 level, we performed separate analyses for the three physical-distance conditions. The interaction between congruency and conceptual distance was significant in the physically-close and physically-far-with-no-keys conditions, Fs(2, 188) = 3.63 and 4.10, ps = .029 and .018, η p 2s = .04, but not in the physically-far-with-keys condition, F(2, 188) = 1.37, p = .257, η p 2 = .01, suggesting little effect of conceptual distance when the intervening keys had already increased the discriminability of the responses.

The ANOVA of PEs showed main effects of congruency (Ms = 3.8 %, 5.2 %, and 4.3 % for the congruent, incongruent, and neutral conditions, respectively), F(2, 564) = 57.27, p < .001, η p 2 = .17, and block (Ms = 4.3 %, 4.3 %, 4.3 %, and 4.8 % for Blocks 1–4), F(3, 846) = 3.66, p = .012, η p 2 = .01. No other effects were significant, ps > .067.

Lakens et al. (2011) had used only 30 trials per participant, and Proctor and Chen (2012) found that a response-distance effect (of 32 ms) was evident for the first 30 of 720 trials, but not over all trials. Consequently, we calculated the Stroop effects for the first 30 trials and for all trials for each participant, and performed an ANOVA with Number of Trials as a within-subjects factor, and Physical Distance and Conceptual Distance as between-subjects factors. Number of trials interacted with conceptual distance, F(1, 282) = 4.16, p = .042, η p 2 = .02, but not with physical distance or its interaction with conceptual distance, Fs < 1. The conceptual response-distance effect was 30 ms (49-ms Stroop effect for conceptually close minus 19 ms for conceptually far) for the first 30 trials, as opposed to 10 ms (i.e., 26 – 16 ms) for all 720 trials. This outcome suggests that the conceptual component is mainly what changes early in practice.

Discussion

Lakens et al. (2011) provided evidence that when the responses in a two-choice Stroop task are less distinguishable, choice between the alternatives is more difficult, and the Stroop effect is larger. In their study, the Stroop effect was larger when the index fingers of the respective hands were on adjacent keys than when the keys were separated farther. However, because the manipulation of spacing was done on a standard keyboard, several factors in addition to physical separation differentiated the conditions. We had previously shown that distance between the responding hands is not crucial by using a procedure in which the keys were operated by sticks pressed by the hands at opposite locations (Proctor & Chen, 2012). The Stroop effect was larger when adjacent keys were operated by separated hands than when separated keys were operated by adjacent hands. In the prior studies, the far-key condition differed from the close-key condition not only in physical separation, but also in being labeled by letter and digit categories, rather than only by letters, and in having intervening keys. The present study addressed the question as to which of these factors that differentiated the close- and far-key conditions influenced the discriminability of the action goals.

Our results showed that the Stroop effect is influenced by the conceptual distance and the intervening keys between response keys, but not by the physical key separation. These results, along with our previous ones, are difficult to reconcile with an account in terms of hand separation (e.g., Lakens et al., 2011), even one that allows for a tool tip to be considered an extension of the hand. They are consistent with the action-goal account (e.g., Buhlmann et al., 2007; Hoffmann, Lenhard, Sebald, & Pfister, 2009; Hommel et al., 2001), according to which the response keys are coded as action goals, and farther distance between the goals leads to less interference, resulting in a smaller Stroop effect. In the two-choice Stroop color-identification task, the action goals are the classifications of the physical colors of the words as red or blue by pressing the appropriate response key. The word meanings are also processed, however, and activate the responses assigned to the physical colors. When the action goals are less distinguishable, performance will be worse, particularly for incongruent Stroop trials, on which both action goals are receiving activation.

In the present study, the response keys were labeled with digits, and conceptual distance was operationally defined as numerical distance. Our results provided evidence that the responses were more discriminable when the numerical difference was large than when it was small. One issue in terms of generalizability is whether similar findings would occur for other ordered dimensions, such as letters of the alphabet. Also, because the smaller digit was always placed on the left key and the larger one on the right key, the key assignments were consistent with the order on a left-to-right mental number line (Dehaene, Bossini, & Giraux, 1993). Thus, the demonstrated conceptual response-distance effect might itself be mediated by spatial coding of the digits. Both of these possibilities need to be evaluated in further experiments.

By noting in the Introduction that, in the studies of Lakens et al. (2011) and Proctor and Chen (2012), the close keys were labeled with letters and the far keys with a letter and a number, we implied that a difference between categories also increases response discriminability. Nett and Frings’s (2013) recent results could be taken to suggest the contrary. They obtained a response-distance effect when response-key labels from the same category (letters or digits) were used for the far-key condition as well as the close-key condition. However, their response-distance effects tended to be smaller than those found when the labels for the far keys are from different categories. More importantly, because the responses were made on a standard computer keyboard, the far keys were always separated by additional keys, which would tend to yield little or no benefit of conceptual distance. That is, in the present study, the Stroop effect was reduced at the far physical distance when additional keys were inserted between the two response keys, with the conceptual response distance having little additional influence on performance.

Together, the present results and those of Proctor and Chen (2012) provide evidence that neither the distance between the hands nor the physical distance between the keys has much impact on discrimination between the action goals in the Stroop task. Our finding that the conceptual distance contributes to distinguishing the action goals is important, because key labels have tended to be neglected and considered only when they overlap with the relevant stimulus dimension (e.g., color stimuli and labels; see Hommel, 2004, and McClain, 1983). Also, the evidence from the conceptual response-distance effect argues against the supposition that “keeping your hands apart might actually help to keep things apart in your mind” (Lakens et al., 2011, p. 889). It is not physical distance between the hands or the response keys, but conceptual distance and intervening keys, that increases the discriminability of action goals.