When interacting with the world, people encounter objects, states, or events together with the words used to refer to these entities. As a result, the words get associated with experiential traces related to the referents of these labels. When people later hear or read words referring to the respective objects, states, and events, the corresponding experiential traces get reactivated (Zwaan & Madden, 2005). Importantly, the reactivated experiential traces presumably consist not only of multimodal attributes of the entities themselves, but also of contextual information, such as the typical actions performed with the entities or the typical situations in which the entities are encountered (Barsalou, 1999). In the present article, we are concerned with the activation of a certain type of information during word processing—namely, information concerning an entity’s location in vertical space (up vs. down). For some words, this information may be part of the meaning of the word, be it in terms of an absolute or a relative up or down feature (e.g., summit vs. roof). For others, it may constitute more of a contextual feature resulting from the fact that the entity referred to is often encountered in upper or lower locations (e.g., eagle vs. worm). In both cases, if the experiential-trace hypothesis is correct, this information should get activated during word processing. Specifically, a word whose referent is associated with an up or down location should reactivate a set of experiential traces that share the corresponding spatial attribute (up or down). As a result, an up or down feature should become activated. Furthermore, since the underlying mechanism is an association resulting from frequent co-occurrence of a word and experiencing its referent, the activation should be relatively automatic, in the sense that it occurs without intention, consumes few conscious resources, and is not open to awareness or introspection (Posner & Snyder, 1975).

The literature provides evidence that the processing of individual words gives rise to the activation of location information. However, the exact conditions for obtaining these effects, as well as the underlying mechanisms, are still unclear. The studies suggesting that location information is activated during word processing typically have provided participants with contextual setting information (e.g., Borghi, Glenberg, & Kaschak, 2004) or, at least, have employed tasks that required lexical access (e.g., Zwaan & Yaxley, 2003). Likewise, the study by Estes, Verges, and Barsalou (2008) cannot be directly interpreted as evidence for an automatic activation of location information. Participants were presented with cue words referring to entities associated with an up or down location (e.g., hat = up, boot = down) and were subsequently asked to identify a visual target that appeared in the upper or lower part of the visual screen. Participants responded to this visual task approximately 800 ms after the onset of the cue words. Responses in the visual task were affected by the meaning of the cue words, but responses were longer in the compatible than in the incompatible condition. Thus, this study showed interference, not facilitation. Together with the fact that responses occurred rather late after the onset of the relevant stimulus, this seems to suggest that the effect does not reflect an automatic activation of spatial features (cf. Hommel, Musseler, Aschersleben, & Prinz, 2001). Thus, it remains uncertain whether location information is automatically activated.

Indeed, recently, some researchers have started to question the view that word processing automatically results in a reactivation of memory traces and thereby activates information stemming from experiences with the words' referents. One reason for this skepticism is the extreme task and context dependency of experiential effects during linguistic processing. For instance, van Dam, Rueschemeyer, Lindemann, and Bekkering (2010) found that words denoting objects for which the functional use is associated with a movement (e.g., telephone, hammer) facilitated compatible responses, but only when the words were presented in a context emphasizing the action feature (e.g., conversation–telephone vs. plug–telephone). Additionally, several studies have shown that lexical processing of action words is highly context and task dependent. For example, Costantini, Ambrosini, Scorolli, and Borghi (2011) showed that responses are faster if action words follow a picture of an object the action could be performed on and this was presented within a reachable distance (e.g., to plug up preceded by a picture of a bottle in a reachable distance). The strong task dependency of those affordance effects was also observed in a study by Bub, Masson, and Cree (2008) that focused on gestural knowledge—that is, knowledge about how one typically interacts with a particular object. Participants were presented with words denoting objects, and their task was to respond to the words with hand postural gestures. Response times (RTs) were shorter in compatible conditions, but only when participants were required to read the words in a lexical decision experiment. No compatibility effect emerged when participants responded to the color the words were written in. Whether these results generalize to the activation of location information is unclear. The response setting adopted in Bub et al. required complex gestural responses to be prepared. This may have limited the influence of automatically activated knowledge.

Taken together, the literature provides evidence that experiential traces are activated during word processing and do affect subsequent sensory–motor processing. However, whether those traces are activated automatically in a bottom-up manner during word processing or, rather, become available as a result of more strategic simulation processes is still unclear. In the present series of experiments, therefore, we investigated whether experiential traces—specifically, location information—is activated automatically during word processing. We tested the hypothesis that presenting a word will activate experiential traces stemming from perceiving the entity or interacting with the entity in the past (Zwaan & Madden, 2005). These experiential traces comprise attributes of the referents themselves, as well as attributes of the situations and actions they were involved in. Thus, words whose referents are associated with an up or down location should reactivate a set of experiential traces sharing the corresponding spatial attribute and, as a result, should affect processes in perception and action that also involve this attribute. In particular, responding to the words should be facilitated if the required response is compatible with the activated location information (e.g., an up response for a word such as roof) and hindered if it is incompatible (e.g., an up response for a word such as root). This interaction effect should occur independently of whether or not word reading is required by the experimental task. As a starting point, we conducted an experiment with a task that required word reading.

Experiment 1

Participants performed a lexical decision task with words denoting objects that are associated with an up or down location (e.g., roof vs. root, respectively). Correctly responding to the words required either an upward or a downward movement. If reading an object noun activates location information, a compatibility effect should be observed.

Method

Participants

Thirty-six right-handed German native speakers (6 of them male; M age = 27.4 years, SD = 8.8). Two participants were excluded because of a low accuracy rate in at least one condition (<90%).

Materials and apparatus

Seventy-eight German nouns and 78 pseudowords were presented in black, centered on a white background. Nouns were controlled for frequency with the "Wortschatz Portal" of the University of Leipzig (http://wortschatz.uni-leipzig.de), for length, and for the typical vertical location of their referent. A group of 49 volunteers who did not participate in the actual experiment rated 104 nouns with respect to the referents' typical location, using a 5-point Likert-scale ranging from down to up. Word length and frequency were matched across the two categories of vertical position (down vs. up), resulting in 39 up words (letters: M = 6.07, SD = 1.78) and 39 down words (letters: M = 6.07, SD = 1.78). Up and down words did not differ significantly with regard to their frequency, t(76) = 0.37, p = .71, or length, t(76) = 0.00, p = 1, but did differ significantly for the rated position (M up = 4.70, SD = 0.28; M down = 1.54, SD = 0.37), t(76) = 42.27, p < .001. The pseudowords were rated as neutral (M = 2.91, SD = 0.89) and had a similar length as the words.Footnote 1

Responses were recorded using a PS/2 computer keyboard adapted with a locally constructed overlay (Fig. 1a).

Fig. 1
figure 1

a Experimental setup. The keyboard is implemented on a vertical plane in front of the participants. b The locally constructed overlay for a German keyboard. Buttons 1–4 of the overlay are connected to the keys “tab,” “u,” “o,” and “end” below. A response with movement is to release with one hand one of the middle buttons 2 and 3 and press a button above or below (1,4) while resting with the other hand on the respective middle button and returning back with the responding hand to the released middle button. c A stationary response without movement. The hands stay rested on the respective buttons

Procedure and design

Participants were presented with a list of words and pseudowords and performed a lexical decision task. For half of the participants, the response mapping was “yes is up” for the first half of the experiment and “yes is down” for the second half. The remaining participants had the reverse order. When the trial started, participants simultaneously held down the two middle keys with their left and right hands (see Fig. 1b). After a centered fixation cross (800 ms), the stimulus appeared and stayed until response. RTs were measured as the time to release a middle button after stimulus onset. Each stimulus was presented 4 times, resulting in a total of 624 experimental trials, subdivided into eight blocks, separated by a self-paced break with error information.

The design was a 2 (referent location) × 2 (response direction) design with repeated measurement on both variables in the by-participants analysis (F 1) and repeated measurement on response location in the by-items analysis (F 2).

Results and discussion

Responses to pseudowords, responses faster than 100 ms, and errors were excluded from further analyses. Responses deviating by more than 2 SDs from the mean for that participant in that condition were excluded. This reduced the data set by less than 5%. Mean RTs are displayed in Fig. 2a.

Fig. 2
figure 2

Results of lexical decision task (Experiment 1). a Mean response times (RTs, in milliseconds) of correct responses as a function of response direction and referent location. Error bars represent the 95% confidence interval for within-subjects designs (Masson & Loftus, 2003). b Mean RTs and movement times (MTs) of compatible and incompatible conditions according to decile (1st to 10th) of the RT distribution

Responses were significantly faster for up responses (614 ms) than for down responses (636 ms), F 1(1, 33) = 7.26, p < .05; F 2(1, 75) = 98.97, p < .001, which probably reflects the fact that up responses were performed with the dominant right hand. There was no effect of referent location (both Fs < 1). Importantly, there was a significant interaction of referent location and response direction, with responses being significantly faster on compatible trials (617 ms) than on incompatible trials (626 ms), F 1(1, 33) = 8.32, p < .01; F 2(1, 76) = 11.03, p < .01. Some of the words employed in the present experiment were compounds with the morpheme “hoch” (high) or “Höhe” (height) and “unter” (under).

In order to rule out the possibility that our effect is driven mainly by these words, we conducted post hoc analyses in which we omitted these words. The compatibility effect was still significant, F 1(1, 33) = 10.85, p < .005; F 2(1, 64) = 12.35, p < .001. To rule out an explanation attributing the compatibility effect solely to an association between referent location and responses with the right versus left hand, we conducted an additional experiment. Thirty-six participants performed a lexical decision task whereby responses (yes-is-up vs. yes-is-down) were given on a vertical mounted three-button keyboard with the right hand only. Nevertheless, the compatibility effect was significant, F 1(1, 34) = 16.70, p < .001; F 2(1, 56) = 96.24, p < .0001.

In order to analyze the temporal characteristics of the observed compatibility effect, we performed additional analyses. First, RTs in the compatible and incompatible conditions were grouped into deciles separately for each participant (see Fig. 2b). An ANOVA with the factor decile confirmed the compatibility effect, F(1, 20) = 7.38, p < .05, and showed no compatibility × decile interaction, F(9, 180) = 1.54, p = .13. Second, we conducted an ANOVA with movement times (MTs) as the dependent variable, which did not show any effects (all Fs < 1). Thus, in this experiment, compatibility affected processing in the time range of 500–800 ms, but only in information processing prior to response movement. The MT was not affected, not even for very short RTs, where responses were completed less than 800 ms after stimulus onset (see Fig. 2b, lower deciles).

In summary, responses were faster when the response direction matched the referent’s typical location, even though no contextual information was provided prior to word processing in this experiment. If the activation of location information occurs fully automatically, a compatibility effect should also be found when the task does not require lexical access to the words. This was tested in Experiment 2.

Experiment 2

Participants were presented with the same words as in Experiment 1, but they responded with an upward or downward movement based on the font color. If location information is activated automatically when a word is presented, this experiment should yield compatibility effects comparable to those found in Experiment 1.

Method

Participants

Twenty-four right-handed German native speakers (4 of them male; M age = 22.79 years, SD = 4.88). One participant was excluded because of low accuracy in at least one condition (<90%).

Materials

The words used in Experiment 1 were presented in one of four colors: blue (rgb,0, 0, 255), orange (rgb,255, 128, 0), lilac (rgb,150, 0, 255), and brown (rgb,140, 80, 20).

Procedure and design

Participants were instructed to respond to the color of the word as quickly and accurately as possible. The mapping of colors to response direction was balanced across participants: All possible color pairs occurred equally often, and each color was paired with each response direction equally often. Each noun was presented 16 times, resulting in a total of 624 experimental trials, subdivided into eight blocks.

Results and discussion

Data were analyzed as in Experiment 1. Outlier elimination reduced the data set by less than 4.5%. Mean RTs are displayed in Fig. 3a. There was a main effect neither of response direction nor of referent location (all Fs < 1), but there was a significant interaction between response direction and referent location. Responses were significantly faster on compatible than on incompatible trials, F 1(1, 22) = 6.96, p < .05; F 2(1, 76) = 13.57, p < .001. As in Experiment 1, the ANOVA with the variable decile confirmed the compatibility effect, F(1, 20) = 7.66, p < .05, but this time there was a significant interaction with decile, F(9, 180) = 13.3, p < .001, due to an increase of the compatibility effect with increasing RTs, with the compatibility effect being significant from the third decile onward (all ps < .05). Again, there were no effects in the MTs (Fs < 1). This suggests that the compatibility effect does require some time after stimulus onset to develop. The fact that we observed facilitation from about 500 ms on is in line with a study by Chersi, Thill, Ziemke, and Borghi (2010). There, interference was predicted in a time range of 160–500 ms, followed by facilitation from about 550 ms on. Our results, however, do not provide evidence for interference in early time ranges, but the null effects in the early deciles may, of course, indicate the beginning of a turnaround in lower processing times.

Fig. 3
figure 3

Results of color response task (Experiment 2). a Mean response times (RTs, in milliseconds) of correct responses as a function of response direction and referent location. Error bars represent the 95% confidence interval for within-subjects designs (Masson & Loftus, 2003). b Mean RTs and movement times (MTs) of compatible and incompatible conditions according to decile (1st to 10th) of the RT distribution

The results of the present experiment suggest that a task requiring word reading is not a prerequisite for the activation of location information during word processing. These results strongly suggest that location information is automatically activated when a word denoting an object with a typical location is seen. Of course, in principle, it is also possible that participants cannot help but read the words, but once word meaning becomes available, they strategically activate location information. But why would they do so, if the task does not even require word reading and, certainly, does not activate location information? One reason may be that, over the course of the experiment, they somehow notice the regularity in the material—namely, that the words refer to entities that are associated with an up or down location.

Experiment 3

To reduce the probability that participants would notice the regularity in terms of typical referent location, we augmented the stimulus materials by neutral filler words denoting objects without a typical location. In addition, participants completed a survey, subsequent to the experiment, in which they were asked about the regularities in the materials that they had noticed. This allows analyzing the data for a subgroup of participants who were naïve with respect to the relevant manipulations.

Method

Participants

Twenty-four right-handed German native speakers (3 of them male; M age = 25.25 years, SD = 3.85).

Materials

Thirty-nine additional words were included. Filler words referred to entities without a typical up or down location, as indicated by ratings in the range of 2.8–3.33 on the Likert scale (1 = up to 5 = down).

Procedure and design

The design and procedure were identical to those in Experiment 2, except that the 39 filler words were included, resulting in 968 trials overall. After the experiment, participants were asked four questions: (1) “What did you notice in this experiment?” (2) “Were there any regularities with respect to the content of the words?” (3) “Did you realize that the words referred to entities with typical up or down locations?” and (4) “Were there particular trials that seemed more difficult than others?”

Results and discussion

Data were analyzed as in Experiments 1 and 2. Outlier elimination reduced the data set by less than 5%. Mean RTs and distributions are displayed in Fig. 4. There was a significant main effect neither of referent location, F 1 < 1, nor of response direction, F 1(1, 22) = 1.03, p > .30; F 2(1, 76) = 3.06, p = .08. Importantly, there was an interaction between response direction and referent location, with responses being significantly faster in the compatible than in the incompatible condition, F 1(1, 22) = 65.10, p < .001; F 2(1, 76) = 5.93, p < .05. As in Experiment 2, an ANOVA with the additional factor decile produced a compatibility effect, F(1, 20) = 49.53, p < .001, and a significant interaction with decile, F(9, 180) = 5.87, p < .001, this time reflecting significant compatibility effects from the fourth decile onward (all ps < .05). MTs again showed no effects (both Fs < 1).

Fig. 4
figure 4

Results of color response task with filler (Experiment 3). a Mean response times (RTs, in milliseconds) of correct responses as a function of response direction and referent location. Error bars represent the 95% confidence interval for within-subject designs (Masson & Loftus, 2003). b Mean RTs and movement times (MTs) of the compatible and incompatible conditions according to decile (1st to 10th) of the RT distribution

To obtain more information with regard to whether this effect depends on participants’ noticing the relevant manipulations, we conducted post hoc analyses. Participants were subdivided into two groups. Group 1 included all participants who had not at all noticed the relevant manipulations (“negative” answers to all four questions; n = 10). Group 2 included the remaining participants. For both groups, there were significant interactions [Group 1, F 1(1, 9) = 18.90, p < .01, and F 2(1, 76) = 1.55, p = .21; Group 2, F 1(1, 12) = 44.33, p < .001, and F 2(1, 76) = 5.20, p < .05]. A combined analysis with group as a factor did not yield a significant three-way interaction (both Fs < 1). Thus, the observed compatibility effect is independent of whether participants notice that there are compatible and incompatible trials. This provides further evidence for the automaticity of the activation of the relevant knowledge during word processing.

Experiment 4

In Experiments 1–3, responses involved an upward or downward movement. Responses were faster when the response direction was compatible with the typical location of the referent entity. We interpreted this effect as suggesting that location information is automatically activated during word processing. If so, a compatibility effect should also be observed with a stationary up/down response that does not involve a movement. In the present experiment, participants kept their fingers stationary throughout the experiment.

Method

Participants

Twenty-four right-handed German native speakers (10 of them male; M age = 23.4 years, SD = 5.13). One participant was excluded due to low accuracy in at least one condition (<90%).

Materials, procedure, and design

The materials, procedure, and design were identical to those in Experiment 2, except that participants answered with stationary responses (see Fig. 1c).

Results and discussion

Data were analyzed as in the previous experiments. Outlier elimination reduced the data set by less than 5%. Mean RTs are presented in Fig. 5a.

Fig. 5
figure 5

Results of color response task with stationary response (Experiment 4). a Mean response times (RTs, in milliseconds) of correct responses as a function of response location and referent location. Error bars represent the 95% confidence interval for within-subjects designs (Masson & Loftus, 2003). b Mean RTs of the compatible and incompatible conditions according to decile (1st to 10th) of the RT distribution

We found a main effect of response location, with faster responses for the upper key (right hand), F 1(1, 21) = 15.74, p < .001; F 2(1, 76) = 113.87, p < .001. There was no significant main effect of word location (both Fs < 1), but there was an interaction between response and word location, with responses being significantly faster in the compatible than in the incompatible condition, F 1(1, 21) = 9.89, p < .001; F 2(1, 76) = 14.15, p < .001. The ANOVA including the factor decile confirmed this effect, F(1, 19) = 6.85, p < .05, and showed a marginally significant interaction with decile, F(9, 171) = 1.87, p = .058.

Although participants did not perform an upward or downward movement, we again observed a compatibility effect. This suggests that the effect is driven by the compatibility between the referent’s location and the location of the key that is going to be pressed.

General discussion

In four experiments, participants were presented with words referring to entities that are associated with an up or down location. Across experiments, we manipulated whether the experimental task required word reading or not, as well as whether the response involved a movement or was stationary. In all the experiments, participants’ responses were significantly faster if the responses were compatible to the words' typical location. This strongly suggests that information concerning a referent’s typical location is automatically activated when participants process object nouns. The additional analyses of RT distributions, as well as those with MTs, indicate that the compatibility effect takes some time to develop: It consistently shows up 500 ms after stimulus onset and stays until the response movement has been initiated.

To our knowledge, this is the first demonstration of an automatic activation of location information during processing nouns in a situation where word reading is not required and no contextual information is provided. To help provide a better overview of how the processing of location information is automatic, we will more closely compare our results with those of each of the previous studies that examined similar questions.

First, Estes et al. (2008) found an interference effect when readers processed words such as hat (up) or boot (down) in a Perky-like task. Thus, in contrast to our results, compatible conditions led to particularly long RTs. A possible explanation is related to the difference in the experimental tasks employed. Estes et al. measured RTs in a visual task subsequent to the reading of the relevant cue words. Responses in this task occurred approximately 750–800 ms after the onset of the words. In contrast, our RTs were measured starting with word onset, and RTs varied between 400 and 800 ms absolutely. Thus, it seems possible that Estes et al. measured processes that occur at a later stage in word processing, whereas we measured earlier processes. In principle, it is conceivable that location information at an earlier stage in processing (i.e., until about 800 ms after stimulus onset) leads to the (bottom-up) activation of an up or down feature, whereas at a later stage, which starts around 800 ms after stimulus onset, a (top-down) perceptual simulation of the named object in its typical location takes place. This simulation may, for instance, occupy visual-spatial working memory, thus leading to interference with a visual task that also employs this part of working memory (see Baddeley, 1997). Indeed, Estes et al. themselves interpreted their effect along these lines. Supporting this interpretation, interference in their study was diminished when the cue word was followed by a visual mask in both potential target locations, hindering participants from mentally simulating the respective object in its typical location.

Second, the study by Borghi et al. (2004) found compatibility effects when readers decided whether a word such as head (up) or foot (down) belonged to a particular object named in a context sentence (e.g., “There is a doll standing in front of you”). In contrast to our results, a compatibility effect was observed only when the required response involved an upward or downward movement, not with a stationary up versus down reaction. However, Borghi et al. employed words for which up or down location was not an absolute but, rather, a relative feature. For instance, feather is not associated with down in an absolute sense, but only when compared with crest in the context of a rooster standing in front of you. Thus, in such case, the words themselves cannot activate an up or down location, and accordingly, stationary responses purely involving the up or down location did not show a compatibility effect. Furthermore, in contrast to our study, Borghi et al.'s participants had to perform a matching task that required deep semantic processing (they had to judge whether the word is part of the object described in the preceding sentence). Borghi et al. suggested that such a task might result in an internal pointing gesture to the upper or lower part of the previously described object; hence, responses involving a response movement were affected by compatibility.

Finally, there is the question of how our results relate to the findings mentioned in the introduction showing a strong context and task dependency of experiential effects during language processing. Initially, these results seem to speak against an automatic activation of information stemming from previous experience with the words’ referents. For example, in the study by van Dam et al. (2010), a word such as telephone, implying an action toward the body, facilitated responses only when presented in the context of a word that strengthened this aspect of the word’s meaning (e.g., conversation). However, since the study did not include a condition in which the target words were presented without a context, it remains unclear whether the words alone would trigger the activation of the corresponding information. Thus, the association between the words and the proposed movement may simply not have been strong enough to result in solid experiential effects in the absence of strengthening contexts. Similarly, in other studies investigating responses to single words, compatibility effects were limited to tasks where participants had to lexically access the words’ meanings (e.g., Bub et al., 2008) and can be modified by the reachability of the context objects (Costantini et al., 2011). However, in both cases, a complex experiential trace has to become active (e.g., reactivating a grasping gesture), which potentially involves the top-down integration of several features (e.g., effectors, location, etc.). This results in longer RTs, and thus bottom-up activations might be suppressed or overwritten by top-down processes (see Raposo, Moss, Stamatakis, & Tyler, 2009).

In the present study, we obtained compatibility effects even in rather unfavorable conditions—namely, when the experimental task did not require word reading. This may be somewhat surprising, considering that for many of the words employed in our study, we cannot even be sure that location information is part of word meaning; it may, rather, constitute an attribute of the situations in which the respective object is often encountered.

There is still the question of how far controlled processing can be reduced before this compatibility effect disappears. With words such as above, below, upward, and downward, for which location information constitutes an integral part of word meaning, Ansorge, Kiefer, Khalid, Grassl, and König (2010) found response activation even if the words were presented subliminally. It would be interesting to see whether a similar compatibility effect can be obtained with the nouns employed in the present study.