Many of our daily eye movements can be considered purposeful. By moving the eyes, we actively gather the information we need to pursue our goals. This task-dependent nature of eye movements was first demonstrated in scene perception studies (Buswell, 1935; Yarbus, 1967) and, later on, during natural action execution. In natural tasks, such as sandwich or tea making, as well as in ball games, saccades are often made to a location in a scene before an expected event has even occurred (Land & Tatler, 2009). For example, objects that are the target of a hand movement are often fixated before the hand arrives at the target (Land & Hayhoe, 2001; Land, Mennie, & Rusted, 1999). Likewise, cricket players fixate the bounce point of the ball just ahead of its impact (Land & McLeod, 2000).

Previous studies have examined the question of how anticipation guides where to look next in the inanimate environment. Yet our environments also have a social part, and many actions, including eye movements, are directed to our fellow human beings (Argyle, 1988). For example, eye contact frequently solicits social responses, such as empathic facial expressions (e.g., Bavelas, Black, Lemery, & Mullet, 1986) or the eyebrow flash (Eibl-Eibesfeldt, 1972). In fact, gaze has a unique characteristic in social interactions: Gaze can be an action that causes a change in the outer world, and not only a change in sensation, as in the inanimate environment. This purposeful and goal-directed aspect of eye movements, which is that they “cause” effects in the environment, has so far been addressed neither theoretically nor empirically.

An important theoretical perspective on the functional underpinnings of goal-directed actions has been the ideomotor approach (e.g., Elsner & Hommel, 2001; Kunde, 2001; Nattkemper, Ziessler, & Frensch, 2010; Prinz, 1997; Waszak & Herwig, 2007; Waszak et al., 2005), which holds that goal-directed actions are selected, initiated, and executed by activating anticipatory codes of a movement’s sensory effects. Goal-directed action, in turn, implies that knowledge about which movement leads to which sensory effect (action–effect associations) has been acquired. Thus, at its core, the ideomotor approach raises two questions: (1) How are action–effect associations acquired, and (2) how are these associations applied to select a goal-directed action?

Usually, the acquisition and application questions have been addressed by different experimental paradigms. Studies investigating acquisition have pursued a strategy proposed by Greenwald (1970). For example, Elsner and Hommel (2001) confronted their participants with novel, action-contingent events in an acquisition phase (e.g., left keypress → high tone, right keypress → low tone), with the expectation that participants acquire bidirectional action–effect associations (i.e., ideomotor learning). To diagnose these associations in a second test phase, the same tones were presented as imperative stimuli in a reaction time task, using the same keypress responses. If the same tone–key combination was used in the acquisition and test phases (e.g., if the high tone followed the left keypress in acquisition, and the high tone had to be responded to with a left key in test), response times were shorter than when the tone–key combination was changed. This result indicates that the perception (i.e., exogenous activation) of a stimulus similar to the learned sensory effect activates the action with which it is associated.

Studies addressing application have followed a different strategy. According to this strategy, effects are always presented after, instead of prior to, action execution to secure that any observed influences on action execution are actually due to effect anticipation (i.e., endogenous activation). For example, Kunde (2001) demonstrated that spatial keypresses are initiated faster when they reliably produce spatially compatible visual effects, as compared with incompatible ones. Comparable compatibility effects between actions and their effects have also been reported for intensity and velocity (Kunde, 2003; Kunde, Koch, & Hoffmann, 2004). Moreover, in a recent study, Kunde, Lozo, and Neumann (2011) extended the action–effect compatibility paradigm to facial expressions and demonstrated that smiling and frowning can be generated more quickly if they are followed by the predictable visual presentation of the same expression, as compared with a different expression.

In the present experiments, we studied saccadic eye movements that contingently led to specific changes in facial expression. For the first time, acquisition and application of action–effect associations were tested within a single experimental paradigm. Specifically, we presented participants with two different neutral faces in an acquisition phase, one on the left and one on the right side of fixation. Shortly after participants directed their gaze at one of the two faces, it changed into a happy or angry expression, depending on its position. To diagnose whether participants acquire bidirectional associations between their actions (i.e., eye movements) and their actions’ effects (i.e., changes in facial expression), we presented happy and angry faces in a second test phase as imperative stimuli requiring a saccade to a novel left or right target. Moreover, to address the question of whether, and under which circumstances, action effects were used for action selection, we took a closer look at spatial saccadic parameters during the acquisition phase. We show that in this kind of situation, humans acquire bidirectional saccade–effect associations. Furthermore, participants anticipate the specific change in facial expression and direct their first saccade more often to the mouth region of a neutral face if it is about to change into a happy expression and to the eyebrows region if it is about to change into an angry expression. However, replicating and extending previous results (Herwig, Prinz, & Waszak, 2007; Herwig & Waszak, 2009), acquisition and usage of saccade–effect associations are restricted to situations in which participants freely choose between left and right saccades during acquisition (Experiment 1), but not if saccades are triggered by an external stimulus (Experiment 2).

Experiment 1

Experiment 1 addressed two questions: (1) whether saccades and their effects in the environment become associated and (2) whether these saccade–effect associations are actually used in the reverse direction to voluntarily select a saccade by anticipating its effect.

Method

Twenty participants, between 20 and 32 years of age, took part in Experiment 1. Eleven of the participants were female. All reported normal or corrected-to-normal vision, and all were naïve with respect to the aim of the study. Stimuli were presented on a 19-in. display monitor (100-Hz refresh rate, resolution of 1,024 × 768 pixels) at a distance of 71 cm. A video-based tower mounted eyetracker (Eye Link1000, SR Research, Ontario, Canada) with a sampling rate of 1000 Hz was used for recording eye movements. The participants’ head was stabilized by a chin and a forehead rest, and the right eye was monitored in all participants. Saccade onsets were detected using a velocity criterion of 30 °/s. Color photographs of four male and four female faces (NimStim face stimulus set; Tottenham et al., 2009) with a neutral, angry, and happy expression served as stimuli. Faces were presented in vertical ellipses (7.7° × 4.9° in the acquisition phase and 3.85° × 2.45° in the test phase) on a black background. Saccade latency was defined as the interval between the onset of the facial stimuli and the initiation of a saccade eye movement.

Experiment 1 comprised an acquisition and a test phase (see Fig. 1a). Each trial of the acquisition phase started (following a variable fixation interval of 1,000–1,500 ms) with the presentation of two neutral faces at 8° to the left and right of the screen’s center. Participants were instructed to randomly saccade to the left or right face, depending on their own choice. Feedback regarding the number of saccades to the left and right was provided every 56 trials to ensure that each saccade–effect combination was experienced about equally often. Each saccade triggered a particular change of the facial expression (to happy or angry) of the fixated neutral face. This change occurred 100 ms after the saccade arrived at one of the faces and lasted 500 ms. Importantly, facial expression depended on the saccades’ direction: For one half of the participants, a saccade to the left triggered a change from a neutral to an angry face, and a saccade to the right a change from a neutral to a happy face; for the other half, this saccade–effect mapping was reversed. Participants were not informed about the saccade–effect mapping. A posttest survey revealed that all the participants noticed the pairing of facial expression and location. The acquisition phase consisted of 224 trials.

Fig. 1
figure 1

Experimental paradigm of a Experiment 1 and b Experiment 2

After completing the acquisition phase, participants received an instruction of the required stimulus–response (S–R) mapping for the test phase. On each test trial, one of the eight happy or angry faces was presented at the center of the screen flanked by two solid gray ellipses (the potential saccade targets; 3.85° × 2.45°), at an eccentricity of 8°. There were two subgroups of participants. In the acquisition-compatible subgroup, participants had to respond to the facial expression with the saccade in the direction that triggered the facial expression in the acquisition phase. In the acquisition-incompatible subgroup, participants had to respond with the saccade that triggered the other expression in the acquisition phase. The next trial started 1,000 ms after the saccade. Participants worked through 96 test trials.

Results and discussion

We will first report the results of the test phase before turning to the acquisition phase. As can be seen in Fig. 2a, mean saccadic latenciesFootnote 1 during the test phase were decreased by 70 ms in the acquisition-compatible subgroup (398 ms), as compared with the acquisition-incompatible subgroup (468 ms),Footnote 2 t(18) = 3.44, p < .01. The analysis of error rates did not reveal a significant difference between the compatible and incompatible conditions, t(18) = −1.48, p = .16. Hence, this result shows that the perception of a learned sensory effect (i.e., the specific facial expression) activated the action (i.e., the saccade) with which it was associated. This can be interpreted as evidence for the acquisition of bidirectional saccade–effect associations.

Fig. 2
figure 2

Results from Experiment 1. a Mean saccadic latencies in the test phase for the acquisition-compatible and -incompatible subgroups. b Mean vertical landing position in the acquisition phase of the first saccade toward the neutral face as a function of 56 trial blocks and forthcoming change of facial expression to happy or angry. Error bars represent standard errors. Asterisks denote statistically significant (p < .05) contrasts between the conditions

To address the question of whether these saccade–effect associations were actually used to voluntarily select a saccade, we took a closer look at saccadic parameters during the acquisition phase. Figure 2b depicts the mean vertical landing position on the neutral faces, separated for faces changing into a happy versus an angry expression and the four blocks of the acquisition phase (56 trials each). Actually, saccades hit the neutral faces at a lower position if these were about to change into a happy, rather than an angry, expression. The 2 (effect condition: happy vs. angry) × 4 (block: 1–4) repeated measures analysis of variance revealed a significant interaction of effect condition and block, F(3, 57) = 3.00, p < .05, η² = .13. Furthermore, paired t tests showed differences between effect conditions only for the last block of acquisition, t(19) = 2.22, p < .05, but not for blocks 1, 2, and 3, t(19) = 0.89, 1.37, and 1.75, respectively. Moreover, we compared the saccades’ vertical landing position distribution for the different effect conditions, to get a better impression of the specific parts of the neutral face where saccades’ landing probabilities differed. As is illustrated in Fig. 3, differences were related to the mouth and eyebrows region, with a higher probability of directing the gaze to the mouth region of a neutral face if it was about to change into a happy expression and to the eyebrows region if it was about to change into an angry expression. On the basis of the findings shown in Fig. 3, we defined regions of interest for the mouth (−1° to −2.5°) and eyebrows (0° to 1.5°) regions. Paired t tests showed significant differences between the effect conditions during the last block of acquisition in the mouth region, t(19) = −1.90, p < .05, and almost significant differences in eyebrows region, t(19) = 1.67, p = .06 (one-tailed). This result is in line with a recent study showing the mouth region to be of outstanding importance for the recognition of happiness, whereas the recognition of disgust (anger was not investigated in this study) proved to rely on the eyes and eyebrows region, as well as the mouth region (Nusseck, Cunningham, Wallraven, & Bülthoff, 2008). To summarize, saccade landing position was biased to the facial region where the facial change was expected. This is consistent with the assumption that the saccade was selected by its anticipated effects.

Fig. 3
figure 3

Differences in saccades’ vertical landing distribution on the neutral face for the different effect conditions in the acquisition phase of Experiment 1, collapsed across all blocks

Mean saccadic latency in the acquisition phase was 220 ms and did not differ for saccades triggering different effects. Moreover, no significant difference was observed for the amount of saccades directed to faces changing into a happy or an angry expression.

Experiment 2

Prior studies (Herwig et al., 2007; Herwig & Waszak, 2009) have shown that the acquisition of action–effect associations crucially depends on the mode of movement the actions are performed in. In these studies, action–effect associations were diagnosed in a forced choice test phase only if participants freely chose between left and right keypresses during acquisition (intention-based acquisition). In contrast, if the actions that participants performed were triggered by external stimulus events (stimulus-based acquisition), no indication for the acquisition of action–effect association was observed. To explain this result, Herwig et al. suggested that actions are controlled by their anticipated sensory consequences only if the agent acts in the intention-based mode of movement. In contrast, when acting in the stimulus-based mode, participants pass on control to the stimulus (prepared reflex; Hommel, 2000), and actions are selected with respect to their antecedents. However, these studies did not directly test whether both modes of movement differ with respect to the actual use of action–effects for action planning. Experiment 2 was conducted to bridge this knowledge gap by repeating Experiment 1 with a stimulus-based acquisition phase.

Method

Twenty new participants, between 19 and 32 years of age, took part in Experiment 2. Experiment 2 was a replication of Experiment 1, with the only exception that one of the two neutral faces during acquisition was framed by a white ellipse (stroke width: 4 pixels) for 50 ms that indicated the side to which participants should direct their gaze (Fig. 1b). Thus, the only crucial difference between the experiments concerned the mode of movement, with intention-based saccades in Experiment 1 and stimulus-based saccades in Experiment 2.

Results and discussion

Data were analyzed as in Experiment 1. However, in contrast to Experiment 1, saccadic latencies in the test phase did not differ between the acquisition-compatible subgroup and the acquisition-incompatible subgroup whatsoever (Fig. 4a),Footnote 3 t(18) = 0.05, p = .93. This observation is in line with recent studies investigating acquisition of action–effect associations with keypress actions and tones (Herwig et al., 2007; Herwig & Waszak, 2009; but see Pfister, Kiesel, & Hoffmann, 2011, for different results with a free choice test phase). Importantly, there was also no indication of effect anticipation during the acquisition phase (Fig. 4b) [interaction of effect condition and block, F(3, 57) = 0.35, p = .72, Hyuhn-Feldt’s ε = .68, η² = .02]. Thus, Experiment 2 provided clear evidence that saccades are not governed with respect to their anticipated sensory consequences if the agent acts in the stimulus-based mode of movement.

Fig. 4
figure 4

Results from Experiment 2. a Mean saccadic latencies in the test phase for the acquisition-compatible and -incompatible subgroups. b Mean vertical landing position in the acquisition phase of the first saccade toward the neutral face as a function of 56 trial blocks and forthcoming change of facial expression to happy or angry. Error bars represent standard errors

General discussion

Our results clearly show that humans easily acquire the link between their own eye movements to faces and the change in facial expression that is triggered by their gaze. Moreover, in the course of experiencing the saccade–effect relationship, changes in facial expression come to be anticipated and directly affect the first saccade’s destination on a neutral face. This study thus demonstrates for the first time action–effect associations in the oculomotor system. Furthermore, the acquisition and actual application of action–effect associations were addressed within a single experimental paradigm.

Replicating previous results obtained with keypress actions and auditory effects (Herwig et al., 2007; Herwig & Waszak, 2009), the acquisition of saccade–effect associations crucially depends on the mode of movement. The presentation of a facial expression in the test phase activated its associated saccade only if participants in the acquisition phase decided on their own where to look next (Experiment 1). In contrast, if this decision was specified by an external stimulus during acquisition (Experiment 2), no action–effect learning was observed. Extending previous results, the present study further shows that this dissociation holds for the application of anticipated effects during action planning. Thus, the results confirm Herwig and colleagues’ assumption that action control is governed by anticipated sensory consequences only if the agent acts in the intention-based mode of movement.

The similarity regarding the present saccade–effect associations and the findings of previous studies on keypress–tone associations (Elsner & Hommel, 2001; Herwig et al., 2007) raises the question of whether the same results would ensue with arbitrary saccade effects such as changes in the color or shape of inanimate objects. Indeed, it might be that the social nature of the action effects presented in the experiments is just one (efficient) way to investigate saccade–effect association. However, there are also reasons to assume a special relationship between the oculomotor system and social responses. In contrast to actions with other body parts, which might directly manipulate inanimate objects, eye movements do not naturally change inanimate objects. Under normal circumstances, only living creatures are able to detect mutual gaze and, accordingly, to change behavior. Past experience of this frequent saccade–effect combination might thus facilitate learning of the very combination in the laboratory. However, clarifying these speculations is beyond the scope of the present study and should be subject to future research.

The notion that the anticipated consequences of an eye movement are important in how we allocate gaze and in how we perceive scenes plays no prominent role in saliency-based models of eye movement control (e.g., Itti & Baldi, 2005; Itti & Koch, 2001). However, the prediction notion has recently been addressed in considerations of eye movement control in scene perception (O’Regan & Noë, 2001), natural settings (Ballard & Hayhoe, 2009), shape perception (Renninger, Verghese, & Coughlan, 2007), and space perception (Wolff, 2004). By focusing on the effects of eye movements in the environment, our results show that similar learning and control mechanisms that are involved in goal-directed hand actions in the inanimate environment apply also to intended eye movements in the animate social world, in which our counterparts are responsive to our gaze behavior. This suggests that eye movements can be construed in the ideomotor framework as goal-directed actions.