Folk psychology suggests that the thought of a goal precedes the decision to perform the action that leads to it. For example, the thought of an ice cream may inspire a trip to the ice cream van. The idea that action planning starts with imagining the consequences of the action has been captured in the
ideomotor
account of human intentional action, which was developed mainly in the nineteenth century (Carpenter,
1852; Herbart,
1825; James,
1890; Lotze,
1852; Stock & Stock,
2004) and is still endorsed by many researchers in the field of human cognition (Hommel,
2003; Hommel, Musseler, Aschersleben, & Prinz,
2001). According to James (
1890), experiencing a contingency between an action and its outcome (or effect) produces an association between the motor programme for the action and a representation of the outcome, so that on future occasions activating the outcome representation directly excites the action programme. Although ideomotorists have mainly studied associations between movements and their immediate
sensory consequences, similar associative accounts have been developed in the field of animal learning to account for goal-directed action (Asratyan,
1974; Gormezano & Tait,
1976; Pavlov,
1932). We will refer to the ideomotor account as well as related associative accounts as
outcome-response (
O-R)
theory, because the basic premise is that goal-directed actions are mediated by O→R associations. In the following section, we will focus on the experimental paradigms that have provided support for the prediction of O-R theory that actions can be triggered by experiencing the consequences of these actions directly or even by mere anticipation of the consequences.
O-R theory and the belief criterion
If instrumental learning establishes O→R associations, which can then be used to guide action selection, presenting the outcome should immediately prime the associated action. Addictive drugs appear to be particularly potent (O→R) primers of drug craving and drug seeking. For example, Chutuape, Mitchell, and de Wit (
1994) found that consumption of an alcoholic drink not only enhanced the craving for alcohol but also the propensity to seek the drug, and response priming by such outcomes has received extensive empirical investigation in animals (see for a review, Stewart & de Wit,
1987). However, purely sensory outcomes are also capable of priming their associated responses. In the R-O pre-training stage of Meck’s experiment (
1985), rats were presented with two levers. While a common food reward could be obtained after pressing both levers, each lever was also paired with a specific sensory post-reinforcement cue, either a noise or a light signal. After substantial training on this schedule, the rats were transferred to a procedure in which the noise and the light preceded the opportunity to lever press. For half of the rats (congruent group), each stimulus signalled that the response that had previously earned this reward during the R-O pre-training would be rewarded. For instance, when a tone was presented, the animals had to press the lever previously associated with the tone outcome, whereas when a light was presented, they had to press the lever associated with the light outcome. These contingencies were reversed for the remaining animals (incongruent group) in such a way that the animals had to learn to make the opposite response. Here, a tone signalled that the lever that was previously associated with the light outcome would lead to a reward and vice versa. If R-O training had established O→R connections between the outcome cues (i.e. the light and the tone) and their respective responses, the congruent group should acquire the discrimination more rapidly than the incongruent group. This prediction of O-R theory was confirmed.
More recently, similar results were obtained in humans. For example, Elsner and Hommel (
2001) demonstrated outcome-mediated response priming in humans. In the R-O pre-training stage, participants were instructed to press either a right or a left key as quickly as possible on appearance of a white rectangle. Each key press triggered either a high or a low tone, for example, right key presses (R
R) were followed by a high tone (T
h) and left key presses (R
L) by a low tone (T
l). According to O-R theory, this pre-training should have established T
h→R
R and T
l→R
L associations. In the test phase, trials started with the presentation of one of the tones and the participants were instructed to press either the left or the right key as quickly and as spontaneously as possible. Right key presses following the high tone and left key presses following the low tone were counted as congruent choices, whereas the opposite choices were counted as incongruent. The prediction of O-R theory that the participants should make more congruent than incongruent choices was again born out by the results.
The response priming effect has not only been shown with tones as outcomes (Elsner & Hommel,
2001,
2004) but also with a variety of other sensory outcomes (Ansorge,
2002; Hommel,
1993,
1996; Ziessler,
1998; Ziessler & Nattkemper,
2001,
2002). Moreover, Beckers, De Houwer, and Eelen (
2002) showed that the emotional valence of outcomes can prime responses through O→R associations. In their study, responses associated with shock outcomes were executed faster following negative target words than neutral responses, and relatively slowly following positive target words. Finally, Elsner and Hommel (
2004) demonstrated that the two major parameters controlling animal instrumental learning, R-O contiguity and R-O contingency, also influence the priming effect as would be predicted by associative theory.
In summary, there is a wealth of evidence for outcome-mediated response priming in adult humans. Moreover, O-R learning has been demonstrated early in infancy. If babies are given the opportunity to learn that moving their legs activates a mobile, then later presentations of the moving mobile will cause them to move their legs (Fagen & Rovee,
1976; Fagen, Rovee, & Kaplan,
1976), suggesting that the ability to form O→R associations develops early in life.
However, although response priming by outcomes provides evidence for the formation of O→R associations, this effect by itself does not support the contention of O-R theory that the mere thought of a goal can prime actions through these associations. In daily life most goal-directed actions do of course not result from direct perception of the goal. For example, the sound of the bell of the ice cream van on a sunny afternoon makes us think of an ice cream which in turn makes us think of the action that has yielded ice cream in the past, namely to approach the van and buy an ice cream. In associative terms, the bell activates the ice cream representation through a (Pavlovian) S→O association, which in turn activates the response representation of buying the ice cream through an (instrumental) O→R association.
Of course, in daily life, S-O and O-R learning as well as S-R learning occur concurrently in complex, dynamic interactions so that the contribution of each learning process to performance cannot be isolated. However, many years ago animal learning researchers have developed the so-called
Pavlovian-to-Instrumental
transfer paradigm (
PIT) to demonstrate the separate contributions of S-O and O-R learning to the control of instrumental performance (Estes,
1943,
1948). With this task it has been shown that a purely Pavlovian stimulus can prime performance of an instrumental response that was separately paired with the same outcome as the Pavlovian stimulus (Baxter & Zamble,
1982; Blundell, Hall, & Killcross,
2001; Colwill & Motzkin,
1994; Colwill & Rescorla,
1988; Corbit, Janak, & Balleine,
2007; Holland,
2004; Kruse, Overmier, Konz, & Rokke,
1983; Rescorla,
1994). As the Pavlovian stimulus and instrumental response were never trained together, this effect has to be mediated by an intervening representation of the common outcome rather than being mediated by direct S→R associations. The PIT procedure is illustrated by an experiment of Colwill and Motzkin (
1994) in which rats first received Pavlovian S-O training during which, for example, a light signalled the delivery of food pellets and a noise predicted access to sucrose solution. In a separate instrumental training phase, the same rats were trained to lever press for food pellets and chain pull for sucrose solution in the absence of the light and the noise. Finally, in an extinction test the rats were given access to both response manipulanda. This test yielded outcome-specific PIT in that the stimuli caused the rats to make the response associated with the signalled outcome. During the noise the rats chain-pulled more than they pressed the lever and vice versa. These results suggest that presentation of the noise (or the light) triggered the representation of the food pellets (or the sucrose solution) that in turn led the rats to press the corresponding lever, thereby demonstrating the role of an S→O→R associative chain in the control of responding.
It seems intuitively appealing that humans, like rats, select actions via S→O→R associative chains, and in fact PIT effects have been argued to play a crucial role in drug seeking behaviour (Ludwig, Wikler, Stark, & Lexington,
1974). Moreover, the logic behind marketing is that advertisements remind consumers of the product in question, which in turn should prime the action of purchasing it. Over the last few years, the PIT and associated paradigms have been increasingly used to demonstrate the effect of this route to action selection in humans (Bray, Rangel, Shimojo, Balleine, & O’Doherty,
2008; Hogarth, Dickinson, Wright, Kouvaraki, & Duka,
2007; Talmi, Seymour, Dayan, & Dolan,
2008). For example, Hogarth et al. (
2007) showed that in the presence of a stimulus associated with cigarettes addicted smokers were more likely to perform actions trained with cigarettes than those trained with money, with the reversed pattern of performance evident when the stimulus associated with money was presented. Similarly, Bray et al. (
2008) reported that Pavlovian cues for natural rewards (chocolate milk, a soft drink and orange juice) would preferentially enhance performance of responses that previously yielded these rewards during a separate instrumental training phase.
In summary, a prerequisite for goal-directed action is that the agent possesses knowledge of the instrumental R-O contingency. Evidence that animal and human behaviour is sensitive to both direct response priming by outcomes and indirect priming by cues for outcomes suggests that O-R theory provides a viable account of instrumental learning. In the next section, we will address the question whether this account can also capture the desire criterion of goal-directed action, namely, that actions are sensitive to the current motivational value of their outcomes.
O-R theory and the desire criterion
Ideomotor theory and associative O-R theory provide a viable account of the acquisition and deployment of instrumental knowledge in a way that fulfils the belief criterion of goal-directed action. The question remains, however, whether O-R accounts can explain behaviour that is sensitive to the current motivational value of an outcome—in accordance with the desire criterion—and can thus be considered a sufficient account of goal-directed actions in the strict sense. In fact, as noted earlier, most ideomotor research has focused on response priming by sensory events that did not serve as goals for the participants. Indeed, some authors have even interpreted ideomotor theory as strictly referring to associations between motor movements of the body and the perceivable motor effects (Kunde, Koch, & Hoffmann,
2004; but see Hommel,
2003). Accordingly, to date O-R theory does not provide a mechanism through which the motivational value of the outcome influences action planning.
The O-R approach could, however, easily be modified to account for such a sensitivity to affective value by incorporating the additional assumption that the extent to which the outcome mediates response activation through the O→R association depends on the current value of the outcome. Or in other words, ‘a fiat that the action’s consequences become actual intervenes’ when actions are selected via the ideomotor mechanism (James,
1890). The sensitivity of the outcome representation to input excitation, and hence its ability to activate the associated response, could be modulated by the relevance of the outcome to the agent’s current motivational state. For example, the ability of choices on a menu to excite strong or vivid thoughts of the actual meal, may well depend upon how hungry one is, and this may determine to what extent the response of ordering an option is activated through the S→O→R associative chain. From a purely theoretical perspective, therefore, it is possible for O-R theory to provide an account of the goal-directed nature of instrumental actions. However, there is an empirical evidence from animal research that speaks against such a possibility.
A potential problem for O-R accounts of goal-directed action is that outcome-specific PIT appears to be insensitive to the current value of the outcome. Earlier we argued that PIT provides evidence for an O-R account because the separate establishment of S→O and O→R associations enables the Pavlovian stimulus to activate the response associated with the common outcome even though the stimulus and response have never been trained together. Therefore, instrumental responses can be triggered through an S→O→R associative chain. The finding that speaks against an O-R theory as a complete account of goal-directed action is that, at least in rats, the magnitude of PIT is unaffected by outcome devaluation. For example, following separate S-O and R-O training phases, Rescorla (
1994) conditioned an aversion to the food outcome before testing the ability of the stimulus to enhance the corresponding response. Importantly, he found that the ability of a stimulus to facilitate responding was unaffected by devaluation of the associated food (see also Holland,
2004). In other words, PIT occurs even when the anticipated outcome is not currently a goal for the animal, and therefore, although O→R associations allow outcomes to prime associated responses, a different mechanism may mediate the pursuit of goals.
Further evidence that O-R theory may not be a sufficient account of goal-directed behaviour comes from neural dissociations between PIT and outcome devaluation effects in rats. Lesions of the nucleus accumbens core abolish the immediate sensitivity of instrumental performance to devaluation of the outcome, but do not affect outcome-specific PIT (Corbit, Muir, & Balleine,
2001). Conversely, lesions of the nucleus accumbens shell abolish this form of PIT while not influencing performance in an outcome devaluation test (Corbit et al.,
2001; de Borchgrave, Rawlins, Dickinson, & Balleine,
2002). Furthermore, related research has shown that extensive amounts of training abolish the goal-directed nature of instrumental performance but leave outcome-specific PIT intact (Holland,
2004). In summary, several rodent PIT studies provide evidence that response priming via O→R associations and outcome devaluation are mediated by different mechanisms, suggesting that O-R theory may not provide a complete account of goal-directed behaviour. Note, however, that these studies do not demonstrate that O→R associations
cannot mediate goal-directed behaviour. It certainly remains possible that the lack of sensitivity of response priming via O→R associations to the current incentive value of an outcome and to extensive training merely arise as artefacts of the PIT procedure. In any case, it appears wise to be cautious in accepting an O-R account of goal-directed behaviour. In the following section we will, therefore, discuss an alternative candidate theory of goal-directed action.