The two-visual-pathways model

Creating a conscious percept of the visual input and guiding our actions through the surrounding environment are the major tasks of the human visual system. In an influential model, these two functions have been mapped onto two cortical streams (Goodale & Milner, 1992; Milner & Goodale, 2006): A ventral stream to inferior temporal areas should provide the conscious percept (the perceptual system), while a dorsal stream to posterior parietal areas should program and control visually guided actions (the visuomotor system). Although this neuroanatomical model was initially derived from neuropsychological cases such as the famous visual agnostic patient D.F. (Milner et al., 1991), it has stimulated many behavioral studies with healthy participants. On the basis of this model, several dissociations of performance in ventral/perceptual and dorsal/visuomotor tasks have been proposed. For example, it has been suggested that only perceptual tasks, but not visuomotor tasks (such as grasping or pointing), are susceptible to visual illusions (Aglioti, DeSouza, & Goodale, 1995). Thus, finding an effect of visual illusions on a specific action type has been used to argue that this particular action is under control of the ventral perceptual system (e.g., Gonzalez, Ganel, Whitwell, Morrissey, & Goodale, 2008). To date, however, great controversy still centers on whether the dissociating effect of visual illusions really exists or is due to methodological and/or statistical artifacts (Franz, Gegenfurtner, Bülthoff, & Fahle, 2000; for reviews of this controversy, see Bruno & Franz, 2009; Carey, 2001; Franz & Gegenfurtner, 2008; Goodale, 2008; Schenk, Franz, & Bruno, 2011; Smeets & Brenner, 2006).

Analytic versus holistic processing

Ganel and Goodale (2003) argued that the visuomotor system should be able to efficiently ignore irrelevant visual input that is not needed for (and is possibly harmful to) the control of object-oriented actions. In other words, this system should be able to process visual input strictly analytically and to access individual features of objects (e.g., width, length, and height) separately. Consequently, variations of task-irrelevant (stimulus) dimensions should not deteriorate visuomotor performance. In contrast, the perceptual system, which is assumed to process its input holistically (i.e., to spontaneously combine such features to a composite object), should suffer interference from such variations. Differences between both systems regarding shape representation have also been reported on the neural level (e.g., Janssen, Srivastava, Ombelet, & Orban, 2008; Srivastava, Orban, De Mazière, & Janssen, 2009), suggesting that the visuomotor representation strictly matches the specific demands imposed by an object-oriented action. Ganel and Goodale tested this notion using a variant of Garner’s (1978) speeded classification task, in which participants were either to judge the width of a stimulus block (a ventral/perceptual task) or to grasp a stimulus block across its width (a dorsal/visuomotor task). In the baseline condition, only two stimulus blocks of the same (irrelevant) length were used; in the filtering condition, four stimulus blocks (that consequently varied on the length dimension) were used. As predicted, in the visuomotor task, response latencies were comparable in both conditions but were longer in the filtering condition of the perceptual task (i.e., Garner interference was found). Subsequently, this dissociation has been replicated in dual-task contexts (Janczyk & Kunde, 2010; Kunde, Landgraf, Paelecke, & Kiesel, 2007) and has been used to identify the task-underlying processing mode of various actions (Janczyk, Franz, & Kunde, 2010).

The comparison of different blocked conditions brings about obvious constraints, which in turn render the available empirical evidence inconclusive. First, and most importantly, the baseline and filtering conditions vary in the number of stimuli employed (two vs. four). Besides the differences regarding the variation of the irrelevant dimension, both conditions thus differ in the number of stimulus templates that must be held in memory for comparison with the actual visual input. Dyson and Quinlan (2010) showed that both aspects (the irrelevant variation and the number of objects employed) potentially contribute to longer response times in the filtering conditions. Although these authors used tasks different from ours (and from those of Ganel & Goodale, 2003), it is likely that one or both factors apply for the tasks in the present study as well. This reasoning is corroborated by the observation that three studies using the Psychological Refractory Period paradigm found that Garner interference combined additively with the dual-task factor Stimulus Onset Asynchrony (Janczyk et al., 2010; Janczyk & Kunde, 2010; Kunde et al., 2007), pointing to the implication of central (instead of precentral perceptual) mechanisms (Pashler, 1994). It thus might be that the increased memory workload in the filtering condition relative to the baseline condition is the true reason for the response time differences with the perceptual task. Note that the visuomotor system has been described as only relying on “real-time” information, without access to memory (e.g., Hu & Goodale, 2000). Therefore, the differences in the number of stimuli should not affect its functioning. Second, block-wise comparisons of filtering and baseline conditions make it difficult to investigate the development of Garner interference throughout an experiment. Moreover, block-wise manipulations invite strategic preparation, due to the predictability of the experimental situation. If such preparation applies for the baseline condition of the perceptual judgment but not of the visuomotor grasping task, the response time dissociation reported by Ganel and Goodale (2003) could readily be explained.

The present experiments

Despite these limitations that originate from the methodological technique of blocking conditions, we suggest that the holistic-versus-analytic distinction (Ganel & Goodale, 2003) is valid, and in fact allows for a fairly easy alternative dissociation in response times. The present research is meant as a methodological improvement that overcomes the problems of the Garner interference paradigm (see the preceding section), thereby providing new evidence for the analytic-versus-holistic distinction of visuomotor and perceptual processing. The rationale underlying our research is as follows.

Conceivably, if the perceptual system works holistically, segregating the relevant and the irrelevant dimensions should be more difficult, the more similar these dimensions are. Consider, for example, participants who are asked to judge the width of an object with an almost identical width and length (hence, almost a square). Such similarity of the relevant and irrelevant object dimensions should pose a problem for the perceptual system, which tends to process objects holistically, and thus does not distinguish task-relevant from task-irrelevant features. In contrast, such a manipulation should not pose a major problem (and possibly no problem at all, in terms of performance differences) for a strictly analytically operating visuomotor system. In Experiment 1, we made objects’ task-irrelevant lengths either more similar or more dissimilar to the relevant object width. This experiment included two “long” objects, with a large difference between task-irrelevant length and task-relevant width, and two “short” objects, with a relatively smaller difference between length and width. In fact, we used—for better comparability—the same four stimulus blocks that have been used in other studies, and we expected to find a strong influence of this manipulation for a perceptual judgment task (speeded judgment of the objects’ widths), presumably driven by the ventral stream, but no influence for a visuomotor task (grasping the objects across their widths), presumably driven by the dorsal stream. Experiment 2 was run as a control experiment to rule out an alternative account for the results obtained in Experiment 1.

Experiment 1

In Experiment 1, our participants were either to grasp a stimulus block across its width (visuomotor task) or to give a speeded judgment of the stimulus width (perceptual task). Both tasks were conceptually similar to those employed by Ganel and Goodale (2003), except that we presented all four stimuli during the experiment rather than distinguishing baseline and filtering conditions (as would have been necessary in order to classically assess Garner interference).

Materials and method

Participants

A group of 32 students (26 female, 6 male; age 21–33 years) from Dortmund University of Technology participated for course credit. All of the participants reported normal or corrected-to-normal vision.

Design, stimuli, and procedure

Each participant performed in one single session consisting of two parts. The written instructions focused on speed, while maintaining errors at a low rate. In one part of the experiment (the visuomotor grasping task), participants were to grasp wooden stimulus blocks across their width using a precision grip (the grasping movement would cover a distance of approximately 50 cm). Participants were instructed to lift the grasped stimulus and to hand it to the experimenter; a correct precision grip was demonstrated by the experimenter beforehand. In the other part of the experiment (the perceptual judgment task), they were to judge the width of the wooden stimulus blocks and to respond accordingly with a keypress using either the index or the middle finger of the right hand. After this response, they were to grasp the stimulus and hand it to the experimenter at leisure. The participants were made familiar with the employed stimuli prior to the experiment, and thus were aware of the differing widths and lengths, and the required response in the perceptual judgment task was explained. In particular, the stimuli were four white-colored blocks, constructed according to a factorial combination of two lengths (63 vs. 75 mm) and two widths (30 vs. 35.7 mm), that had been used in earlier studies based on the Garner interference paradigm (Ganel & Goodale, 2003; Janczyk et al., 2010; Janczyk & Kunde, 2010; Kunde et al., 2007). With these stimuli, the task-relevant and task-irrelevant dimensions are more similar for the two short (63 mm) than for the two long (75 mm) stimulus blocks.

The participants wore computer-controlled PLATO shutter glasses (Translucent Technologies, Toronto, ON) to control stimulus visibility. Response times (RTs) were measured from the glasses’ opening until the right-hand keypress (perceptual judgment task) or until the right index finger left a home button (grasping task). For grasping, movement times (MTs) were measured from leaving the home button until the stimulus was picked up (which released a small hidden microswitch). Errors in the perceptual task were registered automatically, and errors in the grasping task were recorded by the experimenter (errors included, e.g., not using a precision grip or dropping the stimulus block after grasping it).

Each participant performed in three experimental blocks of 80 trials each for both the grasping and the perceptual judgment task (each task was preceded by a short training block of 10 trials). The order of the two tasks and the mapping of stimulus width and the required keypress for the perceptual judgment task were counterbalanced across participants. Each trial began with a short warning click, and after 1,000 ms the shutter glasses opened to provide a view of the stimulus. After a trial, the experimenter provided feedback, the shutter glasses became opaque, and the experimenter prepared and started the next trial.

Results

Statistical analyses were done by means of a 2 × 2 ANOVA with repeated measures on stimulus length (short vs. long) and task type (grasping vs. perceptual judgment). For the RT/MT analyses, only correct trials were considered. RTs below 150 ms and exceeding an individual’s mean by more than 2.5 standard deviations (calculated separately for each participant and design cell) were excluded as outliers (2.9 % of trials).

The results of the RT analyses are illustrated in Fig. 1. RTs were longer in the perceptual judgment task than in the grasping task, F(1, 31) = 293.80, p < .001, η 2p = .91, and were longer for the short than for the long stimulus blocks, F(1, 31) = 8.15, p = .008, η 2p = .21. Most importantly, however, the interaction was significant, as well: F(1, 31) = 10.89, p = .002, η 2p = .26. The difference between the short and long stimulus blocks was only significant in the perceptual judgment task, t(31) = 3.18, p = .003, whereas no such difference was observed in the grasping task, t(31) = 0.81, p = .425. The mean MTs were 596 ms for both the short and long stimulus blocks, t(31) = 0.22, p = .829, and the mean bivariate correlations of RTs and MTs were r = –.01, |t|(31) = 0.32, p = .750, and r = –.03, |t|(31) = 1.09, p = .283, for the short and long stimuli, respectively.

Fig. 1
figure 1

Mean response times (RTs) and movement times (MTs) in milliseconds as a function of task type (grasping vs. perceptual judgment) and length of the stimulus blocks (short [63 mm] vs. long [75 mm]). Error bars represent 95 % within-subjects confidence intervals. n.s., nonsignificant. ** p < .01

The mean error percentages were small (2.1 vs. 0.7 for the short and long stimulus blocks, respectively) for the grasping task and were higher for the perceptual judgment task (7.3 vs. 6.8, respectively), F(1, 31) = 57.91, p < .001, η 2p = .65. Overall, the percentages were slightly higher for the short than for the long stimulus blocks, F(1, 31) = 6.13, p = .019, η p² = .17, but the interaction was nonsignificant, F(1, 31) = 0.74, p = .397, η 2p = .02.

Discussion

In general, the results from Experiment 1 support the predictions put forward in the introduction. While similarity of the task-relevant and task-irrelevant dimensions did not affect RTs in the visuomotor task, perceptual judgments were slower when this similarity was increased. Before addressing a potential alternative explanation, we need to comment on two other aspects of the data. First, the observation that MTs were unaffected by the similarity of the objects suggests that no adjustments were made in flight (which would probably not be reflected in RTs). Furthermore, it is conceivable that the more adjustment occurs during movement execution, the less planning would happen in the RT interval. This trade-off would thus give rise to a negative correlation of RTs and MTs, yet we observed no such correlation. These considerations suggest that no adjustments were made in flight, although we of course cannot exclude such an account entirely. Second, for error percentages we found a main effect of similarity, while the critical interaction with task type was not significant. On the one hand, it is fair to acknowledge that grasping was indeed somehow affected by similarity. On the other hand, we suspect that the higher error percentage for the stimuli with higher-dimension similarity was due to the fact that these stimuli also had less grip surface. Since trials were declared erroneous when the participant, for instance, lost the stimulus after grasping, the reduced grip surface likely contributes to the higher error percentages for the short stimuli in the grasping task.

The biggest caveat to our interpretation, however, is the fact that the stimuli causing the faster perceptual judgments had—on average—larger surface area as well. Larger stimuli can be construed as being more intensive, which under appropriate conditions would facilitate responding (Kohlfeld, 1971), although it is left unclear why grasping RTs would then be unaffected. However, in Experiment 2 we tested whether any differences in the mere detection of stimuli were linked to surface area, as such an account would suggest.

Experiment 2

To explicitly address the viability of the size account described above, we ran Experiment 2. Here, we tested whether any differences in detecting our stimuli would emerge as a function of their surface area. To this end, participants were asked to give a speeded response based on whether a stimulus was or was not present. To the extent that surface area matters, positive RTs should differ for the four stimuli we employed. Finding no difference, however, would support the logic laid out in the introduction as the reason for the results of Experiment 1.

Materials and method

Participants

A group of 20 persons from the Würzburg community (17 female, 3 male; age 20–45 years) participated for monetary compensation. All of the participants reported normal or corrected-to-normal vision.

Design, stimuli, and procedure

The task was similar to the perceptual judgment task of Experiment 1. However, the participants’ responses were based on whether a stimulus block was present or absent. Each participant performed in three experimental blocks of 80 trials each (preceded by a short training block of 20 randomly drawn trials). Half of these trials required an “absent” (or negative) response, and the other half required a “present” (or positive) response. Each of the four stimuli occurred equally often per block (10 times), and trials were presented in a random order. The mapping of the required responses to the two response keys was counterbalanced across participants. The glasses opened 500 ms after a short warning click to initiate a trial. After a participant’s response, vocal feedback in the case of an error was provided by the experimenter, the glasses became opaque, and the experimenter prepared and started the next trial.

Results and discussion

The analysis was done by means of an ANOVA with stimulus (no stimulus vs. Stimulus 1 vs. . . . vs. Stimulus 4; Stimuli 1 and 2 were short, and Stimuli 3 and 4 long) as a repeated measure (2.1 % of the trials were excluded as outliers according to the same criteria as in Exp. 1). The mean RTs were 400, 390, 393, 386, and 391 ms for the five conditions, and the corresponding effect was not significant, F(4, 76) = 1.30, p = .276, η 2P = .06. The mean error percentages were 1.8, 2.8, 1.7, 2.5, and 2.8, and the corresponding effect was also not significant, F(4, 76) = 0.77, p = .490, η 2P = .04.

According to these results, differences in object surface area do not yield different detection times. As such, they do not provide evidence for the size account of the results of Experiment 1, but they do speak in favor of the reasoning laid out in the introduction.

General discussion

Earlier behavioral studies related to the two-visual-systems hypothesis (Goodale & Milner, 1992; Milner & Goodale, 2006) reported performance dissociations that either have been disputed heavily (as is the case for visual illusions; Franz & Gegenfurtner, 2008; Goodale, 2008) or have exhibited methodological drawbacks linked to the nature of block-wise comparisons, such as in the Garner interference paradigm (e.g., Dyson & Quinlan, 2010; see also the Analytic versus holistic processing section of the introduction). In line with the idea of an analytical visuomotor system and a holistically working perceptual system (Ganel & Goodale, 2003), we here showed a novel and remarkably easy dissociation between the two systems: A perceptual judgment task became more difficult, the more similar the relevant and irrelevant dimensions were, whereas a visuomotor grasping task was mostly unaffected by this similarity (see the Results section of Exp. 1 for a possible qualification). Briefly, visual processing for perception has more trouble resisting similarity of the relevant and irrelevant stimulus features than does visuomotor processing for action control. Importantly, our experimental approach overcomes the problems associated with the block-wise comparison involved when measuring Garner interference (Ganel & Goodale, 2003).

To the extent that both of these tasks map onto the two proposed visual streams, our data can be interpreted as a novel behavioral dissociation supporting the distinctive purposes and functioning of the ventral and dorsal streams (Goodale & Milner, 1992; Milner & Goodale, 2006). Still, given the behavioral nature of this study, such an ascription to neural systems is hypothetical: Although processing in the two tasks is affected differently by the similarity factor, our data do not allow for a firm statement of whether or not both tasks are exclusively processed in different neural streams. For example, the data are equally consistent with Schenk’s (2010) integration account, which assumes that both ventral and dorsal areas contribute to visuomotor processing, providing some redundancy in the available information that is not available for nonvisuomotor tasks.

Both visual illusions and Garner interference have been used as arguments to ascribe specific actions to the visuomotor/dorsal or the perceptual/ventral system (Gonzalez et al., 2008; Janczyk et al., 2010)—sometimes with opposed conclusions. The dissociation we have reported in this article can serve as an easy and alternative indicator for different underlying processing modes. This indicator has quite practical advantages, such as the possibility of studying practice-related variations of interference throughout an experiment, which is barely possible with the block-wise manipulations inherent to the study of Garner interference. This will allow, for example, a new look at the currently debated issue of whether, for instance, unskilled actions are controlled by the dorsal visuomotor (Janczyk et al., 2010) or by the ventral perceptual system (Gonzalez et al., 2008).