Introduction

Holistic processing, the decoding of the stimulus as a unified whole, is a key feature in human perceptual organization (Farah, Wilson, Drain, & Tanaka, 1998; Hochstein & Ahissar, 2002; Oliva & Torralba, 2006). It has been suggested to be obligatory in nature, such that participants cannot ignore the global characteristics of a given stimulus, even when these characteristics are irrelevant to the task at hand (e.g., Garner & Felfoldy, 1970; Navon, 1977). Contrary to its central role in visual perception, holistic processing does not appear to affect visuomotor control of actions, subserved by the dorsal visual stream, which is suggested to be based on a fundamentally different, analytic representation of object shape (Ganel & Goodale, 2003, 2014; for reviews see Goodale, 2011, 2014)

Ganel and Goodale (2003) suggested that vision-for-action operates in an analytical fashion. In particular, they have argued that visuomotor interactions with objects rely on the most immediate and absolute representation of the target object, without being influenced by relative, holistic processing style. Such holistic processing, however, is required by the perceptual system in order to efficiently recognize objects across different viewpoints, sizes, and contexts (Goodale, 2011). To test this idea they utilized the Garner speeded-classification task (Garner & Felfoldy, 1970) which provides a reliable measure of how efficiently people can process one dimension of an object while ignoring its other dimensions. Participants were asked to classify rectangular objects on the basis of a given dimension (e.g., width) while ignoring an irrelevant dimension of this object (e.g., length) under two experimental conditions. In the baseline condition, only the relevant dimension varied while the values of the irrelevant dimension were kept constant. In the filtering blocks, the values of both the irrelevant and the relevant dimensions varied between trials. Ganel and Goodale (2003) found longer reaction times (RTs) in the filtering blocks compared to the baseline blocks (i.e., a Garner interference effect) in tasks that involved perceptual estimations. On the other hand, when participants were asked to simply grasp the objects (vision-for-action), no Garner interference effect was found, implying that object shape was processed holistically for perception but analytically for grasping.

The finding of a Garner-induced holistic-analytic dissociation between action and perception has been replicated in several more recent studies in which RTs served as the main dependent variable (e.g. Janczyk & Kunde, 2012; Kunde, Landgraf, Paelecke, & Kiesel, 2007). Recently, however, Ganel and Goodale (2014) extended these findings and showed that for perceptual estimations Garner interference is not limited to RTs, and is expressed in the within-subject variability of the response, which has been shown to reflect response precision (Ganel, Chajut, & Algom, 2008). In particular, the final apertures of perceptual estimations were more variable in the filtering compared to the baseline condition. On the other hand, for grasping, the precision of the Maximum Grip Apertures (MGAs) was unaffected by the object's irrelevant dimension, which again reinforced the notion according to which visually guided actions rely on analytical representation of object shape.

Additional evidence for the dissociation in the processing styles that subserve vision-for-action and vision-for-perception comes from studies that demonstrated that grasping evades the effect of Weber’s law (Ganel et al., 2008; Ganel, Freud, & Meiran, 2014). According to this fundamental psychophysical law, people’s ability to detect changes within a given physical dimension linearly decreases with the magnitude of this dimension (Baird & Noma, 1978). Ganel and colleagues used the within-subject variability as a measure of the Just Noticeable Difference (JND) of object size and showed that for perceptual estimation, JNDs were linearly scaled to size, in accordance with Weber’s law. On the other hand, during grasping, MGAs remained constant across object sizes, suggesting that object size was processed analytically, unaffected by Weber’s law (Ganel et al., 2008, 2014).

Importantly, the absence of Weber’s law and the holistic processing style (measured in the Garner’s task) for visually guided actions reflects complementary yet different aspects of visual relative processing. Particularly, while the violation of Weber’s law indicates analytic processing of a single object dimension, the lack of Garner interference reflects analytic processing of a single dimension relatively to other dimensions belonging to the same object.

Recently, Holmes and Heath (2013) utilized Weber’s law to test the nature of computations mediating grasping trajectories toward 2D object drawings. On one hand, these 2D objects lack basic properties of real 3D objects which could be essential for the generation of absolute representations of object size and shape. These properties include coherent monocular and binocular depth cues and haptic feedback from the object at the end of each grasp. On the other hand, actions toward 2D objects are directed toward distinct and recognizable visual targets and full visual feedback of the target object and of the grasping hand is available during grasping, very similar to how actions are performed toward real 3D objects. Moreover, such actions become more and more frequent in human everyday life, when actions are more commonly directed toward 2D targets displayed on smartphones and tablets. It could therefore be expected that 2D grasping would be performed in a normal analytic fashion, just as actions are performed toward real objects. Interestingly, Holmes and Heath (2013) found that unlike grasping trajectories toward real 3D objects, which evaded the effect of Weber’s law, grasping movements directed toward 2D objects of the same dimensions were subjected to Weber’s law, showing a linear increase in JND with object size. The adherence of actions toward 2D objects to Weber’s law suggests that actions toward such objects might be performed in a fundamentally different manner from normal grasping and might be subserved by different cognitive and neural mechanisms. In line with this idea, a recent fMRI adaptation study showed differential fMRI adaption patterns for real objects compared to 2D images presented on a screen (Snow et al., 2011).

The current study was designed to further explore the nature of the representations mediating 2D grasping. To this purpose, we tested whether such actions are subserved by an analytic or by a holistic representation of object shape. We utilized an experimental design similar to the one used by Ganel and Goodale (2003), but now, instead of using real 3D objects, participants were asked to direct their grasping movements toward 2D rectangles presented on a computer screen.

Methods

Participants

Nineteen right-handed, healthy undergraduate students from Ben-Gurion University of the Negev (six males, mean age: 24.3 years) received course credit for their participation in the experiment. The results of one participant were removed from the analysis because she failed to follow the experimental instructions.

Apparatus and stimuli

Participants sat in front of a black tabletop on which a 19-in screen was placed horizontally in a viewing distance of about 50 cm from the participants’ head (Fig. 1a–b). Grip scaling was recorded by an Optotrak Certus device (Northern Digital, Waterloo, ON, Canada). The apparatus tracked the 3D position of three active infra-red light-emitting diodes attached separately to the participant’s index finger, thumb, and wrist. This allowed for complete movement freedom of the hand and fingers. A 200-Hz sampling rate was used for the Optotrak, which provides a 0.1-mm positional accuracy under the specified experimental conditions. The dimensions of the rectangular target objects were similar to the stimuli presented in the original study by Ganel and Goodale (2003); however, instead of real objects, the stimuli were simple white 2D rectangles that were presented on the computer monitor. Four stimuli were used in the experiment, created from a factorial combination of two different widths (35.7 and 30 mm) and two different lengths (75 and 63 mm) (Fig. 1c).

Fig. 1
figure 1

An illustration of experimental design and stimuli. Participants were required to initially place their fingers at the starting point (a), and then, when vision was allowed, to grasp the target object by its width (b). Four stimuli were used in the experiment, created from a factorial combination of two different widths and two different lengths (c)

Procedure

Participants placed their index finger and thumb of their right hand on a start button (Fig. 1a), and were asked to reach out and make a grasping movement toward an object presented on the screen across its width as quickly as possible immediately after the stimulus was presented (Fig. 1b). They were then asked to keep their finger and thumb on the target intercept location for a few seconds and then place their fingers back on the starting point prior to the next trial.

In the baseline blocks, the relevant dimension (width) varied between trials while the irrelevant dimension (length) remained constant. In the filtering blocks, both the relevant and irrelevant dimensions randomly varied between trials, and all possible combinations were used. The order of the conditions (baseline/filtering) was counterbalanced across participants.

All blocks began with four practice trials, which were excluded from the analysis. In each experimental block, all four versions of the stimuli were presented eight times in a random fashion, resulting in a total of 32 presentations in each baseline block. Given that 64 trials were presented in the filtering condition, the filtering blocks were divided into two equal parts (as in Ganel & Goodale, 2014) that contained 32 stimuli, to match the number of trials presented at the baseline blocks.

Data analysis

On each trial, we recorded the 3D trajectories of the fingers during grasp. Movement onset (RT) was determined at the point in time where the aperture between index finger and thumb increased by more than 0.1 mm for at least ten successive frames (50 ms). Movement offset was determined at the point in time where the aperture between index finger and thumb changed by less than 0.2 mm for at least 20 successive frames (100 ms).

Similarly to Ganel and Goodale (2014), both RT and response precision were analyzed. Reaction time was defined as the time interval between the stimulus presentation and the onset of the movement. To analyze the precision of the response, we computed the within-subject standard deviation of the response in the baseline and filtering blocks at the time of the maximum grip aperture (see Ganel et al., 2008). To this end, the standard deviations for each of the four objects were computed and then averaged for each experimental condition.

In addition to the MGA, we also analyzed movement trajectories. In this analysis, movements were segmented to 11 normalized time points (from movement initiation at 0 % to final grasping of the object at 100 %, in gaps of 10 %), and grip aperture was calculated for each of the 11 time points. A similar normalization procedure was also applied in several earlier studies (e.g. Ganel et al., 2012; Króliczak, Westwood & Goodale, 2006). Movement trajectories were calculated separately for the narrow and the wide stimuli, and were collapsed across blocks (filtering/baseline).

Results

RTs were averaged for each participant for each condition. Longer RTs were observed in the filtering blocks (272 ms) compared to the baseline blocks (260 ms). Nevertheless, this difference was not significant, pointing to the lack of a significant RT-based Garner interference [t(17)=1.38, p = .18; Fig. 2a].

Fig. 2
figure 2

Reaction times (a) and average within-subject standard deviations (b) at the baseline and filtering blocks. While reaction times (RTs) to initiate grasping did not differ statistically between baseline and filtering, a robust accuracy-based Garner interference effect was found, with more variable performance in the filtering compared to the baseline blocks. Error bars represent confidence intervals of the main effect of block as calculated for repeated-measures ANOVAs (Jarmasz and Hollands, 2009)

The variability analysis, which reflects the precision of the response, revealed a robust Garner interference. Specifically, higher precision (i.e., smaller variability) was found in the baseline blocks (3.25 mm) compared to the filtering blocks (4.24 mm) [t(17)=4.02, p <.0001, Cohen’s dz = .94; Fig. 2b]. This finding provides novel evidence that actions directed toward 2D stimuli are not analytic in nature and are subserved by a holistic representation of object shape.

Grasping directed toward 2D objects lacks some of the properties inherent to real grasping, and therefore could not be considered to be strictly natural. Nevertheless, the movement trajectories analysis (Fig. 3) suggests that grasping toward 2D objects cannot be considered as “double pointing.” In particular, “double pointing” toward a flat surface is not expected to yield a grasping trajectory that includes a point of MGA prior to contact, but rather to produce a monotonic increase in aperture up to the point of final contact. As the trajectory analysis shows, grasping directed toward 2D objects were similar, in many aspects, to typical grasping toward 3D objects (Fig. 3). In particular, grip aperture was scaled to object size after about 20 % of the movement time and remained sensitive to this parameter until the end of the movement. Most importantly, MGAs were reached at around 70 % of the movement, similar to the grasping trajectories toward 3D objects (e.g. Ganel, Freud, Chajut, & Algom, 2012; Jakobson & Goodale, 1991; Jeannerod, 1984, 1986). On the other hand, there were also some differences between movement trajectories in the current study and typical trajectories toward 3D objects. In particular, the difference in aperture between the point at which MGA was achieved and between the point of the final grip was smaller in the present study. This observation is in line with previous findings that showed that grasping movements directed toward 2D objects are usually accompanied by a smaller peak apertures, but preserve some of the basic attributes of natural grasping (Westwood, Danckert, Servos, & Goodale, 2002)

Fig. 3
figure 3

Average grip aperture data throughout the movement trajectory. Grasping apertures were calculated separately for the narrow and wide objects and were collapsed across blocks (filtering/baseline). Grasping apertures began to be scaled to object size at about 20 % of movement time, and reached a peak (Maximum Grip Aperture) at about 70 % of movement time. This trajectory pattern resembles typical trajectories toward real objects (see Westwood et al., 2002). Error bars denote the confidence intervals of the main effect of size as calculated for repeated-measures ANOVAs (Jarmasz and Hollands, 2009)

Discussion

The present study aimed to investigate the nature of the representation that mediates goal-directed actions toward 2D stimuli. More specifically, we utilized the Garner task to show that, unlike in actions toward real objects, actions directed toward 2D stimuli are subserved by a holistic representation of object shape.

Actions performed toward 2D stimuli become more and more frequent with recent technological advances. Specifically, people more commonly interact with 2D objects presented on their smartphones or tablets. Nevertheless, little is known about the underlying cognitive and neural mechanisms that subserve this type of movements. Early studies suggested that actions toward 2D and 3D targets are not fundamentally different from normal grasping. This conclusion was mainly based on findings of basic sensitivity to object size for 2D targets. For example, Westwood et al. (2002) tested a patient with visual-form agnosia (D.F.) who exhibited similar sensitivity to the size of both 2D and 3D targets in her average MGAs during grasping, but not for her average perceptual estimations of object size. On the other hand, a more recent study that measured the within-subject variability of the response rather than looking only at the average response, found an adherence to Weber’s law for 2D but not for 3D objects. Particularly, JNDs during MGAs linearly increased with object size when actions were directed toward 2D objects (Holmes & Heath, 2013), but not when actions were directed toward real objects (Ganel et al., 2008, 2014; Heath, Holmes, Mulla, & Binsted, 2012; Heath, Mulla, Holmes, & Smuskowitz, 2011; Holmes, Mulla, Binsted, & Heath, 2011).

Note that grasping was used here as a model task following the vast previous research conducted in this domain for real objects as well as the research comparing grasping trajectories of 2D and real grasping. We used grasping as a model task as an effective measure that enabled us to directly compare the pattern of trajectories in the current study to results of previous studies using the same task with 3D objects. However, we are aware that typical interactions with virtual objects projected on touch-screens and tablets do not necessarily entail grasping movements but that these devices are also approached with pointing movements toward objects, or with movements meant to increase or decrease the size of the “window” presented on a touch screen. We plan to directly address this issue in future studies in our laboratory.

The utilization of the precision grasping task for 2D objects provides an additional challenge. In particular, this task could potentially be performed using a “double pointing” strategy rather than natural grasping. Nevertheless, the movement trajectory analysis presented in Fig. 3 revealed that grip aperture reached a peak at about 70 % of the movement time. This well-established finding is associated with grasping (e.g., Jeannerod, 1984, 1986) and is not usually found during pointing movements. Moreover, according to prominent models of “double pointing” (Smeets & Brenner, 2001), such movements should be characterized by independent control of the two digits to different locations, and should not be affected by factors such as object size, irrelevant size, or any other type of relative or holistic-based processing (Smeets & Brenner, 2008), which is measured by Garner’s paradigm. The finding of Garner interference during 2D grasping indicates holistic processing of object shape and therefore cannot be easily accommodated by a “double pointing” underlying mechanism.

The Garner speeded-classification task was used in the past decade to explore potential differences between the processes underlying action and perception. Garner interference effects were consistently found for perceptually driven tasks, implying holistic, Gestalt-based processing of object shape. On the contrary, when goal-directed actions were performed toward real 3D objects, no Garner interference effects were found, which is usually interpreted as evidence for analytic processing style (Ganel & Goodale, 2003, 2014; Janczyk & Kunde, 2012; Kunde et al., 2007; but see Hesse & Schenk, 2013). This dissociation was interpreted as support for the perception-action model (Goodale & Milner, 1992).

A significant effect of Garner interference was found for accuracy, or precision (indicated by more variable performance in the filtering compared to the baseline blocks), but not for RTs. The lack of a significant Garner interference effect for RTs may be accounted for by the idea that grasping directed at 2D object is hybrid in nature, mediated by both dorsal and ventral stream processing in an interactive manner. This is also evident in the movement trajectory analysis, which shows that grasping movements resemble 3D grasping (i.e., MGA at 70 % of movement time), but at the same time show flatter MGA-final grip trajectory differences (also see Westwood et al. 2002), which could be associated with perceptually-based effects on grasps. Future research is needed to reaffirm this interpretation.

The present study, as well as previous studies (e.g., Holmes & Heath, 2013), investigated potential differences between grasping directed toward 3D objects and grasping directed toward 2D objects, which were represented by simple line drawings. Hence, it is not yet clear which properties could have mediated the observed differences in performance for these two stimulus types. In particular, simple 2D drawings do not allow tactile feedback at the end of the trial and lack multiple depth cues which are available when grasping is directed toward real 3D objects. Future studies should therefore carefully manipulate these different properties to unfold the exact role of each cue in the planning and execution of visually guided actions.

Beyond exploring the mechanisms that underlie visuomotor control toward 2D objects, the present line of investigation could also contribute to improved theoretical conceptualization of the mechanisms mediating grasping directed at real 3D objects. For example, according to Dixon and Glover’s planning-control account, which has been put forward as an alternative for Goodale and Milner’s action-perception model, the planning of a visuomotor grasping movements is initially affected by relative, perceptual organization rules. Only later through the action trajectory, when the fingers approach the target object, feedback from the fingers and from the target object is effectively used by the visuomotor system to counteract the initial effects of visual perception (Glover & Dixon, 2001). Note that the results of the current experiment cannot be easily accommodated by the planning-control account. In particular, 2D grasping movements are made toward distinct, visually visible targets that provide complete visual feedback from the fingers and from the target object throughout the entire movement. Therefore, according to Glover and Dixon’s model, relative effects of visual perception are predicted to diminish during 2D grasping when the fingers approach the object. Contrary to this prediction, the current findings show that grasping trajectories toward 2D objects were affected by perceptual, relative-based processing style even at late stages of the movement trajectory, which include the point in time at which MGAs are achieved (Jakobson & Goodale, 1991; Jeannerod, 1984, 1986).

In conclusion, the current results provide new evidence to support the idea that grasping directed toward 2D objects is fundamentally different from grasping of 3D objects. In particular, while grasping of 3D objects relies on an analytic representation of object shape, grasping toward 2D objects relies on a holistic representation. Hence, our findings suggest that 2D objects may not be a valid proxy for real objects, particularly in the case of visually guided actions.