Introduction

Human visual perception and attentional selection near the hands are substantially altered compared with these processes far from the hands (Abrams, Davoli, Du, Knapp III, & Paull, 2008; Brockmole, Davoli, Abrams, & Witt, 2013; Di Pellegrino & Frassinetti, 2000; Reed, Grubb, & Steele, 2006). The initial observation came from a neuropsychological study of visual extinction (Di Pellegrino & Frassinetti, 2000). This study revealed that visual extinction in a patient with a right parieto-temporal lesion was remarkably reduced when visual stimuli were presented near the contralesional hand compared with when the stimuli were presented far from it. Subsequently, Reed et al. (2006) examined whether visual attention was modulated by the position of the hands in neurologically intact humans. They had subjects hold one of their hands near the stimuli and found that subjects were faster to detect targets on the side close to the hand. Evidence has accumulated that hand-stimulus proximity modulates a variety of visual processes, including perception, attention, executive control, and emotional processing (Abrams et al., 2008; Cosman & Vecera, 2010; Davoli, Brockmole, Du, & Abrams, 2012; Du, Wang, Abrams, & Zhang, 2017; Wang, Du, He, & Zhang, 2014; Weidler & Abrams, 2014).

While the visual hand-proximity effect has been well documented by a number of studies, only one study, to the best of our knowledge, has investigated an auditory hand-proximity effect and observed a much weaker hand-proximity effect in audition than in vision (Tseng, Yu, Tzeng, Hung, & Juan, 2014). Tseng et al. asked participants to perform a binary auditory-spatial discrimination task while resting their hands near or far from the loudspeakers. They showed that participants responded faster when their left hands were near the stimuli but not when their right hands were near the stimuli. Moreover, this facilitation effect disappeared in a pitch discrimination or spatial-plus-pitch discrimination task. The authors suggested that the effects of hand proximity with auditory stimuli might be weaker than those with visual stimuli. However, it is premature to reach this conclusion because the visual hand-proximity effect has never been explored in a visual task similar to the task used in Tseng et al. This study therefore aimed to further explore the auditory hand-proximity effect.

Two theoretical accounts have been put forward to explain the mechanisms underlying the hand-proximity effect. The initial account drew on multimodal neurons as an explanation (e.g., Reed et al., 2006). The multimodal-neuronal hypothesis postulated that holding the hands close to stimuli can activate multimodal neurons. Thus, the stimuli near hands enjoy stronger neuronal representation and win the competition of attentional resource (Abrams et al., 2008; Reed et al., 2006). Multimodal neurons respond to both visual and auditory stimuli that are close to the body (Graziano, Reiss, & Gross,1999) and encode space on the basis of hand-centered coordinate systems (Graziano & Cooke, 2006; Serino, Bassolino, Farnè, & Làdavas, 2007). Therefore, the multimodal-neuronal hypothesis predicts both visual and auditory processing can be modulated by the proximity of the hands to the stimuli.

A more recent account proposed that having the hand close to stimuli facilitates magnocellular neuron (M-cell) functions at the relative expense of parvocellular neuron (P-cell) functions (Goodhew, Gozli, Ferber, & Pratt, 2013; Gozli, West, & Pratt, 2012). The M-cell enhancement account received supports from follow-up studies (e.g., Abrams & Weidler, 2014; Gozli, Ardron, & Pratt, 2014; Kelly & Brockmole, 2014; Thomas, 2015; for a review, see Goodhew, Edwards, Ferber, & Pratt, 2015). M cells dominate the forward projections to the dorsal stream, and P cells dominate the ventral cortical processing stream (Merigan & Maunsell, 1993; see Milner & Goodale, 2006, for a review). As in the visual system, a dorsal-ventral partitioning has also been proposed in the auditory system (Rauschecker & Scott, 2009; Rauschecker & Tian, 2000). The dorsal auditory stream is involved in spatial auditory and audiomotor processing, and the ventral auditory stream is responsible for auditory object identification and recognition (Hickok & Poeppel, 2004; Rauschecker & Scott, 2009; Rauschecker & Tian, 2000). If the M cells dominate dorsal auditory stream as dorsal visual stream, the auditory processing can be modulated by the hand-stimulus proximity.

To further explore the auditory hand-proximity effect, we asked participants to perform an auditory Simon task with their hands either next to or far from the loudspeakers. Participants were required to respond to a nonspatial feature of the target (high or low pitches of the sound) while ignoring its spatial location (left or right speaker). Responses have been found to be faster and/or more accurate when the target location spatially matches the appropriate response (compatible condition) than when it does not (incompatible condition) (Simon & Rudell, 1967; Wascher, Schatz, & Kuder, 2001; Xiong & Proctor, 2016). The Simon effect is calculated as the mean difference in response time (RT) and in error rate between the compatible and incompatible conditions. Recent studies found that the visuomotor Simon effect in RT was enhanced when the hands were near the stimuli compared to when the hands were far from the stimuli (Liepelt & Fischer, 2016; Wang et al., 2014; Wang, Du, Hopfinger, & Zhang, 2018). Moreover, the RT distribution analysis showed that the visuomotor Simon effect near the hands was consistently greater than that far from the hands across four RT bins. Thus, we expect to observe: (1) an enhanced auditory Simon effect in RT near the hands compared to far from the hands; (2) a consistent increment of auditory Simon effect near the hands across all RT bins.

Experiment 1

Methods

Participants

This study was carried out in accordance with the recommendations of the Moral and Ethics Committee of School of Psychology, Jiangxi Normal University. Based on previously reported effect sizes (ηp2 = 0.25; Liepelt & Fischer, 2016; Wang et al., 2018), a power analysis indicated that 26 participants are needed to achieve 80% power (α= 0.05) to shown an effect of hand proximity on Simon effect. Twenty-eight right-handed undergraduates from the Jiangxi Normal University (26 females; age: 18–21 years) participated in the experiment for payment. One participant was replaced due to a program error. All participants reported normal or corrected-to-normal vision and normal audition. All of them were naïve to the purpose of the study.

Apparatus, stimuli, and procedure

The participants sat in front of a 21-in. LCD monitor (1,024 × 768) at a viewing distance of 70 cm in a dimly lit room. Two loudspeakers were placed in front of the monitor either 16 cm to the left or right of the middle of the monitor and 45 cm in front of the participants. The participants steadied their head on a chinrest and rested their hands on a lightweight board. Two computer mouse devices were mounted 32 cm apart on the left and right ends of a board. In the hand-proximal condition (see Fig. 1, upper-left panel), the board was placed in front of the speakers with the mouse devices aligned with the middle of the speakers. The mouse devices were approximately 2 cm away from the speakers. In the hand-distal condition (see Fig. 1, upper-right panel), the board was put on the participants’ laps. The stimuli were “high” (1,050 Hz) or “low” (650 Hz) tones with a loudness of approximately 60 dBA. Each tone was delivered by one of the two speakers. The stimulus presentation and response collection were controlled by the E-Prime software system.

Fig. 1
figure 1

Experimental setup in Experiment 1 (upper panel) and Experiment 2 (lower panel). Upper-left panel: The hand-proximal condition in Experiment 1. Upper-right panel: The hand-distal condition in Experiment 1. Lower-left panel: The hand-proximal condition in Experiment 2. Lower-right panel: The hand-distal condition in Experiment 2

Each trial began with a black fixation cross at the center of the screen on a gray background. The fixation duration randomly varied from 800 ms to 1,200 ms. After fixation, a tone was presented for 200 ms via the left or right speaker, followed by a blank screen that remained on until a response was made. Immediately after a response, a visual feedback, correct or incorrect, was presented for 1,000 ms at the center of screen. Participants were instructed to press one of the mouse buttons as quickly and as accurately as possible to the tone pitch while ignoring the location of the sound. Half of the participants responded to the “low” tone with the left hand and to the “high” tone with the right hand, while the other half of the participants had the conditions reversed.

Design and data analysis

The participants familiarized themselves with the task by first completing 24 practice trials that were not subjected to analysis. The following 320 experimental trials were separated into four blocks of 80 trials each. Each block included an equal number of compatible and incompatible trials. There were two blocks in each of the two hand-stimulus proximity conditions. The order of the two hand-stimulus proximity conditions was counterbalanced across participants. For compatible trials, the stimulus was presented to the same side as the correct response, whereas for incompatible trials, the stimulus was presented from the opposite side of the correct response. The experiment was a 2 (hands-proximal, hands-distal) × 2 (compatible, incompatible) factorial design.

In both experiments, responses slower than 3,000 ms were discarded. The remaining trials with an error or response time (RT) below or above 2.5 standard deviations (SDs) of the mean RT in each condition were excluded from RT analysis (2.5% in Experiment 1; 2.3% in Experiment 2). Error rates and mean RTs for correct trials were submitted to a 2 (hand-stimulus proximity: hands-proximal and hands-distal conditions) × 2 (S-R compatibility: compatible and incompatible conditions) repeated-measures ANOVA.

Results

The RTs for the correct trials in Experiment 1 are illustrated in Fig. 2a as a function of hand-stimulus proximity and S-R compatibility. The main effect of S-R compatibility was significant, F (1, 27) = 164.99, p < 0.001, ηp2 = 0.859, with longer RTs in the incompatible condition (M = 551 ms) than in the compatible condition (M = 486 ms), indicating a Simon effect of 65 ms overall. The main effect of hand-stimulus proximity was not significant, F (1, 27) = 1.41, p = 0.245, ηp2 = 0.050. The interaction between S-R compatibility and hand-stimulus proximity was significant, F (1, 27) = 5.48, p = 0.027, ηp2 = 0.169, demonstrating that the Simon effect was larger in the hand-proximal condition (73 ms) than it was in the hand-distal condition (59 ms; mean difference =13.90, SE =5.94, 95% CI= [1.71, 26.08]).

Fig. 2
figure 2

Results from Experiment 1. a The mean response time (RT) for each condition. Error bars represent the within-subject standard errors. b The Simon effect (RT incompatible – RT compatible) is plotted against the mean RTs in each bin

To compute the time course of the Simon effects, we applied the Vincentizing procedure (De Jong, Liang, & Lauber, 1994; Ratcliff, 1979) in the two experiments. Each participant’s RTs were ranked from shortest to longest for compatible and incompatible trials in each condition and were then divided into four equally sized bins. The Simon effect was then calculated by subtracting the mean correct RTs of compatible trials from those of incompatible trials in each bin. A repeated-measures ANOVA was performed on the Simon effects with the factors of RT bin (four bins) and hand-stimulus proximity (hands-proximal and hands-distal conditions).

The bin analysis of the Simon effect (as shown in Fig. 2b) revealed a significant main effect of bin, F (1, 81) = 15.77, p < 0.001, ηp2 = 0.369, indicating a generally increasing Simon effect across the four bins. The main effect of hand-stimulus proximity was marginally significant, F (1, 27) = 4.17, p = 0.051, ηp2 = 0.134, indicating a greater Simon effect in the hand-proximal condition than in the hand-distal condition across the four bins. The interaction between hand-stimulus proximity and bin was not significant, F (1, 81) = 1.49, p = 0.224, ηp2 = 0.052.

The analysis of the error rate (as shown in Table 1) showed the main effect of S-R compatibility, F (1, 27) = 24.47, p < 0.001, ηp2 = 0.475. The error rate for compatible trials (1.0%) was lower than that for incompatible trials (5.0%). The main effect of hand-stimulus proximity was not significant, F (1, 27) = 0.15, p = 0.704, ηp2 = 0.01, nor was the interaction, F (1, 27) =3.14, p = 0.088, ηp2 = 0.10.

Table 1 Error rates for compatibility and hand-stimulus proximity in Experiments 1 and 2

Consistent with the previous findings of an enhanced visuomotor Simon effect (Liepelt & Fischer, 2016; Wang et al., 2014), the data from Experiment 1 showed that the auditory Simon effect in the hand-proximal condition was larger than that in the hand-distal condition. Moreover, the RT distribution analysis demonstrated that the auditory Simon effect near the hands was consistently greater than that far from the hands across the four RT bins. The present results indicated that hand proximity also modulate auditory Simon.

Experiment 2

Similar to the study of Tseng et al. (2014), we used a computer display in Experiment 1 to present the fixation cross at the onset of each trial and the feedback after the responses. The loudspeakers were placed slightly in front of the display. Participants might allocate more visual attention in near-hand space when the hands were close to the loudspeakers. Moreover, the hands were visible to the participants in the hand-proximal condition, but not visible in the hand-distal. It was unclear whether more visual attention in near-hand space and the visibility of the hands contributed to the finding of enhanced auditory Simon effect near the hands in Experiment 1. To rule out these possibilities the display in Experiment 2 was placed between the participant and the loudspeakers, with the distance from the display to the participant the same as that to the loudspeakers (see Fig. 1, lower panel). The view of the loudspeakers and the hands was obscured by the display.

Method

Participant

We conducted this experiment with a new group of 28 subjects (18 females; age: 18–23 years). One participant was replaced due to a low accuracy in the hand-distal condition (79%, more than 3 standard deviations from the group mean), and one participant was replaced because he failed to follow the experimental instructions. All participants reported being right-handed with normal or corrected-to-normal visual acuity. All were paid for their participation.

Apparatus, stimuli, and procedure

The apparatus, stimuli, and procedure were identical to those in Experiment 1, except that a laptop with a 12.5 in. screen was used to display the fixation and the feedback. The laptop was placed between the participant and the loudspeakers, with the distance from the display to the participant the same as that to the loudspeakers (see Fig. 1, lower panel). The view of the loudspeakers and the hands was obscured by the display.

Results

The RTs for the correct trials are illustrated in Fig. 3a as a function of hand-stimulus proximity and S-R compatibility. The main effect of S-R compatibility was significant, F (1, 27) = 96.89, p < 0.001, ηp2 = 0.782, with longer RTs in the incompatible condition (M = 560 ms) than in the compatible condition (M = 512 ms), demonstrating a Simon effect of 48 ms overall. The main effect of hand-stimulus proximity was not significant, F (1, 27) = 2.46, p = 0.128, ηp2 = 0.084. The interaction between S-R compatibility and hand-stimulus proximity was significant, F (1, 27) = 7.13, p = 0.013, ηp2 = 0.209, suggesting that the Simon effect was larger in the hand-proximal condition (55 ms) than in the hand-distal condition (41 ms; mean difference =13.71, S.E. = 5.14 , 95% CI = [3.17, 24.25]).

Fig. 3
figure 3

Results from Experiment 2. a The mean response time (RT) for each condition. Error bars represent the within-subject standard errors. b The Simon effect (RT incompatible – RT compatible) is plotted against the mean RTs in each bin

The bin analysis of the Simon effect (as shown in Fig. 3b) revealed that the main effect of bin was not significant, F (1, 81) = 0.22, p = 0.883, ηp2 = 0.008, indicating a constant time course across the four bins. The main effect of hand-stimulus proximity was significant, F (1, 27) = 6.42, p = 0.017, ηp2 = 0.192, suggesting a greater Simon effect in the hand-proximal condition than in the hand-distal condition across the four bins. The interaction between hand-stimulus proximity and bin was not significant, F (1, 81) = 1.51, p = 0.218, ηp2 = 0.053.

The analysis of the error rate (as shown in Table 1) showed a main effect of S-R compatibility, F (1, 27) = 37.23, p < 0.001, ηp2 = 0.580. The error rate for compatible trials (2.4%) was less than that for incompatible trials (6.7%). There was no main effect of hand-stimulus proximity, F (1, 27) = 0.31, p = 0.585, ηp2 = 0.01, nor the interaction, F (1, 27) = 2.41, p = 0.132, ηp2 = 0.08.

Consistent with the findings of Experiment 1, Experiment 2 showed that the auditory Simon effect was reliably enhanced in the hand-proximal condition. Thus, our two experiments consistently showed that the auditory Simon effect was enhanced when the hands were close to the speakers compared with far from the speakers.

However, the results' pattern had some inconsistencies between Experiment 1 and Experiment 2. Firstly, the Simon effect was larger in Experiment 1 (65 ms) than in Experiment 2 (48 ms), p = 0.017. A critical manipulation in Experiment 2 was that the display obscured the view of the loudspeakers and the hands when they were placing near the loudspeakers. It is possible that the view of the loudspeakers may help to locate the sound from the loudspeakers in Experiment 1. As a consequence, the location of the sound was more salient and hence the Simon effect was increased in Experiment 1 compared to Experiment 2.

Secondly, there was an RT difference (i.e., RT difference between the hand-distal trials and the hand-proximal trials) in the compatible condition (494 ms vs. 478 ms, p = 0.028) but not the incompatible condition (552 ms vs. 550 ms, p = 0.831) in Experiment 1. However, an RT difference was in the incompatible condition (551 ms vs. 570 ms, p = 0.026) but not the compatible condition (510 ms vs. 515 ms, p = 0.532) in Experiment 2. The view of the hands in the hand-proximal condition might facilitate responding, resulting a slightly shorter RT for the hand-proximal trials than the hand-distal trials in Experiment 1. There was, however, no such facilitation when the view of the hands was obscured in Experiment 2. The overall RT was slightly longer for the hand-proximal trials than the hand-distal trials in Experiment 2. It is likely that RT difference will be in the compatible trials when participants respond faster for the hand-proximal trials than the hand-distal trials. RT difference, however, will be in the incompatible trials when participants respond more slowly for the hand-proximal trials than the hand-distal trials. Noting that, prior studies of the visuomotor Simon effect also showed that RT difference could be in the compatible condition or the incompatible condition or both (Wang et al., 2014, 2018). The relative reaction speed in the hand-proximal condition might modulate where the RT difference will occur.

Lastly, the bin analysis showed a stable Simon effect across time bins in Experiment 2 instead of the increasing Simon effect observed in Experiment 1. The previous literature reported varied patterns of auditory Simon effect across time bins. Some showed an increasing or level trend across time bins (Proctor & Shao, 2010; Wascher et al., 2001), while others reported a decreasing trend (Xiong & Proctor, 2016). The reason for these discrepancies is not entirely clear. but differences in RTs might be important. According to the diffusion model for conflict (DMC) tasks (Ulrich, Schröter, Leuthold, & Birngruber, 2015), RT distribution functions depend on the relative speeds of the automatic and controlled processes. An increasing or level trend across time bins means that the automatic process peaks relatively later in time when responses are fast. In contrast, a decreasing trend across time bins indicates the automatic process peaks relatively earlier when responses are slow (Ulrich et al., 2015). In the present study, RT was slightly faster in Experiment 1 (518 ms) than in Experiment 2 (536 ms). Consistent with the suggestion of DMC, the distribution function was more positive in Experiment 1 than in Experiment 2.

Comparison between visual and auditory hand-proximity effect

Wang et al. (2014) revealed that the visuomotor Simon effect was larger in magnitude when the hands were near the stimuli than when the hands were far from the stimuli. To examine whether the effect of hand proximity on the visuomotor Simon effect differed from that for the auditory Simon effect, a between-experiment ANOVA was performed. The RT data from Experiments 1 and 2 in the study of Wang et al. (2014) were pooled and compared with the pooled data from Experiments 1 and 2 in the present study. The RTs were submitted to a 2 (hand-stimulus proximity: proximal and distal) × 2 (S-R compatibility: compatible and incompatible) × 2 (modality: visual and auditory) mixed ANOVA.

The results showed a main effect of S-R compatibility, F (1, 106) = 322.20, p < 0.001, ηp2 = 0.752, with longer RTs in the incompatible condition (M = 500 ms) than in the compatible condition (M = 460 ms), indicating a Simon effect of 40 ms overall. The main effect of modality was significant, F (1, 106) =53.71, p < 0.001, ηp2 = 0.336. The RTs were faster for the visual stimuli (M = 433 ms) than for the auditory stimuli (M = 527 ms). The two-way interaction between hand-stimulus proximity and S-R compatibility was significant, F (1, 106) = 23.45, p < 0.001, ηp2 = 0.181, indicating that the Simon effect was larger in the hand-proximal condition (47 ms) than in the hand-distal condition (33 ms). The two-way interaction between S-R compatibility and modality was significant, F (1, 106) = 50.11, p < 0.001, ηp2 = 0.321, demonstrating that the auditory Simon effect was larger (57 ms) than the visual Simon effect (25 ms). Most importantly, the three-way interaction was not significant, F (1, 106) = 0.19, p = 0.664, ηp2 = 0.002, suggesting that the increased Simon effect near the hands compared to far from the hands was essentially the same for the visual and the auditory stimuli. No other effects reached significance, Fs < 0.35, Ps > 0.558.

Consistent with previous findings, the between-experiment comparisons revealed that the auditory Simon effect was larger than the visual Simon effect (Xiong & Proctor, 2016). However, the increased visual and auditory Simon effects were essentially the same when the hands were close to the stimuli compared to when the hands were far from the stimuli.

General discussion

This study aimed to examine the effect of hand proximity on auditory processing. Participants performed an auditory Simon task either with their hands close to the loudspeakers or far from the loudspeakers. Consistent with the findings in vision (Liepelt & Fischer, 2016; Wang et al., 2014), the present results showed that the auditory Simon effect was enhanced near the hands compared to far from the hands (Experiment 1). This effect remained robust even after ruling out the possible influence of visual attention (Experiment 2). Furthermore, between-experiment comparisons showed that no difference emerged between the increased visual and auditory Simon effect near the hands, suggesting that the magnitude of the auditory hand-proximity effect is same as that of the visual effect. The present study is the first to reveal a robust auditory hand-proximity effect, and clearly showed that the effects of hand proximity in audition are not weaker than those in vision.

Our findings are consistent with the multimodal-neuronal hypothesis. The multimodal-neuronal hypothesis postulated that holding the hands close to stimuli can activate multimodal neurons (Abrams et al., 2008; Reed et al., 2006). Multimodal representations provide stronger neuronal representation of the stimuli near hands, leading them to win the competition of attentional resource (Abrams et al., 2008; Reed et al., 2006). Multimodal neurons respond to both visual and auditory stimuli that are close to the body (Graziano, Reiss, & Gross, 1999) and encode space on the basis of hand-centered coordinate systems (Graziano & Cooke, 2006; Serino et al., 2007). Thus, the spatial S-R mapping may become stronger when the stimuli come closer to the hands, resulting in an enhanced Simon effect near the hands (Liepelt & Fischer, 2016; Wang et al., 2014). Therefore, the present findings of an enhanced auditory Simon effect are consistent with the multimodal-neuronal hypothesis.

Moreover, if the M-cell enhancement can hold for processing auditory stimuli near the hands, our findings are also compatible with the M-cell enhancement account. The M-cell enhancement account suggests that objects near the hands bias visual processing toward the action-oriented magnocellular visual pathway versus the perception-oriented parvocellular visual pathway (Goodhew et al., 2013; Gozli et al., 2012; see Goodhew et al., 2015, for a review). M cells dominate the forward projections to the dorsal stream, which specializes in spatial perception and visuomotor integration, and P cells dominate ventral cortical processing streams, which specialize in object recognition and identification (Merigan & Maunsell, 1993; see Milner & Goodale, 2006, for a review). Thus M-cell enhancement can account for the enhanced visuomotor Simon effect (Wang et al., 2014). As in the visual system, a dorsal-ventral partitioning has also been proposed in the auditory system (Rauschecker & Scott, 2009; Rauschecker & Tian, 2000). The dorsal auditory stream is involved in spatial auditory and audiomotor processing, and the ventral auditory stream is responsible for auditory object identification and recognition (Hickok & Poeppel, 2004; Rauschecker & Scott, 2009; Rauschecker & Tian, 2000). If the M cells dominate the dorsal auditory stream as for dorsal visual stream, the auditory Simon effect near the hands should be enhanced compared with far from the hands.

The finding of the current study can be explained by the referential coding account (Hommel, 1993; Murchison & Proctor, 2015, 2016) too. The referential coding account suggests that the stimulus is spatially coded with reference to intentionally defined objects (Hommel, 1993). Murchison and Proctor (2015, 2016) used this account to explain a reduced flanker effect near hands. Replicating the finding of Davoli and Brockmole (2012), Murchison and Proctor (2015, 2016) observed a reduced interference in a flanker task when the hands were placed around the target location rather than below the screen. They suggest that the stimuli's locations are coded relative to the hands when placing the hands around the target location, allowing more efficient allocation of visual attention to the target location. Accordingly, in the present study, the stimuli' s locations could be coded relative to their corresponding hands when placing hands near the stimuli, resulting in a stronger binding between the stimulus and its corresponding effector in the hand-proximal condition. As a consequence, the Simon effect enhanced near the hands.

The present study demonstrated that the magnitude of the auditory hand-proximity effect was the same as, rather than weaker than, the visual hand-proximity effect. This result speaks against the suggestion of a weaker hand-proximity effect in audition than in vision (Tseng et al., 2014). Tseng et al. found only a left-hand advantage in a binary-spatial discrimination task when both hands were near the loudspeakers. No such facilitation was found in a nonspatial pitch discrimination task or in a combined spatial-pitch discrimination task. The authors suggested that the effect of hand proximity was weaker in audition than in vision. However, this suggestion might be inconclusive. To the best of our knowledge, although the visual hand-proximity effect had been found in a variety of tasks, no study had investigated the visual hand-proximity effect in a visual task similar to the task used in Tseng et al. In Tseng et al.’s study, the responding hands were ipsilateral to the stimuli; that is, the left and right hands were always assigned to respond to the left and right stimuli, respectively. This task is similar to the congruent condition of the Simon task. Previous studies used visual Simon tasks and failed to find a reliably faster response near the hands in the congruent condition. It is most likely that the visual hand-proximity effect is also weak in a visual task similar to the task used in Tseng et al. In addition, the two experiments in the current study also showed that the effect of hand proximity was not reliable in audition during the congruent trials. Thus, the weak effect of hand proximity in Tseng et al.’s study might be due to the compatible spatial task but not due to the auditory stimuli they have used.

Replicating previous findings in vision (Wang et al., 2014, 2018), the present study found an enhanced auditory Simon effect near the hands. The enhanced visuomotor/auditory Simon effect indicates an impaired executive control when hands were close to the stimuli. However, it is in sharp contrast to the proposal that the executive control is improved near the hands (Englert & Wentura, 2016; Weidler & Abrams, 2014). Those studies found a reduced conflict effect in the Stroop and Flanker task (Davoli, Du, Montana, Garverick, & Abrams, 2010; Englert & Wentura, 2016; Weidler & Abrams, 2014). As Wang and colleagues suggested, the discrepancy may be attributed to the different sources of conflict in those task (Wang et al., 2014, 2018). A stimulus-stimulus (S-S) conflict underlies the Flanker or the Stroop effect, while a stimulus-response (S-R) conflict underlies the Simon effect (Kornblum, Hasbroucq, & Osman, 1990). It is possible that holding hands near the stimuli may facilitate resolving an S-S conflict, but impair the resolution of a spatial S-R conflict (Wang et al., 2014, 2018).

The literature showed that the hand proximity effect was highly context-dependent (e.g., Bush & Vecera, 2014). For example, Reed et al. (2006) suggest that stimuli in near-hand space enjoy attentional prioritization when pointing a single hand to target, while Abrams et al. (2008) revealed a delayed attentional disengagement when putting both hands near target. In addition, Gozli et al. (2012) proposed that near-hand space facilitates M-cell functions, at the expense of P-cell functions. However, the pattern of findings is reversed if one hand is placed near the stimulus (Bush & Vecera, 2014), or if a large number of items are in the display (high attentional demand, Goodhew & Clarke, 2016), or if the hand is positioned to afford a precision grasp (Thomas, 2015). Thus, as Gozli and Deng (2018) suggested, the hand proximity effects may not form a unitary set of phenomena, and not a single theory can explain all hand-proximity effects.

In conclusion, this research provides the first evidence that having the hands close to the stimuli enhanced the Simon effect in audition, suggesting the existence of an auditory hand-proximity effect. The auditory hand-proximity effect was reliable and not weaker than the visual hand-proximity effect. Our findings are consistent with the multimodal-neuronal hypothesis and the theory of event coding. If the M-cell enhancement also holds for processing auditory stimuli near the hands, our findings can be accommodated by the M-cell enhancement account too.

Open Practices Statement

None of the data or materials for the experiments reported here are available, and none of the experiments were preregistered.