In everyday life, our brains receive and efficiently process visual signals not only when we are stationary, but also when we are walking or running. In the latter scenario, a critical problem for the visual system to resolve is to dissociate two different sources of retinal input signals: those representing the motion of the world and those induced by our own movements (for a review, see Greenlee et al., 2016).

To date, a plethora of research has shown that the neural system can predict sensations and suppress responses to self-motion—for example, head movements—so that we can efficiently process motion signals in the real world (Haarmeier, Thier, Repnow, & Petersen, 1997; Miall & Wolpert, 1996; Troncoso et al., 2015; Wallach, 1987; Wolpert, Ghahramani, & Jordan, 1995; Wurtz, 2008). To deepen the knowledge of such visual–vestibular integration or interaction that frequently occurs in everyday life, a thorough investigation of the head movements of mobile observers is needed. Yet an inconvenient issue arises: Most previous and contemporary methods have been based on mechanical devices (Harris, Morgan, & Still, 1981; Jaekl, Jenkin, & Harris, 2005; Kaliuzhna, Prsa, Gale, Lee, & Blanke, 2015; Shirai & Ichihara, 2012; Wallach & Flaherty, 1975). These mechanical methods are useful for studying the passive movements of immobile observers. Unfortunately, they are not suitable for studying the locomotions of mobile observers. For example, a trolley (Harris et al., 1981) and an optical mouse (Shirai & Ichihara, 2012) have been used to track subjects’ passive fore–aft movements when the subjects were either standing on the trolley or sitting in a wheelchair. In other work, the subjects sat in a chair and their voluntary head rotations were measured by a mechanical tracker (Jaekl et al., 2005), or a mechanical chair was used to deliver passive whole-body rotational stimuli to a subject (Kaliuzhna et al., 2015). Mechanical complexity and lack of portability restrict the generality of all those methods. For instance, a mechanical device designed for studying passive on-axis rotation (i.e., the center of rotation is located on the head-to-seat body axis) is usually not capable of investigating passive centrifugal rotation (i.e., rotation about an earth-vertical axis different or even distant from the head-to-seat body axis). In this and other similar cases, researchers may have to build a new mechanical device every time they want to study a different type of self-motion. In addition, lack of portability also hampers the investigation of vision and multisensory interactions during locomotion (e.g., when people are walking or running). Moreover, it is also hard to build and calibrate those mechanical devices. All these disadvantages impede the discovery of visual–vestibular interactions, especially under more natural situations.

The present study introduces a virtual reality (VR) device that combines a three-space sensor with a head-mounted display, to quantitatively control the causal relationship between retinal motion and head movements. This VR device can be easily and cheaply assembled. It is small-sized and lightweight. Because the three-space sensor records the rotational spatiotemporal information of the head in real time, the device in theory can track any head rotation, which in turn should suffice to investigate various research questions about visual–vestibular and (or) visual–proprioceptive integration and their interactions. Note that even when an observer sits (relatively) stationary on a rotating swivel chair, proprioceptive signals, in addition to vestibular signals, may contribute to the perception of self-motion. Therefore, we will use the term vestibular in the rest of this article to denote both vestibular and proprioceptive signals, for simplicity (see also the Results section for Exp. 5).

As is shown in Fig. 1a, a three-space sensor (Yost Labs) was installed on top of the helmet for a pair of head-mounted goggles. Our customized Matlab codes recorded the head orientation in real time, which could be used to calculate the velocity and acceleration of head movements. The codes also used this information to present visual motion stimuli in real time. The data acquisition from the sensor and presentation of the visual stimuli were executed in each pass of the main program loop. Therefore, we were able to estimate the delay from the start of data acquisition to the finish of updating the visual stimuli for the current loop by using the “GetSecs” function in the Psychophysics Toolbox (Brainard, 1997), though this lag estimation could only return a multiple of the duration of the vertical retrace of the goggles. According to the test, the temporal lag was on average 16.4 ms, which was about the duration of only one vertical retrace of the goggles. This means that the real temporal lag could be less than 17 ms if the screen could reach a faster refresh rate.

Fig. 1
figure 1

Experimental design for the first three experiments. (a) The subject receives visual input through a head-mounted display. A three-space sensor is attached on the top of the helmet, recording the subject’s head movements in real time. (b) Schematic showing the stimuli in the head movement condition of Experiment 1. During the adaptation stage (left), when the head rotated to the right, a vertical grating drifted leftward in the upper visual field. While the head rotated to the left, a vertical grating drifted rightward in the lower visual field. Immediately after the end of adaptation was the test stage (right), in which a static vertical grating was presented centrally, covering the two adapting locations. Subjects were required to click the mouse at the time the MAE vanished. (c) Stimuli in the head movement condition of Experiment 2. The retinal motion here became perpendicular to the head rotations. (d) Stimuli in one of the head movement conditions of Experiment 3, in which the direction of the retinal motion was the same as that of the head rotation. The other head movement condition was the same as the conditions in Experiment 1

Similar to our method, some commercial virtual reality devices (e.g., Oculus Rift or HTC Vive) also include an accelerometer or motion-tracking system. A recent study (Kim, Chung, Nakamura, Palmisano, & Khuu, 2015) utilized Oculus Rift to study vection (the illusory perception of self-motion created by watching optic-flow visual stimuli). However, the commercial devices are meant mainly for entertainment, and thus sufficient supports for research are still being established and improved (e.g., Psychophysics Toolbox has recently released a new toolbox for VR hardware). According to Kim and colleague’s report (Kim et al., 2015), the temporal lag in their work is up to 196.7 ms. Our method relies on customized Matlab codes and the Psychophysics Toolbox (Brainard, 1997) to trigger the visual stimulation with head movement and pair the two motion profiles, thus greatly reducing the end-to-end lag.

To efficiently validate our method, in the present study we focused only on a particular aspect of visual–vestibular interaction, the influences of head rotations on adaptation to retinal motion. In light of the hypothesis that the neural system can suppress responses to self-motion (Miall & Wolpert, 1996; Wallach, 1987; Wolpert et al., 1995), if retinal motion resulting from self-motion is suppressed, adaptation to that retinal motion would also be weakened. Therefore, one would expect to observe diminished motion aftereffect (MAE) for retinal motion signals induced by self-motion; MAE is an illusory motion formed after viewing a moving stimulus for a period of time, in which a stationary test pattern appears to move in the opposite direction to the original stimulus (Harris et al., 1981; Huk, Ress, & Heeger, 2001). In support of this hypothesis, previous studies have found substantial reduction in MAE following prolonged exposure to an expanding motion during the subjects’ forward movements (Harris et al., 1981; Wallach & Flaherty, 1975). Furthermore, no MAE is observed for horizontal image displacements over the retina due to either eye (Mack et al., 1987; Morgan, Ward, & Brussell, 1976; Swanston & Wade, 1992) or head (yaw) movement (Swanston & Wade, 1992).

Using our VR device, we first attempted to replicate a previous finding that MAE was suppressed for adaption to horizontal retinal motion resulting from head (yaw) rotation (Swanston & Wade, 1992). The magnitude of MAE was estimated by its duration (Keck, Palella, & Pantle, 1976; McGovern, Roach, & Webb, 2012; Petrov & Van Horn, 2012; Swanston & Wade, 1992; Verstraten, Fredericksen, & van de Grind, 1994), where a longer MAE duration corresponded to stronger MAE. In Swanston and Wade’s work, a leftward-adapting motion signal on the retina was produced by rightward head movements tracking a rightward-moving (but retinally stationary) central grating relative to the stationary (but retinally leftward-moving) flanking gratings. The relationship between retinal motion and head movement rendered by their manipulation was similar to that in our Experiment 1, though our new method might produce a more immersive experience, with accurately matched velocity between the head rotation and retinal motion in real time.

Considering that both the head movement and retinal motion signals are composed of features such as speed, direction, and acceleration, it remains largely unknown the causality of which feature(s) led to the suppression of MAE found in previous work (Swanston & Wade, 1992). This question could not be readily addressed by Swanston and Wade’s traditional psychophysical approach, but it was systematically tested in the present study, given the freedom and convenience of stimulus presentation using a head-mounted display. Our results in Experiments 1–4 indicated that MAE is reduced for prolonged exposure to retinal motion causally induced by head rotation. However, whether or not the direction of retinal motion was opposite the head rotation—as in everyday life or Swanston and Wade’s work—was not a critical factor to produce the suppression. During voluntary head movements, in addition to vestibular signals, both efference copy signals (Sperry, 1950; von Holst & Mittelstaedt, 1950) and neck proprioceptive signals (Pettorossi & Schieppati, 2014) may influence perception. Therefore, to better control for the contributions from efference copy and proprioceptive signals, in Experiment 5 we examined the role of head rotation on MAE in both voluntary and passive conditions (by using a swivel chair). The results suggested that both passive and voluntary head rotation produced comparable magnitudes of inhibition of MAE.

Method

Experimental procedures in all the experiments of the present study were approved by the Institutional Review Board of the Institute of Psychology, Chinese Academy of Sciences, and informed consent was obtained from all subjects. All subjects had normal or corrected-to-normal vision. In all five experiments, one subject was the first author, and all the other subjects were not aware of the experimental hypothesis. For all of the experiments, we sought to collect data from about ten participants, since previous studies using five (Harris et al., 1981) or 12 (Swanston & Wade, 1992) subjects with a similar paradigm showed that this sample size would yield ample power.

Stimuli were presented on Sony HMZ-T3 head-mounted goggles (50° × 28° visual angle, 1,280 × 720 pixel resolution at 60 Hz), connected to a Dell XPS 8700 computer and programmed in Matlab and Psychophysics Toolbox. A three-space sensor (TSS-WL sensor, YEI Technology, USA) was attached on top of the helmet of the goggles, which was used to record the subject’s head movement data in real time. The communications with the three-space sensor were realized through a customized computer program we developed.

Experiment 1

Eleven normal adults (eight females, three males; age range = 18–32 years) participated in Experiment 1 (one of them was the author J.B.). There were two experimental conditions—head movement and head still. The MAEs produced in the head-still sessions were compared with those in the head movement sessions, to examine the influence of head movements on motion adaptation. Each session included an adaptation stage and a test stage. In a head movement session, subjects started the experiment by pressing the space bar and then immediately rotated their heads horizontally back and forth for 4 min with a rhythm of 0.276 ± 0.148 Hz. Subjects were required to turn their heads from the left (or right) side to the right (or left) side during the head rotations. The angular range of head rotations could be calculated by multiplying the average velocity of head rotation by the average time length for a head turn, which was 110° ± 14°. Note that the subject’s head was not supported during the experiments. However, we instructed the subjects to make head yaw rotations rather than head pitch or roll rotations, and our programs only read and processed the data along the yaw dimension. A red fixation point (0.46°, or 12 pixels, in diameter) was always presented at the center of the display in both the adaptation and test stages. Subjects were required to maintain fixation throughout the experiment. When the head was rotating to the left, a vertical grating drifted rightward in the lower visual field on a midgray background (see Fig. 1b). When the head was rotating to the right, a vertical grating drifted leftward in the upper visual field. Therefore, during the adaptation stage the direction of the drifting adapter was always rightward in the lower visual field and leftward in the upper visual field, such that the MAE in either visual field could accumulate over time without the two canceling each other. This is one advantage of using two drifting adapters (but only one at a time in either upper or lower visual field) over using only one. The drifting gratings were rendered to always move at the same speed as the head turns, on the basis of head movement data recorded by the three-space sensor in real time. Once the adaptation stage finished, a static vertical grating was presented centrally, covering the two adapting locations. An MAE could be perceived by most subjects, such that the upper part of the test grating appeared to move rightward, whereas its lower part appeared to move leftward. On the basis of experience in a previous study (Mesik, Bao, & Engel, 2013), direction contrast across the upper and lower visual fields facilitates the detection of a weak MAE. This is another advantage of using two such drifting adapters. Subjects were told to click the left mouse button once the MAE stopped, or to click the right mouse button if they did not experience any MAE. All the adapting and testing gratings were at full contrast, with a spatial frequency of 0.13 cpd. Each adapting grating subtended 24.7° × 6.79°, centered 3.55° away from the fixation point. The test grating subtended 24.7° × 13.89°. In the head movement sessions, whenever a head turn finished, the average drifting velocity of the adapting grating during the head turn and the time relative to the start of the session were saved. These data were used to make the adapting gratings in the head still sessions that simulated the visual inputs in the preceding head movement sessions. To avoid an unwanted influence of testing order on measuring the MAE, each subject completed three head movement sessions (HM) and two head-still sessions (HS). Subjects took a break between successive sessions for 9.7 min, on average (SD = 7.4 min), to avoid any carryover effects from the preceding session. The exact session order was either HM–HS–(HM)–HS–HM or (HM)–HS–HM–HM–HS for a subject. The testing results for the head movement sessions within the parentheses were not entered into the analysis. The visual stimuli during the adaptation stage of each head-still session were derived from the preceding head movement sessions.

Experiment 2

The same 11 subjects participated in Experiment 2. The stimuli and procedure in this experiment resembled those in Experiment 1, except for the following changes: A horizontal grating drifted downward in the left visual field when the head rotated to the left, and a horizontal grating drifted upward in the right visual field when the head rotated to the right (see Fig. 1c). The drifting speed of the grating also depended on the speed of head rotation. Both adapting gratings subtended 6.79° × 24.7°, with 3.55° eccentricity. The test grating was also horizontally oriented and centrally presented, subtending 13.89° × 24.7°. The MAE in this experiment would be that the left part of the test grating would appear to move upward, while its right part would appear to move downward. The visual stimuli in the head-still sessions were generated using the same method as in Experiment 1. Subjects took a break between successive sessions for 5.4 min, on average (SD = 2.7 min).

Experiment 3

Eleven subjects (six females, five males; age range = 19–32 years) participated in Experiment 3. Four of them had also participated in the first two experiments. Unlike in Experiment 1, there were two different head movement sessions in this experiment. The adapting grating in the head movement session could drift in either the same or the opposite direction as the head rotation (see Fig. 1d). Since no order effect had been observed in Experiment 1, each subject completed four sessions, with the two head-still sessions following the two head movement sessions. The visual stimuli in the head-still sessions were generated using the same method as in Experiment 1.

Experiment 4

Ten subjects (six females, four males; age range = 18–32 years) participated in Experiment 4. Eight of them (including the author J.B.) had also participated in Experiments 1 and 2. Each subject completed three sessions. The stimuli and procedure in the first session were identical to those in the head movement session of Experiment 1, which was only used to compute the average duration τ and average velocity ω of a head turn for the subject. The other two sessions included a head movement session and a head-still session. In this head movement session, the adapting gratings drifted independently of the head rotations. Figure 2 shows the time series for head rotation and retinal motion in one subject (S5). To be more specific, the relationship between the direction of head movement and the drifting direction of the grating varied randomly throughout the session, so that these directions were uncorrelated. In the head-still session, the visual inputs were exactly the same as in the head movement session.

Fig. 2
figure 2

Time series for head rotation and retinal motion in one subject (S5) in Experiment 4. The time series for head rotation (y1) was recorded during the head rotation session, and the uncorrelated retinal motion (y2) was calculated on the basis of a time series (y0) of head rotation recorded in a separate head rotation session before the formal experiment. As is shown in this figure, the time series y1 and y2 were uncorrelated with each other, since the correlation coefficient (r) between the two time series was close to zero

Using the temporal information for head turns recorded in the first session, we could construct a time series y0 depicting the change of the head rotation direction over time. Specifically, the time series was sampled at 100 Hz. A time point was assigned 0 if the head was rotating to the left, and 1 for rightward rotation. The visual stimulation in the first session was the same as in Experiment 1. Thus, a time point assigned 0 meant that a grating drifted rightward in the lower visual field, and a time point assigned 1 meant that a grating drifted leftward in the upper visual field. We could also construct a time series y1 for the head movement session. However, the time series for the retinal motion here was not y1 but another time series, y2, that was designed to be independent of y1.

By creating an independent time series y2, we aimed to make the direction reversals for the retinal motion independent of those for the head movements. If there were a strong causal relationship between the direction reversals of the two time series, y1 and y2 should be either highly anticorrelated (i.e., a correlation coefficient of r ≈ – 1, as in Exp.1) or highly correlated (i.e., r ≈ 1, as in the “same-direction” condition of Exp. 3). If there were no causal relationship between the direction reversals (which we hypothesized in this experiment), the two time series should then be uncorrelated. Ideally, the correlation coefficient should approach zero for very uncorrelated time series. Technically, we had to know y2 before generating the visual stimuli in the head movement session of the formal experiment (i.e., the second head movement session), but y1 was not recorded until the completion of that session; we thus relied on y0 (i.e., the time series of the first head movement session) in order to produce y2 in advance.

To create y2, we concatenated multiple pairs of zeros and ones of random length. Each pair consisted of a row of zeros followed by a row of ones, with the number of zeros being equal to that of the ones. The time length represented by the number of zeros (or ones) in each pair was randomly selected among seven possible durations, ranging from 0.1τ to 1.9τ in steps of 0.3τ. The total length of the time series was constrained to equal 4 min. We created 500 such time series and randomly selected half of them to be subtracted from 1, so that the resulting time series would start with one rather than zero. We then performed a correlation analysis between y0 and each of the 500 time series. The least correlated time series was selected as y2. Since y0 and y1 were derived from the same subject, we assumed that y2 should also be uncorrelated with y1. After the subjects had completed the head movement session, we performed an additional correlation analysis between y2 and y1. The results for every subject consistently confirmed that our assumption was correct (see the example results for one subject in Fig. 2). To define the velocity of the drifting grating, a random value between ω ± 1 SD was determined as the velocity for each pair of period in y2.

Experiment 5

Ten subjects (six females, four males; age range = 19–26 years) participated in Experiment 5. There were two self-motion conditions. One self-motion condition was called the voluntary condition, in which subjects sat in a swivel chair and used their feet and legs to rotate the swivel chair back and forth (see Fig. 3a). In this way, their heads rotated in space but kept still relative to their bodies. The angular range of head rotations in space was 71.3° on average (SD = 14.2°). The rhythm of head rotation was 0.237 ± 0.055 Hz. The other self-motion condition was called the passive condition, in which the subjects sat still in the swivel chair while an experimenter rotated the chair back and forth (see Fig. 3b). In other words, the subjects’ heads rotated in space passively (i.e., without voluntary motor actions). The angular range of head rotations in space was 49.0° on average (SD = 11.5°). To avoid anticipation in the passive condition, the experimenter randomly varied the magnitude of rotation in each rotating phase. This also explains why the angular range of head rotations in space was smaller in the passive than in the voluntary condition.

Fig. 3
figure 3

Experimental design for Experiment 5. (a) The voluntary condition: Subjects rotated the swivel chair back and forth using their feet and legs. Therefore, their heads rotated in space but kept still relative to their bodies. (b) The passive condition: The experimenter rotated the chair back and forth while the subject sat still relative to the chair. In both conditions, the three-space sensor recorded the data for head rotation in space

There was a head-still control session for each self-motion session. In these control sessions, subjects adapted to the simulated visual inputs (produced in the same way as in Exp. 1) while keeping stationary. Each subject completed four sessions, with each self-motion session followed by its control session.

Results

Experiment 1

We first investigated whether head rotation could affect the processing of retinal motion signals that resulted from the head movement per se when the environment was static. During the adaptation periods, whenever the head rotated to the left, a vertical grating drifted rightward in the lower visual field on a midgray background (see Fig. 1b). Conversely, whenever the head rotated to the right, a vertical grating drifted leftward in the upper visual field. Because the drifting gratings were rendered to always move at the same speed as the head turns and in the opposite direction, this mimicked the relationship between retinal motion and head movements in everyday life (but only in half of the visual field for each head rotation).

Such visual presentation repeated as subjects rotated their heads back and forth for 4 min (at 110° ± 14°, with a rhythm of 0.276 ± 0.148 Hz), during which time subjects were told to maintain a gaze at central fixation. Immediately after the end of adaptation, a static vertical grating was presented centrally, covering the two adapting locations. Most subjects experienced vivid MAE that the upper grating appeared to be moving to the right while the lower one moved to the left. They were told to click the left mouse button once the MAE had stopped, or to click the right mouse button if they did not experience any MAE. As a control, MAE duration was also measured after subjects had watched simulated replays of the visual stimuli recorded in the head rotation sessions.

The testing sequences for the head rotation and replay sessions were counterbalanced (see the Method section for details). A one-sample Kolmogorov–Smirnov test suggested that the data in all five experiments of the present study followed a normal distribution (all ps > .31). MAE was significantly shorter-lived [t(10) = 4.71, p < .001, Cohen’s d = 1.42] in the head rotation condition (11.36 s) than in the head-still condition (19.93 s; see Fig. 4). This head-movement-induced inhibition of MAE was also estimated with an inhibition index by dividing the MAE duration in the head movement condition by that in the head-still condition. An inhibition index smaller than 1 would indicate a reduction of MAE in the head movement condition. A paired t test on the inhibition index (against 1) also revealed a significant inhibition effect due to head rotation [t(10) = 6.67, p < .001, d = 2.01, mean = 0.54; see Fig. 5]. These results are in line with the hypothesis that MAE is diminished for retinal motion signals induced by self-motion. Therefore, the hypothesis is also validated for head yaw rotations.

Fig. 4
figure 4

Individual and grand average MAE durations for the head rotation and head-still conditions in Experiment 1. Error bars represent ± 1 SEM

Fig. 5
figure 5

The suppression of MAE in all five experiments (abbreviated as “Exp.”) was evaluated with an inhibition index, which was calculated by dividing the MAE duration for the head movement condition by that for the head-still condition. Each symbol shows the inhibition index for a subject, while each bar denotes the grand average inhibition index. Opposite, Perpendicular, Same, and Uncorrelated denote that during adaptation, the directions of retinal motion were opposite to, perpendicular to, the same as, or uncorrelated with the direction of head rotation, respectively. Voluntary and Passive represent the voluntary and passive conditions in Experiment 5. Error bars represent ± 1 SEM

Experiment 2

To test whether the inhibition of MAE relied on a natural relationship between retinal motion and head movements, we first adopted horizontal gratings drifted in the left or right visual field. Specifically, a grating drifted downward in the left visual field when the head rotated to the left, and a grating drifted upward in the right visual field when the head rotated to the right (see Fig. 1c). With this manipulation, the effects observed in Experiment 1 were still replicated [MAE duration: t(10) = 2.26, p = .048, d = 0.68, head rotation, 15.45 s; head still, 26.83 s; inhibition index: t(10) = 3.66, p = .004, d = 1.10, mean = 0.70; see Fig. 5]. Because the 11 subjects in this experiment had also participated in Experiment 1, we then performed a 2 (Head Status: head rotation vs. head still) × 2 (Experiment: Experiment 1 vs. Experiment 2) repeated measures analysis of variance (ANOVA) on the MAE durations. We found a significant main effect of head status [F(1, 10) = 13.26, p = .005]. However, neither the main effect of experiment [F(1, 10) = 0.93, p = .358] nor the interaction [F(1, 10) = 0.29, p = .603] was significant, suggesting that the head-rotation-induced inhibition of MAE was similar across the two experiments.

Experiment 3

We then compared the degrees of inhibition when the direction of retinal motion was either opposite to (as shown in Fig. 1b) or the same as (Fig. 1d) the direction of head movement. A 2 (Head Status: head rotation vs. head still) × 2 (Direction: opposite vs. same) repeated measures ANOVA revealed a significant main effect of head status [F(1, 10) = 9.95, p = .010], yet the main effect of direction [F(1, 10) = 0.04, p = .841] and the interaction [F(1, 10) = 4.07, p = .071] were not significant. This suggested that inhibition of MAE was present and comparable in both conditions

Paired t tests also indicated that in both conditions, MAE lasted a shorter time [“Opposite,” t(10) = 3.35, p = .007, d = 1.01; “Same,” t(10) = 2.26, p = .048, d = 0.68] in the head rotation condition (“Opposite,” 14.31 s; “Same,” 16.01 s) than in the head-still condition (“Opposite,” 20.27 s; “Same,” 19.19 s; see Fig. 5). The inhibition indices were significantly less than 1 in the “Opposite” condition [t(10) = 4.77, p < .001, d = 1.44, mean = 0.69], whereas in the “Same” condition the effect was marginal [t(10) = 2.15, p = .057, d = 0.65, mean = 0.83].

Experiment 4

To verify that the inhibition of MAE depended on the causality between retinal motion and head movements (e.g., the velocity and the frequency to update the direction of motion/movement), the drifts of the gratings were rendered independent of head rotation. If the inhibition of MAE strongly depended on the causality between the profiles of retinal motion and head movement, independent retinal motion would eliminate the previously observed effects.

Each subject first completed one head rotation session. The head movement data from this session were then used to create a time series defining the direction and speed of retinal motion in the formal experiment. We then tested the correlation between the actual time series for the direction of retinal and head motion in the head movement session of the formal experiment. The correlation coefficients were nearly zero (r = .0004, with a range from – .0034 to .0075), indicating that the retinal motion and head movements were completely uncorrelated.

As expected, the MAE durations for the head rotation and head-still conditions were not statistically different from each other [MAE duration: t(9) = 0.11, p = .914, d = 0.04; head rotation, 23.13; head still, 23.30 s; inhibition index: t(9) = 0.83, p = .428, d = 0.26, mean = 0.95; see Fig. 5], suggesting the critical contribution of pairing the head and retinal motion profiles to the inhibition effects.

Experiment 5

As was shown in the experiments above, the effects of adaptation on retinal translational motion were attenuated by co-occurring head rotations, as long as the two signals were synchronized to convey a visuo-motor causal relationship. These results are consistent with the prediction made by the efference copy theory. The term efference copy was coined in 1950 (von Holst & Mittelstaedt, 1950). An efference copy signal is generated before motor actions, resulting in a corollary discharge (Sperry, 1950) of the expected sensory consequences in the brain. Through these mechanisms, the neural system can predict sensations and suppress responses to self-generated sensations, allowing animals to efficiently process motion signals in the real world (Bridgeman, 2007; Leube et al., 2003; Miall & Wolpert, 1996; Wolpert et al., 1995; Wurtz, 2008). In our experiments, the synchronized retinal motion was likely associated with a consequence of the head movements, and thus became suppressed. Moreover, spindles in the neck muscles provide proprioceptive cues for head rotation in the yaw plane (Chan, Kasper, & Wilson, 1987), which might also be a potential source of the inhibition of MAE.

However, to strictly examine whether efference copy or proprioceptive signals indeed played a key role in our observations, we conducted the fifth experiment. In one head rotation condition, the subjects sat in a swivel chair and used their feet and legs to rotate the swivel chair back and forth, so that their heads rotated in space but kept still relative to their bodies. This was called the voluntary condition. In the passive condition, the subjects sat still in the chair while an experimenter rotated the chair back and forth, so that their heads rotated in space passively (i.e., without voluntary motor actions). To avoid subjects’ anticipation, the experimenter randomly varied the magnitude of rotation during each rotating phase. As a control, MAE duration was also measured after subjects had watched simulated replays of the visual stimuli recorded for both the voluntary and passive conditions.

A 2 (Head Status: head rotation vs. head still) × 2 (Rotation Type: voluntary vs. passive) repeated measures ANOVA revealed a significant main effect of head status [F(1, 9) = 11.07, p = .009], yet the main effect of rotation type [F(1, 9) = 0.02, p = .888] and the interaction [F(1, 9) = 0.149, p = .708] were not significant. Paired t tests suggested that MAE duration was significantly shorter [t(9) = 2.64, p = .027, d = 0.83] in the voluntary condition (23.14 s) than in the control condition (29.24 s). Similarly, MAE duration was also significantly shorter [t(9) = 2.66, p = .026, d = 0.84] in the passive condition (22.01 s) than in the control condition (29.31 s). The analysis of inhibition indices showed a significant inhibition effect in the passive condition [t(9) = 4.12, p = .003, d = 1.30, mean = 0.73; see Fig. 5] and a marginal effect in the voluntary condition [t(9) = 2.12, p = .063, d = 0.67, mean = 0.79; see Fig. 5]. Comparison of the inhibition indices between the voluntary and passive conditions did not show a significant difference [t(9) = 0.31, p = .764, d = 0.10].

Because reduction of the MAE duration was found in both the voluntary and passive conditions with similar effect sizes (or even a larger effect size in the passive condition, based on the inhibition index analysis), we propose that the present findings are accounted for mainly by the vestibular system rather than the motor planning system. Because the subject’s head was not fixed in the passive condition, small involuntary movements of the head with respect to the body might have occurred in some subjects occasionally. Thus, the results here could not completely exclude the contribution from proprioceptive signals. However, since the subjects were instructed to remain stationary relative to the chair, we presume that the potential contribution from proprioceptive signals was minor as compared to that from vestibular signals. Therefore, the sum of vestibular and proprioceptive signals was simply referred to as “vestibular” signals in this article.

Discussion

Previous studies mostly relied on mechanical devices to study the interaction between vision and self-movement (Harris et al., 1981; Jaekl et al., 2005; Kaliuzhna et al., 2015; Shirai & Ichihara, 2012; Wallach & Flaherty, 1975). Relatively speaking, mechanical devices are neither easily installed nor very portable. The 21st century has witnessed the rise of wearable video technology—for example, Google glass, HoloLens, Oculus Rift, and so forth. The present study introduced a new method based on this new technology and incorporated a three-space sensor to track head movements.

In the present study, we developed a researcher-friendly VR system that is able to track head movement and deliver visual stimuli in experimenter-specified ways with precise timing (lag < 17 ms). For instance, the retinal and head motion profiles can be paired in both speed and direction, or paired in speed but perpendicular in moving direction, or completely decorrelated. Most important, both the motion tracking and the stimulus presentation are based on Matlab and Psychophysics Toolbox, the two widely used research software packages in perception sciences. This software compatibility endows our system with sufficient adaptability to satisfy various kinds of research goals in the field. Furthermore, this VR device can be easily assembled. It is small-sized and lightweight. Given all these advantages, the present study has demonstrated that our method can serve as a powerful tool for studying visual–vestibular interaction.

By using our new method, our Experiment 1 replicated Swanston and Wade’s (1992) finding that MAE was suppressed for adaptation to head-rotation-induced retinal motion. This inhibition of MAE disappeared in Experiment 4, in which the directional association between the retinal motion and head rotation was made inconsistent and unreliable. All this evidence shows that the head-rotation-induced inhibition of MAE rested on the pairing of the motion profiles between the retinal motion and head movement signals, which underscores the important role of the correlation between retinal motion and head rotation in this phenomenon. The results of Experiment 5 further showed that the inhibition of MAE arises from vestibular rather than from efference copy signals. Besides the neural insights, these new and extended findings also prove the efficacy and validity of using our method to investigate visual–vestibular interaction. Unlike the previous methods, our approach is electronic, and thus more portable.

It should be noted that our method not only enables the portability of the research but also opens a window to different kinds of electronic manipulations of visual–vestibular associations, since both the visual stimulation and the recording of head movements are launched and freely controlled by customized computer programs. These natures endow our method with additional value. Our Experiments 2 and 3 provide two interesting examples. Both experiments established an abnormal causal association in direction between the retinal motion and head movement over short-term adaptation training. These abnormal visual–vestibular associations are rarely experienced in everyday life. However, the neural system could still quickly learn to identify the strong correlation between the two motion profiles and suppress the corresponding visual processing, just as in Experiment 1, in which the visual–vestibular association was more natural.

These results can be explained by the notion that the neural system predicts sensations and suppresses responses to self-motion, in order to efficiently process motion signals in the real world (Haarmeier et al., 1997; Miall & Wolpert, 1996; Wallach, 1987; Wolpert et al., 1995; Wurtz, 2008). The literature has reported that adaptation to expanding visual optic flow is suppressed during the observer’s forward movement (Harris et al., 1981; Wallach & Flaherty, 1975). Our work shows the similar visual–vestibular interaction that head rotation in the yaw plane can also suppress adaptation to the associated retinal motion. Surprisingly, such visual–vestibular interaction is more flexible than was previously thought. The suppression of adaptation we observed presumably reflects suppressed neural responses to the associated retinal motion. By investigating the continuous flash suppression of visual adapter stimuli, previous work has shown that both the effects of adaptation and the early visual responses to the adapter are weakened (Blake, Tadin, Sobel, Raissian, & Chong, 2006; Mei, Dong, Dong, & Bao, 2015; Yuval-Greenberg & Heeger, 2013). Besides, repeated visual adaptation for multiple daily sessions has also been found to result in progressively weakened aftereffects (Dong, Gao, Lv, & Bao, 2016). Comparison of the time courses of decay of the aftereffects suggests that the strength of the adapter becomes weaker over training (Dong, Engel, & Bao, 2014; Dong et al., 2016; Greenlee, Georgeson, Magnussen, & Harris, 1991). Therefore, as an indirect behavioral marker, suppressed effects of adaptation may be used to indicate suppressed neural responses to visual adapters. Nevertheless, future work should use neurophysiological approaches to obtain more direct evidence for the account of suppressed neural responses to the associated retinal motion.

One possible explanation for the flexibility of visual–vestibular interaction is Hebbian synaptic learning (Hebb, 1949), which has been used to explain the presence of mirror neurons (Gallese, Fadiga, Fogassi, & Rizzolatti, 1996). For example, during the adaptation in Experiment 2, a downward-drifting grating was presented whenever the subject made a leftward head turn. As a reafferent sensory stimulus, the downward-drifting grating triggered activity in the visual neurons encoding downward motion. Because the activities of these visual neurons consistently overlap in time with those of the vestibular neurons responding to leftward head turns, Hebbian learning predicts that the synapses connecting these two types of neurons should be potentiated. Accordingly, over time the downward retinal motion would be identified by the neural system as sensation signals associated with leftward head turns, which should thus be suppressed. Another possible explanation is that visual–vestibular interaction occurs at relatively later decision stages (Kovacs, Raabe, & Greenlee, 2008; Ventre-Dominey, 2014), in which neurons are not direction-specific. To these decision stages, it is not so critical whether or not the directional association between the two signals is natural.

Multisensory interactions have recently been distinguished between multisensory convergence, multisensory transformation, and multisensory modulation (Haggard, Iannetti, & Longo, 2013). Multisensory convergence often manifests as improved sensitivity or performance due to converged unimodal information from two or more modalities (Gu, Angelaki, & Deangelis, 2008; Gu, Watkins, Angelaki, & DeAngelis, 2006). Multisensory transformation refers to the transformation of information from one modality into the spatial reference frame of another. This form of multisensory interaction may explain the findings of hand-centered MAE (Matsumiya & Shioiri, 2014) and other crossmodal aftereffects (Cuturi & MacNeilage, 2014; Konkle, Wang, Hayward, & Moore, 2009). The findings in the present study reflect a third form of multisensory interaction, multisensory modulation (Haggard et al., 2013). The flexibility of such modulation indicates that the multisensory networks—probably including the visual, vestibular, proprioceptive, and motor systems—can rapidly rewire in order to adapt to novel causal associations between retinal motion and head movement. The underlying neural mechanisms may involve brain areas such as MSTd, VIP, cerebellum, and so forth (Billington & Smith, 2015; Brooks, Carriot, & Cullen, 2015; DeAngelis & Angelaki, 2012; Gu et al., 2008; Gu et al., 2006; Zhang, Heuer, & Britten, 2004), which still require future neurophysiological investigations to identify them.

Moreover, although the present study is focused on head (yaw) rotation, the method can be used to track rotation along other dimensions (e.g., pitch, roll) and more complex head or body movements. It is also feasible to apply the method to experiments in which observers are required to move naturally in the environment while viewing virtual stimuli.

Limitation

An important prerequisite of our experiments was that subjects maintained good central fixation while rotating their heads. However, we did not use eye tracking in the experiments. Therefore, no eye-tracking data were available to verify good fixation, though subjects were instructed to maintain the central fixation all the time during each session. Furthermore, recent work has shown that fixation influences visual–vestibular conflict detection and integration (Garzorz & MacNeilage, 2017). Fixating a head-fixed target (as in our experiments) optimizes visual–vestibular integration, yet fixating a scene-fixed target impairs integration but improves detection of visual–vestibular conflict (Garzorz & MacNeilage, 2017). The use of eye tracking will be taken into account in future studies, so that any role of fixation can be further explored. Besides, although we believe that the subjects in Experiment 5 did not make any observable voluntary head movements, on the basis of the instructions and monitoring from the experimenter (author J.B. stayed with each subject during all the experiments), we lacked objective supporting data. This limitation may be resolved in future work by attaching another three-space sensor on the swivel chair, such that the subtraction signal from the sensor on the chair and the sensor on the helmet can reveal any measurable voluntary head movement during the passive rotations.

It should be noted that the three-space sensor used in our present system can only accurately track rotational motion information. To record translational information, new sensor(s) should be considered for a future version of the system, so that a single device could apply to all types of head movements. Nevertheless, the key methodological conception of the future system would be identical to the present one: a head-mounted display plus one or more three-space sensors controlled by customized computer programs of popular and professional research software.

Conclusion

The present study has introduced a VR approach to investigate visual–vestibular integration and interaction. We observed that head rotations suppressed the retinal motion they induced, and that this effect was independent of the direction of the inducing retinal motion. This new tool promises a new means to test various types of visual–vestibular integration and interactions in future work, and thus may promote the acceleration of discovery in the field.

Author note

This research was supported by the National Natural Science Foundation of China (31571112, 31371030, 31271175, and 31525011) and by the Key Research Program of the Chinese Academy of Sciences (XDB02010003 and QYZDB-SSW-SMC030). Authors’ contributions: Y.J. and M.B. conceived the VR approach; J.B. and M.B. engineered the VR system; all authors designed the experiments; J.B. performed the experiments; J.B. and M.B. analyzed the data; and M.B., T.Z., and Y.J. wrote the article.