Reference frames in spatial updating when body-based cues are absent

He, Qiliang; McNamara, Timothy P.; Kelly, Jonathan W.

doi:10.3758/s13421-017-0743-y

Reference frames in spatial updating when body-based cues are absent

Published: 28 July 2017

Volume 46, pages 32–42, (2018)
Cite this article

Download PDF

Memory & Cognition Aims and scope Submit manuscript

Reference frames in spatial updating when body-based cues are absent

Download PDF

Qiliang He¹,
Timothy P. McNamara¹ &
Jonathan W. Kelly²

1386 Accesses
8 Citations
2 Altmetric
Explore all metrics

Abstract

The current study investigated the reference frame used in spatial updating when idiothetic cues to self-motion were minimized (desktop virtual reality). In Experiment 1, participants learned a layout of eight objects from a single perspective (learning heading) in a virtual environment. After learning, they were placed in the same virtual environment and used a keyboard to navigate to two of the learned objects (visible) before pointing to a third object (invisible). We manipulated participants’ starting orientation (initial heading) and final orientation (final heading) before pointing, to examine the reference frame used in this task. We found that participants used the initial heading and the learning heading to establish reference directions. In Experiment 2, the procedure was almost the same as in Experiment 1 except that participants pointed to objects relative to an imagined heading that differed from their final heading in the virtual environment. In this case, pointing performance was only affected by alignment with the learning heading. We concluded that the initial heading played an important role in spatial updating without idiothetic cues, but the representation established at this heading was transient and affected by the interruption of spatial updating; the learning heading, on the other hand, corresponded to an enduring representation which was used consistently.

Guided Search 6.0: An updated model of visual search

Article 05 February 2021

Framing the figure: Mental rotation revisited in light of cognitive strategies

Article 06 September 2016

Virtual memory palaces: immersion aids recall

Article Open access 16 May 2018

To navigate effectively, animals must update the spatial relations between their body and objects in the environment as they move. This process is referred to as spatial updating (Amorim, Glasauer, Corpinot, & Berthoz, 1997; Amorim & Stucchi, 1997; Farrell & Robertson, 1998; Rieser, 1989). Spatial updating can be achieved via external self-motion cues (vision, audition) or internal self-motion cues (vestibular, kinesthetic, efferent information; see Waller & Hodgson, 2013, for a review). Internal self-motion cues are often referred to as idiothetic cues. Regardless of the source of information used in spatial updating, people need a reference frame to represent the spatial relation between themselves and the target. A number of studies have investigated the nature of this reference frame when the full set of idiothetic cues were available during spatial updating (Hodgson & Waller, 2006; Kelly, Avraamides, & Loomis, 2007; Mou, McNamara, Valiquette, & Rump, 2004; Wang et al., 2006). In daily life, people often commute by car or train and therefore have a limited set of idiothetic cues during navigation, but few studies have examined the reference frame when idiothetic cues were limited or not available in spatial updating. In the current study, we investigated the reference system during spatial updating when idiothetic cues were not available in a virtual environment navigated by a keyboard. By bridging this gap in the literature, we can better understand the mechanisms of spatial updating and investigate how people adapt their spatial representations depending on the availability of the idiothetic cues.

Spatial updating can be an egocentric or an allocentric process. Egocentric spatial updating refers to the process whereby the navigator updates each object’s location with respect to the body using a reference system centered on the body (and typically defined by the reference directions of front, back, right, or left; Wang, 2016). In contrast, allocentric spatial updating refers to the process whereby the navigator updates his or her position in the environment using a reference system external to the body and anchored in the environment (e.g., using canonical directions of north, south, east, or west; Klatzky, 1998).

There is evidence of the use of both egocentric and allocentric reference systems in spatial updating when idiothetic cues are available. In relatively featureless environments, people may rely primarily on egocentric reference systems (Wang et al., 2006), but in more natural, feature-rich environments, both reference systems seem to be employed (e.g., Amorim et al., 1997; Hodgson & Waller, 2006; Holmes & Sholl, 2005; Kelly et al., 2007; Mou et al., 2004; Waller & Hodgson, 2006; Xiao, Mou, & McNamara, 2009). For example, Kelly et al. (2007) had participants learn a layout of objects from a fixed perspective and later had them recall the learned objects by pointing from several imagined perspectives. When recall occurred in the same room as did learning, recall was facilitated when the imagined perspective was aligned with (parallel to) the participant’s facing direction during recall, considered evidence for an egocentric reference frame. Whether recall occurred in the learning room or an adjacent room, recall was facilitated when the imagined perspective was aligned with the learning view; this effect is considered evidence for an allocentric reference frame in long-term memory (e.g., Shelton & McNamara, 2001). Moreover, Mou et al. (2004) found that if the imagined perspective was aligned with the allocentric as well as the egocentric reference frame, performance was better than when the imagined perspective was aligned with only one of these reference frames. Taken together, these results indicate that egocentric and allocentric reference frames may be used simultaneously during spatial updating.

In the absence of idiothetic cues, past research shows that spatial updating is still possible with visual cues (Riecke, Heyde, & Bülthoff, 2005; Riecke, Veen, & Bülthoff, 2002; Ruddle, Volkova, & Bülthoff, 2011; Waller, Loomis, & Haun, 2004; but see Klatzky, Loomis, Beall, Chance, & Golledge, 1998). He, McNamara, and Kelly (2017) investigated the nature of the reference frame in a path integration task when idiothetic cues were limited. Participants navigated to three waypoints in a desktop virtual environment using the computer keyboard and then pointed to the first waypoint using a joystick (there was no learning or familiarization phase and only one waypoint was visible at any point in time). The results indicated that under these circumstances, participants used an allocentric reference frame in which the principal reference direction was defined by their initial perspective in the environment (the initial heading).

Spatial updating often occurs, however, in familiar environments (e.g., walking to one’s bathroom at night in the dark). Besides the initial heading, another heading that people may use to establish the reference direction during spatial updating is the perspective from which they learn a layout of objects (the learning heading). Studies have shown that people organize their spatial memories in terms of a small number (1–2) of reference directions even when environments are experienced from multiple points of view (Kelly & McNamara, 2008; McNamara, 2003; Mou & McNamara, 2002; Shelton & McNamara, 2001; Valiquette, McNamara, & Smith, 2003; Waller, Montello, Richardson, & Hegarty, 2002), and generally prefer to use the learning heading to establish a reference direction to represent the object-to-object spatial relations and self-to-object spatial relations (Kelly et al., 2007; Mou et al., 2004).

Combining the aforementioned findings, we conjectured that both the initial heading and the learning heading could be used to establish reference directions in spatial updating in a familiar environment without idiothetic cues. In the current study, we manipulated the alignment among the initial heading, the learning heading, and the imagined heading (the heading participants needed to imagine they were facing in the virtual environment before responding) to examine the reference system.

Figure 1 outlines the experimental design in Experiment 1. Participants learned a layout of objects from a heading of 0° (the learning heading) in a virtual environment. After learning, they were placed in the same virtual environment and used a keyboard to navigate sequentially to two of the learned object locations. The starting orientation (initial heading) and location varied across experimental conditions. After navigating to the second object, participants occupied a position and an orientation (the final heading) which also varied across experimental conditions. Participants used a joystick to point to a third object from this final position and heading, and hence the final heading is referred to as the imagined heading (in Experiment 2, the final heading and the imagined heading differed). As a result of these manipulations, the initial heading could be aligned or 90° misaligned with the imagined heading, and the learning heading could be aligned or 90° misaligned with the imagined heading.

For brevity, the condition in which both the initial and learning headings were aligned with the imagined heading was named IL; the condition in which only the learning heading was aligned with the imagined heading was named L; the condition in which only the initial heading was aligned with the imagined heading was named I; and the condition in which both the initial and the learning headings were misaligned with the imagined heading was named M.

We assumed that spatial relations that are encoded in memory can be retrieved whereas those that are not encoded must be computed or inferred, which introduces errors and time to the decision processes (Klatzky, 1998; Shelton & McNamara, 2001). Better performance for an imagined heading relative to others indicates that the spatial relations are represented in memory with respect to a reference direction parallel to that imagined heading (Mou et al., 2004; Rump & McNamara, 2013). Therefore, comparison of performance across experimental conditions can determine whether participants used the initial heading, the learning heading, or a combination of the two to establish the reference direction(s): For example, if performance in the I condition or in the L condition was better than performance in the M condition, it would suggest that participants used the initial heading or the learning heading, respectively, to establish reference directions; if performance in the IL condition was also better than performance in the I condition and the L condition, it would suggest that misalignment with one of two reference directions conferred a cost to pointing accuracy (e.g., Mou et al., 2004) or that the availability of two aligned reference directions enabled participants to construct a more accurate representation at the time of responding (perhaps in working memory).^{Footnote 1} On the other hand, if performance was equivalent across conditions, it would suggest that participants did not use the initial heading or the learning heading to establish a reference direction.

We considered that the reference directions established by the initial heading and the learning heading were components of allocentric reference systems because neither of these headings changed as participants changed their orientation in the virtual environment (an alternative interpretation is discussed in the General Discussion). However, we conjectured that the reference systems defined by the initial and learning headings differed in terms of stability (Allen & Haun, 2004): The reference system defined by the initial heading was assumed to be transient because this heading was not constant, and participants never viewed the layout of objects from this heading; the reference system defined by the learning heading, on the other hand, was assumed to be enduring because its orientation was constant and participants learned the layout from this perspective intensively. In Experiment 2, we tested this hypothesis by following almost the same procedure as in Experiment 1, but asking participants to imagine a heading that was 90° misaligned with their final heading before responding (e.g., if their final heading was 0°, participants needed to imagine they were facing 90° or -90°). The four conditions of Experiment 1 were the same, but by introducing this mental rotation we changed the alignment of the imagined heading and the final heading. We hypothesized that this mental rotation would interrupt spatial updating and force people to switch to an enduring representation system (Waller & Hodgson, 2006), and, as a result, only the reference direction defined by the learning heading would be used during retrieval of spatial relations in Experiment 2. To anticipate our results, Experiment 1 showed that both the reference directions defined by the initial and learning headings were used during retrieval, whereas only the reference direction defined by the learning heading was used in Experiment 2.

The results of a preliminary experiment with 16 participants were used to establish sample sizes for the current project. This experiment included only the I, L, and M conditions, but otherwise the materials, procedure, and results were similar to Experiment 1 (we decided not to report this experiment fully because it was completely subsumed by Experiment 1). The observed power was .81 in pointing error. To ensure adequate power in the present project, sample sizes of 24 were used.

Experiment 1

Method

Participants

Twenty-four students (12 women) from Vanderbilt University and the Nashville community participated in this experiment in return for extra credits in psychology courses or monetary compensation.

Materials and design

The experiment was conducted on a 21.5-inch Apple iMac desktop computer. The virtual environment (see Fig. 2) consisted of eight virtual objects (dog, ball, mug, fish, car, lamp, plant, and shoe) placed on identical 60-cm tall blue pillars.Objects were arranged in five columns, as shown in Fig. 2b. In addition, a square (7 m × 7 m × 3 m) virtual room surrounded the scene. The room floor was textured with a brick pattern. The four walls of the virtual room were textured with different colors and materials so that participants could use the texture of the wall to determine their initial heading at the beginning of a trial. The sky was textured with clouds in the learning phase, but was rendered uniformly blue in the test phase, so that participants could not use the sky to determine their position and orientation in the test phase. All participants learned the object locations from a fixed location and perspective (defined as 0°), which was 2 m away from the layout (see Fig. 2a). This viewing perspective ensured that participants could see all objects simultaneously.

We manipulated two headings in the current experiment: The initial heading, which was the heading participants faced at the beginning of a test trial in the virtual environment; and the imagined heading, which was the heading that participants were required to imagine they were facing before responding, and this imagined heading was always the same as the final heading participants occupied at the end of a test trial in the virtual environment. The learning heading was the fixed heading from which participants learned the layout in the learning phase (see Figs. 1 and 2).

To investigate the adopted reference frame, we used a 2 × 2 factorial design by manipulating the alignment between the initial heading and the imagined heading, and the alignment between the learning heading and the imagined heading as shown in Fig. 1. The orientations of the headings in each condition are listed in Table 1. Ten trials were constructed for each experimental condition, resulting in 40 total trials. These 40 trials were divided into 10 blocks of four trials each, with one trial from each condition in each block and presented randomly.

Table 1 Participants’ headings (degrees) in the virtual environment across conditions by experiments

Full size table

As stated previously, if participants used the learning heading or the initial heading to establish the reference direction, then performance in the L or the I condition, respectively, should be better than performance in the M condition. In addition, if participants were able to use aligned initial and learning headings to construct a more accurate representation at the time of responding or if misalignment with learning or initial headings produced processing costs, performance in the IL condition should be better than performance in the L or I condition.

To ensure that any significant differences observed between the aforementioned experimental conditions were not due to path complexity differences across conditions (Wan, Wang, & Crowell, 2013), we controlled the outbound path length (the shortest distance from the starting location to the first object, plus the shortest distance from the first to the second objects), outbound path turning angle (the shortest turning angle from the starting location to the first object, plus the shortest turning angle from the first to the second objects) and the correct pointing angle (the shortest angle from the second to the third object) across conditions.^{Footnote 2} These metrics are presented in Table 2.

Table 2 The means and standard deviations (in parenthesis) of the outbound path length, outbound path turning angle and correct pointing angle across conditions by experiments

Full size table

Procedure

Learning phase

The layout of eight objects was displayed (Fig. 2b) on a computer monitor, and the experimenter named each of the objects for the participants. After all of the objects were named, the participants were instructed to study the layout for 2 minutes. During learning, participants were told not to move from the study location. After learning, both the objects and pillars were hidden and one of the pillars, but not objects, would appear randomly. Participants named the corresponding object on that pillar. This learning sequence was repeated until the participant successfully named all the objects twice.

Test phase

After learning the layout, participants performed the test trials in front of the same computer using keyboard and joystick. Participants started at the location corresponding to the trial condition (I, L, IL, or M). All objects and pillars were hidden, but room walls and the floor were present at the beginning so that participants could use the wall textures to identify their orientation in the virtual environment (see Fig. 2a). Participants could not change their orientation or position before they pulled the trigger on the joystick. After participants pulled the trigger, the room walls were removed, and one of the learned objects and the pillar beneath it appeared. Participants used the arrow keys on the keyboard to navigate to that object. Participants were instructed to first rotate the viewing perspective to face to the object, and then use the forward key to reach the object. The object disappeared upon arrival, and the second object would appear. Participants were instructed to release the forward key upon arrival and use the left or right key to look for the second object. Participants reached the second object in the same way. Upon arrival at the second object, everything disappeared, and a text message appeared at the center of screen, displaying the name of the third object to point to (e.g., “Please point to the shoe”).

When participants saw the text message, they were told to imagine the environment from their final location (i.e., standing at the position and facing the orientation in the virtual environment they had been before the screen was blanked), and to use the joystick to point to the third object from that perspective. The pointing response was chosen in favor of a navigation response because the final heading was a key manipulation and we wanted to ensure that participants adopted and maintained their final heading during response. In addition, participants were told not to rotate the body during the test phase. If the joystick was deflected vertically or horizontally by more than 1 cm, the response would be recorded, and participants would be teleported to the next position and orientation corresponding to the experimental condition to start the next trial.

Before the test trials, participants performed three practice trials that were identical to the test trials, except that the objects in practice trials were randomly selected from the remembered layout.

Results and discussion

Previous research suggested that gender differences may exist in path integration (Kelly, McNamara, Bodenheimer, Carr, & Rieser, 2009), so we included gender in the following analysis. Pointing error and latency were analyzed in 2 (gender) × 2 (alignment between the learning and imagined headings, referred to as learning-imagined) × 2 (alignment between the initial and imagined headings, referred to as initial-imagined) mixed ANOVAs (see Fig. 3), with gender as the between-subjects factor and learning-imagined and initial-imagined as within-subjects factors. For pointing error (see Fig. 3a), the main effect of gender was not significant, F(1, 22) = 3.92, MSE = 970.46, p = .06, η² = .30, but the main effects of learning-imagined and initial-imagined were significant, F(1, 22) = 9.44, MSE = 444.71, p = .006, η² = .15; F(1, 22) = 23.79, MSE = 60.62, p < .001, η² = .52. In addition, all of the two-way interactions were significant: Learning-Imagined × Initial-Imagined, F(1, 22) = 9.83, MSE = 69.12, p = .005, η² = .30; Learning-Imagined × Gender, F(1, 22) = 5.90, MSE = 444.71, p = .024, η² = .21; Initial-Imagined × Gender, F(1, 22) = 6.61, MSE = 60.27, p = .017, η² = .23. The significance levels of the following t tests were Bonferroni adjusted (the p value must be less than or equal to .025 to be deemed significant).

Collapsing across gender, pairwise comparisons showed that pointing error was higher in the M condition than in the I and L conditions, ts(23) > 3.52, ps < .002, suggesting that participants used both the learning and the initial headings to establish reference directions in the current task. The IL condition did not differ from the I or the L condition, t(23) = 1.32, p = .20; t(23) = 1.92, p = .07, respectively. The significant interaction between learning-imagined and initial-imagined and the pattern of pointing error suggested that when the imagined heading was aligned with the learning heading, the alignment between the imagined heading and the initial heading did not play a role. On the other hand, when the imagined heading was misaligned with the learning heading, the alignment between the imagined heading and the initial heading affected the pointing error significantly. This interaction might be due to a floor effect such that the lowest average pointing error that could be achieved with the current pointing device is the pointing error of the L condition (~30°), and therefore the performance in the IL condition could not be better than the L condition. We discounted this possibility because two previous studies from our lab (He & McNamara, 2017; He et al., 2017) showed that the average pointing error could be as low as 20° with the current pointing device. Combined with the comparable performance among the I, L, and IL conditions, we concluded that participants used only one reference direction when both reference directions (defined by the initial and learning headings) could be utilized simultaneously.

Within gender, when the imagined heading was misaligned with the initial heading, women’s learning heading effect (difference between L and M conditions: Diff = -31.89, SE = 9.15), t(11) = 3.33, p = .007, was larger than men’s (Diff = -4.75, SE = 4.30), t(11) = 1.05, p = .32. When the imagined heading was misaligned with the learning heading, women’s initial heading effect (difference between I and M conditions: Diff = -20.05, SE = 3.04), t(11) = 6.31, p < .001, was larger than men’s (Diff = -5.64, SE = 4.31), t(11) = 1.25, p = .23. When the imagined heading was aligned with both headings (IL condition), performance was not significantly different from the I or L condition for women, ts(11) < 2.35, ps > .04, or men, ts(11) < 0.08, ps > .94.

For pointing latency (see Fig. 3b), only the main effect of learning-imagined was significant, F(1, 22) = 8.04, MSE = 1.28, p = .01, η² = .26, suggesting that participants responded faster when the imagined heading was aligned with the learning heading.

In sum, the results from Experiment 1 showed that during spatial updating without idiothetic cues, participants used both the learning heading and the initial heading to establish reference directions but could not use them simultaneously. Within gender, we found that women relied on these two headings to establish reference directions, but this effect was not significant for men. Another interpretation of this gender difference is that men did rely on these two headings to establish reference directions, but when the imagined heading was not aligned with these headings (M condition), men were able to mentally rotate the layout of objects from the learning heading efficiently. We discuss the gender difference in more detail in the General Discussion.

Experiment 2

We assumed that the allocentric reference system established at the initial heading was transient, whereas the allocentric reference system established at the learning heading was enduring. To test this hypothesis, we used the same paradigm as in Experiment 1, but when participants reached the second object, they were required to imagine that they were facing 90° left or right to their final heading in the virtual environment and to point to the target object relative to this imagined heading. This mental rotation could interrupt spatial updating and encourage people to switch to an enduring representation (Waller & Hodgson, 2006). If the reference system established at the initial heading is a transient representation and the one established at the learning heading is an enduring representation, the significant difference between the L and M conditions should remain, whereas the difference between the I and M conditions should decrease or become insignificant.