Playing shooter and driving videogames improves top-down guidance in visual search

Wu, Sijing; Spence, Ian

doi:10.3758/s13414-013-0440-2

Playing shooter and driving videogames improves top-down guidance in visual search

Published: 05 March 2013

Volume 75, pages 673–686, (2013)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Playing shooter and driving videogames improves top-down guidance in visual search

Download PDF

Sijing Wu¹ &
Ian Spence¹

11k Accesses
77 Citations
22 Altmetric
2 Mentions
Explore all metrics

Abstract

Playing action videogames is known to improve visual spatial attention and related skills. Here, we showed that playing action videogames also improves classic visual search, as well as the ability to locate targets in a dual search that mimics certain aspects of an action videogame. In Experiment 1A, first-person shooter (FPS) videogame players were faster than nonplayers in both feature search and conjunction search, and in Experiment 1B, they were faster and more accurate in a peripheral search and identification task while simultaneously performing a central search. In Experiment 2, we showed that 10 h of play could improve the performance of nonplayers on each of these tasks. Three different genres of videogames were used for training: two action games and a 3-D puzzle game. Participants who played an action game (either an FPS or a driving game) achieved greater gains on all search tasks than did those who trained using the puzzle game. Feature searches were faster after playing an action videogame, suggesting that players developed a better target template to guide search in a top-down manner. The results of the dual search suggest that, in addition to enhancing the ability to divide attention, playing an action game improves the top-down guidance of attention to possible target locations. The results have practical implications for the development of training tools to improve perceptual and cognitive skills.

Action video game training reduces the Simon Effect

Article 04 August 2015

The influence of action video game playing on eye movement behaviour during visual search in abstract, in-game and natural scenes

Article 15 December 2016

Video Games

Playing a first-person shooter (FPS) videogame improves performance on tasks that require spatial attention (Feng, Spence, & Pratt, 2007; Green & Bavelier, 2003, 2006b, 2007; Spence, Yu, Feng, & Marshman, 2009) and also alters the event-related potential waveform in ways that generally index top-down modulation of spatial selective attention via the inhibition of distractors (Wu, Cheng, Feng, D’Angelo, Alain, and Spence, 2012). Moreover, practiced FPS players show less activation in the frontoparietal network, suggesting more efficient top-down allocation of attention and better filtering of distracting information (Bavelier, Achtman, Mani, & Föcker, 2011). FPS videogame players also possess enhanced task-switching skills (Colzato, van Leeuwen, van den Wildenberg, & Hommel, 2010; Green, Sugarman, Medford, Klobusicky, & Bavelier, 2012; Strobach, Frensch, & Schubert, 2012), and Karle, Watter, and Shedden (2010) suggested that this is due to superior top-down selective attentional control. FPS players also do better when two or more tasks must be performed simultaneously (Chiappe, Conger, Liao, Caldwell, & Vu, 2013; Green & Bavelier, 2006a; Strobach et al., 2012). Indeed, the ability to deploy and guide attention plays a central role in most of the cognitive skills that have been shown to improve after playing FPS videogames (for reviews, see Green, Li, & Bavelier, 2010; Hubert-Wallander, Green, & Bavelier, 2011; Spence & Feng, 2010).

Classic visual search

FPS videogames often require the player to search for a target against a distracting background, such as an enemy sniper hiding behind bushes or the rubble of a building; this has much in common with classic visual search (Treisman & Gelade, 1980). Notably, FPS players are quicker in both easy and difficult conjunction visual search (Castel, Pratt, & Drummond, 2005), with players spending less time per item (Hubert-Wallander, Green, Sugarman, & Bavelier, 2011), consistent with increased efficiency in visual selective attention. However, it is not known how players perform in feature search—the so-called “pop-out” search (Treisman & Gelade, 1980).

During search, attention is required in order to select a target while filtering out distractors. According to the Guided Search model (Wolfe, 1994, 2007), top-down and bottom-up forms of information are used to construct an activation map that indicates how likely each element is to be the target (parallel stage). Attention is then guided to the item with the highest activation. Since dynamic noise or interference is present in any neural process, some distractors might be identified as being more promising than the target. If the first item is not the target, attention is guided to the next item with the highest activation (serial stage; Cave & Wolfe, 1990; Wolfe, 1994). The efficiency of selection may be measured by the slope of the function relating the time to find the target versus the number of distractors. In feature search, the target differs from the distractors in a single feature (e.g., searching for a blue bar among red bars), and the search time is relatively unaffected by the number of distractors. This highly efficient search (indicated by a flat search function) is thought to be the result of efficient guidance of attention to the target, since the bottom-up component of the feature activation at the target location is strong enough that the target will likely have the highest activation. In conjunction search, in which the target differs from other items in two or more features, search functions with positive slopes are usually found, with reaction times (RTs) increasing with the number of distractors. In a more complex display, the target may not possess the highest activation, and with more distractors the probability that the highest activation will belong to the target decreases, resulting in a positive search slope (Cave & Wolfe, 1990; Wolfe, 1994).

Practice can improve feature search, but only after a very large number of trials (Schoups & Orban, 1996), and the improvement does not generally transfer to other dimensions, such as stimulus orientation, size, or location (Ahissar & Hochstein, 1993, 1996). On the other hand, improvements in conjunction search can be achieved in a few hundred trials, and the learning is far less specific than in feature search (Sireteanu & Rettenbach, 1995, 2000). FPS players perform better in conjunction visual search than do nonplayers (Castel et al., 2005), with faster search rates and flatter slopes (Hubert-Wallander et al., 2011b). However, we do not know whether players’ faster search rates are due to a superiority in the parallel stage, such as better attentional guidance, or faster processing in the serial stage, via enhanced item-processing speed, faster reallocation of attention to new items, or better inhibition of previously searched items (Hubert-Wallander et al., 2011b). Furthermore, it is not known whether FPS players also excel in feature search, in which the search is efficient and the slope is flat, even for nonplayers. Indeed, pure feature search is not encountered in FPS videogames, and although players sometimes search for a salient target, such as a first-aid kit or a weapon, and the target usually attracts the player’s attention immediately, this is not feature search; it is more akin to conjunction search.

If training nonplayers, by having them play an FPS videogame, were to result in faster speeds in feature search, an explanation that did not rely solely on an improvement in search rate would be needed, since feature search is already efficient, with a search slope close to zero. If feature search were quicker after playing an FPS videogame, presumably this would be because the videogame had exercised and enhanced some high-level cognitive mechanism that is useful in feature search, since pure feature search is not part of the typical FPS videogame. Improvement would be the result of enhancement of a more general capacity (Green & Bavelier, 2012), such as learning a target template (Green et al., 2010a), for better top-down guidance in feature search (Wolfe, Butcher, Lee, & Hyle, 2003).

Dual search

Action videogames—especially first-person shooters—often require the player to perform more than one task simultaneously. Players navigate the environment and search for hostages or materiel, while simultaneously searching for threats that suddenly appear in the periphery. Simultaneous multiple visual searches are often required. Players possess superior task-switching skills (Cain, Landau, & Shimamura, 2012; Colzato et al., 2010; Strobach et al., 2012), and they also do better when performing two or more tasks at the same time (Chiappe et al., 2013; Green & Bavelier, 2006a; Strobach et al., 2012), possibly because of enhanced attentional capacity (Karle et al., 2010) or improved executive functioning (Cain et al., 2012). However, some evidence has also indicated that dual-task costs do not differ between players and nonplayers when performing two tasks simultaneously (Donohue, James, Eslick, & Mitroff, 2012). Thus, it is still too early to draw the conclusion that the ability to share or divide attention during multitasking benefits from playing FPS videogames.

Genres of games

Most previous videogame training studies have focused on FPS games, which apparently exercise the cognitive skills found to improve after playing these games (Achtman, Green, & Bavelier, 2008). It is not known whether other types of action games might produce similar training effects. Driving and racing games call on many of the same kinds of perceptual and cognitive skills as FPS games do, and thus might also improve these capacities. Three types of videogames were used in Experiment 2: an FPS game, a driving-racing game, and a nonaction, control game.

Experiment 1A: Classic visual search

In classic visual search (Treisman & Gelade, 1980), participants see an array of bars and report whether or not a target bar is present. The target and distractor bars may differ in color or orientation only (feature search), or they may differ in both color and orientation (conjunction search). During feature search, the target usually has the highest activation in the simple search array and “pops out,” with attention being efficiently guided to the target. On the other hand, during conjunction search, the target may not be the item with the highest activation. According to the Guided Search model (Wolfe, 1994, 2007), sequential examination of items in order of their attentional priority in the activation map occurs, and the average search time increases with more items in the display. If FPS videogame experience only benefits item-processing speed in the serial stage, players should be better than nonplayers only in conjunction search, since feature search is already efficient, regardless of the number of distractors.

Method

Participants

The participants were undergraduates at the University of Toronto and received either course credit or $10/h compensation. On the basis of a questionnaire given before the experiment, 36 male participants with normal or corrected-to-normal vision were classified as 19 FPS players (mean age = 21.4 years, from 19 to 23) or 17 nonplayers (mean age = 21.7 years, from 17 to 25). Only males were tested because of the relative scarcity of females with sufficient FPS experience (a minimum of 4 h per week of FPS playing during the previous six months). The qualifying games included titles like Call of Duty, Counter Strike, Halo, Half-Life, Medal of Honor, and Rainbow Six. Nonplayers reported no FPS play of any kind in the preceding 3 years. The participants were divided at random into six groups that were assigned randomly to the six possible orders of the three tasks.

Stimuli and procedure

Participants were seated at a 20-in. CRT monitor in a dimly lit room. They viewed the display (colored stimuli on a white background) binocularly with the head positioned in a chinrest 25 cm from the screen. Each trial began with a black fixation cross (1.6º × 1.6º) in the center of the screen.

The array contained 9, 16, or 25 items. The items were blue or red bars measuring 7º × 2º in a vertical or horizontal orientation. The density and mean distance from fixation were equated for all arrays. The visual angles subtended by the arrays were 53º × 47º, 42º × 37º, and 29º × 27º for the 5 × 5, 4 × 4, and 3 × 3 arrays, respectively. Half of the trials contained a randomly located target, and the remaining trials did not. Each participant performed two varieties of feature search (color or orientation) and one conjunction search (see Fig. 1).

In the color condition, participants searched for a blue bar, and in the orientation condition, they searched for a horizontal bar. In the conjunction condition, participants searched for a vertical blue bar. On each trial, the central fixation cross appeared for 500 ms, followed by the search array. Participants responded as quickly as possible while minimizing errors, by pressing either the “1” key at the top of the keyboard, if the target was detected, or the “9” key, if it was not detected. The trial ended after a response, or after 6,000 ms if no response had been registered. Twenty practice trials were also presented.

The three search tasks were conducted in three separate blocks, and all six possible orders were presented. Each block consisted of 100 trials, with a target being present in a randomly selected 50 of those trials, and absent in the remainder. The position of the target in the array was random. Participants could take a break of up to 1 min between blocks, and they pressed a key to continue.

Results

The factors in the 2 × 3 × 3 design were FPS Experience (players or nonplayers), Search Task (color, orientation, and conjunction), and Set Size (9, 16, and 25). The first factor (Experience) was manipulated between participants, and the remaining factors (Search Task and Set Size) were within participants. Since the variance of an RT increases with the mean, and since the variance of a proportion becomes smaller as the proportion approaches either 0 or 1, variance-stabilizing transformations were routinely employed: Speed of responding (1,000/RT) was calculated, and accuracy (proportion correct, p) was transformed to 2 sin^–1 $ \sqrt{p} $ before performing the analysis of variance (Kirk, 1982, pp. 105–106).^{Footnote 1} We analyzed both target-present and target-absent trials. Although participants were slower in target-absent trials, we did not observe any other difference between the two types of trials, especially for the effects related to FPS experience. Thus, we will present the analyses of target-present trials only.

Accuracy

Players (94%) did not differ from nonplayers (94%), F(1, 34) < 1, n.s. The accuracies of the three types of visual searches did differ, F(2, 68) = 15.1, p < .001, in that feature searches (color, 96%; orientation, 96%) were more accurate than conjunction search (90%), F(1, 68) = 30.3, p < .001 (contrast). The overall accuracy was higher with fewer distractors, F(2, 68) = 18.1, p < .001, but this interacted with search task, F(4, 136) = 8.2, p < .001: Accuracy was higher with fewer distractors in conjunction search [96%, 89%, and 87%, respectively, for the 9-, 16-, and 25-item arrays; simple main effect, F(2, 68) = 27.0, p < .001], but not in the feature searches (for color, 96%, 96%, and 97%, respectively; for orientation, 95%, 97%, and 95%). No speed–accuracy trade-off was observed in any cell of the experimental design.

Speed

Players (560 ms) were faster than nonplayers (625 ms) (see Fig. 2), F(1, 34) = 6.5, p < .05, and the advantage was about the same in the three search conditions, F(2, 68) < 1, n.s. (Fig. 2). The three types of search differed, F(2, 68) = 304.9, p < .001, with the feature searches (color, 445 ms; orientation, 516 ms) being faster than conjunction search (816 ms), F(1, 68) = 558.6, p < .001 (contrast). Overall, participants were faster with fewer distractors, F(2, 68) = 95.5, p < .001, but the effect interacted with search task, F(4, 136) = 21.9, p < .001: Speed was faster with fewer distractors in conjunction search [686, 785, and 969 ms, respectively, for the 9-, 16-, and 25-item arrays; simple main effect, F(2, 68) = 120.9, p < .001], but not in the feature searches (for color, 437, 440, and 455 ms, respectively; for orientation, 508, 511, and 525 ms). Players (15.2 ms/item) had faster search slopes than did nonplayers (20.4 ms/item) in conjunction search, F(1, 34) = 4.3, p < .05; the slopes were not different between players and nonplayers in the feature searches.

Discussion

With more distractors, accuracy and speed dropped more in conjunction search than in feature searches. This suggests that the feature searches were guided more efficiently by bottom-up information than was conjunction search (Cave & Wolfe, 1990; Wolfe, 1994). Players and nonplayers did not differ in overall accuracy; however, the players were quicker in conjunction search (cf. Castel et al., 2005), with a faster search rate (cf. Hubert-Wallander et al., 2011b). Interestingly, the players were also quicker in feature search, which is already efficient (flat search slopes), even among nonplayers. Indeed, the feature search slopes did not differ between players and nonplayers, and so the difference in intercepts cannot be attributed to a faster search rate per item in the serial stage of item processing (Hubert-Wallander et al., 2011b). However, in general, feature search is influenced by top-down guidance: Knowing the feature in advance results in better search performance (Soto, Humphreys, & Heinke, 2006; Wolfe et al., 2003). Kristjánsson, Wang, and Nakayama (2002) and Wolfe et al. (2003) have interpreted decreases in search intercepts as being evidence of top-down guidance. The FPS players may be capable of more efficient guidance in the parallel stage of search, possibly as a result of superior top-down executive control (Chisholm, Hickey, Theeuwes, & Kingstone, 2010; Chisholm & Kingstone, 2012), such as filtering unwanted distractor information, or top-down guidance, with a better target template to prioritize targets (Chisholm & Kingstone, 2012; Leonard & Egeth, 2008).

Experiment 1B: Dual search

FPS players often detect and identify threats in the periphery, while being simultaneously engaged in central search tasks. In our laboratory analogue, participants performed two simultaneous search tasks that required discrimination and identification in central and peripheral areas of the visual field (VanRullen, Reddy, & Koch, 2004). For the central search, participants were required to say whether five randomly rotated letters were identical. Simultaneously, the participants had to locate and identify a briefly presented target that could appear anywhere on a circular locus in the periphery. When performing both searches simultaneously, identifying a letter (L or T) is more difficult than identifying a bar (horizontal or vertical) in the periphery, since more attentional resources are required to identify a letter under divided processing (VanRullen et al., 2004). If playing an FPS videogame enhances the ability to identify a more difficult peripheral target after it is located (presumably by allocating more attentional resources), players should enjoy an advantage over nonplayers when the peripheral stimulus is a letter. If, on the other hand, the players’ advantage did not differ by stimulus type, the difference might be due to the players’ superior performance in locating (as distinct from identifying) the target. Participants also performed the peripheral task alone, in order to provide baseline performance data.^{Footnote 2} Accuracy and RTs were recorded.