Playing a first-person shooter (FPS) videogame improves performance on tasks that require spatial attention (Feng, Spence, & Pratt, 2007; Green & Bavelier, 2003, 2006b, 2007; Spence, Yu, Feng, & Marshman, 2009) and also alters the event-related potential waveform in ways that generally index top-down modulation of spatial selective attention via the inhibition of distractors (Wu, Cheng, Feng, D’Angelo, Alain, and Spence, 2012). Moreover, practiced FPS players show less activation in the frontoparietal network, suggesting more efficient top-down allocation of attention and better filtering of distracting information (Bavelier, Achtman, Mani, & Föcker, 2011). FPS videogame players also possess enhanced task-switching skills (Colzato, van Leeuwen, van den Wildenberg, & Hommel, 2010; Green, Sugarman, Medford, Klobusicky, & Bavelier, 2012; Strobach, Frensch, & Schubert, 2012), and Karle, Watter, and Shedden (2010) suggested that this is due to superior top-down selective attentional control. FPS players also do better when two or more tasks must be performed simultaneously (Chiappe, Conger, Liao, Caldwell, & Vu, 2013; Green & Bavelier, 2006a; Strobach et al., 2012). Indeed, the ability to deploy and guide attention plays a central role in most of the cognitive skills that have been shown to improve after playing FPS videogames (for reviews, see Green, Li, & Bavelier, 2010; Hubert-Wallander, Green, & Bavelier, 2011; Spence & Feng, 2010).

Classic visual search

FPS videogames often require the player to search for a target against a distracting background, such as an enemy sniper hiding behind bushes or the rubble of a building; this has much in common with classic visual search (Treisman & Gelade, 1980). Notably, FPS players are quicker in both easy and difficult conjunction visual search (Castel, Pratt, & Drummond, 2005), with players spending less time per item (Hubert-Wallander, Green, Sugarman, & Bavelier, 2011), consistent with increased efficiency in visual selective attention. However, it is not known how players perform in feature search—the so-called “pop-out” search (Treisman & Gelade, 1980).

During search, attention is required in order to select a target while filtering out distractors. According to the Guided Search model (Wolfe, 1994, 2007), top-down and bottom-up forms of information are used to construct an activation map that indicates how likely each element is to be the target (parallel stage). Attention is then guided to the item with the highest activation. Since dynamic noise or interference is present in any neural process, some distractors might be identified as being more promising than the target. If the first item is not the target, attention is guided to the next item with the highest activation (serial stage; Cave & Wolfe, 1990; Wolfe, 1994). The efficiency of selection may be measured by the slope of the function relating the time to find the target versus the number of distractors. In feature search, the target differs from the distractors in a single feature (e.g., searching for a blue bar among red bars), and the search time is relatively unaffected by the number of distractors. This highly efficient search (indicated by a flat search function) is thought to be the result of efficient guidance of attention to the target, since the bottom-up component of the feature activation at the target location is strong enough that the target will likely have the highest activation. In conjunction search, in which the target differs from other items in two or more features, search functions with positive slopes are usually found, with reaction times (RTs) increasing with the number of distractors. In a more complex display, the target may not possess the highest activation, and with more distractors the probability that the highest activation will belong to the target decreases, resulting in a positive search slope (Cave & Wolfe, 1990; Wolfe, 1994).

Practice can improve feature search, but only after a very large number of trials (Schoups & Orban, 1996), and the improvement does not generally transfer to other dimensions, such as stimulus orientation, size, or location (Ahissar & Hochstein, 1993, 1996). On the other hand, improvements in conjunction search can be achieved in a few hundred trials, and the learning is far less specific than in feature search (Sireteanu & Rettenbach, 1995, 2000). FPS players perform better in conjunction visual search than do nonplayers (Castel et al., 2005), with faster search rates and flatter slopes (Hubert-Wallander et al., 2011b). However, we do not know whether players’ faster search rates are due to a superiority in the parallel stage, such as better attentional guidance, or faster processing in the serial stage, via enhanced item-processing speed, faster reallocation of attention to new items, or better inhibition of previously searched items (Hubert-Wallander et al., 2011b). Furthermore, it is not known whether FPS players also excel in feature search, in which the search is efficient and the slope is flat, even for nonplayers. Indeed, pure feature search is not encountered in FPS videogames, and although players sometimes search for a salient target, such as a first-aid kit or a weapon, and the target usually attracts the player’s attention immediately, this is not feature search; it is more akin to conjunction search.

If training nonplayers, by having them play an FPS videogame, were to result in faster speeds in feature search, an explanation that did not rely solely on an improvement in search rate would be needed, since feature search is already efficient, with a search slope close to zero. If feature search were quicker after playing an FPS videogame, presumably this would be because the videogame had exercised and enhanced some high-level cognitive mechanism that is useful in feature search, since pure feature search is not part of the typical FPS videogame. Improvement would be the result of enhancement of a more general capacity (Green & Bavelier, 2012), such as learning a target template (Green et al., 2010a), for better top-down guidance in feature search (Wolfe, Butcher, Lee, & Hyle, 2003).

Dual search

Action videogames—especially first-person shooters—often require the player to perform more than one task simultaneously. Players navigate the environment and search for hostages or materiel, while simultaneously searching for threats that suddenly appear in the periphery. Simultaneous multiple visual searches are often required. Players possess superior task-switching skills (Cain, Landau, & Shimamura, 2012; Colzato et al., 2010; Strobach et al., 2012), and they also do better when performing two or more tasks at the same time (Chiappe et al., 2013; Green & Bavelier, 2006a; Strobach et al., 2012), possibly because of enhanced attentional capacity (Karle et al., 2010) or improved executive functioning (Cain et al., 2012). However, some evidence has also indicated that dual-task costs do not differ between players and nonplayers when performing two tasks simultaneously (Donohue, James, Eslick, & Mitroff, 2012). Thus, it is still too early to draw the conclusion that the ability to share or divide attention during multitasking benefits from playing FPS videogames.

Genres of games

Most previous videogame training studies have focused on FPS games, which apparently exercise the cognitive skills found to improve after playing these games (Achtman, Green, & Bavelier, 2008). It is not known whether other types of action games might produce similar training effects. Driving and racing games call on many of the same kinds of perceptual and cognitive skills as FPS games do, and thus might also improve these capacities. Three types of videogames were used in Experiment 2: an FPS game, a driving-racing game, and a nonaction, control game.

Experiment 1A: Classic visual search

In classic visual search (Treisman & Gelade, 1980), participants see an array of bars and report whether or not a target bar is present. The target and distractor bars may differ in color or orientation only (feature search), or they may differ in both color and orientation (conjunction search). During feature search, the target usually has the highest activation in the simple search array and “pops out,” with attention being efficiently guided to the target. On the other hand, during conjunction search, the target may not be the item with the highest activation. According to the Guided Search model (Wolfe, 1994, 2007), sequential examination of items in order of their attentional priority in the activation map occurs, and the average search time increases with more items in the display. If FPS videogame experience only benefits item-processing speed in the serial stage, players should be better than nonplayers only in conjunction search, since feature search is already efficient, regardless of the number of distractors.

Method

Participants

The participants were undergraduates at the University of Toronto and received either course credit or $10/h compensation. On the basis of a questionnaire given before the experiment, 36 male participants with normal or corrected-to-normal vision were classified as 19 FPS players (mean age = 21.4 years, from 19 to 23) or 17 nonplayers (mean age = 21.7 years, from 17 to 25). Only males were tested because of the relative scarcity of females with sufficient FPS experience (a minimum of 4 h per week of FPS playing during the previous six months). The qualifying games included titles like Call of Duty, Counter Strike, Halo, Half-Life, Medal of Honor, and Rainbow Six. Nonplayers reported no FPS play of any kind in the preceding 3 years. The participants were divided at random into six groups that were assigned randomly to the six possible orders of the three tasks.

Stimuli and procedure

Participants were seated at a 20-in. CRT monitor in a dimly lit room. They viewed the display (colored stimuli on a white background) binocularly with the head positioned in a chinrest 25 cm from the screen. Each trial began with a black fixation cross (1.6º × 1.6º) in the center of the screen.

The array contained 9, 16, or 25 items. The items were blue or red bars measuring 7º × 2º in a vertical or horizontal orientation. The density and mean distance from fixation were equated for all arrays. The visual angles subtended by the arrays were 53º × 47º, 42º × 37º, and 29º × 27º for the 5 × 5, 4 × 4, and 3 × 3 arrays, respectively. Half of the trials contained a randomly located target, and the remaining trials did not. Each participant performed two varieties of feature search (color or orientation) and one conjunction search (see Fig. 1).

Fig. 1
figure 1

Sample search array in Experiment 1A. A total of 25 bars, either blue or red in color, were oriented either vertically or horizontally (red is displayed as white and blue as black here). The task was to indicate whether a target was present. Three search conditions were presented, in which the target differed from the distractors in color, in orientation, or in both color and orientation

In the color condition, participants searched for a blue bar, and in the orientation condition, they searched for a horizontal bar. In the conjunction condition, participants searched for a vertical blue bar. On each trial, the central fixation cross appeared for 500 ms, followed by the search array. Participants responded as quickly as possible while minimizing errors, by pressing either the “1” key at the top of the keyboard, if the target was detected, or the “9” key, if it was not detected. The trial ended after a response, or after 6,000 ms if no response had been registered. Twenty practice trials were also presented.

The three search tasks were conducted in three separate blocks, and all six possible orders were presented. Each block consisted of 100 trials, with a target being present in a randomly selected 50 of those trials, and absent in the remainder. The position of the target in the array was random. Participants could take a break of up to 1 min between blocks, and they pressed a key to continue.

Results

The factors in the 2 × 3 × 3 design were FPS Experience (players or nonplayers), Search Task (color, orientation, and conjunction), and Set Size (9, 16, and 25). The first factor (Experience) was manipulated between participants, and the remaining factors (Search Task and Set Size) were within participants. Since the variance of an RT increases with the mean, and since the variance of a proportion becomes smaller as the proportion approaches either 0 or 1, variance-stabilizing transformations were routinely employed: Speed of responding (1,000/RT) was calculated, and accuracy (proportion correct, p) was transformed to 2 sin–1 \( \sqrt{p} \) before performing the analysis of variance (Kirk, 1982, pp. 105–106).Footnote 1 We analyzed both target-present and target-absent trials. Although participants were slower in target-absent trials, we did not observe any other difference between the two types of trials, especially for the effects related to FPS experience. Thus, we will present the analyses of target-present trials only.

Accuracy

Players (94%) did not differ from nonplayers (94%), F(1, 34) < 1, n.s. The accuracies of the three types of visual searches did differ, F(2, 68) = 15.1, p < .001, in that feature searches (color, 96%; orientation, 96%) were more accurate than conjunction search (90%), F(1, 68) = 30.3, p < .001 (contrast). The overall accuracy was higher with fewer distractors, F(2, 68) = 18.1, p < .001, but this interacted with search task, F(4, 136) = 8.2, p < .001: Accuracy was higher with fewer distractors in conjunction search [96%, 89%, and 87%, respectively, for the 9-, 16-, and 25-item arrays; simple main effect, F(2, 68) = 27.0, p < .001], but not in the feature searches (for color, 96%, 96%, and 97%, respectively; for orientation, 95%, 97%, and 95%). No speed–accuracy trade-off was observed in any cell of the experimental design.

Speed

Players (560 ms) were faster than nonplayers (625 ms) (see Fig. 2), F(1, 34) = 6.5, p < .05, and the advantage was about the same in the three search conditions, F(2, 68) < 1, n.s. (Fig. 2). The three types of search differed, F(2, 68) = 304.9, p < .001, with the feature searches (color, 445 ms; orientation, 516 ms) being faster than conjunction search (816 ms), F(1, 68) = 558.6, p < .001 (contrast). Overall, participants were faster with fewer distractors, F(2, 68) = 95.5, p < .001, but the effect interacted with search task, F(4, 136) = 21.9, p < .001: Speed was faster with fewer distractors in conjunction search [686, 785, and 969 ms, respectively, for the 9-, 16-, and 25-item arrays; simple main effect, F(2, 68) = 120.9, p < .001], but not in the feature searches (for color, 437, 440, and 455 ms, respectively; for orientation, 508, 511, and 525 ms). Players (15.2 ms/item) had faster search slopes than did nonplayers (20.4 ms/item) in conjunction search, F(1, 34) = 4.3, p < .05; the slopes were not different between players and nonplayers in the feature searches.

Fig. 2
figure 2

Reaction times (RTs) and average speeds (1,000/RT) in three search conditions for players and nonplayers in Experiment 1A for the target-present trials. Note the nonlinearity (reciprocal transformation) of the speed scale relative to the RT scale. Error bars represent ±1 SE. Players were faster than nonplayers in the two feature searches and the conjunction search. The search slopes (in ms/item) are shown for players and nonplayers

Discussion

With more distractors, accuracy and speed dropped more in conjunction search than in feature searches. This suggests that the feature searches were guided more efficiently by bottom-up information than was conjunction search (Cave & Wolfe, 1990; Wolfe, 1994). Players and nonplayers did not differ in overall accuracy; however, the players were quicker in conjunction search (cf. Castel et al., 2005), with a faster search rate (cf. Hubert-Wallander et al., 2011b). Interestingly, the players were also quicker in feature search, which is already efficient (flat search slopes), even among nonplayers. Indeed, the feature search slopes did not differ between players and nonplayers, and so the difference in intercepts cannot be attributed to a faster search rate per item in the serial stage of item processing (Hubert-Wallander et al., 2011b). However, in general, feature search is influenced by top-down guidance: Knowing the feature in advance results in better search performance (Soto, Humphreys, & Heinke, 2006; Wolfe et al., 2003). Kristjánsson, Wang, and Nakayama (2002) and Wolfe et al. (2003) have interpreted decreases in search intercepts as being evidence of top-down guidance. The FPS players may be capable of more efficient guidance in the parallel stage of search, possibly as a result of superior top-down executive control (Chisholm, Hickey, Theeuwes, & Kingstone, 2010; Chisholm & Kingstone, 2012), such as filtering unwanted distractor information, or top-down guidance, with a better target template to prioritize targets (Chisholm & Kingstone, 2012; Leonard & Egeth, 2008).

Experiment 1B: Dual search

FPS players often detect and identify threats in the periphery, while being simultaneously engaged in central search tasks. In our laboratory analogue, participants performed two simultaneous search tasks that required discrimination and identification in central and peripheral areas of the visual field (VanRullen, Reddy, & Koch, 2004). For the central search, participants were required to say whether five randomly rotated letters were identical. Simultaneously, the participants had to locate and identify a briefly presented target that could appear anywhere on a circular locus in the periphery. When performing both searches simultaneously, identifying a letter (L or T) is more difficult than identifying a bar (horizontal or vertical) in the periphery, since more attentional resources are required to identify a letter under divided processing (VanRullen et al., 2004). If playing an FPS videogame enhances the ability to identify a more difficult peripheral target after it is located (presumably by allocating more attentional resources), players should enjoy an advantage over nonplayers when the peripheral stimulus is a letter. If, on the other hand, the players’ advantage did not differ by stimulus type, the difference might be due to the players’ superior performance in locating (as distinct from identifying) the target. Participants also performed the peripheral task alone, in order to provide baseline performance data.Footnote 2 Accuracy and RTs were recorded.

Method

Participants

The same 36 participants as in Experiment 1A completed Experiment 1B in the same experimental session. The order of the two experiments was counterbalanced over participants.

Stimulus display

The same apparatus was used as in Experiment 1A. Black stimuli were displayed on a white background.

Participants fixated a central cross (1.6º × 1.6º), which was replaced after 500 ms by five randomly rotated letters, each 3.9º × 3.9º, separately centered at randomly chosen positions within the circumference (12.5º eccentricity) of an invisible circle with an area of 25º at the center; the nonoverlapping letters could be five Ls, five Ts, four Ls and one T, or four Ts and one L (equally distributed across trials).

After 100 ms, a stimulus appeared (for 30 ms) in the periphery at a random location centered on the circumference of an invisible circle subtending 53º (26.5º eccentricity; see Fig. 3). The peripheral stimulus was either a bar or a letter, depending on the task for that block. When the target was a letter, the stimulus was either an L or a T, with equal probabilities; both letters measured approximately 7.7º × 7.7º, and appeared with a randomly chosen rotation. When the target was a bar, it appeared with either a vertical or a horizontal orientation, with equal probabilities; the bars measured approximately 7.7º × 1.6º.

Fig. 3
figure 3

Sample stimulus displays in Experiment 1B. The display consisted of five central letters displayed for 230 ms, and one stimulus flashed for 30 ms in the periphery, 100 ms after the onset of the central stimulus. The peripheral stimulus was either a bar (vertical or horizontal) or a letter (L or T) and was larger than the central stimuli in order to compensate for reduced resolution at the periphery. The central task was to indicate whether the five letters were same. The peripheral task was to identify the stimulus (L or T, or horizontal or vertical). The outer circle is the locus of the randomly located peripheral stimulus, and the inner circle outlines the area within which the central group of letters could appear. Neither circle appeared in the actual stimulus display

The central letters remained for a further 100 ms after the peripheral stimulus had disappeared. Then, a response window with a verbal cue appeared. The trial ended after a response or after 6,000 ms, if no response had been registered. The next trial started 1,000 ms later.

Procedure

In four separate blocks of trials, the participants completed four tasks. These included two single searches—(1) peripheral stimulus (horizontal or vertical bar?) and (2) peripheral stimulus (which letter?)—and two dual searches—(3) central search (same–different letters?)/peripheral stimulus (horizontal or vertical bar?), and (4) central search (same–different letters?)/peripheral stimulus (which letter?). The single-search blocks each consisted of 64 trials, and the dual-search blocks consisted of 128 trials, with the five central-task letters being the same on half of the trials and different on the remaining trials. The order of the individual trials was randomized, and the order of the four blocks was counterbalanced across participants. Before each block, participants read instructions regarding the task for that block. Participants took a break of up to 1 min between blocks, and pressed a key to continue.

Each of the tasks was carried out as follows.

Peripheral search alone

Participants were required to fixate the center of the screen and to locate and identify the peripheral stimulus while ignoring the stimuli at the center. Because the position of the peripheral stimulus was chosen randomly, participants could not improve their performance by fixating a location other than the center. They did, however, know that the target would appear somewhere on the circumference of an imaginary circle in the periphery. Participants responded as quickly as possible while minimizing errors, pressing one of two keys (“1” or “9”), depending on the identification of the target.

Dual search

Both searches were performed simultaneously. After the stimuli disappeared, a cue instructed participants to produce a response for either the central search or the peripheral search, with equal probabilities (Fig. 4). Participants were told to maintain performance in the central search, while performing both tasks simultaneously.

Fig. 4
figure 4

Sample trial sequence in Experiment 1B

For the central search, the participants pressed the “1” key at the top of the keyboard if the central five letters were identical, and the “9” if one letter was different.

For the peripheral search, the participants responded as they did in peripheral search alone.

Results

Three separate analyses of variance were performed for the three tasks: (1) peripheral search alone, (2) the central component of the dual search, and (3) the peripheral component of the dual search. FPS Experience (players, nonplayers) was the between-participants factor, and Type of Peripheral Stimulus (bar, letter) was the within-participants factor in each of the three 2 × 2 analyses of variance. Arcsine-transformed proportions of correct responses (accuracy) and reciprocal-transformed RTs (speed) were analyzed.

Accuracy

Players (83%) were more accurate than nonplayers (79%) in the peripheral component of the dual search (Fig. 5), F(1, 34) = 4.6, p < .05; this difference was about the same for the bars and the letters, F(1, 34) < 1, n.s. Players (77%) were also more accurate than nonplayers (70%) in the central component of the dual search, F(1, 34) = 4.2, p < .05, but were not more accurate than nonplayers (94% vs. 92%) in the peripheral search alone, F(1, 34) = 2.0, n.s. Accuracy was higher when the peripheral stimulus was a bar rather than a letter in the peripheral search alone (95% vs. 91%), F(1, 34) = 48.4, p < .001, and in the peripheral component of the dual search (88% vs. 74%), F(1, 34) = 67.9, p < .001.

Fig. 5
figure 5

Proportions correct (p) and average accuracies (2 sin–1 \( \sqrt{p} \)) in the single and dual searches of Experiment 1B. Note the nonlinearity (arcsine transformation) of the accuracy scale relative to the proportion correct scale. Error bars represent 1 SE. Players were more accurate in both the central component and the peripheral component of the dual search, but not in the peripheral search alone. The players’ advantages did not differ between bars and letters

Speed

In the peripheral component of the dual search, players (1,209.3 ms) were faster than nonplayers (1,413.1 ms), F(1, 34) = 6.4, p < .05. Players (1,237.8 ms) and nonplayers (1,412.5 ms) did not differ in the central component of the dual search, F(1, 34) = 2.9, n.s. In the peripheral search alone, players (400 ms) were faster than nonplayers (470 ms), F(1, 34) = 6.8, p < .05, and the speed was faster with bars (398 ms) than with letters (468 ms), F(1, 34) = 16.2, p < .001.

Discussion

FPS players were not more accurate in the peripheral search alone. This may seem to contradict studies that have used attentional visual field tasks (Feng et al., 2007; Green & Bavelier, 2003), in which players were more accurate than nonplayers at different eccentricities (10º, 20º, and 30º). However, our task differed in two important respects: (1) Participants were required to locate and identify a stimulus that appeared at a fixed eccentricity of 26.5º, rather than simply to locate a target that appeared unpredictably at one of three different eccentricities, and (2) the task required the location and identification of a single item rather than the detection of a target among multiple distractors distributed across the visual field. Thus, the peripheral search alone was not as demanding for the nonplayers, as was indicated by their accuracy (92%). However, the players were more accurate in the central and peripheral components of the dual search. Players demonstrated higher accuracy in both components and faster RTs in the peripheral component of the dual search, suggesting that they possessed a superior capacity to allocate spatial attention over a wide field of view while simultaneously dividing attention between the two searches.

The players were generally better than nonplayers in the peripheral component of the dual search, independent of whether the peripheral stimulus was a bar or a letter. Thus, the superiority of the players cannot be attributed solely to a superior ability in identification, but rather to better executive control (Cain et al., 2012), or improved higher-level cognitive processes that control and regulate resource management in order to guide attention (Green & Bavelier, 2012). In the dual search, the participants knew that the peripheral stimulus would always appear at a random location on an imaginary circle. This information was likely used better by the players to guide attention to the periphery where the target would appear, regardless of whether it was a bar or a letter.

Experiment 2: Videogame training

While the differences between players and nonplayers in Experiments 1A and 1B were suggestive, they are not evidence of causality (Green & Bavelier, 2003, p. 537; Spence & Feng, 2010, pp. 94–95): Individuals with superior attentional, perceptual, cognitive, and motor skills may choose to play FPS videogames, while those naturally less skilled may not. A training study would be needed to establish a causal link between playing FPS games and improvements in cognition. Experiment 2 was designed to determine whether nonplayers could improve their performance in both classic visual search and dual search. Another important goal of Experiment 2 was to explore the possibility that an action videogame other than FPS, such as a driving-racing game, could produce comparable gains on cognitive tasks.

Three groups of nonplayers were tested on the search tasks before and after playing an action videogame or a control game, for an accumulated total of 10 h. Three games were used: an FPS game (Medal of Honor: Pacific Assault; Electronic Arts, Austin, TX), a driving-racing game (Need for Speed: Most Wanted; Electronic Arts, Austin, TX), and a 3-D puzzle game (Ballance; Atari, New York, NY). Since the driving game shares many characteristics with the FPS game, such as rapidly moving objects and the need to locate and identify targets, we expected that participants who played the driving game would realize gains comparable to those in the FPS group. We did not expect the 3-D puzzle game to produce cognitive changes as large as those in the action game groups (Feng et al., 2007).

Method

Participants

A group of 30 males and 30 females (18–25 years old, with normal or corrected-to-normal vision) were recruited by flyers posted on campus. None had participated in Experiment 1A or 1B. They were randomly assigned to an FPS game group, a driving game group, and a 3-D puzzle game control group, with ten males and ten females in each group. The participants reported (via a preexperiment questionnaire) no action videogame experience during the previous 3 years.Footnote 3 The disqualifying action videogames included FPS games, fighting games, driving-racing games, and sports games. Participants were told that they would be paid $50 on completion of the study, and that this was not contingent on performance in the videogame or on the cognitive tests. Participants were not aware that other participants might be playing a different videogame. Expectations of the outcomes were not communicated to the participants; they were told only that the purpose of the study was to see whether playing a videogame would have any effect on the performance of some cognitive tasks.

Stimuli and procedure

Participants completed the classic search and dual search tasks, followed by 10 h of videogame training, and then the same search tasks after training. The training was conducted in several sessions of 1 or 2 h duration, under experimenter supervision, and the accumulated total of 10 h was completed within a maximum period of three weeks. The FPS group played their game, Medal of Honor: Pacific Assault, on a computer with a 21-in. monitor, keyboard, and mouse. The context was World War II in the South Pacific; players navigated a complex virtual environment, completing missions in which they had to kill enemies and avoid being killed in the game (Feng et al., 2007; Green & Bavelier, 2003; Spence et al., 2009). The driving group played the game Need for Speed: Most Wanted using an Xbox 360 and a 30-in. LCD monitor, with a driving wheel, brake, and accelerator pedals. Since the players in the driving group sat farther back from the monitor (because of the driving wheel), the visual angle was not much different from that with the 21-in. monitors used for the other two games. Players raced against a time limit or other racers. The control group played a 3-D puzzle game, Ballance, using a computer with a 21-in. monitor, keyboard, and mouse. The players controlled a ball that they had to move from one point to another, along a complex path in 3-D space, without falling off the path. The Ballance game involves puzzle elements requiring problem solving (cf. Feng et al., 2007).

All three games become more difficult as the game progresses. At the end of each 1- or 2-h session the participant’s progress was saved, and participants continued from that point in the next session.

Results

In the FPS game group (Medal of Honor), the number of enemies killed in the first scenario of the game was higher after training (16) than before training (11), t(19) = 4.6, p < .001, two-tailed. In the driving game group (Need for Speed), the average speed in the first scenario of the game was faster after training (91 km/h) than before training (42 km/h), t(19) = 17.8, p < .001, two-tailed. In the control game group (Ballance), the total scores in the first scenario of the game improved from 2,801 to 3,552 (higher scores represent faster speeds and fewer mistakes), t(19) = 5.9, p < .001, two-tailed.

Classic visual search

The between-participants factors in the 3 × 2 × 2 × 3 × 3 design were Training Group (FPS, driving, and control) and Gender (male, female), and the within-participants factors were Training (pretraining test, posttraining test), Set Size (9, 16, and 25), and Search Task (color, orientation, and conjunction). Analyses of variance of the transformed RTs and proportions correct were computed. We present the results for target-present trials.

Accuracy

The accuracies of the three types of search differed, F(2, 108) = 45.7, p < .001; the feature searches (color, 96%; orientation, 96%) were more accurate than conjunction search (92%), F(1, 108) = 100.3, p < .001 (contrast). Participants were not more accurate after training (95%) than before training (94%), F(1, 54) = 2.9, n.s. Overall accuracy was higher with fewer distractors, F(2, 108) = 12.6, p < .001, but the effect interacted with search task, F(4, 216) = 3.0, p < .05, with accuracy being higher with fewer distractors in conjunction search [93%, 92%, 89%, respectively, for the 9-, 16-, and 25-item arrays; simple main effect, F(2, 108) = 11.6, p < .001], but not in the feature searches (for color, 96%, 97%, and 96%, respectively; for orientation, 96%, 96%, and 96%). Accuracy did not differ among the training groups, F(2, 54) = 2.4, n.s., and we found no gender-related training effects. No speed–accuracy trade-off was observed in any cell of the experimental design.

Speed

Overall, participants responded faster after playing the games (from 589 to 531 ms), F(1, 54) = 23.4, p < .001, but the FPS (from 601 to 535 ms) and driving (from 598 to 521 ms) groups improved more than the control group (from 569 to 538 ms), F(1, 54) = 4.2, p < .05 (contrast) (Fig. 6), and the improvements in speed did not differ between the FPS and driving groups, F(1, 54) < 1, n.s. (contrast). Feature searches were faster (color, 431 ms; orientation, 479 ms) than conjunction search (771 ms), F(1, 108) = 1,374.5, p < .001 (contrast). Participants were faster with fewer distractors, F(2, 108) = 98.9, p < .001, but this effect interacted with search task, F(4, 216) = 33.7, p < .001: Speeds were faster with fewer distractors in conjunction search [686, 755, and 871 ms, respectively, for the 9-, 16-, and 25-item arrays; simple main effect, F(2, 108) = 205.8, p < .001], but not in the feature searches (for color, 434, 427, and 433 ms, respectively; for orientation, 475, 477, and 484 ms). We found no difference in speed among the training groups, F(2, 54) < 1, n.s., and the improvements in speed did not differ between feature searches (color and orientation) and conjunction searches, F(1, 108) < 1, n.s. (contrast). No gender-related training effects emerged.

Fig. 6
figure 6

Reaction times (RTs) and average speeds (1,000/RT) for the feature search (average of color and orientation searches) and the conjunction search in Experiment 2 for the target-present trials. Note the nonlinearity (reciprocal transformation) of the speed scale relative to the RT scale. Error bars represent ±1 SE. Participants in the FPS and driving game groups were faster in both feature and conjunction searches, after 10 h of game playing. Participants in the control game group did not achieve a similar improvement

The slopes did not differ significantly in the feature and conjunction searches after training for any of the three groups (Table 1).

Table 1 Mean slope ± one standard error (ms/item) for target-present trials in Experiment 2 as a function of search type and videogame training group, before and after training

Dual search

Analyses of variance were performed for the peripheral searches alone, the central component of the dual search, and the peripheral component of the dual search. The between-participants factors in the 3 × 2 × 2 × 2 design were Game Training Group (FPS, driving, and control) and Gender (male and female), and the within-participants factors were Training (pretraining and posttraining test) and Type of Peripheral Stimulus (bars and letters). Transformed RTs and proportions correct were analyzed.

Accuracy

We found no improvement in the peripheral searches alone, F(1, 54) < 1, n.s., nor in the central component of the dual search, F(1, 54) < 1, n.s. (Fig. 7). Participants were more accurate after training (from 78% to 82%) in the peripheral component of the dual search, F(1, 54) = 12.8, p < .005. The FPS and driving groups improved more (from 79% to 84% and from 77% to 81%, respectively) than did the control group (from 79% to 80%), F(1, 54) = 13.4, p < .001 (contrast; see Fig. 7), and the improvements did not differ between the FPS and driving groups, F(1, 54) < 1, n.s. (contrast). The improvements in the peripheral component of the dual search did not differ between bars and letters, F(1, 54) < 1, n.s. Accuracy was higher with bars than with letters in the peripheral searches alone (93% vs. 89%), F(1, 54) = 63.4, p < .001, in the central component of the dual search (73% vs. 70%), F(1, 54) = 15.7, p < .001, and in the peripheral component of the dual search (85% vs. 75%), F(1, 54) = 54.5, p < .001. No gender-related training effects were apparent.

Fig. 7
figure 7

Proportions correct (p) and average accuracies (2 sin–1 \( \sqrt{p} \)) in the single and dual searches of Experiment 2. Note the nonlinearity (arcsine transformation) of the accuracy scale relative to the proportion correct scale. Error bars represent ±1 SE. Participants in the FPS and driving game groups were more accurate in the peripheral component of the dual search, but not in the peripheral searches alone. Participants in the control game group did not achieve a similar improvement

Speed

After training, participants were faster in the peripheral searches alone (from 407 to 374 ms), F(1, 54) = 15.0, p < .001, and in the central component of the dual search (from 1,201 to 1,039 ms), F(1, 54) = 16.4, p < .001, and the peripheral component of the dual search (from 1,227 to 1,037 ms), F(1, 54) = 21.0, p < .001. However, none of the improvements differed among the three game groups, or between bars and letters. Participants were faster with bars (360 ms) than with letters (421 ms) in the peripheral searches alone, F(1, 54) = 63.4, p < .001, and no gender-related training effects emerged.

Discussion

Experiment 2 supports the causal hypothesis that playing action videogames improves classic visual search and performance in a dual search, simultaneously in the center and the periphery.

Feature search and conjunction search were improved by playing action videogames for as little as 10 h, indicating that action videogames possess specific characteristics not found in the slower 3-D puzzle game. Since feature search is efficient, with search time usually being independent of the number of distractors, these improvements in speed are unlikely to be the result of a faster search rate alone (Hubert-Wallander et al., 2011b). Indeed, the faster speeds in both feature and conjunction search after playing action videogames were the result of decreases in the intercepts rather than of faster search slopes. Playing an action game may modify higher-level mechanisms that modulate the relative saliency of targets and distractors, and thus provide better top-down guidance of attention during the parallel stage of search.

The results of the dual search were consistent with those of Experiment 1B. Participants became more accurate in the peripheral component of the dual search after playing an action game, and the improvements were greater in the FPS and driving groups (5%) than in the control group (1%). Responses were faster in all three searches, possibly as a result of more efficient motor execution and improved perceptual–response mappings (Castel et al., 2005). However, we believe that the overall faster speed is more likely to be a consequence of higher-level learning that results in more efficient use of sensory information, since RT improvements after playing an FPS videogame have been observed with a variety of seemingly unrelated tasks (Dye, Green, & Bavelier, 2009; Green, Pouget, & Bavelier, 2010; Hubert-Wallander et al., 2011a). The improvements in accuracy and RTs in the peripheral component of the dual search after playing an action videogame did not depend on stimulus type, suggesting that the gains were not the result of an improved ability to identify (as opposed to locate) the stimulus: Enhancement of top-down guidance of attention to locations on a circular locus in the periphery seems more likely.

We found no change in accuracy in the central component of the dual search.Footnote 4 The central component was a conjunction visual search in which participants searched for the odd letter in a same–different task (e.g., a T among Ls or an L among Ts). As in the classic search tasks, we might have expected to see improvement after playing an action game, due to better top-down guidance. However, the central component differed in one crucial respect from classic visual search: the target varied (either an L or a T) from trial to trial, while the target does not vary in classic visual search. Thus top-down guidance was compromised without specific knowledge of the target, in contrast to classic visual search. This ambiguity regarding the target’s identity nullifies the potential boost from top-down guidance.

The small gains in the control group indicated that playing the puzzle game produced some improvement, which may have been the result of practice, since participants performed the visual searches and the dual search before and after training. The statistical contrasts (average action game gains vs. control game gains, and FPS game gains vs. driving game gains; see the Results section) showed that the improvements in classic visual search and in the dual search were greater after playing either of the action games.

Experiment 2 provided the first evidence that playing a driving-racing game produces improvements comparable to those from an FPS game. Although there are commonalities between the two types of action game, driving-racing videogames possess different characteristics, demands, and features. A comparative evaluation of action games may reveal correlations between common facets of the games and the skills that are seen to benefit from playing them (see Spence & Feng, 2010, Table 1).

General discussion

In Experiments 1A and 1B, FPS videogame players were better at feature and conjunction visual search, and they were better in the peripheral component of a dual search. In Experiment 2, nonplayers improved on the same cognitive tasks after only 10 h of playing an FPS or a driving-racing game. There were no commensurate gains in a control group, showing that the improvements were specific to the action videogames (cf. Feng et al., 2007; Green & Bavelier, 2003, 2006b, 2007; Li, Polat, Makous, & Bavelier, 2009; Li, Polat, Scalzo, & Bavelier, 2010; Spence et al., 2009).

Visual search

The players in Experiment 1A were faster than nonplayers in both feature and conjunction search, and nonplayers in Experiment 2 were faster in both feature and conjunction search after 10 h of playing an action videogame. More importantly, the gains in speed involved changes in the search intercepts and not the slope. Thus, the improvements in speed cannot be explained by an appeal to a faster (serial) search rate due to faster item processing (cf. Hubert-Wallander et al., 2011b). The intercept changes suggest that the gains are more likely to be the result of improved top-down processing during visual search. Whether top-down processing is involved in feature search has been extensively debated (Theeuwes, Reimann, & Mortier, 2006; Wolfe et al., 2003). Studies that have focused on the N2pc component, an event-related potential used to track the allocation of attention (Eimer, 1996; Woodman & Luck, 1999, 2003), have suggested that feature singletons are not salient enough to engage attention in a purely bottom-up fashion, and that attentional capture is strongly determined by top-down task set (Eimer & Kiss, 2008; Holguin, Doallo, Vizoso, & Cadaveira, 2009; Kiss, Jolicœur, Dell’Acqua, & Eimer, 2008). Additionally, feature search can be influenced by top-down guidance—if participants know the target beforehand, the search speed is faster than when the target is not known (Leonard & Egeth, 2008; Wolfe et al., 2003).

In our classic search experiments, participants knew the target before each block of search tasks. Thus, improvements after videogame training could have been the results of an improved target template or of better guidance from the target template under top-down modulation, thus contributing to the creation of a more accurate activation map. Since both color and orientation feature search were enhanced after playing an action game, the improvement in top-down guidance was not restricted to specific features, suggesting that the neural sites responsible are located at a relatively high level, where sensory information is integrated and actions are selected (Green et al., 2010b). The lack of differences in search slopes, except for the FPS players’ faster search slope in conjunction search in Experiment 1A, was a little unexpected (see note 4); however, the effect of top-down guidance may be constant, affecting search speeds overall, but not necessarily the search slopes (Kristjánsson et al., 2002; Wolfe et al., 2003).

In Hubert-Wallander et al., (2011b), who showed that players had a faster search rate than nonplayers, the search slopes ranged from 35 to 50 ms/item. In our study, the search slopes for the conjunction search ranged from 10 to 15 ms/item. Thus, the conjunction search in our study was much more efficient. It is possible that a less efficient search task leaves room for an improvement in speed, resulting in a change in search slopes for those who played the action games. It is also possible that, in a less efficient search, improved top-down guidance could significantly increase the probability that the target item would be selected, by increasing its activation in the map.

Dual search

After action videogame training, participants were more accurate and faster in the peripheral component of the dual search. It is well known that dual-task interference is reduced by extensive repetition of the task (Ruthruff, Johnston, Van Selst, Whitsell, & Remington, 2003; Van Selst, Ruthruff, & Johnston, 1999) and that extensive repetition improves performance of the peripheral task in a dual-task useful-field-of-view paradigm. However, our results showed that performance in the peripheral component of a dual search task can be improved without extensive repetition of the dual task itself: Playing an action videogame alone was sufficient to produce a significant improvement in the peripheral search.

The improvements in the peripheral component of the dual search after playing an action game were equally large for bars and letters, suggesting that the differential attentional demands of identifying the stimulus are not responsible for the improvements. The gains in speed and accuracy are likely to be the result of locating the stimulus more efficiently, due to better executive control (Cain et al., 2012) or improved top-down modulation (Colzato et al., 2010; Karle et al., 2010). Participants knew that the target would always appear at a random location on an imaginary circle in the periphery. This circular locus could be highlighted in the activation map, and thus—when the target appeared—the sensory information would be processed more efficiently. Action videogames provide abundant practice in guiding the player`s attention in a top-down manner (e.g., action videogame players are sometimes instructed that enemies may appear at certain locations). Thus, it seems highly likely that players improve their general skills in guiding attention in a top-down manner to specific locations in multitasking situations (Green et al., 2010a). This learning is more general than traditional perceptual learning and does not require extensive repetition with a specific class of stimuli.

Driving-racing videogames

Since playing a driving game improves performance on certain cognitive tasks, one might wonder why people have not already achieved these gains via actual driving. However, driving a car and “driving” in a racing game are different experiences. In real-world driving, drivers generally maintain moderate speeds without intentional risky maneuvers, and they experience very few—if any—incidents or crashes. In contrast, the typical driving-racing game offers fast-paced action with many more incidents, higher speeds, unexpected obstacles, and much higher rates of crashes than would ever be encountered in real life. Thus, normal on-road driving is not comparable to driving in a racing videogame, which places much higher demands on perceptual and cognitive processing skills.

Driving videogames may be preferable for cognitive training in applied contexts. For example, older adults have difficulty performing two tasks concurrently, especially when one task is in the periphery (Sekuler, Bennett, & Mamelak, 2000), and this deficiency can contribute to older drivers being at greater risk for crashes (Ball & Owsley, 1991; Ball, Owsley, Sloane, Roenker, & Bruni, 1993). Effective training methods to counter the normal age-related deterioration in spatial selective attentional capacities would be extremely valuable. Playing a driving game could help older adults maintain their existing skills, or perhaps even reverse decline, especially in tasks in which divided attentional costs are involved (Bherer, Kramer, & Peterson, 2008; Cassavaugh & Kramer, 2009; Richards, Bennett, & Sekuler, 2006).

According to the Pan European Game Information Index (PEGI), of the games that they rated, the racing games were less violent than the FPS games. This may be important when training cognitive skills in aging drivers (Ball et al., 1993). Simple training exercises to improve attentional skills have been proposed by commercial “brain training” enterprises; however, rote practice on simple tasks quickly becomes boring, and compliance is difficult to achieve (Spence & Feng, 2010). Videogames that have survived in the marketplace are more likely to be played for longer periods of time, because of their proven entertainment value. However, the audience for FPS games is mainly confined to a demographic that does not include younger children or elderly people. Driving games may have wider appeal and utility as training tools with these populations.

Conclusion

FPS videogame players were superior to nonplayers in both feature and conjunction visual search, as well as in the peripheral component of a dual search, suggesting that they have developed better top-down guidance for locating search targets. This superiority cannot be explained solely by an appeal to self-selection—after playing only 10 h of either an FPS or a driving game, nonplayers were able to achieve gains that approached the performance of experienced action videogame players on each of these search tasks.