Multiple-object tracking (MOT) involves monitoring positions of multiple items as they move among other identical items in a visual display (Pylyshyn & Storm, 1988). There are marked individual differences in tracking performance (e.g., Allen, McGeorge, Pearson, & Milne, 2004), and the goal of this study is to determine whether measures of working memory predict these differences. This is relevant given the discussion about the role of working memory in MOT (e.g., Allen, McGeorge, Pearson, & Milne, 2006; Fougnie & Marois, 2006, 2009; Oksama & Hyönä, 2004; Zhang, Xuan, Fu, & Pylyshyn, 2010). This also has ramifications for recent work looking at a variant of the MOT task called multiple-identity tracking.

When Pylyshyn and Storm (1988) originally discussed MOT, they proposed that it relies on a spatial-indexing mechanism that operates by assigning reference tokens to a small number of items in the visual scene at once. These reference tokens individuate the selected items from distractors and from one another and monitor item locations, thus allowing these items to maintain their identities despite changes in their properties or positions. This mechanism operates before the one-area-at-a-time workings of focal spatial attention, and, in fact, it was argued that these reference tokens are a necessary prerequisite if the attentional focus is to be moved accurately from one item to the next, given that there may be many items in a scene and items move and change (see also Pylyshyn, 2001). Similarly, it was suggested that the ability to individuate and track selected items is a necessary precondition for making eye movements or touching specific items among others. Because these reference tokens are assigned shortly after edge detection and grouping (early vision) and before the operation of the attentional focus, it was assumed that they do not require higher-level processes such as working memory.

However, perhaps because MOT is a noticeably effortful and demanding task, one that requires focused concentration over a period of time, a number of researchers have questioned the idea that it is carried out by such a primitive preattentive mechanism (e.g., Scholl, 2009), and it has been suggested that limitations in the number that can be tracked at once may reflect limitations in short-term/working memory (e.g., Cowan, 2000). To resolve this issue, several studies have been carried out using the dual-task paradigm (e.g., Allen, McGeorge, et al., 2006; Fougnie & Marois, 2006, 2009). The reasoning behind the dual-task paradigm is that if two tasks both use the same limited resource (working memory), task performance should be worse when participants perform two tasks at once than when they carry out the tasks independently (baseline performance).

There were complications to using this research strategy though, insofar as there may be several types of working memory. For example, according to Baddeley and Logie (1999), one type of working memory is specialized for storing and manipulating verbal information (the phonological loop), another for storing and manipulating visuospatial information (the visual-spatial sketchpad), and yet another for switching attention and coordinating the activity of various memory stores (the executive). By this account, executive working memory is considered to be amodal, and it is thought to be involved in multitasking, switching attention, and coordinating the activities of the modal stores: phonological working memory and the visuospatial sketchpad. Finally, there is the episodic buffer, a temporary multimodal store that is limited in terms of the number of episodes that can be stored simultaneously (Allen, Baddeley, & Hitch, 2006). All of these systems are thought to have processing or capacity limitations that restrict the amount of information that can be dealt with at once. Consequently, dual-task interference may originate from a variety of different sources.

For one, given the recent expansion of the role of the executive in theories of working memory, it seems plausible that dual-task interference in tracking might occur due to competition for executive working memory. Specifically, some contend that the executive is important for controlled processing, maintaining selection in the face of distraction (e.g., Kane, Poole, Tuholski, & Engle, 2006), while others stress the role of the executive in inhibition (e.g., Hasher & Zacks, 1999). The executive is also thought to be involved in operations such as updating and monitoring, multitasking, task switching, and maintaining information in short-term memory while carrying out controlled searches through long-term memory (see Miyake, Friedman, Emerson, Witzki, & Howerter, 2000; Unsworth & Engle, 2007), and some of the pronounced individual differences in executive function observed as a function of age have been attributed to differences in processing speed (e.g., Salthouse, 1991). Given that the executive is thought to be involved in so many important cognitive operations, it follows that MOT should require at least some aspect of the executive. MOT is a deliberate, effortful process that requires maintaining item selection in the face of distraction, and there is evidence that tracking involves distractor inhibition (e.g., Pylyshyn, Haladjian, King, & Reilly, 2008). Item positions have to be updated as they move, and processing speed may be a factor, and although Pylyshyn and Storm (1988) initially argued against it, it is possible that the tracking task might involve selecting each item one at a time, updating its position, and then switching attention to the next (the serial attention-switching model described in Oksama & Hyönä, 2004).

Allen, Baddeley, et al. (2006) used a dual-task study to demonstrate the role of the executive in MOT. This study involved comparing baseline tracking with tracking when participants were required to carry out a variety of different tasks, each at a rate of one response per second. The tasks were categorizing visual digits as high or low, categorizing auditory tones as high or low, tapping the four corners of a 3  ×  3 matrix with a finger, or repeating the word “the” (articulatory suppression). Baseline tracking performance was better than dual-task performance, and the difference was especially pronounced for the two most difficult tasks, the ones predicted to put more demands on the executive: the visual and auditory categorization tasks. In that article, the authors concluded that the reason this interference occurred was because updating the positions of moving items requires executive processes to compare successive object files and inhibit the creation of new object files, where object files are episodic representations for individual objects that contain information about object properties and positions (see Kahneman, Treisman, & Gibbs, 1992).

In contrast, Fougnie and Marois (2006) found interference between MOT and tests of visuospatial working memory. They introduced MOT into the retention interval for a visual working memory task where participants had to determine whether a probed item had the same position and color as one of the previously presented items. Memory performance decreased as the number of items to be tracked increased, although tracking did not interfere as much as did a second visual working memory task. They concluded that tracking and visual working memory tasks share some resources and not others. More recently, Zhang et al. (2010) found that there was little interference between tracking and visual working memory unless the memory task required binding specific locations to specific object features. They found that increases in the tracking load reduced memory performance when the task was to recall whether an item of a specific color occupied a specific location. This tracking load manipulation had little effect when the task was simply to recall whether an item with a specific conjunction of features had been present, although Fougnie and Marois (2009) conducted a similar study and found reductions in memory for conjunctions as well. Given this controversy, there may be advantages to avoiding the issue and using memory tests that are more purely spatial, rather than ones that require binding specific object properties to specific locations.

Although dual-task studies are informative, they have shortcomings. No task is a measure of a single ability. When two tasks interfere with one another, the interference may originate from more than one source. Imaging studies of MOT produce different results depending on the methodology used, but they implicate a wide variety of brain areas, including the anterior cingulate, frontal eye fields, inferior precentral sulcus, anterior interparietal sulcus, posterior interparietal sulcus, transverse parietal lobule, superior parietal lobule, human motion area (MT+), lateral occipital cortex, and cerebellum (Culham, Cavanagh, & Kanwisher, 2001; Howe, Horowitz, Morocz, Wolfe, & Livingstone, 2009; Jovicich et al. 2001). In an event-related potentials study, Drew and Vogel (2008) found that for individuals with better tracking performance, there was higher amplitude contralateral delay activity, which is associated with activity in the lateral interparietal sulcus and parietal cortex, as well as an enhanced N2pc component, which has been associated with the striate cortex, including the V4 and inferotemporal cortices. A variety of brain areas may be involved in multiple target tracking, and these may be important for different aspects of the task. Consequently, it is possible that executive and visuospatial working memory each plays an independent role in MOT. Alternatively, because spatial tests sometimes correlate with measures of the executive (e.g., Miyake, Friedman, Rettinger, Shah, & Hegerty, 2001), perhaps the tests are accounting for the same variance. It is difficult to determine the relative contribution of different types of working memory in a dual-task study. To further complicate the situation, there is a great deal of controversy about what the executive does (e.g., Colom, Rubio, Shih, & Santacreu, 2006; Colom, Shih, Flores-Mendoza, & Quiroga, 2006; Hudjetz & Oberauer, 2007) and whether it is truly unitary and amodal rather than fractionated (see, e.g., Miyake et al., 2000). Dual-task studies show that there is interference, but discovering the origin of this interference requires different research strategies.

Thus, Oksama and Hyönä (2004), Experiment 1 employed an individual-differences approach to study the role of the executive and visual working memory in tracking. In this study, nonspatial tests of the executive (e.g., the operation span [OSPAN}) and tests of visuospatial working memory (e.g., the Corsi blocks task) were used to predict performance in a classic tracking task. The results were disappointing, with measures of executive and visuospatial memory predicting little of the variance in tracking performance. No individual measure predicted more than 4.8% of the variance. However, in this particular study, the magnitude of the correlations may have been smaller than they should be because the participants were unusual. They were applicants for jobs at a civilian aviation agency, and all scored in the top 11% of the population in standardized testing.

The present study involved a slightly more diverse sample (first-year university students), but it also included some checks to facilitate interpretation of the correlations. Spurious correlations can emerge due to factors unrelated to the constructs under investigation. For example, during an experiment, some individuals are more motivated to perform than others. These individuals could be expected to put more effort into the memory and tracking tasks, and this may explain some of the common variance between predictors and criterion. Thus, it is often helpful to include variables unrelated to the criterion in order to establish a context for interpretation. As a result, we incorporated the digit span test into the study. There is no reason to expect digit span, a measure of passive verbal short-term memory, to play a role in MOT. However, given that individuals who put more effort into one task would put more effort into others, there is reason to expect that there might be a small correlation between tracking performance and digit span scores. The magnitude of this correlation should be minimal, as compared with the correlations between tracking and measures of abilities that are inherent in tracking though. Moreover, if regression is used to remove the common variance between tests, the correlation between digit span and tracking should disappear, whereas the other tests should continue to predict tracking performance.

There are a variety of different tests that purport to measure executive working memory, some of which make greater demands on spatial processing than do others. When trying to determine the role of visuospatial working memory in tracking as compared with that of the more nonspatial functions attributed to the executive, it makes sense to begin with the executive measures that make the fewest spatial demands. This study used two relatively nonspatial measures of the executive that have been extensively studied over a period of years: the OSPAN (Turner & Engle, 1989) and reading span (RSPAN: Daneman & Carpenter, 1980). They were chosen because there is a substantial literature devoted to the psychometric properties of these tests and how they relate to other variables (see Conway et al., 2005, for a review), and there is little controversy about whether these tests truly measure the executive, at least as compared with measures such as the n-back (Kane, Conway, Miura, & Colflesh, 2007). More important, neither of these tests required switching attention back and forth between different visuospatial locations, as do tests such as the SYNWORKI (Elsmore, 1994; used in Oksama & Hyönä, 2004). Specifically, the SYNWORKI requires dividing attention between tasks in four areas on a computer screen, which requires switching the attentional focus between locations—a process that would require Pylyshyn’s spatial-indexing operation to individuate item locations (the same mechanism proposed to explain MOT). If correlations between tracking and the SYWORKI test emerged, it would be unclear whether it was multitasking per se or the need to switch attention between different locations that caused the relationship. Similarly, measures such as the Towers of Hanoi would also require rapidly moving the attentional focus between different spatial locations as part of the task. If those tasks were used, there is a possibility that any correlations might originate from the common need for spatial indexing, rather than abilities that are associated with more nonspatial aspects of the executive.

In contrast, the OSPAN measures the ability to carry out arithmetic calculations while holding numbers in memory, which requires switching between mental calculation and rehearsal, but it does not make as great of demand on indexing or spatial memory. This test predicts measures of fluid intelligence and abstract reasoning (e.g., Conway, Cowan, Bunting, Therriault, & Minkoff, 2002; see Colom, Rubio, et al., 2006; though), but it also predicts performance in a number of attentional tasks. For example, the OSPAN predicts the magnitude of the attentional blink (Arnell, Stokes, & MacLean, 2010), performance in tasks that involve top-down constraint of the focus of visual attention (e.g., Heitz & Engle, 2007), selective enumeration of targets in distractors, and, in general, enumeration beyond the subitizing range (Tuholski, Engle, & Baylis, 2001). Thus, although there is debate about what complex span tests such as the OSPAN measure (e.g., Colom, Rubio, et al., 2006; Colom, Shih, et al., 2006; Conway et al., 2002; Conway et al., 2005; Miyake et al., 2000; Miyake et al., 2001; Unsworth & Engle, 2007), there is evidence to suggest that the OSPAN predicts some aspects of attentional function.

The RSPAN was included to ensure that results for the executive measures were not specific to mental calculation. This test requires making decisions about whether a series of phrases are sensible or not while holding words in working memory. Although the OSPAN and RPSAN use different materials and are typically scored using different techniques, there are robust correlations between these two tests even in relatively homogeneous samples, such as samples of university students (e.g., Engle, Tuholski, Laughlin, & Conway, 1999). Given that the magnitudes of correlations vary depending on the diversity of the sample, it is useful to include strongly correlated variables for purposes of comparison. In the present sample, the observed correlation between the two measures of the executive sets an upper limit to what kinds of correlation might be expected between measures of the executive and other variables.

As well, two extensively studied measures of visuospatial memory were employed: the Corsi blocks task (Milner, 1971) and the Visual Patterns Test (Della Sala, Gray, Baddeley, & Wilson, 1997). Neither of these tests required binding visual properties to locations, and consequently, there was no danger of confusing spatial deficits with deficits in linking varying object properties to locations. The Corsi blocks task is frequently used as an index of spatial skills in neuropsychological test batteries. In this task, a demonstrator taps a matrix of identical blocks in a specific order, and the participant is required to tap the blocks in the same order. There are a number of reasons to expect a relationship between the Corsi blocks and standard MOT tasks. Corsi blocks is the visual working memory test most closely related to action (Logie, 1995, p. 2), and it has been argued that MOT is fundamental to visuo-motor control in activities such as touching, catching, and pointing (Pylyshyn, 2001). Furthermore, there is evidence that the same types of secondary tasks interfere with Corsi blocks and MOT tasks. When participants are required to tap their fingers while viewing the presentation of items, finger tapping interferes with Corsi blocks performance (Della Sala, Gray, Baddeley, Allamano, & Wilson, 1999). Similarly, pattern finger tapping interferes with MOT (Trick, Guindon, & Vallis, 2006; finger tapping occurred during item movement, before the report phase).Footnote 1 An equally difficult articulation task did not. The Corsi blocks task was chosen over other visuospatial measures because it had a longer history of research behind it and because, as compared with other spatial tests, it is not as strongly correlated with nonspatial measures of the executive such as the OSPAN (cf. Oksama & Hyönä, 2004, p. 661, for their hybrid task that combined Corsi blocks with the mental rotation task).

Although the Visual Patterns Test is not as ubiquitous as the Corsi blocks task, it seems to be more closely related to the strictly visual components of spatial memory. Consequently, for the Visual Patterns Test, irrelevant visual stimulation interferes more than finger tapping, whereas the opposite is true for the Corsi blocks task (Della Sala et al., 1999). This finding has led individuals to conclude that the Corsi blocks and Visual Patterns Tests measure two separate aspects of spatial working memory. As spatial tests, they might be expected to correlate with each other to some extent, but each should make independent contributions to predictions of tracking performance if different types of spatial memory are involved.

Imaging studies implicate a number of brain regions in the Corsi blocks task, including the right hippocampus, the parietal regions associated with spatial memory, and the frontal areas associated with some aspects of the executive (Toepper et al., 2010).Footnote 2 Moreover, a variety of behavioral studies also suggest that the Corsi blocks task may be partially contaminated by the executive (Miyake et al., 2001; Quinn, 2008; Rudkin, Pearson, & Logie, 2007). Consequently, correlations between the Corsi blocks, OSPAN, and RSPAN tests are to be expected. The underlying cause for this relationship is unclear, but Ridgeway (2006) suggests that individual differences in strategy may play a role. Individuals who excelled at the Corsi blocks task reported using a strategy where they grouped items into chunks as the items were being presented one by one. Although this technique clearly involves a spatial component, it also requires the executive, because grouping while learning a new location would require multitasking. If there are roles for both executive and spatial memories in predicting tracking performance, both types of test should continue to predict significant amounts of variance in tracking performance when regression is used to partial out the variance the tests have in common.

Method

Participants

One hundred thirty-four students were recruited from the psychology participant pool (94 women; mean age  =  19.15 years, SD  =  2.14). On average, they reported 2.48 h a week playing videogames (SD  =  4.26), but this variable did not predict performance on the tracking task or on any of the memory tests in this sample (p  >  .05 for all). All 134 participants did the OSPAN, Corsi blocks, digit span, and MOT tasks, and 92 did the RSPAN and the Visual Patterns Test as well. (The study was conducted over 2 years, and extra measures were added in the second year.)

Apparatus and stimuli

Macintosh and PC computers were used to present the MOT, OSPAN, RSPAN, Corsi blocks, and digit span tests. For the computer tasks, participants were seated 50 cm from the viewing screen. The Visual Patterns Test (Form A) was also used, but it is a paper-and-pencil measure (Della Sala et al., 1997).

For the tracking task, the tracking field (the area in which items moved) was a black central rectangle on the computer screen that occupied 22.19°  ×  16.73° of visual angle. The remainder of the screen was gray. Items were small cartoon figures: regular civilians (1.35° happy faces) and sinister-looking “spies” (1.48° square). Each was enclosed in a white bounding contour, but happy faces were blue and spies were black. We used a version of the task where the total number of items in the display was 10 (four targets and six distractors). When items moved, they moved randomly and independently of one another, never occluding (they repelled). For each item, speed and direction of motion changed randomly from frame to frame (frame  =  11.7 ms), with speeds ranging from 0° to 10.79° per second, and with a 1/100 probability that the item would change its vertical or horizontal direction (or both) every frame. This version of the task was chosen because it was not impossibly difficult (6 of the 134 participants managed to correctly identify 100% of the targets), yet the task was difficult enough to produce substantial individual variation (range: 45% – 100%; median error rate  =  18.33%).

Procedure

Procedures for the OSPAN and RSPAN tests were taken from Kane et al. (2006) and Daneman and Carpenter (1980), respectively. In both, participants were presented with a series of unrelated statements on a computer screen, each with a different word at the end. Participants would have to recite the statement and the word, say whether the statement was true/sensible, memorize the word, and then, after a series of statements, recite the words in order. For the OSPAN, the statements involved mathematical equations [e.g., (2  +  1)/3  =  1? HOUSE], and the total number of words reported in the correct order was measured. For the RSPAN, the statements were sentences (e.g., The young girl wandered down the winding PATH), and the maximum number of final words reported in the correct order was measured. Although there are different ways of scoring the RSPAN (Friedman & Miyake, 2005), we used Daneman and Carpenter’s original technique to avoid confounding the scoring system with the type of memory measured. Specifically, the primary measures of visuospatial and verbal memory (Corsi blocks and digit span) were both scored using the maximum number of items reported (the standard technique). By using Daneman and Carpenter’s procedure for RSPAN, it at least ensured that one of the measures of the executive was scored in the same way as the tests of visuospatial and verbal memory.

For the Corsi blocks task, nine identical yellow squares appeared on a black background. After a sequence of individual squares flashed (1/s), participants were required to point to the squares in the order that they had flashed. The maximum number of squares reported in the correct order was measured. Similarly, for the digit span task, participants were shown a series of visual digits (1/s) and then were asked to recall the digits in order after a question mark appeared. The maximum number of digits reported in the correct order was measured. For the Visual Patterns Test, participants were shown a series of matrices made up of black and white squares. Participants were given 3 s to view each matrix and then were required to fill in the darkened squares on a blank matrix. Matrices varied in complexity from 2  ×  2 (with 2 black squares) to 5  ×  6 (with 15 black squares), and scores could thus range between 2 and 15. The score represented the maximum number of black squares correctly filled in on a matrix. Before each memory test, participants were given two trials of practice.

A variant of the MOT task called “Catch the Spies” was used (Trick et al., 2006). Participants were given the task of monitoring the locations of four identical spies (targets) that had “disguised themselves” as civilians (distractors: happy face figures). Targets were randomly positioned among the distractors on the computer screen, and during the 936-ms encoding phase, the targets switched back and forth from spy to happy face form to signal that they were targets (117 ms as spy, 117 ms as happy face for the duration). Then all 10 items (both targets and distractors) returned to happy face form. After a 351-ms delay where all items remained static, all 10 moved randomly and independently for 8 s. When they stopped, participants used the computer mouse to indicate the items that were targets. The percentage of correctly identified targets was measured. Six practice trials preceded the 15 experimental trials.

Results

Because outliers can produce spurious relationships in multivariate analyses, data were screened for univariate and multivariate outliers, using techniques recommended by Tabachnick and Fidell (2001). Specifically, univariate outliers (data points with standard scores in excess of 3.29, p  <  .001) were replaced with values corresponding to standard scores of 3.29. There was only one such outlier. Similarly, Mahalanobis distances were used to check for multivariate outliers, cases where scores were extreme on two or more variables (Tabachnick & Fidell, 2001). This analysis revealed no multivariate outliers.

Descriptive statistics for the various measures are reported in Table 1. There was no evidence of restriction of range in any of the measures. To optimize correlations between the OSPAN and the other variables, a square root transformation was applied to the OSPAN data to correct for skew and kurtosis, although correlations with the untransformed variables were also included in the table for purposes of comparison. A range of transformations was tried with the RSPAN, but they improved skew at the expense of kurtosis, so the data were left untransformed.

Table 1 Descriptive statistics for measures

First, it was important to establish that the measures of working memory correlated with each other in ways that might be expected from previous research. To accomplish this, for the 92 participants who received all of the memory tests, the two tests of the executive (OSPAN and RSPAN) and the two tests of visual memory (Corsi blocks and Visual Patterns Test) were correlated. The correlation between the two measures of the executive was significant and comparable to that in other studies with university students (e.g., r  =  .51, Engle et al., 1999, Table 2). Similarly, the magnitude of the correlation between the two measures of visual memory was comparable to that observed in Della Sala et al. (1999).

Table 2 Correlations among measures of working memory and between working memory and multiple-object tracking performance

Next, relationships between the predictors and MOT were assessed. MOT correlated significantly with every memory test except the RSPAN, as is shown in Table 2. An initial regression was carried out with only the Corsi blocks, digit span, and transformed OSPAN tests as predictors, as is shown in Table 3. Together, they explained significant amounts of variance in tracking performance, F(3, 130)  =  12.38, p  <  .001, R 2  =  .22 . Standard regression revealed that the Corsi blocks test was the only significant predictor of tracking performance when the other memory tasks were statistically controlled. When the Corsi blocks task was dropped from the analysis, digit span and OSPAN together predicted a significant but small amount of the variance, F(2, 131)  =  4.25, p  <  .05, R 2  =  .06. Semipartial correlations involving the digit span and OSPAN tasks indicate that the OSPAN’s unique contribution to this variance was only sr 2  =  .0289. Thus, although the Corsi blocks task was a strong predictor of tracking performance, the OSPAN was not.

Table 3 Standard regressions and semipartial correlations with various tests as predictors of multiple-object tracking performance

By comparing results when the Corsi blocks task was and was not included in the analyses, it became possible to make a rough estimate of the amount of tracking performance predicted by variance that the OSPAN and Corsi blocks share (Table 3, n  =  134 analysis). Specifically, when only digit span and OSPAN were included in the analyses, OSPAN predicted 2.89% of the unique variance in tracking scores (sr  =  .17). When the Corsi blocks task was included in the analysis as well, OSPAN predicted 0.81% of unique variance in tracking scores. Thus, the shared variance predicted by the OSPAN and Corsi blocks together (not including the variance predicted by each test separately) is 2.89% ˗ 0.81%  =  ~2%. This amount of variance is minimal when compared with the 16% variance predicted by the Corsi blocks task on its own (sr  =  .40).

Regressions were then carried out on data from participants who did all the memory tests to see whether the Visual Patterns Test made a significant contribution once variance from the OSPAN, Corsi blocks, and digit span were controlled. Together, these variables predicted tracking scores, F(4, 87)  =  10.21, p  <  .001, R 2  =  .32, and as can be seen from Table 3, the Visual Patterns Test and Corsi blocks test were each significant predictors, whereas the OSPAN and digit span were not. The further addition of RSPAN increased the variance accounted for by 1%, F (5, 86)  =  8.33, p  <  .001, R 2  =  .33. Together, the two measures of the executive did not account for a significant amount of variance on their own, F(2, 89)  =  1.53, p  =  .22, R 2  =  .03, but with digit span, their contribution approached significance, F(3, 88)  =  2.19, p  =  .10, R 2  =  .07. In contrast, when the two visuospatial measures were entered on their own in regression, they accounted for 30% of the variance in tracking scores, F(2, 90)  =  19.61, p  <  .001, R 2  =  .30.

Discussion

This study showed that tests of working memory predict significant amounts of variance in tracking performance (e.g., up to R 2  =  .33 in this study). The predicted correlations between MOT and the two spatial memory tests emerged. Each test made an independent contribution in explaining tracking performance, as might be expected if there were different types of spatial working memory (Darling, Della Sala, & Logie, 2007, 2009). The Visual Patterns Test explained 11.5% of the variance in tracking performance in bivariate correlation and accounted for a small but significant amount of variance on its own, with all other types of memory partialled out in standard regression (sr  =  .19, p  =  .03). By far the best predictor was the Corsi blocks task, though. It explained over 20% of the variance in tracking scores in simple correlation and continued to be a strong predictor when variance associated with OSPAN, digit span, and the Visual Pattern Test was statistically controlled (sr  =  .42, n  =  92, p  <  .001).

However, contrary to prediction, the tests of executive working memory (OSPAN and RSPAN) predicted little of the variance in tracking scores. There is no reason to expect digit span, a test of passive verbal short-term memory, to correlate with MOT, and yet it correlated significantly and almost as well as the OSPAN (r  =  .18 for digit span and tracking and r  =  .19 for OSPAN and tracking—at best, r  =  .21 for transformed OSPAN). The correlation between RSPAN and tracking was not even statistically significant (r  =  .11). When regressions were performed using the two measures of spatial memory, OSPAN and digit span, as predictors, the OSPAN was not a significant predictor of tracking performance once common variance was partialled out (sr  =  .002). Although no psychological test is perfect, the correlation should have been larger if individual differences in this type of executive working memory played an independent role in accounting for tracking performance.

Thus, these measures of the executive do not predict much of the variance in tracking performance. This is not to say that the executive is not involved in tracking but, rather, that in young adults, individual differences in this component of the executive do not seem to predict individual differences in tracking. It is possible that the executive could play a larger role in explaining individual differences if normal tracking mechanisms were challenged by having items cluster or move extremely quickly (Tombu & Seiffert, 2008). That is because, if one or more items were lost during tracking, performing error recovery strategies (extrapolating from the last known positions) while continuing to track the remaining items would probably require multitasking—and thus, the executive. Alternatively, individual differences in the executive may play a greater role in explaining individual variation in performance where there is need for especially high spatial or temporal resolution analysis, and this may explain the advantages associated with expertise, as was proposed by Barker, Allen, and McGeorge (2010). However, it is also possible that expertise is associated with improvements in a variety of different abilities. The research on the expertise engendered by action videogames suggests that a wide range of capacities improve with practice, from contrast sensitivity to visual–motor coordination and speed (see Spence & Feng, 2010, for a review). Nonetheless, in the present study, although the participants performed a reasonably difficult tracking task, one that produced substantial individual variation in performance, the measures of the executive used here did not predict much of this variance. It could be that under normal circumstances, MOT is more similar to other perceptual/attentional tasks where the executive span plays a minimal role in explaining the individual variation among young adults. For example, the OSPAN does not predict individual differences in visual search slopes (Kane et al., 2006) or differences in enumeration latencies or response time slopes in the subitizing range (Tuholski et al., 2001).

The spatial memory tests were much better predictors of tracking performance. Although the Visual Patterns Test predicted a small amount of the variance, by far the strongest predictor was the Corsi blocks task, a test that is considered more spatial than purely visual. There are two ways of understanding the association between tracking and spatial memory tests. One is that MOT requires spatial working memory. Another is that tests of spatial working memory and the tests of reasoning that correlate with them all make use of more basic perceptual and motor operations—operations that are also used in MOT. That would explain why finger tapping, deliberate eye movements, pointing, spatial judgments (left and right, up and down), and moving the attentional focus from location to location also interfere with Corsi block task performance (Pearson & Sahraie, 2003; Smyth & Scholey, 1994; Zimmer, Speiser, & Seidler, 2003). MOT, the Corsi blocks task, spatial finger tapping, deliberate eye movements, and spatial judgments all require spatial indexing, the ability to select out an object location among others. Individual differences in these basic perceptual/motor abilities may explain some of the individual variation in performance observed in higher order tasks.

One possible explanation for the relationship between the visual spatial tests and MOT is that all require the ability to “fix” the locations of targets among distractors. If this is true, individual differences in the ability to encode the positions of targets might be evident even without item movement. It is undeniable that in the tracking task, participants must first encode the positions of the targets. In fact, it would be surprising if there were not some sort of relationship between target acquisition performance and overall tracking performance, given the observed relationship between tracking and measures of static spatial memory (Corsi blocks, pattern span) in this study. However, at present, the behavioral evidence does not support the claim that individual differences in the target acquisition phase of the tracking task predict individual differences in MOT. That is because the ability to fix target locations (report the positions of targets immediately after presentation but before item movement) in displays such as the ones used in this study does not predict MOT performance—not in young adults, and not even when groups of participants ranging in age from 7 to 75 years were tested (Trick, Hollinsworth, & Brodeur, 2009). This study used the same stimuli as the present study (10 items, 4 targets), and there were substantial individual differences in tracking performance across this age range. In fact, even when recall was delayed by 10 s, the ability to report the positions of the targets did not predict variations in tracking performance in adults between 26 and 75 years of age. Similarly, when individuals with Williams or Downs syndromes were tested (O’Hearn, Landau, & Hoffman, 2005, and Brodeur, Trick, Flores, Marr, & Burack, in press, respectively), the tracking deficits that they exhibited, as compared with controls matched in mental age, were not predicted by individual differences in the ability to report the positions of static targets either immediately or after a delay of 8–10 s. Given that the Corsi blocks task seems to be a strong predictor of tracking performance in young adults but individual differences in immediate report for the positions of static targets does not predict for this age group, the relationship must have something to do with things that occur after initial target acquisition.

The results of the present study are also relevant in light of the longstanding controversy about the relationship between Pylshyn’s spatial indices and Treisman’s object files (for a discussion, see Kahneman et al. 1992)—in particular, when these results are taken in combination with studies that investigate a variant of the tracking task called multiple-identity tracking. In the classic MOT task, all the items are identical, and consequently, item properties cannot be used to distinguish targets from distractors during tracking. Pylyshyn’s spatial indices (formerly FINSTs) are simply pointer variables that provide information about the location of items, not their properties. Location is all that is required in a classic tracking task because objects do not vary in their features. In contrast, multiple-identity tracking requires monitoring item positions when items differ from one another during item movement. Multiple-identity tracking is a more natural task insofar as we are rarely faced with large numbers of identical items, some of which must be tracked, but multiple-identity tracking may require a representation that merges item positions and properties, a representation more similar to an object file. Horowitz et al. (2007) compared multiple-identity tracking performance with standard MOT and found that multiple-identity tracking was always significantly worse than MOT, although, with practice with specific items, multiple-identity tracking improves (Pinto, Howe, Cohen, & Horowitz, 2010). The authors suggested that there may be two systems involved: one for location information and the other for identity. Botterill, Allen, and McGeorge (2011) found a similar dissociation between location and identity performance.

MOT and multiple-identity tracking are related tasks, but when it comes to working memory, they may vary in interesting ways. Oksama and Hyönä (2004, Experiment 2) carried out an individual-differences study on multiple-identity tracking. Participants were required to monitor the positions of a set of diverse objects as they moved. After a mask, participants were required to report the identity of the object at a probed location. In that study, the Corsi block task predicted individual differences in tracking performance, as in the present study (r  =  .40, Oksama & Hyönä, 2004, Table 8, and r  =  .45, respectively; p < .001 for both). Thus, tests of spatial memory predict performance in both multiple-identity tracking and MOT. Moreover, in the multiple-identity tracking study, the correlation between Corsi blocks and the OSPAN was almost identical to that observed in the present study (r  =  .25, Oksama & Hyönä, 2004, and r  =  .24, n  =  134, respectively). Nonetheless, the correlations between measures of the executive and tracking performance were more notable in multiple-identity tracking than in MOT (e.g., raw correlations with the OSPAN: Oksama & Hyönä, 2004, r  =  .28, p  <  .001, as compared with r  =  .19 for the present study).

The discrepancy between these two studies suggests that multiple-identity tracking may impose some additional demands that require the resources of executive working memory as measured by tests such as the OSPAN. The need to pair an object identity (the objects’ properties) with an item while tracking its movements may have costs. It has been noted that although MOT performance improves when targets and distractors have different colors during item motion, this advantage is nullified by a task where participants are required to hold item colors in working memory while tracking (Makovski & Jiang, 2009b). The authors concluded that the benefits of heterogeneity may be produced by visual working memory processes operating at the same time as tracking processes. If that is true, coordinating the demands of these two operations might require the executive. There may be even more demands when participants have to combine features to define object identity. Makovski and Jiang (2009a) found that although there was a benefit if the targets and distractors in a tracking task differed in their features, there was none when they varied in terms of a combination of features, and in fact, the performance was actually slightly worse than it was when the items were homogeneous. Thus, keeping track of the distinct properties of individual items while tracking may require additional operations, the performance of which may be predicted by tests that measure individual differences in executive function (see also Oksama & Hyönä, 2008). Moreover, when the multiple-identity tracking task is made more difficult by increasing the speed at which objects moved to 13.7 °/s or more, there were trade-offs between location and identity tracking performance that suggested that the two tasks were sharing a common resource (see Cohen, Pinto, Howe, & Horowitz, 2011), and this resource may be related to the executive. At the very least, the discrepancy in the predictors of individual performance in multiple-identity tracking and MOT studies suggest that researchers should be cautious about making conclusions about classic MOT on the basis of multiple-identity tracking performance.

In conclusion, the present study suggests that both visuospatial and spatial working memory tests predict some of the individual differences in performance in the classic MOT task, even when the tests do not require linking object features to locations. In contrast, although nonspatial measures of the executive, such as the OSPAN, predict the magnitude of the attentional blink, top-down constraint of the attentional focus in the Eriksen flanker task, multiple-identity tracking and selective enumeration and enumeration when there are large numbers of items (Arnell et al., 2010; Heitz & Engle, 2007; Oksama & Hyönä, 2004, Experiment 2; Tuholski et al., 2001, respectively), they do not seem to be as important for predicting performance in the MOT task. It is interesting to note that although the OSPAN is not related to the rate at which visual displays can be searched (Kane et al., 2006), it is related to the attentional tasks that require responses based on item properties that define the targets’ identities for those tasks. Specifically, OSPAN is related to attentional performance when participants are required to report the identity of a second target in a sequence in the attentional blink, report the identity of the central target in the Eriksen flanker task, and track the positions of items with specific identities in multiple-identity tracking. Property-based identities can easily be associated with verbal labels. Selective enumeration requires remembering the properties of the items to be enumerated, and in general, enumeration beyond the subitizing range involves holding verbal number names in memory for purposes of mental addition (Trick, 2005), and the OSPAN predicts these two aspects of enumeration (Tuholski et al., 2001). In contrast, during MOT, items cannot be distinguished by their properties, because all of the items are the same and so this task is more strictly spatial. The present study showed that there are substantial individual differences in MOT performance among young adults even when the items are moving relatively slowly (0 – 10.79°/s, with speeds changing randomly every 11.7 ms), but these differences do not seem to be related to executive function as measured by tests such as the OSPAN and RPSAN. If this type of executive working memory plays a role in predicting tracking performance, it is a small one (about as large as that of passive verbal short-term memory).