Category-specific learned attentional bias to object parts

Chua, Kao-Wei; Gauthier, Isabel

doi:10.3758/s13414-015-1040-0

Category-specific learned attentional bias to object parts

Published: 29 December 2015

Volume 78, pages 44–51, (2016)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Category-specific learned attentional bias to object parts

Download PDF

Kao-Wei Chua¹ &
Isabel Gauthier¹

1473 Accesses
7 Citations
3 Altmetric
Explore all metrics

Abstract

Humans can selectively attend to information in visual scenes. Learning from previous experiences plays a role in how visual attention is subsequently deployed. For example, visual search times are faster in areas that are statistically more likely to contain a target (Jiang and Swallow in Cognition, 126(3), 378–390, 2013). Here, we examined whether similar attentional biases can be created for different locations on complex objects as a function of their category, based on a history of these locations containing a target. Subjects performed a visual search task in the context of novel objects called Greebles. The target appeared in one half (e.g., top) of the Greebles 89 % of the time and in the other half (e.g., bottom) 11 % of the time. We found a reaction time advantage when the target was located in a “target-rich” region, even after target location probabilities were equated. This indicates that attentional biases can be associated not only with regions of space but also with specific object features, or at least with locations in an object-based frame of reference.

Category structure and region-specific selective attention

Article 18 October 2022

Robert M. Nosofsky & Mingjia Hu

Typicality modulates attentional capture by object categories

Article 27 January 2021

Y. Isabella Lim, Andrew Clement & Jay Pratt

Retinal-specific category learning

Article 02 July 2018

Luke A. Rosedahl, Miguel P. Eckstein & F. Gregory Ashby

The world contains a great deal of visual information that must be selectively filtered for further processing. Theories of attention have often presented a dichotomy between top-down goals (Folk et al., 1992) and bottom-up perceptual salience (Theeuwes, 1991, 1994). However, the deployment of attention can also be affected by previous experience and response histories (Awh, Belopolsky, & Theeuwes, 2012), such that we may learn to attend to bottom-up information that consistently facilitates top-down goals. For instance, implicit learning of regularities in the structure of scenes guides spatial attention during visual search (Chun & Jiang, 1998). This ability to abstract regularities from the environment, or statistical learning, can influence how attention is deployed (Saffran, Aslin, & Newport, 1996; Fiser & Aslin, 2001; Zhao, Al-Aidroos, & Turk-Browne, 2013).

One form of statistical learning is probabilistic cuing, wherein attention is implicitly drawn to areas of the visual field that have a higher probability of containing behaviorally relevant information. Geng and Behrmann (2005) used probability cuing in a task in which a target object could appear in one of four locations. The target was in one of the locations 75 % of the time but in one of the other locations a total of 25 % of the time. Subjects were faster and more accurate to detect targets in the high probability area compared to low probability areas. In addition, interference from distractors was reduced in the high probability location. Other studies using probabilistic cuing have demonstrated that these attentional biases persist for several days and remain for several hundred trials after the probabilities are equalized (Jiang, Swallow, Rosenbaum, & Herzig, 2013b). In these studies, the spatial bias was acquired rapidly in a short training session, indicating that probabilistic cuing is a powerful way to direct spatial attention to frequently selected locations.

One question is whether these spatial attentional biases are framed relative to the viewer or the external environment. Viewer-centered frames are low in computational demands but are relatively unstable because they are sensitive to changes in movement and viewpoint (Marr & Nishihara, 1978). In contrast, environment-centered frames are more stable to movement changes but are more computationally expensive. Jiang, Swallow, Rosenbaum (2013a) found that after acquiring a bias to attend to one quadrant of space, subjects who were reseated so they were seeing the screen from another viewpoint switched their bias to a previously sparse quadrant, demonstrating a viewer-centered frame of reference. This is consistent with other work showing that contextual cuing is also viewer-centered (K. P. Chua & Chun, 2003).

Since attentional biases acquired during probabilistic cuing are long lasting and persistent to statistical changes, a spatial bias acquired in one task could generalize to another task. However, recent results suggest that such transfer may not occur: A bias to attend to a region in space induced by probabilistic cuing did not transfer to a foraging task (Jiang, Swallow, Won, Cistera, & Rosenbaum, 2015). It is possible that spatial biases do not transfer because space must be shared for all manner of tasks (e.g., attending to the bottom right is relevant in typing, cooking, and opening doors). Therefore, generalized spatial biased may be counterproductive.

Although spatial aspects of different tasks may be uncorrelated, different tasks that use the same objects could depend on a similar set of features. Attention may be drawn to certain object features (e.g., the eyes of a face) to discriminate them from other objects in that category but also to get information about eye gaze or emotional expression. To the extent that a category is associated with several tasks for which the same spatial biases are helpful, or at least not incompatible, a category-specific but task-general attentional bias could develop.

One open question is whether learned attentional biases can occur within objects. When categorizing complex objects, information and features that are diagnostic can be prioritized. For example, when learning to categorize different types of fish that varied in the shape of the tail or mouth, features useful for categorization are selectively attended (Sigala & Logothetis, 2002), resulting in a “stretching” of the relevant dimension in a multidimensional category space that increases perceptual discrimination along that dimension. A recent study using similar stimuli cued attention near different parts of the fish and found a reaction time advantage when the cue was spatially closer to features crucial for identification (Baruch, Kimchi, & Goldsmith, 2014), indicating that spatial attention can be drawn toward diagnostic object features. Likewise, in previous work, we found that attentional biases could develop to specific parts of faces (K. W. Chua, Richler, & Gauthier, 2014) and novel objects (K. W. Chua, Richler, & Gauthier, 2015) due to their history of being useful for individuation. In these studies, subjects were trained to individuate faces or Greebles wherein one half contained most of the information diagnostic for identification. When later asked to selectively attend to just part of those objects, subjects could not ignore parts that were previously diagnostic.

Here, we ask if spatial attention can be learned in a category-specific manner (e.g., learning to attend to the top of an object) without requiring object individuation. In the fish experiments mentioned previously (Baruch et al., 2014), attention was drawn to features crucial for object recognition (see Rehder & Hoffman, 2005a, b). In the Greeble experiments, a history of finding information relevant to individuation in an object part made it harder to ignore (K. W. Chua et al., 2015). Here, we ask if learned attentional biases to object parts can occur when the object is not relevant for the task, whether these biases generalize to other objects of the same category, and whether they persist once probabilities are equated, as in viewer-centered probability cuing.

To investigate these questions, we used probability cuing with two Greeble categories. Subjects had to detect a valid “T” among distractors and were asked to indicate what direction the head of the T was pointing. Critically, the target appeared in the top half of one Greeble category 89 % of the time and the bottom half of the other Greeble category 89 % of the time. In the second half of the experiment, we equated target location probabilities for all object halves and examined whether target detection remained faster in object regions with a history of high target probability.

Experiment 1

Method

Subjects

Twenty-one subjects participated in Experiment 1 (8 male, 13 female, mean age = 20.1 years). Sample size was determined based on a power analysis using the effect size from a previous probabilistic cuing study (Cohen’s d = 1.6; Jiang et al., 2013a), aiming for power greater than 0.90 with alpha = 0.05 (two-tailed). Subjects received class credit. The study was approved by the Vanderbilt University IRB.

Stimuli

Stimuli were objects from two categories of asymmetrical Greebles (Gauthier & Tarr, 1997; K. W. Chua et al., 2015) called Ploks and Glips. Ploks and Glips have distinct body shapes, textures, and parts that point in different directions (up vs. down; see Fig. 1). Twenty unique Greebles from each category were used. All Greebles were presented in grayscale and tilted 40° clockwise. Greeble images were 400 × 400 pixels.

Procedure

On each trial, subjects saw a single Greeble. A valid sideways “T” and a slightly offset “T” were superimposed on the top and bottom halves of the Greeble after 0.5 seconds. This 0.5-second latency period gave subjects time to scan the features of the Greeble before the target appeared. Both “T” shapes were displayed in a darker gray than the Greeble. The task was to press the left or right arrow key to indicate the direction the head of the valid “T” was pointing. A beeping noise was played for incorrect answers. There were 1,152 trials with four blocks of 288 trials.

Critically, the valid “T” appeared in one half of one Greeble category 89 % of the time (e.g., the top of Glips) and in the other half of the other Greeble category (e.g., the bottom of Ploks) 89 % of the time (part assignment counterbalanced). For the first half of the experiment (576 trials; blocks 1 and 2), subjects saw Greebles with this probability asymmetry. In the second half of the experiment (blocks 3 and 4), the target probabilities were equated to assess if the attentional bias would persist. Target-rich locations were defined as the areas in each Greeble where the target was most likely to appear (89 %), and sparse locations were defined as the areas where the target appeared less often (11 %). Importantly, richness is defined through a combination of category membership and object-specific location (e.g., the tops of Ploks and the bottom half of Glips, wherever they appear on screen), so it is unlikely that there was a bias based on screen position.

The Greeble could appear in one of nine locations on a 3 × 3 grid that spanned 1,200 × 1,200 pixels in the center of the screen. Positions were randomized on each trial to minimize any attentional bias due to screen position. Note that there is an overall screen-based bias because targets were on average higher (or lower) on the screen for one category. However, the target distributions for the two categories overlapped greatly, and no location had greater probability when category was not taken into consideration.

Results

Subjects were as accurate when the target was in the rich half (94.4 %) versus the sparse half (93.3 %), p = .33, η_p ² = 0.04. Our analyses therefore focus on mean correct response times. Trials with RTs faster than 200 ms and slower than 2,000 ms were excluded (0.006 % of trials).

We were first interested in whether there was an interaction between target richness (or prior history of target richness) and block number. To that end, we ran an ANOVA on reaction time with target richness and block as factors, but there was no interaction between block and target richness, F(3, 60) = 0.188, p = .90, η_p ² = 0.009. Thus, we decided to look at the effects of the probability manipulation in each individual block. We ran one-way ANOVAs on reaction time in each of the four blocks with target richness (or prior history of target richness) as a factor (see Fig. 2). There was no probability cuing advantage in Block 1, F(1, 21) = 2.52, p = .13, η_p ² = 0.11, but subjects became faster in the target-rich half starting in Block 2, F(1, 21) = 15.81, p < .001, η_p ² = 0.44. Most critically, this bias was still significant in Block 3, F(1, 21) = 7.53, p = .01, η_p ² = 0.27, but extinguished in Block 4, F(1, 21) = 1.22, p = .28, η_p ² = 0.06.

Experiment 1 Discussion

Probabilistic cuing is a powerful means of directing attention to areas of space (Jiang et al., 2013a). Most studies to date have focused on these spatial biases in an environment-based frame of reference. Here, attention was drawn to target-rich parts of complex objects. This bias started in the second block and persisted into the third block, providing evidence that the attentional bias lasted several hundred trials after the probabilities were equated. However, by the fourth block, there was no evidence of any attentional bias. These results are similar to Jiang et al. (2013b), who found that a bias to attend to rich quadrants of space lasted for a few hundred trials before being extinguished.

Because we used 20 Greebles from each category, it seems reasonable to assume that the effect was associated with the categories and not specific objects. To provide a more direct test of this interpretation, we conducted Experiment 2, which differed from Experiment 1 in three ways. First, we used different sets of objects during the first two blocks where probabilities were asymmetric and the last two blocks where probabilities were equated. If the advantage for the target-rich half persists even after exemplars are changed, we will have evidence that the learned attentional bias is associated with features that define the two Greeble categories. Second, there was sufficient variability among subjects in Experiment 1 that we wondered if this was due to variability in subjects noticing that there were two discrete categories of objects. Could we maximize learning by ensuring that subjects knew there were two Greeble categories? Would this produce biases that persist until Block 4? To this end, we included a short categorization task before the visual search task. Finally, to encourage learning to begin as early as possible, we encouraged accuracy in the visual search task using an aversive timeout following incorrect answers.