Introduction

The visual world is immensely complex—there are far too many objects and features of objects at any moment for an observer to perceive everything impinging on the retina. It is therefore unavoidable that, at any given moment, the observer must select a subset of the visual world for processing. What is needed, in a word, is attention. The visual system must rapidly and efficiently direct attention to a location or feature of the highly complex environment to determine which stimulus is to be further processed, and which should be ignored. Moreover, this selection must vary from moment to moment. It would clearly be problematic if selection did not change over time, because this could lead to the system repeatedly selecting the same focus, rather than sampling the visual array in a more informative way. Thus, theorists assume that there must be some mechanism or inhibitory component to discourage attention from re-orienting back to a previously attended location or feature.

In their seminal paper, Posner and Cohen (1984) demonstrated just such a mechanism, subsequently named "Inhibition of Return" (IOR) by Posner, Rafal, Choate, and Vaughan (1985). In these experiments, subjects did a simple detection task, hitting a response button upon detecting the appearance of a small target shape that could appear in one of three boxes on a screen in front of them. The critical manipulation was a brief brightening of one of the boxes. This "cue" could appear at various times before the target, and the target could either appear in the box that had been cued or in a different box. Posner and Cohen found that targets appearing in a cued box were detected more quickly than targets in an uncued box if the cue−target interval was less than 300 ms. However, at longer delays, the pattern reversed, with cued locations actually now yielding slower target detection than uncued ones. Posner and Cohen suggested that the slower responses at the cued location were due to a mechanism that maximizes the efficiency of visual search of novel locations by ensuring that attention is inhibited from returning to previously examined places.

Klein (1988) provided an elegant demonstration that IOR is naturally linked to visual search, taking advantage of Treisman and Gelade's (1980) distinction between attention-demanding searches for targets defined by a conjunction of features and more automatic searches for targets defined by a single feature. Klein presented probes at locations that had recently either been part of a conjunction-based search or a feature-based search. Probe detection was slowed at the locations that had been searched in the conjunction tasks (i.e., locations that had been attended), but there was no slowing at locations that had been searched in the feature task. Klein concluded that IOR serves as a "foraging facilitator" to maximize search efficiency (see also Müller & Von Mühlenen, 2000; Takeda & Yagi, 2000; for a very recent review, see Wang & Klein, 2010).

In the studies discussed so far, the assumption has been that the push toward novelty is defined in spatial terms, with a particular location being subject to inhibition because it has already been the focus of attention. Given that various levels of the visual system are organized in spatial coordinates, it is plausible that attention and inhibitory effects might be defined spatially. There has been considerable debate over the relative importance of location-based versus feature-based visual attention. For example, Bundesen (1990; see also Laarni, 1999; Laarni, Koski, & Nyman, 1996) proposed that location is just one type of feature, similar to color, shape and orientation.

However, this view has not gone unchallenged. There is some evidence to suggest that stimulus selection via spatial location is primary (Laarni, 1999; Laarni, et al., 1996; Schneider, 1995). In fact, Kubovy (1981) has called location the "indispensable attribute" for vision, and location clearly plays a primary role in classic models of search (e.g., Treisman & Gelade, 1980; Wolfe, Cave, & Franzel, 1989), in which objects are specified by collections of features that co-occur in internal “maps” that are spatially organized. Nonetheless, there is evidence that the bias toward novelty is not limited to disfavoring a recently attended location. Tipper and his colleagues (e.g., Tipper, Driver, & Weaver, 1991; Tipper, Weaver, Jerreat, & Burak, 1994) have shown that, if an attended object moves, some of the inhibition attaches to the object in its new position. Thus, the system favors both new locations and new objects, consistent with the "foraging facilitation" conception.

Given that both locations and objects can be subject to inhibition, it is reasonable to ask whether the preference for novelty is quite general: does the system inhibit processing of any stimulus aspect that is repeated? This question is based on the idea that an object can be thought of as having a number of visual features, one of which is its location. The question is whether a target that repeats a non-spatial feature is processed more poorly than a target without such feature repetition. Empirically, the answer to this question has been surprisingly elusive. Hu, Samuel, and Chan (2010) recently reviewed the relevant literature and concluded that, although there were hints of such detection costs, the reported effects (Fox & de Fockert, 2001; Kwak & Egeth, 1992; Law, Pratt, & Abrams, 1995; Monder & Amirault, 1998; Pratt & Castel, 2001; Riggio, Patteri, & Umiltà, 2004; Tanaka & Shimojo, 1996; Taylor & Klein, 1998) were much smaller than location-based IOR effects, and they did not seem to follow the typical time course for "classic" IOR.

In contrast to these weak effects, Hu et al. demonstrated robust detection time costs for targets that repeated a non-spatial feature. The key to finding these effects seems to be the use of visual displays that are richer than those typically used in prior studies. Using a version of Samuel and Weiner's (2001) cuing paradigm, which has more objects in more locations than in most other studies, Hu et al. found that, for both color and shape, attribute repetition produced robust inhibitory effects that followed a time course similar to that for location-based IOR. An important constraint was that the inhibitory effect only occurred when the target shared both the feature (i.e., color or shape) and location with the cue; repetition of either non-spatial attribute with a change in location did not produce inhibition. Kwak and Egeth (1992) had observed a similar constraint, though with much smaller effects. This pattern of repetition costs being applied to non-spatial features, but with an overarching spatial constraint, is difficult to accommodate within the dominant view of IOR as a special mechanism to enhance search by biasing the system against resampling a recently attended location.

Hu et al. (2010) suggested that these results are more consistent with two recent alternative hypotheses about the mechanism driving inhibition of return: Dukewich (2009) has argued that the effect is better thought of as being a consequence of the very general mechanism of habituation, and Lupiáñez (2010) has outlined a three-factor model (see below) that has the flexibility needed to account for the pattern. The current study includes two experiments that provide additional empirical constraints that help to contrast these two models with the more traditional view, and importantly, also help to distinguish between the two new theories. The new experiments use the same relatively complex displays, but rather than simply detecting targets, the participants in the current study were required to discriminate whether the target was red or blue (Experiment 1), or a square or a circle (Experiment 2). The empirical questions are (1) will location-based inhibitory effects be found; (2) will attribute repetition (color; shape) between cue and target produce inhibitory costs, as it did for detection; and (3) if so, will those costs be limited to targets that are presented in the same location as the cue, as was the case for detection?

To the extent that IOR serves the purpose that it has usually been assumed to serve, it seems natural to expect similar effects in discrimination tasks; if the system is biased against returning attention to a recently attended location, then making a discrimination judgment in that location should be impeded. However, while IOR is found readily in simple detection tasks, it is much less robust for discrimination. In fact, for the first decade after Posner and Cohen's (1984) initial report, the prevailing view was that IOR did not occur for discrimination. Over the years, there have been a substantial number of studies that looked for inhibition of return using various types of discrimination tasks (see Taylor & Donnelly, 2002, for a very thoughtful analysis of the different task types), and it eventually became clear that IOR does occur under some circumstances (e.g., Pratt, 1995). The effect is much less robust than for detection, and an analysis by Lupiáñez, Milán, Tornay, Madrid and Tiudela (1997) suggests that the inhibitory effect only emerges with longer SOAs, in perceptually difficult discrimination tasks (see Lupiáñez, Ruz, Funes, & Milliken, 2007, for a recent discussion of the necessary conditions). This fragility raises questions about the traditional way of interpreting IOR, because, as noted above, this view seems to predict similar costs for detection and for discrimination.

The examinations of IOR in discrimination tasks have almost exclusively focused on whether discrimination judgments are impaired when the target stimulus appears in a recently cued location: location-based IOR. However, there have been a handful of studies that looked for such costs when the repetition was of a non-spatial feature rather than of location. For example, Tanaka and Shimojo (1996), Pratt and Castel (2001), and Francis and Milliken (2003) each examined discrimination performance when the target stimulus did or did not involve repetition of color. Given the general weakness of inhibitory effects in discrimination tests, and the weakness of non-spatial feature repetition even in detection studies (prior to that of Hu et al., 2010), it should not be surprising that repetition effects in discrimination tests for non-spatial features have been quite inconsistent. For example, Francis and Milliken (2003) did find a repetition cost (inhibition), but with a time course that did not really match typical IOR, Pratt and Castel (2001) found evidence for both facilitation and inhibition, and Tanaka and Shimojo (1996) observed facilitation. Tests based on repetition of other non-spatial features have produced similarly scattered results (e.g., Pratt, Kingstone, & Khoe, 1997, for shape: inhibition; Danziger & Kingstone, 1999, for orientation/shape: facilitation; Francis & Milliken, 2003, for line length: inhibition). A further complication in this literature is the absence of corresponding detection experiments that are matched to the discrimination tests. Without knowing what the time course is of any facilitation and/or inhibition for detection, there is no basis for comparison of the discrimination data. The current study has two critical advantages. First, we have established very robust attribute repetition detection costs for exactly the same stimuli as those used here (Hu et al., 2010). Second, because the displays are identical to those used in the detection experiments—only the instructions are changed—this study provides a direct comparison between detection and discrimination, allowing us to see any differences in how any facilitation or inhibition is distributed over time and space.

As noted above, our theoretical goal is to test among three different accounts of IOR. The mainstream view has been that a salient visual event (an “exogenous cue”) attracts attention to the event’s location. Initially, this enhances perceptual processing at this location, consistent with the typical (though not universal) observation of facilitation for a brief time. Then, attention is withdrawn and inhibited from returning, leading to impaired performance at that location (IOR). Actually, there are differing variants of the standard view, with one version requiring this successive facilitation-then-inhibition, and another version instead assuming that both processes begin immediately but with different strengths over time (leading to the observation of initial facilitation followed by inhibition). Regardless of which version of the two-factor model is considered, without further modification, it seems to predict similar costs for detection and discrimination tasks. Given the more restricted conditions under which IOR has been found in discrimination tasks, this traditional model has already been questioned by some researchers. The current study provides a direct comparison of detection to discrimination with identical stimuli, and to the extent that results differ, this would be further evidence against the “standard” model.

In contrast to the traditional view, Dukewich (2009) has reconceptualized inhibition of return as the habituation of the orienting response. Thus, rather than invoking a special process specifically designed to facilitate search, Dukewich argues that the quite general process of habituation (e.g., Sokolov, 1963) can produce the observed results. The presence of a similar preceding event (the cue) leads to a weakened orienting response to the target. In general, habituation is strongest for the exact repetition of a stimulus, with weaker habituation as the stimulus differs across presentations. To account for the facilitation that is often seen initially, Dukewich assumes a typical priming idea, with cue and target activation combining if they are similar and occur close together spatially and temporally. This model is consistent with most of the IOR literature, and it also provides a reasonable account of Hu et al.’s (2010) finding that repetition of non-spatial features can produce detection costs (as such repetition should presumably lead to habituation as well). It shares with the traditional view the notion of two relatively simple processes, one of which produces facilitation (priming) and the second of which produces a reduction in system responsiveness (habituation of orientation).

An apparent advantage of this model over the traditional view is the natural role that similarity plays in it: the strength of both priming and habituation should depend on the similarity of the cue and target; Hu et al.’s (2010) detection data reflected such an effect of similarity. The habituation-based model does potentially suffer from the same lack of a clear account of the detection−discrimination difference as the traditional model, perhaps because it seems to be intended as an attempt to bring IOR into a more general and more biological perspective, rather than as a detailed explanation of time-dependent cognitive processing. More generally, to the extent that performance turns out to be task-dependent (e.g., detection vs discrimination) and stimulus property-dependent (e.g., location vs non-spatial feature), the habituation model does not appear to offer a basis for such dependencies; it certainly does not predict them. To the extent that performance is relatively consistent across testing variables (other than similarity), the habituation model would be supported.

A second recent alternative model has been offered by Lupiáñez (2010). As with the habituation approach, this model assumes no domain-specific representations and processes. Both cues and targets are assumed to be represented in object files, representations that Kahneman and his colleagues have suggested play a fundamental role in perception and attention. In this view (e.g., Kahneman & Treisman, 1984; Kahneman, Treisman, & Gibbs, 1992), an object file is a midlevel visual representation that “sticks” to an object over time on the basis of spatiotemporal properties; the object file stores (and updates) information about the object’s properties. Given object files for the cue and for the target, Lupiáñez suggests that three different factors determine the observed performance in a given experimental situation. Two of these factors, spatial selection and spatial orienting, facilitate performance, but a third factor (a detection cost) impairs it. The two spatial factors reflect the ability to focus attention on the correct location (and thus object file). The detection cost is driven by the likelihood that the target may be absorbed into the object file of the cue. If the target is the same as (or very similar to) the cue, in the same location, this absorption is relatively likely, and to the extent that this occurs, it is difficult for the observer to detect the target itself. Thus, as with the model proposed by Dukewich (2009), similarity plays an important role, with similarity between cue and target increasing the likelihood of a detection cost.

Critically, each of the three factors has a different hypothesized time course. For example, as one would expect, the likelihood of a target getting absorbed into the cue’s object file is relatively high when the two are close in time, and drops off with greater temporal separation. Similarly, the spatial cueing benefits are also strongest at short cue–target delays. However, based on the literature, Lupiáñez posits different shapes to the three curves, and it is possible to choose shapes that generally do a very good job of matching the results in the literature. A key strength of the model is that it allows differential predictions for detection versus discrimination tests: Detection experiments should be most affected by the detection cost factor’s time course, whereas discrimination tasks that presumably benefit more from properly focused attention should be driven more by the time course for the spatial factors. This feature of the model, while coming with a cost in parsimony, gives it an advantage over the other two models in explaining why discrimination tasks produce different results than those for detection. Lupiáñez et al. (2007) used this analysis to argue that discrimination tasks will generally show a different balance of facilitation and inhibition than detection tasks, with inhibitory effects generally being weaker and appearing later in discrimination tasks than in detection tasks. In the current study, the flexibility of this model would be supported to the extent that feature repetition and location repetition have differing effects in different testing environments; such differences can be generated when different tasks are affected to different degrees by the time courses of the three processes in the model. In this respect, the Lupiáñez (2010) model differs from the simpler two-process nature of the standard view and of the Dukewich (2009) alternative; it is a model that is more purposefully intended to account for situation-specific and stimulus-specific performance.

Present study

Our previous study (Hu et al., 2010) provided the first clear demonstration that repetition of non-spatial attributes (color and shape) can produce detection delays, with the observed time course for this effect closely matching the pattern previously found for location-based inhibition of return. Critically, although the results supported a role for such non-spatial features, location remained a dominant feature: non-spatial feature repetition only produced inhibitory effects when the cue and target shared location. We noted in that study that these results are not easily accommodated by the traditional account of IOR, but are generally consistent with the recent suggestions of both Dukewich (2009) and Lupiáñez (2010). Both of the more recent theories naturally accommodate results that depend on the similarity of the cue and target, which was the central finding of Hu et al.'s detection experiments. The current study is intended to provide further evidence to help select among the three theory types.

As we noted above, the basic empirical question is whether the bias against orienting toward a previously attended non-spatial attribute obtains in a discrimination task. In Hu et al.'s (2010) detection experiments, the repetition cost was limited to targets that shared location with their cues. The results for the discrimination experiments in the current study, together with the recent detection data, speak to the issue of how dominant location is for attention and perception: is location the “indispensable attribute” that Kubovy (1981) has suggested? More specifically, the current experiments continue our examination of the relationship between location-based inhibition and feature-based repetition costs. Are these fundamentally the same phenomenon, driven by the same underlying mechanisms, or are these different effects with different generators? To address this question, in the General discussion, we will discuss whether the two effects covary, or are instead independent.

Experiment 1

As noted, a key aspect of the current work is its use of exactly the same stimulus displays that produced robust feature (color; shape) repetition costs in a detection task (Hu et al., 2010). These displays were based on those used by Samuel and Weiner (2001; Samuel & Kat, 2003). Figure 1 illustrates a sequence of events that comprise a trial in this paradigm. After a fixation cross is presented, the initial display consists of eight medium-sized white circles arranged in an imaginary circle around a central fixation cross. In half othe white circles, there are two smaller objects; empty circles alternate with circles that contain the small objects. The cue then appears in one of the four empty circles (in Fig. 1, the cue is a small red circle near the top of the third frame). Finally, the target object appears (in Fig. 1, a second small red circle), either within the same circle as the cue (a "same" location trial) (for related discussion, see Wang & Klein, 2010), or within one of the other originally empty circles (a "different" location trial). Location-based IOR (detection) effects using these somewhat complex displays are about twice as big as those found with simpler displays (Hu et al., 2010; Samuel & Kat, 2003; Samuel & Weiner, 2001), and these displays generated significant non-spatial attribute repetition effects of a similar size (Hu et al., 2010). The motivation for using more complicated displays, including the presence of “cluttered” filler circles, was that the “classic” IOR displays are much simpler than any normal scene, and that such reductionist stimuli might represent a degenerate case for the perceptual system. Although the displays used here are still much simpler than normal scenes, the more robust IOR effects that these displays have produced in the three studies that used them do suggest that the exceptionally sparse displays in most IOR studies may not present the perceptual system with as rich an array as it is intended to work on.

Fig. 1
figure 1

Stimuli sequences for trials in Experiments 1 and 2. For Experiment 1: the cue is red and the target is the same color. For Experiment 2: the cue is a circle and the target is the same shape. (Note: not drawn exactly to scale; in the actual displays, each frame was a 480 × 640 pixel display)

As shown in Fig. 1 (top), the critical non-spatial attribute in Experiment 1 is color: the target either matches the cue in color (red–red, or blue–blue), or mismatches (red–blue, or blue–red). The color repetition/non-repetition was manipulated orthogonally with location repetition—match or mismatch cue–target colors could occur on either a same-location trial, or a different-location trial. The task is a simple two-alternative forced choice: subjects pushed one response button if the target was red, and another button if the target was blue. Thus, they had to discriminate the color of the target, and the central question is whether such a discrimination is impaired when the target's color is a repetition of the cue's; if there is such a cost, we also can map its temporal and spatial properties.

Method

Participants

Nineteen subjects participated in Experiment 1. All were undergraduate or graduate students from Peking University. By self-report, all had normal or corrected-to-normal (color) vision, and were naïve to the purpose of the experiment. All were right-handed. After the experiment, each was paid 15 RMB for participating.

Apparatus and procedure

The stimuli and the general procedure matched those of Hu et al. (2010). The experiment was conducted on a Pentium IV computer running E-Prime software (Schneider, Eschman, & Zuccolotto, 2002), with subjects viewing the screen from a distance of approximately 63 cm. Each trial consisted of four sequential displays: First, a frame with a white fixation cross (1°) appeared on a dark background for 250 ms. Then, a second frame which included eight white circles (diameter: 3.7°) appeared for 750 ms. These white circles were arranged in an imaginary circle around the fixation cross (radius: 6.8°). Four empty circles alternated with four filled circles, each of which contained two small (1°) objects. The cue (red or blue circle) then appeared in one of the four empty circles in the third frame. Manipulating the duration this frame provides the desired range of cue-target SOAs; Frame 3, and thus the cue-target SOA, was pseudo-randomly selected to be 200, 350, 700, 1, 500, 2, 500, or 3, 500 ms. Finally, the target object was presented in the fourth frame. On one-third of the trials, the target was presented in the same circle as the cue (“Same” condition), on one-third of the trials the target was displayed in a circle 90° away from the cue ("Diff1" condition; half of the trials were clockwise, and half counter-clockwise, from the cue), and on one-third of the trials the target was presented in the circle 180° away from the cue ("Diff2" condition)Footnote 1. Note, on a “Same” trial, the cue and the target appeared within the same circle, but were always in slightly different positions, as shown in Fig. 1.

The participants made a two-alternative forced choice (2AFC) on each trial. In this color discrimination experiment, the (left) “N” key on the keyboard was pressed in response to a red target regardless of the target location, and the (right) “K” key was to be hit if the target was blue, regardless of its location. Subjects were tested individually in a darkened, sound-attenuated room.

Each participant was presented with three blocks of trials. Each block included 144 trials [6 SOAs × 4 possible cue locations × 3 possible target location conditions (Same, Diff1 and Diff2) × 2 possible feature repetition conditions (repetition or non-repetition)]. Thus, for each subject, there were 12 observations for each combination of SOA, target location, and repetition/non-repetition case. Blocks were identical to blocks in Hu et al.'s (2010) study, except that the 16 catch trials per block (i.e., trials in which a cue was not followed by a target) used in that study were not used here. We divided blocks into two passes, offering a rest period after each pass. The subject was instructed to fixate on the central cross throughout the experiment.

Each participant was given a practice block of 30 trials that were not analyzed. Both speed and accuracy were emphasized. If a subject answered incorrectly, or failed to respond, a tone was played as feedback.

Results and discussion

The data from three subjects were removed because the subjects did not correctly follow the instructions. For the remaining 16 participants, average accuracy was 98%. Trials with incorrect responses were removed from further analysis. Response times less than 100 ms or greater than 1, 500 ms were also removed as outliers prior to analysis (less than 1%). Average reaction times were 674 ms. In preliminary analyses, there were no systematic differences between targets in the Diff1 and Diff2 location conditions. Therefore, to simplify the data presentation, these two conditions were collapsed into a single "Different" location condition.

The mean RTs were submitted to a 2 (Color repetition: repeated vs non-repeated) × 6 (SOA: 200, 350, 700, 1, 500, 2, 500 and 3, 500 ms) × 2 (Location conditions: Same vs Different) analysis of variance (ANOVA), with degrees of freedom corrected for violations of the sphericity assumption. Figure 2 shows the reaction time data, broken down by the SOA, location, and color repetition conditions. We first consider whether this discrimination task produced location-based inhibition (i.e., classic IOR), and then we consider whether the color repetition costs found in detection (Hu et al., 2010) also appear in discrimination.

Fig. 2
figure 2

Experiment 1 results. Target detection times, broken down by Color (repetition vs nonrepetition), Location (Same vs Diff) and stimulus onset asynchrony (SOA)

Location repetition

Classic IOR is indicated when performance is slower for targets that are in the Same location as cues than for targets in Different locations, with this difference typically emerging at longer SOAs. Thus, IOR can be assessed by looking at the main effect of Location repetition, and especially at the interaction of Location repetition with SOA. In both cases, there is clear evidence of IOR: The main effect of Location repetition was reliable, F(1, 15) = 12.89, p < .01, and more critically, so was the interaction of Location with SOA, F(5, 75) = 4.93, p < .01. In Fig. 2, these effects can be seen by looking at the pair of solid curves, and at the pair of dashed curves: Within each pair, the elevated curve reflects the slower responses when a target was in the Same location as the cue. And, within each pair, this difference emerges at the 700 ms SOA—there is no difference at the short SOAs. The difference is marginally significant at the 700 ms SOA [t(15)=1.78, p=.09], and robust for the longer SOAs [1, 500 ms: t(15) = 4.67, p< .01; 2, 500 ms: t(15)=2.12, p = .05; 3,500 ms: t(15) = 4.21, p < .01]. The relatively late emergence of IOR here is consistent with Lupiáñez et al.'s (2007) view that IOR appears later in discrimination tasks than in detection tasks. These late inhibitory effects produced a main effect of SOA, with slower responses at longer SOAs, F(5, 75) = 3.02, p < .05. Given that the pattern is quite similar for the pair of solid curves and the pair of dashed curves, it is not surprising that the three-way interaction was not significant, F(5, 75) = 0.10, n.s. Similarly, there was no interaction of Location repetition with Color repetition, F(1, 15) = 0.14, n.s.

Feature (color) repetition

Using these same displays, Hu et al. (2010) found that cue–target color repetition produced detection time costs that were quite similar in size and time course to those for location repetition, but only when the cue and target were in the Same location. The central question is whether this pattern also holds for discrimination. As Fig. 2 clearly shows, this did not occur. In fact, color repetition produced facilitation, not inhibition; the discrimination judgments were faster for targets that shared their color with cues, F(1, 15) = 26.34, p < .01. Moreover, the non-spatial feature repetition advantage was most pronounced at short SOAs, reflected in the significant interaction of Color repetition with SOA, F(5, 75) = 5.00, p < .01. The facilitation was significant at both the “Same” and “Different” locations at the two shortest SOAs [smallest t(15) = 2.80, p < .01]. The effect gradually declined across SOA at both location conditions. At SOAs of 700 and 1,500 ms, the facilitation only reached significance for the 1,500-ms SOA at the “Different” location, t(15) = 2.63, p < .05. At the longest SOAs (2,500 ms and 3,500 ms), no significant facilitation remained at either location [largest t(15) = 0.63, n.s.].

The results of Experiment 1 provide a very clear dissociation of location-based inhibition of return and non-spatial feature repetition. We observed the typical cost associated with a target appearing in a recently cued location, with the relatively long delay found for discrimination tests (Lupiáñez et al., 2007). Using these same displays in a detection task, Hu et al. (2010) found similar location-based effects, with a slightly earlier time course. In contrast, repetition of a non-spatial feature—color—produced significant facilitation effects, not inhibition, and the facilitation was strongest at short SOAs. This result is strikingly different than the delayed inhibition (for targets that were in the Same location as the cue) that was found in the detection experiments.

Experiment 2

Before we consider the implications of the dissociations found in Experiment 1, we will test the generality of the results by manipulating a quite different non-spatial contrast, one between filled circles and open squares, rather than one between red and blue circles. Hu et al.’s (2010) tested the filled circle–open square case in a detection task, whereas here we use the same type of discrimination task used in Experiment 1. As Hu et al. noted, neither the color-based test nor this shape-based test is intended as a psychophysical probe of a pure feature case; no attempt was made to make the red and blue stimuli isoluminant, and the shape differences in Experiment 2 are also not single-feature (e.g., they are also not isoluminant). In both experiments, the goal is to present two stimuli that have salient perceptual differences, and ask subjects to discriminate those differences in the presence or absence of stimulus repetition. Critically, as in Experiment 1, exactly the same stimulus displays are used here in a discrimination task as were used by Hu et al. in a detection task. Thus, any differences reflect task factors, given the identical stimuli.

Method

Participants

Sixteen new subjects participated in Experiment 2. All were undergraduate or graduate students from Peking University. By self-report, all had normal or corrected-to-normal (color) vision, and were naïve to the purpose of the experiment. All were right-handed. After the experiment, each was paid 15 RMB for participating.

Apparatus and procedure

With one exception, the stimuli and procedure matched those of Experiment 1 (see the bottom part of Fig. 1). The critical change is in the two small objects that could appear in the larger circles. In Experiment 2, these were a filled circle and an open square; both were black.

The participants pushed the “N” key in response to a target that was a filled circle, and the “K” key for an open square.

Results and discussion

The data were analyzed as in Experiment 1. Average accuracy for the 16 subjects was 96.3%. Response times less than 100 ms or greater than 1, 500 ms comprised less than 1% of the trials. Trials with incorrect, slow (>1, 500 ms), and anticipatory (<100 ms) responses were excluded from further analysis. The mean reaction time was 645 ms, very similar to response times in the color discrimination task. Figure 3 shows the reaction time data.

Fig. 3
figure 3

Experiment 2 results. Target detection times, broken down by Shape (repetition vs nonrepetition), Location (Same vs Diff) and stimulus onset asynchrony (SOA)

As with color discrimination, there was no hint of a three-way interaction, F(5, 75) = 0.49, n.s. Similarly, there was also no interaction of Location repetition with repetition of the non-spatial factor, F(1, 15) = 0.04, n.s. In fact, the statistical pattern in Experiment 2 was essentially identical to that in Experiment 1:

Location repetition

For the shape discrimination task, the same evidence for location-based IOR was found: targets appearing in the Same location as their cues were responded to more slowly than targets in a Different location, F(1, 15) = 12.94, p < .01. Again, this main effect was driven by the interaction of Location and SOA, the classic signature of IOR, F(5, 75) = 4.80, p < .01. As in Experiment 1, the reaction time cost was focused on the long SOA conditions: There were no significant differences at 200, 350, or 700 ms, but all three of the long SOA conditions produced a significant effect [1, 500 ms: t(15) = 4.58,p < .05; 2, 500 ms: t(15) = 2.51, p < .05; 3, 500 ms: t(15)=4.46, p < .05]. As before, these long-SOA differences drove a main effect of SOA, F(5, 75) = 3.67, p < .01.

Feature (shape) repetition

As Fig. 3 shows, repetition of a non-spatial attribute (shape) again produced facilitation rather than inhibition. And again, the facilitation was most pronounced at short delays. The facilitation generated by shape repetition was significant overall, F(1, 15)=16.25, p < .01, as was its interaction with SOA, F(5, 75) = 3.91, p < .01. The facilitation was significant at both the Same and Different locations at the 200-ms SOA and at the Different location at 350 ms [smallest t(15) = 3.01,p < .01]. It did not reach significance in the Same location at SOA 350 ms, and as the SOA increased to 700 ms and beyond, the facilitation gradually diminished at both location conditions [largest t(15) = 1.64, n.s.].

As noted, the pattern of results in Experiment 2 was virtually identical to the pattern in Experiment 1. As such, we can increase statistical power by combining the data across experiments, to evaluate two trends in the location repetition data that were not significant in the individual experiments: the trend toward facilitation at the shortest SOA (200 ms), and the trend toward inhibitory effects beginning at the 700-ms SOA (see Figs. 2 and 3). When the data from the two experiments were combined, the facilitation at 200 ms was marginal, t(31) = 1.97, p = .06, and the inhibition at 700 ms was reliable, t(31) = 2.22,p < .05.

Looking at the results of the two experiments more broadly, we observed strong, late costs when targets appeared in the same location as cues, coupled with early facilitation when targets shared a non-spatial feature with cues, regardless of whether the cues and targets shared location. The clear dissociation of location repetition (yielding late inhibition) and non-spatial feature repetition (yielding early facilitation) is reminiscent of the results from a study by Milliken, Tipper, Houghton, and Lupiáñez (2000). Using rather different procedures, those authors also observed an inhibitory effect of location repetition, and a facilitatory effect of non-spatial feature (color) repetition. Their study included an interesting additional manipulation that allowed them to control whether their subjects actively attended to the location or color of the cue. Their results suggest that the location-based inhibitory effect occurred regardless of the attentional properties of the cue, but that the feature-based facilitation only occurred when subjects had attended to the cue’s color. In the current study, the subjects did not have any task at all for the cue, but they were required to attend to the relevant property (color or shape) of the target. Thus, it is an open question whether the facilitation that we observed for feature repetition would occur if the repetition was unrelated to the judgment that the subject was required to make (cf. Taylor & Donnelly, 2002).

General discussion

Across the two experiments, there are two clear and important empirical dissociations. First, repetition of location produces very different consequences than repetition of a non-spatial feature. Second, the pattern observed here, when subjects made a discrimination judgment, is quite different from what Hu et al. (2010) found when observers made detection judgments for the same displays. By bringing together the results of the current study with those of Hu et al. (2010), we can easily see the two dissociations. Figure 4 presents the relevant data. The left side illustrates location-based repetition effects in discrimination (top part of panel) and in detection (bottom). The right side shows the corresponding results for non-spatial attribute repetition (collapsed across color and shape, as these two cases produced essentially identical results). Again, the top part of the panel presents the discrimination data from the current study, while the bottom part shows the detection results from Hu et al. (2010).

Fig. 4
figure 4

Different repetition effects in Discrimination tasks (top panels) and Detection tasks (bottom panels; adapted from Hu et al., 2010): a summary. Left side the mean target detection/discrimination times, broken down by cue-target location relationship (same, different) and stimulus onset asynchrony (SOA). Right side the mean target detection/discrimination times, broken down by cue-target location relationship (same, different), feature (repetition, nonrepetition), and stimulus onset asynchrony (SOA)

The results in the left panel are exceptionally simple to summarize: for both detection (bottom) and discrimination (top), we observe classic IOR. When a target appeared in the same location as a cue, responses were slower than if the location was not repeated. The time course in these experiments is completely consistent with previous findings: For detection, the inhibitory effect is already apparent at the 350-ms SOA, whereas for discrimination, it only begins to emerge at the 700-ms SOA.

What is new here is the set of results examining detection and discrimination performance when a non-spatial attribute repeats. This is where the new dissociations appear. Using a detection task, Hu et al. (2010) showed that non-spatial attribute repetition can produce a pattern very much like that for classic location-based IOR, but only when the repeated target feature was presented at the same location as the cue had been. The two patterns can be seen in the bottom section of the right panel: the two highest curves show the late inhibitory effect of attribute repetition (when the cue and target shared a location), while the bottom two curves show no hint of this effect (when the cue and target appeared in different locations). In the current study, using the same displays but now with a discrimination task, the results are quite different from both the corresponding detection data (bottom-right panel), and from the location-repetition case (top left). Unlike both of those situations, discrimination tests with non-spatial attribute repetition produce facilitation, rather than inhibition, and the facilitation effect occurs at short SOAs. This kind of early facilitation for non-spatial attribute repetition at a cued location is consistent with studies by Kwak and Egeth (1992) and by Tanaka and Shimojo (1996), among others.

As we noted in the Introduction, there are currently at least three theoretical perspectives that have been offered to account for inhibition of return. The most widely held view has been that the occurrence of a salient visual event (the cue) triggers two competing processes. One process brings attention to bear on the cue, facilitating performance at that location for a short period of time. A second process is inhibitory, and the standard interpretation is that this process functions to enhance “foraging” in the environment by moving attention away from recently focused locations. Depending on the particular account, the two processes either both begin when the cue occurs, with facilitation initially dominating, or facilitation precedes inhibition. Either way, this view accounts for the early facilitation that is typically but not always observed, and the later inhibitory effect. Although this approach is intuitively appealing, our results are problematic: the dissociation of detection and discrimination, despite identical stimulus displays, cannot be accommodated by the standard IOR view (see Lupiáñez, 2010 for a review).

The two recent alternative theoretical suggestions have been offered by Dukewich (2009) and by Lupiáñez (2010). Our assessment of these two alternatives, along with the more standard account, focuses on two factors: similarity and complexity. The similarity at issue is between the cue and target: what properties do they share (e.g., spatial, temporal, visual), and how does this similarity affect the observed pattern of response times? By complexity, we mean the extent to which the observed pattern differs as a function of test parameters—if facilitation and inhibition appear in different ways under different conditions, the pattern to be accounted for is complex.

The recent models both improve on the standard view with respect to the similarity criterion, because both include components that are sensitive to similarity in ways that are not the case in the usual approach. Dukewich (2009) argues that inhibition of return is just one case of the more general phenomenon of habituation. The notion is that through habituation, presentation of a cue should make observers less responsive to a following target; the more the target resembles the cue, the greater the habituation should be. Clearly, similarity is a central factor in this model, which allows it to naturally accommodate Hu et al.’s (2010) finding that there is a processing cost when cue and target share visual features (such as color or shape), as well as location. To account for the facilitation effect that is often observed initially, Dukewich assumes that processing of the target is enhanced by sharing processing with the cue if the two are close enough in time and space. Thus, similarity of cue and target can initially enhance processing, but such similarity also eventually increases habituation.

The model offered by Lupiáñez (2010) can also naturally account for similarity effects, though in a slightly less direct way. Lupiáñez frames his model within the object file theory of visual perception (e.g., Kahneman & Treisman, 1984; Kahneman et al., 1992). Kahneman and his colleagues have argued that visual objects are represented in object files that are indexed by an object’s spatiotemporal properties, with only a small number of such representations being available at a given moment. Object files store and update information about each object’s properties. Lupiáñez assumes that the presentation of a cue generates an object file, and that the observed response to a following target will depend on the relationship of the target to the cue. In particular, if the target is very similar to the cue (spatially, temporally, visually) then the target may get treated as an update to the cue’s object file, rather than as a separate object with its own object file. In this model, the effect of similarity is thus a function of how the cue and target map onto object files. As with Dukewich’s (2009) habituation-based model, similarity of cue and target can potentially play a significant role in the observed pattern, consistent with the detection data of Hu et al. (2010). Thus, both recent models offer a more natural account of similarity effects than is available in the more traditional view.

With respect to the issue of complexity, the habituation model (Dukewich, 2009) is more similar to the traditional approach. In both cases, there is a relatively simple account, which of course has the virtue of parsimony. However, as Fig. 4 illustrates, and as analyses such as those by Lupiáñez et al. (2007) and by Taylor and Donnelly (2002) have shown, the pattern of inhibition and facilitation that emerges in any particular experiment is a complex function of multiple factors. Thus, a successful theoretical account will probably require more complexity.

As described in the Introduction, Lupiáñez (2010) lays out a theory that does have the potential to fit the observed complexity. Given object files for the cue and for the target, Lupiáñez suggests that three different factors determine the observed performance in a given experimental situation. In this model, the inhibitory effect that has been studied in the IOR literature derives from one of these three factors: there is a detection cost for a target if it is similar enough to the cue for it to be treated as an update to the cue's object file. When this occurs, it should be difficult for the observer to detect the target's onset because there is no new object file created. In the model, this is most likely to occur when the cue and target are close in time and space, and Hu et al.'s (2010) detection results suggest that feature overlap also increases this probability, which certainly makes sense. Even though this cost is largest for short cue–target SOAs, it also remains substantial for longer SOAs. Note that this factor specifically refers to detection; it plays an important role in detection experiments, and a reduced role in discrimination tasks.

Lupiáñez posits a complementary factor, the spatial selection benefit, with a fairly similar time course. This benefit appears when integrating information from the target and the cue should help performance. For example, in many discrimination tasks, having selected the cue's object file should enhance the ability to compare the target to the cue. The discrimination task in the current study is such a case, and should show a large benefit from this factor when the cue and target match (e.g., when both are red). Under these conditions, the target's critical property can essentially reinforce the existing feature in the cue's object file, leading to a rapid recognition of the feature. Because the spatial selection benefit is strongest at short SOAs, this should lead to early facilitation effects on the task, exactly what we found. The third factor in the model, a spatial orienting benefit, is essentially the same as the initial facilitation in the traditional model of IOR, or the priming effect that Dukewich (2009) includes to account for early facilitation. In all of the models, this factor reflects the processing advantage that comes with bringing attention to bear on the relevant location.

Thus, in considering the traditional two-factor (early facilitation, later inhibition) model, the habituation model (Dukewich, 2009), and the three-factor model (Lupiáñez, 2010), there are different strengths and weaknesses. The first two models are simpler, and thus more parsimonious. The two new models provide a more natural account than the traditional model for the cue–target similarity effects reported by Hu et al. (2010). When the complexity of the data pattern is considered, with the task situation yielding either facilitation or inhibition at various SOAs, the simplicity of the first two models seems to be a liability. Thus, the model that seems best able to account for both the observed similarity effects and the complexity of the data pattern is the one offered by Lupiáñez.

In addition to being best able to account for the observed similarity and complexity patterns, this model has the additional virtue of being grounded in a well-established and broader view of perception—object file theory. Moreover, the two facilitatory processes in the model seem to map well onto two varieties of attention that have been widely studied. As Shalev and Algom (2000) noted, there is a useful logical and empirical distinction that can be drawn between selecting a particular location or object for attention, and focusing attention on a particular stimulus dimension of that location or object. Shalev and Algom demonstrated that these two varieties of attention produce additive effects, consistent with the notion that there really are two separate forms of attention at work. The spatial orienting factor in the Lupiáñez model (and in the other two models discussed above) corresponds to the first form. This type of attention is what Posner and Cohen (1984) originally set out to investigate by using exogenous cues to attract attention to a particular location. The spatial selection benefit, in contrast, serves to improve processing of cue and target information within the selected location, and in that respect it is more akin to the type of attention that enhances within-object processing (cf. Garner, 1974). The results of a very recent study (Luo, Lupiáñez, Funes, & Fu, 2010), based on the ability of cues to reduce the spatial Stroop effect, suggest that exogenous cueing effects are mediated by object-based representations. This finding is consistent with the object-file interpretation of stimulus repetition effects that is a critical component of the Lupiáñez model.

Non-spatial attribute-based repetition costs versus location-based IOR

The focus of our study has been whether repetition of a non-spatial attribute generates a processing cost akin to the cost found for targets presented at recently cued locations (i.e., IOR). The results speak to the question of the extent to which location is a special property—in Kubovy's (1981) terms, an "indispensable attribute". Our previous experiments (Hu et al., 2010) produced robust costs for feature (color and shape) repetition, with a similar time course to IOR for location. Such results provide some support for the view that location is just another type of feature (e.g., Bundesen, 1990; Laarni, 1999; Laarni et al., 1996). However, this pattern only occurred when the cue and target were presented at the same location (see the bottom-right panel of Fig. 4), consistent with location having a unique status. The results of the discrimination experiments of the current study reinforce this distinction: Typical location-based IOR was observed (Fig. 4, top-left panel), whereas feature repetition under these conditions (in fact, during the same trials) produced a quite different consequence—early facilitation (Fig. 4, top-right panel). The dual nature of location—sometimes acting like other features, sometimes acting differently—may derive from the fact that it is an attribute of both the stimulus and any orienting or motor response (Barry, 2006; Dukewich, 2009). Spatial location serves as the organizing principle in classic theories of visual search, including Feature Integration Theory (Treisman & Gelade, 1980) and Guided Search (Wolfe et al., 1989), with both theories relying on feature “maps” that are spatially represented. Consistent with this view, Tsal and Lavie (1988, 1993) pointed out that non-spatial attributes of a stimulus (e.g., color) are more likely to be attended if attention is directed to the location of the stimulus.

Given the complex status of spatial information (e.g., Barry, 2006; Dukewich, 2009), the existence of multiple forms of attention (e.g., Shalev & Algom, 2000), and the likelihood that multiple factors will influence processing speed (e.g., Lupiáñez, 2010), it is hardly surprising that many different data patterns have been observed in the literature. Although such variance may seem daunting, to the extent that the patterns of variation group along theoretically predictable lines, the differences actually provide an opportunity to test different theories' predictions. For example, by comparing the results from detection to those from discrimination, it is possible to isolate aspects of the system that are more closely tied to one process or the other. In the current study, we found that the discrimination results for non-spatial attributes produce clear evidence for both location-based IOR and for attribute-based facilitation. This dissociation (and the accompanying dissociation from the detection results with identical stimuli; Hu et al., 2010) is an example of the type of diagnostic difference that supports selection among competing theories. For example, the ability to account for feature similarity effects provides support for the habituation model of Dukewich (2009) and the three-factor model of Lupiáñez (2010) over the more traditional facilitation + inhibition models; the ability to account for complex dissociations of the kind we report here supports the three-factor model over the other two. Overall, we believe the model that Lupiáñez has described merits continued exploration. The three factors in this model each have substantial evidence from a range of approaches, and using object files as the underlying representation positions this model well within current thinking about visual perception more generally.