At any given moment, the visual system is bombarded with more information than it can process to the level of awareness. This is why visual attention is so important; it is the mechanism that selects certain aspects of the visual scene for privileged processing. This selection process protects the visual system’s limited-capacity processing resources from being overwhelmed (Broadbent, 1982; Kastner & Pinsk, 2004; Moran & Desimone, 1985). Historically, the metaphor of an “attentional spotlight” has been used to conceptualise attentional selection: only features, objects, or locations that fall within the spotlight are attended and thus receive preferential processing (Posner, 1980). While the spotlight is a convenient metaphor for referring to the attended region, it does not capture all of the nuances of how attentional resources can be applied, such as that there can be changes in the size of the attended region (Eriksen & St. James, 1986; LaBerge, 1983), a gradient in terms of magnitude of enhancement within the attended region (Downing, 1988; LaBerge & Brown, 1989), and the fact that the attended-region can be formed into an annulus (Jefferies & Di Lollo, 2015), or split into multiple locations (M. Muller, Malinowski, Gruber, & Hillyard, 2003). Therefore, we will use the general term “attended region” to refer to the spatial area over which attentional resources are applied. The focus of the present paper is the effect of different attended-region sizes on different aspects of visual perception. It is important to note that when conceptualising the effect of attended-region size, the impact of a narrow attended-region size is always compared with that of a wide attended-region size (i.e., comparing perception at a location that is always attended, and simply encompassed by either a narrow versus wide region). That is, we are not comparing conditions in which attention is versus is not applied to a stimulus but instead conditions in which attentional resources are applied differently.

One seminal model across cognition, perception, and neuroscience for how the size of the attended-region impacts visual perception is the zoom-lens model (Eriksen & St. James, 1986; Eriksen & Yeh, 1985; N. G. Muller, Bartelt, Donner, Villringer, & Brandt, 2003). According to this model, there is an inverse relationship between the size of the attended region and the magnitude of perceptual enhancement. That is, a narrower-diameter attended-region size facilitates greater perceptual enhancement within the attended region, whereas when attentional resources are spread more broadly into a wider-diameter size, the magnitude of perceptual enhancement at any given location is weaker. It is important to note that the zoom-lens model does not draw any distinctions between different aspects of perception. This means that one possible inference is that the model predicts that a narrow versus wide attended-region size enhances all aspects of visual perception. We call this possibility the notion of “universal perceptual enhancement.” An alternative interpretation, however, is that the zoom-lens model was only intended to apply to the spatial domain in which it was proposed and tested, even though this was not specified in the model at the time. Whichever interpretation one favours, the outcome is the same: the zoom-lens model has only been tested in one perceptual domain, as described below, and therefore, our goal was to establish how attended-region size impacts different aspects of visual perception.

The zoom-lens model was proposed more than 30 years ago. In the intervening time, evidence has been found that is consistent with its core tenet that a narrow attended-region size is beneficial to perception. For example, with a narrower relative to a wider attended region, behavioural responses to stimuli are faster (Eriksen & St. James, 1986; Eriksen & Yeh, 1985), the perception of target contrast changes is facilitated (Balz & Hock, 1997), spatial resolution is improved (Balz & Hock, 1997; Goodhew, Shen, & Edwards, 2016), and physiological responses in primary visual cortex (V1) are enhanced (N. G. Muller et al., 2003). However, there are reasons to doubt the generality of the zoom-lens model to all perceptual domains. One of the main reasons arises from consideration of the physiological architecture of the brain: visual processing is subserved by two core visual pathways (magnocellular and parvocellular) that operate independently in early visual processing and encode contrasting visual information. Specifically, parvocellular neurons (P cells) have small receptive field sizes, superior spatial acuity, and inferior temporal acuity. This means that each P cell encodes information from only a small chunk of a visual scene, and within this region, they process fine spatial detail, but their responses are sluggish across time. Magnocellular neurons (M cells), in contrast, have large receptive field sizes, inferior spatial acuity, and superior temporal acuity. In other words, each M cell encodes information from a relatively large region of a visual scene, and across this region they process coarse spatial “gist,” and their responses are sharp and precise across time. In technical terms, P cells are especially sensitive to rapid changes in luminance across space (i.e., high spatial frequencies) and slow changes in luminance across time (i.e., low temporal frequencies), whereas M cells are suited to processing slow changes in luminance across space (i.e., low spatial frequencies) and are sensitive to rapid changes in luminance across time (i.e., high temporal frequencies) (Denison, Vu, Yacoub, Feinberg, & Silver, 2014; Derrington & Lennie, 1984; Livingstone & Hubel, 1988; Schiller & Logothetis, 1990).

In virtually all real-world visual tasks (e.g., recognising an object, reading), neither M nor P cells hold an absolute monopoly, but instead they collaborate to provide a complete and coherent perceptual representation (Bar et al., 2006). However, certain tasks favour one class of cell over the other. A useful way of conceptualising this is as balance-point or ratio between M and P cells, and their relative contribution to visual perception at a given point in time. Particular tasks and stimuli bias this balance in favour of M versus P cells, without completely eliminating the contribution of the other type of cell. For intermediate spatial and temporal frequencies, there is considerable overlap in cell responses. In the laboratory, however, we can more selectively isolate the contribution of M versus P cell with specially designed psychophysical tasks. In the extremes, very fine spatial acuity demands P-cell input, whereas very coarse spatial acuity and fine temporal acuity is subserved by M cells (Schiller & Logothetis, 1990). Although Skottun and Skoyles (2007) have written multiple commentaries suggesting that it is not possible to selectively gauge the effect of M versus P cells, claiming they are essentially identical in their properties, their claim has been repeatedly, explicitly, and convincingly refuted (Butler et al., 2007; Lalor & Foxe, 2010; Öğmen, Purushothaman, & Breitmeyer, 2008; Schulte-Körne, Remschmidt, Scheuerpflug, & Warnke, 2004). It is unclear what alternative mechanism Skottun and Skoyles posit is driving the plethora of evidence for selective sensitivity to differential spatial and temporal frequencies, which bear all the hallmarks of M and P cells. Instead, they focus their critiques on the claim that such selectivity is impossible. Furthermore, while it should be acknowledged that due care is required in the design and interpretation of psychophysical tasks for the purpose of selectively gauging the contribution of M versus P cells, it is a critical misrepresentation of the physiological evidence to assert that it is not possible to do so.

Given the presence of two major visual pathways that demonstrably differ in anatomy and function, this raises the possibility that visual attention may have selective effects on the different pathways, therefore having differential effects on different aspects of visual perception. Indeed, the existing neuroscientific literature provides a mechanism via which this could occur. That is, the LGN receives greater feedback connections from the cortex than it does feedforward input from the retina (Erişir, Van Horn, & Sherman, 1997), and the evidence suggests that attention can have M and P pathway-specific modulatory effects on LGN activity (Vanduffel, Tootell, & Orban, 2000). This means that it is biologically plausible that attended-region size could selectively regulate the contribution M cells versus P cells to visual perception. In contrast to the zoom-lens model notion, we have proposed two potential theoretical models that predict visual pathway-specific effects of the size of the attended region on visual perception. The first theory is the Spatiotemporal Trade-Off (STO) account. This model proposes that by virtue of their different receptive field sizes, M versus P cells are preferentially implicated in instantiating narrow versus wide attended-region sizes. That is, to create a narrow attended-region size, the visual system would preferentially pool and enhance responses from P cells with their smaller average receptive field sizes, whereas to create a wide region size, it would preferentially pool responses from M cells with their larger average receptive field sizes. Given the clear demarcation in the sensitivity of M versus P cells to temporal versus spatial information, STO predicts that a narrow attended-region size will enhance spatial acuity and high spatial frequency perception but impair temporal acuity and low spatial frequency perception, whereas a wide attended-region size will enhance temporal acuity and the perception of low spatial frequencies but impair fine spatial acuity and high spatial frequency perception (Goodhew et al., 2016). The second theory is the Selective Spatial Enhancement (SSE) account, which espouses that the function of a focussed, narrow attended-region size is to enhance P-cell-mediated aspects of perception, such as spatial acuity, which decline most rapidly as eccentricity increases, whereas M-cell-mediated aspects of perception, such as temporal acuity remain intact. In this sense, attentional mechanisms are optimised for when covert and overt attention diverge (for a similar notion of visual attention providing an island of enhanced spatial acuity in the periphery in the covert shift literature, see Carrasco, 2011). Therefore, according to SSE, a narrow versus wide attended-region size will enhance P- cell-mediated aspects of visual attention such as spatial acuity, whereas it will have no impact on M-mediated aspects of visual attention such as temporal acuity.

To summarise, the zoom-lens, STO, and SSE all converge on the prediction that P-cell-mediated aspects of visual perception will improve with a narrow relative to a wide size of the attended region. Where the models diverge is in what they predict will be the impact of attended-region size on M-cell-mediated visual perception for a narrow versus wide attended-region size. If we favour the interpretation that the zoom-lens model applies across perceptual domains, then it would stipulate that it should improve. In contrast, SSE espouses no effect of attended-region size on M-cell-mediated visual perception, and STO predicts that it should be impaired for a narrow versus wide attended-region size. Historically, because the relevance of designing tasks to gauge the contribution of the two visual pathways was not considered, the vast majority of the research on the size of the attended region has inadvertently used stimuli that are preferentially P-cell-mediated. For example, much of the early work examined response efficiency to small alphanumeric characters (Eriksen & St. James, 1986; Eriksen & Yeh, 1985; LaBerge, 1983), which are high spatial frequency and thus would be preferentially processed by P cells. Similarly, neuroimaging evidence in favour of the zoom-lens arose from examining activations in V1—the one region that receives input from both of the M and P channels before they predominately diverge into cortical pathways (Muller et al., 2003). Because such evidence did not include measures of M cell visual perception, it can neither assess the generality of the zoom-lens model, nor adjudicate between the accounts that predict pathway-selective effects (STO and SSE). However, we have marshalled some indirect psychophysical evidence from cognate literatures, which we will now review, and furthermore highlight the first piece of direct evidence challenges the universal perceptual enhancement assumption and suggests pathway-specificity in the effects of attended-region size.

One of the psychophysical results that question the generality of the zoom-lens model comes from the literature on covert shifts of attention. Of course, shifts of attention are a distinct mechanism from changes in attended-region size, which are most purely gauged in the absence of shift of the central location. These findings, however, illustrate that at least one mechanism of visual attention does appear to have selective effects on different aspects of visual perception, highlighting the plausibility of theoretical accounts which make similar predictions for changes in the size of the attended region. Yeshurun and colleagues have shown that executing a covert exogenous shift of (a presumably narrow) attended-region to a given location improves spatial acuity but impairs temporal acuity at that location. This is consistent with the notion that exogenous shifts enhance P-cell while inhibiting M-cell-mediated visual perception (Yeshurun, 2004; Yeshurun & Hein, 2011; Yeshurun & Levy, 2003; Yeshurun & Marom, 2008; Yeshurun & Rashal, 2010; Yeshurun & Sabo, 2012).

Further indirect support for the pathway-selective effects of attended-region size on visual perception emerges from some very recent literature on altered visual perception near the hands (near-hand space). That is, when observers place their hands in a power grasp (palm pointing inwards, fingers uncurled) on either side of the computer monitor, visual perception is changed compared with identical stimuli presented when the hands are held away from the screen (for a review, see Goodhew, Edwards, Ferber, & Pratt, 2015). While initial explanations for this focussed on non-pathway-specific attentional explanations, which predicted nonspecific perceptual enhancement (Reed, Grubb, & Steele, 2006) or nonspecific disadvantage due to difficulty with attentional disengagement (Abrams, Davoli, Du, Knapp, & Paull, 2008), more recent evidence implies that is the nature of the stimulus, in particular whether it is optimally suited to be processed by M or P cells, that determines whether hand proximity improves or impairs perception. That is, when participants hold their hands in a power grasp near visual stimuli, they demonstrate enhanced M-cell-mediated visual processing at the expense of P-cell-mediated perception (Abrams & Weidler, 2013; Goodhew et al., 2015; Goodhew, Fogel, & Pratt, 2014; Goodhew, Gozli, Ferber, & Pratt, 2013; Gozli, West, & Pratt, 2012; Kelly & Brockmole, 2014; Medina, Drebing, Hamilton, & Coslett, 2016). However, several key factors can qualitatively shift this to a pattern of P-cell enhancement near the hands, such as when participants adopt a precision grip (conducive to fine motor interactions with objects), instead of a power grasp (Thomas, 2015). Similarly, Goodhew and Clarke (2016) found that a P-cell bias was evident in near-hand space when the search array was high density. Density was manipulated by increasing the number of items in a fixed-size annulus array of Gabors of prescribed spatial frequencies. Only when the density was high did the P-cell bias emerge for the stimuli near versus far from the hands, as evidenced by improved visual search among high spatial frequency items and impaired search among low spatial frequency items (Goodhew & Clarke, 2016). If we assume that the high-density array necessitated participants adopting a narrow attended-region to resolve the identities of the individual items, and hand proximity facilitated this adoption, then this implicates pathway-selective effects as a function of the size of the attended region. Specifically, it would appear to support the predictions from the STO model, which espouses trade-offs in P-cell- and M-cell-mediated perception across different attended-region sizes. However, in that study there were no independent measures or manipulations of attended-region size to verify this interpretation.

The single piece of direct evidence for the impact of attended-region size on M-cell mediated visual perception does not support any notion of universal perceptual enhancement but instead points to selective effects on the different pathways. However, the nature of these selective effects appears to differ from the indirect evidence. Specifically, Goodhew, Shen, and Edwards (2016) manipulated participants’ attended-region size by means of shape discrimination task, in which participants’ task was to identify whether a circle or slightly deformed circle (ellipse) was presented on a given trial. Participants performed this task on these stimuli for 80% of trials (randomly intermixed) in a given block. In one block of trials, these circle and ellipse shapes were all large in diameter (in order to induce a sustained wide attended-region), and in the other block, they were all small (to induce a sustained narrow attended-region). On the other 20% of trials in each of these blocks, the impact of the induced attended-region size was measured by means of gap detection tasks. In one condition, participants performed a spatial gap detection task in order to gauge their spatial acuity, and in the other participants performed a temporal gap detection task in order to measure their temporal acuity. We found that the narrow relative to the wide attended-region size improved spatial acuity, while it had no effect on temporal acuity (Goodhew et al., 2016). These results are consistent with the notion of SSE, according to which a narrow size of the attended region will improve P-cell-mediated aspects of visual perception, while leaving M-cell-mediated aspects unchanged.

To summarise, both the indirect and direct evidence calls into question whether the zoom-lens model will generalise across perceptual domains. Instead, the evidence suggests that the size of the attended region has selective effects on the different pathways and therefore different aspects of visual perception. The indirect evidence appears to suggest that STO may be the correct model, whereas the first piece of direct evidence instead appears to support SSE. The purpose of the present work, therefore, was to a) more comprehensively establish the effect of attended-region size on temporal acuity, and b) assuming a pattern of SSE or STO is found, test whether this reflects M-cell versus P-cell mediated effects more generally (including the processing of different spatial frequencies), rather than just being limited to spatial versus temporal acuity per se, and c) to determine whether the pattern of SSE was specific to centrally presented stimuli. In experiment 1, we examined the effect of a narrow versus wide attended-region size on spatial gap identification and temporal order judgement performance. In experiment 2, we tested the effect of a narrow versus wide attended-region size on the perception of low-spatial frequencies (LSFs) versus high-spatial frequencies (HSFs). Finally, in experiment 3, we tested whether central presentation was a necessary requirement to observe SSE.

Experiment 1

The purpose of experiment 1 was to assess the effect of attended-region size on spatial and temporal acuity, as measures of P-cell- and M-cell-mediated visual perception respectively. To manipulate participants’ attended-region size to be narrow and wide, we used the shape-discrimination inducer method validated in our previous work (Goodhew et al., 2016), which was modelled on previous methods of sustained changes in the size of the attended region (LaBerge, 1983). We used a spatial gap identification task to gauge spatial acuity under these different attended-region sizes, in which participants’ task was to identify on which side (left vs. right) there is a small break (spatial gap) in an outline circle shape. We used a temporal order judgement task to measure temporal acuity under narrow versus wide attended-region size conditions, where participants’ task was to identify which of two stimuli presented in rapid succession appears first (stimulus on the left vs. right; i.e., identify the temporal order).

As discussed earlier, perceptual sensitivity to rapid changes in luminance across time is subserved by M cells. The temporal order task is designed to gauge sensitivity to such high temporal frequencies, due to its rapid presentation parameters in conjunction with the requirement for participants to discern the relative timing of the two stimuli. In contrast, sensitivity to rapid changes in luminance across space is subserved by P cells, and the spatial gap identification task is optimised to measure sensitivity to such high spatial frequencies, because it requires participants to resolve the location of a tiny spatial gap in the stimulus, and the stimulus duration does not preclude P-cell processing. Furthermore, both the spatial gap task (Gozli et al., 2012; Yeshurun & Levy, 2003) and temporal order judgement task (Hein, Rolke, & Ulrich, 2006) have been demonstrated as useful selective measures of P-cell- and M-cell-mediated visual perception and are sensitive to attentional and cognitive manipulations. For example, performance on this temporal order judgement task suffers as a consequence of transient attentional shifts to the stimuli, whereas it improves after sustained attentional shifts (Hein et al., 2006).

The zoom-lens model and both of the pathway-selective models (SSE and STO) predict that spatial acuity (as measured with the spatial gap discrimination task) should improve with a narrow versus wide attended-region size. This pattern therefore is expected to be obtained and should provide convergent evidence that attended-region size was successfully manipulated. For temporal acuity (as measured with the temporal order judgement), the universal perceptual enhancement interpretation of the zoom-lens model predicts that a narrow (versus wide) region size will facilitate performance, STO predicts reduced performance, whereas SSE predicts no effect of attended-region size.

Method

Participants

Forty participants (23 females, 17 males) completed the experiment and were compensated with payment (AUD$15) for their time. Participants’ mean age was 24.2 years (SD = 6.8). Two participants reported being left-handed, one ambidextrous, and the remainder right-handed. In this and all subsequent experiments, participants provided written, informed consent prior to participation.

Stimuli and apparatus

Stimuli were presented on gamma-corrected CRT monitor running at a refresh rate of 75 Hz. Stimuli were generated using the Psychophysics Toolbox (Brainard, 1997) within MATLAB. Throughout the experiment, the background was set to mid grey. Viewing distance was fixed at 44 cm with a chinrest.

An illustration of the experimental stimuli can be seen in Fig. 1. Inducer stimuli were darker grey than the background (30% contrast). The narrow inducer stimulus was either an outline circle of 1° of visual angle in diameter or an outline ellipse that subtended 1° on one dimension (horizontal or vertical) and either 0.9° or 0.8° on the other. The wide inducer stimulus was either an outline circle of 20° in diameter or an outline ellipse that subtended 20° on one dimension (horizontal or vertical) and either 16° or 18° on the other. This variability in the way the ellipse deformed from a circle was introduced so that there was not a constant location that participants could examine to determine the shape correctly, but instead it required a more global analysis (and thus the corresponding attended-region size). The width of the stroke for all inducers was .14°. Inducers were always presented such that their centre aligned with the centre of the screen, where participants were instructed to fixate on a small white dot throughout the trial. This is because the cleanest measure of the effect of the size of the attended region is when the size is changed without any change in the central location (i.e., in the absence of a shift of attention).

Fig. 1
figure 1

Illustration (not to scale) of: (a) the shape inducer stimuli and (b) the two target condition stimuli and judgements. In the experiment, on 80% of trials in each block participants were presented with one of the shapes (a) and asked to indicate whether a circle or ellipse shape was shown on that trial. The size of the shapes was blocked, such that the large stimuli were designed to induce a wide attended-region size, whereas the small stimuli were designed to induce a narrow attended-region size. Whether a circle or ellipse was shown was randomly determined on each trial. On the other 20% of trials in each block, participants were presented with the stimuli in (b), that is, a broken ring, and asked to identify whether the gap was on the left or right of the object (spatial gap identification), or presented with two dots that flashed in rapid succession, and asked to identify whether the dot on the left or the right appeared first (temporal order judgement). Target task was blocked. This design allows for spatial and temporal acuity to be examined as a function of the size of the attended region (narrow versus wide)

In the spatial gap identification condition, the target stimulus was an outline circle whose total diameter was 0.4° and the stroke width was 0.07°. It was presented in dark grey (30% contrast). The spatial gap subtended 0.07° vertically, and made a complete horizontal break (on either the left or right side of the circle). This target stimulus was always presented in the centre of the screen.

In the temporal order judgement condition, target stimuli consisted of two small black circles (0.14° diameter), which appeared to the 0.2° to the left and right of centre. After their presentation time (see Procedure), a mask (‘XXXX’ in black) appeared centrally, such that it covered the locations that the two target dots had been presented). This was done to mitigate the possibility that participants could adopt a strategy, such as using perceived contrast in iconic memory as a proxy to determine duration and therefore temporal order, and to be consistent with the procedure that has been successfully used in attention studies previously (Hein et al., 2006). Note that all target stimuli always fitted with the bounds of the narrow attended-region size.

Procedure

The experiment consisted of two different types of trials: inducer trials, and target identification trials. The inducer trials were designed to manipulate attended-region size in a given block, whereas the target identification trials were designed to measure the effect of a given attended-region size on perception. Both the attended-region size and target task variables were blocked and fully crossed, such that participants completed both attended-region size conditions of a given target task before progressing to the second target task (i.e., attended-region size blocks were sub-blocks within the larger target type blocks). Target task was blocked so that participants knew what task they were completing in addition to the inducer task in a given block. Order of block completion was counterbalanced across participants to mitigate order effects. All other variable assignments were randomly intermixed within blocks. Within each block, 80% of trials were inducer trials, and 20% were target identification trials. Ensuring that target trials were always interleaved among inducer trials was designed to maximise the effectiveness of the attended-region size induction (see Fig. 2).

Fig. 2
figure 2

Illustration (not to scale) of a series of trials in each condition. The image inside each frame depicts the key response-relevant stimuli on a given trial. The purpose of this figure is to illustrate the nature of the manipulations within and across blocks. Specifically, it illustrates how attended-region size was blocked (by manipulating the size of the stimuli across blocks), while the shape of the stimulus (circle vs. ellipse) was randomly intermixed within blocks. It also shows how it was randomly varied on each trial whether a target or inducer was presented, but according to different probabilities (80% of trials were shape-discrimination within a block, whereas 20% of trial was target trials (temporal order or spatial gap identification, depending on block. To simplify the illustration, we show targets occurring on 33% of trials). This figure also illustrates that while the target stimuli would always fit within the bounds of the narrow region and the shape discrimination task was designed to manipulate attended-region size, these stimuli were never presented simultaneously (a misconception one might have from examining Fig. 1 only). Note that for the temporal order judgement stimuli, one of the dots would appear briefly before the other also appeared such that both would be visible as they are here before they would be masked. For clarity we show them here unmasked, whereas in the actual experiment the two dots were backwardly masked

Inducer identities (circle vs. ellipse) were equi-probable and randomly assigned. When the inducer was an ellipse, it was equally likely to be shorter along the horizontal versus vertical dimension, and equally likely to be each of the two shorter distances outlined in the Stimuli and apparatus section. The left versus right gap/side appearance targets for the spatial gap identification and temporal order judgement targets were randomly-assigned and equi-probable.

The experiment began with a set of practice trials for whichever target task participants were assigned to first (spatial gap identification versus temporal order judgement). Participants’ performance on this practice block was used to determine the difficulty of the target identification task in the experiment proper. The practice block consisted of 24 trials where target exposure time started long for the first few trials progressively sped up to 80 ms (spatial gap stimulus) or 40 ms for the first circle and 107 ms for the second circle (temporal order stimuli) after six trials, where it remained at these values for the rest of the practice block. This was done to give participants a clearer initial idea of the nature of the stimuli and task requirements before the presentation time was reduced to a more difficult level. Feedback (“CORRECT” appearing in green or “INCORRECT” appearing in red in the centre of the screen) also was given during all practice trials to help participants to understand the different tasks and responses. In this practice block, there was a 50/50 split between the inducer trials and the target trials (rather than 80/20 in the experiment itself). This is because it is typically the target task that participants find most difficult and therefore we sought to ensure adequate exposure to these during practice and were not seeking to induce and maintain a particular attended-region size yet as we were in the experiment. Participants were exposed to both narrow and wide inducers in the practice blocks. Participants completed the relevant practice block, followed by the corresponding experimental block of trials (e.g., spatial gap practice block, then spatial gap experimental blocks), and then completed the practice block for the other task (e.g., temporal order stimulus, if they began with the spatial gap stimulus) before commencing their second target type block.

Participants’ performance in the practice block was then used to determine the precise timing parameters of the experimental block. This is because previously we have found considerable individual variability in the level of performance participants produce in response to the identical stimulus parameters when stimuli are presented briefly. To maximise our ability to get meaningful performance (i.e., off floor and ceiling) from all participants, we adjusted the presentation times based on their performance in the practice block for that target task. We also used performance on the inducer trials as a criterion to ensure that performance on these was not compromised in order to maximise performance on the target task. For both target tasks, if participants scored greater than 85% on the inducer trials and greater than 85% on the target trials, then they were assigned to the most difficult target timing parameters, if they scored greater than 75% on the inducer task and target task but less than or equal to 85%, then they were assigned to the standard, intermediate level of difficulty condition, and if they scored greater than 75% on the inducer task and greater than 65% on the target task, they were assigned to the slowest and therefore easiest target display parameters. If they did not meet any of these criteria, they were deemed unsuccessful and required to repeat the practice block. For the experiment proper, in the spatial gap identification task, target exposure time was varied across difficulty conditions such that in the most difficult case it was 67 ms, the intermediate target exposure time was 80 ms, and the longest was 93 ms. In the temporal order judgement task, the second target was always presented for 107 ms. We varied the exposure time of the first target alone before the onset of the second target, such that fastest (and most difficult) was 26 ms of first-target only time, intermediate was 40 ms, and the slowest was 67 ms. Target exposure time was the only factor that differed between the different difficulty parameter settings, the settings were identical in all other respects.

For both target types, on each trial, a white fixation dot was presented in the centre of the screen for 253 ms. Then on inducer trials, the inducer was presented for 53 ms before the screen went blank until a response was registered. On target trials in the spatial identification task block, after the fixation dot the target circle was then shown for 67 ms, 80 ms, or 93 ms (depending on to which difficulty condition participants were assigned), and then the screen went blank (grey) until a response was registered. On target trials in the temporal order judgement task block, after fixation, the first target dot was shown for 26 ms, 40 ms, or 67 ms (depending on to which difficulty condition participants were assigned), and then the second target dot appeared and was shown for 107 ms while the first dot was still present. Then they were both masked, and the mask was shown until a response was made. For all trial types, after response, the screen remained blank for a 1,000-ms intertrial interval before the start of the next trial.

On the inducer trials, participants’ task was to identify whether a circle or an ellipse was presented, whereas on the target identification trials, participants’ task was to identify whether the gap was on the left or right of the circle (spatial gap task) or to identify whether the dot on the left or the right appeared first (temporal order judgement). Participants used the “z” versus “x” keys on a standard keyboard to respond to on both inducer and target trials, and accuracy rather than speed of response was emphasised. The experiment itself consisted of a total of 640 trials: 320 per target type (spatial identification vs. temporal order judgement), and within this 160 trials per attended-region size (narrow vs. large). A self-paced rest break was provided every 80 trials).

Methodological note about the necessary absence of a baseline condition

Note that there is no meaningful baseline condition when it comes to attended-region size. It is not possible to have a neutral “no attended-region size condition,” because such a state does not exist. In the absence of a direct experimental manipulation of attended-region size, one would simply be measuring participants’ default or current attended-region size and comparing the effect of an induced region size to this state. There indeed are demonstrable differences in default or preferred attended-region size across different groups, such as for people of different ethnicities, personality, and age groups (Elahipanah, Christensen, & Reingold, 2011; Greenwood & Parasuraman, 2004; Hutchinson, Ledgeway, & Allen, 2014; Kramer, Ellenberg, Leonard, & Share, 1996; McKone et al., 2010; Roalf, Lowery, & Turetsky, 2006; Wilson, Lowe, Ruppel, Pratt, & Ferber, 2016). Such default differences are not an issue here, because all of our key manipulations are manipulated within-subjects rather than between-subjects. But it is for this reason that we do not have a neutral condition in which attended-region size is not manipulated, and thus the effect of different sizes on visual perception is always measured in comparison to another attended-region size. This means that while we express the pattern in a given way for simplicity, e.g., narrow impairing performance relative to a wide size in STO, it can always equally be expressed in the alternative way, e.g., wide improving performance relative to narrow region.

Results and discussion

Inducer task accuracy

We examined participants’ performance on the inducer task (circle vs. ellipse discrimination). It is critically important that participants sufficiently engage in this task, so that their attended-region size reflects the size of the inducer. To ensure that participants were engaged in the inducer task, we excluded from further analysis any participants whose performance fell below 60% in either the focal or diffuse condition. Two participants were excluded on this basis. Performance on the inducer identification task for the remaining participants was high (93.6% for narrow and 95.9% for wide).

Target identification accuracy

As described in the Procedure section, participants were assigned to different difficulty conditions where target exposure time was varied across participants to adjust optimally the level of difficulty required to see variability unconstrained by floor or ceiling effects and thus sensitive to attentional manipulations for different individual participants. The settings were identical in all other respects and were applied to all conditions (i.e., a constant). In the spatial gap identification condition, 27 participants completed the most difficult target timing presentation parameters, 9 participants were assigned to the intermediate settings, and 4 participants completed the easiest settings. In the temporal order judgement condition, 19 participants were given the most difficult target timing parameters, 12 participants intermediate, and 9 participants were assigned to the slowest (and therefore easiest) target timing parameters. Three participants, however, still performed below chance on the target identification task in the temporal order judgement block and were excluded from further analysis.

Target identification accuracy for the remaining 35 participants is shown in Fig. 3. These values were submitted to a 2 (attended-region size: narrow versus wide) by 2 (target task: spatial gap versus temporal order) repeated-measures ANOVA on-target identification accuracy. This revealed a significant main effect of target task, F(1, 34) = 80.09, p < 0.001, ηp 2 = 0.702, and no main effect of attended-region size (F<1), but a significant interaction between target task and attended-region size, F(1, 34) = 4.33, p = 0.045, ηp 2 = 0.113. Repeated measures t tests revealed that this interaction was driven by the fact that in the spatial gap identification task, performance was reliably higher in the narrow region size condition than in the wide condition (p = 0.001), whereas there was no difference between the size conditions for temporal order judgement accuracy (p = 0.484). The fact that spatial acuity (as measured by the spatial gap identification task) was modulated by attended-region size highlights that the induction of different attended-region sizes was effective, as this pattern of results is consistent with previous research (Goodhew et al., 2016) as well as the predictions from all of the theoretical models. The fact that temporal acuity (as measured by the temporal order judgement task) was unaffected by attended-region size but is inconsistent with the notion of universal perceptual enhancement. Instead, it is consistent with the SSE model, which predicts that only P-cell-mediated visual processes are modulated by the size of the attended region. In other words, the results provide support for pathway-specific effects of attended-region size on visual perception.

Fig. 3
figure 3

Shows accuracy (% correct) as a function of the size of the attended area (narrow versus wide) for spatial versus temporal acuity tasks for Experiment 1 (left hand side of the graph) and the HSF and LSF orientation identification tasks for Experiment 2 (right hand side of the graph). Across both experiments, P-cell-mediated visual perception (spatial acuity and HSF perception) was improved by a narrow versus wide attended-region, whereas M-cell-mediated visual perception (temporal acuity and LSF perception) was unchanged. HSF, high spatial frequency; LSF, low spatial frequency. Errors bars depict standard error of the mean corrected for within-subjects designs (Cousineau, 2005)

Checking for floor effects

Before we accept this conclusion, we wanted to ensure that the absence of attended-region size modulation for temporal acuity was not a consequence of lower overall accuracy relative to the spatial gap identification condition. To check this, we removed all datasets from the temporal order task whose average target accuracy was below 60%. This took average accuracy up to 76%—around the most optimally sensitive part of the psychometric function (N = 24). Yet performance was still unaffected by attended-region size (76% and 75% for narrow and wide respectively, p = 0.874). This indicates that the absence of modulation for temporal order judgement accuracy due to region size was not an artefact of floor effects.

Checking for speed-accuracy trade-offs

Although accuracy rather than speed of responses was emphasised, we sought to ensure that the improvement in spatial gap identification accuracy with a narrow attended-region did not arise at the cost of response speed and that the absence of accuracy modulation on the temporal order judgement was not obscuring a change in response speed between the narrow and wide size conditions. To check this, we examined response times to the target for the two tasks as a function of attended-region size and found that spatial gap identification responses were significantly quicker with a narrow (622 ms) versus the wide (710 ms) attended-region size (p = 0.009). This means that the response speed was also facilitated (in addition to accuracy) with a narrow attended-region, thus countering the possibility of a speed/accuracy trade-off. In contrast, there was no difference in temporal order judgement response times between the narrow (1,100 ms) and wide (1,115 ms) attended-region size conditions (p = 0.823) mirroring the accuracy data. In other words, on both accuracy and response time, a narrow (versus wide) attended-region size facilitated judgements requiring spatial acuity, whereas both accuracy and response time revealed no effect of attended-region size on temporal order judgements.

In summary, in Experiment 1 spatial gap identification was affected by attended-region size, demonstrating the important methodological points that attended-region size was successfully manipulated and that the paradigm was sensitive to detecting changes in perceptual acuity as a function of attended-region size. Experiment 1 was the second-ever direct test of the zoom-lens model on non-spatial measures. In light of this, it is highly informative that it revealed that temporal order judgements were unaffected by the size of the attended region. Because temporal order judgements are one measure of temporal acuity, this is consistent with the results of Goodhew et al. (2016), who found that temporal gap detection, a different measure of temporal acuity, was similarly unaffected, but it is an important novel extension in showing converging evidence via a distinct method. At the end of Experiment 1, we can be confident that temporal acuity is indeed insensitive to attended-region size.

Experiment 2

Experiment 1 revealed a pattern of results demonstrating pathway-specific effects of attended-region size on visual perception. That is, the results were consistent with the SSE model: a narrow (vs. wide) attended-region improves spatial acuity but not temporal acuity. In Experiment 2, we sought to investigate further this finding in a new domain and, in doing so, to test two alternative explanations for the pattern of results observed in Experiment 1. That is, there are two equally viable explanations for the SSE pattern of results observed. It could be (a) as we hypothesised, that P-cell-mediated visual perception is impacted by changes in attended-region size while M-cell-mediated visual perception is not. However, (b) it could equally be that spatial aspects of visual perception are influenced by changes in attended-region size, while temporal aspects are not. This is a critical distinction, because these two hypotheses make different predictions about what should happen with respect to stimuli that are coarse or low spatial frequency (LSF), which are spatial and not temporal in nature, but processed predominately by M-cells. According to (a), SSE should still occur with LSF stimuli, such that the perception of HSF stimuli is impacted by attended-region size whereas the perception of LSF stimuli is not, whereas according to (b), LSF stimuli should be impacted by changes in attended-region size as well as HSF stimuli. Indeed, this appears to be the case when it comes to covert shifts of attention, where temporal acuity and LSF perception are differentially impacted (Carrasco, Talgar, & Cameron, 2002; Yeshurun & Levy, 2003).

In Experiment 2, we used Gabors (sine waves of a given spatial frequency in a Gaussian envelope) as targets. Unlike natural objects that typically contain a broad range of spatial frequencies, Gabors are specially designed stimuli that contain only a narrow range of spatial frequency content. We manipulated the Gabors to contain different spatial frequency content—finely detailed (high spatial frequency; HSF) and thus P-cell preferred, or coarse information (and so LSF), and thus M-cell-preferred (Derrington & Lennie, 1984; Livingstone & Hubel, 1988; Tootell, Switkes, Silverman, & Hamilton, 1988). We tested participants’ sensitivity to these different stimuli under narrow versus wide attended-regions. If the pattern of SSE observed in Experiment 1 was a product of a selective effect of attended-region size on P-cell contribution to visual perception, then SSE should be observed again in Experiment 2, such that perception of the HSF stimulus would improve under a narrow versus wide attended region, whereas perception of the LSF stimulus would be unaffected. In contrast, if both the LSF and HSF stimuli are improved, or indeed another pattern is observed, then this suggests that selective P-cell enhancement is not the underlying mechanism for the SSE observed in Experiment 1.

Method

Participants

Forty participants (26 females, 14 males) completed the experiment in exchange for payment. Their mean age was 22.3 years (SD = 4.1). Of these, 33 reported being right-handed and 7 left-handed.

Stimuli and apparatus

The same inducer stimuli were used as those in Experiment 1. Now target stimuli were Gabors (sine waves of a given frequency in a Gaussian envelope). The total size of the Gabor was 0.6°, with a standard deviation of 0.14°. The phase of the Gabor was randomly assigned on each trial. Targets were oriented to the left or right of vertical (the precise value of the angular offset from vertical was determined by participants’ performance in the practice block) and were always presented at central fixation. The LSF targets had a spatial frequency of 1 cpd, whereas the HSF targets were 10 cpd. These spatial frequencies are tailored to the physiological sensitivities of M versus P cells and have been found psychophysically to selectively tap M cell and P cell processes respectively in previous work (Goodhew & Clarke, 2016). The contrast of both targets was set to 50%.

Procedure

Attended-region size was still blocked (narrow vs. wide) and counterbalanced; however, the different target types (HSF vs. LSF) were randomly intermixed within blocks. Within each block, a given trial was randomly assigned to be either an inducer or target identification trial with an 80% and 20% probability respectively, with the constraint that two target identification trials could not occur consecutively.

Participants commenced the experimental session with a practice block, which consisted of 40 trials. In the practice block, inducer and target presentation time was 1,000 ms for trials 1–5; 253 ms for trials 6–10; 107 ms for trials 10–20; 67 ms for the inducers; and 107 ms for the target for the last 20 trials. This gradual decrease in target presentation time was designed to allow participants to familiarise themselves with the Gabor stimuli, particularly the HSF stimulus, which is more challenging to identify. Trial-by-trial feedback on the accuracy of their responses helped them to calibrate to the task as per Experiment 1.

The angular offset of the target Gabors from a vertical orientation was varied in order to adjust task difficulty, such that the closer to vertical the angular offset of the targets, the more difficult the task. Again, participants were assigned to different difficulty conditions based on their performance in the practice block and the different difficulty conditions were identical in all other respects except angular offset of the target orientation from vertical. If participants scored 90% or above for the inducer trials and 85% or above for the target trials they were assigned to the most difficult task parameter settings (12° target offsets from vertical), if they scored 75% or above for the inducer trials and above 70% but less than 85% for the target trials, then they were assigned the intermediate settings (15°), whereas if they scored 75% or above for the inducer task and 60% or above for the target task, then they were assigned to easiest discriminable condition (18°). If they scored below 75% on the inducers or below 60% on the targets, they were deemed unsuccessful and had to repeat the practice block until they were successful before progressing to the experiment proper.

On each trial, a white fixation dot was presented for 253 ms. On the inducer trials, the inducer annulus was displayed for 67 ms, and then the screen went blank until a response was made. Whereas on target trials, the target Gabor was displayed for 106 ms, and then the screen went blank until a response was registered. There was still a 1,000 ms intertrial interval during which the screen was blank until the start of the next trial. On the inducer trials, participants’ task was still to make a circle versus ellipse discrimination judgement, whereas on the target trials, the task was to identify whether the target was oriented to the left or right of vertical. Participants used the “z” and “x” keys to respond regarding the inducer identity and the left and right arrow keys to identify the orientation of the target. The experiment consisted of a total of 480 trials: 240 per region size block. Self-paced rest breaks were provided every 120 trials.

Results and discussion

Inducer task accuracy

Performance on the inducer identification task for the remaining participants was high (88.6% for the narrow attended-region and 95.7% for the wide region). All individual participants’ accuracies exceeded 60%.

Target identification accuracy

Based on their performance in the practice block, 8 participants were assigned to the smallest target angular offset from vertical orientation presentation parameters, 16 were assigned intermediate, and 16 were assigned to the largest target angular offset parameters. Five participants scored below chance overall in a target condition (all in the HSF condition) and were excluded from further analysis.

The pattern of results from the remaining 35 datasets is shown in Fig. 3 (right side). These were submitted to a 2 (attended-region size) × 2 (target SF) repeated-measures ANOVA. This revealed a significant main effect of target type, F(1, 34) = 33.68, p < 0.001, ηp 2 = 0.498, and a significant main effect of attended-region size, F(1, 34) = 11.28, p = 0.002, ηp 2 = 0.249, which was qualified by a significant interaction between attended-region size and target SF, F(1, 34) = 4.85, p = 0.035, ηp 2 = 0.125. Repeated measures t tests revealed that this interaction reflected the fact that accuracy was significantly greater in the narrow region condition than in the wide region condition for HSF targets (p = 0.001), whereas there was no reliable difference between the narrow and wide region conditions for the LSF targets (p = .390). These results support the notion that the mechanism responsible for the pattern of SSE was a selective enhancement of P-cell and not M-cell contribution to visual perception by a narrow attended-region (versus wide) attended-region. In addition to Experiment 1, this is further evidence that is inconsistent with any notion of universal perceptual enhancement, according to which a narrow size of the attended region should improve the perception of targets irrespective of their spatial frequency content. Instead, it illustrates how changing the size of the attended region can have pathway-selective effects on visual perception.

Checking for ceiling effects

The absence of M-cell-mediated modulation of perceptual performance cannot be explained away as reflecting floor or ceiling effects, because the level of accuracy for the spatial gap identification that is modulated in Experiment 1 is similar to the level of accuracy for the LSF target identification performance that is not, and vice versa: the level of accuracy that is modulated in Experiment 2 in the HSF target condition is similar to the level of accuracy in the temporal order judgement condition that is not modulated by attended-region size. This is apparent in Fig. 3. However, to double-check that the absence of attended-region size modulation for the LSF was not a result of a ceiling effect, we excluded participants whose average accuracy equalled or exceeded 90% in the LSF conditions. This made the average level of performance for the remaining 21 participants 85.4%—well off ceiling and yet there was still no modulation (p = 0.346).

Checking for speed-accuracy trade-offs

Once again, the orientation identification judgements were unspeeded and thus accuracy was the primary metric of interest. However, to check for putative speed-accuracy trade-offs, we examined response times for the different spatial frequency targets as a function of attended-region size. Responses were not reliably different between the two attended-region sizes for either the LSF targets (1,500 ms vs. 1,504 ms for narrow vs. wide respectively, p = 0.810) or the HSF targets (1,523 ms vs. 1,531 ms for narrow vs. wide respectively, p = 0.789). This means that the enhanced accuracy observed with the narrow versus wide region size condition for the HSF targets was not a product of a speed-accuracy trade-off.

Considering sufficient separation in the SFs

Whereas Experiment 1 gauged spatial versus temporal acuity, which are qualitatively different properties to which M versus P cells are attuned, Experiment 2 examined sensitivity to different points along the spatial frequency continuum. This means that an important consideration is whether 1 cpd sufficiently low and 10 cpd sufficiently high to selectively tap M versus P cells respectively. Consideration of previous evidence indicates that they were most likely sufficiently tailored. An influential psychophysical study that assessed pathway-selective spatial frequencies found preferential P-cell sensitivity above 4 cpd (Leonova, Pokorny, & Smith, 2003). While these authors were able to assess the unique contribution of M cells below 4 cpd, the clearest differentiation was for 1 cpd and below (Leonova et al., 2003). From this perspective, the 1 cpd and 10 cpd used to define LSF and HSF should be well-suited to assess the contribution of M cells and P cells respectively. Moreover, reliable M and P differentiation, including complete trade-offs in performance as a function of spatial frequency have been found with the 1 cpd and 10 cpd used here (Goodhew & Clarke, 2016).

The only further caveat one could argue would be that while the 10 cpd was sufficiently high to be selectively P cell, the 1 cpd was an “intermediate” spatial-frequency and thus reflecting competing effects of M cells and P cells that cancel one another. We do not think that this was so for the reasons outlined above. However, if it was, then it would support STO and not universal perceptual enhancement. That is, it cannot be the case that mixed M and P effects would obscure a pattern of universal perceptual enhancement for the two spatial frequencies. These effects would work together, thereby producing a pattern of improvement for the narrow condition for the LSF targets along with the HSF targets. It could only be the case that qualitatively different effects of M and P cells summed to zero, which assumes an underlying model of STO. Thus, we can confidently say the results are not consistent with the notion of universal enhancement, thereby refuting the generality of the zoom-lens model. There does remain the possibility that a true STO pattern was obscured here and instead it just appeared as SSE. If so, then this would still indicate pathway-selective effects of attended-region size. We revisit this issue in the General discussion.

Experiment 3

Experiments 1 and 2 have demonstrated a pattern of SSE, whereby tasks that gauge P-cell processing have been modulated by the size of the attended region, whereas tasks that gauge M-cell processing have not. One potential explanation for this absence of M-cell-mediated visual perception is that in both of these experiments, the stimuli were presented in the centre of the screen and therefore would have appeared in participants’ fovea. While there are indeed M cells in the fovea, their proportion increases in the periphery. It is possible, therefore, that eccentric presentation of stimuli is more conducive to modulating the relative contribution of M cells to visual perception. Indeed, peripheral stimulus presentation (4°) have been used where M-cell contributions to visual perception have been modulated by hand proximity (Bush & Vecera, 2014; Gozli et al., 2012) (although the detrimental effect of cueing on temporal gap detection is observed even with centrally presented stimuli, see Yeshurun & Levy, 2003). The purpose of Experiment 3 was to establish whether the pattern of Selective Spatial Enhancement is specific to central presentation, or whether another pattern, such as STO would be observed with noncentral stimulus presentations. In terms of task, because it is tasks of spatial and temporal gap detection that have been used most commonly in related areas where trade-offs between M-cell and P-cell processing have been observed (Gozli et al., 2012; Yeshurun & Levy, 2003), we used these tasks in Experiment 3.

Method

Participants

Forty participants (33 females, 7 males) completed the experiment in exchange for payment or course credit. Their mean age was 19.98 years (SD = 2.7). Of these, 37 reported being right-handed and 3 left-handed.

Stimuli and apparatus

With respect to choosing an eccentricity at which to present the target stimuli, there were multiple competing factors we had to balance. One factor was that we wanted to maximise the eccentricity to maximise the beneficial effect that this would have on our ability to detect any potential eccentricity-sensitive trade-off. Another important factor to consider was that we needed to ensure that participants did not need to move their attention from fixation to the stimulus in order to perceive the target and thus complete the task, or else we would be confounding changes in attended-region size with covert shifts of attention. This requirement restricted the eccentricity that we could use. Similarly, another important factor was we would need to increase the diameter of the narrow inducer as a function of target eccentricity to ensure that the narrow attended-region condition still encompassed the target. It was not feasible to increase the size of the wide inducer concurrently, as this risked moving the inducer so far into the periphery as to undermine its visibility psychophysically, and also even a modest increase in size would have hit the limits of the screen size. This means that as the size of the narrow attended-region increased, the difference in size between the narrow and wide regions decreased, thus reducing the power of the manipulation of attended-region size. This requirement, therefore, also compelled that we keep eccentricity to a minimum. On balance, therefore, we chose an eccentricity of 2° for target presentation. That is, targets were placed 2° to the left or right of central fixation.

Target stimuli were small (0.4°) black circles. In the spatial gap detection block, participants judged whether there was a small gap (0.07°) in the bottom of the circle (spatial gap present) or not (spatial gap absent). In the temporal gap detection block, participants judged whether there was a brief period during which the circle disappeared from the screen before reappearing (temporal gap present), or whether the circle was presented continuously (temporal gap absent). The inducer task was the same as the previous experiments, with the exception that now the narrow inducer subtended 7° diameter and the diffuse inducer still subtended 20° when they were circles. When the narrow inducer was an ellipse, it was 6.3° or 5.6° along the horizontal or vertical dimension, while it was 7° on the other.

Procedure

The Procedure was identical to Experiment 1, except where specified. After the fixation dot, on inducer trials, the inducer was presented for 67 ms, before the screen went blank until a response was registered. On target trials in the spatial gap detection block, after the fixation dot the target circle was then shown for 80 ms, 93 ms, or 107 ms (depending on to which difficulty condition participants were assigned), and then the screen went blank (grey) until a response was registered. On target trials in the temporal gap detection block, after fixation, when the temporal gap was absent, the circle was presented for 93 ms, 107 ms, or 120 ms. When the temporal gap was present, the circle appeared for 40 ms, disappeared for 13 ms, 27 ms, or 40 ms, and then reappeared for 40 ms (total 93 ms, 107 ms, or 120 ms exposure). Targets appeared with equal probability to the left and right of fixation. This was done to prevent participants from moving the focus of their attention out to a peripheral location. For the target detection task, the gap (spatial and temporal) was present for 50% of trials. Note that a stimulus was always present, participants’ task was to determine whether the gap in the stimulus was present or absent, by pressing designated response keys. For the spatial gap detection task, based on their performance in the practice task, 15 were assigned to the fastest spatial gap stimulus presentation time (80 ms), 19 to the intermediate (93 ms), and 6 to the slowest (107 ms). For the temporal gap detection task, 13 were assigned to the fastest temporal gap presentation (13 ms), 22 to the intermediate (27 ms), and 5 to the slowest (40 ms).

Results and discussion

Inducer task accuracy

Data from one participant was not included due to a running error (they were run on the temporal condition twice and not on the spatial condition). Data from three participants were excluded due to being below chance on the target task, indicating that they were unwilling or unable to comply with task instructions. Another eight were excluded due to poor performance (<60% accuracy on the inducer task). After exclusion, the resulting inducer task accuracy was 90.4% for the narrow condition and 91.6% for the wide condition.

Target identification task accuracy

Target identification accuracy scores were submitted to a 2 (task) × 2 (attended-region size) repeated measures ANOVA on target identification accuracy. This revealed no main effect of task, F(1, 27) = 2.84, p = 0.104, ηp 2 = 0.095, a trend toward a main effect of attended-region size, F(1, 27) = 3.48, p = 0.073, ηp 2 = 0.114, which was qualified by a trend toward an interaction between task and attended-region size, F(1, 27) = 3.43, p = 0.075, ηp 2 = 0.113. Since the interaction was at trend-level significance, and the following comparisons were planned a priori, we proceeded to conduct repeated measures t tests which revealed that for the spatial gap detection task, responses were significantly more accurate in the narrow (M = 88.3%) than in the wide (M = 82.1%) condition (p = 0.015), whereas there was no difference in response accuracy between the narrow and wide conditions (82.3% for both) for the temporal gap detection task (p = 0.993). Moreover, the fact that accuracy was equivalent for both tasks rules out the possibility that the absence of modulation on the temporal acuity task was the result of a floor or ceiling effect (Fig. 4).

Fig. 4
figure 4

Shows accuracy (% correct) as a function of attended-region size (narrow vs. wide) for spatial versus temporal acuity tasks for Experiment 3 where the targets were presented 2° to the left and right of fixation. As per Experiments 1 and 2, P-cell-mediated visual perception (spatial acuity) was improved by a narrow versus wide attended-region, whereas M-cell-mediated visual perception (temporal acuity) was unchanged

Response times

Correct response times actually appeared to demonstrate an effect of attended-region size, even more strongly than accuracy. A 2 (task) × 2 (attended-region size) repeated-measures ANOVA on response times revealed no main effect of task, F(1, 27) = 1.70, p = 0.204, ηp 2 = 0.059, but a significant main effect of attended-region size, F(1, 27) = 5.00, p = 0.034, ηp 2 = 0.156, which was qualified by an interaction between task and attended-region size, F(1, 27) = 18.20, p < 0.001, ηp 2 = 0.403. Planned repeated measures t tests revealed that responses were significantly faster in the narrow condition (M = 744 ms) compared with the wide condition (M = 823 ms; p = 0.006). This is not a speed-accuracy trade-off, because it converges with the pattern for accuracy: participants were faster and more accurate in the narrow versus wide attentional region condition. For the temporal gap detection task, there was no significant difference between response times for the wide (875 ms) versus the narrow (843 ms) conditions (p = 0.462). That is, although acuity was hypothesised to be the key measure, response times showed a clear pattern of selective spatial enhancement. Responses were facilitated under the narrow compared with the wide attended-region condition for the task designed to gauge spatial acuity, but not for the task designed to gauge temporal acuity. This pattern held up even though targets were no longer presented centrally, as they were in the previous experiments, but instead were presented in the periphery.

General discussion

Across three experiments, we found evidence which refutes the universal perceptual enhancement assumption, instead demonstrating results consistent with pathway-specific effects of changes in the size of the attended region. That is, the evidence was consistent with model of Selective Spatial Enhancement (SSE), according to which P-cell-mediated visual processes are enhanced by a narrow versus wide size of the attended region, whereas M-cell-mediated visual processes are not. In Experiment 1, with centrally presented stimuli, spatial gap identification benefited from the narrow attended-region, while temporal order judgements did not. This is consistent with previous work showing that spatial gap detection benefits from a narrow attended-region whereas temporal gap detection performance is unchanged (Goodhew et al., 2016). Critically, Experiment 2 revealed evidence that selective enhancement of P cells and not M cells is the underlying mechanism for this pattern of results, by demonstrating that the perception of LSF targets also is unaffected by changes in attended-region size. Finally, Experiment 3 assessed to what extent the pattern of SSE was specific to centrally presented stimuli and found that even when the target stimuli were presented at 2° of eccentricity, spatial gap detection responses were facilitated by a narrower region compared with a wider attended-region size.

This means that the evidence from this study points to SSE as the framework for understanding the impact of the size of the attended region on visual perception. We cannot rule out the possibility that another model provides the correct explanation for the effect of attended-region size on visual perception under other conditions. For example, we have used tasks designed to purely gauge the operation of M cells and P cells, but for higher level tasks, the distinction between dorsal versus ventral cortical processing is more relevant. That is, M and P cells are aligned in independent, parallel channels that extend from the retina into V1 via the lateral geniculate nucleus (LGN). From V1, M cells preferentially innervate the dorsal cortical stream, which encodes the location of objects in space, including the integration of signals to perceive global motion (Rees, Friston, & Koch, 2000) and underlies visually-guided motor control (Goodale & Milner, 1992). In contrast, P cells have a more equitable input into ventral cortical stream, which encodes form and identity information and subserves object recognition (Ferrera, Nealey, & Maunsell, 1992; Goodale, 2008; Goodale & Milner, 1992; Grill-Spector, Tammar, Hendler, & Malach, 2000; Mishkin, Ungerleider, & Macko, 1983). It remains to be seen what impact attended-region size will have on more ecologically relevant visual tasks that tap cortical rather than early visual pathway function, such as object recognition (ventral) and global motion perception (dorsal). It may be that at the cortical level, a pattern of STO is observed for tasks that reflect dorsal versus ventral specialised visual processing. Indeed, it has been shown that performing a secondary task improves global motion perception relative to single-task conditions (Motoyoshi, Ishii, & Kamachi, 2015). If the effect of such a cognitive load (due to the secondary task) is assumed to broaden the attended region (Lavie, 2005), then this is consistent with a model of STO whereby dorsally mediated visual perception improves with a wide region and is degraded with a narrow one. Such questions await future research, which we are currently in the process of conducting. What this does mean is that at this point in time, SSE is more of a working hypothesis regarding the impact of attended-region size on visual pathway processing in the brain. However, what we can say unequivocally is that the present evidence clearly demonstrates that the zoom-lens model (i.e., the trade-off between attended-region size and magnitude of perceptual enchantment) does not generalise across all aspects of visual perception. This conclusion has important theoretical implications for a range of fields, as discussed below.

Theoretical implications for cognition and cognitive neuroscience

The zoom-lens has been an influential model for understanding the distribution of attentional resources. It is understandable that in the 1980s when the model was first proposed (Eriksen & St. James, 1986; Eriksen & Yeh, 1985), when knowledge of the major visual pathways and their implications for visual-cognitive processes was still in its infancy, that such a framework was not incorporated in theorising about attentional mechanisms and their impact on visual perception. Now, we have the benefit of a well-developed understanding of M- and P-cell properties and function, and here we have demonstrated how visual attended-region size appears to have selective impacts on these two pathways and therefore different aspects of visual perception. This, together with the existing evidence that covert shifts of attention also have selective effects (albeit in a different pattern, such that temporal acuity is actively impaired) (Yeshurun & Hein, 2011; Yeshurun & Levy, 2003; Yeshurun & Marom, 2008; Yeshurun & Sabo, 2012), tells us that the impact of these two fundamental mechanisms of visual attention are best understood through an M-cell and P-cell framework. The development of a more sophisticated understanding of other related processes is also occurring as M-cell and P-cell frameworks are adopted. For example, processes such as emotion and hand proximity were initially thought to be nonspecifically beneficial to visual perception, due to a generic effect of attention (Phelps, Ling, & Carrasco, 2006; Reed et al., 2006). It is now understood that emotion (Bocanegra & Zeelenberg, 2011) and hand proximity (Goodhew et al., 2015; Gozli et al., 2012) can both improve and impair performance, dependent on the nature of the task, as encapsulated within the dual-channel framework.

Implications for our understanding of altered visual processing near the hands

In line with this development, Bush and Vecera (2014) found a pattern of spatiotemporal trade-off that they attributed changes in the size of the attended region. Specifically, they found that the typical enhancement in temporal acuity and impairment in spatial acuity that occurs for visual stimuli when the two hands are placed either side of the screen in a power grasp could be reversed to an enhancement in spatial acuity and impairment in temporal acuity when only one hand was placed next to the screen. The authors attributed this to a change in the size of the attended region with a concomitant shift in spatial versus temporal acuity. In other words, they invoked a spatiotemporal trade-off model of attended-region size, despite this notion being untested at the time. Bush and Vecera (2014) made some methodological choices that could potentially undermine their findings, such as using different baseline conditions to compare the effect of two versus one hands on visual perception and only having a practice block for two-hands conditions and not the one-hand condition. However, if we take their results at face value, then when considered in light of the present findings of selective spatial enhancement, it appears that a change in attended-region size cannot account for the spatiotemporal trade-off Bush and Vecera (2014) observed. Instead, the contribution of M versus P cells to visual perception must be shifted by another, as yet unspecified, mechanism when one swaps between the one- versus two-hand arrangement.

Similarly, Goodhew and Clarke (2016) also attributed their P-cell advantage for dense visual search arrays near the hands to a change in the size of the attended region. In light of the present results of SSE, it is interesting the Goodhew and Clarke’s (2016) results were most robust in terms of facilitation for HSFs, rather than impairment for LSFs (see Experiment 2). This suggests that there remains open the possibility that attended-region interacts with hand proximity. Future research should test this possibility directly.

Effect of visual attention on magnocellular mediated visual processes

One possible interpretation of our results is that magnocellular-mediated visual processes simply do not require attention. It is true that LSFs represent the “gist” of a scene; they are simpler and more rapidly processed and often facilitate subsequent sustained and complex processing (Bar et al., 2006). We think that this interpretation is inconsistent with the previous literature, which shows that both LSFs and temporal acuity is reliably modulated by other forms of attention. Even texture segmentation is modulated by shifts of attention (Yeshurun & Carrasco, 1998), and texture segmentation in the service of figure-ground segmentation is attributed to M-cell function (Livingstone & Hubel, 1988). Furthermore, transient shifts of attention impair while sustained shifts of attention improve temporal acuity (Hein et al., 2006; Yeshurun & Hein, 2011; Yeshurun & Levy, 2003; Yeshurun & Marom, 2008). While it might be debated whether attention is a necessary condition for magnocellular-mediated visual processes, it is certainly not the case that magnocellular mediated visual perception is impervious to all aspects of attention. It, however, does appear to be insensitive to the size of the attended region.

Another possible interpretation is methodological: that is, while it is conceptually possible for attention to influence M-cell-mediated tasks, the specific tasks we used to gauge M-cell-mediated processes may have been impervious to modulation. A number of lines of evidence undermine this possibility. First, the specific temporal order judgement task we used to assess magnocellular function is Experiment 1 has been found to be reliably impacted by covert shifts of attention, and this pattern can be an impairment or enhancement depending on cue predictability (Hein et al., 2006). Second, the orientation-discrimination task of low-spatial frequency (1 cpd) Gabor targets that we used in Experiment 2 to assess M-cell contribution has been found to be affected by covert shifts of attention across a range of spatial-frequencies (0.5 cpd to 10 cpd) (Cameron, Tai, & Carrasco, 2002). It also has been found to be moderated by hand proximity for targets of 2.1 cpd and 4.2 cpd (Abrams & Weidler, 2013), such that performance was improved or impaired depending on target spatial frequency. Moreover, the precise spatial frequencies used—1 cpd and 10 cpd—in the same laboratory as the present study have driven disordinal (cross-over) interactions between hand proximity and spatial frequency (Goodhew & Clarke, 2016). Third, the temporal gap detection task used in Experiment 3 is impaired by covert shifts of attention (Yeshurun & Levy, 2003) and improved by hand proximity (Gozli et al., 2012). Altogether, this is compelling evidence that the tasks we used to gauge M-cell contribution to visual perception are sensitive to modulation and that the absence of modulation in the current experiments is theoretically meaningful. That is, changes in the size of the attended-region modulates P-cell-mediated and not M-cell-mediated visual performance.

Conclusions

Across three experiments, we found that while spatial acuity and high spatial-frequency perception was improved by a narrow versus wide attended-region size, temporal acuity and low-spatial frequency perception was unchanged. This clearly demonstrates that the zoom-lens model does not generalise across different perceptual domains. Instead, it is evidence consistent with the pathway-selective effects of visual attention and in particular the model SSE, which proposes that P-cell-mediated visual perception benefits from a narrow versus wide attended region, whereas M-cell-mediated visual perception is unaffected by changes in the size of the attended region. This highlights the need to consider the contrasting major visual pathways when theorising about the impact of visual attention.