Introduction

The most widely adopted approach for studying the organization of cognitive functions, at least in cognitive neuroscience, is based on identifying dissociations in task performance between subjects, both with and without brain damage. While this approach is not without problems (Gerlach, Lissau, & Hildebrandt, 2018), it has been fairly successful in pin-pointing domain-specific effects. As an example, it has been shown that inversion is generally more harmful for face than for object recognition, suggesting that face and object processing differ in important respects (Bruyer, 2011; Yin, 1969). What this approach does not consider are operations that are commonly engaged across tasks, that is, domain-general operations. The reliance on dissociations as the only means of “carving nature at its joints” is somewhat unfortunate as identification of associations (commonalities) should be just as important as identification of dissociations in attempts to fully understand cognitive functioning. In line with this reasoning, Wilmer (2008) has advocated for the use of an individual difference approach where individual differences among measures are not simply treated as noise, as is often the case in group studies, but rather as potential useful information that can assist in determining whether, and how much, different tasks rely on the same resources.

The use of an individual difference approach has become increasingly prevalent in studies of face recognition where several attempts have been made to examine whether individual differences in measures of holistic processing (e.g., the composite face paradigm (Young, Hellawell, & Hay, 2013), and the part-whole paradigm (Tanaka & Farah, 1993)) can predict face recognition performance. Unfortunately, the results from these studies have been rather inconsistent. Some studies have reported positive findings (DeGutis, Wilmer, Mercado, & Cohan, 2013; Richler, Cheung, & Gauthier, 2011; Wang, Li, Fang, Tian, & Liu, 2012), whereas others have not (Konar, Bennett, & Sekuler, 2010; Richler, Floyd, & Gauthier, 2015; Sunday, Richler, & Gauthier, 2017). Although some studies suggest that these measures of holistic processing tap the same process (DeGutis et al., 2013), other studies have failed to find a correlation between them (Rezlescu, Susilo, Wilmer, & Caramazza, 2017; Wang et al., 2012). Correspondingly, there also appears to be many different interpretations of what “holistic” means (Richler, Palmeri, & Gauthier, 2012), and until this uncertainty has been resolved it has been suggested that the term be dropped in favor of just referring to the specific effects (the composite effect, the part-whole effect, the inversion effect, etc.) (Rezlescu et al., 2017).

The focus of the present paper is on the global precedence effect, which may or may not be considered an aspect of holistic processing depending on which definition of holistic one is committed to. The global precedence effect, as its name implies, refers to the observation that the overall layout (global level) of a stimulus is often processed before its details (local level). The effect is probably best known in the context of Navon’s paradigm (Navon, 1977), which typically uses stimuli where large letters (global level) are composed of smaller letters (local level), and in which the global and the local letters may be the same (consistent) or different (inconsistent) (Fig. 1a). While different effects may be obtained with this paradigm depending on exposure duration, masking, letter spacing, attentional demands, etc. (Kimchi, 1992; Navon, 2003; Yovel, Levy, & Yovel, 2001), three effects are usually found: (i) a global precedence effect, which is the effect we will focus on here, with faster judgments of the identity of the large letter (the global shape) compared with the small letters, (ii) an interference effect, with slower responses to inconsistent than consistent stimuli, and (iii) an inter-level interference effect with greater interference effects on local compared with global identity trials.

Fig. 1
figure 1

Example stimuli from: (a) the Navon paradigm – consistent and inconsistent stimuli, and (b) the object-decision task – real object (whistle) and non(sense)-object (half wolf, half mule)

The global precedence effect seems to fit with findings from other paradigms suggesting a coarse-to-fine temporal dynamic in visual object processing (Bar et al., 2006; Gerlach, 2017; Hegde, 2008; Mace, Joubert, Nespoulous, & Fabre-Thorpe, 2009; Poncet & Fabre-Thorpe, 2014; Sanocki, 1993; Schyns & Oliva, 1994; Wu, Crouzet, Thorpe, & Fabre-Thorpe, 2015), and patients with visual recognition deficits following brain damage have also been found to perform abnormally in Navon’s paradigm (Behrmann & Kimchi, 2003; Gerlach, Marstrand, Habekost, & Gade, 2005). This suggests that Navon’s paradigm may tap some of the same operations that underlie visual object processing. This aspect is not trivial considering that compound stimuli differ from common objects in that they are formations of elements, rather than actual objects, and that the elements that constitute their global shape are not features of the global shape but objects in their own right. In more direct support of the notion that the global precedence effect may tap operations underlying visual object recognition, we recently showed – by means of an individual difference approach – that global precedence effects in Navon’s paradigm can explain a considerable amount of variance in two standard object classification paradigms: object decision and superordinate categorization (Gerlach & Poirel, 2018).

There is reason to believe that a coarse-to-fine temporal dynamic may not only characterize visual object processing but also face processing where the global structure of the face may dominate in the initial phases providing a stable representation of the image wherein more detailed information regarding features can later be embedded. Evidence supporting this notion has been presented by Goffaux et al. (2011). They showed that processing of face information in most face-preferring brain regions, and especially in the right fusiform face area, initially relies on low spatial frequencies, which are deemed particularly important for processing of configural information (Goffaux, Hault, Michel, Vuong, & Rossion, 2005), but that with increasing exposure durations, these regions attenuate processing of low spatial frequencies in favor of higher spatial frequencies. Likewise, the human N170, which is an electrophysiological component associated with face detection, is stronger in response to low-pass than high-pass filtered images of faces (Flevaris, Robertson, & Bentin, 2008).

If face processing, like object processing, is characterized by a coarse-to-fine temporal dynamic, it seems reasonable to expect that individual differences in face recognition performance may also be systematically related to the global precedence effect as measured in Navon’s paradigm. Some indication of a connection was obtained in a study by Gao, Flevaris, Robertson, and Bentin (2011). They showed that the composite face effect was larger for subjects when they had attended to the global level of compound stimuli in Navon’s paradigm compared with the local level (or had not been primed at all). Even though this result does not reflect a global precedence effect as such, it does demonstrate that a global shape bias can affect face recognition performance. More direct evidence for a systematic relationship between global precedence effects and face recognition performance was obtained in a recent study of ours in a group of developmental prosopagnosics (DPs) – individuals with a disorder characterized by profound and lifelong difficulties with face recognition (Duchaine, 2011). We initially showed that face recognition ability in this group correlated with recognition of objects when the objects were presented as silhouettes and as fragmented forms (Gerlach, Klargaard, & Starrfelt, 2016), stimuli that are likely to place particular demands on global shape information in the recognition process (Gerlach, 2009; Gerlach & Toft, 2011). Furthermore, the magnitude of the DPs’ global precedence effects in Navon’s paradigm correlated with both their face recognition performance and their ability to recognize degraded objects (silhouettes and fragmented forms) (Gerlach, Klargaard, Petersen, & Starrfelt, 2017).

The objectives of the present experiments were to examine whether the global precedence effect, as measured with Navon’s paradigm, would also be systematically related to face recognition performance in typically developed subjects, and whether such an effect would be of similar magnitude for face compared with object recognition in the same subjects.

Method

Participants

Eighty-seven first-year psychology students (66 females; mean age = 23.2 years, SD = 5.4 years) from the University of Southern Denmark, naïve to the specific hypotheses tested, took part in the study as part of their course in cognitive psychology. The course is approved by the study board at the Department of Psychology, University of Southern Denmark, and the experiments conducted do not require formal ethical approval/registration according to Danish Law and the institutional requirements. Prior to participation the students were informed that data collected in the experiments might be used in an anonymous form in future publications. Participants were free to opt out if they wished, and participation in the experiments was taken as consent. No participants opted out and hence the sample size was determined by the number of students who took the course that year and were present at the individual test dates. No participants were excluded from the analyses reported below.

Procedure and stimuli

Navon’s paradigm

The participants were presented with large letters, either ‘H’ or ‘S’, that could consist of smaller ‘H’s or ‘S’s (Fig. 1a).

Each participant was presented with four experimental blocks presented in an ABBA design. The participants were required to report the identity of the global letter in two of the blocks and the identity of the local letters in the others. Half of the participants began with global identity judgements.

The large letters subtended 3.91° × 5.25° and the small letters 0.47° × 0.67° of visual angle. The fixation cross presented before stimulus onset subtended 0.95° × 0.95° of visual angle. All stimuli were black presented on a white background on a computer screen.

Participants performed a total of 80 trials in each block (40 consistent/40 inconsistent). Stimuli were shown at four different positions (top, bottom, left, and right) relative to the fixation cross. An equal number of stimuli within each block (10 consistent/10 inconsistent) were presented at the four locations and the center of the global shape was positioned 3.34° of visual angle from the fixation cross. The order of position and consistency was randomized.

A trial began with a fixation cross presented in the middle of the screen for 1 s, which the participants were instructed to look at. This was followed by stimulus onset, which was replaced after 180 ms by a blank screen that remained until response. Reaction times (RTs) were recorded by means of the keyboard. Before each block, the participants performed 16 practice trials. Feedback (accuracy) was provided on the practice but not the experimental trials.

The Cambridge Face Memory Test (CFMT) and the Cambridge Car Memory Test (CCMT)

In the CFMT (Duchaine & Nakayama, 2006) and the CCMT (Dennett et al., 2012), the participant is introduced to six target stimuli, either faces or cars, and then tested for recognition of these after a short delay with forced choice among three stimuli. A total of 72 trials are distributed over three phases: (i) an 18-trial intro-phase where the study stimulus and the target stimulus are identical, (ii) a 30-trial novel-phase where the target differs from the study stimulus in pose and/or lighting, and (iii) a 24-trial novel+noise phase where the target differs from the study stimulus in pose and/or lighting and where Gaussian noise is added to the target. The dependent measure is number of correct trials.

Object decision

In this task participants must decide whether stimuli depict real objects or non(sense)-objects. All non-objects were chimeric combinations of real objects (Fig. 1b).

The participants were instructed to press the 'M-key’ for a real object and the 'N-key' for a non-object, and they were encouraged to respond as fast and as accurately as possible. Prior to data collection the participants performed a practice version of the task. Stimuli used in the practice version were not used in the actual experimental condition.

One hundred and sixty pictures (80 real objects/80 nonobjects) were presented in random order. All stimuli were presented centrally on a white background and subtended 3–5° of visual angle. The stimuli were displayed until the participants gave a response. The interval between response and presentation of the next object was 1 s.

The participants first performed the Navon task (N = 87). One week later they performed the CFMT (n = 81) and the CCMT (n = 83) with task order counter-balanced across participants. Finally, they performed the object-decision task (n = 81) separated by a week from the Cambridge tests. The difference in number of participants across the tasks is primarily due to some participants being absent on some of the test points but also a few instances where data files in The Cambridge Tests were accidentally overwritten.

Data analysis

Following Wang et al. (2012), we use a relative measure of face recognition ability (FRA) based on subtracting car recognition performance from face recognition performance in the two otherwise identical Cambridge tests. This procedure is assumed to isolate processes relatively specific to face recognition because variance reflecting domain-general cognitive processes (visual-discrimination ability, visual attention, etc.) is subtracted out (Wang et al., 2012). This aspect is particularly important as it has been shown that previous findings of correlations between the CFMT and measures of holistic processing may have reflected a learning component in the task (the same target stimuli are repeated) rather than a perceptual component of it (Richler et al., 2015). We note, however, that the validity of the subtraction approach is debated. DeGutis et al. (2013), for example, have pointed out that a potential weakness of the subtraction approach is that the dependent measure, being a relative measure, may reflect variation in both the measure of interest (here CFMT performance) and the “baseline” measure (here CCMT performance), making interpretations of potential correlations with the relative measure (here FRA) difficult. Hence, a potential positive correlation between the FRA and global precedence effects may reflect that positive global precedence effects are either beneficial to face recognition (CFMT), harmful for car recognition (CCMT), or a combination of both effects. Instead DeGutis et al. (2013) argue for a regression-based approach because regressing the control conditions from the condition of interest creates a better measure that is independent of the control condition. Whether this regression-based approach is in fact better than an approach based on subtraction is not clear, and it is not the case that the regression approach always yields better results than an approach based on subtraction (Ross, Richler, & Gauthier, 2015). It could be argued, for example, that the subtraction approach is more “true” to the individual difference approach because each subject serves as its own control rather than each subject’s performance (on the measure of interest) being estimated based on deviation from the relationship for the group as a whole (the regression line relating performance on the variable of interest and the control variable). However, given this uncertainty, we adopt both approaches here.

As demonstrated by Gerlach and Krumborg (2014), many formerly used indexes derived from Navon’s paradigm have probably been unreliable. To avoid this problem, we use an index of global/local shape bias which is based on the standardized mean difference (Cohen’s d) between RTs to Local Consistent and Global Consistent trials. In comparison with other indexes derived from Navon’s paradigm, this index, which we term the Global-Local precedence index, is pure, because it measures differences in global and local processing that are not confounded by interference effects (Gerlach & Krumborg, 2014). The higher the score on this index, the faster are responses to global as compared with local shape characteristics. This index is also the same as that used in the study by Gerlach et al. (2017), which showed a correlation with face recognition performance in DP, and in the study by Gerlach and Poirel (2018), which showed a correlation with object classification performance.

Analysis of the object-decision task was restricted to real objects only as the non-objects serve no other purpose in the present study than to ensure detailed shape processing of the real objects (Gerlach & Toft, 2011). The mean percentage correct answers was 95 (SD = 3.7), with some individuals performing at ceiling. Hence, we decided to use the mean correct RT for each individual as the dependent measure in the correlation analyses with the Global-Local precedence index.

The Pearson product-moment correlation coefficient was used to examine associations, and the confidence intervals (CIs) of these were computed by means of bias corrected and accelerated bootstrap analyses with 1,000 samples. To estimate the reliabilities of the indexes on which these correlation analyses are based we computed their Spearman-Brown-corrected split-half reliabilities. As can be seen in Table 1, these can be deemed satisfactory (Cook & Beckman, 2006).

Table 1 Mean values and 95% confidence intervals (CIs) for face recognition ability (FRA), the Cambridge Face Memory Test, the Cambridge Car Memory Test, the Global-Local precedence index derived from Navon’s paradigm, and the reaction time (ms) in the object recognition task (object decision). Also shown is the Spearman-Brown-corrected split-half reliability associated with the individual measures

As can be seen from Table 1, the participants exhibited the usual pattern found with Navon’s paradigm as 92% (Fig. 2b) were faster with global than with local identity judgements on consistent trials (corresponding to a positive score on the Global-Local precedence index).

Fig. 2
figure 2

(a) Individual scores on the face recognition index (FRA) computed by subtracting the score on the Cambridge Car Memory Test from the score on the Cambridge Face Memory Test, and (b) the individual scores the Global-Local precedence index. For the former index, positive values represent better performance with faces relative to cars. For the latter index, positive scores represent a global bias, and scores are expressed as effect sizes (Cohen’s d). For both indexes, scores are ordered numerically

Results

The mean FRA-score was 9.2, reflecting the fact that faces were recognized more efficiently than cars (Table 1). This was true for the majority of the individuals (84%) (Fig. 2a).

The Global-Local precedence index correlated positively with the FRA index (r = .25, 95% CI = .06–.45) (Fig. 3a), and negatively with RTs in the object-decision task (r = -.22, 95% CI = [-.39, -.02]) (Figure 3b).

Fig. 3
figure 3

Scatterplots showing the relationship between: (a) face recognition index (FRA) and the Global-Local precedence index, and (b) object recognition performance and the Global-Local precedence index. Also shown are the Pearson correlation coefficients (r) and their associated 95% confidence intervals (CIs)

To examine whether the Global-Local precedence index would also correlate with CFMT performance when performance on the CCMT was used as a regressor (rather than as part of a subtraction), we computed the partial correlation between the Global-Local precedence index and CFMT performance controlling for CCMT performance. This yielded similar results as the approach based on subtraction (r = .21, 95% CI = .02–.4). While neither the CFMT nor the CCMT correlated reliably with the Global-Local precedence index when assessed individually, we note that the correlation was positive for the CFMT (r = .16, 95% CI = .39 to -.12) and negative for the CCMT (r = -.14, 95% CI = -.36–.11). The correlation between the CFMT and CCMT was, however, reliable (r = .37, 95% CI = .16–.55), suggesting that these tasks share a considerable amount of variance.

Discussion

There has been an increase in studies using an individual difference approach to examine the organization of cognitive functions. This movement has been especially prevalent for studies of face recognition where individual differences in face recognition performance have been sought tied to measures of holistic processing. Findings have, however, been rather inconsistent. Whether this is due to poor reliability of the measures used (DeGutis et al., 2013; Ross et al., 2015; Sunday et al., 2017; Wang et al., 2012), inefficient control of the variables that may have driven observed effects (Richler et al., 2015), uncertainty regarding whether the measures employed tap the same construct (Rezlescu et al., 2017; Wang et al., 2012), or a combination is unclear. Still, an individual difference approach to visual cognition seems to be a valuable and also necessary complement to studies examining dissociations (Wilmer, 2008) as it can reveal domain-general operations. Indeed, a full understanding of visual processing must ultimately account for both domain-specific and domain-general operations.

One aspect that may characterize both face and object recognition, and thus be a candidate for a domain-general modus operandi in visual cognition, is a coarse-to-fine temporal dynamic where global shape information is derived prior to local shape information and may even augment the build-up of an elaborate visual representation of the stimulus (Gerlach, 2017; Goffaux et al., 2011). The perhaps most well-known example of a coarse-to-fine temporal dynamic is the global precedence effect in Navon’s paradigm (Navon, 1977), where people are faster at recognizing the large letter than the small letters in compound stimuli. Even though compound stimuli differ from common objects in important aspects (cf. the introduction), we have recently shown that individual differences in the magnitude of the global precedence effect in Navon’s paradigm can explain a considerable amount of variance in normal object classification performance (Gerlach & Poirel, 2018). In the present experiments we examined if global precedence effects can also explain individual differences in face recognition performance, and if so, whether the amount of variance accounted for by global precedence effects is similar for face and object recognition performance.

In line with our previous study (Gerlach & Poirel, 2018), we find a negative correlation between RTs in the object recognition task (object decision) and our index of global precedence derived from Navon’s paradigm (the Global-Local precedence index). The larger the global precedence effect, the more efficient the object recognition performance. In the present study the correlation was r = -.22 (95% CI = -.39 to -.02), compared with r = -.31 (95% CI = -.46 to -.11) in our previous study. Pooling data from the two studies, comprising a total of 197 subjects, yielded a correlation of r = -.28 (95% CI = -.42 to -.13). Of more interest in the present context was the finding that the Global-Local precedence index also correlated positively with our relative measure of face recognition ability (FRA), which was based on subtracting car recognition performance (indexed by the Cambridge Car Memory Test; CCMT) from face recognition performance (indexed by the Cambridge Face Memory Test; CFMT). The larger the global precedence effect, the more efficient was the face recognition performance (relative to car recognition performance). The effect size was also quite similar for FRA (r = .25) and for object recognition (r = -.22). These results suggest that some of the individual variation in both object and face recognition performance can be accounted for by a similar measure of global precedence effects, and this in turn supports the notion that both face and object recognition is characterized by a coarse-to-fine temporal dynamic where global shape information is derived prior to local shape information and may augment subsequent processing (Gerlach, 2017; Goffaux et al., 2011). Hence, just as object-based grouping mechanisms have been shown to also support (holistic) face processing (Curby, Entenman, & Fleming, 2016; Curby, Goldstein, & Blacker, 2013; Zhao, Bulthoff, & Bulthoff, 2016), global precedence effects seem to affect face and object processing in a domain-general manner.

A theoretical explanation of why fast derivation of global relative to local shape information seems important for both face and object recognition can be found in the PACE model of visual object processing (Gerlach, 2009, 2017). This model assumes the existence of two operations: shape configuration and selection. Shape configuration refers to the binding of visual elements into elaborate shape descriptions in which relationships between the parts are specified, whereas selection refers to the matching of visual impressions to representations stored in visual long-term memory (VLTM). The matching process is thought of as a race among VLTM representations that compete for selection, and the VLTM representation that matches the configured representation the best according to a given criterion will win the competition, and hence be selected. The race is initiated by matching the outline (gestalt) of the stimulus to VLTM representations. This first-pass access to VLTM yields initial hypotheses regarding the likely identity of the stimulus. These hypotheses are then used in a top-down manner to augment the buildup of a more detailed description of the visual impression of the stimulus (i.e., shape configuration), which again serves as input for a more specific match with VLTM representations (Gerlach, 2017). The greater the demand placed on perceptual differentiation, the more loops comprising VLTM access and shape configuration are required to reach a successful match between the visual input and VLTM representations (i.e., recognition). According to this model fast derivation of global shape information is important in the recognition process because it facilitates the matching process, by narrowing down the scope of likely VLTM candidates, but also because it provides the initial frame in which local details can later be embedded (Sanocki, 1993, 2001). Hence, when interpreted within the PACE framework it makes good sense that faster derivation of global than local shape information is associated with (more) efficient recognition, and that this applies to both faces and objects.

A possible limitation of the present study is that the correlation between face recognition ability and global precedence effects was only found using a relative measure of face recognition ability based on subtracting CCMT performance from CFMT performance; i.e., the Global-Local precedence index did not correlate reliably with the absolute scores on the CFMT. A similar effect, however, was found by regressing CCMT from CFMT performance, rather than by subtracting the measures. We suggest that this is because the shared processes tapped by the CFMT and the CCMT (visual attention, stimulus repetition, etc.) are removed in these analyses. It is, however, also possible that global precedence effects have opposing impact on car and face recognition so that a relative measure captures both effects. Hence, we can only conclude that global precedence effects predicted CFMT performance above and beyond what could be accounted for by the CCMT alone. This is in keeping with the notion that global precedence effects exert a positive influence on face recognition performance.

A question related to the possible limitation discussed above is why CCMT performance did not correlate positively with the Global-Local precedence index if object recognition is generally facilitated by global precedence effects, which we argue is the case. We see two possibilities that are not mutually exclusive: (i) It may be that cars form a special class of objects that have less in common with other non-face classes than faces do (e.g., susceptibility to global precedence effects). That this is not merely a tautology is supported by studies showing lower correlations in general between car recognition and recognition of non-face categories than between face recognition and the same non-face categories (McGugin, Richler, Herzmann, Speegle, & Gauthier, 2012; Van Gulick, McGugin, & Gauthier, 2016). This picture is corroborated by a study by Richler, Wilmer, and Gauthier (2017) showing that car recognition in general correlated less with recognition of other object categories than these other object categories correlated with one another. (ii) Another possible explanation is that the CCMT is primarily based on recognition of features whereas the CFMT and the object-decision task (our object recognition measure) necessitate configural processing that is supported by global shape information (Gerlach & Poirel, 2018; Goffaux et al., 2005). While the correlations between the Global-Local precedence index and performance on the CCMT and the CFMT were both unreliable, we note that the relationship was negative for the CCMT and positive for the CFMT, a pattern that is compatible with the post-hoc explanation that CCMT performance might primarily rely on featural (local) processing, whereas CFMT performance is likely to require global shape processing. This is not to say that global shape does not also benefit recognition of cars when cars are presented with other objects as in the object-decision task, but rather that the global shape of cars might be of little value when cars must be discriminated from other cars (within-class recognition) as in the CCMT.

We will close by noting that the correlations reported here may appear small in that global precedence effects account for no more than 5–6% of the variance in face and object recognition performance. On the other hand, given that face and object recognition are complex cognitive functions, which are affected by a number of other factors ranging from visual acuity to semantic access, and that the Navon paradigm uses neither faces nor objects as stimuli, the effects are not insignificant. In fact, considering that previous studies have found effects of similar or only slightly larger magnitude, if any effects at all, by comparing face-based measures with face recognition performance, the present findings are noteworthy. Also, even if the existence of domain-general operations, as the one uncovered here concerning a global-to-local temporal dynamic, may be considered unsurprising, they do provide a bound for where face and object recognition might functionally diverge – an aspect that should be relevant for any model of visual object and face processing.

Author Note

This work was supported by a grant from the Danish Research Council for the Humanities (DFF – 4001-00115). We wish to express our gratitude to the Friends of Fakutsi Association (FFA).