Elsevier

Brain Research

Volume 1117, Issue 1, 30 October 2006, Pages 154-161
Brain Research

Research Report
Is high-spatial frequency information used in the early stages of face detection?

https://doi.org/10.1016/j.brainres.2006.07.059Get rights and content

Abstract

The present study examined the role of high-spatial frequency information in early face processing, as indexed by the N170 face-sensitive ERP component. Participants detected 4 versions of famous faces, including full spectrum faces, and bandpass filtered faces containing predominantly high-spatial frequencies, low-spatial frequencies or both. The power spectra of all stimuli were balanced by superimposing the faces onto a visual noise background that included the spatial frequency information that was missing in filtered faces, e.g., high-spatial frequency faces were presented on a high- and low-spatial frequency background. An additional condition comprising of filtered visual noise only was also created to ensure that any observed effects were related to the processing of faces and not simply due to variations between spatial frequency information. Both behavioral and electrophysiological results replicated previous findings of a low-spatial frequency advantage for face processing. However, our results also show that faces containing both high and low-spatial frequency information are detected faster and more accurately than faces containing predominantly low-spatial frequencies. Furthermore, this advantage occurred with an enhanced amplitude of the N170. Together, these findings refute the suggestion that high-spatial frequencies are redundant in face perception.

Introduction

Faces contain a wide spectrum of spatial frequency information. There are now several studies that have used spatial frequency filtering techniques to investigate the minimal spectral information required for face recognition. Variations among the spatial filtering techniques used, as well as differences in experimental design have often led to contradictory results and hence different conclusions about the relative importance of different spatial frequencies for face recognition. However, in general it is agreed that much of the information relevant for face recognition appears to be conveyed by spatial frequencies from middle (8–16 cycles per face) (Collin et al., 2004, Costen et al., 1994, Fiorentini et al., 1983, Grabowska and Nowicka, 1996, Morrison and Schyns, 2001, Näsänen, 1999, Parker and Costen, 1999) as well as lower spatial frequency bands (LSF) (Ginsburg, 1978, Harmon, 1973, Harmon and Julesz, 1973). For example, Harmon (1973) showed that removal of high-spatial frequency information (HSF) had little effect on participants' ability to recognize faces and Ginsburg (1978) showed that faces containing only LSFs could be matched successfully against broad-spectrum face images. The results of these behavioral studies parallel those of more recent electrophysiological investigations demonstrating that the face-sensitive N170 event-related potential (ERP) component appears to be sensitive to LSF, but not HSF, information (Goffaux et al., 2003a, Goffaux et al., 2003b). While the aforementioned studies have shown how variations in spatial frequency content affect recognition of faces other investigations have concluded that face recognition is more strongly affected by spatial frequency overlap (SFO) than by the actual spatial frequency content of face images (Collin et al., 2004, Kornowski and Petersik, 2003, Liu et al., 2000). For example, Liu et al. (2000) defined SFO as the range of spatial frequencies shared by a pair of filtered images and found that greater overlap between learned and test images yielded better performance on face matching tasks. Moreover, this effect appears to be confined to faces, as object recognition did not show the same sensitivity to SFO (Collin et al., 2004). However, since these studies did not examine the role of spatial frequency overlap in tasks other than face matching it remains to be determined whether or not these results are generalizable across other face perception tasks, e.g., detection, gender or emotional expression analysis etc.

The claim that face recognition is dependent on LSF information, with higher spatial frequencies contributing essentially redundant information has been refuted by some investigators. For example, Fiorentini et al. (1983) trained participants to identify (by name learning) nine originally unfamiliar male faces, and examined how recognition (name recall) was affected when faces were filtered so that that they contained only LSF information or only HSF information or a combination of both high- and low-spatial frequencies. The results showed that both coarse and fine scale information can be used to identify faces, but the addition of HSFs to LSFs significantly improved face recognition. Based on this finding the authors concluded that HSF information is not necessarily redundant in face perception. In an attempt to resolve this controversy, Sergent and Hellige (1986) suggested that there is no “critical bandwidth” of spatial frequencies that is necessary for face perception. Rather the relative contribution of spatial frequencies is dependent on the kind of face judgement participants are being asked to make. This idea forms the basis of the “flexible usage hypothesis” proposed by Schyns and Oliva, 1997, Schyns and Oliva, 1999. According to this hypothesis, distinct categorizations of an image will require different perceptual cues, which themselves could be associated with different regions of the spatial spectrum. In the case of faces, distinct spatial frequencies may convey face identity, gender and expression (Schyns and Oliva, 1999, Sergent and Hellige, 1986). For example, the age of a face is probably best conveyed by HSFs which represent information pertaining to the features of the face, such as wrinkles around the eyes and mouth or creases on the forehead. In contrast, other categorizations might require information represented at a coarser scale. Thus, Schyns and Oliva (1999) suggest that scale usage for categorization may be flexible and determined by the usefulness (or diagnosticity) of cues at specific spatial scales. This view has received support from behavioral studies (Schyns and Oliva, 1999, Sergent and Hellige, 1986, Smith et al., 2005), as well as partial support from an ERP study in which the N170 was larger for LSF than HSF faces during a gender task, but not during a familiarity task (Goffaux et al., 2003b). However, since the predicted larger N170 for HSF than LSF in the familiarity task was not obtained, there is now increasing support for the idea that HSF are not critical at least for the early stages of face processing.

The aim of the current study was to further investigate whether HSF information has any influence in the early stages of face processing as indexed by the N170. In the following experiments the spatial frequency content of black and white photographic images of famous faces was manipulated so that the relative importance of LSF and HSF in the early stages of face processing could be assessed using a face detection task. We used a face detection task for two reasons (1) the N170 is thought to reflect early face processing mechanisms related to the structural encoding of faces (Bentin and Deouell, 2000, Bentin et al., 1996, Eimer, 2000a, Holmes et al., 2003) and has been found to be insensitive to changes in facial expression (Eimer et al., 2003, Holmes et al., 2003) or familiarity (Eimer, 2000b) and (2) face detection is not thought to rely heavily on HSF information, therefore using a detection task would be a particularly strict test to determine whether or not HSFs contribute to the processing of faces. Previous studies investigating the effect of spatial frequency on face recognition have typically used learning methods (such as name recall) to familiarize participants with previously unknown faces, we used famous faces as we thought this would involve a more naturalistic processing strategy. As mentioned, the N170 is thought to reflect the early stages of encoding of facial information (detection of the basic configuration of the face and/or of the eyes in particular (Bentin and Deouell, 2000, Bentin et al., 1996)). There is evidence to suggest that the perceptual mechanisms underlying the processing of featural and configural information differ and the difference may be related to underlying properties of channels involved in the processing of low- and high-spatial frequencies. The processing of configural (or global) information is thought to depend on its LSF content and the processing of featural information is thought to rely more on its HSF content (Boeschoten et al., 2005). Moreover, evidence from neuroimaging studies suggests that HSF and LSF information engage different brain areas. While some studies suggest the left hemisphere processing of HSF and right processing of LSF, others conclude that the processing of configural and featureal information is mapped according to the spatial frequency, that is, the retinotopic map of the visual cortex (Boeschoten et al., 2005).

The general aim of this study was to investigate how varying the spatial frequency information depicted in faces affects (a) the amplitude and/or latency of the N170 and (b) the accuracy and speed with which faces are detected. The different face stimuli used were: (1) Original faces—these were unfiltered images of faces and hence contained a wide range of spatial frequency information. Inclusion of this condition allowed us to replicate the basic prior findings with the N170; (2) low-spatial frequency faces—these faces contained only low-spatial frequency information and therefore depicted information thought to be predominantly relevant to the overall configural properties of faces; (3) high-spatial frequency faces—these faces contained only high-spatial frequency information and therefore depicted information predominantly relevant to the textural/featural properties of faces; and (4) high- and low-spatial frequency faces—these faces contained both low and high-spatial frequency information but differed from unfiltered faces in that they did not contain mid-band spatial frequency information. All the faces were superimposed onto a structured visual noise background that consisted of spatial frequency information that was missing as a consequence of bandpass filtering the faces i.e., both HSF and LSF faces were superimposed on a HSF and LSF noise background. In this way, we were able to ensure that the spectral energy of our stimuli were balanced allowing us to rule out early visual effects that might trivially arise due to power spectra variations amongst the different stimulus conditions. An additional condition was included that comprised of HSF and LSF structured visual noise only. The low-level physical properties of High-and-Low Faces and High-and-Low Noise are identical therefore any difference in the amplitude and/or latency of the N170 between these two conditions is attributable to the presence of face information.

We predicted that if HSFs contribute to the early stages of face processing the N170 should be (a) greater to HSF faces than to spatial frequency matched noise, and (b) greater to High and Low combined faces than to LSF faces. In addition, we predicted that if HSF information contributes to face detection then participants would be more accurate to detect High and Low faces compared to LSF faces.

Section snippets

N170 amplitude

Only those ERP trials in which participants made correct responses were included in the analysis of the amplitude of the N170. A repeated measures ANOVA showed that there was a main effect of condition on the amplitude of the N170 (F(4,44) = 46.84, P < 0.001) (relevant means and standard deviations are shown in Table 1). As predicted, the amplitude of the N170 was larger for HSF faces compared to noise (F(1,11) = 36.14, P < 0.001). The N170 was also larger for faces containing both high and low-spatial

Discussion

Varying the spatial frequency content of faces affects the early encoding of faces as indexed by the N170. LSF faces elicited a robust N170 that was only somewhat smaller than that elicited by broadband faces, while HSF faces elicited a significantly smaller response. Our finding showing a larger N170 to LSF faces compared to HSF faces is consistent with previous event-related potential studies (Goffaux et al., 2003a, Goffaux et al., 2003b), but is inconsistent with the results of a recent

Participants

Twelve paid volunteers (6 females, mean age 30.1 years, age range 23–42 years) participated in the experiment. All participants were right handed and had normal or corrected-to-normal vision.

Stimuli

The stimuli were derived from black and white images of 22 famous faces (see Fig. 1 for example of stimuli used). Four face types were used, including original (full spectrum), high (> 24 cycles per image), low (< 8 cycles per image) and High-and-Low Faces (superimposition of high and low faces). In order to

References (28)

  • S. Bentin et al.

    Electrophysiological studies of face perception in humans

    J. Cogn. Neurosci.

    (1996)
  • C.A. Collin et al.

    Face recognition is affected by similarity in spatial frequency range to a greater degree than within-category object recognition

    J. Exp. Psychol. Hum. Percept. Perform.

    (2004)
  • N.P. Costen et al.

    Spatial content and spatial quantisation effects in face recognition

    Perception

    (1994)
  • N.P. Costen et al.

    Effects of high-pass and low-pass spatial filtering on face identification

    Percept. Psychophys.

    (1996)
  • Cited by (73)

    • Time course of spatial frequency integration in face perception: An ERP study

      2019, International Journal of Psychophysiology
      Citation Excerpt :

      Whereas four studies reported larger N170 amplitude in response to HSFs than in response to NF full-spectrum face images (Nakashima et al., 2008; Obayashi et al., 2009) or LSFs (Obayashi et al., 2009; with non-frontal face: Mares et al., 2018; or at the level of the M170 for a MEG study: Hsiao et al., 2005), two reported a smaller amplitude for HSFs compared with LSFs and NF face stimuli (Goffaux et al., 2003a, 2003b; Halit et al., 2006). In three of them, the N170 peak latencies were longer in response to HSFs compared with both NF stimuli and LSFs (Obayashi et al., 2009; Halit et al., 2006; with LSF only: Flevaris et al., 2008) but in one (Hsiao et al., 2005) it was reported to be shorter for HSFs than LSFs and equal to NF face stimuli. In accordance with the coarse-to-fine theory, a recent EEG study (Petras et al., 2019) revealed that LSF information would guide the processing of HSF information.

    View all citing articles on Scopus
    View full text