Introduction

A common notion in psychology holds that faces are processed holistically (Farah et al. 1998; Fitousi 2013; Galton 1879). According to this idea, face parts are grouped together and perceived as a unitary whole or Gestalt. A compelling piece of evidence in favor of holistic face processing comes from the composite face illusion (Young et al. 1987). In this phenomenon, the top and bottom halves of faces from two well-known people are fused together to create a completely novel and unfamiliar face. When asked to recognize the top half of this composite face, people are slower and more error prone relative to a condition in which the halves are misaligned. This composite face effect (i.e., CFE) has also been demonstrated with unfamiliar faces (Hole 1994; Fitousi et al. 2010). Composite faces have become the primary tool for testing holistic face processing (for recent reviews see, Richler & Gauthier 2014; Rossion 2013).

The present study calls the common belief that composite faces are processed holistically into question. When submitted to rigorous tests of perceptual independence, the compelling impression invoked by composite faces appears to be misleading. In this sense, the present effort suggests a disillusionment from the composite face illusion.

The traditional explanation of the composite face effect postulates that the facial half is not perceived independently of the other half, such that parts are grouped into a holistic representation (Rossion 2013). The notion of independence (or its lack thereof) is central to research on holistic processing of faces. However, many studies do not provide a clear definition of independence (Garner and Morton 1969; Fitousi and Wenger 2013; Fitousi 2013; Ashby and Townsend 1986; Maddox 1992). The recent debate over the “correct” assay for measuring the composite face effect (Richler and Gauthier 2014; Rossion 2013) attests to this confusion. It is therefore crucial to ask whether holistic processing with composite faces is supported by other tasks and measures. Various operational and theoretical tests of independence were used in the current study. The main goal has been to provide converging operations on the notion of holistic processing with this class of stimuli (Fitousi and Wenger 2013; Fitousi 2013; 2014; Garner et al. 1956).

The many faces of perceptual independence

Independence refers to the ability of an observer to simultaneously process two sources of information or perform in two tasks, such that performance with one is not affected by the other (Garner and Morton 1969; Fitousi 2013). Independence and its complement—dependence—have played an important role in psychological theorizing (Garner and Morton 1969; Luce and Tukey 1964; Morton 1969; Townsend 1971). The many possible definitions and measures of independence have been investigated by both experimentally and theoretically inclined researchers (Ashby and Townsend 1986; Fitousi and Wenger 2013; Fitousi 2013; Garner 1974; Rumelhart and McClelland 1981; Miller 1982; Massaro and Friedman 1990; Movellan and McClelland 2001; Townsend and Ashby 1983; Mordkoff and Yantis 1991). A fundamental distinction has been made between formal characterization of independence in rate (Fifić and Townsend 2010; Fific et al. 2008; Teodorescu and Usher 2013; Logan et al. 2014; Townsend and Nozawa 1995; Townsend and Wenger 2004; Fific et al. 2010) and independence in information (Ashby & Townsend 1986; Garner & Morton 1969; Fitousi & Wenger 2013; Fitousi, 2013; Kadlec & Townsend 1992). Although, recently formalized accounts have been developed that can deal with both types of dependence simultaneously (Townsend & Altieri 2012).

Facial holism: operational versus theoretical accounts

A brief review of the vast literature on face perception (Maurer et al., 2002; Diamond & Carey, 1986; Searcy & Bartlett, 1996; Leder & Bruce, 2000; Young et al., 1987; Tanaka & Farah, 1993; Farah et al., 1998; Ellison & Massaro, 1997; Gold et al., 2012; Loftus et al., 2004; Macho & Leder, 1998) reveals that definitions of independence are of two primary classes: operational and theoretical (Garner et al. 1956). Operational accounts rely on verbal propositions. These are tested with respect to differences in performance measure (e.g., mean RT, mean accuracy) between two experimentally manipulated conditions. The majority of the composite face studies (Young et al. 1987) belong in that class. In contrast, theoretical definitions of holistic processing, dating at least to O’Toole, Wenger, and Townsend, (2001) include formal, computational, and mathematical models (Wenger and Townsend 2001a, b; Copeland and Wenger 2006; Cornes et al. 2011; Mestry et al. 2012; Mestry et al. 2014; Donnelly et al. 2012; Fifić and Townsend 2010; O’Toole et al. 2001; Wenger and Ingvalson 2002; 2003; Fitousi and Wenger 2013; Fitousi 2013). These accounts avoid circularity and are open to falsification because they rely on explicit and well-defined processing models of independence.

The operational–theoretical bifurcation goes well beyond the methods and research strategies employed by researchers from the two camps. In general, the two branches seem to provide opposite answers to the question of facial holism. Operational studies often support the notion of facial holism whereas quantitative computational models are much less supportive. Studies from the latter tradition have shown that facial features can stand the strong tests of: statistical independence (Ellison and Massaro 1997; Gold et al. 2012; Loftus et al. 2004; Macho and Leder 1998), geometrical independence (Sergent 1984; Tversky and Krantz 1969), independence in processing rate (Bradshaw and Wallace 1971; Wenger and Townsend 2006; Donnelly et al. 2012), and perceptual independence (Wenger & Ingvalson, 2002, 2003, but see, Mestry et al., 2012).

Challenges facing the operational definitions

The operational definitions of independence and facial holism have certainly contributed to our understanding of face processing. However, these types of definitions are problematic because they do not preclude and may allow for circularity and lack of consistency. For example, in some paradigms, such as the whole-part task (Tanaka and Farah 1993), the context of the whole face facilitates performance, whereas in other paradigms, such as the composite face effect (Young et al. 1987), the context of a whole face hinders performance. An advocate of holistic perception could have guessed either outcome, facilitation or inhibition (Campbell et al. 2001).

Another related problem concerns the notable discrepancies observed across closely related experimental paradigms. These paradigms are designed to measure the same construct but often fail to do so. An instructive example comes from the work of Amishav and Kimchi (2010). These researchers have applied the Garner paradigm (Garner 1974)—a classic selective attention tool—to facial attributes. Participants classified faces on eye and mouth shape (cf. Experiments 1a and b). The results indicated that participants could selectively attend to one component (i.e., eyes shape) while avoiding interference from irrelevant variation on the other component (i.e., mouth shape). Similar conclusions were drawn by Pomerantz and his colleagues (Pomerantz et al. 2003) in a study that applied the Garner task to simple face drawings.

Richler, Palmeri, and Gauthier 2013 rightly noted that the absence of interference with face stimuli in the Garner task is inconsistent with the routinely observed interference in the composite face paradigm. After all, the Garner paradigm is one of the most powerful tools for detecting holistic processing (Algom and Fitousi 2014; Pomerantz and Pristach 1989). To address this caveat, Richler et al. subjected Amishav and Kimchi’s stimuli to a standard same–different composite face task in which a study face was presented followed by a test face. The observer had to judge whether the top part of the test face was the same or different from that of the study face. Richler et al. demonstrated a composite face effect, but only if configural changes across trials (i.e., altering the distance between the eyes) were added.

The present study

The discord between the Garner (Amishav and Kimchi 2010; Pomerantz et al. 2003) and composite face paradigms (Rossion 2013), as well as the conceptual weakness of the operational definitions, strongly suggest that the assumption of independence in the processing of composite faces should be systematically examined. To achieve this goal, I used various theoretical and operational characterizations of independence. These included species of independence pertinent to the traditional operational definitions of the composite face effect (Hole 1994), the Garner approach (Garner et al. 1956; Garner and Morton 1969), and the redundant target task (Miller 1982; Townsend and Nozawa 1995). The present work is part of a continuous effort at examining the representational and processing characteristics of independence in performance with composite faces. The structure and logic of this study is indebted to a work by Von Der Heide and colleagues 2014. Following is a brief exposition of the Garner and redundant-target paradigms that were applied to composite faces in this study.

Garner speeded classification task

Garner’s speeded classification (Garner 1974; 1978; Garner & Felfoldy, 1970) is a classic test of selective attention. It was designed to assess perceptual independence between any pair of dimensions (e.g., color and shape). The Garner paradigm (for a review, see Algom & Fitousi, 2014) has been applied extensively to many pairs of facial dimensions such as: identity and expression (Etcoff 1984; Fitousi and Wenger 2013; Ganel and Goshen-Gottstein 2004), identity and speech information (Schweinberger et al. 1999; Schweinberger and Soukup 1998), identity and gender (Gal and Bruce 2002), configural and holistic dimensions (Amishav and Kimchi 2010), and contours inside and outside of facial context (Pomerantz et al. 2003).

The Garner paradigm (Garner 1974) consists of three primary experimental conditions. In baseline, the task-relevant dimension (e.g., shape) is varied while the task-irrelevant dimension (e.g., color) is held constant at one level (e.g., red). In filtering, values of the task-irrelevant dimension are allowed to vary from trial to trial (e.g., red and green). In correlated blocks, values from one dimension are correlated with values from the other dimension (e.g., red circle, green triangle). The task of judging the level of the relevant dimension (e.g., shape) remains the same throughout the three blocks.

Comparable performance in filtering and baseline implies perfect selective attention. Deviation from parity—Garner interference—is due to variation on the irrelevant dimension (Pomerantz 1986). A difference in RTs between correlated and baseline blocks is called redundancy gain. This measure indicates whether observers reaped gain from the experimental co-variation of the two dimensions. Dimensions that produce neither filtering costs nor redundancy gains are dubbed separable dimensions. Dimensions that give rise to filtering costs and redundancy gains are called integral dimensions. Color and shape, for example, are separable dimensions, whereas hue and saturation combine to form integral dimensions (Garner 1974). The separable-integral distinction implies profound differences in processing and structure (Fitousi and Wenger 2013; Fific et al. 2008; Garner 1974; Maddox 1992; Pomerantz 1986; Shepard 1964). The Garner paradigm is applied to composite faces in the present study for the first time. The goal is to determine whether composite face halves are processed as integral or separable dimensions.

Redundant target designs

When a visual display includes two or more targets that require the same response, reaction time is improved relative to a display that includes only one target (Raab 1962). This redundant-target effect (RTE) has been replicated in a great deal of studies using a variety of stimuli and dimensions (Miller 1982; 1986; Mordkoff and Egeth 1993; Townsend and Nozawa 1995). It has been recently harnessed to probing facial dimensions (Donnelly et al. 2012; Fitousi and Wenger 2013; Ingvalson and Wenger 2005; Wenger and Townsend 2001b; Yankouskaya et al. 2012). Application of the redundant target task to composite faces is a novel contribution of the present study. Augmented by allied methodologies and recent formal advancements (Townsend and Nozawa 1995; Townsend and Wenger 2004; Townsend and Eidels 2011), the redundant target task can serve as a powerful means for probing the separate but interrelated characteristics of workload capacity and independence in rate in the processing of composite faces.

There are two versions of the redundant target task (Townsend and Wenger 2004). In the OR version, observers are asked to indicate whether at least one of the targets is present. Response can terminate when the first target is detected. In the AND version, observers are asked to respond when both targets are present. The AND task requires exhaustive processing of all pertinent stimuli or dimensions. The classic independent race model (Raab 1962) accounts for the redundant target effect in the OR task as resulting from a statistical facilitation. This facilitation occurs due to a horse race between two independent channels. Alternative coactivation models (Miller 1982) postulate that information from the two targets is accumulated in a single pool of evidence until a decision threshold is crossed and a response is emitted.

Workload capacity

The concept of processing capacity is ubiquitous in psychology (Kahneman 1973). Capacity refers to the efficiency or total amount of energy expanded in a task (Fitousi and Wenger 2011; Townsend and Ashby 1978). Psychologists have argued that a demanding process may tax the efficiency of a concurrent process due to the fact that central processes share capacity. This entails that capacity is related, though indirectly, to the construct of independence. Yet, in order to derive exact predictions regarding their relations, a dedicated model is needed (Logan et al. 2014; Townsend and Nozawa 1995; Townsend and Wenger 2004).

The system factorial technology (SFT), which has been developed by Townsend and his colleagues over the last couple of decades (Houpt et al. 2013; Townsend and Nozawa 1995; Townsend and Wenger 2004; Townsend and Eidels 2011; Townsend and Altieri 2012) provides such a model. SFT is a rigorous theory of information processing in OR and AND redundant target designs. SFT established the link between four aspects of information processing that can be rigorously defined and tested: architecture (i.e., serial, parallel, coactive), stopping rule (i.e., exhaustive or self terminating), workload capacity (i.e., limited, unlimited, super), and independence in rate (positive or negative inter-dependencies). The present study harnessed the workload capacity and independence measures to study how composite faces are processed.

The capacity coefficient

The capacity coefficient, C(t), is a valuable workload measure that has been developed in SFT (Townsend & Nozawa 1995). It quantifies the change in efficiency when observers shift from processing one-to-two targets. Refinements that consider both accuracy and RT do exist (Townsend & Altieri 2012). Specific versions of the capacity coefficient are available for the self-terminating (OR) task, and for the exhaustive (AND) task. Here I present the latency capacity coefficient for the OR task

$$\begin{array}{@{}rcl@{}} C_{OR}(t) & = & \frac{H_{TB}(t)}{H_{T}(t) + H_{B}(t)} \end{array} $$
(1)

where

$$\begin{array}{@{}rcl@{}} H(t) & = & \int\limits_{0}^{\infty} h(t) dt \\ & = & -\log[S(t)] \end{array} $$

is the integrated hazard function, a measure of the cumulative level of work accomplished by time t (Fitousi and Wenger 2011; 2013; Townsend and Ashby 1978; Townsend and Nozawa 1995; Wenger and Townsend 2006; 2000), which is equal to the negative log of the survivor function (Wenger and Townsend 2000; Wenger and Gibson 2004). The subscript T, B, and TB refer to the conditions in which top, bottom, or both composite face halves are processed. The capacity measure is calibrated against a yardstick unlimited-capacity, independent, parallel model (i.e., UCIP). Note that when the efficiency of recognizing a double-target face (e.g., both top and bottom halves targets are present) is equal to the sum of efficiencies for recognizing single-target faces (top or bottom halve are targets), H T B =H T +H B , capacity is unlimited and C O R (t)=1. This prediction follows from the UCIP model. When the efficiency of recognizing the double-target face is lower than the sum of efficiencies for the single targets, H T B H T +H B , capacity is limited, and C O R (t)<1. When the efficiency of processing a double-target is greater than the sum of efficiencies for processing the single-targets, H T B H T +H B , capacity is super, and C O R (t)>1 .

Miller’s inequality

Miller (1982, 1986) proposed an upper bound on RTs in an OR (minimum time) designs called the race model inequality. This inequality has existed for some time, and is known in mathematics as Boole’s inequality. Formally, the Miller’s inequity is:

$$\begin{array}{@{}rcl@{}} F_{TB}(t) \leq F_{T}(t) + F_{B}(t) \end{array} $$
(2)

where F T B (t) is the cumulative distribution function for the double-target (e.g., top and bottom halves) and F T (t) and F B (t) are, respectively, the cumulative distribution functions for the top single-target and bottom single-target. F T B (t) cannot exceed the sum of the single-target top and bottom halves’ cumulative distribution functions, F T (t)+F B (t), if processing is a race.

The Miller’s inequality is not assumption-free. It requires the following conditions to hold: (1) the channels are parallel, (2) the stopping rule is self-terminating, (3) the channels are stochastically independent in rate, and (4) context invariance in force, meaning that the activity of a single target, say, the top half of a face, is not altered by the presentation of a second target, say, the bottom half of a face (see Townsend & Wenger, 2004, p.1013).

It has been argued (Miller 1982; 1986) that violations of the inequality necessarily reflect coactivation—an architecture by which the activation from each channel is integrated and accumulated in a single pool of evidence. Nevertheless, the inequality cannot distinguish parallel from coactive architectures, primarily because other candidate systems such as serial (Townsend and Nozawa 1997; Fific et al. 2008; Fifić et al. 2008) or parallel architectures with channel interdependencies (Townsend and Wenger 2004; Mordkoff and Yantis 1991) can lead to violations of the inequality.

The Miller’s inequality can be used to support inferences regarding capacity (Townsend and Wenger 2004). If the system is limited-capacity, Miller’s inequality will be satisfied. However, a super-capacity system may, but need not, generate RTs that violate Miller’s inequality (Townsend and Wenger 2004).

Grice’s inequality

Grice and his colleagues (Grice et al. 1984) proposed a lower bound on RT in OR designs:

$$\begin{array}{@{}rcl@{}} F_{TB}(t) \geq MAX [F_{T}(t),F_{B}(t)] \end{array} $$
(3)

according to this equation performance on double target trials (top and bottom face halves targets are present) should be faster than that of the fastest of the single-targets trials (top face half, bottom face half). Violations of this inequality imply limited capacity in a strong sense.

Unified spaces for capacity measures

The capacity coefficient, Miller’s and Grice’s inequalities, are different indexes of workload capacity, which until recently had to be plotted in different scales. This has rendered comparison challenging. Townsend and Eidels (2011) have developed a unified framework that allows one to present the three measures within the same space. Here I present the relevant formulas.

Taking advantage of the fact that the integral of the hazard function H(t) equals the minus log of the survivor function (\(H(t) = -\log [S(t)]\)), the capacity coefficient can also be written as:

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} C_{OR}(t) &=& \frac{\log[S_{TB}(t)]}{\log[S_{T}(t) \cdot S_{B}(t)]} \end{array} \end{array} $$
(4)

Miller’s race inequality can be written in terms of the capacity coefficient:

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} C_{OR}(t) &\leq & \frac{\log[S_{T}(t) + S_{B}(t) -1]}{\log[S_{T}(t) \cdot S_{B}(t)]} \end{array} \end{array} $$
(5)

meaning that C O R (t) values that exceed the values on the right-hand side violate the race model inequality. Similarly, the Grice’s inequality can be written as:

$$\begin{array}{@{}rcl@{}} \begin{array}{llll} C_{OR}(t) &\geq & \frac{\log\{MIN[S_{T}(t),S_{B}(t)]}{\log[S_{T}(t) \cdot S_{B}(t)]} \end{array} \end{array} $$
(6)

, meaning that C O R (t) values that are smaller than the right-hand side violate the Grice’s bound.

Summary and predictions

The central idea that guided me throughout the present investigation was the following. Holistic processing of composite faces can be expressed in various “languages” of independence. These languages should denote a common meaning and therefore converge (Garner et al. 1956; Fitousi 2013; Fitousi and Wenger 2013; Fitousi 2014). To accomplish this goal, three such “languages” were used: (a) the composite face paradigm (Young et al. 1987), (b) the Garner paradigm (Garner 1974; Algom and Fitousi 2014), and (c) the redundant target paradigm (Miller 1982; Townsend and Nozawa 1995). Each paradigm is associated with a different class of definitions and measures. The composite face paradigm and the Garner paradigm employ operational definitions of independence, whereas the redundant target paradigm combines operational and theoretical definitions, along with formal measures. Either support or lack of support for the operational/theoretical definition can be interpreted as either support or lack of support for holism.

A series of three experiments was conducted. In Experiment 1, I applied the Garner task (Garner 1974) and the composite face task (Young et al. 1987) in a single experimental design. This allowed a joint investigation of two potential markers of holistic processing. In Experiment 2, the same set of faces were subjected to tests of workload capacity and independence within the redundant target design (Miller 1982). In Experiment 3, the redundant target was conjoined with “same–different” composite face task (Hole 1994; Ingvalson and Wenger 2005; Donnelly et al. 2012), which permitted testing with a large number of composite faces.

Table 1 presents a summary of the hypotheses pertinent to the processing of these composite faces. The first hypothesis refers to the traditional composite face effect (Hole 1994), predicting a strong interference in aligned condition but no (or smaller) interference in the misaligned condition. The second set of hypotheses concerns performance in the Garner paradigm. Observers will be asked to judge the top (bottom) half and ignore variations on the bottom (top) half. The holistic view predicts that Garner interferences and redundancy gains (i.e., indicators of integrality) should be observed with aligned composite faces. The rational being that in the aligned condition the face halves combine to form an integral object. A good example for integral objects are color patches composed of hue and saturation (Garner and Felfoldy 1970). Classification of hue is hindered when the irrelevant saturation varies orthogonally. This toll on performance occurs because observers process the irrelevant saturation inadvertently. In the correlated block, values on the top and bottom halves co-vary. It is therefore expected that this condition will give rise to facilitation (Algom and Fitousi 2014; Garner and Felfoldy 1970). Often, performance in the correlated block with integral dimensions benefits from the variability on the irrelevant dimension because the irrelevant dimension predicts the relevant dimension (Fitousi et al. 2009; Fitousi and Wenger 2013; Amishav and Kimchi 2010; Garner 1974). In contrast, with misaligned faces neither Garner interference nor redundancy gains (i.e., indicating separability) should be observed. According to the holistic prediction, the face halves in the misaligned condition behave like separable dimensions. Separable dimensions, such as color and shape (Garner 1974; Garner and Felfoldy 1970), allow for perfect selective attention to either attributes and are generating neither interference nor facilitation (Garner 1974).

Table 1 Summary of the hypotheses regarding independence from the perspectives of the composite, Garner, and redundant target tasks

The third set of hypotheses concerns performance in the redundant target task. This set of predictions was tested in two experiments. In Experiment 2, the double-target stimulus consisted of a single composite/identity. In this experiment, the holistic approach predicts super-capacity in the aligned condition but unlimited-capacity in the misaligned condition. The predictions of the holistic approach in Experiment 3 are more involved. This is due to the fact that the task was not identification, but rather a same–different judgment (Hole 1994). In particular, the predictions for the effect of alignment in Experiment 3, are conditional on the components of the composite double-target. Aligned faces constructed from the same identity should support super-capacity processing if the stimulus is processed holistically, while misaligned faces constructed from the same identity should support unlimited-capacity processing. However, opposite predictions apply with composites formed from different identities: Aligned stimuli constructed from different identities should produce limited-capacity processing, while misaligned stimuli constructed from different identities should support unlimited-capacity processing. It should be emphasized, though, that in the present study, we focused on testing the former set of predictions as the double-target composite was always constructed from the same identity. Also note, that where predicted, super-capacity, may be accompanied by violations of Miller’s (1982) bound, whereas limited-capacity may be accompanied by violations of the Grice’s (1984) bound.

Experiment 1

The goal of the first experiment was to assess the degree to which composite face halves interact. This was done by jointly deriving the composite face effect (Hole 1994) and the Garner’s speeded classification measures (Garner 1974) within a single experimental design. The main hypothesis was that face halves would appear as integral dimensions in the aligned condition but separable dimensions in the misaligned condition. Selective attention to the relevant part (e.g., top half) should fail in the former, but not in the latter. In addition, a composite face effect should be observed.

Method

Participants

Sixteen young men and women students were recruited from the Ariel University population. They participated as part of a course requirement. All participants reported normal or corrected-to-normal vision.

Stimuli and apparatus

The face stimuli for all the experiments were initially published as part of Cornes et al. (2011). The stimuli for Experiment 1 consisted of a sample of four aligned composite faces and their four complementary misaligned versions (cf. Figure 1 and 2). These were presented as gray-scale images. To create these images, pictures of eight young male volunteers were taken under natural conditions. The hair was removed from the picture of each face and the faces were placed within a standard oval frame. The resulting eight ovals were then separated into top and bottom halves. Four top halves and four bottom halves were randomly chosen from the eight pictures. From this selection, I combined four novel composite faces, under the constraint that none of the resulting composites reproduced an original face. Assume that a bottom dimension is A and a top dimension is B. If each dimension has two levels, then the following four stimuli are created: A 1 B 1, A 1 B 2, A 2 B 1, and A 2 B 2.

In the aligned condition, bottom and top parts were fused together and separated by a thin white bar. In the misaligned condition, the parts were maximally shifted from each other on the horizontal plane and connected by a white bar. All images were equated on brightness, size, and shape. The resulting aligned and misaligned composite faces are presented in Figs. 1 and 2. Viewed at a fixed distance of 76 cm, the aligned images subtended 4.4 of visual angle, horizontally, and 2.3 vertically, whereas the misaligned images subtended 8.5 of visual angle, horizontally, and 2.3 vertically. To exclude the possibility that observers will use information from the different shapes of faces, all faces were presented within the same oval shape. The faces were presented as gray-scale images over a black background.

Fig. 1
figure 1

Aligned version of the composite faces in Experiment 1. The faces were created by a full factorial combination of two bottom and two top face halves

Fig. 2
figure 2

Misaligned version of the composite faces in Experiment 1. The faces were created by a full factorial combination of two bottom and two top face halves

Design and procedure

The experiment was designed as two (alignment: aligned, misaligned) × 2 (task: top, bottom) × 3 (block type: filtering, baseline, correlated) full factorial within-participants design. The dependent variables were mean RT and error rate.

Aligned and misaligned composite faces were presented in separate blocks. The order of presentation was determined randomly by the computer. Before the experiment, observers were familiarized with the four composite faces. In the experiment, on each trial, one of the four composites was presented. Two tasks were employed throughout the experiment and performed in two separate sequences of experimental blocks. One task required observers to classify the bottom halves as A 1 and A 2; the second task required discrimination between top halves B 1 and B 2. In baseline blocks, the relevant dimension, say the bottom half, varied from trial to trial in an orthogonal fashion, whereas the irrelevant dimension, say the top half, was held fixed (e.g., discriminate between A 1 B 1 and A 2 B 1). In filtering blocks, both relevant and irrelevant dimensions varied orthogonally (e.g., discriminate between the faces A 1 B 1, A 1 B 2 and the pair A 2 B 1, A 2 B 2).

In the correlated blocks, a given pair of values of top and bottom halves appeared together (A 1 B 1 versus A 2 B 2, and A 1 B 2 versus A 2 B 1). A sequence of filtering, two correlated blocks and two baseline blocks was presented for each relevant dimension. The order of blocks within a sequence was random, subjected to the restriction that blocks of the same type (i.e., baseline, correlated) are presented sequentially, but their order can be random. Each baseline and correlated block consisted of 27 trials. Each filtering block consisted of 54 trials. Participants pressed the right-hand key (“m”) if the top-half was A 1, and a left-hand key (“z”) if the top-half was A 2. A similar response-mapping was assigned to bottom half B 1 and top half B 2.

Each 1-h session began with a 5-min period of dark adaptation. All trials were initiated by the computer. The target display was presented for 2 s or until response. The stimuli were presented with equal frequency in each block of trials. Each observer completed four identical cycles of aligned and misaligned blocks. This amounted to 622 trials. Unbeknownst to the participants, the first ten trials were considered as training, and were deleted from data analysis.

Results

Error trials averaged 5.3 % across all observers. Those trials were excluded from analyzes along with trials (1.9 %) in which RTs were shorter than 150 ms or longer than 1600 ms. Table 2 presents mean RTs and error percentage for judgments of bottom and top halves in aligned and misaligned conditions.

Table 2 Experiment 1: mean reaction times (RTs) (in milliseconds) and proportions of errors (percent) for judgments of bottom (dimension A) and top (dimension B) halves, in baseline, filtering, and correlated dimensions tasks with aligned and misaligned faces

Composite face effect

To be able to relate the outcome from the Garner test to the composite face effect, it is first crucial to demonstrate a composite face effect within the Garner design itself. This goal is feasible. Recall that in the original paradigm (Hole 1994) observers are asked to judge whether the top part of the target’s face is ‘same’ with that of the study face. Typically, the composite face effect is computed as a difference in mean RT or accuracy between ‘same’ trials in aligned and misaligned conditions (Susilo et al. 2009). The task performed by observers in the present study is classification rather than same-different. However, the quality of ‘sameness’ can still be defined in the filtering block. This block introduces an orthogonal trial-to-trial variation with all possible face stimuli.

By definition, a given composite face half on trial n is either the ‘same’ or ‘different’ with its predecessor on trial n−1. For example, if the observers are now judging the top part of stimulus A 1 B 1 in a standard Garner task, he or she may be influenced by the status of the previous stimulus, say A 2 B 1 (cf. Dyson & Quinlan, 2010). In that case, the top half (B 1) is a repetition of the previous bottom half (B 1) and thus can be considered as the “same”, whereas the bottom half (A 1) is an alternation from the previous bottom half (A 2), and thus can be considered as “different.” The composite face effect is computed as a difference in performance between “same” trials in aligned and misaligned conditions in judgments of the top half.

A composite face effect has been found. ‘Same’ responses for the top half in the aligned condition were slower (692 ms) than ‘same’ responses for the top half in the comparable misaligned (617 ms) condition [t(1,15)=1.91,p<0.05]. A similar analysis performed on the \(\arcsin (\sqrt {x}\)) transformation of error rate revealed marginally significant effect of alignment, with more (7.7 %) errors performed in aligned than in misaligned (3.7 %) condition [t(1,15)=2.39,p=0.09]. These results entail a composite face effect comparable to that observed in previous studies (Rossion 2013). The effect reproduced here is even more compelling than the regular composite face effect (Hole 1994) because observers were not asked to explicitly judge the relations between pairs of composite faces.

Garner measures

A three-way ANOVA with alignment (aligned, misaligned) × dimension (top, bottom) × task (baseline, filtering, correlated) revealed no effect of alignment [F<1]. The effect of dimension was marginally significant [F(1,15)=3.84,η 2=0.02,p=0.07]. The effect of task [F(2,30)=1.53,η 2=0.006,p=0.24] was not significant, pointing to the absence of redundancy gains or Garner interferences. These results entail that composite face halves, aligned or misaligned, are separable dimensions. Observers could pay perfect selective attention to either of these dimensions without suffering neither interference in the filtering block nor recurring gain in the correlated blocks. The Garner results replicate those documented by Amishav and Kimchi 2010, as well as by Pomerantz et al. (2003).

Discussion

The aim of Experiment 1 has been to gauge two markers of holistic processing within a single experimental set up. The first marker—composite face effect (Hole 1994)—registers the slower and more error-prone ‘same’ responses to aligned faces than to misaligned faces. The second index—Garner interference (Garner 1974)—records the quality of selective attention to a relevant face part. The outcome of this joint investigation is rather surprising. A composite face effect has been observed but no Garner interference obtained. This entails that according to one measure (i.e., composite face effect) composites are processed holistically, whereas according to another measure (i.e., Garner interference), they are processed analytically.

Recall that Richler et al. (2013) have explained away a similar inconsistency between the composite face effect and the Garner interference as resulting from the absence of trial-to-trial configural changes in the Garner paradigm. However, this explanation cannot account for the present results. Here I was able to reproduce a composite face effect in tandem with the absence of Garner interference, all within a single experimental design. No configural modifications were added. If so, how can one explain the inconsistency? Before attempting to address this caveat, it would be valuable to mention again that the data patterns from the Garner paradigm and the composite face paradigm are used as operational definitions of holism, but they are susceptible to various limitations (Maddox 1992) and may rely on performance differences that are easily altered by psychophysical factors (Fitousi and Wenger 2013; Melara and Algom 2003). Moreover, these operational tests lack a model that defines independence in a rigorous manner. Such a model is available in the application of the system factorial technology (SFT, Townsend & Nozawa, 1995) to redundant target designs with composite faces.

Experiment 2

The same composite faces from Experiment 1 were implemented within an OR redundant target design (Miller 1982). Observers were asked to look for a predefined top or bottom target face half and respond when at least one of the target halves is present. Several marked differences distinguish this experiment from the previous one. First, the Garner task made use of selective attention, whereas the redundant target task calls for divided attention. Second, the new design allows for rigorous theory-based definitions of holism in terms of mathematically stated measures of workload capacity and RT inequalities (Miller 1982; Townsend and Nozawa 1995).

It was predicted that aligned composite faces would give rise to improved performance in the double-target conditions relative to a single-target conditions. In particular, the holistic view predicts super-capacity as indicated by C O R >1, violations of the Miller’s bound, and no violations of Grice’s bound. In contrast, misaligned composite faces should demonstrate unlimited capacity with C O R =1, no violations of the Miller’s bound, and no violations of the Grice’s bound.

Ideally, the capacity measures are tested at the level of the individual observer (Townsend and Nozawa 1995; Fitousi and Wenger 2011; 2013). Therefore, all analyzes in the next two experiments were performed at this level.

Method

Participants

A new group of eight young male and female students was recruited from the Ariel University population. None of them participated in the previous experiment. They participated as part of a course requirement. All participants reported normal or corrected-to-normal vision.

Stimuli and apparatus

The exact set of stimuli from Experiment 1 has been used. The experiment was designed as 2 (Alignment: aligned, misaligned) × 4 (Target: redundant target, bottom-target, top-target, and no-target) full factorial within-participants design. The dependent variables were mean RT and error rates.

Aligned and misaligned composite faces were presented in separate blocks. The order of presentation was determined randomly by a computer. A single task was employed throughout the entire experiment. The task required a response based on the presence or absence of predefined top or bottom halves. The response rule required observers to press one key whenever the bottom half was at level 1 OR the top half was at level 1, and to press another key when neither of these targets was present (i.e., non-target). Hence, the composite face A 1 B 1 served as the redundant target, A 1 B 2 or A 2 B 1 served as single targets, and A 2 B 2 served as non-target (see Figs. 1 and 2).

There were 30 experimental blocks, each consisting of 60 trials. The two choice responses were made equally probable (Mordkoff and Yantis 1991). Non-target stimulus was presented on half of the trials (i.e., 30 trials per block), while the other three targets stimuli were presented on the remaining half with equal probability (i.e., ten per stimulus). Each experimental block was repeated 15 times in the aligned version, and another 15 times in the misaligned version of the stimuli. The order of blocks was determined randomly by the computer. In total, each observer completed 1800 trials.

Participants pressed the right-hand key (“m”) if the composite face consisted at least one of the targets, and the left-hand key (“z”) if neither of the possible target halves was present.

Each 1-h session began with a 5-min period of dark adaptation. All trials were initiated by computer. The target display was presented for 2 s or until response. Unbeknownst to the participants, the first ten trials of each experimental blocks were considered as training and were deleted from the analysis.

Results

Error trials averaged 6.6 % across all observers and were excluded from RT analyzes. So were RTs shorter than 150 ms or longer than 2500 ms (1.2 %). Mean RTs were computed for each observer in each condition (redundant-target, top or bottom half target, and no-target). There were three main types of analyses. The first concerned the redundant target effect (Raab 1962). The second, addressed the Miller’s and Grice’s inequalities (Miller 1982; Grice et al. 1984). The third analysis was dedicated to the capacity coefficient (Townsend and Nozawa 1995). Using the unified methodology developed by Townsend and Eidels (2011), the three capacity measures: the capacity coefficient, the Miller’s, and Grice’s bounds, were presented within a single space. All analyses were performed separately per observer and alignment condition (Table 3).

Table 3 Experiment 2: ANOVA on RTs for correct responses separately for each observer

Redundant target effects

Mean RTs were computed at each level of alignment, stimulus, and observer. A two-way ANOVA with alignment (aligned, misaligned) × target type (redundant, single-top, single-bottom) was performed. Table 3 presents the results of this analysis. As can be noted, the main effect of target type was significant for all observers. RTEs were tested by comparing performance in conditions of double-target and in each of the single-target (top-target and bottom-target) conditions. Figure 3 presents the RTE in aligned and misaligned conditions. A main effect of alignment was found in six out of eight observers, entailing longer RTs for aligned than misaligned faces. The interaction term, which was reliable for six observers, indicated larger RTE for misaligned than aligned faces.

Fig. 3
figure 3

Experiment 2: mean RTs in double-target (top or bottom), bottom single-target, and top single-target for aligned (top figure) and misaligned (bottom figure) conditions

Capacity

Figure 4 depicts the capacity coefficient C O R (t), along with the Miller’s and Grice’s bounds. These measures are presented for each observer and alignment condition within a common unified capacity space (Townsend and Eidels 2011). The unified space enables a comparison of the different indexes values. All three measures are continuous functions that vary with time. Note that C O R (t)>1 at time point t implies super-capacity at that time. Violations of the Miller’s inequality are indicated when the capacity coefficient function exceeds the Miller’s bound. Violations of the Grice’s inequality are indicated when the capacity coefficient function is posited below the Grice’s bound.

Fig. 4
figure 4

Experiment 2: unified capacity spaces of the capacity coefficient C(t), Miller’s and Grice’s inequalities. The line At C(t)=1 is for unlimited-capacity, independent, parallel model. Spaces are depicted separately for each observer and alignment condition

Statistical tests on the capacity coefficient were performed according to the Houpt–Townsend test (2012), with the R sft package (Houpt et al. 2013). This test adopts the two-tailed null hypothesis that the capacity values emerge from a UCIP model with C O R =1 for all time points. The test produces a single z score statistics that represents the deviation from values of C O R (t)=1. Statistical tests on the Miller’s and Grice’s inequalities were performed by applying the Kolmogorov–Smirnov (KS) tests at the level of empirical cumulative distribution function (Townsend and Nozawa 1995; Maris and Maris 2003). It is worth emphasizing that visual inspection of the unified capacity spaces should be supported by the statistical tests. If, for instance, the C(t) appears mostly above 1, but only for a range of times when the estimate is highly variable, then below 1 for a smaller time range, but a range in which the estimator is more certain, then the two ranges of time may trade off to yield a non significant estimate (Houpt Joseph, pers. comm., September,18, 2014).

Table 4 presents the values of the Houpt–Townsend test. Only three observers (3, 4, and 6) out of eight exhibited patterns of workload capacity that are consisted with holistic processing. These observers showed super-capacity in the processing of aligned faces but unlimited or limited capacity in the processing of misaligned faces. The super-capacity values were accompanied by violations of the Miller’s bound. In these cases, the capacity coefficient function raised beyond the Miller’s bound function in the unified space.

Table 4 Experiment 2: z scores for the Houpt–Townsend (2012) test

Five out of eight observers (1, 2, 5, 7, and 8) demonstrated patterns that were inconsistent with holistic processing. Observer 5 showed super-capacity with both aligned and misaligned faces. Observer 8 exhibited limited capacity in both conditions. Observer 2 and observer 7 evinced limited capacity with aligned faces, but unlimited capacity with misaligned faces. Observer 1 exhibited unlimited capacity with aligned faces, but limited capacity with misaligned faces, which was supported by violations of the Grice’s inequality for long t values. Observer 2 demonstrated violations of the Grice’s bound in both aligned and misaligned conditions. Taken together, the results entail that the majority of observers (five out of eight) were processing the faces in a non-holistic manner.

Discussion

The same faces from Experiment 1 were used in the present experiment. Observers were required to process both face halves in a redundant target task. In such a design, holistic processing should have emerged in the form of enhanced efficiency in the processing of double-target aligned faces. Generally, the results from the capacity coefficient and the Miller’s and Grice’s inequalities were inconsistent with the predictions of holistic processing. Only three out of eight participants exhibited the expected signature of holistic processing—super-capacity with aligned faces and unlimited or limited capacity with misaligned faces. Five out of eight observers demonstrated patterns of capacity that were incommensurate with the holistic predictions.

The validity of the conclusions drawn in Experiment 2 may be limited. First, the results are based on a relatively small sample of faces. Second, the identification task employed here is not typical. The majority of composite face studies have used a “same–different” task (Rossion 2013). The aims of the next experiment were to increase the number of composite faces and to device a same-different task within a redundant target design (Hole 1994).

Experiment 3

The goal of the experiment was to assess the workload capacity and independence of composite face halves within the original study-test paradigm (Hole 1994). The design combined an OR redundant target task with a same–different task (Ingvalson and Wenger 2005). Observers were presented with a study face and a test face. The task assigned one response to conditions in which either the top half OR the bottom half of the test face was the “same” as that of the study face. Another response was assigned to conditions in which neither of the target halves was the “same” as that of the study half (Ingvalson and Wenger 2005; Donnelly et al. 2012). The redundant target in such designs is defined as a condition in which both of the target’s halves are the “same” as the pertinent study half (i.e., same–same), a single target is defined as a condition in which only one of the halves is the “same” as the pertinent study half (i.e., same–different, different–same). A no-target is defined as a condition in which both halves are “different” from the pertinent study half (i.e., different–different).

The present design closely mimics traditional assays that proved apt at recording composite face effects (Richler & Gauthier, 2014). It is doing so by virtue of employing a same–different task with a large number of faces and a great deal of trial-to-trial configural variability (Richler et al. 2013). Note that this design encourages super-capacity and holistic processing as observers must process both face parts coming from the same identity. Super-capacity results from interdependencies across the face halves. If the holistic view is correct, then participants should exhibit super-capacity in the aligned condition but unlimited capacity in the misaligned condition with same identity faces.

Method

Participants

A new group of eight young men and women students was recruited from the Ariel University population. None of these participants took part in the previous experiments. They participated as part of a course requirement. All participants reported normal or corrected-to-normal vision.

Stimuli and apparatus

A subset of 40 face images from the Cornes et al. (2011) study was chosen. The faces were manipulated and prepared to fit into a standard oval frame. In all of the images, the hair was removed and brightness was standardized. Out of this set, 20 pairs were combined to form 20 new composite identities. From those faces, I created 20 gray-scale front-view aligned composite faces and 20 misaligned composite faces. The experiment was designed as 2 (alignment: aligned, misaligned) × 4 (target: redundant target, bottom-target, top-target, and no-target) full factorial within-participants design. Aligned and misaligned composite faces were presented in separate blocks.

The task required a response based on the status of the target’s top or bottom half as the “same” or “different” with the comparable study half. In the redundant target condition, both top and bottom parts required a “same” response. The two single-target conditions included cases in which the top half was the “same” and the bottom half was “different” or the top half was “different” and the bottom half was the “same.” In a no-target condition, both top and bottom parts were “different”. Each stimulus type appeared equally often. Observers were asked to respond with one key (‘m’) when at least one of the top or bottom halves of the target face was the “same” with that of the study face, and with another key (‘z’) when neither of the target halves was the “same” as the study face. On each trial, a fixation cross was presented for 500 ms. It was then replaced by a study face that was presented for 2000 ms. The test face appeared after an inter-stimulus-interval (ISI) of 200 ms and remained on the screen until response.

All the remaining details of the apparatus were identical to those reported in Experiment 2.

Results

Error trials averaged 22 % across all observers and were excluded from the RT analysis. RTs shorter than 150 ms or longer than 2500 ms (2.5 %) were excluded from RT analyzes. The relatively high error rate reflects the difficulty of the task, but is still within the accepted confines for calculating the latency capacity measures (Townsend and Wenger 2004). Mean RTs were computed for each observer in each condition (redundant-target, top or bottom half target, and no-target). The analyzes were identical to those performed in Experiment 2, and included the redundant target effect (Raab 1962), Miller’s and Grice’s bounds (Miller 1982), and the capacity coefficient (Townsend and Nozawa 1995).

Redundant target effects

Mean RTs were computed at each level of alignment, stimulus, and observer. The data were submitted to two-way ANOVA with alignment (aligned, misaligned) × target type (redundant, single-top, single-bottom). Table 5 presents the results of this analysis. The main effect of target type was significant for all eight observers. The RTE was further assessed by comparing performance in redundant target (i.e., same–same) with single top-target (i.e., same–different), and single bottom-target (i.e., different–same). Figure 5 depicts the mean RTs for each of these conditions. As can be noted, RTEs were observed in the data of all observers. A main effect of alignment was demonstrated in the data of six out of eight observers, entailing longer RTs for misaligned than for aligned faces. The interaction was significant for only three out of the eight observers, and therefore was not further interpreted.

Fig. 5
figure 5

Experiment 3: mean RTs in double-target (top or bottom), bottom single-target, and top single-target for aligned (top figure) and misaligned (bottom figure) conditions

Table 5 Experiment 3: ANOVA on RTs for correct responses separately for each of the observers

Capacity

Figure 6 depicts the capacity coefficient C O R (t), along with the Miller’s and Grice’s bounds for each participant at each condition of alignment. The three measures are presented within a single unified space (Townsend and Eidels 2011). Statistical tests on the capacity coefficients were performed according to the Houpt–Townsend (2012) test using the R sft package (Houpt et al. 2013). Violations of Miller’s and Grice’s inequalities were assessed using the Kolmogorov–Smirnov test (Maris and Maris 2003). Note, again, that the capacity measures are taken across time points. Reliance on visual inspection alone can be sometimes misleading or may lead to incorrect inferences. An advantage of the Houpt–Townsend test is that it weights the evidence and provides a single statistics that reflects the dominant trend. Values of the Houpt–Townsend statistics appear in Table 6.

Fig. 6
figure 6

Experiment 3: unified capacity spaces of the capacity coefficient C(t), Miller’s and Grice’s inequalities. The line at C(t)=1 is for unlimited-capacity, independent, parallel model. Spaces are depicted separately for each observer and alignment condition

Table 6 Experiment 3: z scores for the Houpt and Townsend (2012) test

Only three (2, 3, and 4) out of the eight observers demonstrated capacity patterns that were consistent with holistic processing. That is, super-capacity with aligned faces and unlimited capacity with misaligned faces. Supercapcaity in the data of these observers was also supported by considerable violations of the Miller’s inequality. Five (1, 5, 6, 7, and 8) out of eight observers exhibited capacity patterns that were inconsistent with holistic processing. Observers 1, 5, 7, and 8 processed both aligned and misaligned faces with unlimited capacity. Their capacity coefficient values were roughly equal to 1 for most of the t points, and no violations of the Miller’s bound were recorded for them. Observer 6 switched from unlimited capacity in aligned faces to limited capacity in misaligned faces. Since the Grice’s bound was violated for short t values in the misaligned condition data of this observer, it is likely that capacity was severely limited at those times.

The results of this experiment are very similar to those recorded in the previous experiment. Aligned composite faces from the same identity did not produce the expected super-capacity. Moreover, the alignment manipulation did not alter significantly the levels of workload capacity and independence. These results clearly indicate that composite faces from the same identity were not processed holistically, and are difficult to reconcile with a holistic view of face processing.

Discussion

Experiment 3 provided another rigorous test of the notion that composite faces are processed holistically. I hypothesized that, if the holistic view is correct, aligned composite faces from the same identity should exhibit super-capacity, whereas misaligned faces from the same identity should demonstrate unlimited capacity. These predictions were tested in a design that combined a redundant target task and study-test procedure (Hole 1994). The prediction of the holistic approach were not supported. The majority of observers exhibited neither super-capacity with aligned composite faces of the same identity nor qualitative differences between aligned and misaligned conditions of composites of the same identity in workload capacity.

Experiment 2 and 1 shared the same conceptual thread that permitted the computation of the capacity measures. However, they differed considerably in terms of the task employed, the number of stimuli presented, and the general procedure administrated. In spite of these differences, the experiments converged on the same conclusion. Composite faces were not processed holistically by the lion share of observers. Offsetting the face parts in a composite face was not associated with significant quantitative or qualitative changes in processing efficiency. Of the 16 observers that participated in Experiments 1 and 1, ten demonstrated patterns of workload capacity and independence that were incommensurate with holistic processing. The fact that a modicum of observers did show holistic processing attests to the validity of the methods used here. Had it been a prominent strategy, holistic processing should have surfaced more often.

General discussion

The present work sought to examine the notion that composites are processed holistically by capitalizing on the strong logic of converging operations (Garner et al. 1956; Fitousi and Wenger 2013). Composite faces were submitted to a series of tests of perceptual independence. These tests incorporated both operational and theoretical definitions. Three paradigms have been used: the composite face task (Hole 1994), Garner’s speed classification task (Garner 1974), and the redundant target paradigm (Miller 1982). In Experiment 1, a composite face effect has been replicated, but neither Garner interference nor redundancy gains were obtained. In Experiment 2, a redundant target task revealed that the faces from Experiment 1 did not produce super-capacity for a large number of observers. In Experiment 3, the redundant target task has been applied to a modified composite face procedure (Hole 1994), producing again weak evidence for super-capacity and therefore compelling evidence for analytic rather than holistic processing. Taken together, the results from the various definitions, measures, and paradigms converged on the conclusion that composite faces are processed analytically rather holistically by most observers.

Composite faces are considered the epitome of holistic face perception. This belief is so deeply rooted in current research that composite faces have become themselves a tool for assessing sundry aspects of face perception (Rossion 2013). However, if one jettisons the compelling illusion they create, one is left with only sparse evidence that composites are indeed processed holistically. The lion share of the evidence relies on a limited number of experimental procedures that have given rise to so-called “composite face effects” (Hole 1994). The appropriate procedure for testing these effects, as well as their “correct” interpretation, has been a matter of much debate (Rossion 2013; Richler and Gauthier 2014).

The present study challenges the orthodox holistic view (Farah et al. 1998; Young et al. 1987). It is consistent with a growing body of work, coming mainly from computational and formal quarters (Ellison and Massaro 1997; Gold et al. 2012; Loftus et al. 2004; Macho and Leder 1998; Tversky and Krantz 1969; Bradshaw and Wallace 1971; Wenger and Townsend 2006; Donnelly et al. 2012; Wenger and Ingvalson 2002; 2003) that defy the notion of holistic processing. Particularly relevant here is a recent study by Donnelly, Cornes, and Menneer 2012, who have applied the capacity coefficient to the Thatcher illusion (Thompson 1980). In this illusion, inverting the eyes and mouth of a face results in the perception of a grotesque expression. This illusion disappears when the entire “Thatcherized” face is inverted. Like the composite face effect, the Thatcher illusion is considered to be a strong marker of holistic processing. Donnelly et al. found no evidence for super-capacity with “Thatcherized” faces.

Redundant target with distractor

One of the reviewers rightly noted that the single-target face halves in the present study were not truly single because they were presented with a face-half irrelevant to the task at hand (e.g., distractor). This detour from original intent has been taken in many comparable studies (see also, Eidels, Townsend, & Algom, 2010; Ingvalson and Wenger 2005), but hitherto the consequences of such an application have not been studied. They certainly deserve appropriate consideration. Under some assumptions, additional irrelevant distractor may lead to deviations of the estimated capacity coefficient from its true value. This issue was examined in a series of numerical simulations (see Appendix).

The simulations revealed that adding a distractor may have some impact on the estimated capacity coefficient values. However, this can occur only in a small number of cases and under the following implausible processing assumptions: (a) the distractor receives a large portion of the total amount of attentional resources, and (b) that the distractor consumes much more capacity than the target. Various sources of information run against these assumptions in the current data. First, capitalizing on the Garner’s converging operations logic (Garner et al. 1956), the evidence from Experiment 1 clearly showed that irrelevant distractors have not received attention in the Garner paradigm. Second, had the distractor had any impact, it should have affected both aligned and misaligned composites, as both were presented with a distractor. Also note that the alternative of presenting the composite face half in the absence of any other face half, would have significantly hampered the presumable Gestalt structure of the composites presented here, rendering comparison with other studies difficult. In conclusion, the addition of so-called distractors is unlikely to bias theoretical resolution with respect to alignment of composite faces.

Individual differences in processing composite faces

A major contribution of the present study is the demonstration of individual differences in the processing of composite faces. Most of our participants exhibited patterns that were inconsistent with accepted meanings of holistic processing. However, some of our participants did show patterns that comply with holistic processing. The majority of studies on the composite face illusion report the results of the aggregate. However, accumulating evidence in several areas of research domains shows that the aggregate may conceal diverse individual processing strategies (Townsend and Fifić 2004; Fitousi and Wenger 2011; Estes 1956; Ashby et al. 1994). A notable demonstration of considerable individual difference in the SFT paradigm is provided in a study by Townsend and Fific 2004. The composite face effect itself has been shown to vary considerably across observers. When individual data are carefully considered, it is often found that some observers do not show the composite face effect (Avidan et al. 2011; Ramon et al. 2010).

Selective versus divided attention to face parts

A recent debate in the composite face literature (Richler et al. 2012; Rossion 2013) concerns the role played by attention in the processing of composite face (Curby et al. 2013). Is divided attention necessary for holistic processing to occur? At first glance the distinction made between divided attention and selective attention seems unimportant and inconsequential. However, these are two diametrically opposed processes (Fitousi and Wenger 2013; Melara and Algom 2003). The former entails association of information, whereas the latter implies dissociation of information. Research in the domains of attention (Eidels et al. 2010), categorization (Maddox and Ashby 1996), and face recognition (Fitousi and Wenger 2013) reveals that the task demands in these attentional strategies differs to the extent that they change the underlying representation itself.

It has been argued that divided attention tasks allow for much stronger tests of holistic processing than those offered by selective attention tasks. Townsend and Wenger 2014 were explicit about this point. They have argued persuasively that the majority of experimental studies of holistic processing have used selective attention tasks. These tasks cannot assess potential violations of strong forms of independence, because they provide only partial information about the observer’s state. As a result researchers cannot relate the experimental evidence with the theoretical construct of the Gestalt. In this sense, the redundant target designs employed here provide adequate solution.

The composite face effect is not a valid marker of holistic processing

Let me conclude with a caveat. If composite faces are not processed holistically, what does the composite face effect really measure? I submit that the effect registers a response conflict that is unrelated to holistic processes (cf. Richler et al., 2008; Wenger & Ingvalson 2002, 2003). Previous studies have shown that the magnitude of the composite face effect exhibits either no correlation (Konar et al. 2010) or very weak correlation (Wang et al. 2012) with face recognition performance across individuals. Moreover, the fact that misalignment of the irrelevant face half reduces the magnitude of interference considerably, or abolishes it can be readily accounted for by a dilution account (Kahneman and Chajczyk 1983; Fitousi and Wenger 2011). This account has been originally proposed by Kahneman and Chajczyk to explain the reduction in Stroop effects with spatially separated color words and color bars. According to this explanation, attention to the irrelevant dimension is reduced and sensory processing of the distractor decreases, leading to a weaker response conflict between color and word. A similar case can be made for composite faces. Moreover, the dissociation between the Garner and the composite face effect provides additional support for a response-conflict account. Note that a response conflict is absent from the Garner paradigm but is inherent to the same-different task that gives rise to composite face effect. In summary, the results of the present study cast serious doubts on the validity of the composite face effect as a tool for measuring truly perceptual effects of holistic processing.