Skip to main content
Log in

General recognition theory with individual differences: a new method for examining perceptual and decisional interactions with an application to face perception

  • Theoretical Review
  • Published:
Psychonomic Bulletin & Review Aims and scope Submit manuscript

Abstract

A common question in perceptual science is to what extent different stimulus dimensions are processed independently. General recognition theory (GRT) offers a formal framework via which different notions of independence can be defined and tested rigorously, while also dissociating perceptual from decisional factors. This article presents a new GRT model that overcomes several shortcomings with previous approaches, including a clearer separation between perceptual and decisional processes and a more complete description of such processes. The model assumes that different individuals share similar perceptual representations, but vary in their attention to dimensions and in the decisional strategies they use. We apply the model to the analysis of interactions between identity and emotional expression during face recognition. The results of previous research aimed at this problem have been disparate. Participants identified four faces, which resulted from the combination of two identities and two expressions. An analysis using the new GRT model showed a complex pattern of dimensional interactions. The perception of emotional expression was not affected by changes in identity, but the perception of identity was affected by changes in emotional expression. There were violations of decisional separability of expression from identity and of identity from expression, with the former being more consistent across participants than the latter. One explanation for the disparate results in the literature is that decisional strategies may have varied across studies and influenced the results of tests of perceptual interactions, as previous studies lacked the ability to dissociate between perceptual and decisional interactions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, 19, 716–723.

    Article  Google Scholar 

  • Ashby, F. G., & Lee, W. W. (1991). Predicting similarity and categorization from identification. Journal of Experimental Psychology: General, 120(2), 150.

    Article  Google Scholar 

  • Ashby, F. G., & Maddox, W. T. (1994). A response time theory of separability and integrality in speeded classification. Journal of Mathematical Psychology, 38(4), 423–466.

    Article  Google Scholar 

  • Ashby, F. G., & Soto, F. A. (2014). Multidimensional signal detection theory. In J. R. Busemeyer, J. T. Townsend, Z. Wang, & A. Eidels (Eds.), Oxford handbook of computational and mathematical psychology. New York: Oxford University Press (in press).

  • Ashby, F. G., & Townsend, J. T. (1986). Varieties of perceptual independence. Psychological Review, 93(2), 154–179.

    Article  PubMed  Google Scholar 

  • Ashby, F. G., Waldron, E. M., Lee, W. W., & Berkman, A. (2001). Suboptimality in human categorization and identification. Journal of Experimental Psychology: General, 130(1), 77.

    Article  Google Scholar 

  • Baudouin, J. Y., Martin, F., Tiberghien, G., Verlut, I., & Franck, N. (2002). Selective attention to facial emotion and identity in schizophrenia. Neuropsychologia, 40(5), 503–511.

    Article  PubMed  Google Scholar 

  • Billingsley, P. (2012). Probability and Measure. Hoboken, New Jersey: John Wiley & Sons

  • Blais, C., Arguin, M., & Marleau, I. (2009). Orientation invariance in visual shape perception. Journal of Vision, 9(2), 1–23.

    Article  PubMed  Google Scholar 

  • Borg, I., & Groenen, P. (2005). Modern Multidimensional Scaling : Theory and Applications. New York: Springer.

    Google Scholar 

  • Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10(4), 433–436.

    Article  PubMed  Google Scholar 

  • Bruce, V., & Young, A. (1986). Understanding face recognition. British Journal of Psychology, 77(3), 305–327.

    Article  PubMed  Google Scholar 

  • Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference understanding AIC and BIC in model selection. Sociological Methods and Research, 33(2), 261–304.

    Article  Google Scholar 

  • Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika, 35(3), 283–319.

    Article  Google Scholar 

  • Cornes, K., Donnelly, N., Godwin, H., & Wenger, M. J. (2011). Perceptual and decisional factors influencing the discrimination of inversion in the Thatcher illusion. Journal of Experimental Psychology: Human Perception and Performance, 37(3), 645.

    PubMed  Google Scholar 

  • D’Errico, J. (2006). Adaptive robust numerical differentiation. MATLAB Central File Exchange. Retrieved April 19, 2014, from http://www.mathworks.com/matlabcentral/fileexchange/file_infos/13490-adaptive-robust-numerical-differentiation

  • Dailey, M., Cottrell, G. W., & Reilly, J. (2001). California facial expressions, CAFE. Unpublished digital images, University of California, San Diego, Computer Science and Engineering Department.

  • de Beeck, H. P. O., Haushofer, J., & Kanwisher, N. G. (2008). Interpreting fMRI data: maps, modules and dimensions. Nature Reviews Neuroscience, 9(2), 123–135.

    Article  Google Scholar 

  • Ekman, P., Friesen, W. V., & Hager, J. (1978). The Facial Action Coding System (FACS): A technique for the measurement of facial action Palo Alto. Palo Alto: Consulting Psychologists.

    Google Scholar 

  • Ellamil, M., Susskind, J. M., & Anderson, A. K. (2008). Examinations of identity invariance in facial expression adaptation. Cognitive, Affective, and Behavioral Neuroscience, 8(3), 273.

    Article  Google Scholar 

  • Ennis, D. M., & Ashby, F. G. (2003). Fitting the decision bound models to identification categorization data. Santa Barbara: University of California.

    Google Scholar 

  • Etcoff, N. L. (1984). Selective attention to facial identity and facial emotion. Neuropsychologia, 22(3), 281–295.

    Article  PubMed  Google Scholar 

  • Fitousi, D., & Wenger, M. J. (2013). Variants of independence in the perception of facial identity and expression. Journal of Experimental Psychology: Human Perception and Performance, 39(1), 133–155.

    PubMed  Google Scholar 

  • Fox, C. J., & Barton, J. J. S. (2007). What is adapted in face adaptation? The neural representations of expression in the human visual system. Brain Research, 1127, 80–89.

    Article  PubMed  Google Scholar 

  • Fox, C. J., Oruç, I., & Barton, J. J. S. (2008). It doesn’t matter how you feel. The facial identity aftereffect is invariant to changes in facial expression. Journal of Vision, 8(3), 11.

    Article  PubMed  Google Scholar 

  • Ganel, T., & Goshen-Gottstein, Y. (2004). Effects of familiarity on the perceptual integrality of the identity and expression of faces: The parallel-route hypothesis revisited. Journal of Experimental Psychology: Human Perception and Performance, 30(3), 583–596.

    PubMed  Google Scholar 

  • Ganel, T., Valyear, K. F., Goshen-Gottstein, Y., & Goodale, M. A. (2005). The involvement of the “fusiform face area” in processing facial expression. Neuropsychologia, 43(11), 1645–1654.

    Article  PubMed  Google Scholar 

  • Garner, W. R. (1974). The processing of information and structure. New York: Erlbaum.

    Google Scholar 

  • Hartigan, J. A., & Hartigan, P. M. (1985). The dip test of unimodality. The Annals of Statistics, 70–84.

  • Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, 4(6), 223–232.

    Article  PubMed  Google Scholar 

  • Kadlec, H., & Townsend, J. T. (1992a). Signal detection analysis of multidimensional interactions. In F. G. Ashby (Ed.), Multidimensional Models of Perception and Cognition (pp. 181–231). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Kadlec, H., & Townsend, J. T. (1992b). Implications of marginal and conditional detection parameters for the separabilities and independence of perceptual dimensions. Journal of Mathematical Psychology, 36(3), 325–374.

    Article  Google Scholar 

  • Kanwisher, N. (2000). Domain specificity in face perception. Nature Neuroscience, 3, 759–763.

    Article  PubMed  Google Scholar 

  • Lee, M. D., & Wetzels, R. (2010). Individual differences in attention during category learning. In: R. Catrambone & S. Ohlsson (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society (pp. 387–392). Austin, TX: Cognitive Science Society.

  • Lehky, S. R. (2000). Fine discrimination of faces can be performed rapidly. Journal of Cognitive Neuroscience, 12(5), 848–855.

    Article  PubMed  Google Scholar 

  • Mack, M. L., Richler, J. J., Gauthier, I., & Palmeri, T. J. (2011). Indecision on decisional separability. Psychonomic Bulletin & Review, 18(1), 1–9.

    Article  Google Scholar 

  • Maddox, W. T., & Ashby, F. G. (1996). Perceptual separability, decisional separability, and the identification- speeded classification relationship. Journal of Experimental Psychology: Human Perception & Performance, 22, 795–817

  • Maddox, W. T., Ashby, F. G., & Waldron, E. M. (2002). Multiple attention systems in perceptual categorization. Memory and Cognition, 30, 325–339.

    Article  PubMed  Google Scholar 

  • Mestry, N., Wenger, M. J., & Donnelly, N. (2012). Identifying sources of configurality in three face processing tasks. Frontiers in Perception Science, 3, 456.

    Google Scholar 

  • Navarro, D. J., Griffiths, T. L., Steyvers, M., & Lee, M. D. (2006). Modeling individual differences using Dirichlet processes. Journal of Mathematical Psychology, 50(2), 101–122.

    Article  Google Scholar 

  • Pell, P. J., & Richards, A. (2013). Overlapping facial expression representations are identity-dependent. Vision Research, 79(7), 1–7.

    Article  PubMed  Google Scholar 

  • Preacher, K. J., & Merkle, E. C. (2012). The problem of model selection uncertainty in structural equation modeling. Psychological Methods, 17(1), 1.

    Article  PubMed  Google Scholar 

  • Richler, J. J., Gauthier, I., Wenger, M. J., & Palmeri, T. J. (2008). Holistic Processing of Faces: Perceptual & Decisional Components. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(2), 328–342.

    PubMed  Google Scholar 

  • Schweinberger, S. R., Burton, A. M., & Kelly, S. W. (1999). Asymmetric dependencies in perceiving identity and emotion: Experiments with morphed faces. Perception & Psychophysics, 61(6), 1102–1115.

    Article  Google Scholar 

  • Schweinberger, S. R., & Soukup, G. R. (1998). Asymmetric relationships among perceptions of facial identity, emotion, and facial speech. Journal of Experimental Psychology: Human Perception and Performance, 24(6), 1748–1765.

    PubMed  Google Scholar 

  • Silbert, N. H. (2012). Syllable structure and integration of voicing and manner of articulation information in labial consonant identification. The Journal of the Acoustical Society of America, 131(5), 4076–4086.

    Article  PubMed Central  PubMed  Google Scholar 

  • Silbert, N. H., & Thomas, R. (2013). Decisional separability, model identification, and statistical inference in the general recognition theory framework. Psychonomic Bulletin & Review, 20(1), 1–20.

    Article  Google Scholar 

  • Soto, F. A., & Wasserman, E. A. (2011). Asymmetrical interactions in the perception of face identity and emotional expression are not unique to the primate visual system. Journal of Vision, 11(3).

  • Stankiewicz, B. J. (2002). Empirical evidence for independent dimensions in the visual representation of three-dimensional shape. Journal of Experimental Psychology: Human Perception and Performance, 28(4), 913–932.

    PubMed  Google Scholar 

  • Thomas, R. (2001). Perceptual interactions of facial dimensions in speeded classification and identification. Attention, Perception, & Psychophysics, 63(4), 625–650.

    Article  Google Scholar 

  • Thomas, R. D., & Silbert, N. H. (2014). Technical clarification to Silbert and Thomas (2013): “Decisional separability, model identification, and statistical inference in the general recognition theory framework”. Psychonomic Bulletin & Review, 21(2), 574–575.

    Article  Google Scholar 

  • Ungerleider, L. G., & Haxby, J. V. (1994). “What” and “where” in the human brain. Current Opinion in Neurobiology, 4(2), 157–165.

    Article  PubMed  Google Scholar 

  • Vogels, R., Biederman, I., Bar, M., & Lorincz, A. (2001). Inferior temporal neurons show greater sensitivity to nonaccidental than to metric shape differences. Journal of Cognitive Neuroscience, 13(4), 444–453.

    Article  PubMed  Google Scholar 

  • Wald, A. (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society, 54(3), 426–482.

    Article  Google Scholar 

  • Yankouskaya, A., Booth, D. A., & Humphreys, G. (2012). Interactions between facial emotion and identity in face processing: Evidence based on redundancy gains. Attention, Perception and Psychophysics, 74(8), 1692–1711.

    Article  PubMed  Google Scholar 

Download references

Author Note

Preparation of this article was supported in part by AFOSR grant FA9550-12-1-0355, NIH (NINDS) Grant No. P01NS044393, and by Grant No. W911NF-07-1-0072 from the U.S. Army Research Office through the Institute for Collaborative Biotechnologies. The US government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the US Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabian A. Soto.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Table S1

(PDF 35 kb)

Appendix

Appendix

Here we prove that the problem of non-identifiability of decisional separability described by Silbert and Thomas (2013) occurs in GRT-wIND only in the special case in which the decision bounds of all participants for each dimension are parallel to each other. We also describe procedures to: (1) estimate the parameters of a GRT-wIND model from identification data using maximum likelihood estimation, (2) run statistical tests for perceptual independence, perceptual separability and decisional separability, and (3) estimate parameters and test different types of independence as in previous applications of GRT.

Identifiability of decisional separability in the 2 × 2 GRT-wIND model

Silbert and Thomas (2013) showed analytically that a failure of decisional separability is non-identifiable in the Gaussian GRT model for a 2 × 2 identification experiment. That is, if the data from an experiment can be fit by a GRT model in which decisional separability fails, then it is always possible to find a different GRT model in which decisional separability holds and that predicts the exact same data pattern. We call this result the Silbert-Thomas non-identifiability, or STn for short.

We start by summarizing the proof offered by Silbert and Thomas (2013). Their theorem states that “Any perceptually separable but decisionally nonseparable configuration can be transformed to a configuration that is perceptually nonseparable, decisionally separable, and equivalent with respect to predicted response probabilities” (p. 17). Thus, they focus on the case in which the original configuration exhibits perceptual separability but violations of decisional separability. However, the more general result is that decisional separability is nonidentifiable in this model. As the authors indicate, “Any arbitrary (and, in general, not perceptually separable) linear bound model without decisional separability can be rotated and sheared to produce a model with decisional separability […], failure of decisional separability is never identifiable in this model” (pp. 4–5).

The proof for this theorem starts with a configuration without decisional separability and that has been translated so that the origin of the xy-plane coincides with the intersection of the two decision bounds h A and h B. The angle between h B and the x-axis is represented by ϕ and the angle between the bounds h A and h B is represented by ω. Decisional separability holds when ϕ = 0 and ω = π/2. Rotation of the original configuration by ϕ degrees brings h B to be parallel to the x-axis (and orthogonal to the y-axis), achieving decisional separability of component B from A. The horizontal shear transformation has the property of changing the angle between all lines in the plane except those parallel to the x-axis. Thus, for any value of ω, a horizontal shear transformation can be found that brings this angle to π/2 while keeping h B parallel to the x-axis, thus achieving decisional separability of component A from B while also keeping decisional separability of component B from A.

The rotation and shear transformations can be represented by the transformation matrices L 1 and L 2, respectively, which combine to produce:

$$ \mathbf{\mathsf{L}}={\mathbf{\mathsf{L}}}_2{\mathbf{\mathsf{L}}}_1=\left[\begin{array}{cc}\hfill 1\hfill & \hfill -\frac{1}{\mathit{\mathsf{tan}}\omega}\hfill \\ {}\hfill 0\hfill & \hfill 1\hfill \end{array}\right]\left[\begin{array}{cc}\hfill \mathit{\mathsf{cos}}\varphi \hfill & \hfill -\mathit{\mathsf{sin}}\varphi \hfill \\ {}\hfill \mathit{\mathsf{sin}}\varphi \hfill & \hfill \mathit{\mathsf{cos}}\varphi \hfill \end{array}\right]=\left[\begin{array}{cc}\hfill \mathit{\mathsf{cos}}\varphi -\frac{\mathit{\mathsf{sin}}\varphi }{\mathit{\mathsf{tan}}\omega}\hfill & \hfill -\mathit{\mathsf{sin}}\varphi -\frac{\mathit{\mathsf{cos}}\varphi }{\mathit{\mathsf{tan}}\omega}\hfill \\ {}\hfill \mathit{\mathsf{sin}}\varphi \hfill & \hfill \mathit{\mathsf{cos}}\varphi \hfill \end{array}\right] $$
(A1)

This is an area-preserving affine transformation. The change-of-variables theorems for densities guarantees that probabilities will be preserved under such transformation (Billingsley 2012). This means that the predicted probabilities of correct responses in the original configuration and the decisionally-separable configuration are the same, as the values of the integrals involved do not change. The means and covariance matrices in the decisionally-separable configuration can be computed from the original means and covariance matrices by using the formulas:

$$ {\underset{\bar{\mkern6mu}}{\mu}}_{\mathit{\mathsf{T}}}=\mathbf{\mathsf{L}}\underset{\bar{\mkern6mu}}{\mu } $$
(A2)
$$ {\boldsymbol{\Sigma}}_{\mathit{\mathsf{T}}}=\mathbf{\mathsf{L}}\boldsymbol{\Sigma} {\mathbf{\mathsf{L}}}^{\mathsf{T}} $$
(A3)

In the remainder of this section, we show the conditions under which decisional separability is non-identifiable in GRT models with more than one bound per dimension. GRT-wIND and n x m GRT models with n > 2 and m > 2 are special cases of this general class. We start by identifying the conditions under which STn holds for a model with two bounds per dimension. It is then straightforward to see that the same conditions apply for any larger number of bounds per dimension.

Theorem

In a Gaussian GRT model with two dimensions and two linear bounds per dimension, where the ith bound for dimension A is represented as h Ai and the jth bound for dimension B as h Bj , the non-identifiability of decisional separability identified by Silbert and Thomas (2013) is true if and only if h A1h A2 and h B1h B2.

Proof

We first prove that if h A1h A2 and h B1h B2, then STn holds. As with the proof of STn, we start with a configuration without decisional separability that has been translated so that the origin of the xy-plane coincides with the intersection of h A1 and h B1. We represent the angle between h Bj and the x-axis as ϕ j and the angle between of h Ai and h Bj as ω ij . Because h B1 and h B2 are parallel to each other, but not parallel to the x-axis, they intersect the latter at congruent angles; that is, ϕ 1 = ϕ 2 . Thus, rotation of the original configuration by ϕ 1 degrees brings both h B1 and h B2 to be parallel to the x-axis and orthogonal to the y-axis, achieving decisional separability of component B from A. After rotation, it is still true that h A1h A2 and h B1h B2, because rotation preserves parallelism. This means that ω ij = ω for all i and j. Thus, a single shear tranformation can bring this angle to π/2, achieving decisional separability of component A from B while also keeping decisional separability of component B from A.

To complete the proof, we must show that if STn holds, then h A1h A2 and h B1h B2. For STn to hold, a decisionally-separable configuration must exist that can be found by applying an affine transformation L to an original configuration without decisional separability. By definition, in this decisionally-separable configuration h A1x, h A2x, h B1y and h B2y. Because two lines that are both perpendicular to a third line are parallel to each other, with all lines in the same plane, h A1h A2 and h B1h B2 in the decisionally separable configuration. To go from the decisionally separable configuration to the original configuration, we must apply the transformation L −1. This inverse transformation exists because both shear and rotation are invertible transformations. The inverse of an affine transformation is itself an affine transformation that conserves parallelism, so application of L −1 to the decisionally-separable transformation conserves the property that h A1h A2 and h B1h B2. Thus, if STn holds, then bounds must be parallel in the decisionally separable configuration as well as in the original configuration.

This completes the proof for the case in which there are two linear bounds per dimension. A corollary is that for models with more than two bounds per dimension, STn holds if and only if each bound in one dimension is parallel to each of the other bounds in that specific dimension.

Here we have exclusively dealt with part (i) of the theorem proposed by Silbert and Thomas (2013). Part (ii) of this theorem proposes that a configuration with mean shift integrality and decisional separability is unidentifiable from a configuration with perceptual separability and without decisional separability. This theorem also deals with the non-identifiability of decisional separability, so as before it only holds for models with more than one bound per dimension if those bounds are parallel. Furthermore, an additional condition for this theorem to hold is that all covariance matrices in the model must be identical (Thomas and Silbert 2014). This in general is not the case in GRT-wIND or in traditional GRT models for designs larger than 2 × 2, which allow for estimation of different variances and covariances for each perceptual distribution.

In conclusion, STn is not generally true in GRT-wIND or any other model with more than one bound per dimension. The non-identifiability of decisional separability arises in such models only under very specific circumstances.

Maximum likelihood estimation for GRT-wIND

The data from each participant in an identification experiment are summarized in a confusion matrix, with rows corresponding to each stimulus in the experiment, columns corresponding to each response, and response frequencies reported in each cell of the matrix. Let S 1 , S 2 , …, S n denote the n stimuli in an identification experiment and let R 1 , R 2 , …, R n denote the n responses. Let r ij denote the frequency with which the participant responded R j on trials when stimulus S i was presented. Finally, there are N participants in the experiment, indexed by k = 1, 2,…,N. Given a set of parameter values for the model, the likelihood of this confusion data is computed in two steps.

In the first step, the predicted confusion matrix of each participant is computed using standard methods. For example, the predicted probability that a participant responds R j on trials when stimulus S i was presented, denoted by P(R j |S i ), is computed by integrating the volume of the S i perceptual distribution in response region R j . A numerical approximation to this multiple integral can be computed efficiently using Cholesky factorization (Ennis and Ashby 2003; for a tutorial overview, see Ashby and Soto 2014).

The second step is to compute the log of the likelihood function for participant k:

$$ \log {L}_k={\displaystyle \sum_{i=1}^n{\displaystyle \sum_j^n{r}_{ij} \log P\left(\left.{R}_j\right|{S}_i\right)}} $$
(A4)

These log-likelihoods are then summed across all participants:

$$ \log L={\displaystyle \sum_{k=1}^N \log {L}_k} $$
(A5)

The maximum likelihood estimates of the parameters in a GRT-wIND model are those that maximize the expression in Equation A5.

Statistical tests of independence with GRT-wIND

The large number of parameters in a GRT-wIND model makes the computational cost of using likelihood ratio tests and model selection procedures prohibitive. Thus, we recommend a deviation from the custom of computing such tests in GRT analyses. The strategy used here consists of fitting the full GRT-wIND model and testing maximum-likelihood parameter estimates against expected values from null hypotheses using a Wald test (Wald 1943).

Let \( \underset{\bar{\mkern6mu}}{\widehat{\theta}} \) be a column vector containing the maximum likelihood parameter estimates. The Wald test can be used to test any null hypothesis that can be expressed in the form of linear restrictions on \( \underset{\bar{\mkern6mu}}{\widehat{\theta}} \):

$$ \begin{array}{c}\hfill {H}_0:\mathbf{R}\underset{\bar{\mkern6mu}}{\widehat{\theta}}-q=0\hfill \\ {}\hfill {H}_1:\mathbf{R}\underset{\bar{\mkern6mu}}{\widehat{\theta}}-q\ne 0\hfill \end{array} $$

where R is a matrix with number of columns equal to the number of parameters and number of rows equal to the number of restrictions being tested, and \( \underset{\bar{\mkern6mu}}{q} \) is a column vector with number of rows equal to the number of restrictions being tested. For example, if we wanted to test the hypothesis that \( {\widehat{\theta}}_1=0 \), then R would have a single row (we are testing a single restriction) with a +1 in the first cell of that row and zeros in all other cells, while \( \underset{\bar{\mkern6mu}}{q} \) would have a single cell with a zero in it. If we want to additionally test the hypothesis that \( {\widehat{\theta}}_2-{\widehat{\theta}}_3=10 \), then we would add a second row to R with a +1 in the second column (corresponding to \( +{\widehat{\theta}}_2 \)) and –1 in the third column (corresponding to \( -{\widehat{\theta}}_3 \)), while \( \underset{\bar{\mkern6mu}}{q} \) would now have a second cell with the value 10 in it.

Null hypotheses are tested using the Wald statistic:

$$ W={\left[\mathbf{R}\underset{\bar{\mkern6mu}}{\widehat{\theta}}-\underset{\bar{\mkern6mu}}{q}\right]}^{\mathrm{T}}{\left[\mathbf{R}{\Sigma}_{\underset{\bar{\mkern6mu}}{\widehat{\theta}}}-{\mathbf{R}}^{\mathrm{T}}\right]}^{-1}\left[\mathbf{R}\underset{\bar{\mkern6mu}}{\widehat{\theta}}-\underset{\bar{\mkern6mu}}{q}\right] $$
(A6)

where []T represents matrix transpose. The statistic W has a chi-squared distribution with degrees of freedom equal to the number of restrictions being tested (the length of \( \underset{\bar{\mkern6mu}}{q} \)). Computing W requires the covariance matrix of the maximum likelihood estimates, which can be estimated using the Hessian of the log-likelihood function at the solution:

$$ {\boldsymbol{\Sigma}}_{\underset{\bar{\mkern6mu}}{\widehat{\theta}}}=H{\left(\underset{\bar{\mkern6mu}}{\widehat{\theta}}\right)}^{-1} $$
(A7)

Usually the Hessian in Eq. A4 can be obtained from the same optimization software that is used to obtain the parameter estimates that maximize the log-likelihood, but better estimates are obtained from numerical differentiation software. In this study, we used the DERIVEST suite (D’Errico 2006) to obtain estimates of the Hessian.

For the 2 × 2 identification design used here, the restrictions imposed on the model by perceptual separability of dimension A from dimension B are the following:

$$ \begin{array}{ccc}\hfill {\mu}_{A_1{B}_21}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\sigma}_{A_1{B}_21}\hfill & \hfill =\hfill & \hfill 1\hfill \\ {}\hfill {\mu}_{A_2{B}_11}-{\mu}_{A_2{B}_21}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\sigma}_{A_2{B}_11}-{\sigma}_{A_2{B}_11}\hfill & \hfill =\hfill & \hfill 0\hfill \end{array} $$

The restrictions imposed in the model by perceptual separability of dimension B from dimension A are the following:

$$ \begin{array}{ccc}\hfill {\mu}_{A_2{B}_12}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\sigma}_{A_2{B}_12}\hfill & \hfill =\hfill & \hfill 1\hfill \\ {}\hfill {\mu}_{A_1{B}_22}-{\mu}_{A_2{B}_22}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\sigma}_{A_1{B}_22}-{\sigma}_{A_2{B}_22s}\hfill & \hfill =\hfill & \hfill 0\hfill \end{array} $$

The restrictions imposed in the model by perceptual independence in each of the perceptual distributions are the following:

$$ \begin{array}{ccc}\hfill {\rho}_{A_1{B}_1}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\rho}_{A_1{B}_2}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\rho}_{A_2{B}_1}\hfill & \hfill =\hfill & \hfill 0\hfill \\ {}\hfill {\rho}_{A_2{B}_2}\hfill & \hfill =\hfill & \hfill 0\hfill \end{array} $$

The Wald test allows tests of decisional separability for the whole group or for each participant individually. Here, we focus on the latter kind of test. Testing whether decisional separability of dimension A from dimension B holds in participant k involves a single restriction:

$$ {b}_{A_{k2}}=0 $$

Testing whether decisional separability of dimension B from dimension A holds in participant k involves the following restriction:

$$ {b}_{A_{k1}}=0 $$

Model fit and selection in the traditional GRT approach

To fit any GRT model to data (e.g., the models in the hierarchy shown in Fig. 5), the confusion matrix from a single participant is used to find the values of the free parameters that maximize Eq. A5.

A popular method to test assumptions about independence and separability is to fit a restricted and an unrestricted version of the model to data. The restricted model contains a number of parameters that are set to values reflecting the assumption under test. For example, testing perceptual independence would require setting all ρ parameters to zero. The same parameters would be free to vary in the unrestricted model. Once both models are fit to data, the likelihood of the data at the solutions (L U and L R for the unrestricted and unrestricted versions, respectively) can be used to run a likelihood ratio test, by computing the following statistic:

$$ \varDelta =-2\left( \log {L}_R- \log {L}_U\right), $$
(A8)

which follows a Chi-squared distribution with degrees of freedom equal to the difference in number of free parameters between the two models.

The likelihood ratio test can only be applied to select between two nested models. To select between two non-nested models, it is possible to use the Akaike information criterion (AIC, Akaike 1974) for model comparison. Here we use a version of AIC corrected for a bias problem present when the number of data points is small compared to the number of free parameters (see Burnham and Anderson 2004):

$$ AI{C}_C=\hbox{--} 2 \log L+2m+2m\left(m+1\right)/\left({n}^2\hbox{--} m\hbox{--} 1\right), $$
(A7)

where m is the number of free parameters in the model and n 2 is the number of cells in the confusion matrix. The first two terms in Eq. A7 correspond to the traditional definition of AIC and the last term corresponds to the correction factor. A smaller value of AIC represents a better fit of the model to the data.

In the present study, as in previous model-based applications of GRT (e.g., Ashby and Lee 1991; Ashby et al. 2001; Fitousi and Wenger 2013 Thomas, 2001), a hierarchy of models was fit to the data from each participant (see Fig. 5). The procedure starts at the top of the hierarchy and compares nested models through likelihood ratio tests until the test results in a non-significant increase in fit. If more than one candidate model survives this process, the model with the smallest AICC is selected.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Soto, F.A., Vucovich, L., Musgrave, R. et al. General recognition theory with individual differences: a new method for examining perceptual and decisional interactions with an application to face perception. Psychon Bull Rev 22, 88–111 (2015). https://doi.org/10.3758/s13423-014-0661-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3758/s13423-014-0661-y

Keywords

Navigation