Skip to main content
Log in

Generalization and similarity in exemplar models of categorization: Insights from machine learning

  • Theoretical and Review Articles
  • Published:
Psychonomic Bulletin & Review Aims and scope Submit manuscript

Abstract

Exemplar theories of categorization depend on similarity for explaining subjects’ ability to generalize to new stimuli. A major criticism of exemplar theories concerns their lack of abstraction mechanisms and thus, seemingly, of generalization ability. Here, we use insights from machine learning to demonstrate that exemplar models can actually generalize very well. Kernel methods in machine learning are akin to exemplar models and are very successful in real-world applications. Their generalization performance depends crucially on the chosen similarity measure. Although similarity plays an important role in describing generalization behavior, it is not the only factor that controls generalization performance. In machine learning, kernel methods are often combined with regularization techniques in order to ensure good generalization. These same techniques are easily incorporated in exemplar models. We show that the generalized context model (Nosofsky, 1986) and ALCOVE (Kruschke, 1992) are closely related to a statistical model called kernel logistic regression. We argue that generalization is central to the enterprise of understanding categorization behavior, and we suggest some ways in which insights from machine learning can offer guidance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aizerman, M. A., Braverman, E. M., & Rozonoer, L. I (1964). The probability problem of pattern recognition learning and the method of potential functions. Automation & Remote Control, 25, 1175–1190.

    Google Scholar 

  • Alfonso-Reese, L. A., Ashby, F. G., & Brainard, D. H. (2002). What makes a categorization task difficult? Perception & Psychophysics, 64, 570–583.

    Article  Google Scholar 

  • Ashby, F. G., & Alfonso-Reese, L. A. (1995). Categorization as probability density estimation. Journal of Mathematical Psychology, 39, 216–233.

    Article  Google Scholar 

  • Ashby, F. G., & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory, & Cognition, 14, 33–53.

    Article  Google Scholar 

  • Ashby, F. G., & Maddox, W. T. (1992). Complex decision rules in categorization: Contrasting novice and experienced performance. Journal of Experimental Psychology: Human Perception & Performance, 18, 50–71.

    Article  Google Scholar 

  • Ashby, F. G., & Maddox, W. T. (1993). Relations between prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology, 37, 372–400.

    Article  Google Scholar 

  • Ashby, F. G., Waldron, E. M., Lee, W. W., & Berkman, A. (2001). Suboptimality in human categorization and identification. Journal of Experimental Psychology: General, 130, 77–96.

    Article  Google Scholar 

  • Beals, R., Krantz, D. H., & Tversky, A. (1968). Foundations of multidimensional scaling. Psychological Review, 75, 127–142.

    Article  PubMed  Google Scholar 

  • Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: Oxford University Press, Clarendon Press.

    Google Scholar 

  • Bousquet, O., & Elisseeff, A. (2002). Stability and generalization. Journal of Machine Learning Research, 2, 499–526.

    Google Scholar 

  • Bradley, R. A. (1976). Science, statistics, and paired comparisons. Biometrics, 32, 213–239.

    Article  PubMed  Google Scholar 

  • Briscoe, E., & Feldman, J. (2006). Conceptual complexity and the bias-variance tradeoff. In R. Sun, N. Miyake, & C. Schunn (Eds.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (pp. 1038–1043). Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Brown, J. S. (1965). Generalization and discrimination. In D. I. Mostofsky (Ed.), Stimulus generalization (pp. 7–23). Stanford: Stanford University Press.

    Google Scholar 

  • Bülthoff, H. H., & Edelman, S. (1992). Psychophysical support for a two-dimensional view interpolation theory of object recognition. Proceedings of the National Academy of Sciences, 89, 60–64.

    Article  Google Scholar 

  • Bush, R. R., & Mosteller, F. (1951). A model for stimulus generalization and discrimination. Psychological Review, 58, 413–423.

    Article  PubMed  Google Scholar 

  • Chater, N., & Vitányi, P. M. B. (2003). The generalized universal law of generalization. Journal of Mathematical Psychology, 47, 346–369.

    Article  Google Scholar 

  • Cristianini, N., & Schölkopf, B. (2002). Support vector machines and kernel methods: The new generation of learning machines. AI Magazine, 23(3), 31–42.

    Google Scholar 

  • David, H. A. (1988). The method of paired comparisons (2nd ed.). London: Griffin.

    Google Scholar 

  • Fass, D., & Feldman, J. (2003). Categorization under complexity: A unified MDL account of human learning of regular and irregular categories. In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in neural information processing systems 15 (pp. 35–42). Cambridge, MA: MIT Press.

    Google Scholar 

  • Feldman, J. (2000). Minimization of Boolean complexity in human concept learning. Nature, 407, 630–633.

    Article  PubMed  Google Scholar 

  • Fried, L. S., & Holyoak, K. J. (1984). Induction of category distributions: A framework for classification learning. Journal of Experimental Psychology: Learning, Memory, & Cognition, 10, 234–257.

    Article  Google Scholar 

  • Garner, W. R. (1974). The processing of information and structure. Potomac, MD: Erlbaum.

    Google Scholar 

  • Ghirlanda, S., & Enquist, M. (2003). A century of generalization. Animal Behaviour, 66, 15–36.

    Article  Google Scholar 

  • Graf, A. B. A., & Wichmann, F. A. (2004). Insights from machine learning applied to human visual classification. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems 16 (pp. 905–912). Cambridge, MA: MIT Press.

    Google Scholar 

  • Graf, A. B. A., Wichmann, F. A., Bülthoff, H. H., & Schölkopf, B. (2006). Classification of faces in man and machine. Neural Computation, 18, 143–165.

    Article  PubMed  Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: Data mining, inference, and prediction. New York: Springer.

    Google Scholar 

  • Jäkel, F., Schölkopf, B., & Wichmann, F. A. (2007). A tutorial on kernel methods for categorization. Journal of Mathematical Psychology, 51, 343–358.

    Article  Google Scholar 

  • Jäkel, F., Schölkopf, B., & Wichmann, F. A. (2008). Similarity, kernels, and the triangle inequality. Manuscript submitted for publication.

  • Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44.

    Article  PubMed  Google Scholar 

  • Lamberts, K. (1994). Flexible tuning of similarity in exemplar-based categorization. Journal of Experimental Psychology: Learning, Memory, & Cognition, 20, 1003–1021.

    Article  Google Scholar 

  • Logothetis, N. K., Pauls, J., Bülthoff, H. H., & Poggio, T. (1994). View-dependent object recognition by monkeys. Current Biology, 4, 401–414.

    Article  PubMed  Google Scholar 

  • Logothetis, N. K., Pauls, J., & Poggio, T. (1995). Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5, 552–563.

    Article  PubMed  Google Scholar 

  • Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: A network model of category learning. Psychological Review, 111, 309–332.

    Article  PubMed  Google Scholar 

  • Luce, R. D. (1959). Individual choice behavior: A theoretical analysis. New York: Wiley.

    Google Scholar 

  • Luce, R. D. (1963). Detection and recognition. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (Vol. 1, pp. 103–189). New York: Wiley.

    Google Scholar 

  • Luce, R. D. (1977). The choice axiom after twenty years. Journal of Mathematical Psychology, 15, 215–233.

    Article  Google Scholar 

  • McKinley, S. C., & Nosofsky, R. M. (1995). Investigations of exemplar and decision bound models in large, ill-defined category structures. Journal of Experimental Psychology: Human Perception & Performance, 21, 128–148.

    Article  Google Scholar 

  • McKinley, S. C., & Nosofsky, R. M. (1996). Selective attention and the formation of linear decision boundaries. Journal of Experimental Psychology: Human Perception & Performance, 22, 294–317.

    Article  Google Scholar 

  • Medin, D. L., Goldstone, R. L., & Gentner, D. (1993). Respects for similarity. Psychological Review, 100, 254–278.

    Article  Google Scholar 

  • Medin, D. L., & Schaffer, M. M. (1978). Context theory of classification learning. Psychological Review, 85, 207–238.

    Article  Google Scholar 

  • Mostofsky, D. I. (Ed.) (1965). Stimulus generalization. Stanford: Stanford University Press.

    Google Scholar 

  • Navarro, D. J. (2002). Representing stimulus similarity. Unpublished doctoral dissertation, University of Adelaide, Adelaide, Australia.

    Google Scholar 

  • Navarro, D. J. (2007). On the interaction between exemplar-based concepts and a response scaling process. Journal of Mathematical Psychology, 51, 85–98.

    Article  Google Scholar 

  • Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology: General, 115, 39–57.

    Article  Google Scholar 

  • Nosofsky, R. M. (1987). Attention and learning processes in the identification and categorization of integral stimuli. Journal of Experimental Psychology: Learning, Memory, & Cognition, 13, 87–108.

    Article  Google Scholar 

  • Nosofsky, R. M. (1990). Relations between exemplar-similarity and likelihood models of classification. Journal of Mathematical Psychology, 34, 393–418.

    Article  Google Scholar 

  • Nosofsky, R. M. (1991a). Tests of an exemplar model for relating perceptual classification and recognition memory. Journal of Experimental Psychology: Human Perception & Performance, 17, 3–27.

    Article  Google Scholar 

  • Nosofsky, R. M. (1991b). Typicality in logically defined categories: Exemplar-similarity versus rule instantiation. Memory & Cognition, 19, 131–150.

    Article  Google Scholar 

  • Nosofsky, R. M. (1992). Exemplar-based approach to relating categorization, identification, and recognition. In F. G. Ashby (Ed.), Multidimensional models of perception and cognition (pp. 363–393). Hillsdale, NJ: Erlbaum.

    Google Scholar 

  • Nosofsky, R. M., & Zaki, S. R. (2002). Exemplar and prototype models revisited: Response strategies, selective attention, and stimulus generalization. Journal of Experimental Psychology: Learning, Memory, & Cognition, 28, 924–940.

    Article  Google Scholar 

  • Ohl, F. W., Scheich, H., & Freeman, W. J. (2001). Change in the pattern of ongoing cortical activity with auditory category learning. Nature, 412, 733–736.

    Article  PubMed  Google Scholar 

  • Op de Beeck, H., Wagemans, J., & Vogels, R. (2001). Inferotemporal neurons represent low-dimensional configurations of parameterized shapes. Nature Neuroscience, 4, 1244–1252.

    Article  PubMed  Google Scholar 

  • Op de Beeck, H., Wagemans, J., & Vogels, R. (2004). A diverse stimulus representation underlies shape categorization by primates (Abstract). Journal of Vision, 4(8), 518a.

    Article  Google Scholar 

  • Orr, G. B., & Müller, K.-R. (Eds.) (1998). Neural networks: Tricks of the trade. Berlin: Springer.

    Google Scholar 

  • Palmeri, T. J., & Gauthier, I. (2004). Visual object understanding. Nature Reviews Neuroscience, 5, 291–303.

    Article  PubMed  Google Scholar 

  • Pitt, M. A., Myung, I. J., & Zhang, S. (2002). Toward a method of selecting among computational models of cognition. Psychological Review, 109, 472–491.

    Article  PubMed  Google Scholar 

  • Poggio, T. (1990). A theory of how the brain might work. Cold Spring Harbor Symposia on Quantitative Biology, 55, 899–910.

    Article  PubMed  Google Scholar 

  • Poggio, T., & Bizzi, E. (2004). Generalization in vision and motor control. Nature, 431, 768–774.

    Article  PubMed  Google Scholar 

  • Poggio, T., & Edelman, S. (1990). A network that learns to recognize three-dimensional objects. Nature, 343, 263–266.

    Article  PubMed  Google Scholar 

  • Poggio, T., & Girosi, F. (1989). A theory of networks for approximation and learning (Tech. Rep. No. A. I. Memo No. 1140). Cambridge, MA: MIT AI LAB & Center for Biological Information Processing Whitaker College.

    Google Scholar 

  • Poggio, T., Rifkin, R., Mukherjee, S., & Niyogi, P. (2004). General conditions for predictivity in learning theory. Nature, 428, 419–422.

    Article  PubMed  Google Scholar 

  • Poggio, T., & Smale, S. (2003). The mathematics of learning: Dealing with data. Notices of the American Mathematical Society, 50, 537–544.

    Google Scholar 

  • Posner, M. I., & Keele, S. W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353–363.

    Article  PubMed  Google Scholar 

  • Reed, S. K. (1972). Pattern recognition and categorization. Cognitive Psychology, 3, 382–407.

    Article  Google Scholar 

  • Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025.

    Article  PubMed  Google Scholar 

  • Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573–605.

    Article  Google Scholar 

  • Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382–439.

    Article  Google Scholar 

  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386–408.

    Article  PubMed  Google Scholar 

  • Rosseel, Y. (2002). Mixture models of categorization. Journal of Mathematical Psychology, 46, 178–210.

    Article  Google Scholar 

  • Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart, J. L. McClelland, & the PDP Research Group (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 1, pp. 318–362). Cambridge, MA: MIT Press, Bradford Books.

    Google Scholar 

  • Schoenberg, I. J. (1938). Metric spaces and positive definite functions. Transactions of the American Mathematical Society, 44, 522–536.

    Article  Google Scholar 

  • Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press.

    Google Scholar 

  • Shepard, R. N. (1957). Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space. Psychometrika, 22, 325–345.

    Article  Google Scholar 

  • Shepard, R. N. (1958). Stimulus and response generalization: Deduction of the generalization gradient from a trace model. Psychological Review, 65, 242–256.

    Article  PubMed  Google Scholar 

  • Shepard, R. N. (1962). The analysis of proximities: Multidimensional scaling with an unknown distance function. Part I. Psychometrika, 27, 125–140.

    Article  Google Scholar 

  • Shepard, R. N. (1964). Attention and the metric structure of the stimulus space. Journal of Mathematical Psychology, 1, 54–87.

    Article  Google Scholar 

  • Shepard, R. N. (1965). Approximation to uniform gradients of generalization by monotone transformations of scale. In D. I. Mostofsky (Ed.), Stimulus generalization (pp. 94–110). Stanford: Stanford University Press.

    Google Scholar 

  • Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237, 1317–1323.

    Article  PubMed  Google Scholar 

  • Shepard, R. N., & Chang, J.-J. (1963). Stimulus generalization in the learning of classifications. Journal of Experimental Psychology, 65, 94–102.

    Article  PubMed  Google Scholar 

  • Shepard, R. N., Hovland, C. I., & Jenkins, H. M. (1961). Learning and memorization of classifications. Psychological Monographs, 75(13, Whole No. 517), 1–42.

    Article  Google Scholar 

  • Sigala, N., Gabbiani, F., & Logothetis, N. K. (2002). Visual categorization and object representation in monkeys and humans. Journal of Cognitive Neuroscience, 14, 187–198.

    Article  PubMed  Google Scholar 

  • Sigala, N., & Logothetis, N. K. (2002). Visual categorization shapes feature selectivity in the primate temporal cortex. Nature, 415, 318–320.

    Article  PubMed  Google Scholar 

  • Smith, J. D., & Minda, J. P. (1998). Prototypes in the mist: The early epochs of category learning. Journal of Experimental Psychology: Learning, Memory, & Cognition, 24, 1411–1436.

    Article  Google Scholar 

  • Smith, J. D., & Minda, J. P. (2000). Thirty categorization results in search of a model. Journal of Experimental Psychology: Learning, Memory, & Cognition, 26, 3–27.

    Article  Google Scholar 

  • Spence, K. W. (1937). The differential response in animals to stimuli varying within a single dimension. Psychological Review, 44, 430–444.

    Article  Google Scholar 

  • Tenenbaum, J. B., & Griffiths, T. L. (2001). Generalization, similarity and Bayesian inference. Behavioral & Brain Sciences, 24, 629–640.

    Google Scholar 

  • Train, K. E. (2003). Discrete choice methods with simulation. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.

    Article  Google Scholar 

  • Tversky, A., & Gati, I. (1982). Similarity, separability, and the triangle inequality. Psychological Review, 89, 123–154.

    Article  PubMed  Google Scholar 

  • Vapnik, V. N. (2000). The nature of statistical learning theory (2nd ed.). New York: Springer.

    Book  Google Scholar 

  • Verguts, T., Ameel, E., & Storms, G. (2004). Measures of similarity in models of categorization. Memory & Cognition, 32, 379–389.

    Article  Google Scholar 

  • Wichmann, F. A., Graf, A. B. A., Simoncelli, E. P., Bülthoff, H. H., & Schölkopf, B. (2005). Machine learning applied to perception: Decision images for gender classification. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural information processing systems 17 (pp. 1489–1496). Cambridge, MA: MIT Press.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Jäkel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jäkel, F., Schölkopf, B. & Wichmann, F.A. Generalization and similarity in exemplar models of categorization: Insights from machine learning. Psychonomic Bulletin & Review 15, 256–271 (2008). https://doi.org/10.3758/PBR.15.2.256

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3758/PBR.15.2.256

Keywords

Navigation