Elsevier

Neural Networks

Volume 37, January 2013, Pages 52-65
Neural Networks

Essentials of the self-organizing map

https://doi.org/10.1016/j.neunet.2012.09.018Get rights and content

Abstract

The self-organizing map (SOM) is an automatic data-analysis method. It is widely applied to clustering problems and data exploration in industry, finance, natural sciences, and linguistics. The most extensive applications, exemplified in this paper, can be found in the management of massive textual databases and in bioinformatics. The SOM is related to the classical vector quantization (VQ), which is used extensively in digital signal processing and transmission. Like in VQ, the SOM represents a distribution of input data items using a finite set of models. In the SOM, however, these models are automatically associated with the nodes of a regular (usually two-dimensional) grid in an orderly fashion such that more similar models become automatically associated with nodes that are adjacent in the grid, whereas less similar models are situated farther away from each other in the grid. This organization, a kind of similarity diagram of the models, makes it possible to obtain an insight into the topographic relationships of data, especially of high-dimensional data items. If the data items belong to certain predetermined classes, the models (and the nodes) can be calibrated according to these classes. An unknown input item is then classified according to that node, the model of which is most similar with it in some metric used in the construction of the SOM. A new finding introduced in this paper is that an input item can even more accurately be represented by a linear mixture of a few best-matching models. This becomes possible by a least-squares fitting procedure where the coefficients in the linear mixture of models are constrained to nonnegative values.

Section snippets

Brain maps

It has been known for over hundred years that various cortical areas of the brain are specialized to different modalities of cognitive functions. However, it was not until, e.g., Mountcastle (1957) as well as Hubel and Wiesel (1962) found that certain single neural cells in the brain respond selectively to some specific sensory stimuli. These cells often form local assemblies, in which their topographic location corresponds to some feature value of a specific stimulus in an orderly fashion.

The classical vector quantization (VQ)

The implementation of optimally tuned feature-sensitive filters by competitive learning was actually demonstrated in abstract form much earlier in signal processing. I mean the classical vector quantization (VQ), the basic idea of which was introduced (in scalar form) by Lloyd (1957), and (in vector form) by Forgy (1965). Actually the optimal quantization of a vector space dates back to 1850, called the Dirichlet tessellation in two- and three-dimensional spaces and the Voronoi tessellation in

Motivation of the SOM

Around 1981–82 this author introduced a new nonlinearly projecting mapping, called the self-organizing map (SOM), which otherwise resembles the VQ, but in which, additionally, the models(corresponding to the codebook vectors in the VQ) become spatially, globally ordered (Kohonen, 1982a, Kohonen, 1982b, Kohonen, 1990, Kohonen, 2001).

The SOM models are associated with the nodes of a regular, usually two-dimensional grid (Fig. 1). The SOM algorithm constructs the models such that:

More similar

The original, stepwise recursive SOM algorithm

The original formulation of the SOM algorithm resembles a gradient-descent procedure. It must be emphasized, however, that this version of the algorithm was introduced heuristically, when trying to materialize the general learning principle given in Section 3.1. This basic form has not yet been shown to be derivable from any energy function. An approximative and purely formal, but not very strict derivation ensues from the stochastic approximation method (Robbins & Monro, 1951); it was applied

Main application areas of the SOM

Before looking into the details, one may be interested in knowing the justification of the SOM method. Briefly, by the end of the year 2005 we had documented 7768 scientific publications: cf. Kaski, Kangas et al. (1998), Oja, Kaski et al. (2003) and Pöllä et al. (2009) that analyze, develop, or apply the SOM. The following short list gives the main application areas:

  • 1.

    Statistical methods at large

    • (a)

      exploratory data analysis

    • (b)

      statistical analysis and organization of texts

  • 2.

    Industrial analyses, control,

Approximation of an input data item by a linear mixture of models

An analysis hitherto generally unknown is introduced in this chapter; cf. also Kohonen (2007) and Kohonen (2008). The purpose is to extend the use of the SOM by showing that instead of a single winner model, one can approximate the input data item more accurately by means of a set of several models that together define the input data item more accurately. It shall be emphasized that we do not mean k winners that are rank-ordered according to their matching. Instead, the input data item is

Discussion

The self-organizing map (SOM) principle has been used extensively as an analytical and visualization tool in exploratory data analysis. It has had plenty of practical applications ranging from industrial process control and finance analyses to the management of very large document collections. New, very promising applications exist in bioinformatics. The largest applications so far have been in the management and retrieval of textual documents, of which this paper contains two large-scale

Acknowledgments

The author is indebted to all of his collaborators who over the years have implemented the SOM program packages and applications. Dr. Merja Oja has kindly provided the picture and associated material about the more recent HERV studies.

References (95)

  • M. Anderberg

    Cluster analysis for applications

    (1973)
  • C.M. Bishop et al.

    GTM: the generative topographic mapping

    Neural Computation

    (1998)
  • Y. Cheng

    Convergence and ordering of Kohonen’s Batch map

    Neural Computation

    (1997)
  • M. Cottrell et al.

    Étude d’un processus d’auto-organization

    Annales de l’Institut Henri Poincaré

    (1987)
  • Cottrell, M., Fort, J.C., & Pagés, G. (1997). Theoretical aspects of the SOM algorithm. In Proceedings of the WSOM 97,...
  • G. Deboeck et al.

    Visual explorations in finance with self-organizing maps

    (1998)
  • S. Deerwester et al.

    Indexing by latent semantic analysis

    Journal of the American Society for Information Science

    (1990)
  • G.L. Dirichlet

    Über die Reduktion der positiven quadratischen Formen mit drei unbestimmten ganzen Zahlen

    Journal für die Reine und Angewandte Mathematik

    (1850)
  • E.W. Forgy

    Cluster analysis of multivariate data: efficiency vs. interpretability of classifications

    Biometrics

    (1965)
  • A. Gersho

    On the structure of vector quantizers

    IEEE Transactions on Information Theory

    (1979)
  • R.M. Gray

    Vector quantization

    IEEE ASSP Magazine

    (1984)
  • S. Grossberg

    On the development of feature detectors in the visual cortex with applications to learning and reaction–diffusion systems

    Biological Cybernetics

    (1976)
  • J. Hartigan

    Clustering algorithms

    (1975)
  • T.M. Heskes et al.

    Error potential for self-organization

  • D.H. Hubel et al.

    Receptive fields, binocular and functional architecture in the cat’s visual cortex

    Journal of Physiology

    (1962)
  • A.K. Jain et al.

    Algorithms for clustering of data

    (1988)
  • S. Kaski

    Dimensionality reduction by random mapping

  • S. Kaski et al.

    Bibliography of self-organizing map (SOM) papers: 1981–1997

    Neural Computing Surveys

    (1998)
  • T. Kohonen

    Self-organized formation of topologically correct feature maps

    Biological Cybernetics

    (1982)
  • T. Kohonen

    Clustering, taxonomy, and topological maps of patterns

  • T. Kohonen

    Self-organization and associative memory

    (1989)
  • T. Kohonen

    The self-organizing map

    Proceedings of the IEEE

    (1990)
  • T. Kohonen

    Emergence of invariant-feature detectors in self organization

  • T. Kohonen

    Emergence of invariant-feature detectors in the adaptive-subspace self organizing ma

    Biological Cybernetics

    (1996)
  • T. Kohonen

    Self-organizing maps

    (2001)
  • Kohonen, T. (2005). Pointwise organizing projections. In Proceedings of the WSOM05, 5th workshop on self-organizing...
  • Kohonen, T. (2007). Description of input patterns by linear mixtures of SOM models. In WSOM 2007 CD-ROM proceedings,...
  • T. Kohonen

    Data management by self-organizing maps

  • Kohonen, T., Hynninen, J., Kangas, J., & Laaksonen, J. (1996). The self-organizing map program package, report A31....
  • T. Kohonen et al.

    Self organization of a massive document collection

    IEEE Transactions on Neural Networks

    (2000)
  • T. Kohonen et al.

    Self-organized formation of various invariant-feature filters in the adaptive-subspace SOM

    Neural Computation

    (1997)
  • T. Kohonen et al.

    Engineering applications of the self-organizing map

    Proceedings of the IEEE

    (1996)
  • T. Kohonen et al.

    Contextually self-organized maps of Chinese words

  • J.B. Kruskal et al.
  • K. Lagus et al.

    Keyword selection method for characterizing text document maps

  • J. Lampinen et al.

    Self-organizing maps for spatial and temporal AR models

  • C.L. Lawson et al.

    Solving least-squares problems

    (1974)
  • Cited by (1090)

    View all citing articles on Scopus
    View full text