Skip to main content
Log in

The Earth Mover's Distance as a Metric for Image Retrieval

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Ahuja, R.K., Magnanti, T.L., and Orlin, J.B. 1993. Network Flows. Prentice Hall: Englewood Cliffs, NJ.

    Google Scholar 

  • Belongie, S., Carson, C., Greenspan, H., and Malik, J. 1998. Colorand texture-based image segmentation using EM and its application to content-based image retrieval. In IEEE International Conference on Computer Vision, Bombay, India. pp. 675–682.

  • Bentley, J.L. 1975. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18:509–517.

    Google Scholar 

  • Bigün, J. and du Buf, J.M. 1994. N-folded symmetries by complex moments in Gabor space and their application to unsupervised texture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(1):80–87.

    Google Scholar 

  • Bovik, A.C., Clark, M., and Geisler, W.S. 1990. Multichannel texture analysis using localized spatial filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(12):55–73.

    Google Scholar 

  • Bozkaya, T. and Ozsoyoglu, M. 1997. Distance-based indexing for high-dimensional metric spaces. SIGMOD Record (ACM Special Interest Group on Management of Data), 26(2):357–368.

    Google Scholar 

  • Brodatz, P. 1966. Textures: A Photographic Album for Artists and Designers: Dover: New York, NY.

    Google Scholar 

  • Clarkson, K.L. 1997. Nearest neighbor queries in metric spaces. In ACM Symposium on the Theory of Computing, pp. 609–617.

  • Cover, T.M. and Thomas, J.A. 1991. Elements of Information Theory. Wiley Series in Telecommunications. John Wiley & Sons: New York, NY, USA.

    Google Scholar 

  • Das, M., Riseman, E.M., and Draper, B.A. 1997. FOCUS: Searching for multi-colored objects in a diverse image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 756–761.

  • Daugman, J.D. 1998. Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36:1169–1179.

    Google Scholar 

  • Duda, R.O. and Hart, P.E. 1973. Pattern Classification and Scene Analysis. Wiley: New York.

    Google Scholar 

  • Farrokhnia, F. and Jain, A.K. 1991. A multi-channel filtering approach to texture segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 364–370.

  • Field, D.J. 1987. Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A, 4(12):2379–2394.

    Google Scholar 

  • Gabor, D. 1946. Theory of communication. The Journal of the Institute of Electrical Engineers, Part III, 93(21):429–457.

    Google Scholar 

  • Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., and Niblack, W. 1995. Efficient color histogram indexing for quadratic form distance functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7):729–735.

    Google Scholar 

  • Hillier, F.S. and Lieberman, G.J. 1990. Introduction to Mathematical Programming. McGraw-Hill, New York, NY.

    Google Scholar 

  • Hitchcock, F.L. 1941. The distribution of a product from several sources to numerous localities. J. Math. Phys., 20:224–230.

    Google Scholar 

  • Karmarkar, N. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, Washington, DC. pp. 302–311.

  • Klee, V. and Minty, G. 1972. How good is the simplex algorithm. In Inequalities, Vol. III, O. Shisha (Ed.). Academic Press: New York, NY, pp. 159–175.

    Google Scholar 

  • Kullback, S. 1968. Information Theory and Statistics Dover: New York, NY.

    Google Scholar 

  • Liu, F. and Picard, R.W. 1996. Periodicity, directionality, and randomness: Wold features for image modeling and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7):722–733.

    Google Scholar 

  • Manjunath, B.S. and Ma, W.Y. 1996. Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8):837–842.

    Google Scholar 

  • Nasrabad, N.M. and King, R.A. 1988. Image coding using vector quantization: A review. IEEE Transactions on Communication, 36(8):957–971.

    Google Scholar 

  • Niblack, W., Barber, R., Equitz, W., Flickner, M.D., Glasman, E.H., Petkovic, D., Yanker, P., Faloutsos, C., Taubin, G., and Heights, Y. 1993. Querying images by content, using color, texture, and shape. In SPIE Conference on Storage and Retrieval for Image and Video Databases, Vol. 1908, pp. 173–187.

    Google Scholar 

  • Peleg, S., Werman, M., and Rom, H. 1989. A unified approach to the change of resolution: Space and gray-level. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:739–742.

    Google Scholar 

  • Poynton, C. 1996. A Technical Introduction to Digital Video. John Wiley and Sons, New York, NY.

    Google Scholar 

  • Puzicha, J., Hofmann, T., and Buhmann, J. 1997. Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 267–272.

  • Rachev, S.T. 1984. The Monge-Kantorovich mass transference problem and its stochastic applications. Theory of Probability and its Applications, XXIX(4):647–676.

    Google Scholar 

  • Russell, E.J. 1969. Extension of Dantzig's algorithm to finding an initial near-optimal basis for the transportation problem. Operations Research, 17:187–191.

    Google Scholar 

  • Shen, H.C. and Wong, A.K.C. 1983. Generalized texture representation and metric. Computer, Vision, Graphics, and Image Processing, 23:187–206.

    Google Scholar 

  • Smith, J.R. 1997. Integrated Spatial and Feature Image Systems: Retrieval, Analysis and Compression. Ph.D. Thesis, Columbia University.

  • Stolfi, J. 1994. Personal communication.

  • Stricker, M. and Orengo, M. 1995. Similarity of color images. In SPIE Conference on Storage and Retrieval for Image and Video Databases III, Vol. 2420, pp. 381–392.

    Google Scholar 

  • Swain, M.J. and Ballard, D.H. 1991. Color indexing. International Journal of Computer Vision, 7(1):11–32.

    Google Scholar 

  • Tversky, A. 1977. Features of similarity. Psychological Review, 84(4):327–352.

    Google Scholar 

  • Werman, M., Peleg, S., and Rosenfeld, A. 1985. A distance metric for multi-dimensional histograms. Computer, Vision, Graphics, and Image Processing, 32:328–336.

    Google Scholar 

  • Wyszecki, G. and Stiles, W.S. 1982. Color Science: Concepts and Methods, Quantitative Data and Formulae. John Wiley and Sons: New York, NY.

    Google Scholar 

  • Zikan, K. 1990. The Theory and Applications of Algebraic Metric Spaces. Ph.D. Thesis, Stanford University.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rubner, Y., Tomasi, C. & Guibas, L.J. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision 40, 99–121 (2000). https://doi.org/10.1023/A:1026543900054

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026543900054

Navigation