Elsevier

Cognition

Volume 122, Issue 3, March 2012, Pages 280-291
Cognition

The communicative function of ambiguity in language

https://doi.org/10.1016/j.cognition.2011.10.004Get rights and content

Abstract

We present a general information-theoretic argument that all efficient communication systems will be ambiguous, assuming that context is informative about meaning. We also argue that ambiguity allows for greater ease of processing by permitting efficient linguistic units to be re-used. We test predictions of this theory in English, German, and Dutch. Our results and theoretical analysis suggest that ambiguity is a functional property of language that allows for greater communicative efficiency. This provides theoretical and empirical arguments against recent suggestions that core features of linguistic systems are not designed for communication.

Introduction

Ambiguity is a pervasive phenomenon in language which occurs at all levels of linguistic analysis. Out of context, words have multiple senses and syntactic categories, requiring listeners to determine which meaning and part of speech was intended. Morphemes may also be ambiguous out of context, as in the English –s, which can denote either a plural noun marking (trees), a possessive (Dylan’s), or a present tense verb conjugation (runs). Phonological forms are often mapped to multiple distinct word meanings, as in the homophones too, two, and to. Syllables are almost always ambiguous in isolation, meaning that they can be interpreted as providing incomplete information about the word the speaker is intending to communicate. Syntactic and semantic ambiguity are frequent enough to present a substantial challenge to natural language processing. The fact that ambiguity occurs on so many linguistic levels suggests that a far-reaching principle is needed to explain its origins and persistence.

The existence of ambiguity provides a puzzle for functionalist theories which attempt to explain properties of linguistic systems in terms of communicative pressures (e.g. Hockett, 1960, Pinker and Bloom, 1990). One might imagine that in a perfect communication system, language would completely disambiguate meaning. Each linguistic form would map bijectively to a meaning, and comprehenders would not need to expend effort inferring what the speaker intended to convey. This would reduce the computational difficulties in language understanding and comprehension because recovering meaning would be no more complex than, for instance, compiling a computer program. The communicative efficacy of language might be enhanced since there would be no danger of comprehenders incorrectly inferring the intended meaning. Confusion about “who’s on first” could not occur.

Indeed, the existence of ambiguity in language has been argued to show that the key structures and properties of language have not evolved for purposes of communication or use:

The natural approach has always been: Is [language] well designed for use, understood typically as use for communication? I think that’s the wrong question. The use of language for communication might turn out to be a kind of epiphenomenon… If you want to make sure that we never misunderstand one another, for that purpose language is not well designed, because you have such properties as ambiguity. If we want to have the property that the things that we usually would like to say come out short and simple, well, it probably doesn’t have that property. (Chomsky, 2002 p107).

Here, we argue that this perspective on ambiguity is exactly backwards. We argue, contrary to the Chomskyan view, that ambiguity is in fact a desirable property of communication systems, precisely because it allows for a communication system which is “short and simple.” We argue for two beneficial properties of ambiguity: first, where context is informative about meaning, unambiguous language is partly redundant with the context and therefore inefficient; and second, ambiguity allows the re-use of words and sounds which are more easily produced or understood. Our approach follows directly from the hypothesis that language approximates an optimal code for human communication, following a tradition of research spearheaded by Zipf which has recently come back into favor to explain both the online behavior of language users (e.g. Genzel and Charniak, 2002, Aylett and Turk, 2004, Jaeger, 2006, Levy and Jaeger, 2007 i.a.) and the structure of languages themselves (e.g. Ferrer i Cancho and Solé, 2003, Ferrer i Cancho, 2006, Piantadosi et al., 2011). In fact, our specific hypothesis is closely related to a theory initially suggested by Zipf (1949).

In Zipf’s view, ambiguity fits within the framework of his unifying principle of least effort, and could be understood by considering the competing desires of the speaker and the listener. Speakers can minimize their effort if all meanings are expressed by one simple, maximally ambiguous word, say, ba. To express a meaning such as “The accordion box is too small,” the speaker would simply say ba. To say “It will rain next Wednesday,” the speaker would say ba. Such a system is very easy for speakers since they do not need to expend any effort thinking about or searching memory to retrieve the correct linguistic form to produce. Conversely, from the comprehender’s perspective, effort is minimized if each meaning maps to a distinct linguistic form, assuming that handling many distinct word forms is not overly difficult for comprehenders. In that type of system, the listener does not need to expend effort inferring what the speaker intended, since the linguistic signal would leave only one possibility.

Zipf suggested that natural language would strike a balance between these two opposing forces of unification and diversification, arriving at a middle ground with some but not total, ambiguity. Zipf argued this balance of speakers’ and comprehenders’ interests will be observed in a balance between frequency of words and number of words: speakers want a single (therefore highly frequent) word, and comprehenders want many (therefore less frequent) words. He suggested the balancing of these two forces could be observed in the relationship between word frequency and rank frequency: the vocabulary was “balanced” because a word’s frequency multiplied by its frequency rank was roughly a constant, a celebrated statistical law of language.1 Ferrer i Cancho and Solé (2003) provide a formal backing to Zipf’s intuitive explanation, showing that the power law distribution arises when information-theoretic difficulty for speakers and comprehenders is appropriately “balanced.” Zipf (1949) further extends his thinking to the distribution of word meanings by testing a quantitative relationship between word frequency and number of meanings. He derives a law of meaning distribution from his posited forces of unification and diversification, arguing that the number of meanings a word has should scale with the square root of its frequency. Zipf reports a very close empirical fit for this prediction. Functionalist linguistic theories have also posited trade-offs between total ambiguity and perfect and unambiguous logical communication (e.g. Givón, 2009), although to our knowledge these have not been evaluated empirically.

Zipf’s hypothesis of the way ambiguity might arise from a trade-off between speaker and hearer pressures has certain shortcomings. As pointed out by Wasow, Perfors, and Beaver (2005), it is unlikely that a speaker’s effort is minimized by a totally ambiguous language, since confusion means that the speaker may need to expend effort clarifying what was intended. Our argument shows how the utility of ambiguity can be derived without positing that speakers want to produce one single concise word, or that comprehenders want a completely unambiguous system. We argue that Zipf’s basic intuition about ambiguity—that it results from a rational process of communication—is fundamentally correct. Instead of unification and diversification, we argue that ambiguity can be understood by the trade-off between two communicative pressures which are inherent to any communicative system: clarity and ease. A clear communication system is one in which the intended meaning can be recovered from the signal with high probability. An easy communication system is one which signals are efficiently produced, communicated, and processed. There are many factors which likely determine ease for human language: for instance, words which are easy to process are likely short, frequent, and phonotactically well-formed. Clarity and ease are opposed because there are a limited number of “easy” signals which can be used. This means that in order to assign meanings unambiguously or clearly, one must also use words which are more difficult.

One example that illustrates this trade-off is the NATO phonetic alphabet. The NATO phonetic alphabet is the system of naming letters which is used by the military and pilots—A is “Alpha”, B is “Bravo”, C is “Charlie”, etc. This system was created to avoid the confusion that might occur when one attempts to communicate similar-sounding letter names across a noisy acoustic channel. The way this was done was by changing letters to full words, adding redundant information so that a listener can recognize the correct letter in the presence of noise. The downside is that instead of letters having relatively short names, they have mostly bisyllabic full-word names—which take more time and effort to produce and comprehend—trading ease for clarity. Trade-offs in the other direction are also common in language: pronouns, for instance, allow speakers to refer to locally salient discourse entities in a concise way. They are ambiguous because they could potentially refer to anyone, but allow for greater ease of communication by being short and frequent, and potentially less difficult for syntactic systems (Marslen-Wilson et al., 1982, Ariel, 1990, Gundel et al., 1993, Warren and Gibson, 2002, Arnold, 2008, Tily and Piantadosi, 2009).

Beyond Zipf, several authors have previously discussed the possibility that ambiguity is a useful feature of language. Several cognitive explanations of ambiguity were discussed by Wasow et al. (2005). One is the possibility that ambiguity reduces the memory demands of storing a lexicon, though they conclude that human memory is probably not a bottleneck for vocabulary size. They also hypothesize that there may be some processing constraint against longer morphemes which leads to shorter morphemes being recycled for multiple meanings. This is one case of the theory we present and test in the next section: that forms are re-used when they are easy to process. Wasow et al. (2005) also suggest ambiguity might be useful in language contact situations, where speakers of both languages should ideally be able to handle words meaning two different things in two different situations. They also point out that ambiguity does sometimes serve a communicative function when speakers wish to be ambiguous intentionally, giving the example of a dinner guest who says “Nothing is better than your cooking” to express a compliment and an insult simultaneously. Neither of these arguments are especially compelling because it is unclear how they could explain the fact that linguistic ambiguity is so common.

Some previous work has suggested that ambiguity may be advantageous for a communication system. One such suggestion, by Ferrer i Cancho and Loreto (in preparation) holds that ambiguity is a necessary precondition of combinatorial systems, since combining multiple units has no advantage when each unambiguously communicates a full meaning. Ambiguity (there defined more broadly as less than total specification of meaning within a unit) is thus predicted to arise in any morphosyntactic system. A second, information-theoretic direction was pursued by Juba, Kalai, Khanna, and Sudan (2011), who argue that ambiguity allows for more efficient compression when speakers and listeners have boundedly different prior distributions on meanings. This complements the information-theoretic analysis we present in the next section, although studying boundedly different priors requires a considerably more sophisticated analysis.

The goal of the present paper is to develop an explanation for ambiguity which makes fewer assumptions than previous work, and is more generally applicable. Our approach complements previous work arguing that ambiguity is rarely harmful to communication in practice thanks to the comprehender’s ability to effectively disambiguate between possible meanings (Wasow and Arnold, 2003, Wasow et al., 2005, Jaeger, 2006, Roland et al., 2006, Ferreira, 2008, Jaeger, 2010). The explanations we present demonstrate that ambiguity is a desirable feature of any communicative system when context is informative about meaning. We argue that the generality of our results explains the pervasiveness of ambiguity in language, and shows how ambiguity likely results from ubiquitous pressure for efficient communication.

Section snippets

Two benefits of ambiguity

In this section we argue that efficient communication systems will be ambiguous when context is informative about what is being communicated. We present two similar perspectives on this point. The first shows that the most efficient communication system will not convey information already provided by the context. Such communication systems necessarily appear to be ambiguous when examined out of context. Second, we argue that specifically for the human language processing mechanisms, ambiguity

Empirical evaluation of ambiguity and effort

In the previous section, we presented two closely related arguments that ambiguity allows for more efficient communication systems. Both assumed that information is typically present to resolve ambiguities, and that using this information is relatively “cheap.” The first argument looked at ambiguity from the perspective of coding theory, arguing that when context is informative, any good communication system will leave out information already in the context. The second assumed that codewords

General discussion

We have presented two related arguments that show a well-designed communication system will be ambiguous, when examined out of context. We tested predictions of this theory, showing that words and syllables which are more efficient are preferentially re-used in language through ambiguity, allowing for greater ease overall. Our regression on homophones, polysemous words, and syllables—though similar—are theoretically and statistically independent. We therefore interpret positive results in each

Conclusion

We have provided several kinds of evidence for the view that ambiguity results from a pressure for efficient communication. We argued that any efficient communication system will necessarily be ambiguous when context is informative about meaning. The units of an efficient communication system will not redundantly specify information provided by the context; when examined out of context, these units will appear not to completely disambiguate meaning. We have also argued that ambiguity allows

Acknowledgements

We would like to thank Mike Frank, Florian Jaeger, and Roger Levy for helpful discussions about this work. We thank three anonymous reviewers for suggesting improvements to this paper. This work was supported by a National Science Foundation graduate research fellowship and a Social, Behavioral & Economic Sciences Doctoral Dissertation Research Improvement Grant in linguistics (to S.T.P.), and National Science Foundation Grant 0844472 (to E.G.). This research was conducted by S.P while at the

References (78)

  • Y. Kamide et al.

    The time-course of prediction in incremental sentence processing: Evidence from anticipatory eye movements

    Journal of Memory and Language

    (2003)
  • T. Kraljic et al.

    Prosodic disambiguation of syntactic structure: For the speaker or for the addressee?

    Cognitive Psychology

    (2005)
  • R. Levy

    Expectation-based syntactic comprehension

    Cognition

    (2008)
  • D. Roland et al.

    Why is that? Structural prediction and ambiguity resolution in a very large corpus of English sentences

    Cognition

    (2006)
  • J. Sedivy

    Invoking discourse-based contrast sets and resolving syntactic ambiguities

    Journal of Memory and Language

    (2002)
  • J. Sedivy et al.

    Achieving incremental semantic interpretation through contextual representation

    Cognition

    (1999)
  • J. Snedeker et al.

    Using prosody to avoid ambiguity: Effects of speaker awareness and referential context

    Journal of Memory and Language

    (2003)
  • J. Trueswell et al.

    Semantic influences on parsing: Use of thematic role information in syntactic ambiguity resolution

    Memory and Language

    (1994)
  • T. Warren et al.

    The influence of referential processing on sentence complexity

    Cognition

    (2002)
  • D. Allbritton et al.

    Reliability of prosodic cues for resolving syntactic ambiguity

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1996)
  • M. Ariel

    Accessing noun-phrase antecedents

    (1990)
  • J. Arnold

    Reference production: Production-internal and addressee-oriented processes

    Language and cognitive processes

    (2008)
  • M. Aylett et al.

    The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence and duration in spontaneous speech

    Language and Speech

    (2004)
  • Baayen, H.R., Piepenbrock, R. Gulikers, L. (1995). The CELEX lexical database. Release 2 (CD-ROM). Linguistic Data...
  • N. Chomsky

    An interview on minimalism

    N. Chomsky, On Nature and Language

    (2002)
  • T. Cover et al.

    Elements of information theory

    (2006)
  • S. Crain et al.

    On not being led up the garden path: The use of context by the psychological parser

    Natural Language Parsing

    (1985)
  • Demberg, V. (2010). A broad-coverage model of prediction in human sentence processing. Unpublished doctoral...
  • C. Fellbaum

    WordNet: An electronic lexical database

    (1998)
  • R. Ferrer i Cancho

    When language breaks into pieces: A conflict between communication through isolated signals and language

    Biosystems

    (2006)
  • Ferrer i Cancho, R. Loreto, V. (in preparation). Conditions for the emergence of combinatorial...
  • R. Ferrer i Cancho et al.

    Least effort and the origins of scaling in human language

    Proceedings of the National Academy of Sciences of the United States of America

    (2003)
  • Frank, A., & Jaeger, T. (2008). Speaking rationally: Uniform information density as an optimal strategy for language...
  • Frazier, L. (1979). On comprehending sentences: Syntactic parsing strategies. ETD Collection for University of...
  • Frisch, S., Broe, M., & Pierrehumbert, J. (1997). Similarity and phonotactics in...
  • S. Frisson et al.

    Effects of contextual predictability and transitional probability on eye movements during reading

    Learning, Memory, and Cognition

    (2005)
  • Gale, W., Church, K., & Yarowsky, D. (1992). One sense per discourse. In Proceedings of the workshop on Speech and...
  • A. Gelman et al.

    Data analysis using regression and hierarchical/multilevel models

    (2006)
  • Genzel, D., & Charniak, E. (2002). Entropy rate constancy in text. In Proceedings of the 40th annual meeting on...
  • Cited by (0)

    View full text