The communicative function of ambiguity in language
Introduction
Ambiguity is a pervasive phenomenon in language which occurs at all levels of linguistic analysis. Out of context, words have multiple senses and syntactic categories, requiring listeners to determine which meaning and part of speech was intended. Morphemes may also be ambiguous out of context, as in the English –s, which can denote either a plural noun marking (trees), a possessive (Dylan’s), or a present tense verb conjugation (runs). Phonological forms are often mapped to multiple distinct word meanings, as in the homophones too, two, and to. Syllables are almost always ambiguous in isolation, meaning that they can be interpreted as providing incomplete information about the word the speaker is intending to communicate. Syntactic and semantic ambiguity are frequent enough to present a substantial challenge to natural language processing. The fact that ambiguity occurs on so many linguistic levels suggests that a far-reaching principle is needed to explain its origins and persistence.
The existence of ambiguity provides a puzzle for functionalist theories which attempt to explain properties of linguistic systems in terms of communicative pressures (e.g. Hockett, 1960, Pinker and Bloom, 1990). One might imagine that in a perfect communication system, language would completely disambiguate meaning. Each linguistic form would map bijectively to a meaning, and comprehenders would not need to expend effort inferring what the speaker intended to convey. This would reduce the computational difficulties in language understanding and comprehension because recovering meaning would be no more complex than, for instance, compiling a computer program. The communicative efficacy of language might be enhanced since there would be no danger of comprehenders incorrectly inferring the intended meaning. Confusion about “who’s on first” could not occur.
Indeed, the existence of ambiguity in language has been argued to show that the key structures and properties of language have not evolved for purposes of communication or use:
The natural approach has always been: Is [language] well designed for use, understood typically as use for communication? I think that’s the wrong question. The use of language for communication might turn out to be a kind of epiphenomenon… If you want to make sure that we never misunderstand one another, for that purpose language is not well designed, because you have such properties as ambiguity. If we want to have the property that the things that we usually would like to say come out short and simple, well, it probably doesn’t have that property. (Chomsky, 2002 p107).
Here, we argue that this perspective on ambiguity is exactly backwards. We argue, contrary to the Chomskyan view, that ambiguity is in fact a desirable property of communication systems, precisely because it allows for a communication system which is “short and simple.” We argue for two beneficial properties of ambiguity: first, where context is informative about meaning, unambiguous language is partly redundant with the context and therefore inefficient; and second, ambiguity allows the re-use of words and sounds which are more easily produced or understood. Our approach follows directly from the hypothesis that language approximates an optimal code for human communication, following a tradition of research spearheaded by Zipf which has recently come back into favor to explain both the online behavior of language users (e.g. Genzel and Charniak, 2002, Aylett and Turk, 2004, Jaeger, 2006, Levy and Jaeger, 2007 i.a.) and the structure of languages themselves (e.g. Ferrer i Cancho and Solé, 2003, Ferrer i Cancho, 2006, Piantadosi et al., 2011). In fact, our specific hypothesis is closely related to a theory initially suggested by Zipf (1949).
In Zipf’s view, ambiguity fits within the framework of his unifying principle of least effort, and could be understood by considering the competing desires of the speaker and the listener. Speakers can minimize their effort if all meanings are expressed by one simple, maximally ambiguous word, say, ba. To express a meaning such as “The accordion box is too small,” the speaker would simply say ba. To say “It will rain next Wednesday,” the speaker would say ba. Such a system is very easy for speakers since they do not need to expend any effort thinking about or searching memory to retrieve the correct linguistic form to produce. Conversely, from the comprehender’s perspective, effort is minimized if each meaning maps to a distinct linguistic form, assuming that handling many distinct word forms is not overly difficult for comprehenders. In that type of system, the listener does not need to expend effort inferring what the speaker intended, since the linguistic signal would leave only one possibility.
Zipf suggested that natural language would strike a balance between these two opposing forces of unification and diversification, arriving at a middle ground with some but not total, ambiguity. Zipf argued this balance of speakers’ and comprehenders’ interests will be observed in a balance between frequency of words and number of words: speakers want a single (therefore highly frequent) word, and comprehenders want many (therefore less frequent) words. He suggested the balancing of these two forces could be observed in the relationship between word frequency and rank frequency: the vocabulary was “balanced” because a word’s frequency multiplied by its frequency rank was roughly a constant, a celebrated statistical law of language.1 Ferrer i Cancho and Solé (2003) provide a formal backing to Zipf’s intuitive explanation, showing that the power law distribution arises when information-theoretic difficulty for speakers and comprehenders is appropriately “balanced.” Zipf (1949) further extends his thinking to the distribution of word meanings by testing a quantitative relationship between word frequency and number of meanings. He derives a law of meaning distribution from his posited forces of unification and diversification, arguing that the number of meanings a word has should scale with the square root of its frequency. Zipf reports a very close empirical fit for this prediction. Functionalist linguistic theories have also posited trade-offs between total ambiguity and perfect and unambiguous logical communication (e.g. Givón, 2009), although to our knowledge these have not been evaluated empirically.
Zipf’s hypothesis of the way ambiguity might arise from a trade-off between speaker and hearer pressures has certain shortcomings. As pointed out by Wasow, Perfors, and Beaver (2005), it is unlikely that a speaker’s effort is minimized by a totally ambiguous language, since confusion means that the speaker may need to expend effort clarifying what was intended. Our argument shows how the utility of ambiguity can be derived without positing that speakers want to produce one single concise word, or that comprehenders want a completely unambiguous system. We argue that Zipf’s basic intuition about ambiguity—that it results from a rational process of communication—is fundamentally correct. Instead of unification and diversification, we argue that ambiguity can be understood by the trade-off between two communicative pressures which are inherent to any communicative system: clarity and ease. A clear communication system is one in which the intended meaning can be recovered from the signal with high probability. An easy communication system is one which signals are efficiently produced, communicated, and processed. There are many factors which likely determine ease for human language: for instance, words which are easy to process are likely short, frequent, and phonotactically well-formed. Clarity and ease are opposed because there are a limited number of “easy” signals which can be used. This means that in order to assign meanings unambiguously or clearly, one must also use words which are more difficult.
One example that illustrates this trade-off is the NATO phonetic alphabet. The NATO phonetic alphabet is the system of naming letters which is used by the military and pilots—A is “Alpha”, B is “Bravo”, C is “Charlie”, etc. This system was created to avoid the confusion that might occur when one attempts to communicate similar-sounding letter names across a noisy acoustic channel. The way this was done was by changing letters to full words, adding redundant information so that a listener can recognize the correct letter in the presence of noise. The downside is that instead of letters having relatively short names, they have mostly bisyllabic full-word names—which take more time and effort to produce and comprehend—trading ease for clarity. Trade-offs in the other direction are also common in language: pronouns, for instance, allow speakers to refer to locally salient discourse entities in a concise way. They are ambiguous because they could potentially refer to anyone, but allow for greater ease of communication by being short and frequent, and potentially less difficult for syntactic systems (Marslen-Wilson et al., 1982, Ariel, 1990, Gundel et al., 1993, Warren and Gibson, 2002, Arnold, 2008, Tily and Piantadosi, 2009).
Beyond Zipf, several authors have previously discussed the possibility that ambiguity is a useful feature of language. Several cognitive explanations of ambiguity were discussed by Wasow et al. (2005). One is the possibility that ambiguity reduces the memory demands of storing a lexicon, though they conclude that human memory is probably not a bottleneck for vocabulary size. They also hypothesize that there may be some processing constraint against longer morphemes which leads to shorter morphemes being recycled for multiple meanings. This is one case of the theory we present and test in the next section: that forms are re-used when they are easy to process. Wasow et al. (2005) also suggest ambiguity might be useful in language contact situations, where speakers of both languages should ideally be able to handle words meaning two different things in two different situations. They also point out that ambiguity does sometimes serve a communicative function when speakers wish to be ambiguous intentionally, giving the example of a dinner guest who says “Nothing is better than your cooking” to express a compliment and an insult simultaneously. Neither of these arguments are especially compelling because it is unclear how they could explain the fact that linguistic ambiguity is so common.
Some previous work has suggested that ambiguity may be advantageous for a communication system. One such suggestion, by Ferrer i Cancho and Loreto (in preparation) holds that ambiguity is a necessary precondition of combinatorial systems, since combining multiple units has no advantage when each unambiguously communicates a full meaning. Ambiguity (there defined more broadly as less than total specification of meaning within a unit) is thus predicted to arise in any morphosyntactic system. A second, information-theoretic direction was pursued by Juba, Kalai, Khanna, and Sudan (2011), who argue that ambiguity allows for more efficient compression when speakers and listeners have boundedly different prior distributions on meanings. This complements the information-theoretic analysis we present in the next section, although studying boundedly different priors requires a considerably more sophisticated analysis.
The goal of the present paper is to develop an explanation for ambiguity which makes fewer assumptions than previous work, and is more generally applicable. Our approach complements previous work arguing that ambiguity is rarely harmful to communication in practice thanks to the comprehender’s ability to effectively disambiguate between possible meanings (Wasow and Arnold, 2003, Wasow et al., 2005, Jaeger, 2006, Roland et al., 2006, Ferreira, 2008, Jaeger, 2010). The explanations we present demonstrate that ambiguity is a desirable feature of any communicative system when context is informative about meaning. We argue that the generality of our results explains the pervasiveness of ambiguity in language, and shows how ambiguity likely results from ubiquitous pressure for efficient communication.
Section snippets
Two benefits of ambiguity
In this section we argue that efficient communication systems will be ambiguous when context is informative about what is being communicated. We present two similar perspectives on this point. The first shows that the most efficient communication system will not convey information already provided by the context. Such communication systems necessarily appear to be ambiguous when examined out of context. Second, we argue that specifically for the human language processing mechanisms, ambiguity
Empirical evaluation of ambiguity and effort
In the previous section, we presented two closely related arguments that ambiguity allows for more efficient communication systems. Both assumed that information is typically present to resolve ambiguities, and that using this information is relatively “cheap.” The first argument looked at ambiguity from the perspective of coding theory, arguing that when context is informative, any good communication system will leave out information already in the context. The second assumed that codewords
General discussion
We have presented two related arguments that show a well-designed communication system will be ambiguous, when examined out of context. We tested predictions of this theory, showing that words and syllables which are more efficient are preferentially re-used in language through ambiguity, allowing for greater ease overall. Our regression on homophones, polysemous words, and syllables—though similar—are theoretically and statistically independent. We therefore interpret positive results in each
Conclusion
We have provided several kinds of evidence for the view that ambiguity results from a pressure for efficient communication. We argued that any efficient communication system will necessarily be ambiguous when context is informative about meaning. The units of an efficient communication system will not redundantly specify information provided by the context; when examined out of context, these units will appear not to completely disambiguate meaning. We have also argued that ambiguity allows
Acknowledgements
We would like to thank Mike Frank, Florian Jaeger, and Roger Levy for helpful discussions about this work. We thank three anonymous reviewers for suggesting improvements to this paper. This work was supported by a National Science Foundation graduate research fellowship and a Social, Behavioral & Economic Sciences Doctoral Dissertation Research Improvement Grant in linguistics (to S.T.P.), and National Science Foundation Grant 0844472 (to E.G.). This research was conducted by S.P while at the
References (78)
- et al.
Avoiding the garden path: Eye movements in context
Journal of Memory and Language
(1992) - et al.
Incremental interpretation at verbs: Restricting the domain of subsequent reference
Cognition
(1999) - et al.
Interaction with context during human sentence processing
Cognition
(1988) - et al.
Avoiding attachment ambiguities: The role of constituent ordering
Journal of Memory and Language
(2004) Ambiguity, accessibility, and a division of labor for communicative success
Psychology of Learning and Motivation
(2008)- et al.
Effect of ambiguity and lexical availability on syntactic and lexical production
Cognitive Psychology
(2000) - et al.
How do speakers avoid ambiguous linguistic expressions?
Cognition
(2005) - et al.
Influence of contextual contrast on syntactic processing: evidence for strong-interaction in sentence comprehension
Cognition
(2005) A probabilistic model of lexical and syntactic access and disambiguation
Cognitive Science
(1996)- et al.
The role of discourse context in the processing of a flexible word-order language
Cognition
(2004)