The influence of language experience on categorical perception of pitch contours
Research Highlights
► Categorical effect is stronger for native tone language listeners. ► Categorical effect is stronger in speech context for native tone language listeners. ► The sensitivity to the pitch-contour change is shaped by the relevant tone contrasts.
Introduction
We perceive speech sounds categorically—that is to say, we are more likely to notice the differences between categories than within categories. Categorical perception (CP) has the following characteristics (Repp, 1984): (1) In the labeling function, there is a sharp boundary between two categories; (2) in the discrimination function, accuracy peaks at the category boundary, but is at or near chance level within category; (3) the discrimination function can be predicted from the identification function.
Therefore, the major signature of CP is better discrimination across category boundaries than for equivalently separated stimuli within the same category. This ability has been evidenced in several modalities, such as auditory perception of speech (Liberman, Harris, Hoffman, & Griffith, 1957; Liberman, 1996), visual perception of colors (Bornstein, Kessen, & Weiskopf, 1976), and facial expressions (Etcoff & Magee, 1992). Besides human subjects, animals also exhibit CP. Morse and Snowden (1975) found that rhesus monkeys had CP on formant transitions, and Kuhl and Miller (1975) found that chinchilla had CP on voice onset time (VOT). A recent study reported that zebra finches could not only discriminate and categorize monosyllabic words that differ in their vowel, but also exhibited the ability to transfer this categorization to the same words spoken by novel speakers independent of the speaker's gender (Ohms, Gill, Van Heijningen, Beckers, & Ten Cate, 2010). This study showed that birds, like humans, use intrinsic and extrinsic speaker normalization to perform categorization. This finding may imply that there is no need to invoke special mechanisms for speaker normalization in human speech perception (Johnson, 2005; Moore; & Jongman, 1997; Wong; & Diehl, 2003). All in all, CP might be a very basic ability with which an organism organizes the world in which it lives, and this CP might be dynamic in nature, relating to dynamic normalization, such as speaker normalization. CP and normalization might both be involved in extracting invariant features from signals whose physical forms change continuously.
Earlier studies on speech CP mainly focused on segmental features. Liberman et al. (1957) reported that when people listened to sounds that varied along a formant transition continuum, they heard only ba's, da's, or ga's, but nothing in between. Instead of changing gradually, the perceived quality jumped abruptly from one category to another at a certain point along a continuum. CP was also found for a voicing continuum in 1- and 4-month-old infants (Eimas, Siqueland, Jusczyk, & Vigorito, 1971). The infants reacted to a 20 ms VOT difference when accompanied by a phonemic difference (a change from /ba/ to /pa/), but hardly reacted to the same VOT difference when it was not accompanied by a phonemic difference.
Phonemic category in speech perception and production typically develops in human infants during their first year of life: infants have the ability to discriminate phonetic contrasts of all languages during the first 3 months, but this ability starts to decline in foreign-language consonant perception and to increase in native-language consonant perception by around 11 months (Kuhl, 2004). The progressive development of CP from infancy to adolescence is presumably influenced by spoken communication. The increase in CP causes within-category differences to become less discriminated, thereby preventing non-relevant information, e.g. phonetic variations of the same phoneme, from reaching the mental lexicon. This should facilitate word recognition, especially under difficult listening conditions, although other factors, such as context cuing, the listener's guess and frequencies of occurrence, are also helpful. Therefore, CP, which develops as the infant ages, enhances communication. In addition to the behavioral evidence, Rivera-Gaxiola, Silva-Pereyra, and Kuhl (2005) discovered neural correlates of this dynamic language development via electrophysiological measurements, and pointed out that individual developmental differences might have an impact on language development.
The linguistic environment must have a crucial influence on the development of CP. There are thousands of languages in the world which make use of pitch patterns (Yip, 2002) to build words much as vowels and consonants are used. Chinese is perhaps the best known example of these (Wang, 1973), where ‘ma1’ (, mother), ‘ma2’ (, hemp), ‘ma3’ (, horse), and ‘ma4’ (, to scold) share the same segments, but differ in their pitch patterns, with the numbers ‘1’, ‘2’, ‘3’, and ‘4’ indicating different lexical tones (Wang (1967), Wang (1972), Wang (1973); Peng, 2006).
CP of pitch contours for subjects with different language backgrounds was first shown behaviorally for Mandarin lexical tones by Wang (1976), who demonstrated the existence of a linguistic boundary for native Chinese (tone language) subjects but a psychophysical boundary for native American English (non-tone language) subjects. Since then, several studies of CP of pitch contours have investigated the impact of language experience on pitch perception, mainly focusing on the contrast between native tone language and non-tone language subjects (Francis, Ciocca, & Ng, 2003; Hallé, Chang, & Best, 2004; Xu, Gandour, & Francis, 2006). As for tones, Abramson (1979) claimed that tone perception in Thai is not categorical. Nonetheless, there is no strict dichotomy between CP and continuous perception. We agree with Hallé, Chang, and Best (2004) that a more stringent test of how categorical is the perception of tones by native listeners of a tone language requires a comparison with the perception of tones by listeners of a non-tone language.
Different inventories of tones may further influence pitch perception. Gandour (1983) investigated the perceptual dimensions of tones and the effect of linguistic experience on perception of tones by listeners from five language groups (Cantonese, Mandarin, Taiwanese, Thai, and English). He found that listeners from these five groups could be classified into tone language vs. non-tone language listeners, Thai vs. Chinese listeners, and Cantonese vs. Mandarin and Taiwanese listeners on the basis of their patterns of dimension weights, indicating that the experience of different tone language speakers further affected pitch perception. Lee, Vakoch, and Wurm (1996) found that Cantonese listeners were better than both Mandarin and English listeners at discriminating Cantonese tones, and Mandarin listeners did better than both Cantonese and English listeners at discriminating Mandarin tones. In both cases, the tone language listeners did better than the English listeners. However, as far as we know, there is still a lack of studies on whether and how different tone inventories affect pitch perception in terms of CP.
Mandarin and Cantonese are two of the seven major Chinese dialects. Mandarin is spoken throughout China; Cantonese is spoken mainly in south China (Wang, 1973). While Mandarin and Cantonese each have many varieties of speech, in this paper Mandarin refers to the speech of Beijing while Cantonese refers to the speech of Hong Kong. Using the five tone letters proposed by Chao (1930), the four Mandarin lexical tones in citation form are 55 (high level tone, Tone 1), 35 (high rising tone, Tone 2), 214 (low falling rising tone, Tone 3), and 51 (high falling tone, Tone 4). Cantonese has 6 lexical tones (ignoring duration difference), 55 (high level tone, Tone 1), 35 (high rising tone, Tone 2), 33 (middle level tone, Tone 3), 21 (low falling tone, Tone 4), 23 (low rising tone, Tone 5), and 22 (low level tone, Tone 6) (Bauer & Benedict, 1997). Since the tone inventories of Mandarin and Cantonese are very different, we examine whether this difference further influences categorical pitch perception.
It is generally known that other suprasegmental features, especially intensity profile, are highly correlated with tone perception (Abramson, 1972; Howie, 1976; Liu & Samuel, 2004; Whalen & Xu, 1992; Zee, 1978), but in this study, we focus on just the primary cue, fundamental frequency (the physical correlate of pitch), for lexical tone perception, and fixed other features as constant. We have reexamined the effect of tone language experience on categorical pitch perception by comparing the identification and discrimination performances of tone language listeners (Mandarin and Cantonese) vs. non-tone language (German) listeners. Moreover, we have further explored the influence of different tone inventories on pitch perception by comparing identification and discrimination performances of Mandarin vs. Cantonese listeners.
Section snippets
Materials
Two types of continua (rising and falling) were constructed for both speech context, Mandarin syllable /i/, and nonspeech context, pure tone. Fig. 1 shows a schematic diagram of the pitch contours of the 11 stimuli for the rising continuum (on the left, following Wang, 1976) and the 11 stimuli for the falling continuum (on the right). In Mandarin, the syllable /i/ means “ (clothes)” with the high level tone, represented by stimulus Number 11 in both continua, means “ (aunt)” with the high
Identification and discrimination curves
Identification and discrimination curves are shown in Fig. 3, Fig. 4. The estimated boundary position and width, obtained by probit analysis, are shown in Table 1.
Position of category boundary
A three-way mixed design repeated measures analysis of variance (ANOVA) was conducted to determine the impact of language group (Mandarin, Cantonese, and German), tone continuum (rising and falling continua), and stimulus context (speech and nonspeech) on the boundary position, with group as the between-subject factor, and continuum
Discussion
This study has examined pitch contour perception by three groups of listeners: one German group, and two Chinese groups (Cantonese and Mandarin). Our results confirm some of the findings reported previously that have contrasted native tone language listeners and non-tone language listeners. Given the different tone inventories of Mandarin and Cantonese, the influence of these two tone systems on pitch contour perception was indeed reflected in the identification and discrimination curves for
Conclusion
In this study, we have examined the influence of different language experience on the perception of pitch contours in the framework of CP. From the identification curves shown in Fig. 3, Fig. 4, we see clearly that changes from one category to another are more abrupt for the two groups of tone language listeners. This is also reflected in the narrower boundary width shown in Table 1, and in Table 3 where the boundary widths for the rising and falling continua were pooled together, especially in
Acknowledgements
The work described in this paper was partially supported by grants from the Shun Hing Institute of Advanced Engineering of The Chinese University of Hong Kong (Project No. BME-8115020), from National Science Foundation of China (NSFC: 11074267), and from the Research Grant Council of Hong Kong. We thank the members of the Language Engineering Laboratory of The Chinese University of Hong Kong for many helpful discussions. We thank the editor and the two reviewers for their constructive help in
References (41)
- et al.
Categorical perception of facial expressions
Cognition
(1992) Tone perception in Far Eastern languages
Journal of Phonetics
(1983)- et al.
Identification and discrimination of Mandarin Chinese tones by Mandarin Chinese vs. French listeners
Journal of Phonetics
(2004) - et al.
Non-parametric techniques for pitch-scale and time-scale modification of speech
Speech Communication
(1995) Duration and intensity as correlates of F0
Journal of Phonetics
(1978)Tonal experiments with whispered Thai
The noncategorical perception of tone categories in Thai
- et al.
Modern Cantonese phonology
(1997) - Boersma, P., & Weenink, D. (2009). Praat: Doing phonetics by computer....
- et al.
Color vision and hue categorization in young infants
Journal of Experimental Psychology: Human Perception & Performance
(1976)
A system of tone letters
Le Maître Phonétique
On short and long auditory stores
Psychological Bulletin
Auditory sensory storage in relation to the growth of sensation and acoustic information extraction
Journal of Experimental Psychology: Human Perception and Performance
Absolute pitch among students in an American music conservatory: Association with tone language fluency
Journal of Acoustical Society of America
Absolute pitch among American and Chinese conservatory students: Prevalence differences, and evidence for a speech-related critical period (L)
Journal of Acoustical Society of America
Speech perception in infants
Science
Probit analysis
Loudness, its definition, measurement and calculation
Journal of Acoustical Society of America
On the (non)categorical perception of lexical tones
Perception & Psychophysics
Acoustical studies of Mandarin vowels and tones
Cited by (122)
Phonological mediation effects in imitation of the Mandarin flat-falling tonal continua
2023, Journal of Phonetics