Abstract
The McGurk effect, where an incongruent visual syllable influences identification of an auditory syllable, does not always occur, suggesting that perceivers sometimes fail to use relevant visual phonetic information. We tested whether another visual phonetic effect, which involves the influence of visual speaking rate on perceived voicing (Green & Miller, 1985), would occur in instances when the McGurk effect does not. In Experiment 1, we established this visual rate effect using auditory and visual stimuli matching in place of articulation, finding a shift in the voicing boundary along an auditory voice-onsettime continuum with fast versus slow visual speech tokens. In Experiment 2, we used auditory and visual stimuli differing in place of articulation and found a shift in the voicing boundary due to visual rate when the McGurk effect occurred and, more critically, when it did not. The latter finding indicates that phonetically relevant visual information is used in speech perception even when the McGurk effect does not occur, suggesting that the incidence of the McGurk effect underestimates the extent of audiovisual integration.
Article PDF
Similar content being viewed by others
References
Brancazio, L. (2004). Lexical influences in audiovisual speech perception.Journal of Experimental Psychology: Human Perception & Performance,30, 445–463.
Brancazio, L., Miller, J. L., & Paré, M. A. (1999). Perceptual effects of place of articulation on voicing for audiovisually discrepant stimuli.Journal of the Acoustical Society of America,106, 2270.
Brancazio, L., Miller, J. L., & Paré, M. A. (2003). Visual influences on the internal structure of phonetic categories.Perception & Psychophysics,65, 591–601.
Burnham, D. (1998). Language specificity in the development of auditory-visual speech perception. In R. Campbell, B. Dodd, and D. Burnham (Eds.),Hearing by eye II: Advances in the psychology of speechreading and auditory-visual speech (pp. 27–60). Hove, U.K.: Psychology Press.
Carney, A. E., Clement, B. R., & Cienkowski, K. M. (1999). Talker variability effects in auditory-visual speech perception.Journal of the Acoustical Society of America,106, 2270.
Chen, T. H., & Massaro, D. W. (2004). Mandarin speech perception by ear and eye follows a universal principle.Perception & Psychophysics,66, 820–836.
Cohen, J., MacWhinney, B., Flatt, M., & Provost, J. (1993). PsyScope: An interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers.Behavior Research Methods, Instruments, & Computers,25, 257–271.
Erber, N. P. (1969). Interaction of audition and vision in the recognition of oral speech stimuli.Journal of Speech & Hearing Research,12, 423–425.
Fowler, C. A. (1986). An event approach to the study of speech perception from a direct-realist perspective.Journal of Phonetics,14, 3–28.
Green, K. P. (1998). The use of auditory and visual information during phonetic processing: Implications for theories of speech perception. In R. Campbell, B. Dodd, & D. Burnham (Eds.),Hearing by eye II: Advances in the psychology of speechreading and auditory-visual speech (pp. 3–25). Hove, U.K.: Psychology Press.
Green, K. P., & Gerdeman, A. (1995). Cross-modal discrepancies in coarticulation and the integration of speech information: The McGurk effect with mismatched vowels.Journal of Experimental Psychology: Human Perception & Performance,21, 1409–1426.
Green, K. P., & Kuhl, P. K. (1989). The role of visual information in the processing of place and manner features in speech perception.Perception & Psychophysics,45, 34–42.
Green, K. P., Kuhl, P. K., Meltzoff, A. N., & Stevens, E. B. (1991). Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect.Perception & Psychophysics,50, 524–536.
Green, K. P., & Miller, J. L. (1985). On the role of visual rate information in phonetic perception.Perception & Psychophysics,38, 269–276.
Green, K. P., & Norrix, L. W. (1997). Acoustic cues to place of articulation and the McGurk effect: The role of release bursts, aspiration, and formant transitions.Journal of Speech & Hearing Research,40, 646–665.
Jordan, T. R., & Bevan, K. (1997). Seeing and hearing rotated faces: Influences of facial orientation on visual and audiovisual speech recognition.Journal of Experimental Psychology: Human Perception & Performance,23, 388–403.
Lisker, L., & Abramson, A. S. (1964). A cross-language study of voicing in initial stops: Acoustical measurements.Word,20, 384–422.
Lisker, L., & Abramson, A. S. (1970). The voicing dimension: Some experiments in comparative phonetics. InProceedings of the Sixth International Congress of Phonetic Sciences (pp. 563–567). Prague: Academia.
MacDonald, J., Andersen, S., & Bachmann, T. (2000). Hearing by eye: How much spatial degradation can be tolerated?Perception,29, 1155–1168.
MacDonald, J., & McGurk, H. (1978). Visual influences on speech perception processes.Perception & Psychophysics,24, 253–257.
Manuel, S. Y., Repp, B. H., Liberman, A. M., & Studdert-Kennedy, M. (1983, November). Exploring the“McGurk effect.” Paper presented at the 24th meeting of the Psychonomic Society, San Diego.
Massaro, D. W. (1987).Speech perception by ear and eye: A paradigm for psychological inquiry. Hillsdale, NJ: Erlbaum.
Massaro, D. W. (1998).Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, MA: MIT Press.
Massaro, D. W., & Cohen, M. M. (1983). Evaluation and integration of visual and auditory information in speech perception.Journal of Experimental Psychology: Human Perception & Performance,9, 753–771.
Massaro, D. W., & Cohen, M. M. (1996). Perceiving speech from inverted faces.Perception & Psychophysics,58, 1047–1065.
Massaro, D. W., Cohen, M. M., Gesi, A., Heredia, R., & Tsuzaki, M. (1993). Bimodal speech perception: An examination across languages.Journal of Phonetics,21, 445–478.
Massaro, D. W., Cohen, M. M., & Smeele, P. M. T. (1995). Crosslinguistic comparisons in the integration of visual and auditory speech.Memory & Cognition,23, 113–131.
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices.Nature,264, 746–748.
Miller, J. L. (1977). Nonindependence of feature processing in initial consonants.Journal of Speech & Hearing Research,20, 519–528.
Miller, J. L. (1981). Effects of speaking rate on segmental distinctions. In P. D. Eimas & J. L. Miller (Eds.),Perspectives on the study of speech (pp. 39–74). Hillsdale, NJ: Erlbaum.
Pitt, M. A. (1995). The locus of the lexical shift in phoneme identification.Journal of Experimental Psychology: Learning, Memory, & Cognition,21, 1037–1052.
Pitt, M. A., & Samuel, A. G. (1993). An empirical and meta-analytic evaluation of the phoneme identification task.Journal of Experimental Psychology: Human Perception & Performance,19, 699–725.
Rosenblum, L. D., & Saldaña, H. M. (1996). An audiovisual test of kinematic primitives for visual speech perception.Journal of Experimental Psychology: Human Perception & Performance,22, 318–331.
Schwartz, J.-L., Robert-Ribes, J., & Escudier, P. (1998). Ten years after Summerfield: A taxonomy of models for audio-visual fusion in speech perception. In R. Campbell, B. Dodd, and D. Burnham (Eds.),Hearing by eye II: Advances in the psychology of speechreading and auditory-visual speech (pp. 85–108). Hove, U.K.: Psychology Press.
Sekiyama, K. (1997). Cultural and linguistic factors in audiovisual speech processing: The McGurk effect in Chinese subjects.Perception & Psychophysics,59, 73–80.
Sekiyama, K., & Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility.Journal of the Acoustical Society of America,90, 1797–1805.
Sekiyama, K., & Tohkura, Y. (1993). Inter-language differences in the influence of visual cues in speech perception.Journal of Phonetics,21, 427–444.
Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise.Journal of the Acoustical Society of America,26, 212–215.
Summerfield, Q. (1981). Articulatory rate and perceptual constancy in phonetic perception.Journal of Experimental Psychology: Human Perception & Performance,7, 1074–1095.
Summerfield, Q. (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd & R. Campbell (Eds.),Hearing by eye: The psychology of lip reading (pp. 3–51). Hillsdale, NJ: Erlbaum.
Volaitis, L. E., & Miller, J. L. (1992). Phonetic prototypes: Influence of place of articulation and speaking rate on the internal structure of voicing categories.Journal of the Acoustical Society of America,92, 723–735.
Werker, J. F., Frost, P. E., & McGurk, H. (1992). La langue et les lèvres: Cross-language influences on bimodal speech perception.Canadian Journal of Psychology,46, 551–568.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by NIH postdoctoral fellowship F32 DC00373 awarded to L.B. and NIH Grant R01 DC00130 awarded to J. L.M. Preparation of the manuscript was also supported by NIH Grant HD01994, awarded to Haskins Laboratories.
Rights and permissions
About this article
Cite this article
Brancazio, L., Miller, J.L. Use of visual information in speech perception: Evidence for a visual rate effect both with and without a McGurk effect. Perception & Psychophysics 67, 759–769 (2005). https://doi.org/10.3758/BF03193531
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03193531