Abstract
Three experiments are reported on the influence of different timing relations on the McGurk effect. In the first experiment, it is shown that strict temporal synchrony between auditory and visual speech stimuli is not required for the McGurk effect. Subjects were strongly influenced by the visual stimuli when the auditory stimuli lagged the visual stimuli by as much as 180 msec. In addition, a stronger McGurk effect was found when the visual and auditory vowels matched. In the second experiment, we paired auditory and visual speech stimuli produced under different speaking conditions (fast, normal, clear). The results showed that the manipulations in both the visual and auditory speaking conditions independently influenced perception. In addition, there was a small but reliable tendency for the better matched stimuli to elicit more McGurk responses than unmatched conditions. In the third experiment, we combined auditory and visual stimuli produced under different speaking conditions (fast, clear) and delayed the acoustics with respect to the visual stimuli. The subjects showed the same pattern of results as in the second experiment. Finally, the delay did not cause different patterns of results for the different audiovisual speaking style combinations. The results suggest that perceivers may be sensitive to the concordance of the time-varying aspects of speech but they do not require temporal coincidence of that information.
Article PDF
Similar content being viewed by others
References
Abry, C., &Boë, L. J. (1986). Laws for lips.Speech Communication,5, 97–193.
Bernstein, L. E., Coulter, D., O’Connell, M. P., Eberhardt, S., & Demorest, M. (1992, June).Vibrotactile and haptic speech codes. Paper presented at the Second International Conference on Tactile Aids, Hearing Aids, & Cochlear Implants, Stockholm.
Bernstein, L. E., & Eberhardt, S. (1986).Audio-visual stimuli. Johns Hopkins University, Department of Electrical and Computer Engineering.
Braida, D. L. (1991). Crossmodal integration in the identification of consonant segments.Quarterly Journal of Experimental Psychology,43, 647–677.
Cohen, M. M. (1984).Processing of visual and auditory information in speech perception. Unpublished doctoral dissertation, University of California, Santa Cruz.
Dixon, N., &Spitz, L. (1980). The detection of audiovisual desynchrony.Perception,9, 719–721.
Dunnett, C. W. (1955). A multiple comparison procedure for comparing several treatment means with a control.Journal of the American Statistical Association,50, 1096–1121.
Fowler, C. A., &Dekle, D. J. (1991). Listening with eye and hand: Cross-modal contributions to speech perception.Journal of Experimental Psychology: Human Perception & Performance,17, 816–828.
Gagne, J.-P., Masterson, V., Munhall, K. G., Bilida, N., &Querengesser, C. (1994). Across talker variability in speech intelligibility for conversational and clear speech: A crossmodal investigation.Journal of the Academy of Rehabilitative Audiology,27, 133–158.
Gay, T. (1978). Effect of speaking rate on vowel formant transitions.Journal of the Acoustical Society of America,63, 223–230.
Gay, T. (1981). Mechanisms of the control of speech rate.Phonetica,38, 148–158.
Gerdeman, A. (1994).Temporal incongruity and the McGurk effect. Unpublished master’s thesis, University of Arizona, Tucson.
Gracco, V., &Abbs, J. (1986). Variant and invariant characteristics of speech movements.Experimental Brain Research,65, 156–166.
Green, K. P. (1987). The perception of speaking rate using visual information from a talker’s face.Perception & Psychophysics,42, 587–593.
Green, K. P., &Gerdeman, A. (1995). Cross-modal discrepancies in coarticulation and the integration of speech information: The McGurk effect with mismatched vowels.Journal of Experimental Psychology: Human Perception & Performance,21, 1409–1426.
Green, K. P., &Kuhl, P. K. (1989). The role of visual information in the processing of place and manner features in speech perception.Perception & Psychophysics,45, 34–42.
Green, K. P., &Kuhl, P. K. (1991). Integral processing of visual place and auditory voicing information during phonetic perception.Journal of Experimental Psychology: Human Perception & Performance,17, 278–288.
Green, K. P., Kuhl, P. K., & Meltzoff, N. A. (1988, November).Factors affecting the integration of auditory and visual information in speech: The effect of vowel environment. Paper presented at the annual meeting of the Acoustical Society of America, Honolulu.
Green, K. P., Kuhl, P. K., Meltzoff, A. N., &Stevens, E. B. (1991). Integrating speech information across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect.Perception & Psychophysics,50, 524–536.
Green, K. [P.], &Miller, J. L. (1985). On the role of visual rate information in phonetic perception.Perception & Psychophysics,38, 269–276.
Green, K. P., Stevens, E. B., &Kuhl, P. K. (1994). Talker continuity and the use of rate information during phonetic perception.Perception & Psychophysics,55, 249–260.
Hirsh, I. J., &Sherrick, C. E. (1961). Perceived order in different sense modalities.Journal of Experimental Psychology,62, 423–432.
Kashino, M., & Craig, C. (1994). The influence of knowledge and experience during the processing of spoken words: Non-native listeners. InProceedings of the International Conference on Spoken Language Processing (pp. 2047–2050).
Liberman, A., Cooper, F., Shankweiler, D., &Studdert-Kennedy, M. (1967). Perception of the speech code.Psychological Review,74, 431–461.
Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H and H theory. In W. Hardcastle & A. Marchal (Eds.),Speech production and speech modeling (pp. 403–439). Dordrecht: Kluwer.
MacDonald, J., &McGurk, H. (1978). Visual influences on speech perception processes.Perception & Psychophysics,24, 253–257.
Manuel, S. Y., Repp, B., Studdert-Kennedy, M., &Liberman, A (1983). Exploring the “McGurk effect.”Journal of the Acoustical Society of America,74, S66.
Massaro, D. W. (1987).Speech perception by ear and eye. Hillsdale, NJ: Erlbaum.
Massaro, D. W., &Cohen, M. M. (1993). Perceiving asynchronous bimodal speech in consonant-vowel and vowel syllables.Speech Communication,13, 127–134.
Massaro, D. W., Smeele, P. M. T., Cohen, M. M., & Sittig, A. C. (1995).Perception of asynchronous and conflicting visual and auditory speech. Manuscript submitted for publication.
McGrath, M., &Summerfield, Q. (1985). Intermodal timing relations and audio-visual speech recognition by normal-hearing adults.Journal of the Acoustical Society of America,77, 678–685.
McGurk, H., &MacDonald, J. (1976). Hearing lips and seeing speech.Nature,264, 746–748.
Miller, J. (1986). Rate-dependent processing in speech perception. In A. Ellis (Ed.),Progress in the psychology of language (Vol. 3, pp. 119–157). Hillsdale, NJ: Erlbaum.
Miller, J., &Baer, T. (1983). Some effects of speaking rate on the production of /b/ and /w/.Journal of the Acoustical Society of America,73, 1751–1755.
Öhman, S. (1967). Numerical model of coarticulation.Journal of the Acoustical Society of America,41, 310–320.
Pandey, C. P., Kunov, H., &Abel, M. S. (1986). Disruptive effects of auditory signal delay on speech perception with lip-reading.Journal of Auditory Research,26, 27–41.
Picheny, M. A., Durlach, N., &Braida, L. (1985). Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech.Journal of Speech & Hearing Research,28, 96–103.
Pick, H., Warren, D., &Hay, J. (1969). Sensory conflict in judgments of spatial direction.Perception & Psychophysics,6, 203–205.
Remez, R. E., & Rubin, P. E. (in press). Acoustic shards, perceptual glue. In J. Charles-Luce, P. A. Luce, & J. R. Sawusch (Eds.),Theories in spoken language: Perception, production, and development. Norwood, NJ: Ablex.
Remez, R. E., Rubin, P. E., Berns, S. M., Pardo, J. S., &Lang, J. M. (1994). On the perceptual organization of speech.Psychological Review,101, 129–156.
Scheirman, G. L., &Cheetham, P. J. (1990). Motion measurement using the Peak Performance Technologies system.Society of Photooptical Instrumentation Engineers Proceedings,1356, 67–70.
Sekiyama, K., &Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility.Journal of the Acoustical Society of America,90, 1797–1805.
Smeele, P. M. T., Sittig, A. C., & Van Heuven, V. J. (1992). Intelligibility of audio-visually desynchronized speech: Asymmetrical effect of phoneme position. InProceedings of the International Conference on Spoken Language Processing (pp. 65–68).
Smeele, P. M. T., Sittig, A. C., & Van Heuven, V. J. (1994). Temporal organization of bimodal speech information. InProceedings of the International Conference on Spoken Language Processing (pp. 1431–1434).
Sumby, W. H., &Pollack, I. (1954). Visual contribution to speech intelligibility in noise.Journal of the Acoustical Society of America,26, 212–215.
Summerfield, Q. (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd & R. Campbell (Eds.),Hearing by eye: The psychology of lip-reading (pp. 3–51). London: Erlbaum.
Summerfield, Q. (1992). Lipreading and audio-visual speech perception.Philosophical Transactions of the Royal Society of London: Series B,335, 71–78.
Summerfield, Q., &McGrath, M. (1984). Detection and resolution of audiovisual incompatibility in the perception of vowels.Quarterly Journal of Experimental Psychology,36A, 51–74.
Tillmann, H. G., Pompino-Marschall, B., &Porzig, H. (1984). Zum Einfluß visuell dargeborener Sprachbewegungen auf die Wahrnehmung der akustisch kodierten Artikulation.Forschungsberichte des Instituts f ür Phonetik und Sprachliche Kommunikation der Universität München,19, 318–338.
Vatikiotis-Bateson, E., Eigsti, I., & Yano, S. (1994). Listener eye movement behavior during audiovisual speech perception. InProceedings of the International Conference on Spoken Language Processing (pp. 527–530).
Vitkovich, M., &Barber, P. (1994). Effects of video frame rate on subjects’ ability to shadow one of two competing verbal passages.Journal of Speech & Hearing Research,37, 1204–1210.
Ward, M. (1992).The effect of auditory-visual dysynchrony on the integration of auditory and visual information in speech perception. Unpublished bachelor’s thesis, Queen’s University, Kingston, Ontario.
Welch, R. B., DuttonHurt, L. D., &Warren, D. H. (1986). Contributions of audition and vision to temporal rate perception.Perception & Psychophysics,39, 294–300.
Welch, R. B., &Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy.Psychological Bulletin,88, 638–667.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by a grant from NSERC and NIH Grant DC-00594.
Rights and permissions
About this article
Cite this article
Munhall, K.G., Gribble, P., Sacco, L. et al. Temporal constraints on the McGurk effect. Perception & Psychophysics 58, 351–362 (1996). https://doi.org/10.3758/BF03206811
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03206811