Skip to main content

The Influence of Vision on Auditory Communication in Primates

  • Chapter
  • First Online:
Neural Correlates of Auditory Cognition

Part of the book series: Springer Handbook of Auditory Research ((SHAR,volume 45))

  • 1650 Accesses

Abstract

All primate vocal signals are produced through the coordinated movements of the lungs, larynx (vocal folds), and the supralaryngeal vocal tract (Fitch & Hauser, 1995; Ghazanfar & Rendall, 2008). As a result, humans and other primates share a remarkable number of similarities in their vocal signaling, ranging from production mechanisms and signal structure to the inextricable link between vision and audition. In human speech, the signal—across all languages and contexts—is amplitude modulated, consisting of a rhythm that ranges from 2 to 7 Hz (Drullman, 1995; Greenberg et al., 2003; Chandrasekaran et al., 2009), roughly matching the timescale for syllable production. Such temporal modulation in similar frequency ranges also seems to be a common feature of several nonhuman primate vocalizations. For example, common marmoset (Callithrix jacchus) twitter calls are modulated in the 5–9 Hz range (Wang et al., 1995); squirrel monkey (Saimiri sciureus) vocalizations in the 6–10 Hz range (Godey et al., 2005); and finally cotton top tamarins (Saguinus oedipus), macaques (Macaca spp.), and chimpanzees (Pan troglodytes) all seem to have vocalizations modulated in the 3–8 Hz range (Cohen et al., 2007).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Abry, C., Lallouache, M. T., & Cathiard, M. A. (1996). How can coarticulation models account for speech sensitivity in audio-visual desynchronization? In D. Stork & M. Henneke (Eds.), Speechreading by humans and machines: Models, systems and applications (pp. 247–255). Berlin: Springer-Verlag.

    Google Scholar 

  • Barnes, C. L., & Pandya, P. K. (1992). Efferent cortical connections of multimodal cortex of the superior temporal sulcus in the rhesus monkey. Journal of Comparative Neurology, 318(2), 222–244.

    Article  PubMed  CAS  Google Scholar 

  • Barraclough, N. E., Xiao, D., Baker, C. I., Oram, M. W., & Perrett, D. I. (2005). Integration of visual and auditory information by superior temporal sulcus neurons responsive to the sight of actions. Journal of Cognitive Neuroscience, 17(3), 377–391.

    Article  PubMed  Google Scholar 

  • Baylis, G. C., Rolls, E. T., & Leonard, C. M. (1987). Functional subdivisions of the temporal lobe neocortex. The Journal of Neuroscience, 7(2), 330–342.

    PubMed  CAS  Google Scholar 

  • Bernstein, L. E., Auer, E. T., & Takayanagi, S. (2004). Auditory speech detection in noise enhanced by lipreading. Speech Communication, 44(1–4), 5–18.

    Article  Google Scholar 

  • Besle, J., Fort, A., Delpuech, C., & Giard, M.-H. (2004). Bimodal speech: Early suppressive visual effects in human auditory cortex. European Journal of Neuroscience, 20(8), 2225–2234.

    Article  PubMed  Google Scholar 

  • Bruce, C., Desimone, R., & Gross, C. G. (1981). Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. Journal Of Neurophysiology, 46(2), 369–384.

    PubMed  CAS  Google Scholar 

  • Buzsaki, G., & Draguhn, A. (2004). Neuronal oscillations in cortical networks. Science, 304, 1926–1929.

    Article  PubMed  CAS  Google Scholar 

  • Cappe, C., & Barone, P. (2005). Heteromodal connections supporting multisensory integration at low levels of cortical processing in the monkey. European Journal of Neuroscience, 22(11), 2886–2902.

    Article  PubMed  Google Scholar 

  • Chandrasekaran, C., & Ghazanfar, A. A. (2009). Different neural frequency bands integrate faces and voices differently in the superior temporal sulcus. Journal of Neurophysiology, 101(2), 773–788.

    Article  PubMed  Google Scholar 

  • Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLoS Computational Biology, 5(7), e1000436.

    Article  PubMed  Google Scholar 

  • Chandrasekaran, C., Lemus, L., Trubanova, A., Gondan, M., & Ghazanfar, A. A. (2011). Monkeys and humans share a common computation for face/voice integration. PLoS Computational Biology, 7, e1002165.

    Article  PubMed  CAS  Google Scholar 

  • Cohen, Y. E., Theunissen, F., Russ, B. E., & Gill, P. (2007). Acoustic features of rhesus vocalizations and their representation in the ventrolateral prefrontal cortex. Journal of Neurophysiology, 97(2), 1470–1484.

    Article  PubMed  Google Scholar 

  • Driver, J., & Noesselt, T. (2008). Multisensory interplay reveals crossmodal influences on ’sensory-specific’ brain regions, neural responses, and judgments. Neuron, 57(1), 11–23.

    Article  PubMed  CAS  Google Scholar 

  • Drullman, R. (1995). Temporal envelope and fine structure cues for speech intelligibility. The Journal of the Acoustical Society of America, 97(1), 585–592.

    Article  PubMed  CAS  Google Scholar 

  • Drullman, R., Festen, J. M., & Plomp, R. (1994). Effect of reducing slow temporal modulations on speech reception. The Journal of the Acoustical Society of America, 95(5 Pt 1), 2670–2680.

    Article  PubMed  CAS  Google Scholar 

  • Ettlinger, G., & Wilson, W. A. (1990). Cross-modal performance: Behavioural processes, phylogenetic considerations and neural mechanisms. Behavioural Brain Research, 40, 169–192.

    Article  PubMed  CAS  Google Scholar 

  • Evans, T. A., Howell, S., & Westergaard, G. C. (2005). Auditory-visual cross-modal perception of communicative stimuli in tufted capuchin monkeys (Cebus apella). Journal of Experimental Psychology-Animal Behavior Processes, 31, 399–406.

    Article  PubMed  Google Scholar 

  • Fitch, W. T. (1997). Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. The Journal of the Acoustical Society of America, 102(2), 1213–1222.

    Article  PubMed  CAS  Google Scholar 

  • Fitch, W. T., & Hauser, M. (1995). Vocal production in nonhuman primates: Acoustics, physiology, and functional constraints on ldquohonestrdquo advertisement. American Journal of Primatology, 37(3), 191–219.

    Article  Google Scholar 

  • Fu, K. M. G., Shah, A. S., O’Connell, M. N., McGinnis, T., Eckholdt, H., Lakatos, P., et al. (2004). Timing and laminar profile of eye-position effects on auditory responses in primate auditory cortex. Journal Of Neurophysiology, 92(6), 3522–3531.

    Article  PubMed  Google Scholar 

  • Ghazanfar, A. A., & Chandrasekaran, C. F. (2007). Paving the way forward: Integrating the senses through phase-resetting of cortical oscillations. Neuron, 53(2), 162–164.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., & Logothetis, N. K. (2003). Facial expressions linked to monkey calls. Nature, 423(6943), 937–938.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., & Rendall, D. (2008). Evolution of human vocal production. Current Biology, 18(11), R457–460.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends in Cognitive Sciences, 10(6), 278–285.

    Article  PubMed  Google Scholar 

  • Ghazanfar, A. A., Smith-Rohrberg, D., & Hauser, M. D. (2001). The role of temporal cues in rhesus monkey vocal recognition: Orienting asymmetries to reversed calls. Brain Behavior And Evolution, 58(3), 163–172.

    Article  CAS  Google Scholar 

  • Ghazanfar, A. A., Smith-Rohrberg, D., Pollen, A. A., & Hauser, M. D. (2002). Temporal cues in the perception of long calls by tamarins. Animal Behaviour, 64, 427–438.

    Article  Google Scholar 

  • Ghazanfar, A. A., Maier, J. X., Hoffman, K. L., & Logothetis, N. K. (2005). Multisensory integration of dynamic faces and voices in rhesus monkey auditory cortex. The Journal of Neuroscience, 25(20), 5004–5012.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., Nielsen, K., & Logothetis, N. K. (2006). Eye movements of monkey observers viewing vocalizing conspecifics. Cognition, 101(3), 515–529.

    Article  PubMed  Google Scholar 

  • Ghazanfar, A. A., Turesson, H. K., Maier, J. X., van Dinther, R., Patterson, R. D., & Logothetis, N. K. (2007). Vocal-tract resonances as indexical cues in rhesus monkeys. Current Biology, 17(5), 425–430.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., Chandrasekaran, C., & Logothetis, N. K. (2008). Interactions between the superior temporal sulcus and auditory cortex mediate dynamic face/voice integration in rhesus monkeys. The Journal of Neuroscience, 28(17), 4457–4469.

    Article  PubMed  CAS  Google Scholar 

  • Ghazanfar, A. A., Chandrasekaran, C., & Morrill, R. J. (2010). Dynamic, rhythmic facial expressions and the superior temporal sulcus of macaque monkeys: Implications for the evolution of audiovisual speech. European Journal of Neuroscience, 31(10), 1807–1817.

    Article  PubMed  Google Scholar 

  • Godey, B. T., Atencio, C. A., Bonham, B. H., Schreiner, C. E., & Cheung, S. W. (2005). Functional organization of squirrel monkey primary auditory cortex: Responses to frequency-modulation sweeps. Journal of Neurophysiology, 94(2), 1299–1311.

    Article  PubMed  Google Scholar 

  • Gothard, K. M., Battaglia, F. P., Erickson, C. A., Spitler, K. M., & Amaral, D. G. (2007). Neural responses to facial expression and face identity in the monkey amygdala. Journal of Neurophysiology, 97(2), 1671–1683.

    Article  PubMed  CAS  Google Scholar 

  • Greenberg, S., Carvey, H., Hitchcock, L., & Chang, S. (2003). Temporal properties of spontaneous speech—a syllable-centric perspective. Journal of Phonetics, 31(3–4), 465–485.

    Article  Google Scholar 

  • Hauser, M. D., & Andersson, K. (1994). Left hemisphere dominance for processing vocalizations in adult, but not infant, rhesus monkeys: Field experiments. Proceedings of the National Academy of Sciences of the USA, 91(9), 3946–3948.

    Article  PubMed  CAS  Google Scholar 

  • Hauser, M. D., & Ybarra, M. S. (1994). The role of lip configuration in monkey vocalizations—experiments using xylocaine as a nerve block. Brain and Language, 46(2), 232–244.

    Article  PubMed  CAS  Google Scholar 

  • Hauser, M. D., Evans, C. S., & Marler, P. (1993). The role of articulation in the production of rhesus-monkey, Macaca mulatta, vocalizations. Animal Behaviour, 45(3), 423–433.

    Article  Google Scholar 

  • Hauser, M. D., Agnetta, B., & Perez, C. (1998). Orienting asymmetries in rhesus monkeys: The effect of time-domain changes on acoustic perception. Animal Behaviour, 56(1), 41–47.

    Article  PubMed  Google Scholar 

  • Helfer, K. S., & Freyman, R. L. (2005). The role of visual speech cues in reducing energetic and informational masking. The Journal of the Acoustical Society of America, 117(2), 842–849.

    Article  PubMed  Google Scholar 

  • Hikosaka, K., Iwai, E., Saito, H., & Tanaka, K. (1988). Polysensory properties of neurons in the anterior bank of the caudal superior temporal sulcus of the macaque monkey. Journal of Neurophysiology, 60(5), 1615–1637.

    PubMed  CAS  Google Scholar 

  • Jordan, K. E., Brannon, E. M., Logothetis, N. K., & Ghazanfar, A. A. (2005). Monkeys match the number of voices they hear to the number of faces they see. Current Biology, 15(11), 1034–1038.

    Article  PubMed  CAS  Google Scholar 

  • Kayser, C., & Logothetis, N. K. (2009). Directed interactions between auditory and superior temporal cortices and their role in sensory integration. Frontiers in Integrative Neuroscience, 3, 7. doi: 10.3389/neuro.07.007.2009.

    Article  PubMed  Google Scholar 

  • Kayser, C., Petkov, C. I., & Logothetis, N. K. (2008). Visual modulation of neurons in auditory cortex. Cerebral Cortex, 18, 1560–1574.

    Article  PubMed  Google Scholar 

  • Klin, A., Jones, W., Schultz, R., Volkmar, F., & Cohen, D. (2002). Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in individuals with autism. Archives of General Psychiatry, 59, 809–816.

    Article  PubMed  Google Scholar 

  • Kuraoka, K., & Nakamura, K. (2007). Responses of single neurons in monkey amygdala to facial and vocal emotions. Journal of Neurophysiology, 97(2), 1379–1387.

    Article  PubMed  Google Scholar 

  • Lakatos, P., Shah, A. S., Knuth, K. H., Ulbert, I., Karmos, G., & Schroeder, C. E. (2005). An oscillatory hierarchy controlling neuronal excitability and stimulus processing in the auditory cortex. Journal of Neurophysiology, 94(3), 1904–1911.

    Article  PubMed  Google Scholar 

  • Lakatos, P., Chen, C.-M., O’Connell, M. N., Mills, A., & Schroeder, C. E. (2007). Neuronal oscillations and multisensory interaction in primary auditory cortex. Neuron, 53(2), 279–292.

    Article  PubMed  CAS  Google Scholar 

  • Lansing, C. R., & McConkie, G. W. (2003). Word identification and eye fixation locations in visual and visual-plus-auditory presentations of spoken sentences. Perception & Psychophysics, 65(4), 536–552.

    Article  Google Scholar 

  • Logothetis, N. K. (2002). The neural basis of the blood-oxygen-level-dependent functional magnetic resonance imaging signal. Philosophical Transactions of the Royal Society B: Biological Sciences, 357(1424), 1003–1037.

    Article  Google Scholar 

  • MacNeilage, P. F. (1998). The frame/content theory of evolution of speech production. Behavioral and Brain Sciences, 21(4), 499–511.

    PubMed  CAS  Google Scholar 

  • Munhall, K. G., & Tohkura, Y. (1998). Audiovisual gating and the time course of speech perception. The Journal of the Acoustical Society of America, 104(1), 530–539.

    Article  PubMed  CAS  Google Scholar 

  • Munhall, K. G., & Vatikiotis-Bateson, E. (2004). Spatial and temporal constraints on audiovisual speech perception In G. A. Calvert, C. Spence, & B. E. Stein (Eds.), The handbook of multisensory processes (pp. 177–188). Cambridge, MA: MIT Press.

    Google Scholar 

  • Okun, M., Naim, A., & Lampl, I. (2010). The subthreshold relation between cortical local field potential and neuronal firing unveiled by intracellular recordings in awake rats. The Journal of Neuroscience, 30(12), 4440–4448.

    Article  PubMed  CAS  Google Scholar 

  • Otero-Millan, J., Troncoso, X. G., Macknik, S. L., Serrano-Pedraza, I., & Martinez-Conde, S. (2008). Saccades and microsaccades during visual fixation, exploration, and search: Foundations for a common saccadic generator. Journal of Vision, 8(14), 1–18.

    Article  PubMed  Google Scholar 

  • Parr, L. A. (2004). Perceptual biases for multimodal cues in chimpanzee (Pan troglodytes) affect recognition. Animal Cognition, 7(3), 171–178.

    Article  PubMed  Google Scholar 

  • Partan, S. R. (2002). Single and multichannel facial composition: Facial expressions and vocalizations of rhesus macaques (Macaca mulatta). Behaviour, 139, 993–1027.

    Article  Google Scholar 

  • Pevsner, J. (2002). Leonardo da Vinci’s contributions to neuroscience. Trends in Neurosciences, 25, 217–220.

    Article  PubMed  CAS  Google Scholar 

  • Puce, A., & Perrett, D. I. (2003). Electrophysiology and brain imaging of biological motion. Philosophical Transactions of the Royal Society B: Biological Sciences, 358(1431), 435–445.

    Article  Google Scholar 

  • Rajkai, C., Lakatos, P., & Schroeder, C. E. (2008). Visual fixation-related neuronal activity in the auditory cortex of monkeys. Paper presented at the Society for Neuroscience, Washington, DC, November 15–19. Abstract 770.2

    Google Scholar 

  • Redican, W. K. (1975). Facial expressions in nonhuman primates. In L. A. Rosenblum (Ed.), Primate behavior: Developments in field and laboratory research. (pp. 103–194). New York: Academic Press.

    Google Scholar 

  • Romanski, L. M., Averbeck, B. B., & Diltz, M. (2005). Neural representation of vocalizations in the primate ventrolateral prefrontal cortex. Journal of Neurophysiology, 93(2), 734–747.

    Article  PubMed  Google Scholar 

  • Ross, L. A., Saint-Amour, D., Leavitt, V. M., Javitt, D. C., & Foxe, J. J. (2007). Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments. Cerebral Cortex, 17(5), 1147–1153.

    Article  PubMed  Google Scholar 

  • Scalaidhe, S. P., Albright, T. D., Rodman, H. R., & Gross, C. G. (1995). Effects of superior temporal polysensory area lesions on eye movements in the macaque monkey. Journal of Neurophysiology, 73(1), 1–19.

    PubMed  CAS  Google Scholar 

  • Schroeder, C. E., Lakatos, P., Kajikawa, Y., Partan, S., & Puce, A. (2008). Neuronal oscillations and visual amplification of speech. Trends in Cognitive Sciences, 12(3), 106–113.

    Article  PubMed  Google Scholar 

  • Schwartz, J.-L., Berthommier, F., & Savariaux, C. (2004). Seeing to hear better: evidence for early audio-visual interactions in speech identification. Cognition, 93(2), B69–B78.

    Article  PubMed  Google Scholar 

  • Seltzer, B., & Pandya, D. N. (1994). Parietal, temporal, and occipita projections to cortex of the superior temporal sulcus in the rhesus monkey: A retrograde tracer study. Journal of Comparative Neurology, 343(3), 445–463.

    Article  PubMed  CAS  Google Scholar 

  • Senkowski, D., Schneider, T. R., Foxe, J. J., & Engel, A. K. (2008). Crossmodal binding through neural coherence: Implications for multisensory processing. Trends in Neurosciences, 31(8), 401–409.

    Article  PubMed  CAS  Google Scholar 

  • Shannon, R. V., Zeng, F.-G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech recognition with primarily temporal cues. Science, 270(5234), 303–304.

    Article  PubMed  CAS  Google Scholar 

  • Shepherd, S. V., Steckenfinger, S. A., Hasson, U., & Ghazanfar, A. A. (2010). Human-monkey gaze correlations reveal convergent and divergent patterns of movie viewing. Current Biology, 20, 649–656.

    Article  PubMed  CAS  Google Scholar 

  • Smith, Z. M., Delgutte, B., & Oxenham, A. J. (2002). Chimaeric sounds reveal dichotomies in auditory perception. Nature, 416(6876), 87–90.

    Article  PubMed  CAS  Google Scholar 

  • Sugihara, T., Diltz, M. D., Averbeck, B. B., & Romanski, L. M. (2006). Integration of auditory and visual communication information in the primate ventrolateral prefrontal cortex. The Journal of Neuroscience, 26(43), 11138–11147.

    Article  PubMed  CAS  Google Scholar 

  • Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. Journal of the Acoustical Society of America, 26(2), 212–215.

    Article  Google Scholar 

  • Teufel, C., Ghazanfar, A. A., & Fischer, J. (2010). On the relationship between lateralized brain function and orienting asymmetries. Behavioral Neuroscience, 124, 437–445.

    Article  PubMed  Google Scholar 

  • Van Der Horst, R., Leeuw, A. R., & Wouter, A. D. (1999). Importance of temporal-envelope cues in consonant recognition. Journal of the Acoustical Society of America, 105(3), 1801–1809.

    Article  PubMed  CAS  Google Scholar 

  • van Wassenhove, V., Grant, K. W., & Poeppel, D. (2005). Visual speech speeds up the neural processing of auditory speech. Proceedings of the National Academy of Sciences of the USA, 102(4), 1181–1186.

    Article  PubMed  CAS  Google Scholar 

  • van Wassenhove, V., Grant, K. W., & Poeppel, D. (2007). Temporal window of integration in auditory-visual speech perception. Neuropsychologia, 45(3), 598–607.

    Article  PubMed  Google Scholar 

  • Vatikiotis-Bateson, E., Eigsti, I. M., Yano, S., & Munhall, K. G. (1998). Eye movement of perceivers during audiovisual speech perception. Perception & Psychophysics, 60(6), 926–940.

    Article  CAS  Google Scholar 

  • Wang, X., Merzenich, M. M., Beitel, R., & Schreiner, C. E. (1995). Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: Temporal and spectral characteristics. Journal of Neurophysiology, 74(6), 2685–2706.

    PubMed  CAS  Google Scholar 

  • Werner-Reiss, U., Kelly, K. A., Trause, A. S., Underhill, A. M., & Groh, J. M. (2003). Eye position affects activity in primary auditory cortex of primates. Current Biology, 13(7), 554–562.

    Article  PubMed  CAS  Google Scholar 

  • Yehia, H. C., Kuratate, T., & Vatikiotis-Bateson, E. (2002). Linking facial animation, head motion and speech acoustics. Journal of Phonetics, 30(3), 555–568.

    Article  Google Scholar 

Download references

Acknowledgments

The authors gratefully acknowledge the scientific contributions and numerous discussions with Ipek Kulahci, Joost Maier, Darshana Narayanan, Stephen Shepherd, and Daniel Takahashi. This work was supported by NIH R01NS054898, NSF BCS-0547760 CAREER Award and the James S. McDonnell Scholar Award.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asif A. Ghazanfar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Ghazanfar, A.A., Chandrasekaran, C. (2013). The Influence of Vision on Auditory Communication in Primates. In: Cohen, Y., Popper, A., Fay, R. (eds) Neural Correlates of Auditory Cognition. Springer Handbook of Auditory Research, vol 45. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-2350-8_7

Download citation

Publish with us

Policies and ethics