Elsevier

NeuroImage

Volume 175, 15 July 2018, Pages 56-69
NeuroImage

Subcortical sources dominate the neuroelectric auditory frequency-following response to speech

https://doi.org/10.1016/j.neuroimage.2018.03.060Get rights and content

Abstract

Frequency-following responses (FFRs) are neurophonic potentials that provide a window into the encoding of complex sounds (e.g., speech/music), auditory disorders, and neuroplasticity. While the neural origins of the FFR remain debated, renewed controversy has reemerged after demonstration that FFRs recorded via magnetoencephalography (MEG) are dominated by cortical rather than brainstem structures as previously assumed. Here, we recorded high-density (64 ch) FFRs via EEG and applied state-of-the art source imaging techniques to multichannel data (discrete dipole modeling, distributed imaging, independent component analysis, computational simulations). Our data confirm a mixture of generators localized to bilateral auditory nerve (AN), brainstem inferior colliculus (BS), and bilateral primary auditory cortex (PAC). However, frequency-specific scrutiny of source waveforms showed the relative contribution of these nuclei to the aggregate FFR varied across stimulus frequencies. Whereas AN and BS sources produced robust FFRs up to ∼700 Hz, PAC showed weak phase-locking with little FFR energy above the speech fundamental (100 Hz). Notably, CLARA imaging further showed PAC activation was eradicated for FFRs >150 Hz, above which only subcortical sources remained active. Our results show (i) the site of FFR generation varies critically with stimulus frequency; and (ii) opposite the pattern observed in MEG, subcortical structures make the largest contribution to electrically recorded FFRs (AN ≥ BS > PAC). We infer that cortical dominance observed in previous neuromagnetic data is likely due to the bias of MEG to superficial brain tissue, underestimating subcortical structures that drive most of the speech-FFR. Cleanly separating subcortical from cortical FFRs can be achieved by ensuring stimulus frequencies are >150–200 Hz, above the phase-locking limit of cortical neurons.

Introduction

The auditory frequency-following response (FFR) is a neurophonic potential recorded at the scalp that reflects sustained, phase-locked neural ensemble activity of the auditory system. FFRs reflect the neural encoding of dynamic, spectrotemporal features of periodic acoustic stimuli and consequently, provide a “neural fingerprint” of sound within the human electroencephalogram (EEG). The remarkable fidelity of FFRs is evident in behavioral experiments in which the neural responses are replayed as audio signals. These studies demonstrate that FFRs evoked by speech sounds (neural potentials) are highly intelligible to the point they can be reliably identified by external observers (Bidelman, 2018; Galbraith et al., 1995; Weiss and Bidelman, 2015). Given their remarkable spectrotemporal detail, FFRs have provided important insight into auditory processing including individual differences in speech listening skills (Anderson et al., 2011; Bidelman, 2017b; Bidelman and Alain, 2015; Song et al., 2011), neuroplasticity of learning and language experience (Carcagno and Plack, 2011; Chandrasekaran et al., 2012; Kraus and Chandrasekaran, 2010; Krishnan and Gandour, 2009; Krizman et al., 2012), the neurobiology of music (Bidelman, 2013; Bidelman and Krishnan, 2009, 2011; Bidelman et al., 2011c; Bones et al., 2014; Cousineau et al., 2015), auditory aging (Anderson et al., 2013; Bidelman et al., 2014a; Parthasarathy and Bartlett, 2012), and abnormal encoding of complex sounds in clinical populations (Bellier et al., 2015b; Bidelman et al., 2017; Billiet and Bellis, 2011; Chandrasekaran et al., 2009; Cunningham et al., 2001; Kraus et al., 2017; Rocha-Muniz et al., 2012; Song et al., 2008; White-Schwoch et al., 2015).

Despite an abundance of FFR studies, surprisingly little is understood about its basic characteristics, first among them, its anatomical origin(s) (i.e., source generators). Converging evidence from single-unit recordings in animal models, human scalp M/EEG, and lesion studies in several species suggest that an array of sources contribute to FFR generation (Bidelman, 2015b; Chandrasekaran and Kraus, 2010; Coffey et al., 2016; Smith et al., 1975; Sohmer et al., 1977). Many of these studies have proposed the inferior colliculus (IC) of the brainstem as the FFR's primary neural generator. The notion of a midbrain origin to the FFR is supported by the fact that the short latency of the response (∼5–10 ms) aligns with first spike latencies in the IC (Langner and Schreiner, 1988)—earlier than possible from cortical generators (Liégeois-Chauvel et al., 1994), FFRs contain phase-locked activity (e.g., >1000 Hz) well beyond the upper limit of phase-locking of cortical neurons (i.e., ∼100 Hz) (Aiken and Picton, 2008; Akhoun et al., 2008; Wallace et al., 2000), there is a high correspondence between far-field and near-field intracranial FFRs recorded directly from the IC (Smith et al., 1975), cryogenic cooling of the IC results in disappearance of FFRs within the colliculi and at the scalp (Smith et al., 1975), and the response is eradicated with focal lesions to the IC (Sohmer and Pratt, 1977).

Nevertheless, the dominant sources of the FFR are likely to vary across species. In cat, the auditory nerve (AN), cochlear nucleus (CN) and other olivary nuclei may contribute ∼50% of the amplitude to the scalp-recorded FFR, with weaker contributions from IC (Gardi et al., 1979).1 One study has also documented FFRs recorded from medial geniculate body (MGB) (Weinberger et al., 1970). However, Smith et al. (1975) noted that the small amplitude of these responses may have been a far-field radiation of FFRs picked up from lower nuclei and concluded it improbable that the scalp-FFR originates from a locus rostral to IC. Nevertheless, it is clear from these early invasive and cross-species studies that a mixture of brainstem sources are involved in the generation of FFRs (Bidelman, 2015b; Chandrasekaran and Kraus, 2010; Stillman et al., 1978; Tichko and Skoe, 2017).

Localizing scalp potentials in humans is challenged by the fact that far-field potentials are volume conducted to the scalp making it difficult to circumscribe evoked responses to any one anatomical site. In a recent study (Bidelman, 2015b), we recorded high-density (64 channel) speech-evoked FFRs which allowed us the unique opportunity to map the FFR's topography and triangulate the response using source modeling techniques commonly applied to the auditory cortical event-related potentials (ERPs) (Picton et al., 1999). Dipole mapping and 3-channel Lissajous analyses (Pratt et al., 1984, 1987) were used to localize the most likely FFR generator and the orientation of its voltage trajectory in 3D space. We found that FFRs were described by two proximal dipole sources within the midbrain (i.e., brainstem IC) each having an oblique, fronto-centrally oriented voltage gradient that oscillated parallel to the brainstem. Our findings were consistent with a midbrain (IC) origin to the FFR noted in previous studies (Bidelman, 2015b; Smith et al., 1975; Sohmer and Pratt, 1977; Zhang and Gong, 2017). However, we also demonstrated that the strength of the response varied dramatically across the scalp; electrodes near the mastoids were able to record FFRs up to ∼1100 Hz whereas locations over cerebral sites showed FFRs only up to ∼100 Hz, corresponding to the fundamental frequency (F0) of our speech stimulus (see Fig. 6 of Bidelman, 2015b). Higher frequency FFRs near the mastoids is consistent with the notion these channels predominantly record more peripheral following responses, i.e., cochlear microphonic or neural FFRs emitted from auditory nerve (AN) (Chimento and Schreiner, 1990; Marsh et al., 1970).

Despite ample evidence for a brainstem origin (Bidelman, 2015b; Gardi et al., 1979; Smith et al., 1975; Sohmer et al., 1977), renewed controversy surrounding its source(s) emerged after demonstration that FFRs to the voice pitch (F0 = 100 Hz) of speech were observable in MEG responses recorded near auditory cortex (Coffey et al., 2016). Moreover, distributed source analysis suggested that auditory cortex accounted for the highest relative percentage of the neuromagnetic FFR signal, with brainstem nuclei (e.g., CN, IC, MGB) accounting for surprisingly little (∼10%) of the response variance (see Fig. S6 of Coffey et al., 2016). Since the FFR is typically interpreted as reflecting a subcortical (brainstem) origin (Chandrasekaran and Kraus, 2010; Krishnan and Gandour, 2009; Skoe and Kraus, 2010; Tzounopoulos and Kraus, 2009; Wong et al., 2007), a cortical contribution would qualify theoretical understanding of the response and the locus of experience-dependent plasticity often interpreted in the context of human FFR studies (Bidelman, 2013; Chandrasekaran et al., 2009; Kraus and Chandrasekaran, 2010; Krishnan and Gandour, 2009). However, an important property of sustained potentials like the FFR and auditory steady-state response (ASSR) is that the relative contribution of brainstem and cortical sources varies systematically with stimulus frequency (Bidelman, 2015b). Higher frequencies (>80 Hz) evoke peripheral (AN) and brainstem generators whereas low frequencies (<100 Hz) recruit mainly cortical neural ensembles (Herdman et al., 2002; Kuwada et al., 2002). Indeed, MEG-FFRs are only observable at the speech F0 (100 Hz) (Coffey et al., 2016), indicating that cortical contributions, if present, are restricted to the lowest frequencies of the speech spectrum. Thus, a critical but overlooked point in previous studies is that frequency of the phase-locked activity must be considered to properly interpret the generation sites of the FFR. Problematically, MEG is largely insensitive to deep sources (Baillet, 2017; Baillet et al., 2001; Cohen and Cuffin, 1983; Hillebrand and Barnes, 2002). Consequently, the natural bias of MEG to superficial brain tissue makes it unclear if the “cortical dominance” in FFRs suggested by Coffey et al. (2016) is idiosyncratic to the MEG modality, which tends to inherently underemphasize deep source contributions (i.e., brainstem) known to play an important role in FFR generation (e.g., Bidelman, 2015b; Chimento and Schreiner, 1990; Gardi et al., 1979; Smith et al., 1975; Sohmer et al., 1977; Tichko and Skoe, 2017; Zhang and Gong, 2017).

Given equivocal findings, the aim of the present study was to better elucidate the various neural generators contributing to the FFR. To this end, we recorded multichannel speech-FFRs from normal-hearing listeners and applied state-of-the-art source imaging techniques to the neural recordings (discrete dipole modeling, distributed imaging, independent component analysis, computational simulations). With the exception of our recent work (Bidelman, 2015b), we are not aware of any EEG studies that have undertaken such a comprehensive source modeling of the FFR as most reports are limited to a single-channel montage. Under similar channel counts, the spatial resolution of EEG source estimation is comparable to MEG (∼8–10 mm) (Cohen and Cuffin, 1991; Cohen et al., 1990; Hedrich et al., 2017). Yet, a major advantage is that activity of deep sources is more readily recorded with EEG than MEG, allowing us the opportunity to more veridically image both deep (brainstem) and putative superficial (cortical) sources of the FFR (cf. Coffey et al., 2016). We reasoned that cortical contributions to the FFR (if present in EEG recordings) would diminish at higher frequencies and cease altogether above the phase-locking limit of cortical neurons. To test this hypothesis, we performed time-frequency analysis on source-level FFRs extracted from various subcortical and cortical regions of interest (ROIs) to evaluate the frequency-dependence of different FFR sites along the auditory neuroaxis. Our findings confirm a mixture of FFR generators localized to bilateral auditory nerve (AN), upper brainstem (BS) IC, and bilateral primary auditory cortex (PAC). However, we show that the site of FFR generation varies critically with stimulus frequency with subcortical structures making the largest contribution to the electrically recorded FFR (i.e., AN ≥ BS > PAC), opposite the pattern observed in MEG (Coffey et al., 2016). While weak cortical FFRs are apparent for low-frequency stimulation (∼100 Hz), we find only subcortical FFR sources (AN, BS) are active for frequencies spanning the majority of the speech bandwidth (i.e., all spectral cues above a voice pitch of F0 = 100 Hz).

Section snippets

Participants

We recorded multichannel FFRs from n = 18 young adults (μ ± SD age: 25.0 ± 2.7 years; 4 males, 14 females). All had obtained a similar level of formal education (at least an undergraduate degree) and were monolingual speakers of American English. All participants were required to have minimal (<3 years) formal musical training (μ ± SD: 1.8 ± 1.9 years) as musicianship is known to enhance brainstem and cortical auditory evoked potentials (e.g., Bidelman and Alain, 2015; Bidelman et al., 2014b;

Results

Speech-evoked FFRs appeared as a sustained neurophonic potential, reflecting phase-locked activity to the spectrotemporal features of speech (Fig. 1). As an initial, data-driven approach to unravel the sources of the FFR, we conducted ICA decomposition on the scalp-recorded data (Fig. 2). Of the first 10 IC topographies accounting for ∼80% of the data variance, three components explaining 29.1, 12.3 and 2.6% of the variance were localized spatially to the caudal brainstem (midway between IC and

Discussion

By recording multichannel FFRs to speech via EEG, our data confirm a mixture of generators localized to bilateral AN, brainstem IC, and bilateral PAC. These findings corroborate MEG-recordings which have shown that portions of the speech-FFR are generated at both subcortical and cortical levels of the auditory system (cf. Coffey et al., 2017; Coffey et al., 2016). However, frequency-specific scrutiny of source waveforms revealed that the relative contribution of these nuclei to the aggregate

Conclusions

Results of the present study reveal that the methodology used to record FFRs is not unbiased and acts to “color” the response, akin to how different microphones might color the recording of sounds they detect. Clearly, it is important to keep the methodology in mind when interpreting far-field responses called the “FFR” and interpreting its underlying generators in a given task or stimulus paradigm. MEG seems to yield FFRs with (overemphasized) neocortical contributions (Coffey et al., 2016)

Acknowledgements

The author thanks Megan Howell and Mary Katherine Davis for assistance in data collection and Bonnie Brown for comments on earlier versions of the manuscript. This work was supported by grants from the American Hearing Research Foundation (AHRF), American Academy of Audiology (AAA) Foundation, and University of Memphis Research Investment Fund (UMRIF) awarded to G.M.B.

References (154)

  • G.M. Bidelman

    Amplified induced neural oscillatory activity predicts musicians' benefits in categorical speech perception

    Neuroscience

    (2017)
  • G.M. Bidelman

    Sonification of scalp-recorded frequency-following responses (FFRs) offers improved response detection over conventional statistical metrics

    J. Neurosci. Methods

    (2018)
  • G.M. Bidelman et al.

    Tone-language speakers show hemispheric specialization and differential cortical processing of contour and interval cues for pitch

    Neuroscience

    (2015)
  • G.M. Bidelman et al.

    Bilinguals at the "cocktail party": dissociable neural activity in auditory-linguistic brain regions reveals neurobiological basis for nonnative listeners' speech-in-noise recognition deficits

    Brain Lang.

    (2015)
  • G.M. Bidelman et al.

    Musicians and tone-language speakers share enhanced brainstem encoding but not perceptual benefits for musical pitch

    Brain Cognition

    (2011)
  • G.M. Bidelman et al.

    Functional organization for musical consonance and tonal pitch hierarchy in human auditory cortex

    Neuroimage

    (2014)
  • G.M. Bidelman et al.

    Functional changes in inter- and intra-hemispheric auditory cortical processing underlying degraded speech perception

    Neuroimage

    (2016)
  • G.M. Bidelman et al.

    Tracing the emergence of categorical speech perception in the human auditory system

    Neuroimage

    (2013)
  • G.M. Bidelman et al.

    Age-related changes in the subcortical-cortical encoding and categorical perception of speech

    Neurobiol. Aging

    (2014)
  • O. Bones et al.

    Phase locked neural activity in the human brainstem predicts preference for musical consonance

    Neuropsychologia

    (2014)
  • B. Chandrasekaran et al.

    Context-dependent encoding in the human auditory brainstem relates to hearing speech in noise: implications for developmental dyslexia

    Neuron

    (2009)
  • T.C. Chimento et al.

    Selectively eliminating cochelar microphonic contamination from the frequency-following response

    Electroencephalogr. Clin. Neurophysiol.

    (1990)
  • C.G. Clinard et al.

    Aging alters the perception and physiological representation of frequency: evidence from human frequency-following response recordings

    Hear. Res.

    (2010)
  • D. Cohen et al.

    Demonstration of useful differences between magnetoencephalogram and electroencephalogram

    Electroencephalogr. Clin. Neurophysiol.

    (1983)
  • B.N. Cuffin et al.

    Experimental tests of EEG source localization accuracy in spherical head models

    Clin. Neurophysiol.

    (2001)
  • J. Cunningham et al.

    Neurobiologic responses to speech in noise in children with learning problems: deficits and strategies for improvement

    Clin. Neurophysiol.

    (2001)
  • A. Delorme et al.

    EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis

    J. Neurosci. Methods

    (2004)
  • G.C. Galbraith et al.

    Cross-correlation and latency compensation analysis of click-evoked and frequency-following brain-stem responses in man

    Electroencephalogr. Clin. Neurophysiology

    (1990)
  • T. Hedrich et al.

    Comparison of the spatial resolution of source imaging techniques in high-density EEG and MEG

    Neuroimage

    (2017)
  • A. Hillebrand et al.

    A quantitative assessment of the sensitivity of whole-head MEG to activity in the adult human cortex

    Neuroimage

    (2002)
  • J. Kayser et al.

    Issues and considerations for using the scalp surface Laplacian in EEG/ERP research: a tutorial review

    Int. J. Psychophysiol.

    (2015)
  • A. Krishnan et al.

    The role of the auditory brainstem in processing linguistically-relevant pitch patterns

    Brain Lang.

    (2009)
  • A. Krishnan et al.

    Experience-dependent enhancement of pitch-specific responses in the auditory cortex is limited to acceleration rates in normal voice range

    Neuroscience

    (2015)
  • A. Krishnan et al.

    Encoding of pitch in the human brainstem is sensitive to language experience

    Brain Res. Cognitive Brain Res.

    (2005)
  • C. Liégeois-Chauvel et al.

    Evoked potentials recorded from the auditory cortex in man: evaluation and topography of the middle latency components

    Electroencephalogr. Clin. Neurophysiology

    (1994)
  • Z.A. Acar et al.

    Effects of forward model errors on eeg source localization

    Brain Topogr.

    (2013)
  • C. Alain et al.

    Neural correlates of speech segregation based on formant frequencies of adjacent vowels

    Sci. Rep.

    (2017)
  • C. Alain et al.

    Noise-induced increase in human auditory evoked neuromagnetic fields

    Eur. J. Neurosci.

    (2009)
  • S. Anderson et al.

    A neural basis of speech-in-noise perception in older adults

    Ear Hear.

    (2011)
  • S. Anderson et al.

    Reversal of age-related neural timing delays with training

    Proc. Natl. Acad. Sci. U. S. A.

    (2013)
  • S.D. Arlinger et al.

    Thresholds for linear frequency ramps of a continuous pure tone

    Acta Oto-Laryngologica

    (1977)
  • P.F. Assmann et al.

    Modeling the perception of concurrent vowels: vowels with the same fundamental frequency

    J. Acoust. Soc. Am.

    (1989)
  • P.F. Assmann et al.

    Modeling the perception of concurrent vowels: vowels with different fundamental frequencies

    J. Acoust. Soc. Am.

    (1990)
  • S. Baillet

    Magnetoencephalography for brain electrophysiology and imaging

    Nat. Neurosci.

    (2017)
  • S. Baillet et al.

    Electromagnetic brain mapping

    IEEE Signal Process. Mag.

    (2001)
  • V.M. Bajo et al.

    The descending corticocollicular pathway mediates learning-induced auditory plasticity

    Nat. Neurosci.

    (2010)
  • P. Bashivan et al.

    Spectrotemporal dynamics of the EEG during working memory encoding and maintenance predicts individual behavioral capacity

    Eur. J. Neurosci.

    (2014)
  • L. Bellier et al.

    Topographic recordings of auditory evoked potentials to speech: subcortical and cortical responses

    Psychophysiology

    (2015)
  • G.M. Bidelman

    The role of the auditory brainstem in processing musically-relevant pitch

    Front. Psychol.

    (2013)
  • G.M. Bidelman

    Communicating in challenging environments: noise and reverberation

  • Cited by (0)

    View full text