Subcortical sources dominate the neuroelectric auditory frequency-following response to speech
Introduction
The auditory frequency-following response (FFR) is a neurophonic potential recorded at the scalp that reflects sustained, phase-locked neural ensemble activity of the auditory system. FFRs reflect the neural encoding of dynamic, spectrotemporal features of periodic acoustic stimuli and consequently, provide a “neural fingerprint” of sound within the human electroencephalogram (EEG). The remarkable fidelity of FFRs is evident in behavioral experiments in which the neural responses are replayed as audio signals. These studies demonstrate that FFRs evoked by speech sounds (neural potentials) are highly intelligible to the point they can be reliably identified by external observers (Bidelman, 2018; Galbraith et al., 1995; Weiss and Bidelman, 2015). Given their remarkable spectrotemporal detail, FFRs have provided important insight into auditory processing including individual differences in speech listening skills (Anderson et al., 2011; Bidelman, 2017b; Bidelman and Alain, 2015; Song et al., 2011), neuroplasticity of learning and language experience (Carcagno and Plack, 2011; Chandrasekaran et al., 2012; Kraus and Chandrasekaran, 2010; Krishnan and Gandour, 2009; Krizman et al., 2012), the neurobiology of music (Bidelman, 2013; Bidelman and Krishnan, 2009, 2011; Bidelman et al., 2011c; Bones et al., 2014; Cousineau et al., 2015), auditory aging (Anderson et al., 2013; Bidelman et al., 2014a; Parthasarathy and Bartlett, 2012), and abnormal encoding of complex sounds in clinical populations (Bellier et al., 2015b; Bidelman et al., 2017; Billiet and Bellis, 2011; Chandrasekaran et al., 2009; Cunningham et al., 2001; Kraus et al., 2017; Rocha-Muniz et al., 2012; Song et al., 2008; White-Schwoch et al., 2015).
Despite an abundance of FFR studies, surprisingly little is understood about its basic characteristics, first among them, its anatomical origin(s) (i.e., source generators). Converging evidence from single-unit recordings in animal models, human scalp M/EEG, and lesion studies in several species suggest that an array of sources contribute to FFR generation (Bidelman, 2015b; Chandrasekaran and Kraus, 2010; Coffey et al., 2016; Smith et al., 1975; Sohmer et al., 1977). Many of these studies have proposed the inferior colliculus (IC) of the brainstem as the FFR's primary neural generator. The notion of a midbrain origin to the FFR is supported by the fact that the short latency of the response (∼5–10 ms) aligns with first spike latencies in the IC (Langner and Schreiner, 1988)—earlier than possible from cortical generators (Liégeois-Chauvel et al., 1994), FFRs contain phase-locked activity (e.g., >1000 Hz) well beyond the upper limit of phase-locking of cortical neurons (i.e., ∼100 Hz) (Aiken and Picton, 2008; Akhoun et al., 2008; Wallace et al., 2000), there is a high correspondence between far-field and near-field intracranial FFRs recorded directly from the IC (Smith et al., 1975), cryogenic cooling of the IC results in disappearance of FFRs within the colliculi and at the scalp (Smith et al., 1975), and the response is eradicated with focal lesions to the IC (Sohmer and Pratt, 1977).
Nevertheless, the dominant sources of the FFR are likely to vary across species. In cat, the auditory nerve (AN), cochlear nucleus (CN) and other olivary nuclei may contribute ∼50% of the amplitude to the scalp-recorded FFR, with weaker contributions from IC (Gardi et al., 1979).1 One study has also documented FFRs recorded from medial geniculate body (MGB) (Weinberger et al., 1970). However, Smith et al. (1975) noted that the small amplitude of these responses may have been a far-field radiation of FFRs picked up from lower nuclei and concluded it improbable that the scalp-FFR originates from a locus rostral to IC. Nevertheless, it is clear from these early invasive and cross-species studies that a mixture of brainstem sources are involved in the generation of FFRs (Bidelman, 2015b; Chandrasekaran and Kraus, 2010; Stillman et al., 1978; Tichko and Skoe, 2017).
Localizing scalp potentials in humans is challenged by the fact that far-field potentials are volume conducted to the scalp making it difficult to circumscribe evoked responses to any one anatomical site. In a recent study (Bidelman, 2015b), we recorded high-density (64 channel) speech-evoked FFRs which allowed us the unique opportunity to map the FFR's topography and triangulate the response using source modeling techniques commonly applied to the auditory cortical event-related potentials (ERPs) (Picton et al., 1999). Dipole mapping and 3-channel Lissajous analyses (Pratt et al., 1984, 1987) were used to localize the most likely FFR generator and the orientation of its voltage trajectory in 3D space. We found that FFRs were described by two proximal dipole sources within the midbrain (i.e., brainstem IC) each having an oblique, fronto-centrally oriented voltage gradient that oscillated parallel to the brainstem. Our findings were consistent with a midbrain (IC) origin to the FFR noted in previous studies (Bidelman, 2015b; Smith et al., 1975; Sohmer and Pratt, 1977; Zhang and Gong, 2017). However, we also demonstrated that the strength of the response varied dramatically across the scalp; electrodes near the mastoids were able to record FFRs up to ∼1100 Hz whereas locations over cerebral sites showed FFRs only up to ∼100 Hz, corresponding to the fundamental frequency (F0) of our speech stimulus (see Fig. 6 of Bidelman, 2015b). Higher frequency FFRs near the mastoids is consistent with the notion these channels predominantly record more peripheral following responses, i.e., cochlear microphonic or neural FFRs emitted from auditory nerve (AN) (Chimento and Schreiner, 1990; Marsh et al., 1970).
Despite ample evidence for a brainstem origin (Bidelman, 2015b; Gardi et al., 1979; Smith et al., 1975; Sohmer et al., 1977), renewed controversy surrounding its source(s) emerged after demonstration that FFRs to the voice pitch (F0 = 100 Hz) of speech were observable in MEG responses recorded near auditory cortex (Coffey et al., 2016). Moreover, distributed source analysis suggested that auditory cortex accounted for the highest relative percentage of the neuromagnetic FFR signal, with brainstem nuclei (e.g., CN, IC, MGB) accounting for surprisingly little (∼10%) of the response variance (see Fig. S6 of Coffey et al., 2016). Since the FFR is typically interpreted as reflecting a subcortical (brainstem) origin (Chandrasekaran and Kraus, 2010; Krishnan and Gandour, 2009; Skoe and Kraus, 2010; Tzounopoulos and Kraus, 2009; Wong et al., 2007), a cortical contribution would qualify theoretical understanding of the response and the locus of experience-dependent plasticity often interpreted in the context of human FFR studies (Bidelman, 2013; Chandrasekaran et al., 2009; Kraus and Chandrasekaran, 2010; Krishnan and Gandour, 2009). However, an important property of sustained potentials like the FFR and auditory steady-state response (ASSR) is that the relative contribution of brainstem and cortical sources varies systematically with stimulus frequency (Bidelman, 2015b). Higher frequencies (>80 Hz) evoke peripheral (AN) and brainstem generators whereas low frequencies (<100 Hz) recruit mainly cortical neural ensembles (Herdman et al., 2002; Kuwada et al., 2002). Indeed, MEG-FFRs are only observable at the speech F0 (100 Hz) (Coffey et al., 2016), indicating that cortical contributions, if present, are restricted to the lowest frequencies of the speech spectrum. Thus, a critical but overlooked point in previous studies is that frequency of the phase-locked activity must be considered to properly interpret the generation sites of the FFR. Problematically, MEG is largely insensitive to deep sources (Baillet, 2017; Baillet et al., 2001; Cohen and Cuffin, 1983; Hillebrand and Barnes, 2002). Consequently, the natural bias of MEG to superficial brain tissue makes it unclear if the “cortical dominance” in FFRs suggested by Coffey et al. (2016) is idiosyncratic to the MEG modality, which tends to inherently underemphasize deep source contributions (i.e., brainstem) known to play an important role in FFR generation (e.g., Bidelman, 2015b; Chimento and Schreiner, 1990; Gardi et al., 1979; Smith et al., 1975; Sohmer et al., 1977; Tichko and Skoe, 2017; Zhang and Gong, 2017).
Given equivocal findings, the aim of the present study was to better elucidate the various neural generators contributing to the FFR. To this end, we recorded multichannel speech-FFRs from normal-hearing listeners and applied state-of-the-art source imaging techniques to the neural recordings (discrete dipole modeling, distributed imaging, independent component analysis, computational simulations). With the exception of our recent work (Bidelman, 2015b), we are not aware of any EEG studies that have undertaken such a comprehensive source modeling of the FFR as most reports are limited to a single-channel montage. Under similar channel counts, the spatial resolution of EEG source estimation is comparable to MEG (∼8–10 mm) (Cohen and Cuffin, 1991; Cohen et al., 1990; Hedrich et al., 2017). Yet, a major advantage is that activity of deep sources is more readily recorded with EEG than MEG, allowing us the opportunity to more veridically image both deep (brainstem) and putative superficial (cortical) sources of the FFR (cf. Coffey et al., 2016). We reasoned that cortical contributions to the FFR (if present in EEG recordings) would diminish at higher frequencies and cease altogether above the phase-locking limit of cortical neurons. To test this hypothesis, we performed time-frequency analysis on source-level FFRs extracted from various subcortical and cortical regions of interest (ROIs) to evaluate the frequency-dependence of different FFR sites along the auditory neuroaxis. Our findings confirm a mixture of FFR generators localized to bilateral auditory nerve (AN), upper brainstem (BS) IC, and bilateral primary auditory cortex (PAC). However, we show that the site of FFR generation varies critically with stimulus frequency with subcortical structures making the largest contribution to the electrically recorded FFR (i.e., AN ≥ BS > PAC), opposite the pattern observed in MEG (Coffey et al., 2016). While weak cortical FFRs are apparent for low-frequency stimulation (∼100 Hz), we find only subcortical FFR sources (AN, BS) are active for frequencies spanning the majority of the speech bandwidth (i.e., all spectral cues above a voice pitch of F0 = 100 Hz).
Section snippets
Participants
We recorded multichannel FFRs from n = 18 young adults (μ ± SD age: 25.0 ± 2.7 years; 4 males, 14 females). All had obtained a similar level of formal education (at least an undergraduate degree) and were monolingual speakers of American English. All participants were required to have minimal (<3 years) formal musical training (μ ± SD: 1.8 ± 1.9 years) as musicianship is known to enhance brainstem and cortical auditory evoked potentials (e.g., Bidelman and Alain, 2015; Bidelman et al., 2014b;
Results
Speech-evoked FFRs appeared as a sustained neurophonic potential, reflecting phase-locked activity to the spectrotemporal features of speech (Fig. 1). As an initial, data-driven approach to unravel the sources of the FFR, we conducted ICA decomposition on the scalp-recorded data (Fig. 2). Of the first 10 IC topographies accounting for ∼80% of the data variance, three components explaining 29.1, 12.3 and 2.6% of the variance were localized spatially to the caudal brainstem (midway between IC and
Discussion
By recording multichannel FFRs to speech via EEG, our data confirm a mixture of generators localized to bilateral AN, brainstem IC, and bilateral PAC. These findings corroborate MEG-recordings which have shown that portions of the speech-FFR are generated at both subcortical and cortical levels of the auditory system (cf. Coffey et al., 2017; Coffey et al., 2016). However, frequency-specific scrutiny of source waveforms revealed that the relative contribution of these nuclei to the aggregate
Conclusions
Results of the present study reveal that the methodology used to record FFRs is not unbiased and acts to “color” the response, akin to how different microphones might color the recording of sounds they detect. Clearly, it is important to keep the methodology in mind when interpreting far-field responses called the “FFR” and interpreting its underlying generators in a given task or stimulus paradigm. MEG seems to yield FFRs with (overemphasized) neocortical contributions (Coffey et al., 2016)
Acknowledgements
The author thanks Megan Howell and Mary Katherine Davis for assistance in data collection and Bonnie Brown for comments on earlier versions of the manuscript. This work was supported by grants from the American Hearing Research Foundation (AHRF), American Academy of Audiology (AAA) Foundation, and University of Memphis Research Investment Fund (UMRIF) awarded to G.M.B.
References (154)
- et al.
Population responses in primary auditory cortex simultaneously represent the temporal envelope and periodicity features in natural speech
Hear. Res.
(2017) - et al.
Envelope and spectral frequency-following responses to vowel sounds
Hear. Res.
(2008) - et al.
The temporal relationship between speech auditory brainstem responses and the acoustic pattern of the phoneme/ba/in normal-hearing adults
Clin. Neurophysiol.
(2008) - et al.
Optimal modulation frequency for amplitude-modulation following response in young children during sleep
Hear. Res.
(1993) - et al.
Speech auditory brainstem response through hearing aid stimulation
Hear Res.
(2015) - et al.
A fast method for forward computation of multiple-shell spherical head models
Electroencephalogr. Clin. Neurophysiol.
(1994) Induced neural beta oscillations predict categorical speech perception abilities
Brain Lang.
(2015)Multichannel recordings of the human brainstem frequency-following response: scalp topography, source generators, and distinctions from the transient ABR
Hear. Res.
(2015)Sensitivity of the cortical pitch onset response to height, time-variance, and directionality of dynamic pitch
Neurosci. Lett.
(2015)Towards an optimal paradigm for simultaneously recording cortical and brainstem auditory evoked potentials
J. Neurosci. Methods
(2015)