Speaking Rate and Fundamental Frequency as Speech Cues to Perceived Age
Introduction
While the total population of elders can be shown to be increasing, the effect of the normal aging process on speech production, and how these changes are perceived by listeners, is only marginally understood. A better understanding of how aging affects speech production should contribute to general models of the anatomical and physiological consequences of this process. Speaker age is also likely to be an important “indexical” (nonlinguistic) aspect of the speech signal. Other indexical properties of speech, such as token characteristics, speaker identity, and gender, have been demonstrated, in prior research, to be stored in an integrated fashion with linguistic information (eg, phonemes, syllables, words) in long-term memory.1 These indexical properties have been shown to play an important role in speech perception and verbal learning.2 If speech perception and verbal learning are to be modeled, the role of all relevant indexical properties must be understood, and speaker age is likely to be relevant for several reasons. Important among them is the finding that listeners are relatively accurate in estimating age from speech samples.3, 4, 5, 6 Finally, research on speech and aging is vital for clinical purposes to evaluate speech disorders in the elderly population. Without a model of these characteristics for normal, healthy older individuals, it becomes difficult to isolate the consequences of different speech and/or voice pathologies in that population.
As stated, prior work has shown that listeners can accurately gauge speaker age, with correlations between chronologic and perceived age ranging from r = 0.68 to r = 0.90.3, 4, 5, 6 The acoustic cues that are presumed to signal speaker age are likely the product of age-related physiological changes to the vocal tract. Some of the physiological changes that have been identified include the lengthening of the vocal tract or oral cavity,7, 8 a reduction in pulmonary function,9 laryngeal cartilage ossification,10 an increased stiffening of the vocal folds,10, 11, 12, 13 and a reduction in vocal fold closure.14, 15, 16 These physiological changes have been used to predict acoustic correlates of aging in voices. These predictions can be summarized as follows:
Fundamental frequency. Mean fundamental frequency (f0) increases in older males while remaining constant or decreasing in older females.15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 Based on these studies, it does not appear that mean f0 changes linearly across most of the adult life span, but rather shows a shift after late middle age. f0 also has been shown to increase in variability with age for both sexes, indicating reduced laryngeal control.9, 19, 20, 32, 33, 34, 35 It must be noted, however, that a subset of these studies failed to find changes in f0 variability and/or in mean f0 as a function of age for both men and women.18, 20, 36, 37 Nonetheless, the most consistent pattern that emerges across them is one in which male and female voices coalesce with respect to mean f0 from middle to old age—a pattern that modifies the shift observed at puberty.38, 39, 40
Speaking rate. Speaking rate appears to slow as a function of age for both male and female speakers. This rate reduction has been observed in extemporaneous speech and read speech, including paragraphs, sentences, words, and segments.26, 29, 41, 42, 43, 44, 45 However, it has not necessarily been observed in specific phonetic cues of segments, such as voice onset time, even though an extensive range of segment cues have not been examined to date.46, 47, 48, 49 Overall, speaking rate has been consistently shown to be an acoustic cue differentiating speakers by age; it reflects a more generalized slowing in motor processes in the elderly population.50
Formant frequencies. Vowel formant frequencies have shown modest decreases with age.7, 8, 32, 51 This shift is presumed to be a by-product of the lowering of the vocal folds over the life span, which results in a longer vocal cavity. However, some studies have failed to confirm this pattern.8, 42, 52, 53 In particular, Xue and Hao8 did not observe this decrease in formant frequency in all types of vowels measured, as they should have if the predicted glottal lowering is responsible for a consistent, perceptually relevant cue signaling speaker age.
Fundamental frequency perturbation. Investigators examining cyclic variation in f0 in sustained vowels have reported increased f0 perturbation in both older males and females.20, 54 However, older male speakers appear to display more f0 perturbation than do older females.20 In addition, the values observed have been modest in magnitude and their perceptibility has not been studied directly. For example, the observed values are not comparable with those measured in pathological voices. Moreover, other investigators have not found f0 perturbation to be a significant aging cue for males37 or for females.15, 32, 26 Thus, to date, studies of f0 perturbation have not provided useful evidence suggesting that this measure is a robust cue of speaker age.
Other cues. A few other cues also have been examined by means of acoustic studies of vocal aging. Increased shimmer in older males37 and females20 has been observed by some investigators. Moreover, breathiness has been found to differentiate older females from both middle-aged and young females (as measured by both harmonics-to-noise ratio15 and long-term averaged spectra16). Disfluency rate was not found to vary between young, middle-aged, and older males,44 although Shuey55 did observe a significant increase in speech errors due to age in some vowels and final consonants. Finally, intensity has been suggested to decrease with age because of reduced pulmonary function. This prediction has been supported by some studies,48 but others have found no significant effect or even the opposite pattern.41
In summary, findings for most of the acoustic cues of aging, with the exception of speaking rate and (to a lesser extent) mean f0, have not been observed to be particularly consistent across studies. Moreover, few if any perceptual experiments have been conducted to validate the relevance of any of these cues. Specifically, prior research involving perceptual tasks has been focused on either listeners' abilities to accurately gauge age3, 56 or impressionistic characteristics of aged voices.4 Moreover, acoustic studies of speaker age are difficult to evaluate because the perceptual relevance of a particular acoustic cue that changes (due to age) is unknown. For example, how perceptually relevant is a 100-Hz shift in the second formant frequency of the vowel /i/ in older male voices?8 How important is this formant lowering to perceived age relative to, say, a shift in fundamental frequency? Questions such as these cannot be answered without first validating the perceptual relevance of acoustic cues. These observations motivated the design of the studies that followed. They combine an acoustic analysis of potential cues to speaker age with a corresponding perceptual evaluation in which these cues are systematically manipulated.
The purpose of this study was to:
- 1.
Verify, if possible, certain effects of vocal aging by examining two acoustic cues cited in earlier work, namely, mean fundamental frequency and speaking rate (experiment 1).
- 2.
Determine the perceptual relevance of these cues in perception tests in which the speech materials used in the acoustic analyses are modified by resynthesis (experiments 2 and 3).
Mean fundamental frequency and speaking rate were selected because they have been shown to consistently and significantly vary due to perceived/chronologic age in a relatively large number of previous studies. Moreover, these cues can be systematically manipulated in resynthesis without introducing significant unwanted artifacts in the modified speech signals.
Section snippets
Stimulus materials and subjects
The “Rainbow Passage”57 was read by 30 males, 16 of whom were chronologically old (within the age range of 74–88 years, with a mean age of 82 years) and 14 of whom were chronologically young (within the age range of 21–29 years, with a mean age of 24 years). Only the second sentence of the passage was used as the basis for the acoustic analysis. Two criteria were used in selecting these speakers: (1) their chronological age and (2) data from a procedure designed to ensure that they were
Stimulus materials
A subset of the stimulus materials from experiment 1 was used in experiment 2. Specifically, they involved a single sentence from the “Rainbow Passage” produced by 26 males, 13 old and 13 young. These 26 talkers were selected randomly from the 30 males analyzed in experiment 1. Only 26 talkers were used to ensure that the perception test was not excessively long (ie, to avoid the effects of listener fatigue).
The stimulus sentence from each talker was resynthesized under four conditions:
- 1.
No
Stimulus materials
The stimulus materials for this study consisted of a single sentence from the “Rainbow Passage” uttered by 20 males, 5 old, 5 middle aged, and 10 young. These materials were selected from the materials described in experiment 1, as well as materials recorded under very similar conditions. All stimulus materials were consistently identified as falling within the appropriate age category by age estimation studies.
The stimulus sentence from the old and middle-aged talkers was resynthesized under
General discussion
The results of the two perception experiments, along with the acoustic analysis, show that speaking rate acts as a significant cue to perceived age. It has a modest, although significant, effect on average in shifting perceived age in old and middle-aged male voices (4–6 years, depending on the group) toward each other. Thus, while it is a cue to perceived age, it may not be the primary one, although it may prove to exert a greater influence on perceived age when interacting with other
Acknowledgments
We would like to thank Lauren Rusnick and Jenna Silver for their assistance in data collection and processing. We would also like to thank Hideki Kawahara for providing the resynthesis software used in this study.
References (61)
- et al.
Perceptual and acoustic correlates of aging in the speech of males
J Commun Disord
(1974) - et al.
Influences of listener characteristics on perceived age estimations
J Voice
(1987) Connective tissue changes in the larynx and their effects on voice
J Voice
(1987)Harmonics-to-noise ratio: an index of vocal aging
J Voice
(2002)Source characteristics of aged voice assessed from long-term average spectra
J Voice
(2002)- et al.
Speaking fundamental frequency characteristics of Australian women: then and now
J Phonet
(1982) - et al.
Vocal jitter in young adult and aged female voices
J Voice
(1989) - et al.
Speaking fundamental frequency characteristics as a function of age and professional singing
J Voice
(1991) - et al.
Acoustic and temporal correlates of perceived age
J Voice
(1992) - et al.
Have women's voices lowered across time? A cross-sectional study of Australian women's voices
J Voice
(1998)
Age-related differences in speech variability among women
J Commun Disord
“Old voices”: what do we really know about them?
J Voice
Effects of physiological aging on speaking and reading rates
J Commun Disord
Disfluency and rate characteristics of young adult, middle-aged, and older males
J Commun Disord
Age-related voice measures among adult women
J Voice
Vocal tract resonance analysis of aging voice using long-term average spectra
J Voice
Intelligibility of older versus younger adults' CVC productions
J Commun Disord
Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds
Speech Commun
Echoes of echoes: an episodic theory of lexical access
Psychol Rev
Speech perception without speaker normalization
Perception of the aging male voice
J Speech Hear Res
Accuracy of listener judgments of perceived age relative to chronological age in adults
Folia Phoniatr
Voice spectrograms as a function of age, voice disguise and voice imitation
J Acoust Soc Am
Changes in the human vocal tract due to aging and the acoustic correlates of speech production: a pilot study
J Speech Lang Hear Res
Phonatory and related changes with advanced age
J Speech Hear Res
Laryngoscopic and voice characteristics of aged persons
Arch Otolaryngol
Age-related histological changes in the human male and female laryngeal cartilages: biological and functional implications
A survey of age-related changes in the connected tissues of the adult human larynx
Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness
J Speech Hear Res
Pitch and duration characteristics of older males
J Speech Hear Res
Cited by (105)
It takes a village: A multi-brain approach to studying multigenerational family communication
2024, Developmental Cognitive NeuroscienceAnalysis of Speech Fundamental Frequencies for Different Tasks in Japanese
2023, Journal of VoiceCitation Excerpt :The SFF values of males in their twenties were about 130 Hz, indicating a 100 Hz difference from that of the females of the same age. Aging is another important factor that has been linked to changes in the SFF.14, 15 Males reportedly exhibit no drastic changes in SFF with increase in age.
The Influence of Presbylarynx Status on Objective Measures of the Aging Voice
2023, Journal of VoiceA new speech corpus of super-elderly Japanese for acoustic modeling
2023, Computer Speech and LanguageCitation Excerpt :As in previous studies, we confirmed that prolongation of speaking rates and increased vowel duration were associated with aging. These changes may be due to the physical consequences of aging, e.g., neuromuscular degeneration, in conjunction with slower processing times, as well as reduced auditory feedback (Harnsberger et al., 2008; Smith et al., 1987; Linville, 1996). By slowing their speaking rates, the elderly may also be trying to make their vague articulation as clear as possible (Fletcher et al., 2015).
Exploring age-related changes in inter-brain synchrony during verbal communication
2022, Psychology of Learning and Motivation - Advances in Research and TheoryEffect of Ageing on Acoustic Characteristics of Voice Pitch and Formants in Czech Vowels
2021, Journal of VoiceCitation Excerpt :A strong, negative correlation between the articulation rate and the average vowel duration was found, indicating that a longer vowel duration is associated with a slowing down of the overall speech tempo. Similar findings were reported by Harnsberger et al,26 who observed the lengthening of sentence, word, and diphthong durations as a function of age. Nevertheless, the effect of other factors such as preservative coarticulation on the lengthening of vowels in older speakers cannot be excluded.
This paper reports research that was presented at the 34th Annual Symposium: Care of the Professional Voice, June 5, 2005.