Elsevier

Journal of Voice

Volume 22, Issue 1, January 2008, Pages 58-69
Journal of Voice

Speaking Rate and Fundamental Frequency as Speech Cues to Perceived Age

https://doi.org/10.1016/j.jvoice.2006.07.004Get rights and content

Summary

This study aimed to specify a set of acoustic cues fundamental to vocal aging and to establish their perceptual relevance, using acoustic analysis and perceptual testing. Three experiments were conducted to identify the perceptual correlates of the aging voice. The first experiment analyzed important voice parameters that signal a person's age for 16 older males and 14 younger males. In the second and third experiments, these acoustic patterns were systematically shifted through resynthesis to see if perceived age would be significantly influenced. In the second experiment, the older and younger male voices were resynthesized by manipulating speaking rate and fundamental frequency to shift the perceived age of the groups toward each other. In the third experiment, older and middle-aged male voices were resynthesized in a similar manner. In both perceptual studies, an age estimation task with naive listeners was used. The results of the first experiment showed that, in older speakers, sentence, word, and diphthong durations were all significantly longer and mean fundamental frequency was significantly higher than for the younger group. In the second experiment, only the manipulation of speaking rate resulted in a significant shift in perceived age, and it did so only for the older subjects. In the third experiment, a significant shift in age estimates was observed for the middle-aged, but not the older, voices when speaking rate was manipulated. The results of both perception tests suggest that speaking rate, but possibly not fundamental frequency, is a perceptually relevant cue to age in voice.

Introduction

While the total population of elders can be shown to be increasing, the effect of the normal aging process on speech production, and how these changes are perceived by listeners, is only marginally understood. A better understanding of how aging affects speech production should contribute to general models of the anatomical and physiological consequences of this process. Speaker age is also likely to be an important “indexical” (nonlinguistic) aspect of the speech signal. Other indexical properties of speech, such as token characteristics, speaker identity, and gender, have been demonstrated, in prior research, to be stored in an integrated fashion with linguistic information (eg, phonemes, syllables, words) in long-term memory.1 These indexical properties have been shown to play an important role in speech perception and verbal learning.2 If speech perception and verbal learning are to be modeled, the role of all relevant indexical properties must be understood, and speaker age is likely to be relevant for several reasons. Important among them is the finding that listeners are relatively accurate in estimating age from speech samples.3, 4, 5, 6 Finally, research on speech and aging is vital for clinical purposes to evaluate speech disorders in the elderly population. Without a model of these characteristics for normal, healthy older individuals, it becomes difficult to isolate the consequences of different speech and/or voice pathologies in that population.

As stated, prior work has shown that listeners can accurately gauge speaker age, with correlations between chronologic and perceived age ranging from r = 0.68 to r = 0.90.3, 4, 5, 6 The acoustic cues that are presumed to signal speaker age are likely the product of age-related physiological changes to the vocal tract. Some of the physiological changes that have been identified include the lengthening of the vocal tract or oral cavity,7, 8 a reduction in pulmonary function,9 laryngeal cartilage ossification,10 an increased stiffening of the vocal folds,10, 11, 12, 13 and a reduction in vocal fold closure.14, 15, 16 These physiological changes have been used to predict acoustic correlates of aging in voices. These predictions can be summarized as follows:

  • Fundamental frequency. Mean fundamental frequency (f0) increases in older males while remaining constant or decreasing in older females.15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 Based on these studies, it does not appear that mean f0 changes linearly across most of the adult life span, but rather shows a shift after late middle age. f0 also has been shown to increase in variability with age for both sexes, indicating reduced laryngeal control.9, 19, 20, 32, 33, 34, 35 It must be noted, however, that a subset of these studies failed to find changes in f0 variability and/or in mean f0 as a function of age for both men and women.18, 20, 36, 37 Nonetheless, the most consistent pattern that emerges across them is one in which male and female voices coalesce with respect to mean f0 from middle to old age—a pattern that modifies the shift observed at puberty.38, 39, 40

  • Speaking rate. Speaking rate appears to slow as a function of age for both male and female speakers. This rate reduction has been observed in extemporaneous speech and read speech, including paragraphs, sentences, words, and segments.26, 29, 41, 42, 43, 44, 45 However, it has not necessarily been observed in specific phonetic cues of segments, such as voice onset time, even though an extensive range of segment cues have not been examined to date.46, 47, 48, 49 Overall, speaking rate has been consistently shown to be an acoustic cue differentiating speakers by age; it reflects a more generalized slowing in motor processes in the elderly population.50

  • Formant frequencies. Vowel formant frequencies have shown modest decreases with age.7, 8, 32, 51 This shift is presumed to be a by-product of the lowering of the vocal folds over the life span, which results in a longer vocal cavity. However, some studies have failed to confirm this pattern.8, 42, 52, 53 In particular, Xue and Hao8 did not observe this decrease in formant frequency in all types of vowels measured, as they should have if the predicted glottal lowering is responsible for a consistent, perceptually relevant cue signaling speaker age.

  • Fundamental frequency perturbation. Investigators examining cyclic variation in f0 in sustained vowels have reported increased f0 perturbation in both older males and females.20, 54 However, older male speakers appear to display more f0 perturbation than do older females.20 In addition, the values observed have been modest in magnitude and their perceptibility has not been studied directly. For example, the observed values are not comparable with those measured in pathological voices. Moreover, other investigators have not found f0 perturbation to be a significant aging cue for males37 or for females.15, 32, 26 Thus, to date, studies of f0 perturbation have not provided useful evidence suggesting that this measure is a robust cue of speaker age.

  • Other cues. A few other cues also have been examined by means of acoustic studies of vocal aging. Increased shimmer in older males37 and females20 has been observed by some investigators. Moreover, breathiness has been found to differentiate older females from both middle-aged and young females (as measured by both harmonics-to-noise ratio15 and long-term averaged spectra16). Disfluency rate was not found to vary between young, middle-aged, and older males,44 although Shuey55 did observe a significant increase in speech errors due to age in some vowels and final consonants. Finally, intensity has been suggested to decrease with age because of reduced pulmonary function. This prediction has been supported by some studies,48 but others have found no significant effect or even the opposite pattern.41

In summary, findings for most of the acoustic cues of aging, with the exception of speaking rate and (to a lesser extent) mean f0, have not been observed to be particularly consistent across studies. Moreover, few if any perceptual experiments have been conducted to validate the relevance of any of these cues. Specifically, prior research involving perceptual tasks has been focused on either listeners' abilities to accurately gauge age3, 56 or impressionistic characteristics of aged voices.4 Moreover, acoustic studies of speaker age are difficult to evaluate because the perceptual relevance of a particular acoustic cue that changes (due to age) is unknown. For example, how perceptually relevant is a 100-Hz shift in the second formant frequency of the vowel /i/ in older male voices?8 How important is this formant lowering to perceived age relative to, say, a shift in fundamental frequency? Questions such as these cannot be answered without first validating the perceptual relevance of acoustic cues. These observations motivated the design of the studies that followed. They combine an acoustic analysis of potential cues to speaker age with a corresponding perceptual evaluation in which these cues are systematically manipulated.

The purpose of this study was to:

  • 1.

    Verify, if possible, certain effects of vocal aging by examining two acoustic cues cited in earlier work, namely, mean fundamental frequency and speaking rate (experiment 1).

  • 2.

    Determine the perceptual relevance of these cues in perception tests in which the speech materials used in the acoustic analyses are modified by resynthesis (experiments 2 and 3).

Mean fundamental frequency and speaking rate were selected because they have been shown to consistently and significantly vary due to perceived/chronologic age in a relatively large number of previous studies. Moreover, these cues can be systematically manipulated in resynthesis without introducing significant unwanted artifacts in the modified speech signals.

Section snippets

Stimulus materials and subjects

The “Rainbow Passage”57 was read by 30 males, 16 of whom were chronologically old (within the age range of 74–88 years, with a mean age of 82 years) and 14 of whom were chronologically young (within the age range of 21–29 years, with a mean age of 24 years). Only the second sentence of the passage was used as the basis for the acoustic analysis. Two criteria were used in selecting these speakers: (1) their chronological age and (2) data from a procedure designed to ensure that they were

Stimulus materials

A subset of the stimulus materials from experiment 1 was used in experiment 2. Specifically, they involved a single sentence from the “Rainbow Passage” produced by 26 males, 13 old and 13 young. These 26 talkers were selected randomly from the 30 males analyzed in experiment 1. Only 26 talkers were used to ensure that the perception test was not excessively long (ie, to avoid the effects of listener fatigue).

The stimulus sentence from each talker was resynthesized under four conditions:

  • 1.

    No

Stimulus materials

The stimulus materials for this study consisted of a single sentence from the “Rainbow Passage” uttered by 20 males, 5 old, 5 middle aged, and 10 young. These materials were selected from the materials described in experiment 1, as well as materials recorded under very similar conditions. All stimulus materials were consistently identified as falling within the appropriate age category by age estimation studies.

The stimulus sentence from the old and middle-aged talkers was resynthesized under

General discussion

The results of the two perception experiments, along with the acoustic analysis, show that speaking rate acts as a significant cue to perceived age. It has a modest, although significant, effect on average in shifting perceived age in old and middle-aged male voices (4–6 years, depending on the group) toward each other. Thus, while it is a cue to perceived age, it may not be the primary one, although it may prove to exert a greater influence on perceived age when interacting with other

Acknowledgments

We would like to thank Lauren Rusnick and Jenna Silver for their assistance in data collection and processing. We would also like to thank Hideki Kawahara for providing the resynthesis software used in this study.

References (61)

  • R. Morris et al.

    Age-related differences in speech variability among women

    J Commun Disord

    (1994)
  • H. Hollien

    “Old voices”: what do we really know about them?

    J Voice

    (1987)
  • L. Ramig

    Effects of physiological aging on speaking and reading rates

    J Commun Disord

    (1983)
  • S. Duchin et al.

    Disfluency and rate characteristics of young adult, middle-aged, and older males

    J Commun Disord

    (1987)
  • R. Morris et al.

    Age-related voice measures among adult women

    J Voice

    (1987)
  • S. Linville et al.

    Vocal tract resonance analysis of aging voice using long-term average spectra

    J Voice

    (2001)
  • E. Shuey

    Intelligibility of older versus younger adults' CVC productions

    J Commun Disord

    (1989)
  • H. Kawahara et al.

    Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds

    Speech Commun

    (1999)
  • S. Goldinger

    Echoes of echoes: an episodic theory of lexical access

    Psychol Rev

    (1998)
  • K. Johnson

    Speech perception without speaker normalization

  • T. Shipp et al.

    Perception of the aging male voice

    J Speech Hear Res

    (1969)
  • G.S. Neiman et al.

    Accuracy of listener judgments of perceived age relative to chronological age in adults

    Folia Phoniatr

    (1990)
  • W. Endres et al.

    Voice spectrograms as a function of age, voice disguise and voice imitation

    J Acoust Soc Am

    (1970)
  • S. Xue et al.

    Changes in the human vocal tract due to aging and the acoustic correlates of speech production: a pilot study

    J Speech Lang Hear Res

    (2003)
  • P.H. Ptacek et al.

    Phonatory and related changes with advanced age

    J Speech Hear Res

    (1966)
  • I. Honjo et al.

    Laryngoscopic and voice characteristics of aged persons

    Arch Otolaryngol

    (1980)
  • J.C. Kahane

    Age-related histological changes in the human male and female laryngeal cartilages: biological and functional implications

  • J.C. Kahane

    A survey of age-related changes in the connected tissues of the adult human larynx

  • E. Yumoto et al.

    Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness

    J Speech Hear Res

    (1984)
  • E. Mysak

    Pitch and duration characteristics of older males

    J Speech Hear Res

    (1959)
  • Cited by (105)

    • Analysis of Speech Fundamental Frequencies for Different Tasks in Japanese

      2023, Journal of Voice
      Citation Excerpt :

      The SFF values of males in their twenties were about 130 Hz, indicating a 100 Hz difference from that of the females of the same age. Aging is another important factor that has been linked to changes in the SFF.14, 15 Males reportedly exhibit no drastic changes in SFF with increase in age.

    • A new speech corpus of super-elderly Japanese for acoustic modeling

      2023, Computer Speech and Language
      Citation Excerpt :

      As in previous studies, we confirmed that prolongation of speaking rates and increased vowel duration were associated with aging. These changes may be due to the physical consequences of aging, e.g., neuromuscular degeneration, in conjunction with slower processing times, as well as reduced auditory feedback (Harnsberger et al., 2008; Smith et al., 1987; Linville, 1996). By slowing their speaking rates, the elderly may also be trying to make their vague articulation as clear as possible (Fletcher et al., 2015).

    • Exploring age-related changes in inter-brain synchrony during verbal communication

      2022, Psychology of Learning and Motivation - Advances in Research and Theory
    • Effect of Ageing on Acoustic Characteristics of Voice Pitch and Formants in Czech Vowels

      2021, Journal of Voice
      Citation Excerpt :

      A strong, negative correlation between the articulation rate and the average vowel duration was found, indicating that a longer vowel duration is associated with a slowing down of the overall speech tempo. Similar findings were reported by Harnsberger et al,26 who observed the lengthening of sentence, word, and diphthong durations as a function of age. Nevertheless, the effect of other factors such as preservative coarticulation on the lengthening of vowels in older speakers cannot be excluded.

    View all citing articles on Scopus

    This paper reports research that was presented at the 34th Annual Symposium: Care of the Professional Voice, June 5, 2005.

    View full text