Meter and speech
Introduction
As Peter Jusczyk observed, children learn prosodic structure very early in the language acquisition process (Jusczyk, 1997; Mehler et al., 1988). Other data show that prebabbling infants notice deviations from regular timing of perceptual centers or P-centers (which are temporally close to vowel onsets, Fowler, Smith, & Tassinary, 1986). These observations suggest that children may be able to both produce and perceive simple periodic patterns in speech well before producing their first words. Frequently speakers produce speech in a periodic way, sometimes by coupling their speech production to another speaker or to a metronomic pattern, e.g., when chanting or declaiming. This essay reviews some relevant phenomena and proposes general theoretical mechanisms to account for these behaviors. Since the mechanisms are very simple, we might expect them to appear fairly early in the development of speech.
We will review some experimental observations on the kind of event that most often recurs periodically and some properties of periodic speech, then sketch some basic ideas about how periodic patterns in speech might arise.
One of the most important discoveries about periodicity in speech has been known for some time although its importance for global aspects of speech timing may have been underestimated. George Allen (1972), Allen (1975) showed that if English speakers are asked to align a finger tap with a word, they will line it up close to the onset of the vowel in the stressed syllable of the word. This implies that there is a perceptually salient acoustic event at these time points in speech. Subsequent research on ‘perceptual centers’ or ‘P-centers’ was able to refine the notion of the ‘beat’ associated with prominent syllables by showing that large initial clusters tend to move the P-center temporally to the left (into the consonant cluster, e.g., in skate vs. ate) while final clusters can move the P-center somewhat to the right (into the vowel, as in baa vs. banks, Morton, Marcus, & Frankish, 1976; Marcus, 1981; Pompino-Marschall, 1989). These perturbations, however, are small (5–15 ms) relative to the repetition cycle (typically about 500 ms). Apparently, the beat location can be approximated automatically (Scott, 1993) by measuring the amount of energy in lower frequencies (between 200 and 800 Hz), smoothing sufficiently and then looking for large energy onsets which are prominently encoded in the auditory nerve (Delgutte & Kiang, 1984). When speakers attempt to produce a series of regular events with their speech, these observations about beats and pulses imply they will regularize the spacing of vowel onsets, especially stressed vowels (at least for English). This clarifies the question of what it is that is periodic, and thus what ‘periodically produced speech’ might mean. And since other aspects of the signal play only a fairly small role, these findings encourage the use of automatic measurement methods that simulate the beat locating aspects of auditory performance.
Aside from the P-center work, there has been other research on simple periodic speech phenomena. The case of subjects cyclically repeating a short phrase has been shown to lead to the harmonic timing effect. A number of studies have shown that when speakers repeat a short piece of text many times, they exhibit a strong preference for locating prominent (e.g., stressed) syllable onsets at simple harmonic fractions of the repetition cycle (Port, Tajima, & Cummins, 1996; Cummins & Port, 1998; Tajima, 1998). For example, Cummins and Port (1998) presented English-speaking subjects with a two-tone metronome pattern. Tone A marked the beginning of each cycle and alternated with tone B that was randomly located at phase angles between 0.20 and 0.80 of the A–A cycle. The subjects’ task was to repeat a phrase like ‘Dig for a duck’ so that the first stressed word lines up with tone A and the final stress lines up with tone B.1 The location of onset of the final syllable, duck, was measured as a particular phase angle between 0 and 1 (the beginning of the next cycle). The frequency histogram of performance for all speakers is shown in Fig. 1. Although the target phase angles for the onset of the final syllable were distributed uniformly over the interval from 0.20 to 0.80 of the repetition cycle, the speakers did not reproduce anything resembling the flat input distribution but were strongly biased to locate their onsets near just 3 locations in the cycle: 1/3 for all the early phase angle targets, 1/2 for targets near the middle of the cycle and 2/3 for most target phases later than about 0.57. This bias is called the harmonic timing effect because locations like 1/2 and 1/3 would be the phase-zero pulses for (phase-locked) harmonic frequencies of the fundamental. Similar results have now been observed in a number of experiments and the phenomenon can easily be demonstrated to oneself (by repeating a 4–6 syllable phrase and noticing where the phrase-final stress occurs when the pattern is stably repeated).
Although this experiment employed only English speakers, one might expect that other languages should at least have a bias to pay special attention to vowel onsets and to favor low-frequency harmonics whenever nested meters are constructed. There is some data directly comparing English and Japanese in a similar task (Tajima & Port, 2003). The speakers of the two languages adjusted to perturbing influences on timing in language-specific ways, but the data clearly showed that speakers of Japanese were paying attention to the vowel onsets in this task, just as much as the English speakers.
Notice that the speech results demonstrate not merely regularity at the frequency of the metronome, but also at higher frequencies. There is periodicity on two time scales: one at the repetition cycle rate and another either 2 or 3 times faster than the metronome but phase-locked to it (so the phase-zero pulse is actually two simultaneous pulses). What kind of cognitive mechanism could account for these particular constraints on speech timing is the primary issue we are concerned with here.
These experiments show that when there is periodicity at one level, there may sometimes be periodicity at a harmonic of that frequency. This feature of motor temporal behavior is not restricted to speech, but can be observed in simple limb movements as well.
Section snippets
Non-speech periodic behavior and the HKB model
It may be appropriate to compare the harmonic timing phenomenon to oscillatory finger motion as illustrated in Kelso's finger-wagging task. Kelso (1984) had subjects oscillate one finger on each hand to the left and right. When the phase relationship of the fingers is such that they simultaneously move toward and away from the midline (described as 0 phase), performance is easiest. Most phase relationships between the fingers are very unstable although, at a slow enough tempo, the fingers can
Meter and periodicity
Although linguists and students of poetics often describe meter in terms of serial patterns of strong and weak syllables, the most intuitively natural notion of meter seems to be that of music where it is based on periodic structures in continuous time.3
Conclusion
The issue explored in this essay is that, in many situations, speakers will exhibit periodic location of salient events like vowel onsets. For English, this is especially true of syllables with pitch accent or stress. It is proposed that this periodic behavior reflects periodic attractors in relative phase that are generated by one or more internal oscillators producing pulses that are sometimes coupled to external periodicities. These oscillations can be described as neurocognitive because
Acknowledgements
Thanks to Adam Leary for help with figures and to Sarah Hawkins and Noel Nguyen for helpful comments on earlier drafts.
References (41)
Speech rhythmIts relation to performance universals and articulatory timing
Journal of Phonetics
(1975)- et al.
Rhythmic constraints on speech timing
Journal of Phonetics
(1998) - et al.
Non-equilibrium phase transitions in coordinated biological motionCritical fluctuations
Physics Letters A
(1986) On synchronizing movements to music
Human Movement Science
(2000)- et al.
Perceiving temporal regularity in music
Cognitive Science
(2002) On the psychoacoustic nature of the P-center phenomenon
Journal of Phonetics
(1989)- et al.
Non-equilibrium phase transitions in coordinated biological motionCritical slowing down and switching time
Physics Letters A
(1987) - et al.
The acquisition of movement skillsPractice enhances the dynamic stability of bimanual coordination
Human Movement Science
(2001) Elements of general phonetics
(1967)The location of rhythmic stress beats in EnglishAn experimental study I
Language and Speech
(1972)
Infants perception of speech unitsPrimary representational capacities
Speech coding in the auditory nerveI. Vowel-like sounds
Journal of Acoustical Society of America
Finding downbeats with a relaxation oscillator
Psychological Research
Perception of syllable timing by prebabbling infants
Journal of the Acoustical Society of America
Prosodic structure in young children's language production
Language
A theoretical model of phase transitions in human hand movements
Biological Cybernetics
Dynamic attending and responses to time
Psychological Review
The discovery of spoken language
Phase transitions and critical behavior in human bimanual coordination
American Journal of Physiology
Dynamic patternsThe self-organization of brain and behavior
Cited by (117)
The online effect of clash is durational lengthening, not prominence shift: Evidence from Italian
2022, Journal of PhoneticsRhythmic and textural musical sequences differently influence syntax and semantic processing in children
2020, Journal of Experimental Child PsychologyThe Evolution of Rhythm Processing
2018, Trends in Cognitive SciencesSituating language and music research in a domain-specific versus domain-general framework: A review of theoretical and empirical data
2024, Language and Linguistics Compass