Glottal signal, vocal tract resonances and output sound
This page is an appendix to our Introduction to voice acoustics: here we illustrate some aspects of the source-filter model. We show measurements of a signal at the level of the vocal folds, measurements of the vocal tract resonances and of the formants in the output sound of the voice. We compare different phonations or ways of producing a sound at the glottis: normal voice (periodic vibration of the folds), whisper (almost no motion of the folds) and creak or vocal fry (non-periodic motion of the folds).
An electroglottograph (EGG) measures the glottal movement (i.e. the vibration of the vocal folds). The EGG works by passing a small 2 MHz electric current through 4 electrodes secured around the subject's neck and measuring the variation in conductance between pairs of electrodes over time. The tract resonances were measured using broadband excitation at the lips.
Normal speech: vowel in the English word "heard"
The upper trace shows the oscillogram of the output sound of the vowel in the English word "heard". The lower trace shows, for the same sample, the EGG signal: a signal related to the source signal at the glottis.
In both traces, the vertical units are arbitrary. The ticks on the time axis are at 5 ms, so the period is about 10 ms and the frequency about 100 Hz. A strong fifth harmonic is clearly visible in the output signal, but not at all in the conductance signal.
Sound file of the output sound for this vowel.
Spectrum of the EGG signal (the neck conductance measured at 20 MHz) during production of the vowel in "heard". Note the absence of formants in this signal: in that sense, this signal is similar to the source function. (See What is a sound spectrum?)
Sound file of the EGG signal, i.e. of the signal shown immediately above. This signal is the one shown in both the spectrum and waveform above. It is a rather dull sound with weak high frequencies and it lacks formants.
How to measure the vocal tract resonances? Here we do so using boad band excitation of the vocal tract, during speech. The harmonics of the voice are the narrow vertical lines. The broad band signal shows the ratio γ = Z///Zrad, where Zrad is the acoustical impedance of the radiation field, as baffled by the subject's face, at the mouth. Z// is the impedance of the vocal tract in parallel with Zrad, measured at the same position.
See this link for an explanation of the technique.
Note the peaks near 0.6, 1.3, 2.3 and 3.3 kHz
Spectrum of the output sound of the vowel in heard.
Note the strong formants near 0.5, 1.3, 2.3 and 3.4 kHz.
Sound file of the output speech sound for this vowel. The strong fifth harmonic, due to the first resonance of the tract, is near the centre of the first formant.
The glottal vibration is periodic, with the folds opening and closing repeatedly in a regular manner. This periodic behaviour is present, too, in the speech signal. This periodicity appears inthe frequency domain where we see evenly spaced harmonics. The harmonics are bounded by an envelope created by the vocal tract filter. This envelope contains the formants: the broad resonance peaks that are responsible for the intelligibility of the vowel. Notice too that, at high frequencies, the nonperiodic components due to turbulent noise in the throat have increasing importance in the output sound.
Whispered speech: vowel in the English word "heard"
The upper trace shows the oscillogram of the output sound of the whispered vowel in the English word "heard". The lower trace shows an oscillogram of the EGG signal during production of the same sample. For whispering, the glottis is open and varies little with time, so this EGG signal does not show the source function, which is mainly due to turbulent noise.
The sound of the whispered vowel whose signal is shown here.
The smaller EGG signal shows us that the folds are barely moving. Air is flowing through the slightly open glottis and becoming turbulent, giving a completely aperiodic sound source. Thus like the croak voice, the whisper voice produces a smooth envelope for the sound and EGG spectra.
Sound file of the output sound for this vowel.
Spectrum of the output sound of the whispered vowel in heard.
Creak voice: vowel in the English word "heard"
The upper trace shows the oscillogram of the output sound of the vowel in the English word "heard". The subject is using the aperiodic "creak" voice, also called "creak" or "vocal fry". The lower trace shows, for the same sample, a signal closely related to the source signal at the glottis: it is an oscillogram of the EGG signal during creak production of the vowel in heard. In both traces, the vertical units are arbitrary. Note the ringing of the vocal tract (upper trace) after the nearly pulsatile stimulation (lower trace).
The time between peaks in the lower signal is about 30 ms, but the time varies from one cycle to the next, which is why we call it aperiodic.
Spectrum of the EGG signal during production of the creak vowel in "heard".
Spectrum of the output sound of the croak vowel in "heard". The croak signal is aperiodic. The frequencies of vibration are much smaller than for normal speech and broad rather than harmonic. This results in a smoother envelope for the sound spectrum, which allows better resolution in detecting the vocal tract resonances.
Sound file of the source sound for this croaked vowel.
Some related pages and explanatory notes
The measurements above were made by Yoni Swerdlin and Joe Wolfe, as an introductory part of Yoni's undergraduate thesis. The results of that thesis are now published: