Representation of Complex Sounds
Complex periodic sounds can be represented as the sum of a number of pure tone components.
The pure tone components are found at integer multiples of the fundamental frequency of the sound and are called harmonics.
The fundamental frequency, in the case of voiced speech sounds, corresponds to rate of vibration of the larynx, that is, the number of opening-closing cycles of the larynx per second.
The harmonic content of a complex sound can be represented in a graph called a spectrum.
- The horizontal axis of a spectrum corresponds to frequency.
- The vertical axis corresponds to amplitude of the harmonic component in the complex wave.
Here is the spectrum of a vowel sound whose fundamental frequency is 100 Hz.
The voiced source and its spectrum
The airflow through the vibrating larynx is the sound source.
It is a complex wave, looking something like this:
What does it sound like?
Listen to sentence from siSwat'i (Southern Bantu language; Swaziland)
Listen to glottal wave recorded from subject's larynx while sentence is being produced.
For more info on electroglottography (EGG) go here.
What is the spectrum of such a complex wave?
The harmonics of the voiced source decrease (exponentially) in amplitude as a function of frequency.
Filter Function of Vocal Tract
The supralaryngeal vocal tract can be characterized by a filter function, which specifies (for each frequency) the relative amount of energy that is passed through the filter and out the mouth.
The peaks in the filter function of the vocal tract are resonances of the vocal tract when it is in a particular configuration, and are referred to as formant frequencies.
For a relatively unconstricted vocal tract, the resonances of a 17 cm vocal tract occur at the following frequencies:
The resonances of an unconstricted tube or pipe are a function of the length of the tube.
for n = 1, 3, 5, ...
f = formant frequency in Hz
c = speed of sound 34,000 cm/s
L= length of vocal tract in cm
So the lowest formant frequency in a 17 cm. vocal tract is:
= 34,000 / 4 * 17
= 500 Hz
And the spacing between formants is:
(always twice the lowest f)
= 1000 Hz
What are the formants for a young child (12 months)?
Voperian et al, (2005). Development of vocal tract length during early childhood: A magnetic resonance imaging study. Journal of the Acoustical Society of America, 117, 338350.
For an 8 cm vocal, the lowest resonance is:
f = c / 4 * L
= 34,000 / 4 * 8
= 1062 Hz
And spacing between resonances will be about 2120 Hz.
Combining Source and filter
The output energy (at the mouth) for a given frequency is equal to the amplitude the source harmonic, mutiplied by the magnitude of the filter function for that the frequency.
Source-filter examples: F0 = 100, F0 = 200