Formation of the Actor’s/Speaker’s Formant: A Study
Applying Spectrum Analysis and Computer Modeling
*Timo Leino, *Anne-Maria Laukkanen, and †Vojteˇch Radolf, *Tampere, Finland, yPrague, Czech Republic
Summary: Hypothesis. A strong peak between 3 and 4 kHz in the long-term average spectrum (LTAS) of speech
has been found to be one correlate of a good male speaking voice, for example, among actors. The actor’s or speaker’s
formant (resembling the singer’s formant) can be established by certain vocal training. This study investigates the origin
of the speaker’s formant.
Study Design and Setting. The immediate effects of a vocal exercise series on speaking voice were studied in
a Finnish male actor, who is an experienced teacher of the exercises. They consist of nasal vowel syllable strings
and words containing nasals. Before and after a 30-minute exercising, the subject (1) read aloud at three loudness levels
and (2) phonated the Finnish vowels at habitual level.
Methods. Formant frequencies were estimated from spectra of the vowel samples. LTAS was made and equivalent
sound level (Leq) was measured for the text samples. Formant frequencies were used as the input for a one-dimensional
(1D) mathematical model.
Results. After the exercise, the peak at 3.5 kHz in the LTAS of the reading samples was stronger, although Leq was the
same as before, suggesting a level-independent resonance change. Reading samples after exercising were evaluated to
sound better in voice quality than before exercising. The strong peak at 3.5 kHz was present in all vowels, and it was
mainly formed by clustering of F4 and F5.
Conclusions. A 1D model-based optimization suggested that this kind of a formant cluster could be best established
by simultaneously narrowing the epilaryngeal tube, widening the pharynx and narrowing the front of the oral cavity.
Key Words: Vocal exercising–Voice quality–Spectrum analysis–Mathematical modeling–Optimization.
INTRODUCTION 3.5 kHz seem to be some of the features often characterizing
The long-term average spectrum (LTAS) provides information a good male speaking voice.
on the spectral distribution of the speech signal over a period of These spectral characteristics were taken as goals on a special
time.1 If the signal is sufficiently long (eg, 1 minute) and pho- 8-month intense voice training course for student actors.13
netically balanced, LTAS gives information on the average Real-time spectrum analysis was used as an aid in the training
voice quality.1 The method is attractive, because it is easy to sessions. According to the LTAS results of speaking samples
use in the vocologist’s daily routine. The method has been recorded before and after the training period, the goals of the
found to distinguish between normal and pathological speaking training were reached. The samples were also evaluated to
voices and between different degrees of hoarseness.2,3 It also re- sound better after training. The listeners consisted of theater
veals differences between phonation types.4–7 It has, moreover, and speech professionals, student actors, and university
been used to study singing voice in different styles and song students. The results confirm the earlier finding that the spectral
genres8–10 and the effects of singing voice training.11 Leino12 slope and prominence of the peak between 3 and 4 kHz are
applied LTAS for studying the speaking-voice quality of profes- characteristics of a good voice quality in speech.
sional male actors. In these results, samples evaluated as repre- A relatively strong peak at 3.5 kHz can also be seen in the
senting poor voice quality were distinguished from those with LTAS of speakers of other languages than Finnish: for example,
rather poor, fairly good, and good voice qualities by the steepest in the article by Dejonckere, it can be seen for a French-
spectral slope.12 The best voices, in turn, were characterized speaking male subject14; in the article by Frøkjær-Jensen and
especially by a prominent peak between 3 and 4 kHz.12 Prytz, for a Danish speaker15; and in the book by Nolan, for
The concept ‘‘good voice quality’’ is naturally very difficult an English speaker.6 Nawka et al have reported it in good voices
to define exhaustively and most likely consists of various of German speakers16; Bele17 observed it in Norwegian male
elements. However, Leino’s12 results suggest that the shape of professional speakers (actors and teachers); and Master et al18
LTAS has a certain perceptual relevance in the evaluation of observed it in Brazilian Portuguese-speaking male actors.
voice quality. A gentle spectral slope and a prominent peak at Cleveland et al report it in country singers’ singing voices.9
The prominent peak between 3 and 4 kHz in the LTAS of
a good male speaking voice seems to resemble the singer’s for-
Accepted for publication October 8, 2009.
From the *Department of Speech Communication and Voice Research, University of
mant, a strong energy concentration between 2 and 3 kHz in
Tampere, Tampere, Finland; and the yDepartment of Dynamics and Vibrations, Institute male operatic singing voice19 (Figure 1). Although the singer’s
of Thermomechanics, the Academy of Sciences of the Czech Republic, Prague, Czech
Republic.
formant lies lower in frequency and is stronger than the ‘‘actor’s
Address correspondence and reprint requests to Timo Leino, Department of Speech formant,’’ both seem to be correlates of good voice quality, and
Communication and Voice Research, FIN-33014, University of Tampere, Tampere,
Finland. E-mail: Timo.Leino@uta.fi
both can be achieved through training. However, an actor’s
Journal of Voice, Vol. 25, No. 2, pp. 150-158 formant can also be seen in the LTAS of untrained good male
0892-1997/$36.00
Ó 2011 The Voice Foundation
voices, whereas the singer’s formant is mainly achieved
doi:10.1016/j.jvoice.2009.10.002 through classical singing training.
Timo Leino, et al Actor’s Formant 151
the vocal changes after a 30-minute special vocal warm-up ex-
ercise. To obtain an estimate of the origin of the increased prom-
inence of the actor’s formant, the results from the subject were
further studied using an optimization approach based on one-
dimensional (D) mathematical modeling.23
MATERIALS AND METHODS
Subject and tasks
A professional Finnish male actor (1) read aloud from a text at
three loudness levels (habitual, softer, and louder) and (2) pho-
nated the eight Finnish vowels separately at habitual loudness
before and after a vocal warm-up exercise of 30 minutes. Vowel
and text reading samples and vocal exercising were recorded in
a sound-treated studio with a digital recorder Tascam DA-20
(Teac Corporation, Tokyo, Japan) and Bru¨el & Kjær 4165 om-
FIGURE 1. LTAS of a reading sample of the famous late Finnish nidirectional microphone (Bru¨el et Kjær Sound & Vibration
radio announcer Carl-Eric Creutz (gray line) and a singing sample of Measurement A/S, Naerum, Denmark); the mouth to micro-
the Finnish baritone Jorma Hynninen (black line). Reprinted with per- phone distance was 40 cm, and the distance was controlled care-
mission from Laukkanen A-M, Leino T. Ihmeellinen ihmisa¨a¨ni [The fully by measuring and monitoring the subject’s position during
Amazing Human Voice]. Helsinki: Gaudeamus; 1999. recording. The recordings were calibrated for equivalent sound
The acoustic differences between the singer’s formant and level (Leq) measurement using a sine wave generator and
the actor’s formant seem to have perceptual relevance. Some a sound-level meter (Bru¨el & Kjaer Frequency Analyzer 2120).
trained operatic (male) singers tend to use the same kind of Vocal warm-up was performed using the ‘‘Kuukka exercise
voice quality in both singing and speaking. In the LTAS of series.’’24 Niilo Kuukka, the well-known late Finnish voice
such speaking samples, no actor’s formant can be seen, but trainer from the Theatre Academy of Finland, designed this
instead, a prominent peak between 2 and 3 kHz can be seen, exercise series especially for student actors to develop a well-
although it is not so strong as in singing. According to the lis- projecting stage speech. The series is also used by many actors
tening evaluations conducted by Leino, these kinds of speaking for warming up before entering the stage. The exercises consist
samples of singers tend to score lower than the good voices with of nasal vowel syllable strings and words containing nasals pho-
the 3.5-kHz peak. Listeners have commented that the voice nated at varying pitch and loudness and with varying rhythm
sounds strange, not like a normal speaking voice. Thus, there patterns. (For a detailed description, see Laukkanen et al.24)
seems to be a certain timbre difference related to the exact The aim is a clear, ‘‘well-resonating,’’ ‘‘ringing,’’ and well-
location of the high-frequency peak in the LTAS of speaking. carrying voice quality. The subject in the present study was cho-
This corroborates with the results of Berndtsson and Sundberg sen, because he is an experienced teacher of this exercise series
on the effects of the location of the singer’s formant on per- and because earlier recordings and acoustic and perceptual
ceived voice classification.20 The lower frequency of the peak analyses have shown a clear change in his speech after warming
of singers may be related to lowering of the larynx. up with these exercises.
Singer’s formant has been explained as the consequence of
clustering of the upper formants F3, F4, and F5 occurring Analyses
when the laryngeal tube forms an independent resonator.19 LTAS analysis was performed on the text reading samples with
This happens when the cross-sectional area of the outlet of Hewlett-Packard (HP) Dynamic Signal Analyzer 3561 A
the larynx tube is sufficiently different from the cross-sectional (Hewlett-Packard, Palo Alto, CA). 10-kHz range and Hanning
area of the pharynx. This effect can be obtained by lowering the window were used. Voiceless sounds were excluded from the
larynx.19 The role of a laryngeal resonator in the formation of analysis.
the singer’s formant has been questioned, for example, by Dett- LTAS of the samples before and after exercising were com-
weiler.21 There are also observations of singers capable of pro- pared with each other by normalizing them according to the
ducing a singer’s formant although the larynx rises.22 This strongest spectral peak below 1 kHz. In this way, the spectral
would suggest that there may also be other mechanisms to slope differences can be studied visually.
establish the singer’s formant. Fast Fourier Transform (FFT) line spectra and spectrograms
The present study investigates the formation of the actor’s or were made for the separately phonated vowels recorded before
speaker’s formant in one male subject. Earlier observations by and after exercising. The analyses were performed with a signal
the first author using spectrum analysis as an aid in vocal training analysis system Intelligent Speech Analyser (ISA), developed
sessions have shown that the prominence of the actor’s formant by Raimo Toivonen, M.Sc. Eng. Formant frequencies were es-
may be increased through a short vocal warm-up exercising, at timated on the basis of manual measurements of local ampli-
least if this characteristic of speech has been established earlier tude maxima in the spectra. Leq of the 1-minute reading
through training. Therefore, the present study concentrates on samples was measured with ISA. Linear weighting was used.
152 Journal of Voice, Vol. 25, No. 2, 2011
FIGURE 2. Schema of 1D modeling based on magnetic resonance imaging (MRI). A. MRI registration of the vocal tract of a Czech male speaker
phonating on [a:]. B. Dividing the 3D model into 24 cross-sections. C. Reconstruction of a model for [a:] with 1D conical elements. Reprinted with
permission from Laukkanen A-M, Radolf V, Hora´cek J, Leino T. Estimation of the origin of a speaker’s and singer’s formant cluster using an op-
timization of 1D vocal tract model. Proceedings of the 3rd Advanced Voice Function Assessment International Workshop, 18th-20th May 2009, Ma-
drid, Spain.
Listening evaluation vowels, that is, an open back vowel and a closed front vowel,
Text reading and vowel samples were evaluated in a sound- were chosen, because they represent two opposite vocal tract
treated room by four theater school teachers and seven student settings. Details of the model are presented as follows.
actors. The samples were replayed with DAT recorder Tascam One-dimensional model with variable cross-sections
DA-20 (Teac Corporation, Tokyo, Japan) and Genelec Biamp and lengths. The transfer matrix method applying conic ele-
loudspeaker (Genelec, Iisalmi, Finland). Samples recorded be- ments was used. The base of this method is wave equation of an
fore and after exercising were replayed in randomized pairs. acoustic duct with variable cross-section A(x) and viscous los-
The listeners’ task was to judge which sample in each pair rep- ses (specific acoustic resistance rs)25
resented better voice quality or whether there was no difference
between the samples. The listeners were requested to pay atten- v2 f 1 vA vf 1 v2 f rS vf
tion to voice and not to articulation, interpretation, and others. þ , , , þ , ¼ 0: (1)
vx2 A vx vx c20 vt2 r0 vt
However, good voice quality was not predefined.
The vocal tract was constructed of 23 conic elements as a model
Modeling approach of oral and epilaryngeal cavities from vocal folds to mouth.
The possible vocal tract changes resulting in the formation of an Radiation impedance (RI) was considered.
actor’s formant were studied using a 1D mathematical model of We can describe relation between input and output of the
voice production. The 1D vocal tract model was developed from vocal tract by the equation
a 3D volume model obtained from the magnetic resonance im-
pOUT a b pIN
ages of a Czech male speaker.23 Figure 2 illustrates the scheme ¼ , ; (RI 1)
WOUT c d WIN
of the modeling. The formant frequencies measured from the
vowels [a:, i:] recorded from the subject of the present study be- where p is the acoustic pressure, W is the volume velocity, and
fore and after exercising were given to the model, and through
a tuning procedure (changing the vocal tract shape, ie, the size
a b
of area cross-sections and the length of the conical elements), TOUT; IN ¼
c d
the best-fitting vocal tract configurations were obtained. These
Timo Leino, et al Actor’s Formant 153
is the transfer matrix of the acoustic system (vocal tract). Com- 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Ai red ¼ Ai þ Ai Aiþ1 þ Aiþ1 ; (4)
plex elements a, b, c, and d are functions of the vocal tract 3
shape, medium (air density r0, viscosity m, and sound velocity
c0 ), and frequency f of the propagating wave. mi is an acoustic mass lumped into ith cross-section
For the eigenfrequency calculation, we assume the boundary r
Li1 Li
conditions: mi ¼ 0 þ ; (5)
2 Ai1red Ai red
– The input is closed (by rigid wall) and hence WIN ¼ 0
vii1 and vii are partial derivatives of the reduced cross-section.
– The output loaded by the acoustic RI of vibrating circular
plate with radius R placed in an infinite wall23 sffiffiffiffiffiffiffiffiffi!
i1 vAi1red 1 1 Ai1
vi ¼ ¼ , 1þ ; (6)
vAi 3 2 Ai
c 0 r0 J1 ð2kRÞ H1 ð2kRÞ
ZA rad ¼ , 1 þ j ; (RI 2)
pR2 kR kR sffiffiffiffiffiffiffiffiffi!
vAi red 1 1 Aiþ1
vii ¼ ¼ , 1þ ; (7)
where J1 is the Bessel function of the first kind of order 1, H1 is vAi 3 2 Ai
the Struve function of order 1, and k is the wave number.
As Ne is number of conic elements, pi; n is an amplitude of acoustic
pOUT pressure in the ith cross-section for the nth eigenfrequency fn,
ZA rad ¼ ; (RI 3) Wi; n is an amplitude of volume velocity in the ith cross-section
WOUT
we obtain for the nth eigenfrequency fn.
Moreover, we derived that the sensitivity of a particular ei-
ZA rad ,WOUT a b pIN genfrequency fn to a change in length Li in dimensionless form
¼ , (RI 4)
WOUT c d 0
LDfn =fn
and after eliminating pIN Si;n ¼
; (8)
DLi =Li
can be substituted by the relation
a ZA rad ,c ¼ 0; (RI 5)
which is the frequency equation for a complex wave number kn. L~ 1 Ai red 2 r0 2 2
Si;n ¼ p þ Wi; n þ Wiþ1; n
From the relation 2 r0 c20 i; n 2Ai red
u 2pf Li
k¼ ¼ (RI 6) 3PNe þ1 : (9)
c0 c0 2
e¼2 me We; n
we can obtain an eigenfrequency fn in Hertz as
Close to the results of Story,26 using the sensitivity function,
c0 an iterative process was used for an ith cross-sectional area
fn ¼ Re kn , : (RI 7)
2p computation
We derived that the sensitivity of a particular eigenfrequency
A~
P
fn to a change in cross-sectional area Ai in dimensionless form ðAi Þkþ1 ¼ ðAi Þk , 1 þ ðzn , Si;n Þk ; (10)
n
A Dfn =fn
Si;n ¼ (2) and for an ith element-length computation
DAi =Ai
can be substituted by the relation
L~
P
( " # ðLi Þkþ1 ¼ ðLi Þk , 1 þ ðzn , Si;n Þk ; (11)
n
A~ 1 i1 r0 2 2
1 2
Si;n ¼ Li1 vi , Wi1; n þ Wi; n p
2 2A2i1red r0 c20 i1; n where zn is a function of the difference between desired and in-
stantaneous nth eigenfrequency.
The speed of sound, the density, and the dynamic viscosity of
" #)
r0 2 1 2
þLi vii , 2
Wi; n þ Wiþ1; n p the air were considered as follows: co ¼ 353 m s1; r0 ¼ 1.2 kg
2
2Ai red r0 c20 i; n
m3, m ¼ 1.8 3 105 kgm1s1.
The geometry marked ‘‘before (warming up)’’ was obtained
Ai by the iterative process (mentioned earlier) with initial geometry
3PNe þ1 (3)
2
e¼2 me We; n
for the (Czech) vowels ([a:] and [i:]), which was known. Desired
eigenfrequencies for these vowels were set to formant values:
where Ai red is a reduced cross-sectional area of an ith conic
element Vowel [a:]: 597, 989, 2711, 3768, 4220 Hz
154 Journal of Voice, Vol. 25, No. 2, 2011
FIGURE 3. LTAS of a reading sample before (black line) and after (gray line) vocal warm-up. A. Sound pressure level 83 dB in both samples and
B. Sound pressure level 75 and 76 dB, respectively.
Vowel [i:]: 385, 2348, 2590, 3858, 4341 Hz agreement regarding voice quality in that it was better in the
samples recorded after exercising. However, the teachers
This computed geometry became the initial geometry for the seemed to prefer the samples recorded after exercising more
next step. In this step, desired eigenfrequencies were set to for- than did the students. This may be a sign of a better auditory dis-
mant values after warming up. These values were: crimination ability in the teachers or may be related to a differ-
ence in taste.
Vowel [a:]: 687, 1020, 2801,3556, 3918 Hz
Vowel [i:]: 355, 2439, 3103, 3737, 4039 Hz Separately produced vowels
Examples of FFT spectra from vowel samples before and after
The variables were all of 24 cross-sections and the lengths of exercising can be seen in Figure 4. After exercising, the for-
the first, second, and the last element. However, in some cases, mants F4 and F5 have come closer in frequency, thus forming
the iteration process converged better without variable lengths. a cluster. Such a cluster was present in all vowels. Figure 5
shows the changes in formant frequencies. The trend was that
F3 increased and F4 and F5 decreased. The rise of F3 may
RESULTS explain the frequency shift of the third peak in LTAS (Figure 3).
Long-term average spectrum and listening Changes in F4 and F5 seemed to be less marked in rounded
evaluation vowels.
Figure 3 shows LTAS of reading samples produced before and
after exercising. The sample recorded before warm-up
(Figure 3A) represents loud reading and that recorded after it Results from modeling
represents reading at ‘‘habitual’’ loudness; yet, Leq was the Figure 6 shows how the shape of the vocal tract can be changed
same in these samples (83 dB, at a mouth to microphone dis- by changing the spectrum of acoustic signals according to the
tance of 40 cm). In spite of the same Leq in the samples, the formant values measured for the subject before and after exer-
speaker’s formant is much more prominent in the sample re- cising. A good qualitative agreement was seen with the acoustic
corded after exercising. This suggests a level-independent spectra and the corresponding computed transfer functions,
change in voice quality, either because of a change in voice
source, in formant frequencies, or both.
TABLE 1.
Figure 3B shows the LTAS of softer reading. The sample
Evaluation of Voice Quality in Soft and Loud Text
recorded before exercising represents reading at habitual loud- Reading and Production of Vowels Before and After
ness and the sample after exercising represents reading at Vocal Warm-up
‘‘softer’’ loudness. Leq was approximately the same in both
samples (75 and 76 dB, respectively). In these samples, the dif- Before After No difference
ference is similar to that found for louder reading, although the Soft 3 (S) 7 1 (T)
actor’s formant is not as prominent. In the LTAS of all the read- Loud 2 (S) 9
ing samples after warm-up, the third peak (F3 variation range) Vowels 1 (S) 10
had also shifted higher in frequency. Abbreviations: S, student actor; T, theater school teacher.
The number of listeners regarding the voice better/finding no differences
Table 1 summarizes the results of the listening evaluation. It
between the samples before and after. Listeners n ¼ 11 in total.
can be seen that among listeners, there was a high degree of
Timo Leino, et al Actor’s Formant 155
FIGURE 4. LTAS of separately phonated Finnish vowels [ae:] A. and [y:] B. before (dashed line) and after (solid line) warm-up.
suggesting that a 1D model has applicability in estimating the peak is stronger, whereas it is largely absent in the samples rep-
background of the effects of vocal training. According to the resenting whispery voice and falsetto and speaking with low-
results, a speaker’s formant in vowel [a:] could be obtained ered or raised larynx. Leino has also found cases in which the
through a slight narrowing of the epilaryngeal region, widening strong peak at 3.5 kHz was exceptionally related to creaky
of the back of the oral cavity, and narrowing of the front part of voice quality. The resonance frequency of the laryngeal tube
it (Figure 6A, B). Similar results were obtained for [i:], is approximately 3.5 kHz.28 Nolan considers the possibility
although changes in the epilaryngeal region were very small that the 3.5-kHz peak is formed in the same way as the singer’s
(Figure 6C, D). formant according to Sundberg,19 that is, resulting when the
cross-sectional area of the outlet of the larynx tube is suffi-
ciently different from the cross-sectional area of the pharynx.
DISCUSSION The calculations by Titze and Story29 suggest that a narrowing
The results suggest that the actor’s/speaker’s formant is formed of the epilaryngeal region tends to raise the three lowest for-
by a clustering of the upper formants. The prominence of the ac- mants of the vocal tract and lower F4 and F5. As the authors
tor’s formant seems to increase through a vocal warm-up. One put it ‘‘The narrowed epilarynx tube therefore ‘attracts’ all for-
may indeed ask whether this kind of vocal exercising should mant frequencies toward the 2500–3000 Hz region.’’ That is the
be called warming up or placement exercising. The former refers frequency range of the singer’s formant. This kind of epilaryng-
to exercising that aims at immediate positive vocal effects, pos- eal setting most likely plays a role in the formation of the
sibly explicable through physiological changes in the laryngeal speaker’s formant as well. The somewhat vowel-dependent
muscles and vocal fold tissue. The latter implies that a change changes in F3, F4, and F5 could be explained by an epilaryngeal
in the phonatory and/or articulatory setting is aimed at. Probably, narrowing. Some vowel dependency on the changes in formant
both factors are usually involved in vocal exercising. frequencies is understandable. The same change in articulation
Nolan6 conducted experiments in voice quality variation is likely to result in different changes in the formant frequencies
according to the classification presented by Laver.27 In the of different vowels.28
LTAS of the habitual speaking sample of Nolan, there is a clear The results of modeling suggest that a speaker’s formant
peak at 3.5 kHz. In the samples of creak and creaky voice, the could be obtained through a slight narrowing of the
FIGURE 5. Changes in formant frequencies (F1–F5) of the eight Finnish vowels measured before and after warm-up.
156 Journal of Voice, Vol. 25, No. 2, 2011
FIGURE 6. Results from modeling. A. Possible geometry of the vocal tract for vowel [a:] before and after warm-up. B. Transfer function of the
vocal tract for vowel [a:] before and after warm-up. C. Possible geometry of the vocal tract for vowel [i:] before and after warm-up. D. Transfer
function of the vocal tract for vowel [i:] before and after warm-up.
epilaryngeal region, widening of the back of the mouth cavity, According to the calculations of Titze and Story,29 the three
and narrowing of the front part of it (Figure 2). The changes in lowest nasal resonances were at 1575, 3150, and 4725 Hz.
the mouth cavity would imply a more frontal position of the The effects of nasality on acoustic voice quality were found
tongue. A slight lowering of the larynx could also result in a nar- to be minor. A downward shift of F3 and F4 was reported. Titze
rowing of the epilaryngeal tube and lowering of the tongue root. and Story came to the conclusion that the benefit of nasaliza-
It is known that the 1D models can replicate the behavior of tion, for example, in singing, is less acoustic than biomechani-
the 3D vocal tract only up to 3000 Hz, because at higher fre- cal.29 Nasals as vocal exercises warrant further study, for
quencies, transverse modes occur in the 3D vocal tract model.23 example, applying magnetic resonance imaging or X-ray regis-
However, the results obtained in the present study are fairly tration of the vocal tract and electromyography of the laryngeal
close to those reported by Sundberg19 and Titze and Story.29 muscles.
Thus, a 1D model also seems to have applicability in estimating The results of the listening evaluation suggest that there was
the background of the effects of vocal training. a relation between increased prominence of the speaker’s for-
The use of nasals, mainly /m:/, in vocal exercises, either in mant and perception of a better voice quality. However, in the-
words or produced separately as prolonged (often as ‘‘hum- ory, it is also possible that some other characteristics in speech
ming’’), is especially widespread in the voice training litera- may have affected the evaluation. On the other hand, the speech
ture.30–33 Pahn32 has used nasalization exercises but mainly tempo and the lower formants (F1–F2), for instance, were not
on [s] sound instead of [m, n] as used in the Kuukka exercises. markedly different after warming up. Furthermore, the listeners
The nasalization exercises of Pahn have been reported to lead to were asked to pay attention to voice quality and not to other
lowering of the laryngeal position and increased sound energy speech characteristics. Therefore, the results of the present
around 3 kHz.32,34 The fact that intensification of sound energy study may be taken to support the earlier observations of the
is seen at somewhat lower frequency than the ‘‘actor’s formant’’ role of the speaker’s formant as one correlate of a good voice
in Finnish speakers may be explained by (a stronger) lowering quality in speech.
of the larynx. The results by Perkell35 and Yanagisawa et al.36 The actor’s formant seems to have perceptual relevance at
suggest that nasals tend to lower the larynx. This could result least from the aesthetic point of view. It gives the voice a ringing
in a relative narrowing of the epilaryngeal region. On the other quality as the singer’s formant does to the singing voice, al-
hand, preceding antiresonances may also improve the percep- though the timbre is clearly different. However, the specific
tual prominence of spectral peaks in nasalization. Furthermore, role of the singer’s formant is related to the audibility of the
the results obtained by Sundberg et al37 concerning classical voice. The singer’s formant causes the singing voice to carry
singing suggest that nasalization may be used to attenuate F1, over the orchestra with relatively low load imposed on the vocal
which in turn would enhance the relative level of the singer’s organ. The main role of the actor’s formant is most likely also
formant. These may also be among the reasons for using nasal- related to audibility: Such a resonance-based energy concentra-
ization in speaking-voice exercises. It is clear, however, that tion at the frequency range with relatively low auditory thresh-
voice quality after nasalization exercises must not sound nasal. old is likely to increase the voice loudness in a voice-hygienic
Timo Leino, et al Actor’s Formant 157
way. Thus, an actor’s formant is an entirely reasonable goal in REFERENCES
actors’ vocal training. 1. Lo¨fqvist A, Mandersson B. Long-time average spectrum of speech and
The frequency of the upper formants F4 and F5 varies little voice analysis. Folia Phoniatr. 1987;39:221–229.
2. Wendler J, Doherty ET, Hollien H. Voice classification by means of long-
between vowels; therefore, these formants have been regarded
term speech spectra. Folia Phoniatr. 1980;32:51–60.
to mainly reflect the general vocal tract setting without any 3. Dejonckere PH, Villarosa D. Analyse spectrale moyenne´e de la voix. Com-
marked language-related role. However, they vary according paraison de voix normales et de voix alte´re´es par diffe´rentes cate´gories de
to consonant environment, which suggests that they may also pathologies larynge´es. Acta Otorhinolar Belg. 1986;40:426–435.
have some linguistic role.28,38 It may, thus, be hypothesized 4. Kitzing P. LTAS criteria pertinent to the measurement of voice quality.
J Phon. 1986;14:477–482.
that strengthening the upper formants could affect speech intel-
5. Pittam J. Discrimination of five voice qualities and prediction to perceptual
ligibility by improving the recognition of consonants, for exam- ratings. Phonetica. 1987;44:38–49.
ple, in impaired listening conditions. On the other hand, both 6. Nolan F. The Phonetic Bases of Speaker Recognition. Cambridge, UK:
the actor’s formant and the singer’s formant are very stable in Cambridge University Press; 1983.
frequency. Thus, it seems most likely that they cannot have 7. Laukkanen A-M, Bjo¨rkner E, Sundberg J. Throaty voice quality: subglot-
tal pressure, voice source and formant characteristics. J Voice. 2006;20:
any direct effect on speech intelligibility, because improving
25–37.
speech intelligibility requires increased differentiation between 8. Rossing T, Sundberg J, Ternstro¨m S. Acoustic comparison of voice use in
the speech sounds in the acoustic structure. However, both the solo and choir singing. J Acoust Soc Am. 1986;79:1975–1981.
singer’s formant and the actor’s formant may also affect speech 9. Cleveland T, Sundberg J, Stone RE. Long-term-average spectrum charac-
intelligibility in an indirect way. By increasing the audibility of teristics of country singers during speaking and singing. J Voice. 2001;
15:54–60.
voice, they draw the listener’s attention to it and, thus, may also
10. Stone RE Jr, Cleveland TF, Sundberg PJ. Acoustic and aerodynamic char-
help the listener to solve the linguistic code of the signal. acteristics of Country-Western, Operatic and Broadway singing styles com-
The present study focused on only one subject. However, the pared to speech. J Acoust Soc Am. 2003;113:2242–2243.
results should be generalizable at least to some extent, because 11. Flach M, Schwickardi H, Dickopf G, Pabst F. DDR: Zur Beurteilung sa¨n-
a similar speaker’s formant has been observed in the speaking gerischer Stimmenentwicklung mittels LTAS und Stimmfeldmessung.
Folia Phoniatr. 1989;41:4–5.
samples of many other subjects, both in speakers of Finnish
12. Leino T. Long-term average spectrum study on speaking voice quality in
and other languages. In the same way as in the case of a singer’s male actors. In: Friberg A, Iwarsson J, Jansson E, Sundberg J, eds.
formant, the phenomenon may be achieved slightly differently SMAC93, Proceedings of the Stockholm Music Acoustics Conference,
in different subjects because of differences in the vocal tract. July 28-August 1, 1993. Stockholm, Sweden: The Royal Swedish Academy
of Music; 1994:206–210.
13. Leino T, Ka¨rkka¨inen P. On the effects of vocal training on the speaking
CONCLUSIONS voice quality of male student actors. In: Elenius K, Branderud P, eds. Pro-
ceedings of the XIIIth International Congress of Phonetic Sciences, Stock-
A strong sound energy concentration at about 3.5 kHz in
holm, Sweden 13–19 August, 1995, Vol. 3 of 4. Stockholm, Sweden:
speech, that is, an actor’s formant, can be strengthened through Department of Speech Communication and Music Acoustics, Royal Insti-
an exercise series containing nasals. The actor’s formant seems tute of Technology, and the Department of Linguistics, Stockholm Univer-
to be formed by a cluster of F3–F5 (increase of F3 and decrease sity; 1995:496–499.
of F4 and F5) as in the case of a singer’s formant. 14. Dejonckere PH. Analyse acoustique de la production vocale. Essai de syn-
the`se dans une optique clinique. Acta Otorhinolar Belg. 1986;40:377–385.
Results of 1D modeling suggest that these changes can be
15. Frøkjær-Jensen B, Prytz S. Registration of voice quality. Bru¨el Kjær Tech
achieved through epilaryngeal narrowing with a widening of Rev. 1976;3:3–17.
the back of the oral cavity and a narrowing of the front part 16. Nawka T, Anders LC, Cebulla M, Zurakowski D. The speaker’s formant in
of it. A slight lowering of the larynx and/or a more frontal male voices. J Voice. 1997;11:422–428.
tongue position with lowering of the tongue root could result 17. Bele IV. The speaker’s formant. J Voice. 2006;20:555–578.
18. Master S, De Biase N, Chiari BM, Laukkanen A-M. Acoustic and percep-
in these changes. 1 D model seems to be applicable in studying
tual analysis of Brazilian male actors and non-actors voice: long term aver-
the effects of vocal training. The role of nasals as vocal exer- age spectrum and the actor’s formant. J Voice. 2008;22:146–154.
cises should be studied further, for example, using computed to- 19. Sundberg J. Articulatory interpretation of the singing formant. J Acoust Soc
mography scanning of the vocal tract and electromyographic Am. 1974;55:838–844.
registration of the laryngeal muscles. 20. Berndtsson G, Sundberg J. Perceptual significance of the center frequency
of singer’s formant. Scand J Logop Phoniatr. 1995;20:35–41.
21. Dettweiler RF. An investigation of the laryngeal system as the resonance
Acknowledgments source of the singer’s formant. J Voice. 1994;18:303–313.
The authors warmly thank the subject for his patient participa- 22. Pabst F, Sundberg J. Tracking multi-channel electroglottograph measure-
ment of larynx height in singers. Speech Transm Lab Q Prog Status Rep.
tion in the studies. The valuable comments of Dr. Johan Sund-
1993;2–3:67–78.
berg are greatly appreciated. The assistance of special 23. Vampola T, Hora´cek J, Sˇvec J. FE modeling of human vocal tract acoustics.
laboratory technician Jussi Helin in preparing the analyses is Part I: Production of Czech vowels. Acta Acust. 2008;94:433–447.
also acknowledged. Mrs. Virginia Mattila is thanked for lan- 24. Laukkanen A-M, Syrja¨ T, Laitala M, Leino T. Effects of two-month vocal
guage correction. exercising with and without spectral biofeedback on student actors’ speak-
ing voice. Logoped Phoniatr Vocol. 2004;29:66–76.
This research was financially supported by the Grant Agency
25. Merhaut J. Theoretical Foundation of Electroacoustics. Prague, Czech Re-
of the Czech Republic, project No. 101/08/1155 ‘‘Computer public: Academia; 1971. (in Czech).
and physical modeling of vibroacoustic properties of human vo- 26. Story BH. Technique for ‘‘tuning’’ vocal tract area functions based on
cal tract for optimization of voice quality.’’ acoustic sensitivity functions. J Acoust Soc Am. 2006;119:715–718.
158 Journal of Voice, Vol. 25, No. 2, 2011
27. Laver J. The Phonetic Description of Voice Quality. Cambridge, UK: Cam- 35. Perkell JS. Physiology of Speech Production. Results and Implications of
bridge University Press; 1980. a Quantitative Cineradiographic Study. York, PA: The Massachusetts Insti-
28. Fant G. Acoustic Theory of Speech Production. With Calculations Based on tute of Technology, the Maple Press; 1969.
X-ray Studies of Russian Articulations. 2nd ed. The Hague, The Nether- 36. Yanagisawa E, Kmucha ST, Estill J. Role of the soft palate in laryngeal
lands: Mouton; 1970. functions and selected voice qualities. Ann Otol Rhinol Laryngol. 1990;
29. Titze IR, Story BH. Acoustic interactions of the voice source with the lower 99:18–28.
vocal tract. J Acoust Soc Am. 1997;101:2234–2243. 37. Sundberg J, Birch P, Gu¨moes B, Stavad H, Prytz S, Karle A. Experimen-
30. Anderson V. Training the Speaking Voice. 2nd ed. New York: Oxford Uni- tal findings on the nasal tract resonator in singing. J Voice. 2007;21:
versity Press; 1961. 127–137.
31. Machlin E. Speech for the Stage. New York: Theatre Art Books; 1966. 38. Iivonen A, Laukkanen A-M. Explanations of the qualitative variation of
32. Pahn J. Stimmu¨bungen fu¨r Sprechen und Singen. Berlin, Germany: VEB Finnish vowels. In: Iivonen A, Lehtihalmes M, eds. Studies in Logopedics
Verlag Volk und Gesundheit; 1968. and Phonetics 4. Series B: Phonetics, Logopedics and Speech Communica-
33. Berry C. Your Voice and How to Use it Successfully. London, UK: Harrap; 1975. tion 5. Helsinki, Finland: Department of Phonetics, University of Helsinki;
34. Tinge GJ. The nasaling approach [abstract]. Folia Phoniatr. 1989;41:4–5. 1993:29–54.