Academia.eduAcademia.edu
Formation of the Actor’s/Speaker’s Formant: A Study Applying Spectrum Analysis and Computer Modeling *Timo Leino, *Anne-Maria Laukkanen, and †Vojteˇch Radolf, *Tampere, Finland, yPrague, Czech Republic Summary: Hypothesis. A strong peak between 3 and 4 kHz in the long-term average spectrum (LTAS) of speech has been found to be one correlate of a good male speaking voice, for example, among actors. The actor’s or speaker’s formant (resembling the singer’s formant) can be established by certain vocal training. This study investigates the origin of the speaker’s formant. Study Design and Setting. The immediate effects of a vocal exercise series on speaking voice were studied in a Finnish male actor, who is an experienced teacher of the exercises. They consist of nasal vowel syllable strings and words containing nasals. Before and after a 30-minute exercising, the subject (1) read aloud at three loudness levels and (2) phonated the Finnish vowels at habitual level. Methods. Formant frequencies were estimated from spectra of the vowel samples. LTAS was made and equivalent sound level (Leq) was measured for the text samples. Formant frequencies were used as the input for a one-dimensional (1D) mathematical model. Results. After the exercise, the peak at 3.5 kHz in the LTAS of the reading samples was stronger, although Leq was the same as before, suggesting a level-independent resonance change. Reading samples after exercising were evaluated to sound better in voice quality than before exercising. The strong peak at 3.5 kHz was present in all vowels, and it was mainly formed by clustering of F4 and F5. Conclusions. A 1D model-based optimization suggested that this kind of a formant cluster could be best established by simultaneously narrowing the epilaryngeal tube, widening the pharynx and narrowing the front of the oral cavity. Key Words: Vocal exercising–Voice quality–Spectrum analysis–Mathematical modeling–Optimization. INTRODUCTION 3.5 kHz seem to be some of the features often characterizing The long-term average spectrum (LTAS) provides information a good male speaking voice. on the spectral distribution of the speech signal over a period of These spectral characteristics were taken as goals on a special time.1 If the signal is sufficiently long (eg, 1 minute) and pho- 8-month intense voice training course for student actors.13 netically balanced, LTAS gives information on the average Real-time spectrum analysis was used as an aid in the training voice quality.1 The method is attractive, because it is easy to sessions. According to the LTAS results of speaking samples use in the vocologist’s daily routine. The method has been recorded before and after the training period, the goals of the found to distinguish between normal and pathological speaking training were reached. The samples were also evaluated to voices and between different degrees of hoarseness.2,3 It also re- sound better after training. The listeners consisted of theater veals differences between phonation types.4–7 It has, moreover, and speech professionals, student actors, and university been used to study singing voice in different styles and song students. The results confirm the earlier finding that the spectral genres8–10 and the effects of singing voice training.11 Leino12 slope and prominence of the peak between 3 and 4 kHz are applied LTAS for studying the speaking-voice quality of profes- characteristics of a good voice quality in speech. sional male actors. In these results, samples evaluated as repre- A relatively strong peak at 3.5 kHz can also be seen in the senting poor voice quality were distinguished from those with LTAS of speakers of other languages than Finnish: for example, rather poor, fairly good, and good voice qualities by the steepest in the article by Dejonckere, it can be seen for a French- spectral slope.12 The best voices, in turn, were characterized speaking male subject14; in the article by Frøkjær-Jensen and especially by a prominent peak between 3 and 4 kHz.12 Prytz, for a Danish speaker15; and in the book by Nolan, for The concept ‘‘good voice quality’’ is naturally very difficult an English speaker.6 Nawka et al have reported it in good voices to define exhaustively and most likely consists of various of German speakers16; Bele17 observed it in Norwegian male elements. However, Leino’s12 results suggest that the shape of professional speakers (actors and teachers); and Master et al18 LTAS has a certain perceptual relevance in the evaluation of observed it in Brazilian Portuguese-speaking male actors. voice quality. A gentle spectral slope and a prominent peak at Cleveland et al report it in country singers’ singing voices.9 The prominent peak between 3 and 4 kHz in the LTAS of a good male speaking voice seems to resemble the singer’s for- Accepted for publication October 8, 2009. From the *Department of Speech Communication and Voice Research, University of mant, a strong energy concentration between 2 and 3 kHz in Tampere, Tampere, Finland; and the yDepartment of Dynamics and Vibrations, Institute male operatic singing voice19 (Figure 1). Although the singer’s of Thermomechanics, the Academy of Sciences of the Czech Republic, Prague, Czech Republic. formant lies lower in frequency and is stronger than the ‘‘actor’s Address correspondence and reprint requests to Timo Leino, Department of Speech formant,’’ both seem to be correlates of good voice quality, and Communication and Voice Research, FIN-33014, University of Tampere, Tampere, Finland. E-mail: Timo.Leino@uta.fi both can be achieved through training. However, an actor’s Journal of Voice, Vol. 25, No. 2, pp. 150-158 formant can also be seen in the LTAS of untrained good male 0892-1997/$36.00 Ó 2011 The Voice Foundation voices, whereas the singer’s formant is mainly achieved doi:10.1016/j.jvoice.2009.10.002 through classical singing training. Timo Leino, et al Actor’s Formant 151 the vocal changes after a 30-minute special vocal warm-up ex- ercise. To obtain an estimate of the origin of the increased prom- inence of the actor’s formant, the results from the subject were further studied using an optimization approach based on one- dimensional (D) mathematical modeling.23 MATERIALS AND METHODS Subject and tasks A professional Finnish male actor (1) read aloud from a text at three loudness levels (habitual, softer, and louder) and (2) pho- nated the eight Finnish vowels separately at habitual loudness before and after a vocal warm-up exercise of 30 minutes. Vowel and text reading samples and vocal exercising were recorded in a sound-treated studio with a digital recorder Tascam DA-20 (Teac Corporation, Tokyo, Japan) and Bru¨el & Kjær 4165 om- FIGURE 1. LTAS of a reading sample of the famous late Finnish nidirectional microphone (Bru¨el et Kjær Sound & Vibration radio announcer Carl-Eric Creutz (gray line) and a singing sample of Measurement A/S, Naerum, Denmark); the mouth to micro- the Finnish baritone Jorma Hynninen (black line). Reprinted with per- phone distance was 40 cm, and the distance was controlled care- mission from Laukkanen A-M, Leino T. Ihmeellinen ihmisa¨a¨ni [The fully by measuring and monitoring the subject’s position during Amazing Human Voice]. Helsinki: Gaudeamus; 1999. recording. The recordings were calibrated for equivalent sound The acoustic differences between the singer’s formant and level (Leq) measurement using a sine wave generator and the actor’s formant seem to have perceptual relevance. Some a sound-level meter (Bru¨el & Kjaer Frequency Analyzer 2120). trained operatic (male) singers tend to use the same kind of Vocal warm-up was performed using the ‘‘Kuukka exercise voice quality in both singing and speaking. In the LTAS of series.’’24 Niilo Kuukka, the well-known late Finnish voice such speaking samples, no actor’s formant can be seen, but trainer from the Theatre Academy of Finland, designed this instead, a prominent peak between 2 and 3 kHz can be seen, exercise series especially for student actors to develop a well- although it is not so strong as in singing. According to the lis- projecting stage speech. The series is also used by many actors tening evaluations conducted by Leino, these kinds of speaking for warming up before entering the stage. The exercises consist samples of singers tend to score lower than the good voices with of nasal vowel syllable strings and words containing nasals pho- the 3.5-kHz peak. Listeners have commented that the voice nated at varying pitch and loudness and with varying rhythm sounds strange, not like a normal speaking voice. Thus, there patterns. (For a detailed description, see Laukkanen et al.24) seems to be a certain timbre difference related to the exact The aim is a clear, ‘‘well-resonating,’’ ‘‘ringing,’’ and well- location of the high-frequency peak in the LTAS of speaking. carrying voice quality. The subject in the present study was cho- This corroborates with the results of Berndtsson and Sundberg sen, because he is an experienced teacher of this exercise series on the effects of the location of the singer’s formant on per- and because earlier recordings and acoustic and perceptual ceived voice classification.20 The lower frequency of the peak analyses have shown a clear change in his speech after warming of singers may be related to lowering of the larynx. up with these exercises. Singer’s formant has been explained as the consequence of clustering of the upper formants F3, F4, and F5 occurring Analyses when the laryngeal tube forms an independent resonator.19 LTAS analysis was performed on the text reading samples with This happens when the cross-sectional area of the outlet of Hewlett-Packard (HP) Dynamic Signal Analyzer 3561 A the larynx tube is sufficiently different from the cross-sectional (Hewlett-Packard, Palo Alto, CA). 10-kHz range and Hanning area of the pharynx. This effect can be obtained by lowering the window were used. Voiceless sounds were excluded from the larynx.19 The role of a laryngeal resonator in the formation of analysis. the singer’s formant has been questioned, for example, by Dett- LTAS of the samples before and after exercising were com- weiler.21 There are also observations of singers capable of pro- pared with each other by normalizing them according to the ducing a singer’s formant although the larynx rises.22 This strongest spectral peak below 1 kHz. In this way, the spectral would suggest that there may also be other mechanisms to slope differences can be studied visually. establish the singer’s formant. Fast Fourier Transform (FFT) line spectra and spectrograms The present study investigates the formation of the actor’s or were made for the separately phonated vowels recorded before speaker’s formant in one male subject. Earlier observations by and after exercising. The analyses were performed with a signal the first author using spectrum analysis as an aid in vocal training analysis system Intelligent Speech Analyser (ISA), developed sessions have shown that the prominence of the actor’s formant by Raimo Toivonen, M.Sc. Eng. Formant frequencies were es- may be increased through a short vocal warm-up exercising, at timated on the basis of manual measurements of local ampli- least if this characteristic of speech has been established earlier tude maxima in the spectra. Leq of the 1-minute reading through training. Therefore, the present study concentrates on samples was measured with ISA. Linear weighting was used. 152 Journal of Voice, Vol. 25, No. 2, 2011 FIGURE 2. Schema of 1D modeling based on magnetic resonance imaging (MRI). A. MRI registration of the vocal tract of a Czech male speaker phonating on [a:]. B. Dividing the 3D model into 24 cross-sections. C. Reconstruction of a model for [a:] with 1D conical elements. Reprinted with permission from Laukkanen A-M, Radolf V, Hora´cek J, Leino T. Estimation of the origin of a speaker’s and singer’s formant cluster using an op- timization of 1D vocal tract model. Proceedings of the 3rd Advanced Voice Function Assessment International Workshop, 18th-20th May 2009, Ma- drid, Spain. Listening evaluation vowels, that is, an open back vowel and a closed front vowel, Text reading and vowel samples were evaluated in a sound- were chosen, because they represent two opposite vocal tract treated room by four theater school teachers and seven student settings. Details of the model are presented as follows. actors. The samples were replayed with DAT recorder Tascam One-dimensional model with variable cross-sections DA-20 (Teac Corporation, Tokyo, Japan) and Genelec Biamp and lengths. The transfer matrix method applying conic ele- loudspeaker (Genelec, Iisalmi, Finland). Samples recorded be- ments was used. The base of this method is wave equation of an fore and after exercising were replayed in randomized pairs. acoustic duct with variable cross-section A(x) and viscous los- The listeners’ task was to judge which sample in each pair rep- ses (specific acoustic resistance rs)25 resented better voice quality or whether there was no difference between the samples. The listeners were requested to pay atten- v2 f 1 vA vf 1 v2 f rS vf   tion to voice and not to articulation, interpretation, and others. þ , ,  , þ , ¼ 0: (1) vx2 A vx vx c20 vt2 r0 vt However, good voice quality was not predefined. The vocal tract was constructed of 23 conic elements as a model Modeling approach of oral and epilaryngeal cavities from vocal folds to mouth. The possible vocal tract changes resulting in the formation of an Radiation impedance (RI) was considered. actor’s formant were studied using a 1D mathematical model of We can describe relation between input and output of the voice production. The 1D vocal tract model was developed from vocal tract by the equation a 3D volume model obtained from the magnetic resonance im-       pOUT a b pIN ages of a Czech male speaker.23 Figure 2 illustrates the scheme ¼ , ; (RI 1) WOUT c d WIN of the modeling. The formant frequencies measured from the vowels [a:, i:] recorded from the subject of the present study be- where p is the acoustic pressure, W is the volume velocity, and fore and after exercising were given to the model, and through a tuning procedure (changing the vocal tract shape, ie, the size   a b of area cross-sections and the length of the conical elements), TOUT; IN ¼ c d the best-fitting vocal tract configurations were obtained. These Timo Leino, et al Actor’s Formant 153 is the transfer matrix of the acoustic system (vocal tract). Com- 1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi  Ai red ¼ Ai þ Ai Aiþ1 þ Aiþ1 ; (4) plex elements a, b, c, and d are functions of the vocal tract 3 shape, medium (air density r0, viscosity m, and sound velocity c0 ), and frequency f of the propagating wave. mi is an acoustic mass lumped into ith cross-section For the eigenfrequency calculation, we assume the boundary r  Li1 Li  conditions: mi ¼ 0 þ ; (5) 2 Ai1red Ai red – The input is closed (by rigid wall) and hence WIN ¼ 0 vii1 and vii are partial derivatives of the reduced cross-section. – The output loaded by the acoustic RI of vibrating circular plate with radius R placed in an infinite wall23 sffiffiffiffiffiffiffiffiffi! i1 vAi1red 1 1 Ai1 vi ¼ ¼ , 1þ ; (6) vAi 3 2 Ai   c 0 r0 J1 ð2kRÞ H1 ð2kRÞ ZA rad ¼ , 1  þ j ; (RI 2) pR2 kR kR sffiffiffiffiffiffiffiffiffi! vAi red 1 1 Aiþ1 vii ¼ ¼ , 1þ ; (7) where J1 is the Bessel function of the first kind of order 1, H1 is vAi 3 2 Ai the Struve function of order 1, and k is the wave number. As Ne is number of conic elements, pi; n is an amplitude of acoustic pOUT pressure in the ith cross-section for the nth eigenfrequency fn, ZA rad ¼ ; (RI 3) Wi; n is an amplitude of volume velocity in the ith cross-section WOUT we obtain for the nth eigenfrequency fn.       Moreover, we derived that the sensitivity of a particular ei- ZA rad ,WOUT a b pIN genfrequency fn to a change in length Li in dimensionless form ¼ , (RI 4) WOUT c d 0 LDfn =fn and after eliminating pIN Si;n ¼ ; (8) DLi =Li can be substituted by the relation a  ZA rad ,c ¼ 0; (RI 5)   which is the frequency equation for a complex wave number kn. L~ 1 Ai red 2 r0  2 2 Si;n ¼ p þ Wi; n þ Wiþ1; n From the relation 2 r0 c20 i; n 2Ai red u 2pf Li k¼ ¼ (RI 6) 3PNe þ1 : (9) c0 c0 2 e¼2 me We; n we can obtain an eigenfrequency fn in Hertz as Close to the results of Story,26 using the sensitivity function,  c0  an iterative process was used for an ith cross-sectional area fn ¼ Re kn , : (RI 7) 2p computation We derived that the sensitivity of a particular eigenfrequency   A~ P fn to a change in cross-sectional area Ai in dimensionless form ðAi Þkþ1 ¼ ðAi Þk , 1 þ ðzn , Si;n Þk ; (10) n A Dfn =fn Si;n ¼ (2) and for an ith element-length computation DAi =Ai can be substituted by the relation   L~ P ( " # ðLi Þkþ1 ¼ ðLi Þk , 1 þ ðzn , Si;n Þk ; (11) n A~ 1 i1 r0  2 2  1 2 Si;n ¼ Li1 vi , Wi1; n þ Wi; n  p 2 2A2i1red r0 c20 i1; n where zn is a function of the difference between desired and in- stantaneous nth eigenfrequency. The speed of sound, the density, and the dynamic viscosity of " #) r0  2  1 2 þLi vii , 2 Wi; n þ Wiþ1; n  p the air were considered as follows: co ¼ 353 m s1; r0 ¼ 1.2 kg 2 2Ai red r0 c20 i; n m3, m ¼ 1.8 3 105 kgm1s1. The geometry marked ‘‘before (warming up)’’ was obtained Ai by the iterative process (mentioned earlier) with initial geometry 3PNe þ1 (3) 2 e¼2 me We; n for the (Czech) vowels ([a:] and [i:]), which was known. Desired eigenfrequencies for these vowels were set to formant values: where Ai red is a reduced cross-sectional area of an ith conic element Vowel [a:]: 597, 989, 2711, 3768, 4220 Hz 154 Journal of Voice, Vol. 25, No. 2, 2011 FIGURE 3. LTAS of a reading sample before (black line) and after (gray line) vocal warm-up. A. Sound pressure level 83 dB in both samples and B. Sound pressure level 75 and 76 dB, respectively. Vowel [i:]: 385, 2348, 2590, 3858, 4341 Hz agreement regarding voice quality in that it was better in the samples recorded after exercising. However, the teachers This computed geometry became the initial geometry for the seemed to prefer the samples recorded after exercising more next step. In this step, desired eigenfrequencies were set to for- than did the students. This may be a sign of a better auditory dis- mant values after warming up. These values were: crimination ability in the teachers or may be related to a differ- ence in taste. Vowel [a:]: 687, 1020, 2801,3556, 3918 Hz Vowel [i:]: 355, 2439, 3103, 3737, 4039 Hz Separately produced vowels Examples of FFT spectra from vowel samples before and after The variables were all of 24 cross-sections and the lengths of exercising can be seen in Figure 4. After exercising, the for- the first, second, and the last element. However, in some cases, mants F4 and F5 have come closer in frequency, thus forming the iteration process converged better without variable lengths. a cluster. Such a cluster was present in all vowels. Figure 5 shows the changes in formant frequencies. The trend was that F3 increased and F4 and F5 decreased. The rise of F3 may RESULTS explain the frequency shift of the third peak in LTAS (Figure 3). Long-term average spectrum and listening Changes in F4 and F5 seemed to be less marked in rounded evaluation vowels. Figure 3 shows LTAS of reading samples produced before and after exercising. The sample recorded before warm-up (Figure 3A) represents loud reading and that recorded after it Results from modeling represents reading at ‘‘habitual’’ loudness; yet, Leq was the Figure 6 shows how the shape of the vocal tract can be changed same in these samples (83 dB, at a mouth to microphone dis- by changing the spectrum of acoustic signals according to the tance of 40 cm). In spite of the same Leq in the samples, the formant values measured for the subject before and after exer- speaker’s formant is much more prominent in the sample re- cising. A good qualitative agreement was seen with the acoustic corded after exercising. This suggests a level-independent spectra and the corresponding computed transfer functions, change in voice quality, either because of a change in voice source, in formant frequencies, or both. TABLE 1. Figure 3B shows the LTAS of softer reading. The sample Evaluation of Voice Quality in Soft and Loud Text recorded before exercising represents reading at habitual loud- Reading and Production of Vowels Before and After ness and the sample after exercising represents reading at Vocal Warm-up ‘‘softer’’ loudness. Leq was approximately the same in both samples (75 and 76 dB, respectively). In these samples, the dif- Before After No difference ference is similar to that found for louder reading, although the Soft 3 (S) 7 1 (T) actor’s formant is not as prominent. In the LTAS of all the read- Loud 2 (S) 9 ing samples after warm-up, the third peak (F3 variation range) Vowels 1 (S) 10 had also shifted higher in frequency. Abbreviations: S, student actor; T, theater school teacher. The number of listeners regarding the voice better/finding no differences Table 1 summarizes the results of the listening evaluation. It between the samples before and after. Listeners n ¼ 11 in total. can be seen that among listeners, there was a high degree of Timo Leino, et al Actor’s Formant 155 FIGURE 4. LTAS of separately phonated Finnish vowels [ae:] A. and [y:] B. before (dashed line) and after (solid line) warm-up. suggesting that a 1D model has applicability in estimating the peak is stronger, whereas it is largely absent in the samples rep- background of the effects of vocal training. According to the resenting whispery voice and falsetto and speaking with low- results, a speaker’s formant in vowel [a:] could be obtained ered or raised larynx. Leino has also found cases in which the through a slight narrowing of the epilaryngeal region, widening strong peak at 3.5 kHz was exceptionally related to creaky of the back of the oral cavity, and narrowing of the front part of voice quality. The resonance frequency of the laryngeal tube it (Figure 6A, B). Similar results were obtained for [i:], is approximately 3.5 kHz.28 Nolan considers the possibility although changes in the epilaryngeal region were very small that the 3.5-kHz peak is formed in the same way as the singer’s (Figure 6C, D). formant according to Sundberg,19 that is, resulting when the cross-sectional area of the outlet of the larynx tube is suffi- ciently different from the cross-sectional area of the pharynx. DISCUSSION The calculations by Titze and Story29 suggest that a narrowing The results suggest that the actor’s/speaker’s formant is formed of the epilaryngeal region tends to raise the three lowest for- by a clustering of the upper formants. The prominence of the ac- mants of the vocal tract and lower F4 and F5. As the authors tor’s formant seems to increase through a vocal warm-up. One put it ‘‘The narrowed epilarynx tube therefore ‘attracts’ all for- may indeed ask whether this kind of vocal exercising should mant frequencies toward the 2500–3000 Hz region.’’ That is the be called warming up or placement exercising. The former refers frequency range of the singer’s formant. This kind of epilaryng- to exercising that aims at immediate positive vocal effects, pos- eal setting most likely plays a role in the formation of the sibly explicable through physiological changes in the laryngeal speaker’s formant as well. The somewhat vowel-dependent muscles and vocal fold tissue. The latter implies that a change changes in F3, F4, and F5 could be explained by an epilaryngeal in the phonatory and/or articulatory setting is aimed at. Probably, narrowing. Some vowel dependency on the changes in formant both factors are usually involved in vocal exercising. frequencies is understandable. The same change in articulation Nolan6 conducted experiments in voice quality variation is likely to result in different changes in the formant frequencies according to the classification presented by Laver.27 In the of different vowels.28 LTAS of the habitual speaking sample of Nolan, there is a clear The results of modeling suggest that a speaker’s formant peak at 3.5 kHz. In the samples of creak and creaky voice, the could be obtained through a slight narrowing of the FIGURE 5. Changes in formant frequencies (F1–F5) of the eight Finnish vowels measured before and after warm-up. 156 Journal of Voice, Vol. 25, No. 2, 2011 FIGURE 6. Results from modeling. A. Possible geometry of the vocal tract for vowel [a:] before and after warm-up. B. Transfer function of the vocal tract for vowel [a:] before and after warm-up. C. Possible geometry of the vocal tract for vowel [i:] before and after warm-up. D. Transfer function of the vocal tract for vowel [i:] before and after warm-up. epilaryngeal region, widening of the back of the mouth cavity, According to the calculations of Titze and Story,29 the three and narrowing of the front part of it (Figure 2). The changes in lowest nasal resonances were at 1575, 3150, and 4725 Hz. the mouth cavity would imply a more frontal position of the The effects of nasality on acoustic voice quality were found tongue. A slight lowering of the larynx could also result in a nar- to be minor. A downward shift of F3 and F4 was reported. Titze rowing of the epilaryngeal tube and lowering of the tongue root. and Story came to the conclusion that the benefit of nasaliza- It is known that the 1D models can replicate the behavior of tion, for example, in singing, is less acoustic than biomechani- the 3D vocal tract only up to 3000 Hz, because at higher fre- cal.29 Nasals as vocal exercises warrant further study, for quencies, transverse modes occur in the 3D vocal tract model.23 example, applying magnetic resonance imaging or X-ray regis- However, the results obtained in the present study are fairly tration of the vocal tract and electromyography of the laryngeal close to those reported by Sundberg19 and Titze and Story.29 muscles. Thus, a 1D model also seems to have applicability in estimating The results of the listening evaluation suggest that there was the background of the effects of vocal training. a relation between increased prominence of the speaker’s for- The use of nasals, mainly /m:/, in vocal exercises, either in mant and perception of a better voice quality. However, in the- words or produced separately as prolonged (often as ‘‘hum- ory, it is also possible that some other characteristics in speech ming’’), is especially widespread in the voice training litera- may have affected the evaluation. On the other hand, the speech ture.30–33 Pahn32 has used nasalization exercises but mainly tempo and the lower formants (F1–F2), for instance, were not on [s] sound instead of [m, n] as used in the Kuukka exercises. markedly different after warming up. Furthermore, the listeners The nasalization exercises of Pahn have been reported to lead to were asked to pay attention to voice quality and not to other lowering of the laryngeal position and increased sound energy speech characteristics. Therefore, the results of the present around 3 kHz.32,34 The fact that intensification of sound energy study may be taken to support the earlier observations of the is seen at somewhat lower frequency than the ‘‘actor’s formant’’ role of the speaker’s formant as one correlate of a good voice in Finnish speakers may be explained by (a stronger) lowering quality in speech. of the larynx. The results by Perkell35 and Yanagisawa et al.36 The actor’s formant seems to have perceptual relevance at suggest that nasals tend to lower the larynx. This could result least from the aesthetic point of view. It gives the voice a ringing in a relative narrowing of the epilaryngeal region. On the other quality as the singer’s formant does to the singing voice, al- hand, preceding antiresonances may also improve the percep- though the timbre is clearly different. However, the specific tual prominence of spectral peaks in nasalization. Furthermore, role of the singer’s formant is related to the audibility of the the results obtained by Sundberg et al37 concerning classical voice. The singer’s formant causes the singing voice to carry singing suggest that nasalization may be used to attenuate F1, over the orchestra with relatively low load imposed on the vocal which in turn would enhance the relative level of the singer’s organ. The main role of the actor’s formant is most likely also formant. These may also be among the reasons for using nasal- related to audibility: Such a resonance-based energy concentra- ization in speaking-voice exercises. It is clear, however, that tion at the frequency range with relatively low auditory thresh- voice quality after nasalization exercises must not sound nasal. old is likely to increase the voice loudness in a voice-hygienic Timo Leino, et al Actor’s Formant 157 way. Thus, an actor’s formant is an entirely reasonable goal in REFERENCES actors’ vocal training. 1. Lo¨fqvist A, Mandersson B. Long-time average spectrum of speech and The frequency of the upper formants F4 and F5 varies little voice analysis. Folia Phoniatr. 1987;39:221–229. 2. Wendler J, Doherty ET, Hollien H. Voice classification by means of long- between vowels; therefore, these formants have been regarded term speech spectra. Folia Phoniatr. 1980;32:51–60. to mainly reflect the general vocal tract setting without any 3. Dejonckere PH, Villarosa D. Analyse spectrale moyenne´e de la voix. Com- marked language-related role. However, they vary according paraison de voix normales et de voix alte´re´es par diffe´rentes cate´gories de to consonant environment, which suggests that they may also pathologies larynge´es. Acta Otorhinolar Belg. 1986;40:426–435. have some linguistic role.28,38 It may, thus, be hypothesized 4. Kitzing P. LTAS criteria pertinent to the measurement of voice quality. J Phon. 1986;14:477–482. that strengthening the upper formants could affect speech intel- 5. Pittam J. Discrimination of five voice qualities and prediction to perceptual ligibility by improving the recognition of consonants, for exam- ratings. Phonetica. 1987;44:38–49. ple, in impaired listening conditions. On the other hand, both 6. Nolan F. The Phonetic Bases of Speaker Recognition. Cambridge, UK: the actor’s formant and the singer’s formant are very stable in Cambridge University Press; 1983. frequency. Thus, it seems most likely that they cannot have 7. Laukkanen A-M, Bjo¨rkner E, Sundberg J. Throaty voice quality: subglot- tal pressure, voice source and formant characteristics. J Voice. 2006;20: any direct effect on speech intelligibility, because improving 25–37. speech intelligibility requires increased differentiation between 8. Rossing T, Sundberg J, Ternstro¨m S. Acoustic comparison of voice use in the speech sounds in the acoustic structure. However, both the solo and choir singing. J Acoust Soc Am. 1986;79:1975–1981. singer’s formant and the actor’s formant may also affect speech 9. Cleveland T, Sundberg J, Stone RE. Long-term-average spectrum charac- intelligibility in an indirect way. By increasing the audibility of teristics of country singers during speaking and singing. J Voice. 2001; 15:54–60. voice, they draw the listener’s attention to it and, thus, may also 10. Stone RE Jr, Cleveland TF, Sundberg PJ. Acoustic and aerodynamic char- help the listener to solve the linguistic code of the signal. acteristics of Country-Western, Operatic and Broadway singing styles com- The present study focused on only one subject. However, the pared to speech. J Acoust Soc Am. 2003;113:2242–2243. results should be generalizable at least to some extent, because 11. Flach M, Schwickardi H, Dickopf G, Pabst F. DDR: Zur Beurteilung sa¨n- a similar speaker’s formant has been observed in the speaking gerischer Stimmenentwicklung mittels LTAS und Stimmfeldmessung. Folia Phoniatr. 1989;41:4–5. samples of many other subjects, both in speakers of Finnish 12. Leino T. Long-term average spectrum study on speaking voice quality in and other languages. In the same way as in the case of a singer’s male actors. In: Friberg A, Iwarsson J, Jansson E, Sundberg J, eds. formant, the phenomenon may be achieved slightly differently SMAC93, Proceedings of the Stockholm Music Acoustics Conference, in different subjects because of differences in the vocal tract. July 28-August 1, 1993. Stockholm, Sweden: The Royal Swedish Academy of Music; 1994:206–210. 13. Leino T, Ka¨rkka¨inen P. On the effects of vocal training on the speaking CONCLUSIONS voice quality of male student actors. In: Elenius K, Branderud P, eds. Pro- ceedings of the XIIIth International Congress of Phonetic Sciences, Stock- A strong sound energy concentration at about 3.5 kHz in holm, Sweden 13–19 August, 1995, Vol. 3 of 4. Stockholm, Sweden: speech, that is, an actor’s formant, can be strengthened through Department of Speech Communication and Music Acoustics, Royal Insti- an exercise series containing nasals. The actor’s formant seems tute of Technology, and the Department of Linguistics, Stockholm Univer- to be formed by a cluster of F3–F5 (increase of F3 and decrease sity; 1995:496–499. of F4 and F5) as in the case of a singer’s formant. 14. Dejonckere PH. Analyse acoustique de la production vocale. Essai de syn- the`se dans une optique clinique. Acta Otorhinolar Belg. 1986;40:377–385. Results of 1D modeling suggest that these changes can be 15. Frøkjær-Jensen B, Prytz S. Registration of voice quality. Bru¨el Kjær Tech achieved through epilaryngeal narrowing with a widening of Rev. 1976;3:3–17. the back of the oral cavity and a narrowing of the front part 16. Nawka T, Anders LC, Cebulla M, Zurakowski D. The speaker’s formant in of it. A slight lowering of the larynx and/or a more frontal male voices. J Voice. 1997;11:422–428. tongue position with lowering of the tongue root could result 17. Bele IV. The speaker’s formant. J Voice. 2006;20:555–578. 18. Master S, De Biase N, Chiari BM, Laukkanen A-M. Acoustic and percep- in these changes. 1 D model seems to be applicable in studying tual analysis of Brazilian male actors and non-actors voice: long term aver- the effects of vocal training. The role of nasals as vocal exer- age spectrum and the actor’s formant. J Voice. 2008;22:146–154. cises should be studied further, for example, using computed to- 19. Sundberg J. Articulatory interpretation of the singing formant. J Acoust Soc mography scanning of the vocal tract and electromyographic Am. 1974;55:838–844. registration of the laryngeal muscles. 20. Berndtsson G, Sundberg J. Perceptual significance of the center frequency of singer’s formant. Scand J Logop Phoniatr. 1995;20:35–41. 21. Dettweiler RF. An investigation of the laryngeal system as the resonance Acknowledgments source of the singer’s formant. J Voice. 1994;18:303–313. The authors warmly thank the subject for his patient participa- 22. Pabst F, Sundberg J. Tracking multi-channel electroglottograph measure- ment of larynx height in singers. Speech Transm Lab Q Prog Status Rep. tion in the studies. The valuable comments of Dr. Johan Sund- 1993;2–3:67–78. berg are greatly appreciated. The assistance of special 23. Vampola T, Hora´cek J, Sˇvec J. FE modeling of human vocal tract acoustics. laboratory technician Jussi Helin in preparing the analyses is Part I: Production of Czech vowels. Acta Acust. 2008;94:433–447. also acknowledged. Mrs. Virginia Mattila is thanked for lan- 24. Laukkanen A-M, Syrja¨ T, Laitala M, Leino T. Effects of two-month vocal guage correction. exercising with and without spectral biofeedback on student actors’ speak- ing voice. Logoped Phoniatr Vocol. 2004;29:66–76. This research was financially supported by the Grant Agency 25. Merhaut J. Theoretical Foundation of Electroacoustics. Prague, Czech Re- of the Czech Republic, project No. 101/08/1155 ‘‘Computer public: Academia; 1971. (in Czech). and physical modeling of vibroacoustic properties of human vo- 26. Story BH. Technique for ‘‘tuning’’ vocal tract area functions based on cal tract for optimization of voice quality.’’ acoustic sensitivity functions. J Acoust Soc Am. 2006;119:715–718. 158 Journal of Voice, Vol. 25, No. 2, 2011 27. Laver J. The Phonetic Description of Voice Quality. Cambridge, UK: Cam- 35. Perkell JS. Physiology of Speech Production. Results and Implications of bridge University Press; 1980. a Quantitative Cineradiographic Study. York, PA: The Massachusetts Insti- 28. Fant G. Acoustic Theory of Speech Production. With Calculations Based on tute of Technology, the Maple Press; 1969. X-ray Studies of Russian Articulations. 2nd ed. The Hague, The Nether- 36. Yanagisawa E, Kmucha ST, Estill J. Role of the soft palate in laryngeal lands: Mouton; 1970. functions and selected voice qualities. Ann Otol Rhinol Laryngol. 1990; 29. Titze IR, Story BH. Acoustic interactions of the voice source with the lower 99:18–28. vocal tract. J Acoust Soc Am. 1997;101:2234–2243. 37. Sundberg J, Birch P, Gu¨moes B, Stavad H, Prytz S, Karle A. Experimen- 30. Anderson V. Training the Speaking Voice. 2nd ed. New York: Oxford Uni- tal findings on the nasal tract resonator in singing. J Voice. 2007;21: versity Press; 1961. 127–137. 31. Machlin E. Speech for the Stage. New York: Theatre Art Books; 1966. 38. Iivonen A, Laukkanen A-M. Explanations of the qualitative variation of 32. Pahn J. Stimmu¨bungen fu¨r Sprechen und Singen. Berlin, Germany: VEB Finnish vowels. In: Iivonen A, Lehtihalmes M, eds. Studies in Logopedics Verlag Volk und Gesundheit; 1968. and Phonetics 4. Series B: Phonetics, Logopedics and Speech Communica- 33. Berry C. Your Voice and How to Use it Successfully. London, UK: Harrap; 1975. tion 5. Helsinki, Finland: Department of Phonetics, University of Helsinki; 34. Tinge GJ. The nasaling approach [abstract]. Folia Phoniatr. 1989;41:4–5. 1993:29–54.