US3524930A - Resonance synthesizer for speech research - Google Patents

Resonance synthesizer for speech research Download PDF

Info

Publication number
US3524930A
US3524930A US743138A US3524930DA US3524930A US 3524930 A US3524930 A US 3524930A US 743138 A US743138 A US 743138A US 3524930D A US3524930D A US 3524930DA US 3524930 A US3524930 A US 3524930A
Authority
US
United States
Prior art keywords
resonance
output
synthesizer
parameter
excitation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US743138A
Inventor
Donald A Glace
Ignatius G Mattingly
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
US Department of Army
Original Assignee
US Department of Army
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by US Department of Army filed Critical US Department of Army
Application granted granted Critical
Publication of US3524930A publication Critical patent/US3524930A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems

Description

D. A. GLACE ET AL RESONANCE SYNTHESIZER FOR SPEECH RESEARCH Filed July 8 1968 Aug. 18, 1970 .4 Sheets-Sheet l mili! Aug. 18, 1970 D. A. GLACE EVAL 3,524,930
RESONANCE SYNTHESIZER FOR SPEECH RESEARCH Filed July 8 1968 .4 Sheets-Sheet 2 L INVENTORS DONALD A. GLACE IGNATIUS G.' M TT G ATTORNEY RESONANCE Filed July .8 1968 D. A. GLACE FAL SYNTHESIZER FOR SPEECH RESEARCH .4 Sheets-Sheet 3 EXTERNAL FdINPUT 49 DEVICE 5s N 28; 57 Q33 H I 33 WW F ANALDGUE 7 47 INPUT K DEVICE 53/ T 54 :Io14 -slkl 35 O (ln/fyi 1I i I i I I I l l l l F'o t SIIIPOIINOI I A I r INVENTORS V V DONALD A. GLACE TTINGLY IGNATIUS G.
ATTORNEY Aug. 18, 1970 D, A, GLACE ETAL EESONANCE SYNTHESEZEE FOR SPEECH RESEARCH .4 Sheets-Sheet 4 Filed July 8 1968 T U P T U o l l. 1w I l I \l x m Mw MM H L 6 eo WWMW 3 M IHAKNUTW TGPNR 2 NOMOI 7. ALACC V r 3 ec Y w mn EO WNF. www FPD {lllll x CWA C A AAW C C .E E E P O G m m m n O EET CT. M WAM L e0 A GG. l l l l A. s 1 D S LI AR AT NM @fvwm mm L F .v m D M AO BM m 8 6, /r m mw U .IA TGM NO...l ALC V l l l x A ATTORNEY United States Patent O 3,524,930 RESONANCE SYNTHESIZER FOR SPEECH RESEARCH Donald A. Glace, Laurel, Md., and Ignatius G. Mattingly,
Storrs, Conn., assignors to the United States of America as represented by the Secretary of the Army Filed July 8, 1968, Ser. No. 743,138 Int. Cl. G] 1/00 U.S. Cl. 179-1 30 Claims ABSTRACT OF THE DISCLOSURE In a resonance synthesizer for speech research, an input is fed to the speech synthesizer means from an analog input device wherein the selected parameters, those essential to all speech, are fed directly to the synthesizer means after tiltering out generation transients and those other parameters which are generated are fed to a parameter switch which selects those parameters which -are necessary for the speech output desired so that less than all the parameters generated are passed to the` synthesizer means. The synthesizer means has two excitation circuits and three spectrum-shaping circuits, commonly referred to as resonance circuits.
BACKGROUND OF TH'E INVENTION An acoustic phonetician needs both analytic and scientilic tools: the former to study the behavior of natural speech, the latter to test perceptually the hypothesis he has developed. Reasonably satisfactory and moderatelypriced analytic tools, such as the Kay Sona-Graph, have been commercially available for sometime. Synthesis, however, has been more of a problem. The known commercially available synthesizers are quite expensive. Investigators who are engineers or who have good engineering support have built their own experimental synthesizers, but others have had to do without.
Several synthesizers have been built in the past in an attempt to overcome the difficulties encountered in speech synthesis. An example of one of these synthesizers of the prior art is the terminal analog type. This prior art device is intrinsically capable of producing high-quality speech; however, it does have certain limitations. Synthesizers of this type do not have separate nasal circuits and those which do, are not acceptable for faithfulsyn thesis, therefore, acceptable nasal sound, in these prior art devices, must be constructed by using the circuits for the oral resonances. The nasal consonants produced in this way seem quite acceptable, but there is no way for the synthesizer to produce nasalized vowels. Finally, in prio-r art synthesizers of this type there is a forced choice between turbulent and voiced energy, which makes it diicult to synthesize sounds in .which both forms of excitation occur, such as the voiced fricatives.
These limitations of prior art synthesizers are overcome by the device of the present invention wherein a mixed excitation is possible, parameter selection is possible, and wherein a separate nasal circuit is provided so as to be able to produce nasalized vowels as well as nasal consonants. In addition, the parameter switch included in the present device, and described in more detail later on, enables only ten parameters to be passed to the synthesizer means instead of the usual fourteen, as is done in the prior art. Fourteen parameters could, however, be passed to the synthesizer means at a given time if that were also desired; however, inthe present invention only ten channels are necessary for the parameter generation device, which, in the case of a digital computer being used as a parameter generation device, results in a substantial monetary saving due to the using of a lesser 3,524,930 Patented Aug. 18, 1970 number of channels than has been done in the prior art. The utilization of the parameter switch of the present invention results in a versatility of connection to drive means and driven means not present in prior art devices.
Another example of the prior art is a speech sound generating device described by Vermeulen et al., in Pat. No. 2,458,227, issued on Jan. `4, 1949. The Vermeulen device performs synthesis as well as analysis. Four resonances are utilized, each having fixed bandwidth and carrying a given frequency range, with a given resonance shape output from the resonance circuit. The resonant frequency control is a reactance tube control, utilizing two circuits to cover the first formant range of 200 to 800 cycles per second. There is no disclosed capability of mixed excitation, which is necessary for certain speech sounds. The configuration shown in Vermeulen is of the fixed variety. There is an amplitude 'control of the resonance circuits of Vermeulen, however, there is no anti-logarithmic control of this amplitude function which is necessary for a wide dynamic range. Furthermore, Vermeulen does not filter the generator parameters so as to remove the transients due to generation, such as pops and c1icks, from the synthesized output. The Vermeulen device, as is common in the prior art devices, passes all the generated parameters to the synthesizer means.
The device of the present invention overcomes the inherent limitations in the Vermeulen device by several novel innovations. Seven resonance circuits instead of the four of Vermeulen are utilized so as to provide better fidelity, especially on fricatives. These resonance circuits are parallel resonance circuits having adjustable bandwidths, the proper bandwidth being important to fidelity. A choice of resonance shapes may be gotten from the device of the present invention; i.e., the conjugate pole pair or the conjugate pole pair with a zero at zero frequency. The present invention utilizes a chopper-type control of the resonant frequency so as to be able to cover the first formant range with only one formant resonance circuit. In the circuit of the present invention, a parameter switch is utilized to minimize the number of inputs, such minimization being in accordance wth the parameters necessary for the synthesis of a given speech sound. The circuit of the present invention has flexibility; for example, the inputs circuits can be readily adapted to different ranges of input parameter voltages. The resonance circuits of the present invention have an anti-logarithmic control for the amplitude parameter, which will be explained in more detail later on. Lastly, the parameter switch, as well as a separate parameter filter for those parameters not passed through the switch, filter out transients in the parameter signals, such as those due to generation and switching, so as to eliminate the pops and clicks from the synthesized output.
With these and other disadvantages of the prior art in view, an object of the present invention is to provide a new and improved resonance synthesizer.
Another object of the present invention is to provide a new and improved resonance synthesizer having a switching means for minimization of inputs according to the type of sound.
Still another object of the present invention is to provide a new and improved resonance synthesizer capable of providing a mixed excitation necessary for certain sounds.
A still further object of the present invention is to provide a new and improved resonance synthesizer which is more versatile than those of the prior art.
Other objects and many of the intended advantages of this invention will be readily appreciated as the invention becomes better understood by reference to the following description when taken in conjunction with the following drawings wherein:
FIG. 1 is a block diagram of the preferred embodiment of the present invention.
FIG. 2 is a schematic diagram of the parameter switch shown in FIG. 1.
FIG. 3 is a schematic diagram of one of the four electronic switches of the parameter switch shown in FIG. 2.
FIG. 4 is a graphic illustration of various signals present in the excitation selector and mixer means shown in FIG. 1.
FIG. 5 is a block diagram of a resonance circuit shown in FIG. 1.
FIG. 6 is a block diagram of the amplitude control circuit of the resonance circuit shown in FIG. 5.
FIG. 7 is a schematic diagram of the conjugate pole resonance of the resonance circuit shown in FIG. 5.
THEORY Referring now to FIG. l, the resonance synthesizer of the present invention has two excitations circuits 10 and 11 and three spectrum shaping circuits 12, 13, and 14. The excitation circuits 10 and 11 consist of a randomnoise generator 11 for turbulent excitation, and a pulsegenerator 10, the frequency of which is variable logarithmically over the range 55 c.p.s.330 c.p.s., for periodic excitation. For synthesis of womens and childrens voices this range can be altered to 110 c.p.s.-660 c.p.s. by switching. The spectrum-shaping circuits 12, 13, and 14, consist of a vowel circuit 12, a nasal circuit 13 and a conso nance circuit 14. The vowel circuit 12 is composed of four parallel formant resonances 17, 18, 19, and 20. The amplitudes (A17, A18, A19, A) of these four resonances, and the frequencies of the lowest three (F17, F18, F19) are dynamically variable; the frequency of the fourth, F20, is manually variable, the manual adjustment being made by a screw adjustment on the synthesizer 22. A nasal circuit 23 is a single resonance of dynamically variable frequency FN and amplitude AN. The consonant circuit 14 consists of two parallel resonances 24 and 25 of dynamically variable frequency FK24 and FK25, and amplitude AK24 and A1125. The bandwidths of all resonances are manually variable. The amplitudes are variable over a 50 db range. The ranges over which resonance frequencies and bandwidths are variable are:
The synthesizser 22 has four states, selected by a twobit switch function, SXX, shown illustratively in FIG. 4, generated from an analogue input device 28. Each bit has two states, either 0 or 1. The first bit selects the spectrumshaping circuits; and assigns the control functions to the appropriate parameters; the second, bit determines the form of the excitation. Each state is appropriate for particular types of speech sound.
When the first bit state is 0, the vowel resonances 17, 18, 19, and 20, and the nasal resonance 2.3, operate in parallel and the consonant resonances 24 and 25 outputs are switched out. The dynamic parameters are F0, F17, F1a, F19, FN, A17, A18, A19, A20, and AN- When the first bit state equals l, the nasal resonance 23 and the fourth vowel resonance 20 are replaced by the two consonant resonances 24 and 25, F17 assumes a manually preset value,
and the dynamic parameters are: F0, F18, F19, F1124, F1125,
A17, A18, A19 AKZ.; and AK25. Thus, CVCD. hDugh the synthesizer 22 has a total of 14 dynamic parameters in addition to the two-bit state function, Sxx, only 10 control functions (parameters) are needed at any one time. FN, F17, AN, A20 alternate F521, F525, A324, and
A1125, respectively, under control of the first bit of the state function.
When the second bit of the state function equals 0, the excitation is random noise. F0 is, of course, superfiuous. When the second bit of the state function equals 1 and the first bit of the state function equals 0, the excitation is periodic; when a second bit of the state function equals 1 and the first bit of the state function also equals l, the excitation is mixed: noisy and periodic in varying proportions for the three vowel resonances 17, 18, and 19, and the consonant resonances 24 and 25. The mixture for each resonance is provided by means of the excitation selector and mixer network 29.
The state wherein the first bit of the state function is equal to 0 and the second bit of the state function is also equal to 0, is appropriate for lz-like sounds, vowel onsets after voiceless consonants and whispered vowels. The state wherein the first bit equals 0 and the second bit equals 1 is appropriate for semi-vowels, vowels, nasals, nasalized vowels, liquids, and the closure periods of voice stops. The state wherein the first bit equals 1 and the second bit equals 0 is appropriate for voiceless fricatives, affricates and stop bursts. The state wherein the first bit equals 1 and the second bit also equals 1 is appropriate for voiced fricatives, and affricates and noisy liquids and semi-vowels.
Since the switching function is rather complex, overall performance depends to a large extent on the quality of the switching circuits 30, shown in detail in FIGS. 2 and 3. Bilateral transistors 33, 34, and 3S, are used so as to minimize distortion and reduce the complexity of the circuits.
The conjugate pole pair resonances 36, shown in FIG. 5, and in more detail in FIG. 7, will be simulated by second-order operational amplifier loops, such as shown in FIG. 7. Since recent performance improvements and price reductions have made them very attractive, operational amplifiers will be used to a great extent throughout the synthesizer 22. Their use makes the output of the synthesizer 22 readily predictable from its input, simplifies calibration procedures, facilitates modifications and assists in maintenance.
The conversion of the synthesizer 22 control functions will be implemented by using anti-logarithmic modulators 37 foi the fundamental frequency and the amplitudes of the resonance inputs, as shown in FIG. 6, and pulse-width modulators 38 for the resonance frequencies, as shown in FIGS. 1 and 5. A stable internal clock 39 is included to provide a quality time-base for the pulse-width modulators 38.
There are a number of methods by which the synthesizer 22 of the present invention may be driven. For example, it may be driven by a mechanical function generator which converts hand-painted parameter functions to control voltages; by a punched tape to 'be mounted on a motor-driven set of pulleys, with suitable decoding circuitry; by a digital magnetic tape; by a laboratory computer with a digitalto-analog converter; or by a large central computer with a time-sharing system, operating in conjunction with a laboratory digital-to-analog converter. The choice would depend upon the needs and resources of the particular users. All methods except the first require a computer. In the second and third methods the computer operates off-line to generate a control tape; in the fourth and fifth methods the computer operates on-line. For the first method, modification of the synthesizer to operate the switching function, SXX, by an analog control function is required; otherwise, the two bits of the state function are transmitted to the synthesizer digitally. In the case of the fifth method, it would be desirable to set the synthesizer clock by means of a digital duration signal from the computer. The synthesizer would then request new control values only when the values actually changed, rather tharlii at fixed intervals, and so free the computer for other,` wor OPERATION The synthsizer 22 of the present invention uses an analog input device 28 as a parameter generation means for driving the synthesizer 22. Eleven channels of the analog input device 28 are utilized. One channel provides the switch function SXX. One channel provides the pitch frequency, F0, which controls the rate of pulse excitation. Five channels provide the input parameters A17, A1B, F19, A19, and F 19. The remaining four channels provide an input to the parameter switch 42 which will yield eight other speech signal control parameters F17, AK24, A29, FK24, FN, A1125, AN, and FK25. The remaining speech parameter F29 is accomplished by means of a screw adjustment 43 on the synthesizer 22.
The input parameters A17, A19, F18, A19, and F19, are fed into a parameter filter 44 before passing on to the synthesizer 22 and the respective resonance circuits 45. This para-meter Ifilter 44 filters out transients from the parameter control signals, due to the parameter generation. The five parameters A11, A18, F19, A19, and F19, which are passed directly to the synthesizer 22 after passing through the parameter filter 44 are the input parameters which have been determined to be the necessary speech ingredients for most speech sounds.
PARAMETER SWITCH 'The parameter switch 42, shown in F'IG. 2, which Yields the Parameters F17, AKM, A20, 17x24, FN, AK25, AN and F1125 from the four-channel input from the analog input device 28, selects those parameters lwhich are necessary for either the nasal sounds or the fricatives. The arnplitude (A) parameters, taken both directly from the analog input device 28 and the parameter switch 42, are amplitude control functions; the frequency (F) parameters are center frequency control functions. Referring now to FIGS. 1, 2, and 3, the parameter switch 42 consists of four electronic switches 30 connected in parallel across two input leads, 50 and 51, both carrying the switch function SXX, each switch 30 having a parameter output pair, 47, 48, and an external input device 49.
Referring specifically now t0 FIG. 3 which shows a detailed schematic diagram of one of the four electronic switches 30 of the parameter switch 42, the external input device 49 may be utilized to provide any external voltage source capable of supplying some non-zero input to the electronic switch 30. `If desired, the external input 49 may be set to 0. The electronic switch 30 consists of three transistors 33, 34, and and three operational amplifiers 53, 54, and 55, with associated circuitry. The switch function Sxx taken from the analog input device 28 determines the condition of the three transistors 33, 34, and 35. There are two leads 50 and 51 to the transistors of the electronic switch 30, each lead ybeing connected to the channel containing the switch function Sxx. Both leads 50 and 51 contain the switch function, but may differ in voltage level. One of the two leads 51 goes to the transistor pair 33 and 35 which operate concurrently; the other lead 50 goes to the third transistor 34 which operates inversely to the operation of the transistor pair 33 and 35.
ELECTRONIC SWITCHES The electronic switch 30 is a parallel switch wherein the switch is off when the respective transistor is on. The switch is a one-bit switch having two states, In the first state, when the tranitor pair 33 and 35 are off, and the third transistor 34 is on, we have both the output of the first operational amplifier 53 through the third operational amplifier 55, and the output of the external input device 49 through the second operational amplifier 54. If it is desired to set the first output 47 to a given value, this can be accomplished -by setting the value of the external input device 49, and both outputs 47 and 48 may then pass to the associated resonance circuits. If it is desired that the first output 47 be equal to 0, then the external input device 49 may in reality be set t0 0, or be replaced by grounding this connection. The value of this external input device 49 is the manually preset value discussed previously.
In the other state of the electronic switch 30, the transistor pair 33 and 35 is on, and the third transistor 34 iS off. We then have the second output 48, that is the output through the third operational amplifier 55, equal to 0 due to the transistor 35 associated with this output 48 being effectively a ground, and the first output 47 being equal to the input parameter through the first operational amplifier 53 and on through the second operational amplifier 54; the external input 49 being grounded through its associated transistor 33, thereby having a zero output passing through the first leg 47 to the resonance circuit associated with that leg. The selected parameter passes through the other output leg 48 to its associated resonance circuit. The output leg 47 passing the 0 to the associated resonance circuit effectively shuts this circuit off.
The parameters output of the parameter switch 42 can be broken down into two groups. These groups are known as phonemic groups; the first group being the nasal group comprising the parameters F17, A20, FN, and AN; the second group being the fricative group comprsing the parameters AK21, F1121, A1125, and F1125. The nasal group is present when the first bit of the state function is equal to 0, and the fricative group is present when the first bit of the state function is equal to l.
In addition to acting as a selection means for these two phonemic groups, the parameter switch 42 also acts as a filtering means to filter out transients, such as the pops and clicks due to switching, or the transients due to parameter generation. The transients are eliminated from the synthesizer 22 output by the action of the capacitors 57, 58, and 59 associated with the first 53, second 54, and third 55 operational amplifiers in the electronic switch 30, acting as filters for the generated parameters.
EXCITATION SELECTOR AND MIXER A state function channel SxX from the analogue input device 28, provides the input leads 50 and 51 to the parameter switch 42, which leads provide the parallel connection means for the electronic switches 30 which make up the parameters switch 42, also it is connected directly to the synthesizer 22 though a second switching means 62. The second switching means 62 is composed of a mechanical switch 63 and a Boolean AND function circuit `64. The switch 62 together with the pulse source 10 and` the noise source 11 provide the inputs for the excitation selector and mixer 29.
The pulse source 10 may be of the conventional type, not shown, consisting of a function generator, a voltage controlled oscillator, and a standard pulse generator, wherein the F9 pitch frequency parameter is converted to an approximate expontial function in a standard twobreak-point diode function generator, which drives the linear voltage controlled oscillator; the output` of the voltage controlled oscillator triggering the pulse generator so as to yield an `output pulse excitation whose width is constant and whose frequency is an exponential function of the F0 parameter input.
The noise source 11 may also Ibe conventional wherein a noise diode, the output of which is amplified by 60 db, low-pass filtered and then amplified again by 20 db, so as to yield a noise power which is approximately constant over the frequency range of i to 8000 c.p.s., is used to provide the noise excitation.
The second switching means 62 provides the synthesizer 22 with the mixed excitation capability mentioned previously. The pulse 10 and noise 11 excitations may be mixed to obtain a multiplicative output in the state when the first bit of the state funcion is equal to 1 and the second bit of the state function is also equal to l. This multiplicative output will only occur in this state.
Referring now to FIG. 4 which is a graphic illustration of various signals present in the input to the excitation selector and mixer 29, SXx represents the state function input to the second switching means 62, P represents the pulse excitation input to the Boolean AND function circuit 64, and to the excitation selector and mixer 29; S11(P0) represents the multiplicative output of the Boolean AND function circuit 64, this signal appearing only when both P0 and S11 are present; No represents the noise excitation 11 output; and S11(P0) (No) represents the multiplicative output of the excitation switch 66 which is passed to the adder circuit 67 of the excitation selector and mixer 29.
The multiplicative output function of the Boolean AND function circuit 64, S11(P0), is utilized in the excitation switch 66 to switch the noise excitation, No, on through the excitation switch 66. The noise excitation No only passes when both the multiplicative signal S11(P0) and the noise excitation No are present. In the S11 state, that is, when the first bit of the state function is 1 and the second bit of the state function is also 1, the excitation swich 66 does not allow the pulse excitation source 10 to pass by itself to the adder circuit 67. The adder circuit 67 of the excitation selector and mixer 29 is always present. In the S11 state the input to the adder is S11(P0) (N0) and the other input is equal to 0; the input to the excitation switch 66 being Nu and S11(P0). The signal P0 is also passed to the excitation switch 66 Ibut this switch 66 does not pass P0 to the output and subsequently to the adder 67.
In the other three states in the four state function, when the first bit is equal to 0 and the second bit is equal to l (S01), when the first bit is equal to 1 and the second bit is equal to 0 (S10), and when the first bit is equal to 0 and the second bit is also equal to "0 (S00), the second switching means 62 only passes the state function SXx to the excitation switch 66. At these times P0 and N0 are also fed into the input of the excitation switch 66. In the S10 and S00 states, only No is passed to the adder circuit 67, thereby yielding a continuous noise output. In the S01 state, only P0 is passed to the adder circuit 67, yielding a continuous pulse outut. p If an additive, S11(Po) plus No, instead of a multiplicative S11(P0)(N0) excitation in the S11 state is desired, this may be accomplished by merely allowing both P0 and No to pass at the same time through the excitation switch 66.
The excitation switch 66 may be a standard transistorized switch similar in nature to the parameter switch 42.
The output of the excitation switch 66 is fed to the adder 67 and then from the adder 67 to the pulse width modulator 38 to obtain a pulse width modulated waveform e0 of the excitation input; the clock source 39 is used as a referenec for the circuit, as is standard.
The clock source 39 provides the reference signal c in a conventional manner by utilizing the output of a stable crystal oscillator to generate a standard clock pulse and a stable triangular wave used in the pulse width modulators 38. The crystal oscillator may 'be a 100 k.e.p.s. oscillator for driving a two-stage counter in order to divide the frequency by a factor of four to yield a k.c.p.s. signal. This signal then triggers a standard pulse generator whose pulses control the period of a ramp generating circuit, resetting the ramp each time a pulse occurs, yielding a clock source 39 output c which is a stable triangular wave, suitable for driving the pulse width modulator circuits 38.
RESONANCE CIRCUITS The excitation selector and mixer output en, and the output c of the clock source 39 are fed directly to the resonance circuits 45 shown in block form in FIGS. 5 and 6 and in detail in FIG. 7. There are seven resonance circuits as was previously mentioned, four vowel circuits 17, 18, 19', and 20; one nasal circuit 23; and two consonant circuits 24 and 25. Each of these resonance circuits has an input consisting of the excitation selector and mixer output e0, clock source 39C, an associated amplitude parameter Ax, and an associated frequency parameter Fx.
Referring now to FIG. 5, which is a 'block diagram of any of the resonance circuits of the present invention, each resonance circuit comprises a frequency position drive 38, which is in effect no more than a pulse width modulator, having an input of its associated frequency parameter and the clock source 39 signal c, and an output of the frequency control signal ec; an amplitude control anti-logarithmic circuit 37 having an input of the associated amplitude parameter AX and the excitation selector and mixer output en, and a multiplicative output of these two signals (e0) (AX) and a conjugate pole pair resonance circuit 36 having an adjustable bandwidth, having an input of the frequency control signal ec and the multiplicative amplitude control sgnal (e0) (AX) and an output of the desired formant.
Referring novi,r to FIG. 6, which is a block diagram of the amplitude control anti-logarithmic circuit 37 of the resonance circuit shown in FIG. 5, the amplitude control anti-logarithmic 37 circuit consists of an antilogarithrnic circuit 68, which yields a linear db output for a linear voltage input amplitude parameter; a balanced modulator 69 connected to this anti-logarithmic circuit 68, which removes the conponents of the multiplicative (e0) (AX) signal in the band of interest that cannot be removed by the low-pass filter 70 which is connected to the balanced modulator 69 and from which the output of this amplitude-control anti-logarithmic circuit 37 is taken. The input to the balanced modulator 69, in addition to the linear output of the anti-logarithmic circuit 68, is the excitation selector and mixer output e0. Low-pass filter 70 gets rid of the unwanted components of the excitation selector and mixer output signal e0. Balanced modulator 69 gets rid of the unwanted components of the amplitude parameter signal Ax. Through the utilization of this amplitude control circuit 37 we can o'btain an output signal which does not contain unwanted signals which appear throughout the multiplicative output signal e0(AX).
Referring now to FIG. 7 which is a schematic diagram of the conjugate pole pair resonance circuit 36 of the resonance circuits, the conjugate pole pair resonance circuit 36 is a modified second-order operational amplifier loop, having two inputs 71 and 72, as can be seen from examining FIG. 7. The first input 71 is to the transistor pair 73 and 74 and the second input 72, which is the output of the amplitude-control circuit 37, is to the circuit 36 across a variable resistance 75. The two transistors 73 and '74 shown in FIG. 7 change the mode of the loop from an operate to a hold condition at a rate which is at least twice as high as the highest frequency of interest. The circuit output in the operate condition is that of a conjugate pole pair network, such as a common R-L-C network. By controlling the amount of time the circuit is in the operate mode, the frequency location of the poles may be varied, so that the output impulse response in time will be of the form:
where M is the fraction of time the circuit is in the operate mode and wR is the natural frequency of thecircuit.
The switching function performed by the two transistors 73 and 74 is accomplished by using the low collector t0 emitter resistance of the junction transistors operated in the inverted configuration shown in FIG. 7. The frequency control signal ec, which is fed to the input 71 t0 the transistor pair 73 and 74, and which controls the state of the transistors 73 and 74, is, as was previously mentioned, generated by a pulse-Width modulator 38 known as the frequency position drive source 38, the input of which is the frequency control parameter FX, which determines the pulse width.
The conjugate pole resonance circuit 36 has a manually adjustable bandwidth, being adjustable by means of the variable resistance 75. The frequency control signal ec controls the transistor switches 73 and 74 to switch the circuit from the hold (off) to the operate (on) mode. The frequency control signal ec is either positive or negative; when it is positive the circuit is on, and when it is negative the circuit is olf.
When the frequency parameter FX is 0, we are operating in the lowest frequency in our desired range. This is due to an in-circuit adjustment of the frequency position drive source 38, which is standard in this type of circuitry.
The output of the resonance circuit is taken from the conjugate pole resonance circuit 36. We have a choice of three outputs 76, 77, and 78 which may be selected by means of a switch 80. The outputs 76, 77, 78 are dependent upon which point in the conjugate pole resonance circuit 36 the output is taken. The first output 76, which is the normal output is taken at the output of the second operational amplifier S1; a second output 77 may be taken at the output of the third operational amplifier 82, said output being the negative from a normal output; and a. third output 78 may be taken as the output of the iirst operational amplifier 83, said output being a negative of the normal output with a spectral zero-at-zero frequency. The normal output of the resonance circuit is the convolution of the multiplicative signal e(AX) and the impulse response of the conjugate pole resonance circuit 36; such output being `a formant.
The fourth vowel resonance circuit operates in the same manner as the other resonance circuits, except that the input frequency parameter F20 is a manually variable frequency arrived at by an internal potentiometer adjustment, F20 being internal to the synthesizer means 22.
During the operation of the resonance circuits, input parameters A17, A18, and A19 are always present in the synthesizer 22 and therefore in the associated resonance circuits. If desired, these parameters could be set to O. Input parameters AK24, A20, AK25 and AN are switched by the state function SX in the parameter switch 42 so as to select the desired parameters to be passed to the associated amplitude control circuit 37. The amplitude of the resonance circuit 45 input is controlled by an antilogarithmic circuit 68, as explained previously. The converted amplitude control parameter is multiplied by a pulse-width modulated form of the selected excitation in a balanced modulator 69, and the resultant product is low-pass filtered through a low-pass iilter 70 to remove unwanted extraneous components in the modulation product. The output is then the selected excitation at an amplitude which is proportional to the anti-logarithm of the input amplitude parameter. The amplitude parameters are critical factors which determine whether or not the resonance circuits are on; the amplitude parameters not being passed to the resonance circuits being equivalent to 0. The frequency parameter inputs to the resonance circuits are important in that they control the resonant frequency of the associated resonance circuits.
OUTER MIXER The selected outputs of these resonance circuits 45, which are speech formants, are then passed into an output mixer 84. The output mixer 84 accomplishes summation of these speech formants and iinal low-pass filtering to produce connected speech sounds from these formants so as to form words by combining the formants to form phonemes, and phonemes to form words. These words can be connected together in sentence form, and in ensuing paragraph form in any desired speech arrangement.
The resonance synthesizer Iapparatus of the present invention thus makes use of the fact that only ten parameters (control functions) are necessary for speech sounds at any one time, thus allowing for more etlicient and economical synthesis of speech.
It is to be understood that the above-described embodiment of the invention is merely illustrative of the principles thereof and that numerous modilications and embodiments of the invention may be derived within the spirit and scope thereof; such as, utilizing individual excitation selector and mixer circuits for each of the resonance circuits so as to provide each with a separate and distinct mixed excitation output.
What is claimed is:
1. A resonance synthesizer comprising:
a parameter generating means for generating dynamically variable speech signal parameters and at least one state function;
a parameter switching means for selecting a desired phonemic group from said speech signal parameters;
a speech synthesizer means, having at least one state, for synthesizing the speech signal parameters to form connected speech sounds wherein less than all the generated speech signal parameters and at least one state function are utilized to control the speech output of the synthesizer;
a first plurality of connecting means, for connecting the parameter generating means to the speech synthesizer means and to the parameter switching means, the parameter switching means being connected to the parameter generating means by less than the total number of iirst connecting means; said state function being passed to the parameter switching means and to the synthesizer means by at least one of the iirst plurality of connecting means; and
a second plurality of connecting means, for connecting the parameter switching means to the synthesizer means wherein less than all the speech signal parameters passed to the parameter switching means by less than the total number of iirst connecting means are passed to the synthesizer means by less than the total number of second connecting means so that the desired phonemic group is passed to the synthesizer means.
2. A resonance synthesizer in accordance with claim 1 wherein said first plurality of connecting means includes a parameter filtering means for filtering at least one of the generated parameters of the total number of generated speech signal parameters, connected to the speech synthesizer means by the iirst plurality of connecting means so as to eliminate transients.
3. A resonance synthesizer in accordance with claim 2 wherein the parameter generating means is an analogue input device.
4. A resonance synthesizer in accordance with claim 3 wherein the analogue input device is a digital computer.
5. A resonance synthesizer in accordance with claim 3 wherein the parameter switching means is an electronic switching means comprising at least one electronic switch.
6. A resonance synthesizer in accordance with claim 5 wherein the electronic switch includes:
a transistor means;
an external input means for generating a desired external input speech signal parameter;
a lirst output comprising one of the generated speech signal parameters or the desired external input speech signal parameter; and
a second output comprising another generated speech signal parameter.
7. A resonance synthesizer in accordance with claim 6 wherein said state function passed to the parameter switching means controls the operation of the transistor means so as to select which output generated parameter to pass to the synthesizer means.
8. A resonance synthesizer in accordance with claim 4 wherein said electronic switch comprises:
a first input from the parameter generating means;
a first impedance connected to said first input;
a parallel second impedance;
a first operational amplifier connected in parallel with said second impedance;
a parallel third impedance;
a second operational amplifier connected in parallel with said third impedance;
a series fourth impedance connected between said first operational amplifier and said second operational amplifier;
a first transistor, having an emitter, base and collector,
connected to the fourth impedance;
a fifth impedance connected to the base of the first transistor;
a first connecting means for connecting the fifth impedance and the number of connecting means of the first plurality of connecting means passing the state function to the parameter switching means and the synthesizer means;
an external input means for supplying desired external speech signal parameters;
a second input wherein said second input is the input from said external input means;
a sixth impedance connected between said second input and said second operational amplifier;
a second transistor, having an emitter base and a collector connected to the sixth impedance;
a seventh impedance connected to the base of the second transistor;
a second connecting means for connecting the base of the second transistor and the number of connecting means of the first plurality of connecting means passing the state function to the parameter switching means and the synthesizer means;
a series eighth impedance connected to the output of the first operational amplifier;
a parallel ninth impedance; n
a third operational amplifier connected in parallel with said ninth impedance and in series with said eighth impedance;
a third transistor, having an emitter, base and collector,
connected to the eighth impedance;
a tenth impedance connected between the base of said third transistor and the second connecting means;
a first and a second speech signal parameter output at the second and third operational amplifiers, respectively;
said second and third transistors operating together and inversely to the operation of the first transistor so as to pass only one of the two generated speech signal parameter outputs to the synthesizer means, said external input means passing the desired external speech signal parameter to the synthesizer means in addition to said passed one generated speech signal parameter when said second and third transistors do not operate and said first transistor does operate.
9. A resonance synthesizer in accordance with claim 8 wherein:
said parallel second, third and ninth impedances comprise a resistor and capacitor connected in parallel;
said first, fourth, fifth, sixth, seventh, eighth, and tenth impedances comprise resistors;
said first, second, and third transistors are connected to said fourth, sixth, and eighth impedances, respectively, by their respective bases; the respective emitters being grounded.
10. A resonance synthesizer in accordance 'with claim 9 wherein said second, third, and ninth impedance capacitors are filters for filtering the generated speech signal parameters so as to eliminate transients.
11. A resonance synthesizer in accordance with claim 10 wherein the electronic switching means comprising at least one electronic switch comprises more than one electronic switch said electronic switches being connected in parallel across said first at second connecting means, each electronic switch having a separate input from the parameter generating means at the first input;
a separate desired input from the external input means 12 at the second input associated with one of the two generaed speech signal parameters; and
a separate output pair of two speech signal parameters to the synthesizer means, only one generated speech signal parameter of each output pair being passed to the synthesizer means, and the separate desiredexternal input being passed to the synthesizer means when the generated speech signal parameter associated with the external input means is not passed to the synthesizer means.
12. A resonance synthesizer in accordance with claim 11 wherein said state function passed to the parameter switching means controls the operation of the transistors so as to select which output generated parameter to pass to the synthesizer means.
13. A resonance synthesizer in accordance with claim 7 wherein the speech synthesizer means comprise:
a first switching means having a first and a second input and an output, the first input being connected to the number of connecting means of the first plurality of connecting means passing the state function to the parameter switching means and the synthesizer means;
an excitation selector and mixer means having an input and an output, connected at the input to said first switching means at the output of the first switching means;
a pulse excitation source means, having an input and an output, the output being connected to the second input of said first switching means and to the input of said excitation selector and mixer means, and the input being connected to a parameter of the parameter generating means;
a noise excitation source means connected to the input of the excitation selector and mixer means;
a clock source means connected to said excitation selector and mixer means;
a plurality of resonance means connected to the excitation selector and mixer means, at least one of said plurality of resonance means being connected to the clock source means for producing speech foi'mants;
a second switching means connected to the plurality of resonance means for selecting a desired output form of the formant; and
an output mixer means connected to the second switching means for yielding an output consisting of ccnnected speech sounds.
14. A resonance synthesizer in accordance with claim 13 wherein the pulse excitation source input parameter is a pitch frequency parameter for controlling the rate of pulse excitation.
15. A resonance synthesizer in accordance with claim 14 wherein the first switching means includes a Boolean logic AND function circuit for combining the pulse excitation source output and the state function to obtain a first multiplicative output of the state function and pulse excitation source in at least one state of the state function.
16. A resonance synthesizer in accordance with claim 15. wherein the excitation selector and mixer means comprise:
an excitation switch said excitation switch being connected to the pulse excitation source means, the first switching means, and the noise excitation source neans, for selecting the desired excitation combinaion;
an adder circuit connected to the excitation switch for adding the desired excitation combination; and
a pulse-width modulator means connected to the adder circuit and to the clock source means for obtaining a desired excitation output to pass to the plurality of resonance means.
17. A resonance synthesizer in accordance with claim 16 wherein said excitation switch is an electronic switch.
18. A resonance synthesizer in accordance with claim 17 wherein said state function passed to the first switching means, and by said first switching means to said excitation switch, controls the operation of the electronic excitation switch so as to control the mixing of the pulse and noise excitation source means.
19. A resonance synthesizer in accordance with claim 18 wherein in the state of said state function, when the pulse excitation source is multiplicative with the sta-te function the noise excitation source is fed into the excitation switch means by the first switch means to obtain a second multiplicative output of the state function, pulse excitation source and noise excitation source.
20. A resonance synthesizer in accordance with claim 19 wherein said plurality of resonance means comprises vowel resonance means, nasal resonance means, and consonant resonance means.
21. A resonance synthesizer means in accordance with claim 20 wherein said vowel resonance means comprise rst, second, third and fourth vowel resonance circuits, said nasal resonance means comprise a nasal resonance circuit, and said consonant resonance means comprise a first and second consonant circuit.
2.2. A resonance synthesizer means in accordance with claim 21 where-in all resonance circuit means have associated manually variable bandwidths.
23. A resonance synthesizer in accordance `Vith claim 22 wherein said dynamically variable speed signal p-arameters comprise a first type and a second type.
24. A resonance synthesizer in accordance with claim 23 wherein said first type is amplitude and said second type is frequency, each parameter being associated with a selected resonance circuit.
25. A resonance synthesizer in accordance with claim 24 wherein the amplitude parameters control the output of the resonance means so as to pass the desired speech formants to the output mixer means through the second switching means, for obtaining the desired connected speech sound output of the synthesizer me ans.
26. A resonance synthesizer in accordance with claim 25 wherein the resonance circuits each have an associated frequency parameter input, a dynamically variable amplitude parameter input, a clock source input, and an input of the excitation output, and an output of the desired speech formant.
27. A resonance synthesizer means in accordance with claim 26 wherein the resonance circuits each comprise:
a frequency position drive source having an input of the clock source and of the associated frequency parameter and an output of a frequency control signal;
an amplitude control circuit having an input of the associated dynamically variable amplitude parameter and of the excitation output, and an output of an amplitude control signal; and
a first conjugate pole resonance circuit having the associated manually variable bandwidth and having an input of the frequency control signal output and of the amplitude control signal output, and an output of the desired speech formant to the output mixer means.
28. A resonance synthesizer in accordance with claim 27 wherein the amplitude control circuit is anti-logarithmic and comprises:
an ant-logarithmic circuit having an input of the associated dynamically variable amplitude parameter in the form of a linear voltage input, and an output in the form of a linear decibel output of the linear voltage input;
a low-pass filter; and
a balanced modulator circuit, connected between the anti-logarithmic circuit and the low-pass filter and having an input of the anti-logarithmic output and of the excitation output for forming a third multiplicative amplitude control signal output and for removing the unwanted components of the third multiplicative output not removed by the low-pass filter.
29. A resonance synthesizer means in accordance with claim 28 wherein the associated frequency parameter inputs of the first, second, and third vowel resonance circuits, the nasal resonance circuit, and the first and second consonant resonance circuits are dynamically variable parameters obtained from the parameter generating means, and the associated frequency parameter input of the fourth resonance circuit is a manually variable internal parameter.
30. A resonance synthesizer in accordance with claim 29 wherein the first and second conjugate pole resonance circuits comprise a modified second-order operational arnplier loop having a first and second mode changing transistor connected to the circuit frequency control signal input and a first, second, and third operational amplifier having first, second, and third output connections, respectively, connected to the second switching means for obtaining the desired output form of the desired speech formant to be passed to the output mixer means.
References Cited UNITED STATES PATENTS 2,243,089 5/ 1941 Dudley. 2,243,526 5/ 1941 Dudley. 3,158,685 11/1964 Gerstman et al. 3,319,002 5/ 1967 Clark et al.
OTHER REFERENCES Johan C. W. A. Liljencrants: The OVE III Speech Synthesizer, IEEE Transactions on Audio & Electroacoustics, vol. AU-16, No. 1, March 1968.
KATHLEEN H. CLAFFY, Primary Examiner C. W. JIRAUCH, Assistant Examiner U.S. Cl. X.R. 179-1555
US743138A 1968-07-08 1968-07-08 Resonance synthesizer for speech research Expired - Lifetime US3524930A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US74313868A 1968-07-08 1968-07-08

Publications (1)

Publication Number Publication Date
US3524930A true US3524930A (en) 1970-08-18

Family

ID=24987662

Family Applications (1)

Application Number Title Priority Date Filing Date
US743138A Expired - Lifetime US3524930A (en) 1968-07-08 1968-07-08 Resonance synthesizer for speech research

Country Status (1)

Country Link
US (1) US3524930A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2243526A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243089A (en) * 1939-05-13 1941-05-27 Bell Telephone Labor Inc System for the artificial production of vocal or other sounds
US3158685A (en) * 1961-05-04 1964-11-24 Bell Telephone Labor Inc Synthesis of speech from code signals
US3319002A (en) * 1963-05-24 1967-05-09 Clerk Joseph L De Electronic formant speech synthesizer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2243089A (en) * 1939-05-13 1941-05-27 Bell Telephone Labor Inc System for the artificial production of vocal or other sounds
US2243526A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US3158685A (en) * 1961-05-04 1964-11-24 Bell Telephone Labor Inc Synthesis of speech from code signals
US3319002A (en) * 1963-05-24 1967-05-09 Clerk Joseph L De Electronic formant speech synthesizer

Similar Documents

Publication Publication Date Title
US3836717A (en) Speech synthesizer responsive to a digital command input
Winham et al. Input generators for digital sound synthesis
JPS58117599A (en) Method and apparatus for compressing time region information signal
US4470150A (en) Voice synthesizer with automatic pitch and speech rate modulation
Klatt Acoustic theory of terminal analog speech synthesis
US3524930A (en) Resonance synthesizer for speech research
US4351219A (en) Digital tone generation system utilizing fixed duration time functions
US3078345A (en) Speech compression systems
US4566117A (en) Speech synthesis system
US3268660A (en) Synthesis of artificial speech
US4805220A (en) Conversionless digital speech production
US4173915A (en) Programmable dynamic filter
US3830977A (en) Speech-systhesiser
JPS60100199A (en) Electronic musical instrument
US3280266A (en) Synthesis of artificial speech
JPS6038912B2 (en) Signal processing method
Strube Synthesis part of a" Log area ratio" vocoder in analog hardware
US3491205A (en) Plural formant speech synthesizer
US5231240A (en) Digital tone mixer
JPS6265100A (en) Csm type voice synthesizer
US3042748A (en) Dynamic analog speech synthesizer
SU120658A1 (en) Method of analysis and synthesis of speech formant or vocative type
Baggi Implementation of a channel vocoder synthesizer using a fast, time-multiplexed digital filter
US3499986A (en) Speech synthesizer
JP2535807B2 (en) Speech synthesizer