US3042748A - Dynamic analog speech synthesizer - Google Patents

Dynamic analog speech synthesizer Download PDF

Info

Publication number
US3042748A
US3042748A US757170A US75717058A US3042748A US 3042748 A US3042748 A US 3042748A US 757170 A US757170 A US 757170A US 75717058 A US75717058 A US 75717058A US 3042748 A US3042748 A US 3042748A
Authority
US
United States
Prior art keywords
transmission line
control
signal
sections
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US757170A
Inventor
Rosen George
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US757170A priority Critical patent/US3042748A/en
Application granted granted Critical
Publication of US3042748A publication Critical patent/US3042748A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00

Definitions

  • This invention relates a system ⁇ for bandwidth compression of speech which includes a speech analyzer and synthesizer and more particularly to a dynamic analog speech synthesizer.
  • Speech compression systems are comprised of three parts: an analyzer, a communication channel such as a telephone line or radio link, and a synthesizer.
  • the analyzer accepts a speech wave and delivers to the channel -a set of signals which are a coded representation of the important attributes of the input speech and which require little channel capacity for their transmission.
  • the channel delivers the signals to the synthesizer, which treats them as instructions for constructing a replica equivalent by some criterion to the speech at the input to the system.
  • Compression systems such as the vocoder in Patent No. 2,243,527, May 27, 1941, to Homer W. Dudley, ⁇ have produced natural-sounding speech but require a channel of relatively high capacity.
  • Formant-coding compression systems require much less channel capacity. Examination of speech spectra and of the mechanism of vowel production suggests how the ensemble of spectra may be reduced.
  • the vowels are characterized by formants, of which the lowest two or three are suicient to specify the vowel both acoustically and perceptually. Specification of a vowel spectrum by the frequencies o-f the rst three formants rather than by, say, the amplitudes at 16 points along the frequency scale is a much more efcient representation. If, for example, each formant frequency were specified to within 100 cps, the total number of vowel spectra would be about 213, assuming the over-all amplitude yof the vowel to be quantitized in eight steps. Thus, a system in which formant frequencies were coded would be a distinct advantage over a vocoder system, as far as channel capacity requirements are concerned.
  • the present invention provides a speech synthesizer for a formant coding system which is a dynamically controllable electrical analog of the vocal tract. It synthesizes sequences of speech sounds so that they are highly intelligible and natural in cont-rast to previous speech synthesizers.
  • a speech synthesizer which is a dynamic analog of the vocal tract also has utility in studies of speech production. Certain classes of consonants such as stops, semi-vowels, and aifricates exist only by virtue of articulatory changes. Analog studies of these consonants in particular and of connected speech in general require dynamically variable speech synthesizers.
  • the ability of a variable analog to synthesize syllables is helpful in the study of sounds such as nasals and fricatives which can exist in t-he steady state.
  • the dynamically variable -feature is helpful both because formant transitions may be studied and because synthetic stimuli can be presented in context.
  • the vocal tract is treated as an acoustic tube whose cross sectional dimensions are small as compared with a i wave length, i.e., there are no cross modes of vibration in the frequency range of interest.
  • rIhe cross sectional area of the tube is a function lof distance along i-ts axis.
  • This smooth tube is approximated in shape by a cascade of cylinders each of which has a uniform cross sectional area.
  • the analog of the -acoustical system is an electrical system wherein voltage is identified with pressure and current is identified with volume velocity.
  • each lCe cylinder - is represented by a T-section whose series elements are inductance (L) and whose shunt element is a capacitance (C).
  • L inductance
  • C capacitance
  • the dynamic analog to be described is terminated by an inductor which approximates the radiation impedance seen from the mouth.
  • the output is the voltage across a small inductor in series with the lradiation impedance. 'Ihe current through the radiation impedance is treated as the strength (volume velocity) of a simple source land hence the voltage across the small inductor behaves as 'the sound pressure at some distance yfrom that source.
  • the dynamic analog of the vocal tract (DAVO) speech synthesizer is a transmission-line analog of the human vocal tract.
  • the vocal tract is considered as an acoustic tube of variable cross sectional area, terminated by the glottis at one end #and the lips at the other.
  • the acoustic transmission line is simulated electrically by a number of variable LC sections, each of which represents a short length of the vocal tract.
  • the dynamic analog is comprised of a number of iixed and variable LC sections.
  • the inductance and capacitance in each of the variable sections are continuously variable through a range ⁇ of values corresponding to the range of variation of the dimensions of the human Vocal tract.
  • the electrical transmission line receives excitation from a quasi-periodic saw-tooth current source at one end, representing the output of the Vocal ⁇ folds and/or from a noise source of suitable spectrum, inserted ⁇ at an appropriate point along the transmission line to represent the turbulent noise that is created when the air flows through a constriction.
  • the dynamic analog is comprised of a lumped transmission line, sources of excitation for the line, and a device ⁇ for controlling the configuration of the line and its excitation.
  • Both the inductor and the capacitor of a given section Iof the line are controlled by a common control voltage in such manner that the length of that section, proportional to (LC) l, remains constant as its area, proportional to (C/L)/ changes in response to the control voltage.
  • the transmission line is excited at the left by glottal current pulses of variable frequency and amplitude.
  • a noise voltage generator is inserted in series with the transmission line inductor corresponding to the place where the turbulence is produced.
  • the noise amplitude is controlled and band-pass filtering is provided.
  • the generator of click components of velar stops is inserted in lshe line in a like manner.
  • the clicks or noise cor-respond to sounds generated by sudden pressure releases or by turbulent air llow in the vocal tract.
  • a nasal analog is inserted in the dynamic vocal tract.
  • the analog is comprised also of LC sections and operates to simulate nasal sounds.
  • the nasal analog is coupled into the speech synthesizer by means of a nasal coupling device.
  • the nasal analog represents the nasal cav-ity.
  • Each section of the aforementioned transmission line is comprised of three parts, a variable inductance, a variable capacitance circuit, and a control amplifier.
  • the variable inductor utilizes changes in the incremental permeability of a ferromagnetic material and the variable capacitor utilizes changes in the magnitude of the Miller effect.
  • the area range for each section is 100:1 (0.17 cm.2 to 17 cm2). Satisfactory vowels have been produced using only settings between 0.59 cm.2 and 17 cm?, a ratio of 29:1, but it is desirable to have a full continuous range of 100:1 in the dynamic an-alog range.
  • the impedance level was chosen to be 0.77 4 (electrical ohm)1/cm.2 of area.
  • each section of the dynamic analog corresponds to a given portion of the human vocal tract and since the length of each section of the analog remains constant, independent of the ⁇ area changes, the requirement of constant length for each section implies that LC be constrained to remain constant for each section. In order to realize the 100:1 range in larea both the inductance and the capacitance of each section has a 100:1 range.
  • the control characteristics are such that as to make the value of each element an exponential function of its control-voltage.
  • the constraint is achieved by the control exercised on the variable capacitance and inductance. Since the variable capacitance appears between ground and the input of the capacitor amplifier, its value is determined by VCC, capacitor control voltage.
  • VCC capacitor control voltage.
  • the variable inductance whose value is determined by the VCL, the inductor control voltage, appears across a signal winding of a saturable inductor.
  • the control characteristics of both variable capacitance circuit and the variable inductance circuit are exponential.
  • VcL d- (bcbL) Vcc (3) where aC, bC, aL, bL, and d are constants.
  • the control amplifier maintains a linear relation (Eq. 3) between VCL and VCC so that the LC product remains: constant.
  • VCC and VCL are determined by VCS, the control input voltage which determines the section area.
  • the length of the section remains constant as its area changes exponentially over a 100:1 range.
  • the exponential characteristics of the capacitor is achieved by proper design of the electronic attenuator.
  • Each section is constructed as an L-section.
  • a nonuniform line can be approximated more closely by a given number of T-sections than by the same number of L-sections. It is also possible to utilize a 1r configuration.
  • the transmission line whose maximum length is 17.5 cm., is comprised of 14 sections.
  • the three sections adjacent to the glottis are each 1.0 cm. in length and have fixed areas approximately equal to 1.0 cm?.
  • Each of the four sections nearest to the termination, the lip sections are normally 1.0 cm. in length. However, two of these can be shortened to simulate the shortening of the human vocal tract which occurs at wide mouth openings. For this purpose, the inductors and capacitors in those sections are controlled separately. The seven remaining sections are all 1.5 cm. long.
  • the present invention thus provides a dynamically controllable electrical analog of the Vocal tract capable of synthesizing speech sounds.
  • the acoustic transmission line between the glottis and lips in the human vocal tract is realized electrically by eleven electronically controlled variable LC sections plus three fixed sections.
  • Each variable capacitor is realized by means of Miller effect and the inductors are saturable reactors. Both elements of each section are controlled by a single voltage in a manner that constrains the LC product to remain constant, and varies the ratio L/C to simulate changes in effective cross sectional area.
  • the analog is excited Y 4 by a buzz source to simulate the glottal tone, by a noise source inserted at various distances from the glottis to simulate the noise of turbulence, or by both together.
  • vowels and consonants are produced with equal facility by the dynamic analog.
  • An object of the present invention is to provide a dynamic analog speech synthesizer.
  • a further object of the present invention is to provide a dynamically controllable electrical analog of the vocal tract.
  • FIG. 1 shows a block diagram of the dynamic analog speech synthesizer
  • FIG. 2 shows the nasal circuit to simulate the nasal cavity
  • FIG. 3 shows the geometrical shape simulated by the electrical analog of the vocal tract
  • FIG. 4 shows a block diagram of a typical variable LC section
  • FIG. 5 shows schematically an amplifier and fixed capacitor utilized to realize the Miller effect.
  • FIG. 1 shows the dynamic analog speech synthesizer, it is comprised of a lumped transmission line, sources of excitation for said line, and apparatus for controlling the configuration of said line and its excitation.
  • the source of excitation in this instance, is control signals which may be derived from a compatible speech analyzer. These control signals are fed to the input terminals of said speech synthesizer.
  • Input terminal 1 is adapted to receive a control signal representing buzz intensity; 2, pitch frequency; 3, time of click; 4, noise intensity; 5, noise spectrum; 6ft-6m, location of noise and click insertion; 7a-7m, control voltages for each variable section of said lumped transmission line; 8, the degree of nasal coupling; and 9, relative contribution of the nasal circuit.
  • Buzz generator 11 receives a control signal representing pitch frequency from terminal 2, thereby generating the desired buzz or pitch frequency, this buzz frequency signal is fed to variable gain amplifier 12 which also receives a signal representing buzz intensity from terminal 1.
  • Variable gain amplifier 12 produces an output signal which simulates the voicing action of the larynx. This is fed to a summing network 13 which simultaneously receives a bias from bias generator 10.
  • Summing network 13 delivers to input 18 of transmission line 20 a signal for voiced sounds.
  • Transmission line 20 is excited in this manner by glottal current pulses of variable frequency and amplitude.
  • Transmission line 20 represents the air in the acoustic path between the glottis and the lips and is comprised of three fixed LC sections 24-26 and eleven electronically variable LC sections 27-37. The fixed sections represent lower portion of the vocal tract where variation in cross sectional area is small and non-critical.
  • Fixed section 24 is shown schematically. Fixed sections 25 and 26 are shown in block form.
  • Click generator 14 receives a signal representing time of click from input terminal 3. Click generator 14 then generates clicks in response thereto which are fed to summing network 17.
  • Noise generator 15 and its associated filters receive a signal representing noise spectrum from input terminal 5, thereby generating noise signals at its output which is then fed to variable gain amplifier 16.
  • Gain amplifier 16 also receives a noise intensity signal from input terminal 4. The resulting noise signal from gain amplifier 16 is also fed to summing network 17.
  • Summing network 17 produces an output signal containv ing noise and click.
  • the clicks or noise correspond to sudden pressure releases or to turbulent air flow in the vocal tract and are utilized for fricatives and stops. This noise and click signal from summing network 17 is inserted in each LC section of transmission line 20 corresponding to the place where turbulence is produced.
  • Noise and' click input circuit 21 is shown and described for LC section 27 and is identical for all LC sections.
  • a control signal is available which provides the information to operate relay 23. This signal determines noise and click insertion into transmission line 20 at LC section 27.
  • the noise and click signal from summing network 17 is fed to transformer 22, which serves to transmit the control signal from terminal 6a to relay 23.
  • relay 2'3 receives a signal from input terminal 6a by Way of transformer 22, the noise and click signal is inserted into LC section 27 by way of relay 23. In this manner, the noise and click signal may be inserted into any given variable LC circuits 27-37, each being associated with its corresponding noise and click input circuit.
  • Nasal circuit 38 is provided and it represents the acoustic path through the nose.
  • a nasal coupling control voltage is provided from terminal 8 and it is fed to terminal 39 of nasal coupler 40, thereby controlling the degree of nasal coupling from transmission line 20 by way of lines 43 and 44.
  • Nasal circuit 38 provides the necessary reactive network to supply a nasal analog and is connected to terminals 41 and 42 of nasal coupler 40.
  • Nsasal circuit 38 provides a nasal output signal to variable gain amplifier 45.
  • Terminal 9 provides a relative contribution control signal to variable gain amplifiers 45 and 46.
  • Variable gain amplifier 45 thereby provides at its output a nasal analog signal with its amplitude so controlled as to supply a nasal signal of the proper relative proportion. This nasal signal output from amplifier 45 is fed to summing network 47.
  • nasal coupler 40 which is comprised of a saturable inductor controlled with a current driver tube. Coupler 40 is connected to transmission line 20 of FIG. l by way of lines 43 and 44', which provides the nasal circuit connection in the transmission line analog of the vocal tract. Coupler 40 also receives a signal from terminal 8 to control the degree of nasal coupling by way of aforesaid current driver tube and its associated saturable inductor.
  • Nasal circuit 38 is connected to coupler 40" by way of terminals 41 and 42. The controlled nasal signal is then fed from nasal circuit 38 to variable gain amplifier 45 of FIG. l.
  • transmission line 20 is comprised of 14 LC sections connected in series. Sections 24-26 are fixed sections and 27-37 are variable. Transmission line 20 simulates electrically an acoustic transmission line such as shown in FIG. 3. Since the vocal tract is considered an acoustic tube of variable lcross sectional area, terminated by the glottis at one end and the lips at the other, transmission line 20 represents the entire length of the vocal tract. Fixed sections 24-26 are analogous to the lower portion of the vocal tract, Where variations in cross sectional area are assumed to be small and non-critical. Variable sections 27-37 represent sections of the vocal tract whose cross sectional area may be varied.
  • each of the variable sections are continuously variable through a range of values corresponding to the range of variation of the dimensions of the human vocal tract.
  • Each of the LC sections 27-37 is varied by its own control voltage which is received from terminals 7n-7m respectively. Both the L and C of each section are controlled by a common voltage in such manner that the length of that section, proportional to (LC)1/2, remains constant as its area proportion to (C/L)1/2, changes in ⁇ response to the control voltage.
  • transmission line 20 is excited at terminal 18 by glottal current pulses of variable frequency and amplitude.
  • click and noise generators 14 and 16 are inserted in series with the transmission line inductor corresponding to the place where turbulence is produced.
  • the nasal cavity is approximated by insertion of nasal circuit 38 into transmission line 20 by way of nose coupler 40 and lines 43 and 44.
  • Transmission line 20 is terminated by radiation impedance 60, which approximates the radiation impedance seen from the mouth. It is comprised of inductance 61 and 62 in series. The output is taken from point 63. Radiation impedance 60 receives its control signal from terminal 59.
  • Transmission line 20 is comprised of fourteen sections.
  • Three fixed LC sections 24-26 are identical and are of conventional LC configuration.
  • Eleven sections 27-37 are identical, each section being arranged so that its capacitance and inductance is variable by a single control voltage.
  • Sections 24-37 are cascaded.
  • the typical variable section is illustrated by section 27 which is comprised of variable inductance circuit 50, variable capacitance circuit 49, and control amplifier 48.
  • the variability of capacitor circuit 49 is realized by the Miller effect, and the variability of inductance circuit 50 is achieved by utilizing saturable inductors. Both elements 49 and 50 are controlled by a single voltage by way of control amplifier 48 which in turn receives the control signal from terminal 7a.
  • Capacitor circuit 49 receives its control voltage directly from amplifier 48 whereas inductance circuit 50 receives it from amplifier 48 by Way of summing network 51. Summing network 51 simultaneously receives an input signal from the preceding section.
  • Control amplifier 48 also delivers a signal to the following section.
  • FIG. 4 shows in detail a block diagram of a typical variable section such as 27.
  • Each section of the dynamic analog corresponds to a given portion of the human vocal tract.
  • the length of each section of the analog remains constant, independent of area changes.
  • the requirement of constant length for each section implies that the LC product be constrained to remain constant for each section.
  • range of :1 in area is utilized so both inductance and capacitance of each section has a 100:1 range.
  • the Miller effect depends upon an amplifier to magnify the capacitance of a fixed physical capacitor.
  • the variable capacitance Vrealized by means of the Miller effect is shown schematically in FIG. 5.
  • the amplifier impedance at terminal 1 is 1 Z.
  • n- Ro I i F 1 -l-KQ In the useful band of the amplifier, R0 and 6 are made small so that Thus, an apparent variable capacitance is seen at the input of the amplifier at terminals 1; the capacitance of the fixed capacitor magnified by one plus the gain of the amplifier.
  • variable capacitance circuit 49 is shown.
  • the Miller effect amplifier is comprised of input amplifier 1, electronic attenuator 2, and low impedance output amplifier 3.
  • Fixed capacitor 4 is connected between the output of amplifier 3 and the input of amplifier 1.
  • Attenuator 2 receives a control voltage VCC from terminal 5.
  • VCC varies the gain of attentuator 2 so that the amplitude of the output signal from ampliiier 3 is varied.
  • the apparent capacitance across terminal 6 is varied in accordance with control voltage VCC.
  • Variable inductance circuit 50 is also shown in FIG. 4.
  • the variable inductance is realized as saturable inductor 7.
  • a saturable inductor has a magnetic circuit consisting of three portions or legs arranged in a figure eight.
  • Control winding 8 is wound on the inner leg producing a control flux whose paths are the outer legs.
  • Signal winding 9 is split into two halves, each placed on an outer leg and connected in series aiding; as a result, the mutual inductances between each half of signal winding 9 and control winding 8 have opposite polarities and hence the net coupling between the signal and control winding is very small.
  • Control flux through the magnetic material of the outer legs varies the incremental permeability of that material through values which lie between those of the demagnetized and fully saturated states.
  • the signal winding inductance is proportional to the permeability of the core.
  • the inductance of signal Winding 9 is not an exponential function of the current through control winding 8.
  • Control current is, therefore, obtained from a non-linear circuit comprising non-linear cathode feedback 10 and current driver 12 such that the inductance of the inductor signal winding is an exponential function of VCL, the control input to variable inductance circuit 50'.
  • VCL is obtained from terminal 13 and is fed to summing network 11 which simultaneously receives the output signal from non-linear cathode feedback 10.
  • Current driver 12 has a high output impedance.
  • a triode is utilized with a non-linear cathode resistor to provide the feedback.
  • the output signal from network 11 is fed to current driver 12.
  • the output signal from current driver 12 is fed simultaneously to control winding 8 of saturable inductor 7 and non-linear cathode feedback 10.
  • the current in control winding 8 controls the degree of saturation of the core material in the ilux path of signal winding 9.
  • the inductance appearing between terminals 14 of signal winding 9 is therefore a function of the control winding current.
  • the required variable inductance appears between terminals 14 in response to variation of control voltage VCL.
  • variable capacitance appears between terminals 6. Its value is determined by VCC, the capacitor control voltage.
  • variable inductance whose value is determined by VCL, the inductor control voltage appears across terminals 114.
  • the control characteristics of both the variable capacitance circuit and the Variable inductance circuit are exponential.
  • Control amplifier 48 maintains a linear relation between VCL and VCC so that the LC product remains constant. Both VCC and VCL are determined by VCS, the control input voltage which determines the section area. Thus, the length of a variable LC section remains constant as its area changes exponentially over a 100:1 range.
  • a typical single variable LC section as shown in FIG. 4 is comprised of variable capacitance circuit 49', and variable inductance circuit 50 connected to control ampliier 48', so that VCC and VCL are related and both are controlled 4by VCS.
  • the form of restraint depends on the form of the control characteristics of inductor 50 and capacitor 49. The control characteristics make the value of each element an exponential function of its control voltage, requiring increments in the two control voltages to have a constant ratio and to be of opposite sign. To realize this voltage restraint, inverter 29 was utilized.
  • the area of this typical variable LC section is determined by the voltage at the control terminal 18.
  • Amplifier 16 and cathode follower 17 preceding control terminal 18 function as a voltage generator controlled by the voltage, VCS, at terminal 7
  • Signal control path in control amplifier 48 is provided by summing network 15 receiving simultaneously area control signal, VCS, supplied from terminal 7 and feedback signal from terminal 18.
  • Summing network 1S delivers its signal to cathode follower 17 by way of amplifier 16, thus network 15, amplifier 16 and cathode follower' 17 function as a feedback amplifier.
  • Terminal 53 delivers a signal to the summing network of the following LC section.
  • the signal voltage present at terminal 18 is then utilized as the control voltage, VCC, for variable capacitance circuit 49.
  • VCC is fed to summing network 51.
  • Summing network 51 also receives a control signal from terminal 21 which is adapted to receive signal VCC, from the preceding LC section.
  • the output of summing network 51 is fed to inverter 29 which changes the sign at the output, thus providing the control signal, VCL, for variable inductance circuit 50'.
  • the output signal from transmission line 20 is fed to radiation impedance 60.
  • Radiation impedance 60 approximates the radiation impedance seen from the mouth.
  • the output is the voltage across small inductor 61.
  • the current through the radiation impedance 60 is treated as the strength (volume velocity) of a simple source and hence the voltage across inductor 62 behaves as the sound pressure at some distance from the source.
  • the output from inductance 62 is fed to variable gain amplifier 46.
  • Amplifier 46 simultaneously receives a signal from terminal 9 representative of the relative contribution from radiation impedance 60.
  • Summing network 47 receives simultaneously two signals, one from variable gain amplifier 46 and the other from variable gain amplifier 45. Output terminal 64 is thus provided a signal from network 47 which may be utilized to operate a loudspeaker for the reproduction of audio.
  • a speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control continuously throughout the sound-producing cycle, the electrical value of said capacitance and inductance elements while retaining the product, the electrical value of said capacitance and inductance elements constant, means including inductance connected to the output of said lumped transmission line circuitry to reproduce said sound producing cycle, means to excite said lumped transmission line circuitry for varied audio sounds with an electrical signal of variable frequency and amplitude characteristics patterned upon the sound waves of the human vocal tract.
  • a -speech synthesizer comprising lumped transmission line circuitry having a Series of fourteen sections, each of said sections including capacitance and inductance elements, the first three of said sections having a fixed capacitance and inductance electrical value, the next eleven sections having said capacitance and inductance electrical value variable, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control, continuously throughout the sound-producing cycle, the electrical value of said variable capacitance and inductance elements While retaining the electrical value of the product of said capacitance and inductance elements constant, said control means receiving a first electrical signal representative of the changing articulatory configurations of said human anatomy occurring during periods of said sound-producing cycle, a reactive circuit coupled into said lumped transmission line circuitry between said fourth -and fifth section, said reactive circuit receiving a second electrical signal representative of the charging nasal
  • a speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constitu-ted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current feeding relationship to said lumped transmission line circuitry to control, continuously throughout the sound-producing cycle, the electrical value of said capacitance and inductance elements While retaining the electrical value of the product thereof constant, means to excite said lumped transmission circuitry at the input thereof with a rst electrical signal representative of the glottal tone of said human anatomy occurring during said sound-producing cycle, means to inject into said lumped transmission line circuitry by way of the input of each of said sections having said capacitance and inductance elements whose electrical value is being controlled a second electrical signal representative of the turbulence characteristics of the sound waves of the vocal tract of said human anatomy occurring during said sound-producing cycle, means for controlling which of said
  • a speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current feeding relationship to said lumped transmission line circuitry to control, continuously throughout the sound-producing cycle, the electrical value of said capacitance and inductance while retaining the electrical value of the product thereof constant, reactive means coupled into said transmission line circuitry by way of the input of the first of said sections having said capacitance and inductance controlled, said reactive means receiving a first electrical signal representative of the nasal sound components of said human anatomy occurring during said sound-producing cycle, means to inject a second electrical signal into said lumped transmission line circuitry by way of the input of each of said sections whose capacitance and inductance is being controlled, said second electrical signal being representative of the turbulence sound components of said human anatomy occurring during said sound-producing cycle,
  • a speech synthesizing system comprising lumped transmission circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating the human vocal tract, and electrically connected in current-feeding relationship to said lumped transmission line circuitry to continuously and automatically vary the parameters of said transmission line in accordance with electrical control signals representative of the physical variations of the human vocal tract during periods of continuous, uninterrupted speech, and means to combine the output electrical signals from said lumped transmission line circuitry to reproduce said continuous, uninterrupted speech.
  • a speech synthesizing system comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating the vocal tract, and electrically connected in current-feeding relationship to said lumped transmission circuitry to control continuously and automatically the parameters of said transmission line in accordance with a first electrical control signal representative of physical variations in a human vocal tract during periods of continuous uninterrupted speech, means also having electrical connection to said transmission line to excite said transmission line with a second electrical control signal representative of the glot-tal tone of said vocal tract during said periods of said continuous uninterrupted speech, means including additional current-feeding means connected with said transmission line circuitry to inject a third electrical control signal into said line by theway of the inputs of said sections, said third electrical signal being representative of turbulent sounds emanating from said vocal tract during said periods of continuous, uninterrupted speech, means to control which of said sections receives said injected third electrical signal, and means
  • a speech synthesizing system comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating the human vocal tr-act, and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control continuously and automatically the parameters of said transmission line in accordance with a first electrical control signal representative of physical variations in said human vocal tract during periods of continuous uninterrupted speech, means to excite said transmission line at the input thereof with a second electrical control signal representative of the glottal tone of said vocal tract occurring during said periods of said continuous uninterrupted speech, means including a relay to inject a third electrical control signal into said transmission line by way of the inputs of said sections, said third electrical signal being representative of turbulent sounds emanating from said vocal tract occurring during said periods of continuous, uninterrupted speech, means to control which of said sections receives said injected third electrical signal, and means to combine the output electrical signals from said transmission line in
  • a speech synthesizing system comprising lumped transmission line circuitry having a series of sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating ⁇ the vocal tract, and electrically connected incurrent-feeding relationship to said lumped transmission circuitry to control continuously the parameters of said transmission line in accordance with a rst electrical control signal representative of the physical variations of the human vocal tract during periods of continuous, uninterrupted speech, means to excite said transmission by applying at the input thereof a second electrical control signal representative of the glottal tones of said vocal tract occurring during said periods of said continuous, uninterrupted speech, first means to inject into said transmission line by the way of the inputs of said sections a third electrical signal representative of turbulent noises emanating from said vocal tract occurring during said periods of said continuous, uninterrupted speech, means to con# trol which of said sections receives said injected third electrical signal, second means, including reactance, for injecting into said transmission line by the Way
  • a dynamic analog speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission circuitry, said control means being constituted by an additional section of circuitry including signal summaton components simulating the human vocal tract, and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control continuously the parameters of said lumped transmission line circuitry, said control means receiving a first electrical control signal representative of the physical variations of the human vocal tract during periods of continuous, uninterrupted speech, a buzz generator connected to the input of said transmission line, said buzz generator controlled by a second electrical signal representative of the glottal tone of said vocal tract occurring during said periods of said continuous, uninterrupted speech, means to inject an electrical noise signal into said transmission line by ⁇ Way of the inputs of said sections, said electrical noise signal being representative of turbulence existing in said vocal tract occurring during said periods of said continuous, uninterrupted speech, relay means to control which of said sections receives said injected electrical noise signal, and

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Description

CUSS EFEENCE July 3, 1962 G. ROSEN DYNAMIC ANALOG SPEECH syNTHEsIzER 3 Sheets-Shes?l x Filed Aug. 25, 1958 NAW Lush
INVENTOR. @safes-.- Rasav July 3, 1962 G, RQsEN DYNAMIC ANALOG SPEECH SYNTHESIZER I5 Sheets-Sheet 2 Filed Aug. 25. 1958 BY U 3 Sheets-Sheet 3 Filed Aug. 25. 1958 United States Patent() 3,042,748 DYNAMIC ANALOG SPEECH SYNTHESIZER George Rosen, Philadelphia, Pa., assignor to the United States of America as represented by the Secretary of the Air Force Filed Aug. 25, 1958, Ser. No. 757,170 9 Claims. (Cl. 179-1) This invention relates a system `for bandwidth compression of speech which includes a speech analyzer and synthesizer and more particularly to a dynamic analog speech synthesizer.
Speech compression systems are comprised of three parts: an analyzer, a communication channel such as a telephone line or radio link, and a synthesizer. The analyzer accepts a speech wave and delivers to the channel -a set of signals which are a coded representation of the important attributes of the input speech and which require little channel capacity for their transmission. The channel delivers the signals to the synthesizer, which treats them as instructions for constructing a replica equivalent by some criterion to the speech at the input to the system. Compression systems such as the vocoder in Patent No. 2,243,527, May 27, 1941, to Homer W. Dudley, `have produced natural-sounding speech but require a channel of relatively high capacity.
Formant-coding compression systems require much less channel capacity. Examination of speech spectra and of the mechanism of vowel production suggests how the ensemble of spectra may be reduced. The vowels are characterized by formants, of which the lowest two or three are suicient to specify the vowel both acoustically and perceptually. Specification of a vowel spectrum by the frequencies o-f the rst three formants rather than by, say, the amplitudes at 16 points along the frequency scale is a much more efcient representation. If, for example, each formant frequency were specified to within 100 cps, the total number of vowel spectra would be about 213, assuming the over-all amplitude yof the vowel to be quantitized in eight steps. Thus, a system in which formant frequencies were coded would be a distinct advantage over a vocoder system, as far as channel capacity requirements are concerned.
The present invention provides a speech synthesizer for a formant coding system which is a dynamically controllable electrical analog of the vocal tract. It synthesizes sequences of speech sounds so that they are highly intelligible and natural in cont-rast to previous speech synthesizers.
A speech synthesizer which is a dynamic analog of the vocal tract also has utility in studies of speech production. Certain classes of consonants such as stops, semi-vowels, and aifricates exist only by virtue of articulatory changes. Analog studies of these consonants in particular and of connected speech in general require dynamically variable speech synthesizers. The ability of a variable analog to synthesize syllables is helpful in the study of sounds such as nasals and fricatives which can exist in t-he steady state. The dynamically variable -feature is helpful both because formant transitions may be studied and because synthetic stimuli can be presented in context.
The vocal tract is treated as an acoustic tube whose cross sectional dimensions are small as compared with a i wave length, i.e., there are no cross modes of vibration in the frequency range of interest. rIhe cross sectional area of the tube is a function lof distance along i-ts axis. This smooth tube is approximated in shape by a cascade of cylinders each of which has a uniform cross sectional area. The analog of the -acoustical system is an electrical system wherein voltage is identified with pressure and current is identified with volume velocity. Thus, each lCe cylinder -is represented by a T-section whose series elements are inductance (L) and whose shunt element is a capacitance (C). The relations between acoustic and electrical quantities Iare such that the length of each cylinder is proportional to (LC) e and the cross sectional area is proportional to (C/L) "5'.
The dynamic analog to be described is terminated by an inductor which approximates the radiation impedance seen from the mouth. The output is the voltage across a small inductor in series with the lradiation impedance. 'Ihe current through the radiation impedance is treated as the strength (volume velocity) of a simple source land hence the voltage across the small inductor behaves as 'the sound pressure at some distance yfrom that source.
All the section areas of the dynamic analog can be varied together by means of externally generated electrical control voltages to simulate the changing articulatory configurations of the human vocal tract. When buzz and noise excitation are synchronized with articulatory changes, the synthesizer will produce connected speech.
In accordance with the present invention, the dynamic analog of the vocal tract (DAVO) speech synthesizer is a transmission-line analog of the human vocal tract. The vocal tract is considered as an acoustic tube of variable cross sectional area, terminated by the glottis at one end #and the lips at the other. The acoustic transmission line is simulated electrically by a number of variable LC sections, each of which represents a short length of the vocal tract. The dynamic analog is comprised of a number of iixed and variable LC sections. The inductance and capacitance in each of the variable sections are continuously variable through a range `of values corresponding to the range of variation of the dimensions of the human Vocal tract. The electrical transmission line receives excitation from a quasi-periodic saw-tooth current source at one end, representing the output of the Vocal `folds and/or from a noise source of suitable spectrum, inserted `at an appropriate point along the transmission line to represent the turbulent noise that is created when the air flows through a constriction.
The dynamic analog is comprised of a lumped transmission line, sources of excitation for the line, and a device `for controlling the configuration of the line and its excitation. Both the inductor and the capacitor of a given section Iof the line are controlled by a common control voltage in such manner that the length of that section, proportional to (LC) l, remains constant as its area, proportional to (C/L)/ changes in response to the control voltage. For voiced sounds, the transmission line is excited at the left by glottal current pulses of variable frequency and amplitude. For fricatives and other turbulent sounds, a noise voltage generator is inserted in series with the transmission line inductor corresponding to the place where the turbulence is produced. The noise amplitude is controlled and band-pass filtering is provided. The generator of click components of velar stops is inserted in lshe line in a like manner. The clicks or noise cor-respond to sounds generated by sudden pressure releases or by turbulent air llow in the vocal tract. A nasal analog is inserted in the dynamic vocal tract. The analog is comprised also of LC sections and operates to simulate nasal sounds. The nasal analog is coupled into the speech synthesizer by means of a nasal coupling device. The nasal analog represents the nasal cav-ity.
Each section of the aforementioned transmission line is comprised of three parts, a variable inductance, a variable capacitance circuit, and a control amplifier. The variable inductor utilizes changes in the incremental permeability of a ferromagnetic material and the variable capacitor utilizes changes in the magnitude of the Miller effect.
The area range for each section is 100:1 (0.17 cm.2 to 17 cm2). Satisfactory vowels have been produced using only settings between 0.59 cm.2 and 17 cm?, a ratio of 29:1, but it is desirable to have a full continuous range of 100:1 in the dynamic an-alog range.
For each section, the length is xed at (LC)/ and the area A=1/k (C/LW. rlhe impedance level k, which is the same for all sections, is an arbitrary parameter of the transmission line. The impedance level was chosen to be 0.77 4 (electrical ohm)1/cm.2 of area.
Since each section of the dynamic analog corresponds to a given portion of the human vocal tract and since the length of each section of the analog remains constant, independent of the `area changes, the requirement of constant length for each section implies that LC be constrained to remain constant for each section. In order to realize the 100:1 range in larea both the inductance and the capacitance of each section has a 100:1 range.
The constrain (LC=constant) which exists between the inductance and the capacitance requires a constraint between their respective control voltages. The control characteristics are such that as to make the value of each element an exponential function of its control-voltage. The constraint is achieved by the control exercised on the variable capacitance and inductance. Since the variable capacitance appears between ground and the input of the capacitor amplifier, its value is determined by VCC, capacitor control voltage. The variable inductance, whose value is determined by the VCL, the inductor control voltage, appears across a signal winding of a saturable inductor. The control characteristics of both variable capacitance circuit and the variable inductance circuit are exponential.
L gan-Driver. (2)
VcL=d- (bcbL) Vcc (3) where aC, bC, aL, bL, and d are constants. The control amplifier maintains a linear relation (Eq. 3) between VCL and VCC so that the LC product remains: constant. Both VCC and VCL are determined by VCS, the control input voltage which determines the section area. Thus, the length of the section remains constant as its area changes exponentially over a 100:1 range. The exponential characteristics of the capacitor is achieved by proper design of the electronic attenuator.
Each section is constructed as an L-section. A nonuniform line, however, can be approximated more closely by a given number of T-sections than by the same number of L-sections. It is also possible to utilize a 1r configuration.
The transmission line, whose maximum length is 17.5 cm., is comprised of 14 sections. The three sections adjacent to the glottis are each 1.0 cm. in length and have fixed areas approximately equal to 1.0 cm?. Each of the four sections nearest to the termination, the lip sections are normally 1.0 cm. in length. However, two of these can be shortened to simulate the shortening of the human vocal tract which occurs at wide mouth openings. For this purpose, the inductors and capacitors in those sections are controlled separately. The seven remaining sections are all 1.5 cm. long.
The present invention thus provides a dynamically controllable electrical analog of the Vocal tract capable of synthesizing speech sounds. The acoustic transmission line between the glottis and lips in the human vocal tract is realized electrically by eleven electronically controlled variable LC sections plus three fixed sections. Each variable capacitor is realized by means of Miller effect and the inductors are saturable reactors. Both elements of each section are controlled by a single voltage in a manner that constrains the LC product to remain constant, and varies the ratio L/C to simulate changes in effective cross sectional area. The analog is excited Y 4 by a buzz source to simulate the glottal tone, by a noise source inserted at various distances from the glottis to simulate the noise of turbulence, or by both together. Thus, vowels and consonants are produced with equal facility by the dynamic analog.
An object of the present invention is to provide a dynamic analog speech synthesizer.
A further object of the present invention is to provide a dynamically controllable electrical analog of the vocal tract.
The various objects and features of novelty which characterize the present invention will appear more fully from the following detailed description when read in conjunction with the attached drawings showing the preferred embodiment.
In the drawings:
FIG. 1 shows a block diagram of the dynamic analog speech synthesizer;
FIG. 2 shows the nasal circuit to simulate the nasal cavity;
FIG. 3 shows the geometrical shape simulated by the electrical analog of the vocal tract;
FIG. 4 shows a block diagram of a typical variable LC section; and
FIG. 5 shows schematically an amplifier and fixed capacitor utilized to realize the Miller effect.
Referring now in more detail to aforesaid preferred embodiment of my present invention with particular reference to the block diagram of FIG. 1 which shows the dynamic analog speech synthesizer, it is comprised of a lumped transmission line, sources of excitation for said line, and apparatus for controlling the configuration of said line and its excitation. The source of excitation, in this instance, is control signals which may be derived from a compatible speech analyzer. These control signals are fed to the input terminals of said speech synthesizer. Input terminal 1 is adapted to receive a control signal representing buzz intensity; 2, pitch frequency; 3, time of click; 4, noise intensity; 5, noise spectrum; 6ft-6m, location of noise and click insertion; 7a-7m, control voltages for each variable section of said lumped transmission line; 8, the degree of nasal coupling; and 9, relative contribution of the nasal circuit.
Buzz generator 11 receives a control signal representing pitch frequency from terminal 2, thereby generating the desired buzz or pitch frequency, this buzz frequency signal is fed to variable gain amplifier 12 which also receives a signal representing buzz intensity from terminal 1. Variable gain amplifier 12 produces an output signal which simulates the voicing action of the larynx. This is fed to a summing network 13 which simultaneously receives a bias from bias generator 10. Summing network 13 delivers to input 18 of transmission line 20 a signal for voiced sounds. Transmission line 20 is excited in this manner by glottal current pulses of variable frequency and amplitude. Transmission line 20 represents the air in the acoustic path between the glottis and the lips and is comprised of three fixed LC sections 24-26 and eleven electronically variable LC sections 27-37. The fixed sections represent lower portion of the vocal tract where variation in cross sectional area is small and non-critical. Fixed section 24 is shown schematically. Fixed sections 25 and 26 are shown in block form.
Click generator 14 receives a signal representing time of click from input terminal 3. Click generator 14 then generates clicks in response thereto which are fed to summing network 17. Noise generator 15 and its associated filters receive a signal representing noise spectrum from input terminal 5, thereby generating noise signals at its output which is then fed to variable gain amplifier 16. Gain amplifier 16 also receives a noise intensity signal from input terminal 4. The resulting noise signal from gain amplifier 16 is also fed to summing network 17. Summing network 17 produces an output signal containv ing noise and click. The clicks or noise correspond to sudden pressure releases or to turbulent air flow in the vocal tract and are utilized for fricatives and stops. This noise and click signal from summing network 17 is inserted in each LC section of transmission line 20 corresponding to the place where turbulence is produced. Noise and' click input circuit 21 is shown and described for LC section 27 and is identical for all LC sections. From input terminal 6(a), a control signal is available which provides the information to operate relay 23. This signal determines noise and click insertion into transmission line 20 at LC section 27. The noise and click signal from summing network 17 is fed to transformer 22, which serves to transmit the control signal from terminal 6a to relay 23. When relay 2'3 receives a signal from input terminal 6a by Way of transformer 22, the noise and click signal is inserted into LC section 27 by way of relay 23. In this manner, the noise and click signal may be inserted into any given variable LC circuits 27-37, each being associated with its corresponding noise and click input circuit.
Nasal circuit 38 is provided and it represents the acoustic path through the nose. A nasal coupling control voltage is provided from terminal 8 and it is fed to terminal 39 of nasal coupler 40, thereby controlling the degree of nasal coupling from transmission line 20 by way of lines 43 and 44. Nasal circuit 38 provides the necessary reactive network to supply a nasal analog and is connected to terminals 41 and 42 of nasal coupler 40. Nsasal circuit 38 provides a nasal output signal to variable gain amplifier 45. Terminal 9 provides a relative contribution control signal to variable gain amplifiers 45 and 46. Variable gain amplifier 45 thereby provides at its output a nasal analog signal with its amplitude so controlled as to supply a nasal signal of the proper relative proportion. This nasal signal output from amplifier 45 is fed to summing network 47.
Now referring to FIG. 2 to describe in detail the circuitry'of nasal circuitry of FIG. 1, there is shown nasal coupler 40 which is comprised of a saturable inductor controlled with a current driver tube. Coupler 40 is connected to transmission line 20 of FIG. l by way of lines 43 and 44', which provides the nasal circuit connection in the transmission line analog of the vocal tract. Coupler 40 also receives a signal from terminal 8 to control the degree of nasal coupling by way of aforesaid current driver tube and its associated saturable inductor. Nasal circuit 38 is connected to coupler 40" by way of terminals 41 and 42. The controlled nasal signal is then fed from nasal circuit 38 to variable gain amplifier 45 of FIG. l.
Referring again to FIG. 1, transmission line 20 is comprised of 14 LC sections connected in series. Sections 24-26 are fixed sections and 27-37 are variable. Transmission line 20 simulates electrically an acoustic transmission line such as shown in FIG. 3. Since the vocal tract is considered an acoustic tube of variable lcross sectional area, terminated by the glottis at one end and the lips at the other, transmission line 20 represents the entire length of the vocal tract. Fixed sections 24-26 are analogous to the lower portion of the vocal tract, Where variations in cross sectional area are assumed to be small and non-critical. Variable sections 27-37 represent sections of the vocal tract whose cross sectional area may be varied. The inductance and capacitance in each of the variable sections are continuously variable through a range of values corresponding to the range of variation of the dimensions of the human vocal tract. Each of the LC sections 27-37 is varied by its own control voltage which is received from terminals 7n-7m respectively. Both the L and C of each section are controlled by a common voltage in such manner that the length of that section, proportional to (LC)1/2, remains constant as its area proportion to (C/L)1/2, changes in `response to the control voltage. For voiced sounds, transmission line 20 is excited at terminal 18 by glottal current pulses of variable frequency and amplitude. For fricatives and other turbulent sounds, click and noise generators 14 and 16, respectively, are inserted in series with the transmission line inductor corresponding to the place where turbulence is produced. The nasal cavity is approximated by insertion of nasal circuit 38 into transmission line 20 by way of nose coupler 40 and lines 43 and 44. Transmission line 20 is terminated by radiation impedance 60, which approximates the radiation impedance seen from the mouth. It is comprised of inductance 61 and 62 in series. The output is taken from point 63. Radiation impedance 60 receives its control signal from terminal 59.
Transmission line 20 is comprised of fourteen sections. Three fixed LC sections 24-26 are identical and are of conventional LC configuration. Eleven sections 27-37 are identical, each section being arranged so that its capacitance and inductance is variable by a single control voltage. Sections 24-37 are cascaded.
The typical variable section is illustrated by section 27 which is comprised of variable inductance circuit 50, variable capacitance circuit 49, and control amplifier 48. The variability of capacitor circuit 49 is realized by the Miller effect, and the variability of inductance circuit 50 is achieved by utilizing saturable inductors. Both elements 49 and 50 are controlled by a single voltage by way of control amplifier 48 which in turn receives the control signal from terminal 7a. Capacitor circuit 49 receives its control voltage directly from amplifier 48 whereas inductance circuit 50 receives it from amplifier 48 by Way of summing network 51. Summing network 51 simultaneously receives an input signal from the preceding section. Control amplifier 48 also delivers a signal to the following section.
FIG. 4 shows in detail a block diagram of a typical variable section such as 27. Each section of the dynamic analog corresponds to a given portion of the human vocal tract. The length of each section of the analog remains constant, independent of area changes. The requirement of constant length for each section implies that the LC product be constrained to remain constant for each section. range of :1 in area is utilized so both inductance and capacitance of each section has a 100:1 range.
To realize the capacitance range a method and means of utilizing the Miller effect is provided. The Miller effect depends upon an amplifier to magnify the capacitance of a fixed physical capacitor. The variable capacitance Vrealized by means of the Miller effect is shown schematically in FIG. 5. In general the amplifier output is EO=EKQ the amplifier impedance at terminal 1 is 1 Z. n- Ro I i F 1 -l-KQ In the useful band of the amplifier, R0 and 6 are made small so that Thus, an apparent variable capacitance is seen at the input of the amplifier at terminals 1; the capacitance of the fixed capacitor magnified by one plus the gain of the amplifier.
Referring again to FIG. 4, variable capacitance circuit 49 is shown. The Miller effect amplifier is comprised of input amplifier 1, electronic attenuator 2, and low impedance output amplifier 3. Fixed capacitor 4 is connected between the output of amplifier 3 and the input of amplifier 1. Attenuator 2 receives a control voltage VCC from terminal 5. VCC varies the gain of attentuator 2 so that the amplitude of the output signal from ampliiier 3 is varied. Thus, the apparent capacitance across terminal 6 is varied in accordance with control voltage VCC.
Variable inductance circuit 50 is also shown in FIG. 4. The variable inductance is realized as saturable inductor 7. A saturable inductor has a magnetic circuit consisting of three portions or legs arranged in a figure eight. Control winding 8 is wound on the inner leg producing a control flux whose paths are the outer legs. Signal winding 9 is split into two halves, each placed on an outer leg and connected in series aiding; as a result, the mutual inductances between each half of signal winding 9 and control winding 8 have opposite polarities and hence the net coupling between the signal and control winding is very small. Control flux through the magnetic material of the outer legs varies the incremental permeability of that material through values which lie between those of the demagnetized and fully saturated states. The signal winding inductance is proportional to the permeability of the core.
The inductance of signal Winding 9 is not an exponential function of the current through control winding 8. Control current is, therefore, obtained from a non-linear circuit comprising non-linear cathode feedback 10 and current driver 12 such that the inductance of the inductor signal winding is an exponential function of VCL, the control input to variable inductance circuit 50'. VCL is obtained from terminal 13 and is fed to summing network 11 which simultaneously receives the output signal from non-linear cathode feedback 10. Current driver 12 has a high output impedance. A triode is utilized with a non-linear cathode resistor to provide the feedback.
The output signal from network 11 is fed to current driver 12. The output signal from current driver 12 is fed simultaneously to control winding 8 of saturable inductor 7 and non-linear cathode feedback 10. The current in control winding 8 controls the degree of saturation of the core material in the ilux path of signal winding 9. The inductance appearing between terminals 14 of signal winding 9 is therefore a function of the control winding current. The required variable inductance appears between terminals 14 in response to variation of control voltage VCL.
The variable capacitance appears between terminals 6. Its value is determined by VCC, the capacitor control voltage. The variable inductance, whose value is determined by VCL, the inductor control voltage appears across terminals 114. The control characteristics of both the variable capacitance circuit and the Variable inductance circuit are exponential.
where aC, bC, aL, bL, and d are constants. Control amplifier 48 maintains a linear relation between VCL and VCC so that the LC product remains constant. Both VCC and VCL are determined by VCS, the control input voltage which determines the section area. Thus, the length of a variable LC section remains constant as its area changes exponentially over a 100:1 range.
A typical single variable LC section as shown in FIG. 4 is comprised of variable capacitance circuit 49', and variable inductance circuit 50 connected to control ampliier 48', so that VCC and VCL are related and both are controlled 4by VCS. The constraint (LC=constant) which exists between the inductance 50' and -capacitance 49 implies a constraint between their respective control voltages, VCL and VCS. The form of restraint depends on the form of the control characteristics of inductor 50 and capacitor 49. The control characteristics make the value of each element an exponential function of its control voltage, requiring increments in the two control voltages to have a constant ratio and to be of opposite sign. To realize this voltage restraint, inverter 29 was utilized. With this constraint in effect the area of this typical variable LC section is determined by the voltage at the control terminal 18. Amplifier 16 and cathode follower 17 preceding control terminal 18 function as a voltage generator controlled by the voltage, VCS, at terminal 7 Signal control path in control amplifier 48 is provided by summing network 15 receiving simultaneously area control signal, VCS, supplied from terminal 7 and feedback signal from terminal 18. Summing network 1S delivers its signal to cathode follower 17 by way of amplifier 16, thus network 15, amplifier 16 and cathode follower' 17 function as a feedback amplifier. Terminal 53 delivers a signal to the summing network of the following LC section. The signal voltage present at terminal 18 is then utilized as the control voltage, VCC, for variable capacitance circuit 49. From terminal 18, VCC is fed to summing network 51. Summing network 51 also receives a control signal from terminal 21 which is adapted to receive signal VCC, from the preceding LC section. The output of summing network 51 is fed to inverter 29 which changes the sign at the output, thus providing the control signal, VCL, for variable inductance circuit 50'.
Now referring again to FIG. l, the output signal from transmission line 20 is fed to radiation impedance 60. Radiation impedance 60 approximates the radiation impedance seen from the mouth. The output is the voltage across small inductor 61. The current through the radiation impedance 60 is treated as the strength (volume velocity) of a simple source and hence the voltage across inductor 62 behaves as the sound pressure at some distance from the source. The output from inductance 62 is fed to variable gain amplifier 46. Amplifier 46 simultaneously receives a signal from terminal 9 representative of the relative contribution from radiation impedance 60. Summing network 47 receives simultaneously two signals, one from variable gain amplifier 46 and the other from variable gain amplifier 45. Output terminal 64 is thus provided a signal from network 47 which may be utilized to operate a loudspeaker for the reproduction of audio.
The foregoingfspecic embodiment describes the preferred manner for practicing the invention. It will be apparent to those skilled in the art that many modications are possible within the spirit and scope of the invention.
What is claimed is:
1. A speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control continuously throughout the sound-producing cycle, the electrical value of said capacitance and inductance elements while retaining the product, the electrical value of said capacitance and inductance elements constant, means including inductance connected to the output of said lumped transmission line circuitry to reproduce said sound producing cycle, means to excite said lumped transmission line circuitry for varied audio sounds with an electrical signal of variable frequency and amplitude characteristics patterned upon the sound waves of the human vocal tract.
2. A -speech synthesizer comprising lumped transmission line circuitry having a Series of fourteen sections, each of said sections including capacitance and inductance elements, the first three of said sections having a fixed capacitance and inductance electrical value, the next eleven sections having said capacitance and inductance electrical value variable, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control, continuously throughout the sound-producing cycle, the electrical value of said variable capacitance and inductance elements While retaining the electrical value of the product of said capacitance and inductance elements constant, said control means receiving a first electrical signal representative of the changing articulatory configurations of said human anatomy occurring during periods of said sound-producing cycle, a reactive circuit coupled into said lumped transmission line circuitry between said fourth -and fifth section, said reactive circuit receiving a second electrical signal representative of the charging nasal articulation configurations of said human anatomy occurring during said soundproducing cycle, means including inductance connected to the output of said lumped transmission line circuitry to reproduce -said sound-producing cycle, means to excite said lumped transmission line circuitry at `the input thereof with a third electrical signal representative of the glottal tones of said human anatomy occurring during said sound-producing cycle, means to inject a fourth electrical signal into said lumped transmission line circuitry by way of the input of each of said sections having said variable inductance and capacitance, said fourth signal being representative of turbulence in said human anatomy occurring during said sound-producing cycle, and means for controlling which of said sections having said variable capacitance and inductance receives said injected fourth signal.
3. A speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constitu-ted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current feeding relationship to said lumped transmission line circuitry to control, continuously throughout the sound-producing cycle, the electrical value of said capacitance and inductance elements While retaining the electrical value of the product thereof constant, means to excite said lumped transmission circuitry at the input thereof with a rst electrical signal representative of the glottal tone of said human anatomy occurring during said sound-producing cycle, means to inject into said lumped transmission line circuitry by way of the input of each of said sections having said capacitance and inductance elements whose electrical value is being controlled a second electrical signal representative of the turbulence characteristics of the sound waves of the vocal tract of said human anatomy occurring during said sound-producing cycle, means for controlling which of said sections, whose said capacitance and inductance elements are being controlled, receives said injected second electrical signal, and means, including inductance, connected to the output of said lumped transmission line circuitry for the reproduction of said sound-producing cycle.
4. A speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating sound-producing portions of the human anatomy and electrically connected in current feeding relationship to said lumped transmission line circuitry to control, continuously throughout the sound-producing cycle, the electrical value of said capacitance and inductance while retaining the electrical value of the product thereof constant, reactive means coupled into said transmission line circuitry by way of the input of the first of said sections having said capacitance and inductance controlled, said reactive means receiving a first electrical signal representative of the nasal sound components of said human anatomy occurring during said sound-producing cycle, means to inject a second electrical signal into said lumped transmission line circuitry by way of the input of each of said sections whose capacitance and inductance is being controlled, said second electrical signal being representative of the turbulence sound components of said human anatomy occurring during said sound-producing cycle, and means to control which of said sections whose said capacitance and inductance is being controlled receives said injected second electrical signal, means including inductance connected to the output of said transmission line circuitry to reproduce said sound-producing cycle.
5. A speech synthesizing system comprising lumped transmission circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating the human vocal tract, and electrically connected in current-feeding relationship to said lumped transmission line circuitry to continuously and automatically vary the parameters of said transmission line in accordance with electrical control signals representative of the physical variations of the human vocal tract during periods of continuous, uninterrupted speech, and means to combine the output electrical signals from said lumped transmission line circuitry to reproduce said continuous, uninterrupted speech.
6. A speech synthesizing system comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating the vocal tract, and electrically connected in current-feeding relationship to said lumped transmission circuitry to control continuously and automatically the parameters of said transmission line in accordance with a first electrical control signal representative of physical variations in a human vocal tract during periods of continuous uninterrupted speech, means also having electrical connection to said transmission line to excite said transmission line with a second electrical control signal representative of the glot-tal tone of said vocal tract during said periods of said continuous uninterrupted speech, means including additional current-feeding means connected with said transmission line circuitry to inject a third electrical control signal into said line by theway of the inputs of said sections, said third electrical signal being representative of turbulent sounds emanating from said vocal tract during said periods of continuous, uninterrupted speech, means to control which of said sections receives said injected third electrical signal, and means, including inductance, connected to the output of said transmission line circuitry to reproduce said speech.
7. A speech synthesizing system comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating the human vocal tr-act, and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control continuously and automatically the parameters of said transmission line in accordance with a first electrical control signal representative of physical variations in said human vocal tract during periods of continuous uninterrupted speech, means to excite said transmission line at the input thereof with a second electrical control signal representative of the glottal tone of said vocal tract occurring during said periods of said continuous uninterrupted speech, means including a relay to inject a third electrical control signal into said transmission line by way of the inputs of said sections, said third electrical signal being representative of turbulent sounds emanating from said vocal tract occurring during said periods of continuous, uninterrupted speech, means to control which of said sections receives said injected third electrical signal, and means to combine the output electrical signals from said transmission line in the proper sequence to reproduce said continuous, uninterrupted speech.
8. A speech synthesizing system comprising lumped transmission line circuitry having a series of sections including capacitance and inductance elements, control means for said lumped transmission line circuitry, said control means being constituted by an additional section of circuitry including signal summation components simulating `the vocal tract, and electrically connected incurrent-feeding relationship to said lumped transmission circuitry to control continuously the parameters of said transmission line in accordance with a rst electrical control signal representative of the physical variations of the human vocal tract during periods of continuous, uninterrupted speech, means to excite said transmission by applying at the input thereof a second electrical control signal representative of the glottal tones of said vocal tract occurring during said periods of said continuous, uninterrupted speech, first means to inject into said transmission line by the way of the inputs of said sections a third electrical signal representative of turbulent noises emanating from said vocal tract occurring during said periods of said continuous, uninterrupted speech, means to con# trol which of said sections receives said injected third electrical signal, second means, including reactance, for injecting into said transmission line by the Way of the input of the rst of said controlled sections a fourth electrical signal representative of nasal sounds emanating from said vocal tract occurring during said periods of said continuous, uninterrupted speech, and means to combine the output electrical signals from said transmission line to reproduce said continuous, uninterrupted speech.
9. A dynamic analog speech synthesizer comprising lumped transmission line circuitry having a series of sections, each of said sections including capacitance and inductance elements, control means for said lumped transmission circuitry, said control means being constituted by an additional section of circuitry including signal summaton components simulating the human vocal tract, and electrically connected in current-feeding relationship to said lumped transmission line circuitry to control continuously the parameters of said lumped transmission line circuitry, said control means receiving a first electrical control signal representative of the physical variations of the human vocal tract during periods of continuous, uninterrupted speech, a buzz generator connected to the input of said transmission line, said buzz generator controlled by a second electrical signal representative of the glottal tone of said vocal tract occurring during said periods of said continuous, uninterrupted speech, means to inject an electrical noise signal into said transmission line by `Way of the inputs of said sections, said electrical noise signal being representative of turbulence existing in said vocal tract occurring during said periods of said continuous, uninterrupted speech, relay means to control which of said sections receives said injected electrical noise signal, and means to combine the electrical output signals from said transmission line to reproduce said continuous, uninterrupted speech.
References Cited in the le of this patent UNITED STATES PATENTS Miller Feb. 25, 1958 OTHER REFERENCES An Electrical Analog of the Vocal Tract, Journal of Acoustical Society of America, vol. 25, July 1953 (pp. 734-743 relied on).
David: Signal Theory in Speech Transmission, IRE Transactions on Circuit Theory, vol. CT-3, No. 4, December 1956, pp. 232, 243.
US757170A 1958-08-25 1958-08-25 Dynamic analog speech synthesizer Expired - Lifetime US3042748A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US757170A US3042748A (en) 1958-08-25 1958-08-25 Dynamic analog speech synthesizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US757170A US3042748A (en) 1958-08-25 1958-08-25 Dynamic analog speech synthesizer

Publications (1)

Publication Number Publication Date
US3042748A true US3042748A (en) 1962-07-03

Family

ID=25046674

Family Applications (1)

Application Number Title Priority Date Filing Date
US757170A Expired - Lifetime US3042748A (en) 1958-08-25 1958-08-25 Dynamic analog speech synthesizer

Country Status (1)

Country Link
US (1) US3042748A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3428748A (en) * 1965-12-28 1969-02-18 Bell Telephone Labor Inc Vowel detector
US3542955A (en) * 1968-04-29 1970-11-24 Bell Telephone Labor Inc Automatic generation of voiceless excitation in a vocal-tract synthesizer

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2824906A (en) * 1952-04-03 1958-02-25 Bell Telephone Labor Inc Transmission and reconstruction of artificial speech

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2824906A (en) * 1952-04-03 1958-02-25 Bell Telephone Labor Inc Transmission and reconstruction of artificial speech

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3428748A (en) * 1965-12-28 1969-02-18 Bell Telephone Labor Inc Vowel detector
US3542955A (en) * 1968-04-29 1970-11-24 Bell Telephone Labor Inc Automatic generation of voiceless excitation in a vocal-tract synthesizer

Similar Documents

Publication Publication Date Title
Stevens et al. An electrical analog of the vocal tract
Rosen Dynamic analog speech synthesizer
Dudley Remaking speech
Heinz et al. On the properties of voiceless fricative consonants
US2183248A (en) Wave translation
Ito et al. Zero-crossing measurements for analysis and recognition of speech sounds
JPH04328798A (en) Public address clearness stressing system
US2243527A (en) Production of artificial speech
US2458227A (en) Device for artificially generating speech sounds by electrical means
US3836717A (en) Speech synthesizer responsive to a digital command input
US3042748A (en) Dynamic analog speech synthesizer
US3518566A (en) Audio system with modified output
US3268660A (en) Synthesis of artificial speech
Guérin et al. A voice source taking account of coupling with the supraglottal cavities
US3394228A (en) Apparatus for spectral scaling of speech
US2339465A (en) System for the artificial production of vocal or other sounds
US3573374A (en) Formant vocoder utilizing resonator damping
Peterson et al. Objectives and techniques of speech synthesis
US3491205A (en) Plural formant speech synthesizer
US3280266A (en) Synthesis of artificial speech
US3542955A (en) Automatic generation of voiceless excitation in a vocal-tract synthesizer
US3328525A (en) Speech synthesizer
Boves et al. A new synthesis model for an allophone based text-to-speech system.
US3511932A (en) Self-oscillating vocal tract excitation source
US3499986A (en) Speech synthesizer