US2458227A - Device for artificially generating speech sounds by electrical means - Google Patents

Device for artificially generating speech sounds by electrical means Download PDF

Info

Publication number
US2458227A
US2458227A US662960A US66296046A US2458227A US 2458227 A US2458227 A US 2458227A US 662960 A US662960 A US 662960A US 66296046 A US66296046 A US 66296046A US 2458227 A US2458227 A US 2458227A
Authority
US
United States
Prior art keywords
frequency
speech
oscillations
fundamental frequency
control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US662960A
Inventor
Vermeulen Roelof
Six Willem
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hartford National Bank and Trust Co
Original Assignee
Hartford National Bank and Trust Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hartford National Bank and Trust Co filed Critical Hartford National Bank and Trust Co
Application granted granted Critical
Publication of US2458227A publication Critical patent/US2458227A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

Definitions

  • This invention relates to a device for the synthetic composition of speech and the use of such a device in the electrical transmission of speech.
  • the invention In the case of transmitting speech electrically the invention has for its purpose to effect this transmission in such manner that the required frequency band is much narrower than the frequency band occupied by the speech oscillations.
  • the speech spectrum is subdivided into ten frequency subbands and in each of these subbands the mean amplitude of the oscillations appearing therein is determined and these characterizing amplitudes are transmitted by means of a control voltage. Consequently altogether eleven control voltages are required viz. one which is a measure of the instantaneous value of the fundamental frequency, the remaining ten control voltages indicating the amplitude of the oscillations in ten subbands of the speech spectrum.
  • the characteristics transmitted by the control oscillations serve to control a device capable of producing artificial speech oscillations.
  • the control voltage which is a measure of the fundamental frequency of the speech spectrum, controls the oscillation produced by an impulse generator in such manner that the fundamental frequency of the impulses always corresponds to the fundamental frequency of the initial speech spectrum.
  • the impulses containing the fundamental frequency and in addition a large number of higher harmonics are supplied to ten band-pass filters, whose transmission bands respectively correspond to the ten subbands into which is divided the speech spectrum on the transmitter side.
  • each band-pass filter is connected to an amplifier whose amplification factor is automatically controlled by the action of the control voltage, which is a measure of the mean amplitude of the initial speech oscillations, located in the subband corresponding to the transmission band of the band-pass filter.
  • the hissing sounds consist of a continuous frequency spectrum.
  • a source producing a continuous frequency spectrum is provided on the receiver side, which source is automatically connected in place of the impulse generator if no fundamental frequency occurs in the speech vibrations to be transmitted.
  • the characteristics are deduced from the initial speech on the transmitter side.
  • these characteristics may be produced electro-mechanically, for instance, by means of a number of keys operated by hand, These keys respectively control, for instance, the amplitude of the oscillations supplied to the ten band-pass filters, the switching on of the impulse generator and its fundamental frequency or the switching on of the source producing a continuous frequency spectrum.
  • the produced oscillations are, in accordance with the invention, supplied to a number of resonant circuits, whose tuning is respectively controlled in accordance with the formants of the speech sounds to be produced; furthermore the'voltages taken from the resonant circuits are combined in an amplitude ratio corresponding to the amplitude ratio of the formants in the speech sounds to be produced.
  • the tuning of the resonant circuits and the combination of the voltages set up across the resonant circuits in the correct amplitude-ratio is controlled by means of keys operated by hand.
  • the control is effected by means of characteristics which are deduced from the speech at the transmitter side.
  • the speech spectrum to be transmitted is divided into a number of frequency subbands and thefrequency and the am plitude of the formant occurring in each of these subbands is transmitted respectively by a control voltage, the control voltages which area measure of the frequency of the formants respectively controlling the tuning of the resonant circuits on the receiver side, and the amplitude of the voltages taken from these resonant circuits being controlled respectively in accordance with the control voltages that are a measure of the amplitude of the formants.
  • the frequency and the amplitude of the oscillation having a maximum amplitude are determined respectively in a number of frequencybands of the speech spectrum and these speech characteristics are. transmitted by means of control voltages.
  • the air compressed in the lungs passes to the exterior along the vocal cords through the cavity of the throat and the mouth, and sometimes also through the cavity of the nose.
  • the flow of energy of the compressed air is translated into wave energy at definite points where the air current is contracted. This may take place, for instance, in the gap between the vocal cords, in the space between the tongue and the uvula, between the teeth, between the lips.
  • the produced oscillations are usually relaxation oscillations and consequently contain a large number of harmonics.
  • this initial relaxation oscillation enters one or more of the resonance chambers constituted by the cavities of the throat, nose and mouth, the latter are excited and caused to resonate as a result of which definite harmonics of the initial vibration are amplified and others are damped.
  • the frequencies that are most favoured, the so-called formants differ from sound to sound.
  • consonants viz. the sonants and all vowels, that are the actual voice carriers, are substantially formed with cooperation of the said three resonant chambers, so that these sounds will essentially comprise three formants.
  • To characterize the latter it is necessary to determine the value and the location of the three resonances, so that it is necessary to divide the speech centrum into at least three frequency subbands and to determine in each of them at which frequency a resonance maximum occurs and the value of this maximum.
  • consonants for instance the explosive consonants are characterized by transition phenomena which occur before or after the production of a vowel.
  • consonant is abruptly released and stopped respectively as a result of which the resonance cavities of the speech member are excited in their natural vibrations and die out respectively. If the said consonants precede a vowel the character of the consonant determines the manner of building up of the relaxation oscillation of the air-current produced by the vocal cords. If the consonant is preceded by the vowel the consonant determines the manner of dying out of the relaxation oscillation.
  • the device for the artificial production of the speech sounds according to the invention must comprise at least three resonant circuits whose tuning is controlled. This requires three control voltages. Furthermore three control voltages are necessary for controlling the amplitude of the voltages set up across the resonant circuits, and another characterizing magnitude has to control the switching on of the source supplying a continuous spectrum of oscillations or the switching on of the impulse generator and the frequency of the latter. This consequently requires altogether at least seven control voltages.
  • Fig. 1 of the drawing is a diagram of a device according to the invention for electrically mitting speech.
  • Fig. 2 is a graph illustrating the frequency spec- Fig. '5 shows the circuit diagram of elements A1 to A4 in Fig. 1,
  • Fig. 6 represents the circuit arrangement of element G and element R in Fig. 1, and
  • Fig. 7 represents the circuit diagram of e1ements R1 to R4. in Fig. 1.
  • the output circuits of the filters are connected to analysators A1, A2,. A: and A4 which determine in each of the four frequency bands at what isthe frequencyof the oscillation having maximum amplitude and the value of. this amplitude.
  • Each analysator supplies two control voltages, one of which is proportional to the frequency and the other tothe amplitude of the oscillation having. the maximum amplitude which appears in one frequency subband.
  • an analysator A5 is provided on the transmitter side which determines whether a fundamental frequency is available in the oscillations produced by the microph'one and the/value of this frequency. This frequency is also. defined by a control voltage;
  • The: nine control voltages thus obtained are the speech characteristicsv which are transmitted through lines L1 to Ls to thereceiver side 0 and control the device provided on the receiver side for'the artificial'production of speech.
  • the lastmentioned device comprises an impulse generator G whichproduces impulses consisting of a fundamental frequency with a large number of higher harmonics; and a generator R producing a con tinuous spectrum of electrical oscillations.
  • generator G- is put into circuit and the generator R. is cut out.
  • the control voltage controlsat the same time the fundamental frequency of the impulses produced by the generator G, so that this frequency corresponds to the fundamental frequency of the oscillations produced by the microphone M on the receiver side.
  • the tuning of these resonant circuits is controlled by the control voltages that are a measure of the frequency of the oscillation having a maximum amplitude which appears in each of'the four different frequency subbands into I which the frequency spectrum on the transmitter side is divided by'the filters F1 and F4.
  • the voltages set up across the resonant circuits R1 to R4 are supplied to: amplifiers Vi to V; respectively which comprise means for automatically controlling the amplification factor. This control takes place in accordance with the control voltages that are a measure of the amplitude of the oscillation having a maximum amplitude in the four frequency bands.
  • 6i telephone T is connected to the common output circuit.
  • the vowel a is to be transmitted whose frequency spectrum is represented in Fig. 2.
  • the fundamental frequency amounts to 128 cycles/sec. so so that the spectrum. consists of a number of higher harmonics of 128 cycles/sec, whose amplitimes are related as the lengths of the vertical lines in Fig. 2.
  • the ends of these lines lie on an enveloping curve which may be imagined to be composed of thethree dotted curves which have the shape of resonance curves.
  • these curves may be conceived to represent the resonance curves of the resonance chambers constituted by the cavity of the mouth, nose and throat and determinin the formants.
  • the frequency of the relaxation generator G- must be adjusted at 128 cycles/sec. and the resonant circuits R2, R3 and R4 at the frequencies of 640, 1280 and 2688 cycles/sec. respectively, the amplification of the amplifiers V2, Vsand V4 being controlled in accordance with the amplitudes I1, I2 and I3 of the formants in the spectrum shown in 2. From the combination of the characteristic curves of the resonant circuits Rz to R4 and the amplifiers V2 to V4 then results a frequency characteristic of the receiving device 0 which approximately corresponds to the enveloping curve of the vertical lines in Fig. 2.
  • control voltages are transmitted in the form of control voltages through separate lines L1 t0-L9 from. the transmitter side to the receiver side.
  • these control voltages are preferably modulated on. carrier waves e. g. in the manner known from multiplex. carrier 'wave telegraphy.
  • Analysator for the fundamental frequency As Fig. 3 represents the circuit arrangement of the analysator A5 of Fig. IV for determining the fundamental frequency.
  • a filter 3 having a transmission band of 50 to 400 cycles/sec. is connected to the input terminals.
  • the fundamental frequency of sounds possessing such a fundamental frequency lies between 75 and 350 cycles/sec. and varies with the pitch of the voice of the various persons. In the case that a fundamental frequency is present this frequency will consequently fall .within the transmission band of the filter 3.
  • the low frequencies below 50 cycles/sec. should be cut off by the filter to avoid wrong response of the analysator at low frequencies ensuing as a result of the pauses between succeeding syllables and words.
  • the output circuit of the filter 3 is connected to the control grid circuit of an amplifying tube 4 in which circuit is inserted a resistance 6 shunted by a condenser 5, which are so proportioned that the voltage drop across the resistance ii caused by the control-grid current flowing in the tube 4 adjust the grid bias to such value that grid current only flows at a voltage peak of the oscillationssupplied through the filter 3.
  • a condenser I I which is charged through a resistance I2 by a source of direct voltage as a result of which the anode of the relay tube l acquires a positive voltage relatively to the cathode.
  • the circuit is adjusted in such manner that whenever the voltage across the secondary ofthe transformer 9 supplies a positive impulse to the control grid of the relay tube the latter is ignited and the condenser H is abruptly discharged, whereupon the discharge through the relay tube is extinguished and the condenser is charged again across the resistance I2.
  • the current impulses produced at successive ignitions produce periodic voltage impulses across a resistance l3, as represented in Fig. 4d whose frequency corresponds to the fundamental frequency.
  • the direct voltage component of these impulses which is a measure of the fundamental frequency is sieved out by a filter consisting of a resistance I 4 and a condenser l5 and is supplied to a pair of terminals B6, II.
  • control voltage which is a measure of the fundamental frequency of the speech to be transmitted.
  • This control voltage is transmitted to the receiver side through the line L9 of Fig. 1 which is connected to the terminals l6 and I1 shown in Fig. 3.
  • the fundamental frequency is not at all present in the speech sounds or is very little pronounced; however, the speech sounds will then contain a number of higher harmonics. In such cases the frequency of the fundamental frequency corresponds to the periodicity of the sound.
  • Analysa tors A1 to A4 Fig. 5 shows the circuit diagram of the analysators A1 to A4 each of which comprises a device for determining the frequency and a device for determinating the amplitude of the oscillation having a maximum amplitude in one of the four frequency subbands into which the speech to be transmitted is divided by the filters F1 to F4.
  • This frequency subband is supplied to the input terminals l8 and I9.
  • the circuit and operation of the first-mentioned device substantially corresponds to that of the device for determiningthe fundamental frequency shownin Fig. 3 so'that we refer to this figure for a further description.
  • control voltage which is a measure of the frequency of the oscillation having the maximum amplitude is set up between the terminals l6 and II.
  • corresponding parts of the circuit bear the same reference numerals as in Fig. 3.
  • a transformer 20 whose secondary is connected to the device for determining the amplitude of the oscillation having a maximum amplitude.
  • This device comprises a rectifier 2! which is connected in series with resistance 22 shunted by a condenser 23.
  • the circuit acts as a peak detece tor so that a rectified voltage is set up. across the resistance which voltage is proportional to the amplitude of the oscillation having a maximum amplitude.
  • a filter consisting of a resistance 24 and a condenser 25
  • the rectified voltage across the resistance 22 is supplied to a pair of output terminals 26, 2?.
  • FIG. 6 represents the circuit of the impulse generator G and the circuit of the generator R for producing a continuous spectrum of oscillations.
  • the impulse generator comprises a discharge tube Hi0 whose grids NH and Bare coupled together through a condenser 93.
  • a positive voltage is supplied to the grid I02 through a resistance w t and the control voltage transmitted through the line L9 in Fig. l is supplied to the grid I!!! through a resistance I85.
  • This control voltage appears between the pair of terminals I06, I01.
  • a condenser I08 is connected between the grid IM and earth.
  • the circuit of this impluse generator is well known .so that further explanation is superfluous.
  • the frequency of the impulses produced by the generator G is controlled by the control voltage between the terminals I06 and I].
  • the so controlled impulses are supplied .to the grid circuit of a discharge tube I09.
  • the frequency control just mentioned is effectedin such manner that the frequency always corresponds to the fundamental frequency of the initial speech spectrum received by the microphone M in Fig. 1. If the speech .spectrum does not contain a fundamental frequency, in which case there is no control voltage between the terminals I06 and I01, the generator G is blocked and no impulses are produced.
  • the generator R for producing a continuous spectrum of oscillations consists of .a resistance H0 in the input circuit of an amplifier III. Owing to the thermal agitation of the electrons in the resistance material a voltage appears across the resistance Hi3 which as is well-known,
  • the operation of the attenuation network II'2 together with the tube H5 is such that in the absence of a control voltage between the terminals I06 and IN the attenuation network transmits the voltage set up in the output circuit of the amplifier III to the grid circuit .of an amplifying tube IIB. This transmission is blocked as soon as a control voltage is set up between the terminals I06 and I01.
  • the impulse generator G supplies impulses to the tube I00, the frequency of these impulses corresponding to the fundamental frequency of the speech sounds to be transmitted. No transmission of the continuous spectrum of oscillations set up across the resistance IIO takes then place to the tube H6. In the absence of the said control voltage however, the tube I09 does not receive impulses and the generator producing a continuous spectrum of oscillations comes .into action and this spectrum is supplied to the tube IIB.
  • the terminals IIS and H9 of the resistance ill are connected to the input terminals of the tunable resonant circuits.
  • Tunable resonant circuits R1 to R4 All of the tunable resonant circuits R1 to R4 are connected in the same way so that .it is Sufiicient to describe one of them, for instance .00 .Rl.
  • the circuit is represented in Fig. 7.
  • the tunable resonant circuit is constituted by a condenser 20I in parallel with the input impedance .Z of .a discharge tube 202 which is connected as a reactance tube.
  • the anode circuit of this tube includes a resistance 203, and between the anode and the grid an inductance coil 204 is connected.
  • the input impedance Z of this circuit is given by the expression jwL 1+SR where L represents the inductance of the coil 204, ,R the resistance 203 and S the mutual conductance of the tube 202. It follows Ifrom this expression that Z represents the inductive .reactance, the value of the inductance amounting to L 1+SR and therefore depends on the mutual con ducta-nce S.
  • This mutual conductance is controlled by the control voltage, which is a measure of frequency of the formant in the frequency band 200 to 400 cycles/second and is transmitted by the line L2 in Fig. 1 to the terminals 205 and 2536 in Fig. '7.
  • the resonant circuit constituted by the condenser 20K and the input impedance 'Z is so controlled that the resonant frequency corresponds to the frequency of the formant in the frequency band of 200 to 400 cycles/sec.
  • the tuned resonant circuit is connected through a transformer 201 to a pair of terminals 208, 209 to which are supplied the oscillations produced by the source G-orR in Fig. 1.
  • the voltage set up across the resonant circuit is also supplied through a third trans-former wind-ing to an amplifier V1 whose amplification can be controlled.
  • the amplification is controlled by the control voltage, which is transmitted by the line L1 in Fig. l to the terminals 210 and 2H in Fig.
  • the network consists of four nonlinear resistances (for instance dry rectifi'ers) -2I2, '2I3, 2M and -2-I'5. Between the junction point 2113 of the resistances 12 I 4 and H5 and a centre tapping 2I'I of the primary winding of a transformer 218 are connected two resistances 21:9 and 220 which are at the same time inserted in the anode circuit of amplifying tube 22I. The control grid and the cathode of this tube are connected to the terminals 2I0 and 2 respectively.
  • the circuit is adjusted in such manner that no voltage occurs between the points 2H; and an at an average value of the control voltage supplied to the terminals 2I0 and 2H.
  • the point 2II With an increase in control voltage between the terminals 2H! and MI the point 2II becomes more negative as a result of which the resistance of the non-linear resistances ZIZ and H3 increases and the resistance of the nonlinear resistances M and it decreases.
  • the control voltage between the terminals 2I0 and ZII falls below the average value the point ZII becomes more positive and the resistances 2I2 and 2I3 decrease, whereas the resistances 2M and H6 increase.
  • the attenuation network efiects an increased attenuation and in the second case a decreased attenuation.
  • the voltage set up across the resonant circuit constituted by the condenser 20I and the impedance Z will appear with such amplification in the output cir- 'cuit 222,223 of the amplifier V1 as corresponds to the amplitude of the formant in the frequency 'band 200 to 400 cycles/second.
  • control range of the resonant circuit is so chosen that only one formant falls within this range.
  • the device on the receiver side in Fig. 1 may also be used independently for the artificial production of speech.
  • the required control voltages may be derived from a number of sources of potential shunted by potentiometers. These potentiometers may be adjusted by means of keys. The playing of the keys requires some exercise but a trained person will be able to produce speech sounds, words and sentences by means of this device.
  • the device according to the invention used in telephony yields a material saving in the frequency band required for the transmission of a conversation.
  • a frequency band of about .3000 cycles/second is required, but in the example given above a bandwidth of only 225 cycles/ second is necessary so that in the bandwidth of 3000 cycles/second of the usual telephony systems ten conversations may be accommodated when making use of the invention.
  • the device according to the invention may also be used with advantage in microphone-loudspeaker installations, in which special attention 30 had to be paid hitherto tothe avoidance of acoustic feedback.
  • a periodic impulse generator providing harmonically-related sound frequency oscillations having a fundamental frequency
  • means to control the periodicity of said generator in accordance with the fundamental frequency of a desired speech sound means to select from the output of said generator three oscillations whose respective frequencies correspond to the formants of the desired speech sound, and means to combine said selected oscillations in an amplitude ratio corresponding to the amplitude ratio of the formants of the desired speech sound.
  • a periodic impulse generator providing. harmonically-related sound frequency oscillations having a fundamental frequency, means to control the periodicity of said generator in accordance with the fundamental frequency of a desired speech sound, a plurality of resonant circuits coupled to said generator each circuit being continuously tunable within a distinct subband of the sound frequency spec'- trum, means to tune three of said resonant circuits to frequencies corresponding to the formants of the desired speech sound, and means to combine the voltages developed in said three resonant circuits in an amplitude ratio corresponding to the amplitude ratio of the formants of the desired speech sound.
  • a periodic impulse generator for producing harmonicallyrelated sound frequency oscillations having a fundamental frequency, means to vary the periodicity of said generator in accordance with said first control voltage, a plurality of resonant circuits each coupled to said generator, each resonant circuit being tunable within a successive subband of the sound frequency spectrum, means to vary the tuning of respective resonant circuits in accordance with said second control voltages, a common output channel, a plurality of adjustable amplifying means coupling each of said resonant circuits to said output channel, and means for adjusting respective amplifying means in accordance with said third control voltages.
  • said speech including sounds having a fundamental frequency and formants, a microphone for converting said speech sounds into corresponding sound frequency oscillations, a plurality of band-pass filters coupled to said microphone for dividing said sound frequency oscillations into frequency subbands, means coupled to said microphone for developing a first control voltage proportional to the value of the fundamental frequency of said sound frequency oscillations, means coupled to each of said band-pass filters for developing second control voltages proportional to the frequencies of the formants lying in the subbands and third control voltages proportional to the amplitudes of the formants lying in the subbands, a periodic impulse generator for producing harmonically-related sound frequency oscillations having a fundamental 'fre quency, means to vary the periodicity of said generator in accordance with said first control voltage, a plurality of resonant circuits each coupled to said generator, each resonant circuit being continuously tunable Within a respective subband, means for separately varying the tuning of said resonant circuits in accordance with said second control voltages
  • a speech transmission system comprising an input for speech waves, a plurality of filters for dividing said speech waves into frequency subbands, means responsive to the occurrence of a fundamental frequency in said speech waves for developing a first control voltage proportional to said fundamental frequency, a first generator arranged to be actuated by said first control voltage for producing harmonically-related sound frequency oscillations whose fundamental frequency varies in accordance with said first control voltage, a second generator arranged to be actuated in the absence of said first control voltage for producing a continuous spectrum of sound frequency oscillations, a plurality of resonant circuits each continuously tunable within a respective subband, the inputs of said resonant circuits being coupled to said first and second generators, means coupled to said filters for developing second control voltages proportional to the frequencies of the formants lying in respective subbands and third control voltages proportional to the amplitudes of said formants, a sound reproducer, a plurality of adjustable amplifying means coupling the output of each of said resonant circuits to said reproduce
  • the method of artificial speech production which comprises the steps of generating harmonically-related sound frequency oscillations having a fundamental frequency, controlling the fundamental frequency of said generated oscillations in accordance with the fundamental frequency of a desired speech sound, selecting from said harmonically-related oscillations a plurality of oscillations whose frequencies correspond to the formants of the desired speech sound, and combining said selected oscillations in an amplitude ratio corresponding to the amplitude ratio of the formants in the desired speech sound.

Description

Jan. 4, 1949. R. VERMEULEN ET AL 2,458,227
DEVICE FOR ARTIFICIALLY GENERATING SPEECH SOUNDS BY ELECTRICAL MEANS Filed April 18, 1946 3 Sheets-Sheec 1 F "5""- 11 g" "t l l I: a 300-594 A I a; Z; |II l J L i I .5 00 lf 7g I} i 15 L4 1 i a If .2 I
00- 1 G 4-: "1 :4 75 i I i i I 2600- 1 I a 14% l l I 1 l 1 A e m rzwmn 7 4 .23 non rm/viazrw Jan. 4, 1949. 'R. VERMEULEN ET AL DEVICE FOR ARTIFICIALLY GENERATING SPEECH SOUNDS BY ELECTRICAL MEANS 5 Sheets-Sheet 2 Filed April 18, 1946 Jan. 4, 1949.
R. VERMEULEN- ETAL 2,458,227 DEVICE FOR ARTIFICIALLY GENERATING SPEECH SOUNDS BY ELECTRICAL MEANS. Filed April 18, 1946 /M2wrm.r Antler FIR/715716 5 MllI/Y .mr
5 Sheets-Sheet s" Patented Jan. 4, 1949 DEVICE FOR ARTIFICIALLY GENERATING SPEECH SOUNDS BY ELECTRICAL MEANS Roelof Vermeulen and Willem Six, Eindhoven, Netherlands, assignors, by mesne assignments, to Hartford National Bank and Trust Company, Hartford, Conn., as trustee Application April 18, 1946, Serial No. 662,960 In the Netherlands June 20, 1941 Section 1, Public Law 690, August 8, 1946 Patent expires January 17, 1964 8 Claims.
This invention relates to a device for the synthetic composition of speech and the use of such a device in the electrical transmission of speech. In the case of transmitting speech electrically the invention has for its purpose to effect this transmission in such manner that the required frequency band is much narrower than the frequency band occupied by the speech oscillations.
In a well-known device for transmitting speech of the general type disclosed in the patent to H. W. Dudley (No. 2,151,091, March 21, 1939) definite characteristics of the speech are deduced from the speech oscillations by means of an analysator on the side of the transmitter, which characteristics are expressed in a control voltage, which control voltages are modulated on carrier waves having a comparatively low frequency and are then transmitted. These characteristics are the fundamental frequency of the speech spectrum, if present, and the amplitude of the oscillations in definite parts of the speech frequency spectrum. In the well-known device the speech spectrum is subdivided into ten frequency subbands and in each of these subbands the mean amplitude of the oscillations appearing therein is determined and these characterizing amplitudes are transmitted by means of a control voltage. Consequently altogether eleven control voltages are required viz. one which is a measure of the instantaneous value of the fundamental frequency, the remaining ten control voltages indicating the amplitude of the oscillations in ten subbands of the speech spectrum.
On the receiver side the characteristics transmitted by the control oscillations serve to control a device capable of producing artificial speech oscillations. On the receiver side the control voltage, which is a measure of the fundamental frequency of the speech spectrum, controls the oscillation produced by an impulse generator in such manner that the fundamental frequency of the impulses always corresponds to the fundamental frequency of the initial speech spectrum. The impulses containing the fundamental frequency and in addition a large number of higher harmonics are supplied to ten band-pass filters, whose transmission bands respectively correspond to the ten subbands into which is divided the speech spectrum on the transmitter side. The
output circuit of each band-pass filter is connected to an amplifier whose amplification factor is automatically controlled by the action of the control voltage, which is a measure of the mean amplitude of the initial speech oscillations, located in the subband corresponding to the transmission band of the band-pass filter.
Not all of the speech sounds comprise a fundamental oscillation and a number of higher harmonics, as is the case with vowels. Thus, for instance, the hissing sounds consist of a continuous frequency spectrum. In order that such sounds may also be transmitted a source producing a continuous frequency spectrum is provided on the receiver side, which source is automatically connected in place of the impulse generator if no fundamental frequency occurs in the speech vibrations to be transmitted.
In the device referred to above for the transmission of speech the characteristics are deduced from the initial speech on the transmitter side. As an alternative, however, these characteristics may be produced electro-mechanically, for instance, by means of a number of keys operated by hand, These keys respectively control, for instance, the amplitude of the oscillations supplied to the ten band-pass filters, the switching on of the impulse generator and its fundamental frequency or the switching on of the source producing a continuous frequency spectrum. By playing the keys correctly any desired frequency spectrum corresponding to that of the various speech sounds can be composed fully synthetically.
In the well-known device all of the oscillations located in the same subband are reproduced with the same amplitude due to which there is a loss of faithfulness of the speech reproduced. In fact, researches on speech have confirmed the fact that due to resonance phenomena in the cavities of the mouth, throat and nose definite frequencies, the so-called formants, come strongly to the fore in the various speech sounds, which formants essentially determine the character of the speech. Now these formants cannot be fully reproduced by means of the well-known device.
In the device according to the invention this drawback is obviated and, moreover, the advantage is obtained that the number of characteristics required for the synthetical composition of speech, is smaller than with the known device referred to above.
In a device for the artificial production of speech sounds by electrical means, in which according to the character of the speech sound to be produced a continuous spectrum of oscillations or impulses is produced in which latter case the frequency of the impulses is controlled in accordance with the fundamental frequency of the speech sounds (containing such a fundamental frequency) to be produced, the produced oscillations are, in accordance with the invention, supplied to a number of resonant circuits, whose tuning is respectively controlled in accordance with the formants of the speech sounds to be produced; furthermore the'voltages taken from the resonant circuits are combined in an amplitude ratio corresponding to the amplitude ratio of the formants in the speech sounds to be produced.
In the case of a device for synthetically building up speech the switching on of the source producing a continuous spectrum of oscillations or f the impulse generator, the tuning of the resonant circuits and the combination of the voltages set up across the resonant circuits in the correct amplitude-ratio is controlled by means of keys operated by hand.
In devices for the transmission of speech by electrical means the control is effected by means of characteristics which are deduced from the speech at the transmitter side. To this end, according tothe invention, the speech spectrum to be transmitted is divided into a number of frequency subbands and thefrequency and the am plitude of the formant occurring in each of these subbands is transmitted respectively by a control voltage, the control voltages which area measure of the frequency of the formants respectively controlling the tuning of the resonant circuits on the receiver side, and the amplitude of the voltages taken from these resonant circuits being controlled respectively in accordance with the control voltages that are a measure of the amplitude of the formants.
As has been pointed out, in the device according to the invention the frequency and the amplitude of the oscillation having a maximum amplitude are determined respectively in a number of frequencybands of the speech spectrum and these speech characteristics are. transmitted by means of control voltages. Now the question arises in what minimum number of frequency bands the determination of the said characteristics should take place in order to obtain a still intelligible transmission of speech. This may be elucidated by the following explanation of the mechanism for producing speech sounds.
' The air compressed in the lungs passes to the exterior along the vocal cords through the cavity of the throat and the mouth, and sometimes also through the cavity of the nose. In pronouncing various sounds the flow of energy of the compressed air is translated into wave energy at definite points where the air current is contracted. This may take place, for instance, in the gap between the vocal cords, in the space between the tongue and the uvula, between the teeth, between the lips. The produced oscillations are usually relaxation oscillations and consequently contain a large number of harmonics. When this initial relaxation oscillation enters one or more of the resonance chambers constituted by the cavities of the throat, nose and mouth, the latter are excited and caused to resonate as a result of which definite harmonics of the initial vibration are amplified and others are damped. The frequencies that are most favoured, the so-called formants, differ from sound to sound.
Most consonants, viz. the sonants and all vowels, that are the actual voice carriers, are substantially formed with cooperation of the said three resonant chambers, so that these sounds will essentially comprise three formants. To characterize the latter it is necessary to determine the value and the location of the three resonances, so that it is necessary to divide the speech centrum into at least three frequency subbands and to determine in each of them at which frequency a resonance maximum occurs and the value of this maximum.
Other consonants, for instance the explosive consonants are characterized by transition phenomena which occur before or after the production of a vowel.
In producing these sounds the current of air,
is abruptly released and stopped respectively as a result of which the resonance cavities of the speech member are excited in their natural vibrations and die out respectively. If the said consonants precede a vowel the character of the consonant determines the manner of building up of the relaxation oscillation of the air-current produced by the vocal cords. If the consonant is preceded by the vowel the consonant determines the manner of dying out of the relaxation oscillation.
For the artificial production of these sounds a correct control of the frequency during building up Or dying out of the impulse generator electrically imitating the vocal cords comes primarily into question.
Experiments have proved that most of the artificially produced speech sounds are already sufficiently intelligible if they comprise these three iormants. Hence the device for the artificial production of the speech sounds according to the invention must comprise at least three resonant circuits whose tuning is controlled. This requires three control voltages. Furthermore three control voltages are necessary for controlling the amplitude of the voltages set up across the resonant circuits, and another characterizing magnitude has to control the switching on of the source supplying a continuous spectrum of oscillations or the switching on of the impulse generator and the frequency of the latter. This consequently requires altogether at least seven control voltages.
Hence in devices for electrically transmitting speech also sevencontrol voltages are sufficient.
The invention will be more fully explained by reference to the accompanying drawings giving by way the example.
Fig. 1 of the drawing is a diagram of a device according to the invention for electrically mitting speech.
Fig. 2 is a graph illustrating the frequency spec- Fig. '5 shows the circuit diagram of elements A1 to A4 in Fig. 1,
Fig. 6 represents the circuit arrangement of element G and element R in Fig. 1, and
Fig. 7 represents the circuit diagram of e1ements R1 to R4. in Fig. 1.
trans- The spectrum of oscillations produced: on. the: transmitter side Z by talking into the: microphone M. is divided into: four octaves by means of four band pass filters F1, F2, F3 andF' i. It is assumed that, as is well known from telephone technique, intelligibility of speech requires a frequency range of 200 to 3200 cycles/sec. to be reproduced, so that the transmission bands of the filters Fl tO'Fs may be chosen as follows:
F1 200 to 400 cycles/sec. F2 400 to 800 cycles/sec. F3 800 to 1600 cycles/sec. F4 1600 to 3200' cycles/sec.
The output circuits of the filters are connected to analysators A1, A2,. A: and A4 which determine in each of the four frequency bands at what isthe frequencyof the oscillation having maximum amplitude and the value of. this amplitude. Each analysator supplies two control voltages, one of which is proportional to the frequency and the other tothe amplitude of the oscillation having. the maximum amplitude which appears in one frequency subband. Furthermore an analysator A5 is provided on the transmitter side which determines whether a fundamental frequency is available in the oscillations produced by the microph'one and the/value of this frequency. This frequency is also. defined by a control voltage;
The: nine control voltages thus obtained are the speech characteristicsv which are transmitted through lines L1 to Ls to thereceiver side 0 and control the device provided on the receiver side for'the artificial'production of speech. The lastmentioned device comprises an impulse generator G whichproduces impulses consisting of a fundamental frequency with a large number of higher harmonics; and a generator R producing a con tinuous spectrum of electrical oscillations. The control: voltage, indicating whether a fundamen.-- tal frequency is either or not available in the oscillations produced by the microphone M andalso indicating its frequency, is transmitted through, theline L9 and. controls the generators G and-R. This control is such that on receiving a control voltage through the line L9 the. generator G- is put into circuit and the generator R. is cut out. In this case the control voltage controlsat the same time the fundamental frequency of the impulses produced by the generator G, so that this frequency corresponds to the fundamental frequency of the oscillations produced by the microphone M on the receiver side.
Whenno control voltage is obtained through theline L9 the generator R is automatically put into circuit.
The oscillations produced. by the generators G or Rare. supplied to four tunable resonant circuits R1, R2,.R3 and R4. The tuning of these resonant circuits is controlled by the control voltages that are a measure of the frequency of the oscillation having a maximum amplitude which appears in each of'the four different frequency subbands into I which the frequency spectrum on the transmitter side is divided by'the filters F1 and F4. The voltages set up across the resonant circuits R1 to R4 are supplied to: amplifiers Vi to V; respectively which comprise means for automatically controlling the amplification factor. This control takes place in accordance with the control voltages that are a measure of the amplitude of the oscillation having a maximum amplitude in the four frequency bands.
The output circuits of the. amplifiers V1 to Vi are connected in parallel with each other. A
6i telephone T is connected to the common output circuit.
To make the. operation of the device referred to better understood it is assumed that the vowel: a is to be transmitted whose frequency spectrum is represented in Fig. 2. The fundamental frequency amounts to 128 cycles/sec. so so that the spectrum. consists of a number of higher harmonics of 128 cycles/sec, whose amplitimes are related as the lengths of the vertical lines in Fig. 2. The ends of these lines lie on an enveloping curve which may be imagined to be composed of thethree dotted curves which have the shape of resonance curves. After what has been said above about the formation of vowels these curves may be conceived to represent the resonance curves of the resonance chambers constituted by the cavity of the mouth, nose and throat and determinin the formants. In the spectrum of the vowel a these formants are located at 640, l280and 2688 cycles/sec. in Fig. 2, i. e. the 5th, 9th and 21st harmonic of the fundamental frequency of 128 cycles/sec. produced by the vocalcords.
For the artificial production of the frequency spectrum represented in Fig. 2 on the receiver side 0 in Fig. l the frequency of the relaxation generator G- must be adjusted at 128 cycles/sec. and the resonant circuits R2, R3 and R4 at the frequencies of 640, 1280 and 2688 cycles/sec. respectively, the amplification of the amplifiers V2, Vsand V4 being controlled in accordance with the amplitudes I1, I2 and I3 of the formants in the spectrum shown in 2. From the combination of the characteristic curves of the resonant circuits Rz to R4 and the amplifiers V2 to V4 then results a frequency characteristic of the receiving device 0 which approximately corresponds to the enveloping curve of the vertical lines in Fig. 2.
When the impulses produced by the generator G and having a fundamental frequency of 128 cycles/sec. are supplied to this device there ocours in the common output circuit of the ampli fiers V1 to V4 a frequency spectrum which ap proximately corresponds to the initial frequency spectrum according to Fig. This frequency spectrum is reproduced as the speech sound a by the telephone T. a
For the sake of simplicity it has been assume in Fig. 1 that the speech characteristics are transmitted in the form of control voltages through separate lines L1 t0-L9 from. the transmitter side to the receiver side. In practice these control voltages are preferably modulated on. carrier waves e. g. in the manner known from multiplex. carrier 'wave telegraphy. For the transmission of the control voltages modulated on carrier wavesahand width of cycles/sec. is suflicient so that in the system as shown in Fig. Lin which nine control voltages must be transmitted a total bandwidth of 9 25==225 cycles/sec. is requiredgprovided that only one side band of each carrier wave is transmitted.
After the above explanation of the diagram shown in Fig. 1 embodying the idea underlying the invention the parts thereof will be more fully described hereinafter.
Analysator for the fundamental frequency As Fig. 3 represents the circuit arrangement of the analysator A5 of Fig. IV for determining the fundamental frequency. The speech spectrum to be analysed andoriginating from the microphoneM in Fig. lflis suppliedto the input terminals. I and 2- of. the analysator in Fig- 3. A filter 3 having a transmission band of 50 to 400 cycles/sec. is connected to the input terminals. In fact, researches on speech have proved that the fundamental frequency of sounds possessing such a fundamental frequency lies between 75 and 350 cycles/sec. and varies with the pitch of the voice of the various persons. In the case that a fundamental frequency is present this frequency will consequently fall .within the transmission band of the filter 3.
It is desirable that the low frequencies below 50 cycles/sec. should be cut off by the filter to avoid wrong response of the analysator at low frequencies ensuing as a result of the pauses between succeeding syllables and words.
Owing to the comparatively great width of the transmission band of the filter 3 it will in certain cases transmit the fundamental frequency and in addition harmonics ofthis oscillation; however, the amplitude of the fundamental frequency always dominates to a greater or less degree. This amplitude difference is utilised for completely sieving out the fundamental frequency. The output circuit of the filter 3 is connected to the control grid circuit of an amplifying tube 4 in which circuit is inserted a resistance 6 shunted by a condenser 5, which are so proportioned that the voltage drop across the resistance ii caused by the control-grid current flowing in the tube 4 adjust the grid bias to such value that grid current only flows at a voltage peak of the oscillationssupplied through the filter 3. When using a tube having a characteristic which is strongly curved in the surroundings of the point where the grid voltage is zero it is achieved that the oscillation having the largest amplitude, i. e. the fundamental frequency, is more amplified than the amplitude of harmonics of the fundamental frequency that may be available. The amplified oscillations appear through a resistance 1 in the cathode lead of the tube 4. I
In practice it is advisable that a number of stages connected similarly to the tube 4 should be used in cascade to render the amplitude difference between the fundamental frequency in the available harmonics so large as to produce through the resistance 1 of the last stage practically only an oscillation having the fundamental frequency. This oscillation is supplied to'a limiter tube 8 which transforms the sine-shaped fundamental frequency shown in Fig. 4a into an oscillation having an approximately rectangular shape as shown in Fig. 4b. By means of a differentiating transformer 9 the oscillation shown in Fig. lb is translated into an impulse-shaped voltage represented in Fig. 40 which is supplied to the control grid of a relay tube ll]. Between the anode and the cathode of this tube is connected a condenser I I which is charged through a resistance I2 by a source of direct voltage as a result of which the anode of the relay tube l acquires a positive voltage relatively to the cathode. The circuit is adjusted in such manner that whenever the voltage across the secondary ofthe transformer 9 supplies a positive impulse to the control grid of the relay tube the latter is ignited and the condenser H is abruptly discharged, whereupon the discharge through the relay tube is extinguished and the condenser is charged again across the resistance I2. The current impulses produced at successive ignitions produce periodic voltage impulses across a resistance l3, as represented in Fig. 4d whose frequency corresponds to the fundamental frequency. The direct voltage component of these impulses, which is a measure of the fundamental frequency is sieved out by a filter consisting of a resistance I 4 and a condenser l5 and is supplied to a pair of terminals B6, II.
In this manner we obtain the control voltage which is a measure of the fundamental frequency of the speech to be transmitted. This control voltage is transmitted to the receiver side through the line L9 of Fig. 1 which is connected to the terminals l6 and I1 shown in Fig. 3.
In many cases the fundamental frequency is not at all present in the speech sounds or is very little pronounced; however, the speech sounds will then contain a number of higher harmonics. In such cases the frequency of the fundamental frequency corresponds to the periodicity of the sound.
In the analysator this oscillation is distorted, because the large amplitudes are amplified to a higher degree than the small amplitudes. At the same time limiter action takes place by grid current so that the largest amplitudes are amplified to a definite value. This takes place in the tube 4 and any preceding stages connected in the same way. In the last stage the amplitude difference is so large that solely the largest amplitude is trans mitted.
Analysa tors A1 to A4 Fig. 5 shows the circuit diagram of the analysators A1 to A4 each of which comprises a device for determining the frequency and a device for determinating the amplitude of the oscillation having a maximum amplitude in one of the four frequency subbands into which the speech to be transmitted is divided by the filters F1 to F4. This frequency subband is supplied to the input terminals l8 and I9. The circuit and operation of the first-mentioned device substantially corresponds to that of the device for determiningthe fundamental frequency shownin Fig. 3 so'that we refer to this figure for a further description. It is to be noted, however, that the control voltage which is a measure of the frequency of the oscillation having the maximum amplitude is set up between the terminals l6 and II. In Fig. 5 corresponding parts of the circuit bear the same reference numerals as in Fig. 3.
Between the tubes l and 8 is provided a transformer 20 whose secondary is connected to the device for determining the amplitude of the oscillation having a maximum amplitude. This device comprises a rectifier 2! which is connected in series with resistance 22 shunted by a condenser 23. The circuit acts as a peak detece tor so that a rectified voltage is set up. across the resistance which voltage is proportional to the amplitude of the oscillation having a maximum amplitude. After having been smoothed by a filter consisting of a resistance 24 and a condenser 25, the rectified voltage across the resistance 22 is supplied to a pair of output terminals 26, 2?.
Impulse generator and generator for producing a continuous spectrum of oscillations Fig. 6 represents the circuit of the impulse generator G and the circuit of the generator R for producing a continuous spectrum of oscillations. The impulse generator comprises a discharge tube Hi0 whose grids NH and Bare coupled together through a condenser 93. A positive voltage is supplied to the grid I02 through a resistance w t and the control voltage transmitted through the line L9 in Fig. l is supplied to the grid I!!! through a resistance I85. This control voltage appears between the pair of terminals I06, I01. A condenser I08 is connected between the grid IM and earth. The circuit of this impluse generator is well known .so that further explanation is superfluous.
The frequency of the impulses produced by the generator G .is controlled by the control voltage between the terminals I06 and I]. The so controlled impulses are supplied .to the grid circuit of a discharge tube I09. The frequency control just mentioned is effectedin such manner that the frequency always corresponds to the fundamental frequency of the initial speech spectrum received by the microphone M in Fig. 1. If the speech .spectrum does not contain a fundamental frequency, in which case there is no control voltage between the terminals I06 and I01, the generator G is blocked and no impulses are produced.
The generator R for producing a continuous spectrum of oscillations consists of .a resistance H0 in the input circuit of an amplifier III. Owing to the thermal agitation of the electrons in the resistance material a voltage appears across the resistance Hi3 which as is well-known,
consists of a continuous spectrum of oscillations. This continuous spectrum of oscillations is amplified by the amplifier II] and supplied to an attenuation network H2 whose attenuation is controlled by the voltage drop across resistances H3 and H4 in the anode circuit of a discharge tube 5. The control grid of this tube is supplied with a control voltage set up between the terminals I06 and I01. The circuit of the attenuation network H0 and of the tube II3 entirely corresponds to that of the network 2I2 to 2I6, and the tube 22I which will be more fully .described hereinafter in connection with Fig. 7. The operation of the attenuation network II'2 together with the tube H5 is such that in the absence of a control voltage between the terminals I06 and IN the attenuation network transmits the voltage set up in the output circuit of the amplifier III to the grid circuit .of an amplifying tube IIB. This transmission is blocked as soon as a control voltage is set up between the terminals I06 and I01.
summarising it appears from what has been said above that in the presence of a control voltage between the terminals I06 and 101 the impulse generator G supplies impulses to the tube I00, the frequency of these impulses corresponding to the fundamental frequency of the speech sounds to be transmitted. No transmission of the continuous spectrum of oscillations set up across the resistance IIO takes then place to the tube H6. In the absence of the said control voltage however, the tube I09 does not receive impulses and the generator producing a continuous spectrum of oscillations comes .into action and this spectrum is supplied to the tube IIB. Consequently, across a resistance I" inserted in the common anode circuit of the tubes I09 and H6, there occur either voltage impulses or a voltage composed of a continuous spectrum of oscillations according as to whether the speech sounds to be transmitted contain a fundamental frequency or not.
The terminals IIS and H9 of the resistance ill are connected to the input terminals of the tunable resonant circuits.
Tunable resonant circuits R1 to R4 All of the tunable resonant circuits R1 to R4 are connected in the same way so that .it is Sufiicient to describe one of them, for instance .00 .Rl. The circuit is represented in Fig. 7. The tunable resonant circuit is constituted by a condenser 20I in parallel with the input impedance .Z of .a discharge tube 202 which is connected as a reactance tube. The anode circuit of this tube includes a resistance 203, and between the anode and the grid an inductance coil 204 is connected. The input impedance Z of this circuit .is given by the expression jwL 1+SR where L represents the inductance of the coil 204, ,R the resistance 203 and S the mutual conductance of the tube 202. It follows Ifrom this expression that Z represents the inductive .reactance, the value of the inductance amounting to L 1+SR and therefore depends on the mutual con ducta-nce S. This mutual conductance is controlled by the control voltage, which is a measure of frequency of the formant in the frequency band 200 to 400 cycles/second and is transmitted by the line L2 in Fig. 1 to the terminals 205 and 2536 in Fig. '7. The resonant circuit constituted by the condenser 20K and the input impedance 'Z is so controlled that the resonant frequency corresponds to the frequency of the formant in the frequency band of 200 to 400 cycles/sec. The tuned resonant circuit is connected through a transformer 201 to a pair of terminals 208, 209 to which are supplied the oscillations produced by the source G-orR in Fig. 1. The voltage set up across the resonant circuit is also supplied through a third trans-former wind-ing to an amplifier V1 whose amplification can be controlled. The amplification is controlled by the control voltage, which is transmitted by the line L1 in Fig. l to the terminals 210 and 2H in Fig. 7, and which effects the attenuation of an attenuation network inserted in the input circuit of the amplifier V1. The network consists of four nonlinear resistances (for instance dry rectifi'ers) -2I2, '2I3, 2M and -2-I'5. Between the junction point 2113 of the resistances 12 I 4 and H5 and a centre tapping 2I'I of the primary winding of a transformer 218 are connected two resistances 21:9 and 220 which are at the same time inserted in the anode circuit of amplifying tube 22I. The control grid and the cathode of this tube are connected to the terminals 2I0 and 2 respectively. The circuit is adjusted in such manner that no voltage occurs between the points 2H; and an at an average value of the control voltage supplied to the terminals 2I0 and 2H. With an increase in control voltage between the terminals 2H! and MI the point 2II becomes more negative as a result of which the resistance of the non-linear resistances ZIZ and H3 increases and the resistance of the nonlinear resistances M and it decreases. If, in contradistinction thereto, the control voltage between the terminals 2I0 and ZII falls below the average value the point ZII becomes more positive and the resistances 2I2 and 2I3 decrease, whereas the resistances 2M and H6 increase. In the first-mentioned case the attenuation network efiects an increased attenuation and in the second case a decreased attenuation.
By the action of the control of the attenuation network by the control voltage the voltage set up across the resonant circuit constituted by the condenser 20I and the impedance Z will appear with such amplification in the output cir- 'cuit 222,223 of the amplifier V1 as corresponds to the amplitude of the formant in the frequency 'band 200 to 400 cycles/second.
The control range of the resonant circuit is so chosen that only one formant falls within this range.
T The device on the receiver side in Fig. 1 may also be used independently for the artificial production of speech. In this case the required control voltages may be derived from a number of sources of potential shunted by potentiometers. These potentiometers may be adjusted by means of keys. The playing of the keys requires some exercise but a trained person will be able to produce speech sounds, words and sentences by means of this device.
' The device according to the invention used in telephony yields a material saving in the frequency band required for the transmission of a conversation. In the usual methods of transmitting telephony a frequency band of about .3000 cycles/second is required, but in the example given above a bandwidth of only 225 cycles/ second is necessary so that in the bandwidth of 3000 cycles/second of the usual telephony systems ten conversations may be accommodated when making use of the invention.
The device according to the invention may also be used with advantage in microphone-loudspeaker installations, in which special attention 30 had to be paid hitherto tothe avoidance of acoustic feedback.
When using a device according to the invention in such installations there is no risk of such feedback, since the speech oscillations re-- frequencies correspond to the formants of the desired speech sound, and means to combine said selected oscillations in an amplitude ratio corresponding to the amplitude ratio of the formants of the desired speech sound.
2. In a electrical system for producing artificial speech sounds, a periodic impulse generator providing harmonically-related sound frequency oscillations having a fundamental frequency, means to control the periodicity of said generator in accordance with the fundamental frequency of a desired speech sound, means to select from the output of said generator three oscillations whose respective frequencies correspond to the formants of the desired speech sound, and means to combine said selected oscillations in an amplitude ratio corresponding to the amplitude ratio of the formants of the desired speech sound.
3. In an electrical system for producing artificial speech sounds, a periodic impulse generator providing. harmonically-related sound frequency oscillations having a fundamental frequency, means to control the periodicity of said generator in accordance with the fundamental frequency of a desired speech sound, a plurality of resonant circuits coupled to said generator each circuit being continuously tunable within a distinct subband of the sound frequency spec'- trum, means to tune three of said resonant circuits to frequencies corresponding to the formants of the desired speech sound, and means to combine the voltages developed in said three resonant circuits in an amplitude ratio corresponding to the amplitude ratio of the formants of the desired speech sound.
4. In an electrical system for producing artificial speech sounds having a fundamental frequency and formants, means to develop a first control voltage proportional to the fundamental frequency of a desired speech sound, means to develop second control voltages proportional to the respective frequencies of the formants of the desired speech sounds, means to develop third control voltages proportional to the respective amplitudes of said formants, a periodic impulse generator for producing harmonicallyrelated sound frequency oscillations having a fundamental frequency, means to vary the periodicity of said generator in accordance with said first control voltage, a plurality of resonant circuits each coupled to said generator, each resonant circuit being tunable within a successive subband of the sound frequency spectrum, means to vary the tuning of respective resonant circuits in accordance with said second control voltages, a common output channel, a plurality of adjustable amplifying means coupling each of said resonant circuits to said output channel, and means for adjusting respective amplifying means in accordance with said third control voltages.
5. In an electrical speech transmission system, said speech including sounds having a fundamental frequency and formants, a microphone for converting said speech sounds into corresponding sound frequency oscillations, a plurality of band-pass filters coupled to said microphone for dividing said sound frequency oscillations into frequency subbands, means coupled to said microphone for developing a first control voltage proportional to the value of the fundamental frequency of said sound frequency oscillations, means coupled to each of said band-pass filters for developing second control voltages proportional to the frequencies of the formants lying in the subbands and third control voltages proportional to the amplitudes of the formants lying in the subbands, a periodic impulse generator for producing harmonically-related sound frequency oscillations having a fundamental 'fre quency, means to vary the periodicity of said generator in accordance with said first control voltage, a plurality of resonant circuits each coupled to said generator, each resonant circuit being continuously tunable Within a respective subband, means for separately varying the tuning of said resonant circuits in accordance with said second control voltages, a sound reproducer, a plurality of adjustable amplifier means coupling each of said resonant circuits to said reproducer,
and means to adjust separately said amplifier means in accordance with said third control voltages.
6. A speech transmission system comprising an input for speech waves, a plurality of filters for dividing said speech waves into frequency subbands, means responsive to the occurrence of a fundamental frequency in said speech waves for developing a first control voltage proportional to said fundamental frequency, a first generator arranged to be actuated by said first control voltage for producing harmonically-related sound frequency oscillations whose fundamental frequency varies in accordance with said first control voltage, a second generator arranged to be actuated in the absence of said first control voltage for producing a continuous spectrum of sound frequency oscillations, a plurality of resonant circuits each continuously tunable within a respective subband, the inputs of said resonant circuits being coupled to said first and second generators, means coupled to said filters for developing second control voltages proportional to the frequencies of the formants lying in respective subbands and third control voltages proportional to the amplitudes of said formants, a sound reproducer, a plurality of adjustable amplifying means coupling the output of each of said resonant circuits to said reproducer, means to vary separately the tuning of said resonant circuits in accordance with said second control voltages, and means to adjust separately said amplifying means in accordance with said third control voltages.
7. The method of artificial speech production which comprises the steps of generating harmonically-related sound frequency oscillations having a fundamental frequency, controlling the fundamental frequency of said generated oscillations in accordance with the fundamental frequency of a desired speech sound, selecting from said harmonically-related oscillations a plurality of oscillations whose frequencies correspond to the formants of the desired speech sound, and combining said selected oscillations in an amplitude ratio corresponding to the amplitude ratio of the formants in the desired speech sound.
8. The method of artificial speech production, said-speech consisting alternately of first sounds having a continuous frequency spectrum and second sources having a fundamental frequency and formants, said method comprising the steps of alternatingly generating a continuous spectrum of oscillations to similate said first sounds and generating harmonically-related sound frequency oscillations having a fundamental frequency, controlling the fundamental frequency of said harmonically-related sound frequency oscillations in accordance with the fundamental frequency of said second sounds, selecting from said harmonically-related sound frequency oscillations a plurality of oscillations whose frequencies corre spond to the formants of said second sounds and combining said selected oscillations in an am-= plitude ratio corresponding to the amplitude ratio of the formants to simulate said second sounds.
' ROELOF VERMEULEN.
WILLEM SIX.
REFERENCES CITED The following references are of record in the file of this patent:
UNITED STATES PATENTS Number Name Date 2,151,091 Dudley Mar. 21, 1939 2,183,248 Riesz Dec. 12, 1939 2,243,089 Dudley May 27, 1941 2,243,525 Dudley May 2'7, 1941 2,243.526 Dudley May 27, 1941 2,243,527 Dudley May 27, 1941
US662960A 1941-06-20 1946-04-18 Device for artificially generating speech sounds by electrical means Expired - Lifetime US2458227A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
NL2458227X 1941-06-20

Publications (1)

Publication Number Publication Date
US2458227A true US2458227A (en) 1949-01-04

Family

ID=19874270

Family Applications (1)

Application Number Title Priority Date Filing Date
US662960A Expired - Lifetime US2458227A (en) 1941-06-20 1946-04-18 Device for artificially generating speech sounds by electrical means

Country Status (1)

Country Link
US (1) US2458227A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2635146A (en) * 1949-12-15 1953-04-14 Bell Telephone Labor Inc Speech analyzing and synthesizing communication system
US2817707A (en) * 1954-05-07 1957-12-24 Bell Telephone Labor Inc Synthesis of complex waves
US2817711A (en) * 1954-05-10 1957-12-24 Bell Telephone Labor Inc Band compression system
US2866001A (en) * 1957-03-05 1958-12-23 Caldwell P Smith Automatic voice equalizer
US2883465A (en) * 1953-12-17 1959-04-21 Vilbig Friedrich Frequency band transformer
US2891111A (en) * 1957-04-12 1959-06-16 Flanagan James Loton Speech analysis
US2938079A (en) * 1957-01-29 1960-05-24 James L Flanagan Spectrum segmentation system for the automatic extraction of formant frequencies from human speech
US2990453A (en) * 1955-12-06 1961-06-27 James L Flanagan Automatic spectrum analyzer
US3078345A (en) * 1958-07-31 1963-02-19 Melpar Inc Speech compression systems
US3222507A (en) * 1958-07-31 1965-12-07 Melpar Inc Speech compression systems
US3322898A (en) * 1963-05-16 1967-05-30 Meguer V Kalfaian Means for interpreting complex information such as phonetic sounds

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2151091A (en) * 1935-10-30 1939-03-21 Bell Telephone Labor Inc Signal transmission
US2183248A (en) * 1939-12-12 Wave translation
US2243526A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243527A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243525A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243089A (en) * 1939-05-13 1941-05-27 Bell Telephone Labor Inc System for the artificial production of vocal or other sounds

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2183248A (en) * 1939-12-12 Wave translation
US2151091A (en) * 1935-10-30 1939-03-21 Bell Telephone Labor Inc Signal transmission
US2243089A (en) * 1939-05-13 1941-05-27 Bell Telephone Labor Inc System for the artificial production of vocal or other sounds
US2243526A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243527A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech
US2243525A (en) * 1940-03-16 1941-05-27 Bell Telephone Labor Inc Production of artificial speech

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2635146A (en) * 1949-12-15 1953-04-14 Bell Telephone Labor Inc Speech analyzing and synthesizing communication system
US2883465A (en) * 1953-12-17 1959-04-21 Vilbig Friedrich Frequency band transformer
US2817707A (en) * 1954-05-07 1957-12-24 Bell Telephone Labor Inc Synthesis of complex waves
US2817711A (en) * 1954-05-10 1957-12-24 Bell Telephone Labor Inc Band compression system
US2990453A (en) * 1955-12-06 1961-06-27 James L Flanagan Automatic spectrum analyzer
US2938079A (en) * 1957-01-29 1960-05-24 James L Flanagan Spectrum segmentation system for the automatic extraction of formant frequencies from human speech
US2866001A (en) * 1957-03-05 1958-12-23 Caldwell P Smith Automatic voice equalizer
US2891111A (en) * 1957-04-12 1959-06-16 Flanagan James Loton Speech analysis
US3078345A (en) * 1958-07-31 1963-02-19 Melpar Inc Speech compression systems
US3222507A (en) * 1958-07-31 1965-12-07 Melpar Inc Speech compression systems
US3322898A (en) * 1963-05-16 1967-05-30 Meguer V Kalfaian Means for interpreting complex information such as phonetic sounds

Similar Documents

Publication Publication Date Title
US2151091A (en) Signal transmission
US2183248A (en) Wave translation
US2458227A (en) Device for artificially generating speech sounds by electrical means
US2635146A (en) Speech analyzing and synthesizing communication system
US4611342A (en) Digital voice compression having a digitally controlled AGC circuit and means for including the true gain in the compressed data
US2817711A (en) Band compression system
US2243527A (en) Production of artificial speech
US3213199A (en) System for masking information
US1948973A (en) Wave transmission with narrowed band
US3024313A (en) Carrier-wave telephony transmitters for the transmission of single-sideband speech signals
US2014081A (en) Wave transmission system
US1836824A (en) Wave transmission with narrowed bands
US3499991A (en) Voice-excited vocoder
US1882653A (en) Signal transmission system
US3328525A (en) Speech synthesizer
US3280266A (en) Synthesis of artificial speech
US2890285A (en) Narrow band transmission of speech
DE2613513A1 (en) Hearing aid adapting output to wearers disability - halves frequencies and mixes them back with original microphone output
US3042748A (en) Dynamic analog speech synthesizer
US2406825A (en) Privacy system for speech transmission
US3007042A (en) Communication system
US2282404A (en) Transmission system
US654630A (en) Radiophony.
US1986599A (en) Frequency stabilizing means
US1622033A (en) Radiotelephony