US2243526A - Production of artificial speech - Google Patents

Production of artificial speech Download PDF

Info

Publication number
US2243526A
US2243526A US324287A US32428740A US2243526A US 2243526 A US2243526 A US 2243526A US 324287 A US324287 A US 324287A US 32428740 A US32428740 A US 32428740A US 2243526 A US2243526 A US 2243526A
Authority
US
United States
Prior art keywords
speech
circuit
relay
analyzer
waves
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US324287A
Inventor
Homer W Dudley
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Bell Labs
Original Assignee
Nokia Bell Labs
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Bell Labs filed Critical Nokia Bell Labs
Priority to US324287A priority Critical patent/US2243526A/en
Application granted granted Critical
Publication of US2243526A publication Critical patent/US2243526A/en
Anticipated expiration legal-status Critical
Application status is Expired - Lifetime legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Description

May 27, 1941. H. w. DUDLEY PRODUCTION OF ARTIFICIAL SPEECH 4 SheetsSheet 1 Filed March 16, 1940 ATTORNEY May 27, 1941.

H. w. DUDLEY PRODUCTION OF ARTIFICIAL SPEECH Filed March 16, 1940 4 Sheets-Sheet 2 ATTORAEV H. w. DUDLEY 2,243,526

monumou o:- ARTJZFICIAL SPEECH Filed March 16, 1940 4 SheetsSheet-3 \m/ Ill @3 3 HI m J w 15% H WQ n3 mun I 4. W llll T4: Ill Y j m Q a a, w u Mm I- W P .A k Q T 4 a. 1.5L 5 [V1q3 R Q hbsi INVENTOR H. W DUDLEY BY A TTORNE y May 27, 1941.

53.5.38 M. UPx NN Hm met m2: 293 m m I 30 Q 1 3t :32 MW P3 23E U M {m R 2 it 6? Q3 l May 2 7, 1941.

H. w. DUDLEY PRODUCTION OF ARTIFICIAL SPEECH Filed March 16, 1940 4 Sheets-Sheet 4 r0 FILTER I04 IND R T0 FILTER I08 AND R,

m nun 10: AND r0 FILTER 10/ FIGS INVENTOR By H. W. DUDLEY A T TOR/V5 V Patented May 27,

PRODUCTION OF ARTIFICIAL SPEECH Homer W. Dudley, Garden City,'N. Y., assignor to Bell Telephone Laboratories, Incorporated, New York, N. Y., a corporation of New York Application March 16, 1940, Serial No. 324,287

7 Claims.

The present invention relates to the artificial production of speech or similar sound waves and to such artificial production in connection with a system for analyzing. speech or similar waves. The construction of the artificial speech is preferably carried out under control of the speech analyzer. However, the analyzer and the synthesizer may each be used independently of the other, if desired. Where the synthesizer is controlled by the speech analyzer the control may be local and immediate or separated in time or distance, as for example, where a record is made of the analyzed speech or where the result of the analysis is transmitted to a distance over a line or other medium. In both the case where records are used, or where transmission to a distance is used, the frequency range of the waves recorded or transmitted may be very much reduced in comparison with the frequency range of the analyzed speech or of the reconstructed speech.

My prior Patent 2,151,091, issued March 21,

i 1939, discloses a speech communication system in which at a transmitting station an analyzer is employed to determine the fundamental frequency of a speech signal and the average power in properly chosen subbands of frequency, and this information is transmitted in the form of control .currents to a synthesizer at a receiving station to reconstruct speech waves fromsources of energy local to the receiving station. In order to produce at the synthesizer a simulation of the signal from the waves supplied from the local source, frequency subbands of these locally (IE-'- rived waves are selected which may be respectively coextensive with the selected subbands of the speech signal, and the average power in each subband of the locally supplied waves is varied in accordance with the power in the corresponding selected subband of the signal. This variation is effected in response to the information transmitted from the sending end of the system regarding the average power in the selected subbands of the signal.

As disclosed in my prior patent, the analysis of the speech waves at the input end of the system.

derives not only an indication of the distribution of power among the frequency subbands for a particular sound but also derives an indication as to whether a particular sound is a voiced or an unvoiced sound and in the case of the former derives also an indication of the fundamental vocal cord frequency or pitch. The synthesizing portion of the system similarly contains a source of energy simulating waves produced by the vioration of'the vocal cords (sometimes referred to as buzz) and a source of continuous energy spectrum waves simulating the vsound of the breath in whispering (sometimes referred to as hiss). The type of waves used in reconstructing the speech is determined by the presence or absence of an indication received from the ana-- lyzer that the sound at a particular'instant includes a fundamental frequency component- If the fundamental frequency is present the energy source corresponding to voiced sounds (buzz) is selected and the pitch is determined by the pitch indication received from the analyzer. If no fundamental frequency component is present, the continuous energy spectrum source (hiss) is selected.

The waves from whichever source is selected are applied to the group of selective circuits in the speech synthesizer and the particular selective circuits employed are controlled from instant to instant in accordance with the indications received from the analyzer as to the power distribution among the frequency subbands of the analyzed speech, In this way the speech is reconstructed both as to its fundamental pitch" variations and to its energy frequency distribution with time.

In accordance with the present invention, the speech or other input; waves are analyzed and speech is artificially reconstructed under control of the analyzed waves so that to this extent the invention resembles the disclosure of my prior patent referred to. The present invention, however, employs novel types of analysis and synthesis and, in general, the manner of analysis and also the manner of synthesis are made to simulate more. closely the human vocal system with the object of securing a more faithful reproduction of the speech or similar waves, while at the same time retaining or even obtaining to a greater degree the advantages of compressed frequency band and other transmission advantages of the system disclosed in my prior patent referred to.

The various objects and the features of novelty which characterize the present invention will appear more fully from the following detailed description when read in conjunction with the attached drawings showing preferred embodiments.

In the drawings:

Figs. 1 and 2 when placed side by side with Fig.- I

1 at the left show in the form of a schematic circuit diagram an analyzer and a synthesizer in ac cordance with one form of the invention;

relays I01 and I08.

Figs. 2A and 2B show respectively relays and varistors for controlling the connections in the synthesizer;

Fig. 3 shows an analyzer-synthesizer in accordance with the invention similar in general to that in Figs. 1 and 2 but with provision for cating the analyzer at any desired distance from the synthesizer and transmitting over the intervening distance the indications necessary to reconstruct the speech waves at the receiver;

Figs. 4 and 5 show in plan view and section, respectively, an alternative relay structure that may replace the chain of relays shown in Fig. 1.

Referring first to Fig. 1, the wave input for speech or other sounds to be analyzed is indicated at I and this may be a microphone or any other type ot'energy converter for converting from sound vibrations, mechanical vibrations or light variations into electrical variations. The pick-up I ieeds into a vogad" 2 (voice operated gain adjusting device) for reducing to a constant level, in its output, waves of various levels applied to its input and coming from different talkers or input sources of difierent energy levels. This vogad may be of the type disclosed in Mitchell-Schott Patent 2,019,577, issued November 5, 1935, or in the Hogg et al. Patent 1,853,974, issued April 12, 1932, referred to herein. As a result of using the vogad, the speech waves impressed. on the analyzer are maintained at constant average level even though the talker volume at the pick-up I may vary widely.

The speech waves on the output side of the vogad pass into three separate branches. The uppermost branch leading through amplifier I5 is the analyzer for the explosive consonants, to be described later. The central branch leadin through band-pass filter 1 is the fundamental frequency analyzer branch or pitch control branch, to be described later. The lowermost branch leading through equalizer 3 or 4 is the branch leading to the spectrum analyzer for determining the speech frequency subband having the predominant power at a particular instant. Equalizer 3 is used if the relay 5 is not energized (in a manner to be described) which means that the waves present are unvoiced waves. If the waves contain vocal cord energy, equalizer 4 is substituted for equalizer 3 by the fact that relay 5 is energized. The purpose of these equalizers will be explained at a later Point.

After passing one or the other 01 the equalizers, the waves are applied through repeating coil 5 to the inputs of narrow band filters IM to I00 which may comprise any suitable number, but which" in the present description, purely for illustration, will be considered to comprise thirty-two such filters, only a few being actually shown in the drawings.

Each subdividing filter is followed by a rectifier R1, R2, etc., which may also contain smoothing elements for enabling the waves transmitted by the particular filter to operate a polarized relay. There is one such relay for each rectifier and filter. For example, filter MI and rectifier R1 are connected to relay I01; filter I02 and rectifierRz are connected to relay I08, etc. The various relays are paired and the relays of a. pair operate in opposition to each other on a common armature, such as I05, in the case of The construction is such that the common armature is attracted in one direction or the other depending upon the relative current strengths in the individual relays.

The currents transmitted through all of the subdividing filters compete" with one another by means of a system of pyramidal arrangement of the relays in order to establish one and only one circuit at a given instant, which circuit determines as a result of the competitive action the particular subdividing filter that is transmitting the maximum average power at the particular instant of time. Let it be assumed by way of example that at a particular instant this filter is filter IOI.

Under this assumption, relay I01 predominates over relay I08 so that armature I09 is attracted to its uppermost position and carries spring IIO with it on account oi their mechanical linkage. The springs I03 and III are electrically insulated from each other. As a result of this action relay H3 is also energized in parallel with relay I01 and resistance II I, the circuit extending from the output of rectifier R1, lower conductor, upper contact of armature spring I09, resistance II1, winding of relay II3 to ground and back to the opposite terminal of rectifier R1. Relay 3 overpowers relay II4 (under the assumption made above) and causes relay I25 to be energized in parallel with relays I01 and H3 over a circuit extending irom the high potential end of resistance II1, upper closed contact of armature spring IIS, and winding of relay I25. This process continues through the other ranks of relays until the final pair of relays I32, I32 is reached. Under the assumption made relay I32 is energized in parallel with the relays I01, II3, I25 and other corresponding relays in the chain by the extension of the energizing circuit through the upper contact in each case of the armature springs I03, II5, I21, etc. It may be noted that even though other relays, such as H8 or I20, were energized simultaneously with relays I01 and I08 and in consequence caused the energizatlon of a relay such as H4 in the second rank, this would not interfere with the closure of the contacts, as described, of the upper relays of each pair, in view .of the assumption that a predominant energy is transmitted through the filter IOI.

As a result of the foregoing operation a circuit is closed from ground I34 through the upper contact of spring I33, the corresponding upper contact of the upper spring of each of the other pairs leading to spring I28, upper closed contacts of springs I28, IIB andIIO to conductor I35 which leads to a particular contact on the interconnecting block I35 of Fig. 2 for the purpose of controlling the synthesizer in a manner to be described at a later point.

If the subband filter I02 had been assumed to be carrying the maximum average energy, relay I08 would have predominated over relay I01, and the relays II3, I25 and eventually I32 would have been energized as before. This would have resulted in the closure of a circuit from ground I34 through the various closed relay contacts, through the lower contact of spring H0 and to conductor I 31 instead of conductor I35. Similarly, if subband filter I03 had been assumed to be carrying maximum energy, relay H8 would have predominated over relay I20, relay II4 would have predominated over relay II3 with the result that a circuit would have been closed from ground I34, upper contact of spring I33, upper contact of spring I28, lower contact of spring IIS, upper contact of spring I32 and conductor I38. These illustrations are sufilcient to show that a diiferent one of the thirty-two conductors in the group of conductors I40 is connected to ground through the chain of relays depending on which one of the thirty-two analyzing filters is carrying the maximum energy at a particular instant of time.

The effect of grounding one or another of these conductors is to cause the synthesizer shown in Fig. 2 to reconstruct in the outgoing circuit shown as loud-speaker 11 a particular sound corresponding to the individual sound in the analyzer which led to the grounding of the particular conductor. The type of synthesizer shown in Fig. 2 is substantially the same as that disclosed in Fig. 1 of my prior application Serial No. 181,275, filed December 23, 1937, of which the present application is in part a continuation.

' In the analyzer shown in that application, thirty-two equalizer networks are individually selected by a corresponding number of keys so that the output sides of the equalizers are connected one at a time to the outgoing circuit leading to loud-speaker I'I. In the present disclosure, the output circuit leading to the loud-speaker I1 is by way of repeating coil 15, one terminal of which is connected to each of the thirty-two equalizer networks, while the opposite terminal of the repeating coil is grounded. The grounding of one of the individual conductors of group I40 by the analyzer completes a connection from the output side of the corresponding equalizer networks 40 to II, inclusive, to the repeating coil 15, and therefore, has the same eifect as closing one of the keys in my prior application disclosure. The individual conductors from the equalizer networks 40 to II, inclusive, are brought out to a terminal strip I4I which is shown facing terminal strip I36 on which the individual conductors in the group I40 terminate. Suitable cross-connections from terminal strips I38 to I are indicated in the figure by dotted lines. It is necessary that each conductor in the group I40 shall be cross-connected to the proper terminal on strip I in order to select and connect into the outgoing circuit the proper one of the networks 40 to II, inclusive. The use of the terminal strips I38 and MI facilitates the proper cross-connection, which can be determined by trial.

As in my prior application referred to, the input sides of the equalizer networks 40 to 'II,.inelusive, are connected to two sources of waves, resistance noise source 36 and relaxation oscll lator 30, the latter producing sounds simulating the vibrations of the vocal cords. The resisttance noise source 36 produces continuous spectrum noise corresponding to the hiss above referred to and may beof the type shown in my prior application or the type'using a gas-filled tube such as disclosed in my application Serial No. 273,429, filed May 13, 1939. These two sources of wave energy are properly connected to the equalizer networks so as to deliver energy my prior application consists in the manner in which the individual equalizer networks are switched into circuit as above described and the manner in which the pitch of the relaxation oscillator 30 and the volume control for the amplifier I6 are controlled, as will now be described. It will be observed that the resistor 33 in the relaxation oscillator 30 is connected by way of leads I4 to the output of the pitch control branch of Fig. land that the volume con-trolling or gain controlling relay 10' in the input. of amplifier I6 is connected by way of circuit 20 to the explosive analyzer branch of Fig. 1.

The pitch control circuit extends from the output of the vogad 2 through band-pass filter 1 whose pass range may be such as to pass the essential or important speech frequencies, for example, 50 to 3;000 cycles. This filter selects the fundamental voice frequency component and a few of the lower harmonics thereof which are rectified in the rectifier bridge 8 to derive the fundamental frequency component of the speech. This is passed through equalizer 9,which as disclosed in R. R. Riesz Patent; 2,183,248, has its loss increasing with frequency so as to insure that the fundamental frequency which may vary,

for example, from about 80 to some 400 cycles,

comes out at a high power level compared to any upper harmonics that may be present. The equalizer 9 is connected to frequency measuring circuit I0 which may be of the type shown in Fig. 2 of the Riesz patent referred to and the function of which is to produce a direct current the strength of which varies in proportion to the fundamental frequency of the input speech. This is sent through the 25-cycle lowpass filter l2 and delay circuit I3 'to the circuit I4 which leads to the relaxation oscillator 30 of Fig. 2. Variations in the strength of the current transmitted over circuit I4 produce variations in the fundamental frequency of the waves produced in the relaxation oscillator 30 so that these waves follow the frequency variations of the vocal cord waves of the talker. The delay circuit I3 is included to permit time for the operation of the relays in the analyzer.

When voiced sounds are present in the pitch control circuit they cause the energization of re-.

lay 5, since direct current is then impressed on the circuit II. Relay 5 in energizing substitutes equalizer 3 for equalizer 4 in the input circuit to amount of energy transmitted through each of vocal cord characteristic to those shaping method of operation of the synthesizer shown in Fig. 2 of the present application drawings. The principal dlfierence over the disclosure of equalizing filter for the same level of input speech. In this case the equalizer 4 might be omitted since the frequency energy distribution of the unvoiced sounds is nearly uniform. On the other hand, if the band width of the analyzer filters is chosen on a logarithmic basis, the voiced sounds for normal vocal cord output would give nearly uniform power in each filter band, in which case the equalizer 3 might be omitted but equalizer 4 would be necessary in order to equalize the normal power output of the various filters in the case of the unvoiced sounds. Either equalizer may be omitted, if desired.

The uppermost channel of Fig. 1 is the stop consonant analyzer and includes an amplifier I5 for isolating this channel from the other circuits connected across the output terminals of the vogad 2. The output of amplifier I5 is rectified at I6 and the rectified output is passed through -a low-passfilter II which passes the band between 0 and 80 cycles to eliminate the fundamental frequency of any voiced sound that may be present. A large inductance I8 is shunted across the output of filter Il after which comes an amplifier I9, if desired. The design of this circuit branch is such that when a sudden change in the energy level occurs, such as is caused by a stop consonant sound, a sudden change in energy goes through this branch and builds up a potential across inductance I8. This pulse operates polarized electromagnet I8 (Fig. 2) over circuit 20, when there is a sudden drop in the energy from filter I! producing an abrupt drop in the gain of the amplifier 18 to assist in the reconstructing of stop consonant sounds in the final output. An increase in energy from filter ll restores the contact of the electromagnet II. Sustained energy of average level in the voice waves will not produce suflicient potential across the inductance I 8 to cause the operation of the electromagnet 18".

In the operation of the system as thus far described, the speech waves spoken into the microphone I are analyzed in the analyzer into their component sounds, this analysis being on the basis of the frequency subband carrying maximum average power in particular instants of time. Corresponding controls are exerted over the individual conductors I35, etc., in the group of conductors I40 for connecting the outputs of selected individual shaping networks 40 to 68, in-

. clusive, to the final output circuits I5, I6, 'Il. At

the same time the input speech is analyzed for its fundamental pitch andavariable strength current'is sent over circuit II to the relaxation oscillator 30 to adjust the frequency of this oscillator to the required value. In the case of the explosive consonants, these eiIect actuation of electromagnet I8 over circuit 20 to produce abrupt volume changes in the final output circuit. It will be understood that for these explosive consonants the proper shaping network 40 to 88, inclusive, is also connected in circuit under control of the analyzer. The speech is in this way reconstructed on the basis of the individual phonetic sounds which follow one another in rapid succession as in the case of the spoken words at the input end of the system. The intonations are preserved by the pitch channel and the control which it exercises.

It has already been stated that the analyzer filters may have different band widths, such as uniform bands or widths that are logarithmically related. There is, of course, considerable choice as to the frequencies to be passed by the different band filters. While uniform band width throughout is probably the simplest arrangement,

it may be desirable to use wider bands at certain portions of the speech band, thus cutting down on the number of filters required; or it may be advantageous to use level adjustments in the different bands so as to make particular bands more sensitive for the sounds that they are supposed to select. In general, it is required that each sound have some frequency range where it is the strongest.

There is also wide latitude as to the number of sub-dividing filters to use in the analyzer. If thirty-two filters are used as disclosed, then five ranks of relays are required. If the number of filters is between thirty-two and sixty-four, six

ranks of relays will be required. It may be noted that the relays are not required to operate at excessively high speed. For example, the resonant characteristics of speech vary at the rate of about 20 per second. It is necessary that the relays operate only sufiiciently fast so that thefive relay operations involved in a single chain occur together in about a twentieth of a second.

The vogad circuit 2 should operate over periods of time slightly longer than the syllabic periods of time. If the vogad operates over shorter periods of time, it will tend to level out the rise and fall of energy corresponding to the stop consonants and thus make the explosive circuit fail to operate satisfactorily. Alternatively, the explosive circuit may be branched ofl from the input I at a point ahead of the vogad 2, although in this case It might be necessary to use a very sluggish volume control to correct from one talker level to another. The use of the explosive analyzer channel is advantageous, since otherwise it would be necessary to analyze in the main analyzer circuit sounds of verysmall time interval. This would require the speeding up of the circuits of the main analyzer. The explosive analysis should determine such unvoiced sounds as p, t, k, as well as the voiced sounds b, d and g.

In the modification shown in Fig. 2A, an arrangement is provided for connecting the outputs of the shaping networks 40, II, etc., to the outgonig circuit gradually instead of abruptly. For this purpose, a number of relays I60, IGI, etc., are interposed between the terminal strip Ill and the various networks 40, H, etc., each relay having an armature slidable over a resistance I82, I63, etc., so as to cut the networks into the circuit gradually after the manner of a potentiometer starting from an off-position. If desired, solenoid type electromagnets may be used with plungers adjusted to a period of operation as long as desirable.

In the alternative type of circuit,-shown in Fig. 2B, varlstor circuits are interposed between the terminal strip HI and the various shaping networks III, M, etc. Each of these may comprise copper-oxide rectifiers I64 and IE5 and resistances I66 and I8! connected in the form of a bridge with the output conductors from the corresponding shaping network, such as "I, connected across one diagonal of the bridge and the outgoing circuit leading to repeating coil of the bridge connected across the opposite diagonal. The bridge is normally balanced so that no trans mission into the outgoing circuit takes place. When ground is placed on one of the conductors on the terminal strip I from the distant analyzer, a circuit is established for the flow of direct current from battery I68 so as to polarize the copper-oxide rectifiers and so change the resistance of the two variable arms of the bridge as to unbalance the bridge and permit waves from the output of the shaping network, such as 40, to flow into the outgoing circuit through repeating coil I5. This action may be slowed up to the desired extent'by the use of shunt capacity I69 and series inductance I10. The condenser III is a stopping condenser. It is understood that there would be one of these varistor-bridge circuits for each of the shaping networks 40, ll, etc.

Fig 3 illustrates how the analyzing and synthesizing circuits, shown'in Figs. 1 and 2, may be used at opposite ends of a long transmission circult of any desired length where it would be impractical to extend the group of conductors I40 .quite general and to include not only a trans- I mission line but a carrier or a radio channel or any other type of transmission medium or channel. The principal modification made in the analyzer is in the inclusion of alternating voltage sources I45, I45, etc., in the switching circuits of the analyzing relays. Each of these generators produces a different frequency wave which may have any convenient values either within the speech range or below the speech range or above the speech range; but from the standpoint of frequency compression and conservation of band width these frequencies are preferablyquite low and may all lie below some moderatejrequency 1 such as 500 or 600 cycles. It will be pbserved that the arrangement is such that a different one of these alternating voltage sources has i-ts output circuit established by the operation of the analyzer relays so that a. wave of a different frequency is transmitted to line for each analyzed sound. For example, in the figure, if the relay springs IIO, IIS, I28 and I33 are all closed on their upper contacts, a circuit is established which includes source I45, each of the relay contacts mentioned in series and repeating coil I44 so that a wave of the frequency of the source I45 is transmitted over the branch I49 to the line I50.- If the relay operation were the same except that relay spring IIO had closed its lowermost contact, a wave from source I46 would have been sent to line; and so on for the other wave sources.

The stop consonant circuit has also been modified to provide a relay 2| connected across the output of amplifier I9 (see Fig. 1) so that this relay is energized by explosive consonant sounds when there is a sudden energy level drop and connects alternating wave source 22 to the'outgoing line circuit. The pitch control circuit may be the same as in Fig. 1, its output circuit I4 eizggnding to the sending terminal of the channel Referring to the receiving end of the channel I50, a number of tuned circuits I5I, I52, etc., are connected across the line, each selective of a different one of the waves from the generators I45, I45, etc. Following each of these tuned cir-' cults is a detector such as I53 for detecting the received alternating current wave andoperating an individual relay such as I54. .For example, if the analyzed sound at the distant as to select wave source I45, it may be assumed that relay I54 responds and connects the output of the shaping network 58 to the'repeating coil 15 leading to the final output. Similarly, each of the other wave sources in the analyzer will actuate a different relay in the synthesizer. The pitch control is carried out in the synthesizer as in the case of Figs. 1 and 2 by the control currents transmitted through the cycle low-pass filter I55, which current is impressed across the resistance 33 and controls the frequency of the oscillator 30.

The receiving end of the stop consonant or explosive channelis connected to tuned circuit terminal issuch the lines such as I35, conductors I40. This I56 and detector I51 which are selective of the wave from source 22, so that when relay 2| at the transmitter is energized the electromagnet- 18 at the receiver is also energized and causes an abrupt drop in the gain of the amplifier 16. The remaining portions of the synthesizer circuit may be recognized by the reference numerals which are the same as those used on Fig. 2.

It will be understood that in thecase of a radio or carrier channel at I50, a modulator of the usual type will be necessary at the sending end and a demodulator at the receiving end in accordance with standard practice. The frequencies of the wave generators I45, I46, etc., and also that of the alternator 22 must be chosen soas not to interfere with each other or with the pitch channel which uses frequencies between 0 and 25 cycles.

Figs. 4 and 5 disclose a multicontact type of relay which may replace all of the relays following the filters IOI to I06, inclusive, in the analyzer circuit of Fig. 1. This multicontact relay comprises a shell of iron or other magnetic material containing, for example, thirty-two pole-pieces I12, I13, etc., projecting inwardly from the rim. Each pole-piece has mounted on it a winding connected to the output of a respective one of the rectifiers R1, R2, etc. A flexible brush or contact member is carriedon a centrally mounted magnetic member I14 which is resiliently mounted to occupy normally a central position but to be moved in any radial direction under control of the magnetic field set up by the flux in the various pole-pieces as a result of current flow through the coils mounted on the pole-pieces. As the flexibly mounted contactmem'ber I14 is moved in a radial direction, after a small extent of travel, it makes contact with fixed contact plates mounted in a spherical surface I15 within the circular space bounded by the pole-pieces. In the form shown in Fig. 4, there are thirty-two separate contacts arranged around the member I10. For any given sound there will be a maximum attraction in some radial direction which may be due to current in one of the pole-pieces I12-or the resultant due to currents in a group of windings on these polepieces. In either case, the member I14 will take up an angular position corresponding to the maximum magnetic attraction and will close a single one of the contacts in the surface I15 resulting in the grounding of an individual one of I31, etc., in the group of will have the same kind of effect as the operation of a chain. of relays in Fig. 1 resulting in the grounding of one of the .conductors in the group I00. The type of multiple contact relay shown in Figs 4 and 5 permits a large number of circuits to be readily controlled. For example, with the circumference of the surface I15 divided into thirty-two parts and each of these parts divided in three sections in a radial direction corresponding to three different magnitudes of attractive force, a total of ninety-six different contacts may be provided. If this is more than needed in a particular case, neighboring contacts may be multipled together.

What is claimed is:

1. In a speech producing system a plurality of formant circuits corresponding to individual phonetic characters, a common output circuit adapted for connection to the output sides of said formant circuits, means to impress wave energy covering the frequency range of the sound effects to be produced connected to the input sides of said formant circuits, a speech analyzerv V for analyzing applied speech waves to determine their component sounds in consecutive time intervals, and means controlled by said analyzer for selectively controlling connection of said formant circuits individually to' said common output circuit in accordance with said analyzed sounds.

2. In a speech producing system a plurality of formant circuits corresponding to individual phonetic characters, an output circuit for speechrepresenting waves, input energy means for suvplying energy to said formant circuits, individual control circuits for selectively connecting said formant circuits in circuit with said input energy means and said output circuit, and a speech analyzer for actuating said control circuits, comprising means for detecting the subdivision of the input speech spectrum that is carrying maximum power.

3. In a speech producing system, a source of voiced waves and a, source of unvoiced waves, an output circuit for speech-representing waves, a plurality of frequency selective circuits adapted to be variably controlled for selectively passing waves of particular frequencies from said sources to said output circuit, a speech analyzer for varibly controlling said selective circuits on the basis of the existence of predominant speech energy in a different part of the speech frequency band for each component sound, said analyzer including means for determining from instant to instant the part of the speech frequency band containing such predominant energy.

4.'In speech analysis and reconstruction, an analyzer for producing a separate indication for each principal component sound of impressed speech comprising a plurality of frequency selective circuits and detecting means for indicating which component frequencies carry greatest power at a particular instant, a synthesizer including means for generating waves in the speech frequency range, wave modifying means for producing from said waves the elemental speech sounds, and means for controlling said wave modifying means in accordance with said indications produced by said analyzer for artificially constructing speech sounds. a

5. The combination claimed in claim 4 in which said analyzer comprises a plurality of narrow-band filters and relays interconnected to operate differentially in accordance with the relative amounts of power in the waves transmitted through the different filters, said relays closing a diiferent circuit for each principal component sound.

6. The combination claimed in claim 4 in which said analyzer comprises a plurality of narrow-band filters each followed by a detector, a relay structure having a common magnetic core structure with a plurality of poles arranged in a circle and projecting radially inward, a winding on each pole connected in the output of a respective detector, a common armature resiliently constrained to neutral position but adapted to be moved in different angular directions by the conjoint attraction of said poles, a plurality of circularly arranged contacts positioned to be selectively closed one at a time by said armature and control circuits controlled individually by said contacts.

7. An analyzer-synthesizer for speech or similar waves comprising a plurality of frequency selective circuits and differentially operating relays for determining in successive time intervals the portion of the speech frequency band in which the energy predominates, a plurality of formant circuits corresponding in number to the phonetic characters of the speech to be synthesized, energy sources of frequencies covering the frequency range of the speech to be synthesized, means supplying energy from said sources to said formant circuits, an outgoing circuit common to said formant circuits, and means controlled by said analyzer for selectively connecting a said formant circuits individually to said outgoing circuit in accordance with the particular portion of the frequency band of the analyzed speech in which the energy predominates.

HOMER DUDLEY.

US324287A 1940-03-16 1940-03-16 Production of artificial speech Expired - Lifetime US2243526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US324287A US2243526A (en) 1940-03-16 1940-03-16 Production of artificial speech

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US324287A US2243526A (en) 1940-03-16 1940-03-16 Production of artificial speech

Publications (1)

Publication Number Publication Date
US2243526A true US2243526A (en) 1941-05-27

Family

ID=23262932

Family Applications (1)

Application Number Title Priority Date Filing Date
US324287A Expired - Lifetime US2243526A (en) 1940-03-16 1940-03-16 Production of artificial speech

Country Status (1)

Country Link
US (1) US2243526A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2458227A (en) * 1941-06-20 1949-01-04 Hartford Nat Bank & Trust Co Device for artificially generating speech sounds by electrical means
US2522539A (en) * 1948-07-02 1950-09-19 Bell Telephone Labor Inc Frequency control for synthesizing systems
US2635146A (en) * 1949-12-15 1953-04-14 Bell Telephone Labor Inc Speech analyzing and synthesizing communication system
US2824906A (en) * 1952-04-03 1958-02-25 Bell Telephone Labor Inc Transmission and reconstruction of artificial speech
US2908761A (en) * 1954-10-20 1959-10-13 Bell Telephone Labor Inc Voice pitch determination
US2927969A (en) * 1954-10-20 1960-03-08 Bell Telephone Labor Inc Determination of pitch frequency of complex wave
US3180936A (en) * 1960-12-01 1965-04-27 Bell Telephone Labor Inc Apparatus for suppressing noise and distortion in communication signals
US3268661A (en) * 1962-04-09 1966-08-23 Melpar Inc System for determining consonant formant loci
US3524930A (en) * 1968-07-08 1970-08-18 Us Army Resonance synthesizer for speech research
US3600516A (en) * 1969-06-02 1971-08-17 Ibm Voicing detection and pitch extraction system
US3903366A (en) * 1974-04-23 1975-09-02 Us Navy Application of simultaneous voice/unvoice excitation in a channel vocoder
US5940791A (en) * 1997-05-09 1999-08-17 Washington University Method and apparatus for speech analysis and synthesis using lattice ladder notch filters

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2458227A (en) * 1941-06-20 1949-01-04 Hartford Nat Bank & Trust Co Device for artificially generating speech sounds by electrical means
US2522539A (en) * 1948-07-02 1950-09-19 Bell Telephone Labor Inc Frequency control for synthesizing systems
US2635146A (en) * 1949-12-15 1953-04-14 Bell Telephone Labor Inc Speech analyzing and synthesizing communication system
US2824906A (en) * 1952-04-03 1958-02-25 Bell Telephone Labor Inc Transmission and reconstruction of artificial speech
US2908761A (en) * 1954-10-20 1959-10-13 Bell Telephone Labor Inc Voice pitch determination
US2927969A (en) * 1954-10-20 1960-03-08 Bell Telephone Labor Inc Determination of pitch frequency of complex wave
US3180936A (en) * 1960-12-01 1965-04-27 Bell Telephone Labor Inc Apparatus for suppressing noise and distortion in communication signals
US3268661A (en) * 1962-04-09 1966-08-23 Melpar Inc System for determining consonant formant loci
US3524930A (en) * 1968-07-08 1970-08-18 Us Army Resonance synthesizer for speech research
US3600516A (en) * 1969-06-02 1971-08-17 Ibm Voicing detection and pitch extraction system
US3903366A (en) * 1974-04-23 1975-09-02 Us Navy Application of simultaneous voice/unvoice excitation in a channel vocoder
US5940791A (en) * 1997-05-09 1999-08-17 Washington University Method and apparatus for speech analysis and synthesis using lattice ladder notch filters
US6256609B1 (en) 1997-05-09 2001-07-03 Washington University Method and apparatus for speaker recognition using lattice-ladder filters

Similar Documents

Publication Publication Date Title
Flanagan et al. Phase vocoder
US3631520A (en) Predictive coding of speech signals
US3406344A (en) Transmission of low frequency signals by modulation of voice carrier
US3360610A (en) Bandwidth compression utilizing magnitude and phase coded signals representative of the input signal
Atal Predictive coding of speech at low bit rates
US2368953A (en) Electric control system
CA1299752C (en) Optical decoding technique for speech recognition
Jayant Digital coding of speech waveforms: PCM, DPCM, and DM quantizers
EP0525809A2 (en) Digital encoder with dynamic quantization bit allocation
US2705742A (en) High speed continuous spectrum analysis
US685956A (en) Apparatus for utilizing effects transmitted through natural media.
CA1245780A (en) Method of reconstructing lost data in a digital voice transmission system and transmission system using said method
CA1293569C (en) Method of and device for speech signal coding and decoding by subbband analysis and vector quantization with dynamic bit allocation
Zelinski et al. Adaptive transform coding of speech signals
CA1139884A (en) Half duplex integral vocoder modem system
Anderson et al. Tree encoding of speech
US685954A (en) Method of utilizing effects transmitted through natural media.
US3471648A (en) Vocoder utilizing companding to reduce background noise caused by quantizing errors
White et al. The audio dictionary: revised and expanded
EP0713295A1 (en) Method and device for encoding information, method and device for decoding information, information transmitting method, and information recording medium
De Boer Theory of motional feedback
US2213246A (en) Magnetic sound recording and monitor system
US4215242A (en) Reverberation system
JPH02183468A (en) Digital signal recorder
GB466327A (en) Improvements in or relating to electrical signalling systems