US3872250A - Method and system for speech compression - Google Patents
Method and system for speech compression Download PDFInfo
- Publication number
- US3872250A US3872250A US336705A US33670573A US3872250A US 3872250 A US3872250 A US 3872250A US 336705 A US336705 A US 336705A US 33670573 A US33670573 A US 33670573A US 3872250 A US3872250 A US 3872250A
- Authority
- US
- United States
- Prior art keywords
- signal
- speech
- formant
- baseband signal
- baseband
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000006835 compression Effects 0.000 title claims abstract description 33
- 238000007906 compression Methods 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000001172 regenerating effect Effects 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 10
- 230000008929 regeneration Effects 0.000 abstract description 5
- 238000011069 regeneration method Methods 0.000 abstract description 5
- 230000001755 vocal effect Effects 0.000 abstract description 3
- 230000005540 biological transmission Effects 0.000 description 8
- 210000000867 larynx Anatomy 0.000 description 6
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000008447 perception Effects 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 208000019300 CLIPPERS Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000007664 blowing Methods 0.000 description 1
- 208000021930 chronic lymphocytic inflammation with pontine perivascular enhancement responsive to steroids Diseases 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
Definitions
- An UIIVOICC or fncatlve parameter can be transmitted along wlth the baseband s1gnal UNITED STATES PATENTS and utilized at the receiver to generate a synthetic 3,127,476 3/1964 David 179/1 SA noise burst signal for combining with the baseband sig- 31139v437 6/1964 l79/15-55 R nal and the regenerated second formant.
- the present invention pertains to speech compression and, more particularly, to a method and system for speech compression to reduce transmission bandwidth by using baseband vocal regeneration.
- the channel vocoder method of speech compression is based on the recognition of the carrier nature of speech; that is, the carrier or pitch frequency of the larynx is modulated by the .voice tract frequenciesformed in the mouth.
- Channel vocoders thus, include a pitch tracker for detecting and tracking the pitch frequency and supplying a signal through a frequency counter, low pass filter or the like representative of average frequency of the larynx as well as a plurality of bandpass filters covering the voice frequency range from about 300 to 3,000 KHz to measure the energy in each portion of the voice frequency and provide a stepped representation of the spectrum.
- the outputs from the filters are detected with low pass filters, and the pitch and voice tract signals are transmitted and used to regenerate speech at the receiver.
- Channel vocoders while providing acceptable results from an information transmission standpoint, have the disadvantage that the reconstructed or reproduced speech at the receiver lacks naturalness and has poor voice quality. That is, intelligibility, speaker recognition and emotional characteristics of the reconstructed speech have not been adequately reproduced with the result that the reconstructed speech has an irritating synthetic sound.
- One of the primary reasons for the unnaturalness of the reconstructed speech is due to the necessity of tracking pitch in that pitch trackers have not been developed sufficiently to provide accurate tracking.
- Speech compression systems based on mathematical derivation and linear prediction have been proposed; however, after development and hardware implementation, such systems have been revealed to require pitch tracking or its equivalent and, thus, have the disadvantages associated with channel and formant tracking vocoders and improved variations and modifications thereof.
- Another prior method of compressing the bandwidth required for speech transmission is based on transmitting certain key bands in the normal speech range, generally from three to six bands, whose total bandwidth is less than that required for normal speech transmission.
- This method has the disadvantage that when the bands are made sufficiently narrow to achieve a useful bandwidth compression of 4:1 or greater, the quality of the reconstructed speech is characterized by heavily filtered sound which is disturbing to the listener.
- the primary object of the present invention is accomplished by transmitting a baseband signal from approximately 300 to 700 Hz and utilizing the baseband signal at a receiver to generate an upper formant signal from known definable relationships between formants, particularly between the first and second formants in accordance with the Peterson-Barney estimates.
- Another object of the present invention is to utilize the inverse relationship of the first and second formants for front vowels in regenerating the second formant at a receiver when only the first formant is transmitted due to the increased importance of the second formant in defining front vowels relative to the importance of the second formant in defining back vowels.
- a further object of the present invention is to regenerate the second formant from the first formant by frequency translation with the use of a local frequency source and a balanced modulator to translate and invert the first formant to produce a signal harmonically related to the pitch frequency representing the second formant for front vowels.
- the present invention has another object in the reconstruction of good quality, articulate speech from a small portion of the normal voice bandwidth with or without an additional signal of low bandwidth.
- Yet an additional object of the present invention is to eliminate the requirement for devices which track either pitch or formant frequency in a speech compression system while reconstructing speech based on formant frequencies.
- a baseband signal including the first formant and a fricative parameter including noise burst
- unvoice information with the second formant being regenerated from a portion of the spectrum containing the first formant and the unvoice noise burst being synthetically generated corresponding to the fricative parameter and combined with the regenerated second formant and the baseband signal to provide a speech output.
- another object of the present invention is to regenerate the second formant from a baseband signal including the first formant by forming the regenerated second formant of frequencies only'harmonically related to the pitch frequency of the baseband signal.
- a speech signal can be processed to permit voice communications occupying a reduced bandwidth with the same signal-to-noise ratio or at a lower bit rate than the unprocessed speech, sufficient intelligibility and voice naturalness are maintained so that the speaker can be recognized from his voice and the voice quality does not sound synthetic
- the system of the present invention can be used in tandem without appreciable reduction in voice quality or intelligence due to the synthetic production of noise burst signals and the regenerating of the second formant from the baseband signal and the circuitry required is simple thereby permitting the method and system of the present invention to be economically implemented and reliable in operation.
- the present invention is generally characterized in a method of transmitting a speech signal in reduced bandwidth including the steps of transmitting a baseband signal having a frequency range including the first formant, receiving the baseband signal regenerating the second formant from the baseband signal, and combining the baseband signal with the regenerated second formant to reconstruct the speech signal.
- the present invention is further generally characterized in a speech compression system including a speech source providing a speech signal having first and second formants, a filter for passing a baseband signal of the speech signal including the first formant, means for transmitting and receiving the baseband signal, a signal combining network for producing a reconstructed speech signal, means supplying the received baseband signal to the signal combining network, and means responsive to the received baseband signal for regenerat- BRIEF DESCRIPTION OF THE DRAWINGS
- FIG. 1 is a plot illustrating the relationship between first and second formant frequencies for the most common vowel sounds.
- FIG. 2 is a diagram of a speech compression system according to the present invention.
- FIG. 3 is a diagram of a modification of the speech compression system of FIG. 2.
- the other major class of sounds bearing speech intelligence are unvoice or fricative sounds, generated without vibrating the larynx by blowing air through a constricted point somewhere along the voice tract.
- the combination of these voiced and unvoiced sounds comprises the generation of human speech; and, accordingly, speech signals are formed of the pitch frequency and the formants which are harmonics of the pitch frequency and which form the voice portion of speech and the fricative sounds which form the unvoice portion of speech.
- first and second formants F 1 and F are illustrated in FIG. 1 for the most common vowel sounds, and from FIG. 1, it will be appreciated that a practically linear inverse relationship exists between F and F for the front vowels, whereas F and F have a substantially direct relationship for the back vowels. While the frequency and amplitude of the first formant F, are important to human perception of front and back vowel sounds, it has been found that the second formant F is relatively significant to the perception of front vowel sounds only in that perception of back vowel sounds is strongly linked to the F 1 frequency only.
- FIG. 2 One example of a system according to the present invention is illustrated in FIG. 2 and includes aspeech source 10 providing a speech signal on an output 12 to a baseband filter 14 passing frequencies within the band from 300 to 700 Hz and to a gate 16 and a voice- /unvoice decision circuit 18.
- Decision circuit 18 is operative to enable gate 16 when an unvoice signal isreceived to pass the unvoice signal to a bandpass filter 20 passing the band of frequencies between 1,800 and 3,200 Hz, the output from filter 20 being supplied through a detector 22 to a low pass filter 24 to provide a DC to 50 Hz output.
- a 300 to 700 Hz band analog signal is supplied from baseband filter 14 and a DC to 50 Hz signal is supplied from low pass filter 24, the signals from filters 14 and 24 providing a baseband signal including F and a noise burst or fricative parameter, respectively, and being supplied to a suitable transmitter 26 for transmitting the signals to a receiver 28.
- the baseband signal and the fricative parameter can be transmitted in any suitable manner, for instance, by
- the baseband signal is provided on an output 30 and supplied directly to a summing network 34 on a lead 32 and to a balanced modulator 36 and through a half-wave rectifier 38 to a 2,500 Hz ringing circuit or narrow-band filter 40.
- the output from the narrowband filter 40 is supplied to the balanced modulator 36 which has an output supplied through a low pass filter 42 having an upper frequency cutoff skirt at approximately 2,400 Hz to summing network 34.
- the fricative parameter is supplied on an output 44 to a modulator 46 which also receives as an input noise from a 300 to 4,000 Hz, broadband random noise generator 48 through a filter 50 having a 3,000 Hz pole and a 1,500 Hz zero.
- the output from modulator 46 is supplied to summing network 34.
- summing network 34 receives a baseband signal input on lead 32 including F and input from filter 42 corresponding to regenerated F and a noise burst input from modulator 46 and combines the signals to provide a reconstructed speech signal on an output 52 for reproduction by a suitable transducer.
- baseband filter 14 passes a band of frequencies from 200 to 1,000 Hz, preferably from 300 to 700 Hz, such that the transmitted signal includes a baseband signal including the first formant F
- the voice/unvoice decision circuit 18 detects whether the speech signal is voiced, that is pitch excited and periodic, or unvoiced, that is, random noise-like, as is conventional and described in detail on pages 95 97 of a report entitled Formant Tracking Vocoder System prepared for the U.S. Army under Contract Number DA36-039-AMC-00006(E), Oct. 1965.
- gate 16 When an unvoiced sound is detected, gate 16 is enabled to permit the speech signal to be supplied to the filter 20, the speech signal supplied to filter 20 thus corresponding to a noise burst or fricative parameter representing the S, T and K sounds, for example.
- Filter 20 is centered in the band of greatest noise energy, that is, between 1,800 and 3,200 Hz; and, the energy in the noise band is detected by detector 22 and passed by low pass filter 24 to provide the DC to 50 Hz fricative parameter.
- a baseband signal and a fricative parameter are transmitted representative of the speech signal.
- the noise bursts from the DC to 50 Hz fricative parameter are reproduced by controlling the modulator 46 by the fricative parameter to pass noise bursts from the combination of the random noise generator 48 and the filter 50 of amplitude proportional to the fricative parameter voltage.
- the filter 50 is centered around 3,000 Hz and, preferably, has a pole at 3,000 Hz and a zero at 1,500 Hz which represents a median tuning in order to pass a sound somewhere between S and SH.
- the fricative parameter transmitted corresponding thereto is supplied to the modulator 46 such that the modulator 46 provides the product of the fricative parameter and the noise from filter 50 to produce a variable amplitude noise burst signal proportional to the original unvoice sound to the summing network 34.
- the half-wave rectifier 38 distorts the baseband signal to generate a spectrum including all the harmonics of the pitch frequency at high frequencies which are in the pass band of the 2,500 Hz filter 40.
- the narrowband filter 40 has a center frequency of 2,500 Hz and a band wide enough to pass at least one harmonic of the basic pitch frequency, preferably a band 150 Hz wide to provide plus or minus Hz at the 3dB points with broad skirt selectivity. While filter 40 is described as having a center fre quency of 2,500 Hz, the center frequency can be within the range of from 2,200 to 2,800 Hz and preferably from 2,400 to 2,600 Hz.
- the filter 40 should have a bandwidth such that at least one pitch harmonic of the highest larynx frequency anticipated is always captured by the filter.
- the balanced modulator 36 combines or processes the baseband signal with the frequency from filter 40 to produce a double sideband of plus and minus frequencies relative to the pitch harmonic near 2,500 Hz captured in the filter 40, and only the lower sideband of the signal is passed by low pass filter 42 such that the output from filter 42 represents a folded over or inverted F constituting a regenerated F signal for the front vowels.
- the low pass filter 42 separates the lower sideband which contains the difference between 2,500 Hz and the F frequency which constitutes the regenerated second formant F
- a narrowband filter 40 is disclosed for isolating a harmonic of the pitch frequency, this function can be provided by any suitable circuitry, for example, by a phaselocked loop or a combination of a narrow-band filter, a clipper and another narrowband filter in order to assure that only a single harmonic is modulated with the baseband signal.
- the regenerating circuit assures that the regenerated F is harmonically related to the pitch frequency in that the frequency of the signal from the narrowband filter 40 is a harmonic of the pitch frequency of the baseband signal and, since the baseband signal is formed of harmonics of the pitch frequency, modulator 36 operates to add and subtract one'harmonic fre quency from another. Accordingly, after passing through low pass filter 42, the regenerated second formant F is formed of a harmonic frequency or frequencies obtained by subtracting the harmonic or harmonics of the baseband signal from the harmonic from narrowband filter 40.
- a local oscillator such as a 2,500 Hz sine wave
- new frequencies, not harmonics of the fundamental pitch frequency would be produced if the local oscillator was not an exact multiple of pitch frequency; and, thus, the regenerated frequencies of F would not be harmonics of pitch frequency and, therefore, would produce beats in the human ear.
- the use of a ringing circuit or narrowband filter instead of a local oscillator is extremely important to the voice quality obtained with the present invention in that the use of such circuitry assures that no new frequencies which are not harmonics of fundamental pitch frequency would be produced, it being appreciated that in normal human speech, all frequencies in voiced sounds are harmonics of the pitch frequency.
- a front-back vowel detector may be utilized to permit regeneration of F for the back vowel, such as that described on pages 81-83 of the above-mentioned report entitled Formant Tracking Vocoder System prepared for the US. Army under Contract Number DA36-039-AMC- 00006(E), Oct. 1965; however, it has been found that for some of the back vowels, F is transmitted within the baseband signal and, further, that folding over or invertingF to regenerate F for the back vowels did not adversely affect intelligibility or voice quality probably because sufficient F is passed by the baseband to permit the ear to pick up the back vowel. While the reason for the good quality obtained even with the inverted regeneration of F for back vowels is not'completely understood, the use of the F regenerating circuitry of FIG. 2 is preferable due to the advantages of simplified circuitry and economy.
- FIG. 3 A modification of the second formant regenerating circuitry is illustrated in FIG. 3 with the primary difference relative to the regenerating circuitry of FIG. 2 being that the frequency of the baseband signal is doubled by a frequency doubler 54 prior to being supplied to balanced modulator 36.
- Frequency doubler 54 is essentially a squaring modulator receiving the baseband signal at both inputs.
- the frequency subtracted in modulator 36 from the harmonic frequency isolated by filter 40 is doubled to produce a regenerated F closer to the F shown in FIG. 1 for each front vowel.
- the baseband signal could be altered in any manner, such as tripling, while maintaining its harmonic relationship with the pitch frequency in order to obtain a regenerated F closer to the original F
- the method and system of speech compression according to the present invention thus, permit transmission bandwidth to be reduced to 700 Hz with an open channel between the fricative parameter and the baseband signal or further to 450 Hz.
- the most important portion of the speech signal is utilized to reconstruct the speech signal in that this baseband essentially contains the personality of the speaker as well as pitch information to obviate pitch tracking and the first formant F which contains excellent intelligibility.
- a method of transmitting a speech signal in reduced bandwidth comprising the steps of transmitting a baseband signal having a frequency range including the first formant;
- a method of transmitting a speech signal in reduced bandwidth comprising the steps of transmitting a baseband signal having a frequency range including the first formant; regenerating the second formant from the baseband signal of frequencies harmonically related to the pitch frequency of the baseband signal including generating harmonics of of the pitch frequency of the baseband signal and modulating one of the generated harmonics with the baseband signal; and combining the baseband signal with the regenerated second formant to reconstruct the speech signal.
- a speech compression system comprising a speech source providing a speech signal having first and second formants
- filter means forpassing a baseband signal of said speech signal including said first formant
- signal combining means for producing a reconstructed speech signal; means supplying said baseband signal from said receiving means to said signal combining means;
- said means receiving said baseband signal from said receivin g means for generating at least one harmonic of the pitch frequency and processing said at least one harmonic with said baseband signal to regenerate said second formant and supply said regenerated second formant to said signal combining means whereby said reconstructed speech signal includes said first formant and said regenerated second formant.
- a speech compression system comprising a speech source providing a speech signal having first and second formants
- filter means for passing a baseband signal of said speech signal including said first formant
- said means supplying'said baseband signal from said receiving means to said signal combining means; and means receiving said baseband signal from said receiving means for regenerating said second formant from said baseband signal and supplying said regenerated second formant to said signal combining means including means for processing said baseband signal to produce harmonics of the pitch frequency of said speech signal, means for isolating at least one of said harmonics, and means for modulating said at least one harmonic with said baseband signal to regenerate said second formant harmonically related to said baseband signal whereby said reconstructed speech signal includes said first formant and said regenerated second formant.
- a speech compression system as recited in claim 10 wherein said speech signal includes unvoice sounds and further comprising means for detecting said unvoice sounds and generating a fricative parameter corresponding thereto and means responsive to the received fricative parameter for synthetically generating a noise burst signal and supplying said noise burst signal to said signal combining means whereby said reconstructed speech signal includes said first formant, said regenerated second formant and said noise burst signal.
- a receiver for reconstructing a speech signal represented by a frequency band signal comprising means for receiving said frequency band signal;
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
A method and system of speech compression by baseband vocal regeneration wherein a baseband signal including the first formant, approximately between 300 and 700 Hz, is transmitted to a receiver and supplied to a signal combining network along with a second formant regenerated from the baseband signal to provide a reconstructed speech signal including the first and second formants. An unvoice or fricative parameter can be transmitted along with the baseband signal and utilized at the receiver to generate a synthetic noise burst signal for combining with the baseband signal and the regenerated second formant.
Description
United States Patent 1191 Coulter 1451 Mar. 18, 1975 3,499,986 3/1970 Focht 179/1 SA Primary Examiner- David L. Stewart [76] Inventor: David C. Coulter, 9613 Pembroke Pl. Vienna Va 22180 Attorney, Agent, 01 Firm Robert H. Epsteln [22] Filed: Feb. 28, 1973 57 ABSTRACT [21] Appl. No.: 336,705 A method and system of speech compression by baseband vocal regeneration wherein a baseband signal 111- [52] U.S. Cl 179/1 SA Cludmg the r i i approxlmaiely P [51] Im Cl G 1/10 and 700 Hz, transmitted to a rece1ver and supphed [58] Field of Search-m 179/1 SA, 1555 R 1555 T to a slgnal combmmg network along w th a second ormant regenerated from the baseband s1gnal to pro- 179/15 BW v1de a reconstructed speech slgnal lncludmg the hrst [56] References Cited and second formants. An UIIVOICC or fncatlve parameter can be transmitted along wlth the baseband s1gnal UNITED STATES PATENTS and utilized at the receiver to generate a synthetic 3,127,476 3/1964 David 179/1 SA noise burst signal for combining with the baseband sig- 31139v437 6/1964 l79/15-55 R nal and the regenerated second formant. 3,431.362 3/1969 179/1555 R 3.483.325 12/1969 Steward .1 179/1 SA 18 Clalms, 3 Drawmg F lgures speecn \o 31 BASEBAND 516mm mm: BALANCED 16) 18) MQDULMOR "1400112 SUMMlNG BASEBAND P Hum ,/B1\$EBAND 516141 11. LOW as NETWORK 300- Hz 5 :56 FlLTER 'zo A 12 -17. FLTER N E HM} WAVE REGENEi Fi-RTED c moo-3100111 E E RECTHER NQlSE 1'2 1 eunsw T v 2500 1 SlGNN. T E umrzow vo1c'e/uuvo1ce DETECTOR E DEUSVQN R Cmcuw RECONSTRUCTED sPEEcn 5on1 ame \8 LOW was 121 1100111 NOlSE 300011 P015 senate-1 FlZlCATWE \L PARAMETER so MODULATOR BACKGROUND OF THE INVENTION l. Field of .the Invention The present invention pertains to speech compression and, more particularly, to a method and system for speech compression to reduce transmission bandwidth by using baseband vocal regeneration.
2. Discussion of the Prior Art Since the inception of practical speech compression coinciding with the channel vocoder concept of H. Dudley, a great amount of effort has been expended in attempts to improve the naturalness of reproduced speech transmitted by speech compression techniques. The channel vocoder method of speech compression is based on the recognition of the carrier nature of speech; that is, the carrier or pitch frequency of the larynx is modulated by the .voice tract frequenciesformed in the mouth. Channel vocoders, thus, include a pitch tracker for detecting and tracking the pitch frequency and supplying a signal through a frequency counter, low pass filter or the like representative of average frequency of the larynx as well as a plurality of bandpass filters covering the voice frequency range from about 300 to 3,000 KHz to measure the energy in each portion of the voice frequency and provide a stepped representation of the spectrum. The outputs from the filters are detected with low pass filters, and the pitch and voice tract signals are transmitted and used to regenerate speech at the receiver.
Channel vocoders, while providing acceptable results from an information transmission standpoint, have the disadvantage that the reconstructed or reproduced speech at the receiver lacks naturalness and has poor voice quality. That is, intelligibility, speaker recognition and emotional characteristics of the reconstructed speech have not been adequately reproduced with the result that the reconstructed speech has an irritating synthetic sound. One of the primary reasons for the unnaturalness of the reconstructed speech is due to the necessity of tracking pitch in that pitch trackers have not been developed sufficiently to provide accurate tracking.
In attempting to avoid the requirement of a pitch tracker in speech compression systems, it was suggested to band the first few harmonics of the pitch frequency together and to transmit this band of energy, which is within a general range of from 300 to 700 Hz, instead of transmitting the pitch parameter. At the receiver, the band of energy is spread to generate all possible harmonics up to 3,000 Hz instead of just the first few harmonics. This voice excited vocoder provides good voice quality; however, a 9600 bit/sec. rate is required to transmit the band of energy as compared with the 2400 bit/sec. rate required for channel vocoders due to the slowly varying nature of the pitch frequency. Another disadvantage of the voice excited vocoder is that commercial applications are limited due to the complexity of the equipment and the high cost thereof. That is, at the present, the cost of voice excited vocoder systems approaches the cost of extra speech channels; and, therefore, voice excited vocoders do not represent a viable method for reducing bandwidth channels for commercial voice transmission.
Other attempts to provide speech compression systems have included tracking the first three formants,
which have been considered the most important in transmitting speech intelligence. However, the frequency bands of the first three formants are not well defined in that the frequencies of the first three formants overlap; and, accordingly, this lack of isolation of formant frequency range has rendered tracking of the first three formants extremely difficult. Thus, formant tracking speech compression systems requiring the tracking of the first three formants have not been developed to the point of ultimate fruition.
Speech compression systems based on mathematical derivation and linear prediction have been proposed; however, after development and hardware implementation, such systems have been revealed to require pitch tracking or its equivalent and, thus, have the disadvantages associated with channel and formant tracking vocoders and improved variations and modifications thereof.
Another prior method of compressing the bandwidth required for speech transmission is based on transmitting certain key bands in the normal speech range, generally from three to six bands, whose total bandwidth is less than that required for normal speech transmission. This method has the disadvantage that when the bands are made sufficiently narrow to achieve a useful bandwidth compression of 4:1 or greater, the quality of the reconstructed speech is characterized by heavily filtered sound which is disturbing to the listener.
SUMMARY OF THE INVENTION Accordingly, it is a primary object of the present invention to overcome the disadvantages of the prior art by providing a speech compression method and system having more natural and recognizable voice quality and intelligence characteristics of reconstructed speech while being relatively simple in nature and exhibiting low cost relative to transmission bandwidth reduction.
The primary object of the present invention is accomplished by transmitting a baseband signal from approximately 300 to 700 Hz and utilizing the baseband signal at a receiver to generate an upper formant signal from known definable relationships between formants, particularly between the first and second formants in accordance with the Peterson-Barney estimates.
Another object of the present invention is to utilize the inverse relationship of the first and second formants for front vowels in regenerating the second formant at a receiver when only the first formant is transmitted due to the increased importance of the second formant in defining front vowels relative to the importance of the second formant in defining back vowels.
A further object of the present invention is to regenerate the second formant from the first formant by frequency translation with the use of a local frequency source and a balanced modulator to translate and invert the first formant to produce a signal harmonically related to the pitch frequency representing the second formant for front vowels.
The present invention has another object in the reconstruction of good quality, articulate speech from a small portion of the normal voice bandwidth with or without an additional signal of low bandwidth.
Yet an additional object of the present invention is to eliminate the requirement for devices which track either pitch or formant frequency in a speech compression system while reconstructing speech based on formant frequencies.
mission of a baseband signal including the first formant and a fricative parameter including noise burst, unvoice information with the second formant being regenerated from a portion of the spectrum containing the first formant and the unvoice noise burst being synthetically generated corresponding to the fricative parameter and combined with the regenerated second formant and the baseband signal to provide a speech output.
Still further, another object of the present invention is to regenerate the second formant from a baseband signal including the first formant by forming the regenerated second formant of frequencies only'harmonically related to the pitch frequency of the baseband signal.
Some of the advantages of the present invention over the prior art are that no tracking devices are required, a speech signal can be processed to permit voice communications occupying a reduced bandwidth with the same signal-to-noise ratio or at a lower bit rate than the unprocessed speech, sufficient intelligibility and voice naturalness are maintained so that the speaker can be recognized from his voice and the voice quality does not sound synthetic, the system of the present invention can be used in tandem without appreciable reduction in voice quality or intelligence due to the synthetic production of noise burst signals and the regenerating of the second formant from the baseband signal and the circuitry required is simple thereby permitting the method and system of the present invention to be economically implemented and reliable in operation.
The present invention is generally characterized in a method of transmitting a speech signal in reduced bandwidth including the steps of transmitting a baseband signal having a frequency range including the first formant, receiving the baseband signal regenerating the second formant from the baseband signal, and combining the baseband signal with the regenerated second formant to reconstruct the speech signal.
The present invention is further generally characterized in a speech compression system including a speech source providing a speech signal having first and second formants, a filter for passing a baseband signal of the speech signal including the first formant, means for transmitting and receiving the baseband signal, a signal combining network for producing a reconstructed speech signal, means supplying the received baseband signal to the signal combining network, and means responsive to the received baseband signal for regenerat- BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a plot illustrating the relationship between first and second formant frequencies for the most common vowel sounds.
FIG. 2 is a diagram of a speech compression system according to the present invention.
FIG. 3 is a diagram of a modification of the speech compression system of FIG. 2.
DESCRIPTION OF THE PREFERRED EMBODIMENTS Speech is characterized by the production of larynx impulses at a repetitive frequency referred to as pitch frequency. These larynx pulses are rich in harmonics due to their impulse nature, and these harmonics along with the pitch fundamental frequency are transmitted through the voice tract to the outside via the lips. According to the shape and size of the voice tract, as determined by the tongue, lower jaw and lips, certain frequency ranges are selectively emphasized such that three principle bands of harmonics called formants are emphasized. The frequency positions and relative motions of these bands of harmonics determine the transmission of speech intelligence for voiced sounds. The other major class of sounds bearing speech intelligence are unvoice or fricative sounds, generated without vibrating the larynx by blowing air through a constricted point somewhere along the voice tract. The combination of these voiced and unvoiced sounds comprises the generation of human speech; and, accordingly, speech signals are formed of the pitch frequency and the formants which are harmonics of the pitch frequency and which form the voice portion of speech and the fricative sounds which form the unvoice portion of speech.
The relationship of the first and second formants F 1 and F are illustrated in FIG. 1 for the most common vowel sounds, and from FIG. 1, it will be appreciated that a practically linear inverse relationship exists between F and F for the front vowels, whereas F and F have a substantially direct relationship for the back vowels. While the frequency and amplitude of the first formant F, are important to human perception of front and back vowel sounds, it has been found that the second formant F is relatively significant to the perception of front vowel sounds only in that perception of back vowel sounds is strongly linked to the F 1 frequency only. Thus, in accordance with the method and system of the present invention, only a small band of energy including F is transmitted, and F is regenerated at the receiver to maintain sufficient intelligibility and voice naturalness such that the-speaker can be recognized from the reconstructed speech and the voice quality does not have synthetic characteristics.
One example of a system according to the present invention is illustrated in FIG. 2 and includes aspeech source 10 providing a speech signal on an output 12 to a baseband filter 14 passing frequencies within the band from 300 to 700 Hz and to a gate 16 and a voice- /unvoice decision circuit 18. Decision circuit 18 is operative to enable gate 16 when an unvoice signal isreceived to pass the unvoice signal to a bandpass filter 20 passing the band of frequencies between 1,800 and 3,200 Hz, the output from filter 20 being supplied through a detector 22 to a low pass filter 24 to provide a DC to 50 Hz output. Thus, a 300 to 700 Hz band analog signal is supplied from baseband filter 14 and a DC to 50 Hz signal is supplied from low pass filter 24, the signals from filters 14 and 24 providing a baseband signal including F and a noise burst or fricative parameter, respectively, and being supplied to a suitable transmitter 26 for transmitting the signals to a receiver 28.
The baseband signal and the fricative parameter can be transmitted in any suitable manner, for instance, by
combining the baseband signal and the fricative parameter by summation, by modulating the fricative parameter on a subcarrier of approximately 200 Hz by frequency modulation or by amplitude modulation.
At the receiver, the baseband signal is provided on an output 30 and supplied directly to a summing network 34 on a lead 32 and to a balanced modulator 36 and through a half-wave rectifier 38 to a 2,500 Hz ringing circuit or narrow-band filter 40. The output from the narrowband filter 40 is supplied to the balanced modulator 36 which has an output supplied through a low pass filter 42 having an upper frequency cutoff skirt at approximately 2,400 Hz to summing network 34. The fricative parameter is supplied on an output 44 to a modulator 46 which also receives as an input noise from a 300 to 4,000 Hz, broadband random noise generator 48 through a filter 50 having a 3,000 Hz pole and a 1,500 Hz zero. The output from modulator 46 is supplied to summing network 34. Thus, summing network 34 receives a baseband signal input on lead 32 including F and input from filter 42 corresponding to regenerated F and a noise burst input from modulator 46 and combines the signals to provide a reconstructed speech signal on an output 52 for reproduction by a suitable transducer.
In operation, baseband filter 14 passes a band of frequencies from 200 to 1,000 Hz, preferably from 300 to 700 Hz, such that the transmitted signal includes a baseband signal including the first formant F The voice/unvoice decision circuit 18 detects whether the speech signal is voiced, that is pitch excited and periodic, or unvoiced, that is, random noise-like, as is conventional and described in detail on pages 95 97 of a report entitled Formant Tracking Vocoder System prepared for the U.S. Army under Contract Number DA36-039-AMC-00006(E), Oct. 1965. When an unvoiced sound is detected, gate 16 is enabled to permit the speech signal to be supplied to the filter 20, the speech signal supplied to filter 20 thus corresponding to a noise burst or fricative parameter representing the S, T and K sounds, for example. Filter 20 is centered in the band of greatest noise energy, that is, between 1,800 and 3,200 Hz; and, the energy in the noise band is detected by detector 22 and passed by low pass filter 24 to provide the DC to 50 Hz fricative parameter. Thus, a baseband signal and a fricative parameter are transmitted representative of the speech signal.
At the receiver, the noise bursts from the DC to 50 Hz fricative parameter are reproduced by controlling the modulator 46 by the fricative parameter to pass noise bursts from the combination of the random noise generator 48 and the filter 50 of amplitude proportional to the fricative parameter voltage. The filter 50 is centered around 3,000 Hz and, preferably, has a pole at 3,000 Hz and a zero at 1,500 Hz which represents a median tuning in order to pass a sound somewhere between S and SH. Thus, when an unvoice signal is detected by the voice/unvoice decision circuit 18, the fricative parameter transmitted corresponding thereto is supplied to the modulator 46 such that the modulator 46 provides the product of the fricative parameter and the noise from filter 50 to produce a variable amplitude noise burst signal proportional to the original unvoice sound to the summing network 34.
In the circuit for regenerating F the half-wave rectifier 38 distorts the baseband signal to generate a spectrum including all the harmonics of the pitch frequency at high frequencies which are in the pass band of the 2,500 Hz filter 40. The narrowband filter 40 has a center frequency of 2,500 Hz and a band wide enough to pass at least one harmonic of the basic pitch frequency, preferably a band 150 Hz wide to provide plus or minus Hz at the 3dB points with broad skirt selectivity. While filter 40 is described as having a center fre quency of 2,500 Hz, the center frequency can be within the range of from 2,200 to 2,800 Hz and preferably from 2,400 to 2,600 Hz. Basically, the filter 40 should have a bandwidth such that at least one pitch harmonic of the highest larynx frequency anticipated is always captured by the filter. The balanced modulator 36 combines or processes the baseband signal with the frequency from filter 40 to produce a double sideband of plus and minus frequencies relative to the pitch harmonic near 2,500 Hz captured in the filter 40, and only the lower sideband of the signal is passed by low pass filter 42 such that the output from filter 42 represents a folded over or inverted F constituting a regenerated F signal for the front vowels. That is, the low pass filter 42 separates the lower sideband which contains the difference between 2,500 Hz and the F frequency which constitutes the regenerated second formant F While a narrowband filter 40 is disclosed for isolating a harmonic of the pitch frequency, this function can be provided by any suitable circuitry, for example, by a phaselocked loop or a combination of a narrow-band filter, a clipper and another narrowband filter in order to assure that only a single harmonic is modulated with the baseband signal.
The regenerating circuit, thus, assures that the regenerated F is harmonically related to the pitch frequency in that the frequency of the signal from the narrowband filter 40 is a harmonic of the pitch frequency of the baseband signal and, since the baseband signal is formed of harmonics of the pitch frequency, modulator 36 operates to add and subtract one'harmonic fre quency from another. Accordingly, after passing through low pass filter 42, the regenerated second formant F is formed of a harmonic frequency or frequencies obtained by subtracting the harmonic or harmonics of the baseband signal from the harmonic from narrowband filter 40. If a local oscillator, such as a 2,500 Hz sine wave were utilized in place of the harmonic generating circuitry, new frequencies, not harmonics of the fundamental pitch frequency, would be produced if the local oscillator was not an exact multiple of pitch frequency; and, thus, the regenerated frequencies of F would not be harmonics of pitch frequency and, therefore, would produce beats in the human ear. The use of a ringing circuit or narrowband filter instead of a local oscillator is extremely important to the voice quality obtained with the present invention in that the use of such circuitry assures that no new frequencies which are not harmonics of fundamental pitch frequency would be produced, it being appreciated that in normal human speech, all frequencies in voiced sounds are harmonics of the pitch frequency.
With respect to the back vowels, a front-back vowel detector may be utilized to permit regeneration of F for the back vowel, such as that described on pages 81-83 of the above-mentioned report entitled Formant Tracking Vocoder System prepared for the US. Army under Contract Number DA36-039-AMC- 00006(E), Oct. 1965; however, it has been found that for some of the back vowels, F is transmitted within the baseband signal and, further, that folding over or invertingF to regenerate F for the back vowels did not adversely affect intelligibility or voice quality probably because sufficient F is passed by the baseband to permit the ear to pick up the back vowel. While the reason for the good quality obtained even with the inverted regeneration of F for back vowels is not'completely understood, the use of the F regenerating circuitry of FIG. 2 is preferable due to the advantages of simplified circuitry and economy.
A modification of the second formant regenerating circuitry is illustrated in FIG. 3 with the primary difference relative to the regenerating circuitry of FIG. 2 being that the frequency of the baseband signal is doubled by a frequency doubler 54 prior to being supplied to balanced modulator 36. Frequency doubler 54 is essentially a squaring modulator receiving the baseband signal at both inputs. By utilizing the frequency doubier, the frequency subtracted in modulator 36 from the harmonic frequency isolated by filter 40 is doubled to produce a regenerated F closer to the F shown in FIG. 1 for each front vowel. Of course, the baseband signal could be altered in any manner, such as tripling, while maintaining its harmonic relationship with the pitch frequency in order to obtain a regenerated F closer to the original F The method and system of speech compression according to the present invention, thus, permit transmission bandwidth to be reduced to 700 Hz with an open channel between the fricative parameter and the baseband signal or further to 450 Hz. By transmitting a baseband of 300 to 700 Hz, the most important portion of the speech signal is utilized to reconstruct the speech signal in that this baseband essentially contains the personality of the speaker as well as pitch information to obviate pitch tracking and the first formant F which contains excellent intelligibility. The combination of the regenerated F with the baseband signal provides a reconstructed speech signal having good voice quality and intelligibility, and these reconstructed speech characteristics are further improved when the synthetic noise burst signal is combined with the baseband signal and the regenerated F Inasmuch as the present invention is subject to many variations, modifications and changes in detail, it is intended that all subject matter described above or shown in the accompanying drawings be interpreted as illustrative and not in a limiting sense.
I claim:
1. A method of transmitting a speech signal in reduced bandwidth comprising the steps of transmitting a baseband signal having a frequency range including the first formant;
receiving the baseband signal;
regenerating the second formant from the baseband signal by producing harmonics of the pitch frequency of the baseband signal, isolating at least one of the harmonics and processing the at least one harmonic with the baseband signal; and
combining the baseband signal with the regenerated second formant to reconstruct the speech signal. 2. A method of transmitting a speech signal in reduced bandwidth comprising the steps of transmitting a baseband signal having a frequency range including the first formant; regenerating the second formant from the baseband signal of frequencies harmonically related to the pitch frequency of the baseband signal including generating harmonics of of the pitch frequency of the baseband signal and modulating one of the generated harmonics with the baseband signal; and combining the baseband signal with the regenerated second formant to reconstruct the speech signal.
3. A method of transmitting a speech signal as recited in claim 2 wherein said step of regenerating the second formant includes modulating the baseband signal to provide a double sideband signal and passing only the lower sideband signal to be combined with the baseband signal.
4. A method of transmitting a speech signal as recited in claim 3 and further comprising the steps of transmitting a fricative pa rameter of the speech signal, generating a synthetic noise burst signal in accordance with the fricative parameter and combining the noise burst signal with the baseband parameter and the regenerated second formant to produce the speech signal.
5. A method of transmitting a speech signal as recited in claim 3 wherein said baseband signal represents the band of frequencies from 200 to 1,000 Hz.
6. A method of transmitting a speech signal as recited in claim 3 wherein said baseband signal represents the band of frequencies from 300 to 700 Hz.
7. A method of transmitting a speech signal as recited in claim 3 wherein the one harmonic is within a range of from 2,200 to 2,800Hz.
8. A method of transmitting a speech signal as recited in claim 3 wherein the one harmonic is within a range of from 2,400 to 2,60OHz.
9. A speech compression system comprising a speech source providing a speech signal having first and second formants;
filter means forpassing a baseband signal of said speech signal including said first formant;
means for transmitting said baseband signal;
means for receiving said baseband signal;
signal combining means for producing a reconstructed speech signal; means supplying said baseband signal from said receiving means to said signal combining means; and
means receiving said baseband signal from said receivin g means for generating at least one harmonic of the pitch frequency and processing said at least one harmonic with said baseband signal to regenerate said second formant and supply said regenerated second formant to said signal combining means whereby said reconstructed speech signal includes said first formant and said regenerated second formant.
10. A speech compression system comprising a speech source providing a speech signal having first and second formants;
filter means for passing a baseband signal of said speech signal including said first formant;
means for transmitting said baseband signal;
means for receiving said baseband signal;
signal combining means for producing a reconstructed speech signal;
means supplying'said baseband signal from said receiving means to said signal combining means; and means receiving said baseband signal from said receiving means for regenerating said second formant from said baseband signal and supplying said regenerated second formant to said signal combining means including means for processing said baseband signal to produce harmonics of the pitch frequency of said speech signal, means for isolating at least one of said harmonics, and means for modulating said at least one harmonic with said baseband signal to regenerate said second formant harmonically related to said baseband signal whereby said reconstructed speech signal includes said first formant and said regenerated second formant.
11. A speech compression system as recited in claim 10 wherein said isolating means includes a narrowband filter having a center frequency within the range of from 2,200 to 2,800 Hz.
12. A speech compression system as recited in claim 10 wherein said isolating means includes a narrowband filter having a center frequency substantially at 2,500 Hz.
13. A speech compression system as recited in claim 12 wherein said processing means is a half wave rectifier, said modulating means is a balanced modulator providing a double sideband signal and said means for regenerating said second formant further includes a low pass filter for passing the lower sideband signal from said balanced modulator to said signal combining means.
14. A speech compression system as recited in claim 13 wherein said speech signal includes unvoice sounds and further comprising means for detecting said unvoice sounds and generating a fricative parameter corresponding thereto and means responsive to the received fricative parameter for synthetically generating a noise burst signal and supplying said noise burst signal to said signal combining means whereby said reconstructed speech signal includes said first formant, said regenerated second formant and said noise burst signal.
15. A speech compression system as recited in claim 10 wherein said speech signal includes unvoice sounds and further comprising means for detecting said unvoice sounds and generating a fricative parameter corresponding thereto and means responsive to the received fricative parameter for synthetically generating a noise burst signal and supplying said noise burst signal to said signal combining means whereby said reconstructed speech signal includes said first formant, said regenerated second formant and said noise burst signal.
16. A receiver for reconstructing a speech signal represented by a frequency band signal comprising means for receiving said frequency band signal;
means receiving said frequency band signal from said receiving means for processing said frequency band signal to produce harmonics of said frequency band signal;
means for isolating at least one of said harmonics;
means for modulating said at least one harmonic with said frequency band signal to produce a modulated signal; and
signal combining means receiving said frequency band signal from said receiving means and said modulated signal from said modulating means for combining said frequency band signal and said modulated signal to produce a reconstructed speech signal.
17. A receiver as recited in claim 16 wherein said modulating means includes a balanced modulator providing a double sideband signal and low pass filter means for passing the lower sideband signal to said signal combining means as said modulated signal.
18. A receiver as recited in claim 17 wherein said isolating means includes a narrowband filter having a center frequency within the range of from 2,200 to 2,800 Hz.
Claims (18)
1. A method of transmitting a speech signal in reduced bandwidth comprising the steps of transmitting a baseband signal having a frequency range including the first formant; receiving the baseband signal; regenerating the second formant from the baseband signal by producing harmonics of the pitch frequency of the baseband signal, isolating at least one of the harmonics and processing the at least one harmonic with the baseband signal; and combining the baseband signal with the regenerated second formant to reconstruct the speech signal.
2. A method of transmitting a speech signal in reduced bandwidth comprising the steps of transmitting a baseband signal having a frequency range including the first formant; regenerating the second formant from the baseband signal of frequencies harmonically related to the pitch frequency of the baseband signal including generating harmonics of of the pitch frequency of the baseband signal and modulating one of the generated harmonics with the baseband signal; and combining the baseband signal with the regenerated second formant to reconstruct the speech signal.
3. A method of transmitting a speech signal as recited in claim 2 wherein said step of regenerating the second formant includes modulating the baseband signal to provide a double sideband signal and passing only the lower sideband signal to be combined with the baseband signal.
4. A method of transmitting a speech signal as recited in claim 3 and further comprising the steps of transmitting a fricative parameter of the speech signal, generating a synthetic noise burst signal in accordance with the fricative parameter and combining the noise burst signal with the baseband parameter and the regenerated second formant to produce the speech signal.
5. A method of transmitting a speech signal as recited in claim 3 wherein said baseband signal represents the band of frequencies from 200 to 1,000 Hz.
6. A method of transmitting a speech signal as recited in claim 3 wherein said baseband signal represents the band of frequencies from 300 to 700 Hz.
7. A method of transmitting a speech signal as recited in claim 3 wherein the one harmonic is within a range of from 2,200 to 2, 800Hz.
8. A method of transmitting a speech signal as recited in claim 3 wherein the one harmonic is within a range of from 2,400 to 2, 600Hz.
9. A speech compression system comprising a speech source providing a speech signal having first and second formants; filter means for passing a baseband signal of said speech signal including said first formant; means for transmitting said baseband signal; means for receiving said baseband signal; signal combining means for producing a reconstructed speech signal; means supplying said baseband signal from said receiving means to said signal combining means; and means receiving said baseband signal from said receiving means for generating at least one Harmonic of the pitch frequency and processing said at least one harmonic with said baseband signal to regenerate said second formant and supply said regenerated second formant to said signal combining means whereby said reconstructed speech signal includes said first formant and said regenerated second formant.
10. A speech compression system comprising a speech source providing a speech signal having first and second formants; filter means for passing a baseband signal of said speech signal including said first formant; means for transmitting said baseband signal; means for receiving said baseband signal; signal combining means for producing a reconstructed speech signal; means supplying said baseband signal from said receiving means to said signal combining means; and means receiving said baseband signal from said receiving means for regenerating said second formant from said baseband signal and supplying said regenerated second formant to said signal combining means including means for processing said baseband signal to produce harmonics of the pitch frequency of said speech signal, means for isolating at least one of said harmonics, and means for modulating said at least one harmonic with said baseband signal to regenerate said second formant harmonically related to said baseband signal whereby said reconstructed speech signal includes said first formant and said regenerated second formant.
11. A speech compression system as recited in claim 10 wherein said isolating means includes a narrowband filter having a center frequency within the range of from 2,200 to 2,800 Hz.
12. A speech compression system as recited in claim 10 wherein said isolating means includes a narrowband filter having a center frequency substantially at 2,500 Hz.
13. A speech compression system as recited in claim 12 wherein said processing means is a half wave rectifier, said modulating means is a balanced modulator providing a double sideband signal and said means for regenerating said second formant further includes a low pass filter for passing the lower sideband signal from said balanced modulator to said signal combining means.
14. A speech compression system as recited in claim 13 wherein said speech signal includes unvoice sounds and further comprising means for detecting said unvoice sounds and generating a fricative parameter corresponding thereto and means responsive to the received fricative parameter for synthetically generating a noise burst signal and supplying said noise burst signal to said signal combining means whereby said reconstructed speech signal includes said first formant, said regenerated second formant and said noise burst signal.
15. A speech compression system as recited in claim 10 wherein said speech signal includes unvoice sounds and further comprising means for detecting said unvoice sounds and generating a fricative parameter corresponding thereto and means responsive to the received fricative parameter for synthetically generating a noise burst signal and supplying said noise burst signal to said signal combining means whereby said reconstructed speech signal includes said first formant, said regenerated second formant and said noise burst signal.
16. A receiver for reconstructing a speech signal represented by a frequency band signal comprising means for receiving said frequency band signal; means receiving said frequency band signal from said receiving means for processing said frequency band signal to produce harmonics of said frequency band signal; means for isolating at least one of said harmonics; means for modulating said at least one harmonic with said frequency band signal to produce a modulated signal; and signal combining means receiving said frequency band signal from said receiving means and said modulated signal from said modulating means for combining said frequency band signal and said modulated signal to produce a reconStructed speech signal.
17. A receiver as recited in claim 16 wherein said modulating means includes a balanced modulator providing a double sideband signal and low pass filter means for passing the lower sideband signal to said signal combining means as said modulated signal.
18. A receiver as recited in claim 17 wherein said isolating means includes a narrowband filter having a center frequency within the range of from 2,200 to 2,800 Hz.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US336705A US3872250A (en) | 1973-02-28 | 1973-02-28 | Method and system for speech compression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US336705A US3872250A (en) | 1973-02-28 | 1973-02-28 | Method and system for speech compression |
Publications (1)
Publication Number | Publication Date |
---|---|
US3872250A true US3872250A (en) | 1975-03-18 |
Family
ID=23317296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US336705A Expired - Lifetime US3872250A (en) | 1973-02-28 | 1973-02-28 | Method and system for speech compression |
Country Status (1)
Country | Link |
---|---|
US (1) | US3872250A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4158751A (en) * | 1978-02-06 | 1979-06-19 | Bode Harald E W | Analog speech encoder and decoder |
US4286116A (en) * | 1978-09-29 | 1981-08-25 | Thomson-Csf | Device for the processing of voice signals |
US4355204A (en) * | 1979-11-09 | 1982-10-19 | U.S. Philips Corporation | Speech synthesizing arrangement having at least two distortion circuits |
WO1991006944A1 (en) * | 1989-10-25 | 1991-05-16 | Motorola, Inc. | Speech waveform compression technique |
US5060269A (en) * | 1989-05-18 | 1991-10-22 | General Electric Company | Hybrid switched multi-pulse/stochastic speech coding technique |
US5369730A (en) * | 1991-06-05 | 1994-11-29 | Hitachi, Ltd. | Speech synthesizer |
US20040199380A1 (en) * | 1998-02-05 | 2004-10-07 | Kandel Gillray L. | Signal processing circuit and method for increasing speech intelligibility |
US20120004906A1 (en) * | 2009-02-04 | 2012-01-05 | Martin Hagmuller | Method for separating signal paths and use for improving speech using electric larynx |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3127476A (en) * | 1964-03-31 | david | ||
US3139487A (en) * | 1960-12-27 | 1964-06-30 | Bell Telephone Labor Inc | Bandwidth reduction system |
US3431362A (en) * | 1966-04-22 | 1969-03-04 | Bell Telephone Labor Inc | Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal |
US3483325A (en) * | 1966-04-22 | 1969-12-09 | Santa Rita Technology Inc | Speech processing system |
US3499986A (en) * | 1966-09-28 | 1970-03-10 | Philco Ford Corp | Speech synthesizer |
-
1973
- 1973-02-28 US US336705A patent/US3872250A/en not_active Expired - Lifetime
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3127476A (en) * | 1964-03-31 | david | ||
US3139487A (en) * | 1960-12-27 | 1964-06-30 | Bell Telephone Labor Inc | Bandwidth reduction system |
US3431362A (en) * | 1966-04-22 | 1969-03-04 | Bell Telephone Labor Inc | Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal |
US3483325A (en) * | 1966-04-22 | 1969-12-09 | Santa Rita Technology Inc | Speech processing system |
US3499986A (en) * | 1966-09-28 | 1970-03-10 | Philco Ford Corp | Speech synthesizer |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4158751A (en) * | 1978-02-06 | 1979-06-19 | Bode Harald E W | Analog speech encoder and decoder |
US4286116A (en) * | 1978-09-29 | 1981-08-25 | Thomson-Csf | Device for the processing of voice signals |
US4355204A (en) * | 1979-11-09 | 1982-10-19 | U.S. Philips Corporation | Speech synthesizing arrangement having at least two distortion circuits |
US5060269A (en) * | 1989-05-18 | 1991-10-22 | General Electric Company | Hybrid switched multi-pulse/stochastic speech coding technique |
WO1991006944A1 (en) * | 1989-10-25 | 1991-05-16 | Motorola, Inc. | Speech waveform compression technique |
US5369730A (en) * | 1991-06-05 | 1994-11-29 | Hitachi, Ltd. | Speech synthesizer |
US20040199380A1 (en) * | 1998-02-05 | 2004-10-07 | Kandel Gillray L. | Signal processing circuit and method for increasing speech intelligibility |
US20120004906A1 (en) * | 2009-02-04 | 2012-01-05 | Martin Hagmuller | Method for separating signal paths and use for improving speech using electric larynx |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US2151091A (en) | Signal transmission | |
Flanagan et al. | Phase vocoder | |
Makhoul et al. | High-frequency regeneration in speech coding systems | |
Holmes | The JSRU channel vocoder | |
US4752956A (en) | Digital speech coder with baseband residual coding | |
US5054073A (en) | Voice analysis and synthesis dependent upon a silence decision | |
US3872250A (en) | Method and system for speech compression | |
US2817711A (en) | Band compression system | |
US4195202A (en) | Voice privacy system with amplitude masking | |
US4255620A (en) | Method and apparatus for bandwidth reduction | |
Halsey et al. | Analysis-synthesis telephony, with special reference to the vocoder | |
US4170719A (en) | Speech transmission system | |
US2810787A (en) | Compressed frequency communication system | |
US3528011A (en) | Limited energy speech transmission and receiving system | |
US3431362A (en) | Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal | |
US3995115A (en) | Speech privacy system | |
US2824906A (en) | Transmission and reconstruction of artificial speech | |
Golden | Improving Naturalness and Intelligibility of Helium‐Oxygen Speech, Using Vocoder Techniques | |
US3381093A (en) | Speech coding using axis-crossing and amplitude signals | |
US3268660A (en) | Synthesis of artificial speech | |
US3499991A (en) | Voice-excited vocoder | |
US3124654A (en) | Transmitter | |
US4039949A (en) | Pulse code modulation with dynamic range limiting | |
US3091665A (en) | Autocorrelation vocoder equalizer | |
David et al. | Voice-excited vocoders for practical speech bandwidth reduction |