US3681530A - Method and apparatus for signal bandwidth compression utilizing the fourier transform of the logarithm of the frequency spectrum magnitude - Google Patents

Method and apparatus for signal bandwidth compression utilizing the fourier transform of the logarithm of the frequency spectrum magnitude Download PDF

Info

Publication number
US3681530A
US3681530A US46128A US3681530DA US3681530A US 3681530 A US3681530 A US 3681530A US 46128 A US46128 A US 46128A US 3681530D A US3681530D A US 3681530DA US 3681530 A US3681530 A US 3681530A
Authority
US
United States
Prior art keywords
signal
predetermined
output
signals
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US46128A
Inventor
Harold J Manley
Harry L Shaffer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GTE Sylvania Inc
Original Assignee
GTE Sylvania Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GTE Sylvania Inc filed Critical GTE Sylvania Inc
Application granted granted Critical
Publication of US3681530A publication Critical patent/US3681530A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Definitions

  • ABSTRACT A bandwidth compression system such as a digital vocoder including an analysis section employs a transducer to convert an input speech wave into an electrical signal which is then digitized by an analog to digital converter. The digitized signal is directed through a spectrum device where the magnitudes of the frequency spectrum of the input speech wave are obtained. These magnitudes are then directed to a logging circuit to obtain the logarithm of the frequency spectrum magnitudes of the input speech signal.
  • the logged magnitudes of the frequency spectrum are then directed to a computer where the discrete Fourier transform of the logged spectrum magnitudes are obtained to form the Fourier transform of the logarithm of the frequency spectrum magnitude (PTLSM) of the input speech signal.
  • An encoding unit selects and encodes certain ones of the FTLSM coefficients for transmission to a remote terminal for analysis.
  • the encoded signals include pitch data and vocal tract impulse data, both of which are derived from the FTLSM signals.
  • the analysis section of a vocoder terminal employs a decodin device whicg decodes he rec eived data apd separate it into pite data an voca tract llllpll se 37 Claims, 19 Drawing Figures PATENTED M19 1 SHEET 10 [1F 18 o wajm hum m 'IZX'H ms HAROLD J. MA NLEY m HARRY L. SHAFFER zoEQzoQ sum 110$ 18 T I kmm Y R a E mm W N M 5 V 5. I L mm .E My R m A A H H E 5 :5 Gm Em 1022B @2203?
  • This invention relates to speech compresion systems and in particular to digital vocoder systems.
  • the vocal tract consisting of throat, mouth, tongue, lips, teeth and nasal passages, forms a time varying linear filter in which the amplitude response versus frequency characteristics is responsible for practically all the information content in a speech signal.
  • This filter is driven by energy sources, cornmonly known as buu" and hiss energy sources.
  • the term buzz is associated with the type of vocal source excitation function which exists when the vocal cords are oscillating at some quasi-periodic rate (called the pitch). Under this condition the chest cavity is supplying pufi's of air to the vocal tract at the quasiperiodic rate at which the vocal cords are oscillating.
  • the term hiss is associated with the type of vocal source excitation which exists when the vocal cords are not oscillating in a quasi-periodic manner but are always allowing air to pass through from the chest cavity and excite the vocal tract.
  • the excitation is from the buzz energy source.
  • unvoiced sounds e.g., ss, sh, f and whispered speech
  • the excitation is from the his source.
  • the information content is impressed upon the speech signal by the vocal tract acting essentially as a time varying distributed constant linear filter.
  • the latter information generally takes the form of measurements of the fundamental frequency of the bus sources as a function of time (pitch extraction).
  • Information as to whether the excitation is buzz or hiss is used by the speech compression system. Combinations of buzz and his excitation are used to generate some sounds, but speech compression systems do not generally try to detect the combined excitation. A decision is usually made as to whether to use buzz or his excitation for this combined excitation in a speech compression system of this type.
  • Speech compression systems using spectral analysis are generally called vocoders.
  • the spectrum data are transmitted by digitally encoding the logarithm of about 16 voltage spectrum amplitude which are derived from a filter bank spectrum analyzer. This method is known to be inefficient because of the high correlations among the various spectrum amplitudes.
  • Various techniques are now used to remove these correlations and therefore reduce the required data rate for a given transmission fidelity.
  • One approach which produces sonne improvements is the use of a delta pulse code modulation scheme in which only the decibel diflerences in level between adjacent frequency channels are transmitted.
  • Another scheme is to form weighted sum of the logged, digitiud spectrum amplitudes, the weighting being arranged so that cross-correlation of the speech wave against a waveform derived from the input speech are markedly reduced.
  • autocorrelation vocoder Another type of vocoder is called the autocorrelation vocoderwhichderivesitsnamefromthefactthatinthe first step of the analysis process the autocorrelation function of the speech input is measured in terms of orthonormal function. Just as the power spectrum of the speech input varies with tinne (as a talker articulates various sounds), so does the autocorrelation function. There is a one-to-one correspondence between the power spectrum and the autocorrelation function of the speech signal so that measuring one is equivalent to measuring the other. Mathematically, the power spectrum and the autocorrelation function are Fourier transform pairs. Thus, autocorrelation is simply an alternative method of measuring the short time energy spectrum of the speech sigrnal.
  • the input signal is first applied to the inputs of a set of orthogonal filters.
  • the filter output signals are multiplied by the input speech signal, and the product signal is then directed through low pass filters.
  • the output signals from the low pass filter are the coefficients in an expansion of the power spectrum.
  • the power spectrum P0) of a speech signal is the product of the power spectrum of a pitch excitation
  • the autocorrelation function is the Fourier transform of P(j) and is composed of the convolution of the transform of ]H(f) and V0). Practically, this means that the autocorrelation function repeats itself at multiples of the pitch period, and it is necessary to represent the vocal tract out to fairly large delay values (near one-half of a pitch period) in order to represent the speech spectrum with any fidelity.
  • the overlap of successive autocorrelation functions due to convolution properties raises some doubt as to the validity of the values of the autocorrelation function alone as a measure of the vocal tract shape.
  • this type of vocoder is basically an analog device yielding an output consisting of voltage spectrum values which are subsequently digitized.
  • a bandwidth compression system includes a means for generating an electrical signal representing the Fourier transform of the logarithm of the spectrum magnitudes (FTLSM) of an input sigrnal having excitation and impulse response information included therein.
  • a first detection means coupled to the means for generating the FILSM electrical signal, is operative to separate out a first predetermined portion of the Fl'LSM electrical signal to represent the excitation infonnation of tlne input signal.
  • a second detection means also coupled to the means for generating the FTLSM electrical signal, is operative to separate out a second predetermined portion of the FILSM electrical signal to represent the impulse response information of the input signal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A bandwidth compression system such as a digital vocoder including an analysis section employs a transducer to convert an input speech wave into an electrical signal which is then digitized by an analog to digital converter. The digitized signal is directed through a spectrum device where the magnitudes of the frequency spectrum of the input speech wave are obtained. These magnitudes are then directed to a logging circuit to obtain the logarithm of the frequency spectrum magnitudes of the input speech signal. The logged magnitudes of the frequency spectrum are then directed to a computer where the discrete Fourier transform of the logged spectrum magnitudes are obtained to form the Fourier transform of the logarithm of the frequency spectrum magnitude (FTLSM) of the input speech signal. An encoding unit selects and encodes certain ones of the FTLSM coefficients for transmission to a remote terminal for analysis. The encoded signals include pitch data and vocal tract impulse data, both of which are derived from the FTLSM signals. The analysis section of a vocoder terminal employs a decoding device which decodes the received data and separates it into pitch data and vocal tract impulse data. Connected to the decoding device is a computing device for computing the logarithm of the spectrum envelope of the vocal tract impulse response function using the discrete Fourier transform. The logged spectrum is directed through a delogging device to a fast Fourier transform (FET) computer where the Fourier sine transform of the received spectrum signals (the impulse response) are obtained. A convolution unit then convolves the pitch data with the impulse response data to yield the desired synthesized speech signal.

Description

United States Patent Manley et al.
[151 3,681,530 45 Aug. 1,1972
[54] METHOD AND APPARATUS FOR SIGNAL BANDWIDTH COMPRESSION UTILIZING THE FOURIER TRANSFORM OF THE LOGARITIIM OF THE FREQUENCY SPECTRUM MAGNITUDE [72] Inventors: Hamid J. Manley, Sudbury; Barry L. Shaffer, Lynnfield, both of Mass.
[73] Assignees GTE Sylvanla Incorporated [22] Filed: June 15, 1970 I [21] Appl. No.: 46,128
[52] US. Cl. ..l79/l SA [51] Int. Cl ..Gl0l l/02, G101 1/08 [58] Field of Search ....l79/15 A, 15.55 R; 324/77 C, 324/77 F [56] References Cited UNITED STATES PATENTS 3,448,216 6/1969 Kelly ..179/l SA 3,566,035 2/1971 Noll ..179/1 SA 3,344,349 9/1967 Schroeder ..l79/l SA 3,403,227 9/1968 Malm ..l79/1 SA 3,330,910 7/1967 Flanagan ..l79/1 SA 3,493,684 2/1970 Kelly ..|79/1 SA 3,471,648 10/1969 Miller ..179/l SA OTHER PUBLICATIONS Noll, Short-Time Spectrum and Cepstrum Techniques for Vocal Pitch Detection, J.A.S.A. 2/1964 p. 296- 302.
Shively, A Digital Processor to Generate Spectra in Real Time, IEEE Trans. on Computers, 5/1968 p. 485- 491.
Primary Examiner- Kathleen H. Clafiy Assistant Examiner-Jon Bradford Leaheey Attorney-Norman J. OMalley, Elmer J. Nealon and Robert T. Orner [57] ABSTRACT A bandwidth compression system such as a digital vocoder including an analysis section employs a transducer to convert an input speech wave into an electrical signal which is then digitized by an analog to digital converter. The digitized signal is directed through a spectrum device where the magnitudes of the frequency spectrum of the input speech wave are obtained. These magnitudes are then directed to a logging circuit to obtain the logarithm of the frequency spectrum magnitudes of the input speech signal. The logged magnitudes of the frequency spectrum are then directed to a computer where the discrete Fourier transform of the logged spectrum magnitudes are obtained to form the Fourier transform of the logarithm of the frequency spectrum magnitude (PTLSM) of the input speech signal. An encoding unit selects and encodes certain ones of the FTLSM coefficients for transmission to a remote terminal for analysis. The encoded signals include pitch data and vocal tract impulse data, both of which are derived from the FTLSM signals.
The analysis section of a vocoder terminal employs a decodin device whicg decodes he rec eived data apd separate it into pite data an voca tract llllpll se 37 Claims, 19 Drawing Figures PATENTED M19 1 SHEET 10 [1F 18 o wajm hum m 'IZX'H ms HAROLD J. MA NLEY m HARRY L. SHAFFER zoEQzoQ sum 110$ 18 T I kmm Y R a E mm W N M 5 V 5. I L mm .E My R m A A H H E 5 :5 Gm Em 1022B @2203? to 20.282 02 mw ML; FPV no N: c w c A.:|1 O .1 A M AA: 82 zit; w @i F E oz 822 Q9 a 56 29% 2:63; A F 9 V25 Oz; 0 V
PATENTEDAus H972 3.681.530
SHEET 15 8F 18 N4 AVE N (k) INYHYTURF HAROLD J. MANLEY HARRY LSHAFFER \Hnmn WW METHOD AND APPARATUS FOR SIGNAL BANDWIUI'H COMPRESION UTILIZING THE FOURIER TRANSFORM OF THE IDGARITHM OF THE FREQUENCY SPECTRUM MAGNI'I'UDE BACKGROUND OF THE INVENTION This invention relates to speech compresion systems and in particular to digital vocoder systems.
It is well-known that the vocal tract, consisting of throat, mouth, tongue, lips, teeth and nasal passages, forms a time varying linear filter in which the amplitude response versus frequency characteristics is responsible for practically all the information content in a speech signal. This filter is driven by energy sources, cornmonly known as buu" and hiss energy sources.
The term buzz is associated with the type of vocal source excitation function which exists when the vocal cords are oscillating at some quasi-periodic rate (called the pitch). Under this condition the chest cavity is supplying pufi's of air to the vocal tract at the quasiperiodic rate at which the vocal cords are oscillating. The term hiss is associated with the type of vocal source excitation which exists when the vocal cords are not oscillating in a quasi-periodic manner but are always allowing air to pass through from the chest cavity and excite the vocal tract.
For voiced sounds, e.g., vowels, the excitation is from the buzz energy source. For unvoiced sounds, e.g., ss, sh, f and whispered speech, the excitation is from the his source. The information content is impressed upon the speech signal by the vocal tract acting essentially as a time varying distributed constant linear filter. Thus, to recreate speech which is both intelligible and natural sounding, it is necessary to use both the information describing the time varying spectral shape and the information describing the buzz and hiss energy sources. The latter information generally takes the form of measurements of the fundamental frequency of the bus sources as a function of time (pitch extraction). Information as to whether the excitation is buzz or hiss is used by the speech compression system. Combinations of buzz and his excitation are used to generate some sounds, but speech compression systems do not generally try to detect the combined excitation. A decision is usually made as to whether to use buzz or his excitation for this combined excitation in a speech compression system of this type.
Speech compression systems using spectral analysis are generally called vocoders. ln existing speech compression systems, the spectrum data are transmitted by digitally encoding the logarithm of about 16 voltage spectrum amplitude which are derived from a filter bank spectrum analyzer. This method is known to be inefficient because of the high correlations among the various spectrum amplitudes. Various techniques are now used to remove these correlations and therefore reduce the required data rate for a given transmission fidelity. One approach which produces sonne improvements is the use of a delta pulse code modulation scheme in which only the decibel diflerences in level between adjacent frequency channels are transmitted. Another scheme is to form weighted sum of the logged, digitiud spectrum amplitudes, the weighting being arranged so that cross-correlation of the speech wave against a waveform derived from the input speech are markedly reduced.
Another type of vocoder is called the autocorrelation vocoderwhichderivesitsnamefromthefactthatinthe first step of the analysis process the autocorrelation function of the speech input is measured in terms of orthonormal functiom. Just as the power spectrum of the speech input varies with tinne (as a talker articulates various sounds), so does the autocorrelation function. There is a one-to-one correspondence between the power spectrum and the autocorrelation function of the speech signal so that measuring one is equivalent to measuring the other. Mathematically, the power spectrum and the autocorrelation function are Fourier transform pairs. Thus, autocorrelation is simply an alternative method of measuring the short time energy spectrum of the speech sigrnal. in an autocorrelation vocoder, the input signal is first applied to the inputs of a set of orthogonal filters. The filter output signals are multiplied by the input speech signal, and the product signal is then directed through low pass filters. The output signals from the low pass filter are the coefficients in an expansion of the power spectrum.
The power spectrum P0) of a speech signal is the product of the power spectrum of a pitch excitation,
V(f), and the magnitude squares |H(f)|' of a vocal tract transfer function H(f).
P(f)=lH(f)l' V0) (1) As stated above, the autocorrelation function is the Fourier transform of P(j) and is composed of the convolution of the transform of ]H(f) and V0). Practically, this means that the autocorrelation function repeats itself at multiples of the pitch period, and it is necessary to represent the vocal tract out to fairly large delay values (near one-half of a pitch period) in order to represent the speech spectrum with any fidelity. The overlap of successive autocorrelation functions due to convolution properties raises some doubt as to the validity of the values of the autocorrelation function alone as a measure of the vocal tract shape. While the autocorrelation vocoder obtains nearly independent spectral measurements, it does not solve the problem caused by confounding the spectral envelope (vocal tract) data with the excitation spectrum data, which results in higher order transmitted coefficients. Furthermore, this type of vocoder is basically an analog device yielding an output consisting of voltage spectrum values which are subsequently digitized.
SUMMARY OF THE INVENTION Briefly, a bandwidth compression system according to the present invention includes a means for generating an electrical signal representing the Fourier transform of the logarithm of the spectrum magnitudes (FTLSM) of an input sigrnal having excitation and impulse response information included therein. A first detection means, coupled to the means for generating the FILSM electrical signal, is operative to separate out a first predetermined portion of the Fl'LSM electrical signal to represent the excitation infonnation of tlne input signal. A second detection means, also coupled to the means for generating the FTLSM electrical signal, is operative to separate out a second predetermined portion of the FILSM electrical signal to represent the impulse response information of the input signal. The bandwidth required to pass the combined first and

Claims (37)

1. A bandwidth compression system including an analysis section comprising: means for generating electrical signals representing the Fourier transform of the logarithm of the magnitudes of the spectrum of an input signal, said input signal having excitation and impulse response information included therein; first detection means coupled to said means for generating electrical signals and being operative to provide from said electrical signals an output signal representing the excitation information of said input signal; and second detection means coupled to said means for generating electrical signals and being operative to separate out a predetermined portion of said electrical signals, said predetermined portion representing the impulse response information of said input signal.
2. A processor according to claim 1 including a synthesis section comprising: impulse response means coupled to said second detection means and being operative in response to the predetermined portion of said electrical signals to generate an output signal corresponding to the impulse response information; excitation means coupled to said first detection means and being operative in response to the output signal from said first detection means to generate an excitation carrier signal; and convolution means having input connections from said impulse response means and from said excitation means and being operative to convolve the output signals from said impulse response means and from said excitation means to thereby synthesize the input signal.
3. A digital vocoder including an analysis section comprising: means for obtaining spectrum magnitude signals of an input speech signal having voicing and vocal tract information; logging means coupled to said means for obtaining spectrum magnitude signals and being operative to generate output signals representing the logarithm of the spectrum magnitude of the input speech signal; first Fourier transform means coupled to said logging means and being operative to generate output signals having magnitude and positions and representing the Fourier transform of the logarithm of spectrum magnitudes of the input speech signal; pitch detection logic means coupled to said Fourier transform means and being operative to extract a pitch signal from the output signal of said first Fourier transform means, said pitch signal having a magnitude representing the voicing information of the input speech signal; and selecting means coupled to said first Fourier transform means and being operative to select a predetermined number of the output signals of said first Fourier transform means, said predetermined number of output signals representing the vocal tract information of the input speech signal.
4. A digital vocoder according to claim 3 including an encoding means coupled to said selecting means and being operative to quantize at a predetermined rate and scale by a predetermined factor each of the predetermined number of output signals of said Fourier transform means selected by said selecting means.
5. A digitaL vododer according to claim 3 including a synthesis section comprising: second Fourier transform means being operative in response to the selected output signals of said first Fourier transform means to generate output signals representing the Fourier transform of said selected output signals of said first Fourier transform means; delogging means coupled to said second Fourier transform means and being operative to generate output signals representing the antilogarithm of the output signals of said second Fourier transform means; third Fourier transform means coupled to said delogging means and being operative to generate output signals representing the vocal tract information of the input speech signal; pitch carrier generator coupled to said pitch detection logic means and being operative in response to said pitch signal to generate pitch carrier signals having predetermined rates; and convolution unit coupled to said third Fourier transform means and to said pitch carrier generator and being operative to combine the output signals of said third Fourier transform means and the pitch carrier signals from said pitch carrier generator to thereby generate the synthesized version of the input speech signal.
6. A digital vocoder according to claim 3 wherein said means for obtaining the spectrum magnitude signals of an input speech signal includes: transducer means being operative to convert said input signal into an electrical input speech signal; an analog to digital converter connected to said transducer means and being operative to convert said electrical input speech signal into a digital speech signal; computer means coupled to said analog to digital converter and being operative to generate real and imaginary signals representing the spectrum of the digital speech signal; and a magnitude computation circuit connected to said computer means and being operative to combine in a predetermined manner said real and imaginary signals to generate the spectrum magnitude signals of said input speech signal.
7. A digital vocoder according to claim 6 further including a normalization unit connected between said analog to digital converter means and said computer means and being operative to change the level of the input signals a predetermined factor to maintain the peak value of the digital speech signal to said computer means within a predetermined dynamic range.
8. A digital vocoder according to claim 6 further including a weighting function circuit connected between said analog to digital converter means and said computer means and being operative to weight the digital speech signal to obtain a smooth spectral signal from said computer means.
9. A digital vocoder according to claim 3 wherein said pitch detection logic means includes: selection means having an input connection from said first Fourier transform means and being operative to select the output signal of said first Fourier transform means having the largest magnitude; first comparator means having an input connection from said selection means and a first and second output connection, said first comparator means being operative to compare the magnitude of the selected output signal of said selection means to a predetermined threshold level and to generate an output signal at said first output connection if the magnitude of said selected output signal exceeds the predetermined threshold level and to generate a predetermined output signal at said second output connection if the magnitude of said selected output signal is less than the predetermined threshold level; and buffer storage means having a first input connection connected to the common juncture of said selection means and said first comparator means, a second input connection connected to the first output connection of said first comparator means and an output terminal and being operative to store the output signal from said selection means and to shift the stOred signal to the output terminal upon the receipt of a signal from said first comparator, means, whereby an unvoiced speech signal is indicated when said first comparator means has an output signal at said second output connection and a voiced speech signal is indicated when the output signal of said first Fourier transform means is shifted to the output of said buffer storage means.
10. A digital vocoder according to claim 9 further including means for determining voicing information having input connections connected to said means for obtaining spectrum magnitude signals and the first output connection of said first comparator means, a first output connection connected to the second input connection of said buffer storage means and a second output connection and being operative in response to the spectrum magnitude signals to provide an output at said first output connection when said spectrum magnitude signals include a voiced signal and to provide an output signal at said second output connection when said spectrum magnitude signals include an unvoiced signal.
11. A digital vocoder according to claim 10 wherein said means for determining voicing information comprises: means connected to said means for obtaining spectrum magnitude signals for computing a first output signal representing the low-band energy of the spectrum magnitude signals and a second output signal representing the high-band energy of the spectrum magnitude signals; means for combining said first output signal representing the low-band energy with said second output signal representing the high-band energy to form a composite signal representing the ratio of said first and second output signals; second comparator means having an input connection coupled to said means for computing, an output connection, and a predetermined threshold level and being operative to generate an output signal at its input connection when the output signal representing the low-band energy is greater than its predetermined threshold level; third comparator means having an input connection coupled to said means for combining, an output connection and a predetermined threshold level and being operative to generate an output signal at its output connection when said composite signal representing the ratio of said first and second output signals is greater than its predetermined threshold level; and fourth comparator means having a first input connection coupled to the output connection of said second comparator means, a second input connection coupled to the output connection of said third comparator means and a first output connection coupled to said buffer storage means and a second output connection and being operative to generate a signal at its first output connection when two predetermined signals are received at its first and second input connections, respectively, and to generate a signal at its second output connection when only one predetermined signal is received at either its first or second input connection.
12. A digital vocoder according to claim 7 further including a denormalizing unit coupled to said normalization unit and to said first Fourier transform means and being operative to alter the magnitude of the output signal of said first Fourier transform means in a predetermined manner related to the predetermined factor of said normalization unit.
13. A digital vocoder according to claim 12 wherein said denormalizing unit is a computer capable of solving the equation Co C''o - 16 Square Root N log2 (GN) where Co is the altered magnitude, C''o is the unaltered magnitude, N is the selected predetermined number of output signals from said first Fourier transform means and GN is the predetermined factor of said normalization unit.
14. A digital vocoder according to claim 4 wherein said encoding means comprises: scaling factor storage means operative to store a predetermined scaling factor for each of the predetermined number of output signals of said first Fourier transform means; scaling means coupled to said scaling factor storage means and to selecting means and being operative to add each of the predetermined scaling factors to a separate one of the predetermined number of output signals of said first Fourier transform means to eliminate negative values in said predetermined number of output signals; ratio storage means operative to store a predetermined ratio signal for each of the predetermined number of output signals of said first Fourier transform means; and multiplier means coupled to said scaling means and said ratio storage means and being operative to multiply each of the scaled output signals of said scaling means by a corresponding ratio signal stored in said ratio storage means to thereby quantize each of the predetermined numbers of output signals of said first Fourier transform means.
15. A digital vocoder according to claim 14 further including gating means coupled to said multiplier means and being operable to gate certain ones of said predetermined number of output signals of said first Fourier transform means at a first predetermined rate and to gate the remainder of the output signals of said first Fourier transform means at a second predetermined rate.
16. A digital vocoder according to claim 5 wherein said second Fourier transform means is a Fourier transform computer means operable to solve the expression where Vn is the nth frequency sample of the selected output signals of said first Fourier transform means, Ck is the kth sample of the selected output signals of said first Fourier transform means and K and R are the limits of summation.
17. A digital vocoder according to claim 5 wherein said pitch carrier generator includes: first means responsive to said pitch signal from said pitch detection logic means for generating a first predetermined pitch carrier signal when the magnitude of the pitch signal indicates a voiced signal; second means responsive to said pitch signal from said pitch detection logic means for generating a second predetermined pitch carrier signal when the magnitude of the pitch signal indicates an unvoiced signal; and gating means coupled to said first and second means for generating and being operative to gate a first predetermined pitch carrier signal to said convolution means when the magnitude of the pitch signal is less than a predetermined magnitude and to gate a second predetermined pitch carrier signal to said convolution means when the magnitude of the pitch signal is greater than a predetermined magnitude.
18. A digital vocoder according to claim 17 wherein said first means for generating includes: third means for generating signals, the magnitudes of which describe a predetermined function; fourth means for generating signals, the magnitudes of which describe the slope of a line connecting the magnitudes of two successive pitch signals from said pitch detection means; and first comparator means having input connections coupled to said third and fourth means for generating and an output connection coupled to said gating means and being operative to generate a first predetermined pulse when the signals from said fourth means for generating are equal to or greater than the magnitude of the signal from said third means for generating.
19. A digital vocoder according to claim 18 including an inhibiting means responsive to said pitch signal from said pitch detection logic means to inhibit the second predetermined pitch carrier signal of said second means for generating.
20. A digital vocoder according to claim 18 wherein said third means for generating signals includes: first storage counter means having a first input connection and an output connection and beiNg operative to store a first predetermined signal, to add to said first predetermined signal a second predetermined signal appearing at said first input connection and to supply the resultant signal to said output connection; slope means for generating a third predetermined signal; and first summation means having a first input connection coupled to the output connection of said first storage counter means, a second input connection coupled to said slope means and an output connection coupled to said first input connection of said storage counter means and to said gating means of said pitch carrier generators, said first summation means being operative to add the resultant signal of said first storage counter means to the third predetermined signal from said slope means to form said second predetermined signal and to direct said second predetermined signal simultaneously to said gating means of said pitch carrier generator and to said first storage counter means to update said first predetermined signal stored therein.
21. A digital vocoder according to claim 18 wherein said fourth means for generating signals includes: means for computing a slope signal m wherein m Tp(n- 1) -Tpn/T, where Tp is a first pitch signal received from said pitch detection logic at a first predetermined time, Tp(n- 1) is a second pitch signal received from said pitch detection logic at a second predetermined time and T is the elapsed time between said first and second predetermined times; second storage counter means having a first input connection and an output connection and being operative to store a first predetermined signal, to add to said first predetermined signal a second predetermined signal appearing at said first input connection and to supply the resultant signal to said output connection; and second summation means having a first input connection coupled to the output connection of said second storage counter means, a second input connection coupled to said means for computing a slope signal and an output connection coupled to said first input connection of said second storage counter means and to said gating means of said pitch carrier generator, said second summation means being operative to add the resultant signal of said second storage counter means to the slope signal from said means for computing a slope signal to form said second predetermined signal and to direct said second predetermined signal to said gating means of said pitch carrier generator and to said second storage means to update said first predetermined signal stored therein.
22. A digital vocoder according to claim 5 including a weighting circuit having an input connection coupled to said third Fourier transform means and an output connection coupled to said convolution means and being operative to apply weighting function signals to the output signals of said third Fourier transform means to thereby improve the quality of the synthesized version of the input speech signal.
23. A digital vocoder according to claim 22 wherein the weighting circuit includes: a masking circuit having an input connection coupled to said third Fourier transform means and being operative to select a predetermined number of the output signals of said third Fourier transform means; weighting function storage means being operative to store a predetermined number of signals corresponding to the predetermined number of output signals selected by said masking circuit; and multiplier means having input connections coupled to said masking circuit and to said weighting function storage means and an output connection coupled to said convolution means and being operative to multiply each of the predetermined number of output signals selected by said masking circuit by a different one of the predetermined number of signals stored in said weighting function storage means to thereby weIght the vocal tract data being directed to said convolution means.
24. A digital vocoder according to claim 5 wherein said convolution unit includes: logic means having a first input connection coupled to said pitch carrier generator, a second input connection coupled to said third Fourier transform means, and first, second, third and fourth output connections, said logic means being operative in response to a first predetermined time period to provide a data path from said first and second input connections to said first and second output connections, respectively, and being operative in response to a second predetermined time period to provide a data path from said first and second input connections to said third and fourth output connections respectively; first storage means having first and second input connections coupled respectively to said first and second output connections of said logic means and a plurality of output connections, said first storage means being operative to store the output signals representing the vocal tract information received from the third Fourier transform means via the data path established by said logic means during said first predetermined time period and to gate from a different one of said plurality of output connections a complete set of vocal tract signals upon the receipt of each signal from said pitch carrier generator during said first predetermined time period; second storage means having first and second input connections coupled respectively to said third and fourth output connections of said logic means and a plurality of output connections, said second storage means being operative to store the output signals representing the vocal tract information received from the third Fourier transform means via the data path established by said logic means during said second predetermined time period and to gate from a different one of said plurality of output connections a complete set of vocal tract signals upon receipt of each signal from said pitch carrier generator during said second predetermined time period; and summing means having a plurality of input connections each coupled to one of said plurality of output connections of said first and second storage means and being operative to add the vocal tract signals from said first and second storage means whereby a synthesized version of the input speech signal is obtained.
25. A vocoder system for synthesizing a first speech signal and analyzing a second speech signal simultaneously, said first and second speech signals including voicing and vocal tract information, said digital vocoder comprising: means for generating a pitch carrier signal from said first speech signal; means for obtaining the frequency spectrum magnitude signals of said first speech signal; means coupled to said means for obtaining the frequency spectrum magnitudes of said first speech signal for converting the frequency spectrum magnitudes into signals having a first predetermined symmetry; means for obtaining the frequency spectrum magnitudes of a second speech signal; means coupled to said means for obtaining the frequency spectrum magnitudes of said second speech signal for generating signals having a second predetermined symmetry and representing the logarithm of the frequency spectrum magnitudes of said second speech signal; summing means coupled to said means for converting and to said means for generating and being operative to sum said signals having a first predetermined symmetry and said signals having a second predetermined symmetry to form a composite signal; computing means having an input connection coupled to said summing means and first and second output connections, said computing means being operative to compute a first and second set of signals representing the complex Fourier transform of said composite signal, said first set of signals having said first predetermined symmetry and being directed to said first outPut connection and said second set of signals having said second predetermined symmetry and being directed to said second output connection; convolution means coupled to said means for generating a pitch carrier signal and to said first output connection of said computing means and being operative to combine in a predetermined manner said pitch carrier signal and said first set of signals having said first predetermined symmetry to thereby generate a synthesized version of said first speech signal; pitch detection means coupled to said second output connection of said computing means and being operative to extract the voicing information of said second speech signal from said second set of signals having said second predetermined symmetry; and selection means coupled to said second output connection of said computing means and being operative to select a predetermined number of said set of signals having said second predetermined symmetry, said selected signals representing the vocal tract information of said second speech signal.
26. A vocoder system for synthesizing a first speech signal and analyzing a second speech signal simultaneously with said first and second speech signals including voicing and vocal tract information, said digital vocoder system comprising: means for generating a pitch carrier signal from said first speech signal; means for obtaining the frequency spectrum magnitudes of said first speech signal; means coupled to said means for obtaining the frequency spectrum magnitude of said first speech signal for converting the frequency spectrum magnitudes into signals having a first predetermined symmetry; computing means having first and second input ports and first, second, third and fourth output ports and being operable to compute simultaneously the Fourier transform of a set of first predetermined input signals at said first input ports, said set of first predetermined signals having a composite symmetry of said first and second predetermined symmetries and the Fourier transform of a set of second predetermined input signals at said second input port, said set of second predetermined signals having first and second predetermined symmetries and operable to direct to said first, second, third and fourth output ports respectively a first set of output signals representing the Fourier transform of the portion of the set of first predetermined input signals having the second predetermined symmetry, a second set of output signals representing the Fourier transform of the portion of the set of first predetermined input signals having the first predetermined symmetry, a third set of output signals having the first predetermined symmetry and representing the Fourier transform of the portion of the set of second predetermined input signals at said second input port and a fourth set of output signals representing the Fourier transform of the portion of the set of second predetermined input signals having the second predetermined symmetry; sampling means having an output connection coupled to said first input port of said computing means and being operable to sample said second speech signal over a first predetermined time interval, said first and second sets of output signals of said computing means representing the spectrum of said sampled second input speech signal; magnitude means coupled to said first and second output ports of said computing means and being operative to combine in a predetermined manner said first and second sets of output signals of said computing means to generate signals representing the frequency spectrum magnitudes of said second speech signal; means coupled to said magnitude means for generating output signals having a second predetermined symmetry and representing the logarithm of the frequency spectrum magnitudes of said second speech signal; summing means having input connections coupled to said means for converting and to said means for generating and an output connection coupled to said second input port of said computing means and being operative to sum said signals having a first predetermined symmetry with said signals having said second predetermined symmetry to form said set of second predetermined input signals, whereby said third set of output signals of said computing means represents the vocal tract information of said first speech signal and said fourth set of output signals of said computing means is the Fourier transform of the logarithm of the spectrum magnitudes representing the voicing and vocal tract data of said second speech input signal; pitch detection logic means coupled to the fourth output port of said computing means and being operative to extract a pitch signal from the fourth set of output signals of said computing means to thereby represent the voicing information of said second input speech signal; selecting means coupled to the fourth output port of said computing means and being operative to select a predetermined number of the fourth set of output signals to represent the vocal tract information of said second input speech signal; and convolution means coupled to said means for generating a pitch carrier signal from said first speech signal and to the third output port of said computing means and being operative to combine in a predetermined manner the pitch carrier signals with the third set of output signals of said computing means to synthesize the first speech signal.
27. A vocoder system according to claim 26 wherein said means for generating a pitch carrier signal includes: first means responsive to said first speech signal for generating a first predetermined pitch carrier signal when the magnitude of the pitch signal indicates a voiced signal; second means responsive to said pitch signal from said pitch detection logic means for generating a second predetermined pitch carrier signal when the magnitude of the pitch signal indicates an unvoiced signal; and gating means coupled to said first and second means for generating and being operative to gate a first predetermined pitch carrier signal to said convolution means when the magnitude of the pitch signal is less than a predetermined magnitude and to gate a second predetermined pitch carrier signal to said convolution means when the magnitude of the pitch signal is greater than a predetermined magnitude.
28. A vocoder system according to claim 27 wherein said first means for generating includes: third means for generating signals, the magnitudes of which describe a predetermined function; fourth means for generating signals, the magnitudes of which describe the slope of a line connecting the magnitudes of the voiced information of two successive first input signals; and first comparator means having input connections coupled to said third and fourth means for generating and an output connection coupled to said gating means of said means for generating a pitch carrier signal and being operative to generate a first predetermined pulse when the signals from said fourth means for generating are equal to or greater than the magnitude of the signal from said third means for generating.
29. A vocoder system according to claim 28 including an inhibiting means responsive to said voicing information of said first input speech signal to inhibit the second predetermined pitch carrier signal of said second means for generating when the voicing information exceeds a predetermined magnitude.
30. A vocoder system according to claim 29 wherein said third means for generating signals includes: first storage counter means having a first input connection and an output connection and being operative to store a first predetermined signal, to add to said first predetermined signal a second predetermined signal appearing at said first input connection and to supply the resultant signal to said output connection; slope means for generating a third predetermined signal; and FIRST summation means having a first input connection coupled to the output connection of said first storage counter means, a second input connection coupled to said slope means and an output connection coupled to said first input connection of said storage counter means and to said gating means of said means for generating a carrier generator, said first summation means being operative to add the resultant signal of said first storage counter means to the third predetermined signal from said slope means to form said second predetermined signal and to direct said second predetermined signal simultaneously to said gating means of said means for generating a pitch carrier and to said first storage counter means to update said first predetermined signal stored therein.
31. A vocoder system according to claim 30 wherein said fourth means for generating signals includes: means for computing a slope signal m wherein m Tp(n-1) -Tpn/T, where Tp is a first voicing signal received from said first input speech signal at a first predetermined time, Tp(n-1) is a voicing signal received from said first input speech signal at a second predetermined time and T is the elapsed time between said first and second predetermined times; second storage counter means having a first input connection and an output connection and being operative to store a first predetermined signal, to add to said first predetermined signal a second predetermined signal appearing at said first input connection and to supply the resultant signal to said output connection; and second summation means having a first input connection coupled to the output connection of said second storage counter means, a second input connection coupled to said means for computing a slope signal and an output connection coupled to said first input connection of said second storage counter means and to said gating means of said means for generating a pitch carrier signal, said second summation means being operative to add the resultant signal of said second storage counter means to the slope signal from said means for computing a slope signal to form said second predetermined signal and to direct said second predetermined signal to said gating means of said means for generating a pitch carrier signal and to said second storage counter means to update said first predetermined signal stored therein.
32. A vocoder system according to claim 25 wherein said means for obtaining the frequency spectrum magnitude of said first speech signal includes: Fourier transform computer means operable to solve the expression where Vn is the nth frequency sample of said first speech signal, Ck is the kth sample of said first speech signal and k and R are predetermined limits of summation; and delogging computer means operative to obtain the antilogarithm of said expression to yield the frequency spectrum magnitude of said first speech signal.
33. A vocoder system according to claim 26 wherein: said first and second input ports of said computing means are real and imaginary input ports respectively; said first and second predetermined symmetries of said set of first predetermined input signals are even and odd symmetries respectively; said set of first predetermined input signals includes 256 samples of said input speech signal; said first set of output signals at said first output port of said computing means includes 128 samples having even symmetry and representing the Fourier transform of the even portion of the 256 samples at the real input port of said computing means; said second set of output signals at said second output port of said computing means includes 128 samples representing the Fourier transform of the portion of the 256 input samples at said real input port having odd symmetry, said first and second sets of output signals representing, respectively, the real and imaginary parts of the frequency spectrum of the 256 samples of the second input speech signal at the real input port of said computing means; said set of second predetermined input signals at said imaginary input port of said computing means includes 256 samples having even and odd symmetry associated therewith, said even symmetry portion representing the logarithm of the spectrum magnitudes of the second input speech signal and said odd symmetry portion representing the frequency spectrum of the first input speech signal; said third set of output signals at the third output port of said computing means includes 128 samples having odd symmetry and representing the Fourier transform of the odd symmetry portion of 256 samples at said imaginary input port of said computing means, said 128 samples at said third output port of said computing means represents the vocal tract information of said first speech signal; and said fourth set of output signals at the fourth output port of said computing means includes 128 samples having even symmetry and representing the Fourier transform of the logarithm of the spectrum magnitudes from which the vocal tract and the voicing information of the second input speech signal are derived.
34. A vocoder system according to claim 33 wherein said pitch detection logic means includes: selection means having an input connection coupled to said fourth output port of said computing means and being operative to select the sample of said fourth set of output signals having the largest magnitude; first comparator means having an input connection coupled to said selection means and a first and second output connection, said first comparator means being operative to compare the magnitude of the selected output signal of said selection means to a predetermined threshold level and to generate an output signal at said first output connection if the magnitude of said selected sample exceeds the predetermined threshold level and to generate a predetermined output signal at said second output connection if the magnitude of said selected sample is less than the predetermined threshold level; and buffer storage means having a first input connection connected to the common juncture of said selection means and said first comparator means, a second input connection connected to the first output connection of said first comparator means and an output terminal and being operative to store the selected sample from said selection means and to shift the stored sample to said output terminal upon receipt of a signal from said first comparator means, whereby an unvoiced second speech signal is indicated when said first comparator means has an output signal at said second output connection and a voiced signal is indicated when the fourth output signal of said computing means is shifted to the output of said buffer storage means.
35. A vocoder system according to claim 33 wherein said convolution unit includes: logic means having a first input connection coupled to said means for generating a pitch carrier signal, a second input connection coupled to the third output port of said computing means and first, second, third and fourth output connections, said logic means being operative in response to a first predetermined time period to provide a data path from said first and second input connections to said first and second output connections, respectively, and being operative in response to a second predetermined time period to provide a data path from said first and second input connections to said third and fourth output connections respectively; first storage means having first and second input connections coupled respectively to said first and second output connections of said logic means and a plurality of output connections, said first storage means being operAtive to store the output signals representing the vocal tract information received from the third output port of said computing means via the data path established by said logic means during said first predetermined time period and to gate from a different one of said plurality of output connections a complete set of vocal tract signals upon the receipt of each signal from said means for generating a pitch carrier signal during said first predetermined time period; second storage means having first and second input connections coupled respectively to said third and fourth output connections of said logic means and a plurality of output connections, said second storage means being operative to store the output signals representing the vocal tract information received from the third output port of said computing means via the data path established by said logic means during said second predetermined time period and to gate from a different one of said plurality of output connections a complete set of vocal tract signals upon receipt of each signal from said pitch carrier generator during said second predetermined time period; and summing means having a plurality of input connections each coupled to one of said plurality of output connections of said first and second storage means and being operative to add the vocal tract signals from said first and second storage means whereby a synthesized version of the first input speech signal is obtained.
36. A method of compressing the bandwidth of an input signal having an excitation portion and an impulse response portion comprising the steps of: generating a time variant electrical signal representing the Fourier transform of the logarithm of the spectrum magnitude of the input signal; separating out a first time interval signal of said time variant electrical signal to represent the impulse response portion of the input signal; and separating out a second time interval signal of said time variant electrical signal to represent the excitation portion of the input signal, said first and second time interval signals of said time variant electrical signal having a reduced bandwidth.
37. A method of simultaneously synthesizing a first speech signal and analyzing a second speech signal, said first and second speech signals including voicing and vocal tract data, said method comprising the steps of: generating a pitch carrier signal from said first speech signal; generating the frequency spectrum magnitude signals of said first speech signal; converting the frequency spectrum magnitude signals into signals having a first predetermined symmetry; generating the frequency spectrum magnitude signals of the second speech signal; converting the frequency spectrum magnitude signals of the second speech signal into a series of signals having a second predetermined symmetry and representing the logarithm of the frequency spectrum magnitudes of said second speech signal; combining the signals having the first predetermined symmetry with the series of signals having the second predetermined symmetry to generate a series of composite signals; generating from said series of composite signals first and second sets of signals representing the complex Fourier transform of the composite signal, said first set of signals having said first predetermined symmetry and said second set of signals having said second predetermined symmetry; combining the pitch carrier signal from said first speech signal and said first set of signals having said first predetermined symmetry to thereby generate a synthesized version of the first speech signal; selecting a predetermined number of said second set of signals to represent the vocal tract data of said second input speech signal; and selecting a predetermined number of the remaining signals of said second set of signals to represent the voicing information of said second input speech signal.
US46128A 1970-06-15 1970-06-15 Method and apparatus for signal bandwidth compression utilizing the fourier transform of the logarithm of the frequency spectrum magnitude Expired - Lifetime US3681530A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US4612870A 1970-06-15 1970-06-15

Publications (1)

Publication Number Publication Date
US3681530A true US3681530A (en) 1972-08-01

Family

ID=21941776

Family Applications (1)

Application Number Title Priority Date Filing Date
US46128A Expired - Lifetime US3681530A (en) 1970-06-15 1970-06-15 Method and apparatus for signal bandwidth compression utilizing the fourier transform of the logarithm of the frequency spectrum magnitude

Country Status (1)

Country Link
US (1) US3681530A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4076960A (en) * 1976-10-27 1978-02-28 Texas Instruments Incorporated CCD speech processor
US4184049A (en) * 1978-08-25 1980-01-15 Bell Telephone Laboratories, Incorporated Transform speech signal coding with pitch controlled adaptive quantizing
US4310721A (en) * 1980-01-23 1982-01-12 The United States Of America As Represented By The Secretary Of The Army Half duplex integral vocoder modem system
US4495620A (en) * 1982-08-05 1985-01-22 At&T Bell Laboratories Transmitting data on the phase of speech
US4914749A (en) * 1983-10-27 1990-04-03 Nec Corporation Method capable of extracting a value of a spectral envelope parameter with a reduced amount of operations and a device therefor
US4941178A (en) * 1986-04-01 1990-07-10 Gte Laboratories Incorporated Speech recognition using preclassification and spectral normalization
US5179626A (en) * 1988-04-08 1993-01-12 At&T Bell Laboratories Harmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis
US5216748A (en) * 1988-11-30 1993-06-01 Bull, S.A. Integrated dynamic programming circuit
US5412589A (en) * 1990-03-20 1995-05-02 University Of Michigan System for detecting reduced interference time-frequency distribution
US5715363A (en) * 1989-10-20 1998-02-03 Canon Kabushika Kaisha Method and apparatus for processing speech
US5809453A (en) * 1995-01-25 1998-09-15 Dragon Systems Uk Limited Methods and apparatus for detecting harmonic structure in a waveform
US6026348A (en) * 1997-10-14 2000-02-15 Bently Nevada Corporation Apparatus and method for compressing measurement data correlative to machine status
US6108621A (en) * 1996-10-18 2000-08-22 Sony Corporation Speech analysis method and speech encoding method and apparatus
US6507804B1 (en) 1997-10-14 2003-01-14 Bently Nevada Corporation Apparatus and method for compressing measurement data corelative to machine status
US6725108B1 (en) 1999-01-28 2004-04-20 International Business Machines Corporation System and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds
US20040181393A1 (en) * 2003-03-14 2004-09-16 Agere Systems, Inc. Tonal analysis for perceptual audio coding using a compressed spectral representation
US20050201578A1 (en) * 2004-03-05 2005-09-15 Siemens Audiologische Technik Gmbh Method and arrangement for transmitting signals to a hearing aid
US20050273323A1 (en) * 2004-06-03 2005-12-08 Nintendo Co., Ltd. Command processing apparatus
US20070250558A1 (en) * 2004-09-28 2007-10-25 Rohde & Schwarz Gmbh & Co. Kg Method and Device for Performing Spectrum Analysis of a Wanted Signal or Noise Signal
US20080281588A1 (en) * 2005-03-01 2008-11-13 Japan Advanced Institute Of Science And Technology Speech processing method and apparatus, storage medium, and speech system
US20090254350A1 (en) * 2006-07-13 2009-10-08 Nec Corporation Apparatus, Method and Program for Giving Warning in Connection with inputting of unvoiced Speech
US20150033250A1 (en) * 2008-03-31 2015-01-29 Echostar Technologies L.L.C. Systems, methods and apparatus for transmitting data over a voice channel of a wireless telephone network
US10186247B1 (en) 2018-03-13 2019-01-22 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3330910A (en) * 1964-05-06 1967-07-11 Bell Telephone Labor Inc Formant analysis and speech reconstruction
US3344349A (en) * 1963-10-07 1967-09-26 Bell Telephone Labor Inc Apparatus for analyzing the spectra of complex waves
US3403227A (en) * 1965-10-22 1968-09-24 Page Comm Engineers Inc Adaptive digital vocoder
US3448216A (en) * 1966-08-03 1969-06-03 Bell Telephone Labor Inc Vocoder system
US3471648A (en) * 1966-07-28 1969-10-07 Bell Telephone Labor Inc Vocoder utilizing companding to reduce background noise caused by quantizing errors
US3493684A (en) * 1966-06-15 1970-02-03 Bell Telephone Labor Inc Vocoder employing composite spectrum-channel and pitch analyzer
US3566035A (en) * 1969-07-17 1971-02-23 Bell Telephone Labor Inc Real time cepstrum analyzer

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3344349A (en) * 1963-10-07 1967-09-26 Bell Telephone Labor Inc Apparatus for analyzing the spectra of complex waves
US3330910A (en) * 1964-05-06 1967-07-11 Bell Telephone Labor Inc Formant analysis and speech reconstruction
US3403227A (en) * 1965-10-22 1968-09-24 Page Comm Engineers Inc Adaptive digital vocoder
US3493684A (en) * 1966-06-15 1970-02-03 Bell Telephone Labor Inc Vocoder employing composite spectrum-channel and pitch analyzer
US3471648A (en) * 1966-07-28 1969-10-07 Bell Telephone Labor Inc Vocoder utilizing companding to reduce background noise caused by quantizing errors
US3448216A (en) * 1966-08-03 1969-06-03 Bell Telephone Labor Inc Vocoder system
US3566035A (en) * 1969-07-17 1971-02-23 Bell Telephone Labor Inc Real time cepstrum analyzer

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Noll, Short Time Spectrum and Cepstrum Techniques for Vocal Pitch Detection, J.A.S.A. 2/1964 p. 296 302. *
Shively, A Digital Processor to Generate Spectra in Real Time, IEEE Trans. on Computers, 5/1968 p. 485 491. *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4076960A (en) * 1976-10-27 1978-02-28 Texas Instruments Incorporated CCD speech processor
US4184049A (en) * 1978-08-25 1980-01-15 Bell Telephone Laboratories, Incorporated Transform speech signal coding with pitch controlled adaptive quantizing
DE2934489A1 (en) * 1978-08-25 1980-03-27 Western Electric Co CIRCUIT AND METHOD FOR VOICE SIGNAL PROCESSING
US4310721A (en) * 1980-01-23 1982-01-12 The United States Of America As Represented By The Secretary Of The Army Half duplex integral vocoder modem system
US4495620A (en) * 1982-08-05 1985-01-22 At&T Bell Laboratories Transmitting data on the phase of speech
US4914749A (en) * 1983-10-27 1990-04-03 Nec Corporation Method capable of extracting a value of a spectral envelope parameter with a reduced amount of operations and a device therefor
US4941178A (en) * 1986-04-01 1990-07-10 Gte Laboratories Incorporated Speech recognition using preclassification and spectral normalization
US5179626A (en) * 1988-04-08 1993-01-12 At&T Bell Laboratories Harmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis
US5216748A (en) * 1988-11-30 1993-06-01 Bull, S.A. Integrated dynamic programming circuit
US5715363A (en) * 1989-10-20 1998-02-03 Canon Kabushika Kaisha Method and apparatus for processing speech
US5412589A (en) * 1990-03-20 1995-05-02 University Of Michigan System for detecting reduced interference time-frequency distribution
US5809453A (en) * 1995-01-25 1998-09-15 Dragon Systems Uk Limited Methods and apparatus for detecting harmonic structure in a waveform
US6108621A (en) * 1996-10-18 2000-08-22 Sony Corporation Speech analysis method and speech encoding method and apparatus
US6026348A (en) * 1997-10-14 2000-02-15 Bently Nevada Corporation Apparatus and method for compressing measurement data correlative to machine status
US6507804B1 (en) 1997-10-14 2003-01-14 Bently Nevada Corporation Apparatus and method for compressing measurement data corelative to machine status
US6725108B1 (en) 1999-01-28 2004-04-20 International Business Machines Corporation System and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds
US7333930B2 (en) * 2003-03-14 2008-02-19 Agere Systems Inc. Tonal analysis for perceptual audio coding using a compressed spectral representation
US20040181393A1 (en) * 2003-03-14 2004-09-16 Agere Systems, Inc. Tonal analysis for perceptual audio coding using a compressed spectral representation
US20050201578A1 (en) * 2004-03-05 2005-09-15 Siemens Audiologische Technik Gmbh Method and arrangement for transmitting signals to a hearing aid
US7580534B2 (en) * 2004-03-05 2009-08-25 Siemens Audiologische Technik Gmbh Method and arrangement for transmitting signals to a hearing aid
US20050273323A1 (en) * 2004-06-03 2005-12-08 Nintendo Co., Ltd. Command processing apparatus
US8447605B2 (en) * 2004-06-03 2013-05-21 Nintendo Co., Ltd. Input voice command recognition processing apparatus
US20070250558A1 (en) * 2004-09-28 2007-10-25 Rohde & Schwarz Gmbh & Co. Kg Method and Device for Performing Spectrum Analysis of a Wanted Signal or Noise Signal
US8768638B2 (en) * 2004-09-28 2014-07-01 Rohde & Schwarz Gmbh & Co. Kg Method and device for performing spectrum analysis of a wanted signal or noise signal
US20080281588A1 (en) * 2005-03-01 2008-11-13 Japan Advanced Institute Of Science And Technology Speech processing method and apparatus, storage medium, and speech system
US8065138B2 (en) * 2005-03-01 2011-11-22 Japan Advanced Institute Of Science And Technology Speech processing method and apparatus, storage medium, and speech system
US20090254350A1 (en) * 2006-07-13 2009-10-08 Nec Corporation Apparatus, Method and Program for Giving Warning in Connection with inputting of unvoiced Speech
US8364492B2 (en) * 2006-07-13 2013-01-29 Nec Corporation Apparatus, method and program for giving warning in connection with inputting of unvoiced speech
US9743152B2 (en) * 2008-03-31 2017-08-22 Echostar Technologies L.L.C. Systems, methods and apparatus for transmitting data over a voice channel of a wireless telephone network
US20150033250A1 (en) * 2008-03-31 2015-01-29 Echostar Technologies L.L.C. Systems, methods and apparatus for transmitting data over a voice channel of a wireless telephone network
US10186247B1 (en) 2018-03-13 2019-01-22 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
US10482863B2 (en) 2018-03-13 2019-11-19 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
US10629178B2 (en) 2018-03-13 2020-04-21 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
US10902831B2 (en) 2018-03-13 2021-01-26 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
US20210151021A1 (en) * 2018-03-13 2021-05-20 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
US11749244B2 (en) * 2018-03-13 2023-09-05 The Nielson Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal
US12051396B2 (en) 2018-03-13 2024-07-30 The Nielsen Company (Us), Llc Methods and apparatus to extract a pitch-independent timbre attribute from a media signal

Similar Documents

Publication Publication Date Title
US3681530A (en) Method and apparatus for signal bandwidth compression utilizing the fourier transform of the logarithm of the frequency spectrum magnitude
US3624302A (en) Speech analysis and synthesis by the use of the linear prediction of a speech wave
US5305421A (en) Low bit rate speech coding system and compression
Denes et al. Spoken digit recognition using time‐frequency pattern matching
US3995116A (en) Emphasis controlled speech synthesizer
GB1569990A (en) Frequency compensation method for use in speech analysis apparatus
US3566035A (en) Real time cepstrum analyzer
US4991215A (en) Multi-pulse coding apparatus with a reduced bit rate
US5048088A (en) Linear predictive speech analysis-synthesis apparatus
Flanagan Band width and channel capacity necessary to transmit the formant information of speech
US3448216A (en) Vocoder system
US4914702A (en) Formant pattern matching vocoder
Harris Some acoustic cues for the fricative consonants
Kelly Speech and vocoders
Makhoul Methods for nonlinear spectral distortion of speech signals
JPS6032100A (en) Lsp type pattern matching vocoder
JP2615991B2 (en) Linear predictive speech analysis and synthesis device
Nakatsui et al. Nature of helium-speech and its unscrambling
JPS61252600A (en) Lsp type pattern matching vocoder
Morris et al. An economical hardware realization of a digital linear predictive speech synthesizer
JPS58211796A (en) Voice synthesizer
EP0119033B1 (en) Speech encoder
Edwards et al. Better vocoders are coming
US5899974A (en) Compressing speech into a digital format
Kingsbury et al. A robust channel vocoder for adverse environments