EP0570362B1 - Digital speech decoder having a postfilter with reduced spectral distortion - Google Patents
Digital speech decoder having a postfilter with reduced spectral distortion Download PDFInfo
- Publication number
- EP0570362B1 EP0570362B1 EP90913916A EP90913916A EP0570362B1 EP 0570362 B1 EP0570362 B1 EP 0570362B1 EP 90913916 A EP90913916 A EP 90913916A EP 90913916 A EP90913916 A EP 90913916A EP 0570362 B1 EP0570362 B1 EP 0570362B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- component
- postfilter
- speech
- signal
- filter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000003595 spectral effect Effects 0.000 title abstract description 31
- 238000000034 method Methods 0.000 claims abstract description 19
- 238000009499 grossing Methods 0.000 claims abstract description 10
- 230000005284 excitation Effects 0.000 claims description 14
- 230000015572 biosynthetic process Effects 0.000 claims description 9
- 238000003786 synthesis reaction Methods 0.000 claims description 9
- 238000001914 filtration Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000009877 rendering Methods 0.000 claims description 2
- 230000001131 transforming effect Effects 0.000 claims description 2
- 230000003044 adaptive effect Effects 0.000 abstract description 10
- 230000002411 adverse Effects 0.000 abstract 1
- 239000013598 vector Substances 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
Definitions
- This invention relates generally to speech coders, and more particularly to digital speech coders that use postfilters to enhance the speech quality.
- Speech coders and decoders are known in the art. Some speech coders convert analog voice samples into digitized representations, and subsequently represent the spectral speech information through use of linear predictive coding (see, for example, document EP-A-0 294 020). Other speech coders improve upon ordinary linear predictive coding (LPC) techniques by providing an excitation signal that is related to the original voice signal.
- LPC linear predictive coding
- U.S. Patent No. 4,817,157 describes a digital speech coder and decoder having an improved vector excitation source wherein a codebook of codebook excitation vectors is accessed to select a codebook excitation signal that best fits the available information, and is used to provide a synthesized speech signal from an LPC filter that closely represents the original.
- various post-LPC filters are often used to further condition the signal.
- One such filter is an adaptive spectral postfilter (which is typically intended to enhance the perceptual quality of the synthetic speech), and another is a post emphasis filter (which contributes brightness to the synthetic speech result).
- the numerator term attempts to cancel the general spectral shape introduced by the denominator. In prior art applications, ⁇ is often set to about 0.8, and ⁇ to about 0.5.
- the numerator polynomial is only partially successful in tracking the spectral shape of the denominator (in effect, the spectral characteristic of the filter tilts with time), and that discrepancy typically manifests itself as a time varying modulation of the postfiltered speech brightness.
- a method for producing a synthesized speech signal comprising the steps of: providing an excitation signal to a linear predictive coding (LPC) filter; providing from the LPC filter a synthesized speech signal; providing a speech synthesis postfilter that requires a first component and a second component; providing the first component including a first set of coefficients; the method characterised by the steps of: transforming at least some of the first set of coefficients into an autocorrelation domain set of parameters; spectrally smoothing the autocorrelation domain set of parameters to provide a modified first set of coefficients; using the modified first set of coefficients to provide the second component for use by the speech synthesis postfilter; filtering the synthesized speech signal in the speech synthesis postfilter using the first component and the second component to provide a filtered synthesized speech signal, wherein the second component adaptively tracks and cancels out a general spectral shape of the first component; and rendering the filtered synthesized speech signal audible.
- LPC linear predictive coding
- Z transform (filter) coefficients that represent the first component are converted to the autocorrelation domain.
- a spectral smoothing technique that makes use of a bandwidth expansion function is then applied to the autocorrelation sequence, and the second component polynomial coefficients are calculated from the modified autocorrelation sequence via the Levinson recursion.
- the first component is then used as the denominator, and the second component as the numerator, in the above noted filter characteristic.
- the numerator polynomial is replaced by a spectrally smoothed version of the A(z/ ⁇ ) polynomial.
- Formant bandwidth expansion does not change the smoothed spectral envelope.
- the spectrally smoothed bandwidth expanded version of the A(z/ ⁇ ) polynomial effectively minimizes time varying spectral tilt and allows the numerator to adaptively track the general spectral shape of the denominator and cancel it out.
- an additional post emphasis filter can be used to afford more control over postfiltered speech brightness.
- This filter is a first order filter of the form where typically 0.2 ⁇ u ⁇ 0.5.
- a radio (100) embodying the invention includes an antenna (102) for receiving a speech coded radio frequency (RF) signal (101).
- An RF unit (103) processes the received signal to recover the speech coded information.
- This information is provided to a parameter decoder (105) that develops control parameters for various subsequent processes.
- An excitation source (104) as described above utilizes the parameters provided to it to create an excitation signal.
- This resultant excitation signal from the excitation source (104) is provided to an LPC filter (106) that yields a synthesized speech signal in accordance with the coded information.
- the synthesized speech signal is then pitch postfiltered (107) and spectrally postfiltered (108) to enhance the quality of the reconstructed speech.
- a post emphasis filter (109) can also be included to further enhance the resultant speech signal. (Additional details regarding the spectral postfilter (108) and the post emphasis filter (109) will be provided below.)
- the speech signal is then processed in an audio processing unit (111) and rendered audible by an audio transducer (112).
- the excitation source (104), LPC filter (106), pitch postfilter (107), adaptive spectral postfilter (108), and post emphasis filter (109) can all be provided through appropriate programming of a DSP (113).
- the adaptive spectral postfilter (108) is characterized by a first component (a denominator that is related to the filter characteristics of the LPC filter (106)) and a second component (a numerator that adaptively tracks the general spectral shape of the denominator to thereby cancel it out).
- a first component a denominator that is related to the filter characteristics of the LPC filter (106)
- a second component a numerator that adaptively tracks the general spectral shape of the denominator to thereby cancel it out.
- the general form of such a filter can be found described in an article entitled "Real-Time Vector APC Speech Coding at 4800 bps With Adaptive Postfiltering," by Chen and Gersho, which appeared in the April, 1987 edition of the Proceedings of The International Conference on Acoustics, Speech, and Signal Processing, at pages 2185-2188.
- the numerator is developed by applying spectral smoothing techniques to the denominator polynomial.
- spectral smoothing techniques are described in an article entitled "Spectral Smoothing Technique in PARCOR Speech Analysis - Synthesis," by Tohkura, Itakura, and Hashimoto, which appeared in the December, 1978 edition of the I.E.E.E. Transactions on Acoustics, Speech, and Signal Processing.
- Z transform coefficients that represent the denominator are converted to the autocorrelation domain.
- Examples of such conversions can be found in Markel, J.D. Gray, A.H., Jr.; Linear Prediction of Speech (Springer-Verlag, Berlin, Heidelberg, New York, 1976.)
- the spectral smoothing technique bandwidth expansion function is then applied to the autocorrelation sequence, with the numerator polynomial coefficients being calculated from the modified autocorrelation sequence via the Levinson recursion.
- the autocorrelation coefficients are multiplied by the following factors to provide the resultant numerator coefficients: Autocorrelation Lag Spectral Smoothing Factor 0 1.0000000 1 0.9230769 2 0.7252747 3 0.4835164 4 0.2719780 5 0.1279896 6 4.9773753E-02 7 1.5718028E-02 8 3.9295070E-03 9 7.4847753E-04 10 1.0206513E-04
- the denominator and numerator are then used to characterize the adaptive spectral postfilter (108).
- the numerator polynomial is provided by a spectrally smoothed version of the denominator polynomial.
- the spectrally smoothed bandwidth expanded version of the denominator polynomial effectively minimizes time varying spectral tilt and allows the numerator to adaptively track the general spectral shape of the denominator and cancel it out.
- a bandwidth expansion factor (which specifies the degree of smoothing that is performed on the denominator) of about 1,200 Hz was used.
- the adaptive spectral postfilter is characterized by a first component, or denominator, and a second component, or numerator.
- the first component which can be expressed as: is provided in block 202.
- the z-transform coefficients are converted to the autocorrelation domain.
- a spectral smoothing bandwidth expansion function is applied to the autocorrelation sequence, and, in the subsequent block (205), the numerator (second component) polynomial coefficients are calculated from the autocorrelation sequence modified in the previous step (204), through the use of the Levinson recursion.
- the numerator, or second component can be expressed as: 1-B(z).
- the first and second components are used to characterize the adaptive spectral postfilter, which can be represented as:
- the post emphasis filter (109) may be provided to afford more control over postfiltered speech brightness.
- This filter is a first order filter of the form
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Filters That Use Time-Delay Elements (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US42292689A | 1989-10-17 | 1989-10-17 | |
US422926 | 1989-10-17 | ||
PCT/US1990/005190 WO1991006093A1 (en) | 1989-10-17 | 1990-09-17 | Digital speech decoder having a postfilter with reduced spectral distortion |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0570362A4 EP0570362A4 (en) | 1993-07-01 |
EP0570362A1 EP0570362A1 (en) | 1993-11-24 |
EP0570362B1 true EP0570362B1 (en) | 1999-03-17 |
Family
ID=23676980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP90913916A Expired - Lifetime EP0570362B1 (en) | 1989-10-17 | 1990-09-17 | Digital speech decoder having a postfilter with reduced spectral distortion |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP0570362B1 (es) |
JP (1) | JP3158434B2 (es) |
CN (1) | CN1078371C (es) |
AT (1) | ATE177867T1 (es) |
AU (1) | AU635342B2 (es) |
DE (1) | DE69033011T2 (es) |
ES (1) | ES2131498T3 (es) |
WO (1) | WO1991006093A1 (es) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2729246A1 (fr) * | 1995-01-06 | 1996-07-12 | Matra Communication | Procede de codage de parole a analyse par synthese |
FR2729244B1 (fr) * | 1995-01-06 | 1997-03-28 | Matra Communication | Procede de codage de parole a analyse par synthese |
FR2729247A1 (fr) * | 1995-01-06 | 1996-07-12 | Matra Communication | Procede de codage de parole a analyse par synthese |
JP2993396B2 (ja) * | 1995-05-12 | 1999-12-20 | 三菱電機株式会社 | 音声加工フィルタ及び音声合成装置 |
DE19643900C1 (de) * | 1996-10-30 | 1998-02-12 | Ericsson Telefon Ab L M | Nachfiltern von Hörsignalen, speziell von Sprachsignalen |
US6137844A (en) * | 1998-02-02 | 2000-10-24 | Oki Telecom, Inc. | Digital filter for noise and error removal in transmitted analog signals |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4301329A (en) * | 1978-01-09 | 1981-11-17 | Nippon Electric Co., Ltd. | Speech analysis and synthesis apparatus |
US4617676A (en) * | 1984-09-04 | 1986-10-14 | At&T Bell Laboratories | Predictive communication system filtering arrangement |
JP2535833B2 (ja) * | 1986-07-03 | 1996-09-18 | 日本電気株式会社 | 集積回路 |
US4852169A (en) * | 1986-12-16 | 1989-07-25 | GTE Laboratories, Incorporation | Method for enhancing the quality of coded speech |
US4969192A (en) * | 1987-04-06 | 1990-11-06 | Voicecraft, Inc. | Vector adaptive predictive coder for speech and audio |
US4817157A (en) * | 1988-01-07 | 1989-03-28 | Motorola, Inc. | Digital speech coder having improved vector excitation source |
-
1990
- 1990-09-17 WO PCT/US1990/005190 patent/WO1991006093A1/en active IP Right Grant
- 1990-09-17 AT AT90913916T patent/ATE177867T1/de not_active IP Right Cessation
- 1990-09-17 AU AU64114/90A patent/AU635342B2/en not_active Expired
- 1990-09-17 JP JP51307390A patent/JP3158434B2/ja not_active Expired - Lifetime
- 1990-09-17 DE DE69033011T patent/DE69033011T2/de not_active Expired - Lifetime
- 1990-09-17 ES ES90913916T patent/ES2131498T3/es not_active Expired - Lifetime
- 1990-09-17 EP EP90913916A patent/EP0570362B1/en not_active Expired - Lifetime
- 1990-10-15 CN CN90108435.2A patent/CN1078371C/zh not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
ES2131498T3 (es) | 1999-08-01 |
JPH05500573A (ja) | 1993-02-04 |
DE69033011T2 (de) | 2001-10-04 |
EP0570362A4 (en) | 1993-07-01 |
AU6411490A (en) | 1991-05-16 |
DE69033011D1 (de) | 1999-04-22 |
ATE177867T1 (de) | 1999-04-15 |
CN1078371C (zh) | 2002-01-23 |
AU635342B2 (en) | 1993-03-18 |
JP3158434B2 (ja) | 2001-04-23 |
EP0570362A1 (en) | 1993-11-24 |
CN1051101A (zh) | 1991-05-01 |
WO1991006093A1 (en) | 1991-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3678519B2 (ja) | オーディオ周波数信号の線形予測解析方法およびその応用を含むオーディオ周波数信号のコーディングならびにデコーディングの方法 | |
EP0732686B1 (en) | Low-delay code-excited linear-predictive coding of wideband speech at 32kbits/sec | |
US6807524B1 (en) | Perceptual weighting device and method for efficient coding of wideband signals | |
JP3653826B2 (ja) | 音声復号化方法及び装置 | |
DE69934320T2 (de) | Sprachkodierer und verfahren zur codebuch-suche | |
US7191123B1 (en) | Gain-smoothing in wideband speech and audio signal decoder | |
DE69934608T3 (de) | Adaptive kompensation der spektralen verzerrung eines synthetisierten sprachresiduums | |
EP1141946B1 (en) | Coded enhancement feature for improved performance in coding communication signals | |
US5241650A (en) | Digital speech decoder having a postfilter with reduced spectral distortion | |
EP0570362B1 (en) | Digital speech decoder having a postfilter with reduced spectral distortion | |
JP3319556B2 (ja) | ホルマント強調方法 | |
JPH0876799A (ja) | 広帯域音声信号復元方法 | |
KR100421816B1 (ko) | 음성복호화방법 및 휴대용 단말장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 19920514 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB IT LI LU NL SE |
|
17Q | First examination report despatched |
Effective date: 19961018 |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE CH DE DK ES FR GB IT LI LU NL SE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 19990317 |
|
REF | Corresponds to: |
Ref document number: 177867 Country of ref document: AT Date of ref document: 19990415 Kind code of ref document: T |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: NV Representative=s name: JOHN P. MUNZINGER INGENIEUR-CONSEIL Ref country code: CH Ref legal event code: EP |
|
ITF | It: translation for a ep patent filed | ||
REF | Corresponds to: |
Ref document number: 69033011 Country of ref document: DE Date of ref document: 19990422 |
|
ET | Fr: translation filed | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 19990617 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FG2A Ref document number: 2131498 Country of ref document: ES Kind code of ref document: T3 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: LU Payment date: 20060628 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 20060707 Year of fee payment: 17 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PCAR Free format text: JOHN P. MUNZINGER C/O CRONIN INTELLECTUAL PROPERTY;CHEMIN DE PRECOSSY 31;1260 NYON (CH) |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070930 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070917 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: ES Payment date: 20090915 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: AT Payment date: 20090807 Year of fee payment: 20 Ref country code: SE Payment date: 20090904 Year of fee payment: 20 Ref country code: NL Payment date: 20090915 Year of fee payment: 20 Ref country code: GB Payment date: 20090807 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20090930 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20090912 Year of fee payment: 20 |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: V4 Effective date: 20100917 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: PE20 Expiry date: 20100916 |
|
EUG | Se: european patent has lapsed | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20100916 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20100917 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20090916 Year of fee payment: 20 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20100917 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FD2A Effective date: 20130719 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20100918 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230520 |