WO1995028699A1 - Differential-transform-coded excitation for speech and audio coding - Google Patents

Differential-transform-coded excitation for speech and audio coding Download PDF

Info

Publication number
WO1995028699A1
WO1995028699A1 PCT/CA1995/000216 CA9500216W WO9528699A1 WO 1995028699 A1 WO1995028699 A1 WO 1995028699A1 CA 9500216 W CA9500216 W CA 9500216W WO 9528699 A1 WO9528699 A1 WO 9528699A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
quantization
sound signal
transform
spectral
Prior art date
Application number
PCT/CA1995/000216
Other languages
French (fr)
Inventor
Jean-Pierre Adoul
Claude Laflamme
Redwan Salami
Roch Lefebvre
Original Assignee
Universite De Sherbrooke
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universite De Sherbrooke filed Critical Universite De Sherbrooke
Priority to AU22509/95A priority Critical patent/AU2250995A/en
Publication of WO1995028699A1 publication Critical patent/WO1995028699A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A method for coding speech is disclosed. This method, called DTCX (Differential-Transform-Coded Excitation), combines in a new way, the best features of time domain techniques such as CELP (Code-Excited Linear Prediction) with the best features of frequency-domain techniques such as TC (Transform coding) while avoiding their respective drawbacks. The invention preserves the principle of error minimization in the perceptually-weighted-speech(/audio) domain which is found in CELP along with techniques such as linear filtering and pitch prediction, yet, it circumvents the complexity of the CELP analysis-by-synthesis approach by using quantization. The invention also takes advantage of the efficient frequency-domain differential-quantization techniques typical of transform coding (TC) such as spectral decimation, flexible bit allocation as well as numerous forms of stored or algebraic vector quantization techniques. In addition, it is the difference between the current and previous spectra which is quantized resulting in enhanced performance in particular for audio coding. Yet unlike TC, the invention is essentially free from framing problems that plague block transforming of continuous processes.

Description

DIFFERENTIAL-TRANSFORM-CODED EXCITATION
FOR SPEECH AND AUDIO CODING
BACKGROUND OF THE INVENTION
1. Field of the invention:
The present invention relates to a new technique for digitally encoding and decoding in particular, but not exclusively, speech and audio signals in view of transmitting and synthesizing these speech and audio signals.
2. Brief description of the prior art:
Efficient digital speech codingtechniques with good subjective quality/bit-rate tradeoffs are increasingly in demand for numerous applications. Recently, CELP [Schroeder, M.R. & B. Atal, "Code- Excited Linear Prediction (CELP) high-quality speech at very low bit rates", IEEE ICASSP 1985] and Algebraic CELP [Adoul, J.-P. & Laflamme, C, "Dynamic Codebook For Efficient Speech Coding Based on Algebraic Codes", WO 91/13432 published on September 5, 1991] techniques have been developed successfully for voice transmission at rates between 4 to 8 kbps for applications such as land mobile, digital radio, secure telephony etc. However, (unless block sizes are reduced to but a few samples) CELP becomes impractical above 8 kbps as codebook sizes and search times increase exponentially with bit rate. Differences and similarities between the prior art and the present invention:
The present invention, called DTCX (Differential-Transform-Coded Excitation) , retains several features of CELP but circumvents the complexity limitation (DTCX's complexity tends to decrease with bit rate) .
Along with CELP, the invention belongs to the "excited linear prediction (LP) " techniques. In this class of techniques, the reconstructed (i.e.: decoded) signal is obtained by exciting a slowly-varying linear prediction (LP) filter also referred to as "synthesis filter". The excitation being a LP-residual (i.e.: whitened) signal, the signal-to-noise performance of this type of coder reaps readily the full benefit of the linear-prediction gain.
TCX (Transform-coded Excitation) and CELP use, however, opposite search strategies to achieves these common goals. To understand the fundamental difference, it is best to refer to the, so-called, "backward filtering" formulation of CELP [Adoul, J.- P.; Mabilleau, Ph.; Delprat, M. ; and Morissette, S. ; "Fast CELP Coding Based on Algebraic Code", IEEE ICASSP 1987] . In this formulation, a "target signal" is computed. Simply put, the coding problem is to find the winner, that is, the particular innovative component which, once LP-synthesized, will be the closest (in mean-squared-error sense) to this, appropriately called, target signal.
The way CELP solves the problem is called
"analysis-by-synthesis" . In this approach, each possible innovative component (i.e. codebook entry) is LP-filtered one-by-one to yield the winner.
By contrast, and this is the crux, the invention takes the reverse path. Namely, the target signal itself is differentially quantized and the winning innovation component reached by (single) inverse LP-filtering of this quantized target.
There is still another very useful feature, so far left unmentioned, that the invention shares with CELP and which constitutes a distinct improvement over older "excited linear prediction (LP) " techniques such as RELP (Residual Excited LP) . This has to do with the question of properly chaining-up the reconstructed output blocks in the context of block processing. The target signals are not sample blocks taken out from a continuous process, but are so called "decontextualized" signals (free of edge, or "ringing" considerations) .
The fact that the invention is based on the differential quantization of a target signal offers the distinct possibility to take advantage of efficient frequency-domain quantization techniques typical of Transform Coding (TC) while, staying essentially free, from framing problems that plague block transforming of continuous processes. As a matter of fact, statistically invariant properties of speech (/audio) are often more readily usable in the frequency domain. This fact enables many efficient coding techniques including spectral decimation, flexible bit allocation as well as numerous forms of stored or algebraic vector quantization techniques. OBJECTS OF THE INVENTION
The main object of the invention is to formulate a general speech/audio-coding framework which combines in a new way the advantages of both the most efficient time-domain and frequency-domain analysis and encoding methods.
It is also an object of the invention to describe typical examples of perceptually-meaningful differential quantization procedures in the frequency domain to be used within the said general framework.
A further object of the invention is to provide an "excited linear prediction (LP) " technique using short-term (and, possibly, long-term) prediction analysis to obtain a residual (i.e. whitened) signal to which a series of perceptual and frequency transformations are applied in order to perform both a perceptually-meaningful and efficient differential quantization procedure in the frequency domain.
SUMMARY OF THE INVENTION
More specifically, in accordance with the present invention, there is provided a method of coding a sound signal to produce an index signal to be decoded into an excitation signal to be supplied to a synthesis filter to synthesize the sound signal, comprising the steps of: converting the sound signal into a frequency-domain signal by means of a given frequency transform; subtracting a previous frequency-domain signal produced by the converting step, from a current frequency-domain signal produced by this converting step to generate a difference signal; and conducting a spectral quantization on the difference signal to produce the index signal.
Also in accordance with the present invention, there is provided a device for coding a sound signal to produce an index signal to be decoded into an excitation signal to be supplied to a synthesis filter to synthesize this sound signal, comprising: first means for converting the sound signal into a frequency-domain signal by means of a given frequency transform; second means for subtracting a previous frequency-domain signal produced by the converting means, from a current frequency-domain signal produced by these converting means to generate a difference signal; and third means for spectrally quantizing the difference signal to produce the index signal.
According to preferred embodiments of the invention:
- the difference signal is quantized using a weighted mean-squared error criterion.
- the sound signal is sampled and arranged into frames of N consecutive samples applied to the converting step, N being an integer; - a pitch-correlated component based on a past excitation signal is produced and subtracted from the sound signal prior to spectral quantization;
the sound signal is perceptually weighted through a filter means, or the difference signal is perceptually weighted through the spectral quantization which is based on a weighted-distortion measure;
a ringing component is produced and removed from the sound signal prior to spectral quantization, this ringing component being a current effect of quantization errors incurred in previous sample frames;
spectral quantization comprises a decimation step; and
- spectral quantization comprises decomposing the difference signal into amplitude and phase components prior to quantization, quantizing the amplitude components through at least one stored or algebraic vector quantization technique, and quantizing the phase components with either a lattice or a trellis based on a weighted cosine distortion measure.
The objects, advantages and other features of the present invention will become more apparent upon reading of the following non restrictive description of preferred embodiments thereof, given by way of example only with reference to the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS
In the appended drawings:
Figure 1 is a schematic block diagram of a general speech/audio-coding framework in accordance with the present invention, describing the coder (Note that the coder incorporates a local decoder. Hence, the decoder structure is not repeated.);
Figure 2 provides details for a typical implementation of a pitch model of Figure la;
Figures 3 and 4 provides two alternate approaches for implementing the perceptually-weighted differential-transform quantization in accordance with the general speech/audio-coding framework introduced in the present invention;
Figure 5 shows details for quantizers of Figures 3 and 4; and
Figures 6 and 7 show alternate methods to remove the ringing component.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Figure 1 illustrated a schematic block diagram for the general speech (/audio) encoding framework in accordance with the present invention.
Before being encoded by the device of Figure 1, an analog input speech or audio signal is band filtered and sampled at the Nyquist rate (e.g. 8 kHz for telephony and 16 kHz or more for wideband applications) . The resulting signal comprises a train of samples of varying amplitudes represented by 12 to 16 bits of a digital code.
Transmission is based on the encoding of blocks of N consecutive samples referred to as frames (e.g., N = 48 samples) . Within a frame, samples are numbered by index n (e.g., n = 0, 1, ...N-l).
Let s[n] be the input signal (i.e.: sampled speech or audio) for the current frame. Let also s[n] be the corresponding received signal, also called the synthesized signal as it is outputted by the synthesis filter 5. Before encoding the current input frame, the received signal, s[n] , is known up to n = -1 (or equivalently, up to n = N-l of the previous frame) . Furthermore, it is already known that previous encodings will add the fortuitous component, z[n] , to the received signal during the current frame. This phenomenon is referred to as the "ringing component" in CELP literature. Basically, quantization errors of previous frames are still causing some "ringing" at the output of the synthesis filter.
To take into account this phenomenon, z [n] , is first removed from s[n] (see 100). The difference signal, s[n]-z[n], is filtered by an analysis filter 1 to produce a residual signal r[n] . The purpose of the analysis filter 1 is to whiten the residual signal. Let A(z) be the transfer function of analysis filter 1. It is changed from frame to frame to take into consideration the varying spectral content of the input signal. Typically, A(z) is an mth order FIR (finite impulse response) filter whose m coefficients are obtained using the well known autocorrelation method according to either a forward or backward approach. At frame k, the transfer function of the synthesis filter 5 is l/Ak(z) , that is, the exact inverse of the analysis filter Ak(z) .
Next, a pitch-prediction component, p[n] , is removed from the residual r[n] (see 101). The resulting difference signal, v[n] = r[n] - p[n] , is inputted to the "Perceptually-weighted spectral-quantization" module 3 which produces the digital output, i [n] which is transmitted to the decoder 102. From this digital output, the decoder will be able to retrieve the quantized version of the difference signal, v[n] , by the inverse transformation 4. Adding back the pitch-prediction component, p[n] to the signal v[n] (see 103) yields the quantized version of the residual, r[n] , also called the "excitation" of the synthesis filter 5.
The pitch prediction component, p[n] , is produced by a pitch prediction module 6 which is detailed in Figure 2. The pitch analysis is similar to that used in CELP/ACELP coders. However, this invention proposes a new variant which improves performance particularly in the case of backward analysis/synthesis filter adaptation. In the prior art, pitch lag, T (typically: T e {20, 21, ... 147} for telephone applications) , and prediction gain Gp which minimize the following expression are searched.
N-l min ∑ (_r[__] - Gpf [n - T] ) The improvement introduced in this invention consists of considering the "corrected" excitation, r[n-T], instead of the traditional, r[n- T] . As illustrated on Figure 2, the "corrected" excitation, f[n-T], is obtained by supplying the signal s [n] to a pitch delay buffer 61 to obtain a synthesized output s[n-T] , and filtering the synthesized output s[n-T] with the current analysis filter, Ak(z) (see 62) . The "corrected" excitation, f[n-T] is then amplified (gain Gp 63) to obtain the pitch prediction component p[n] .
Note that, s [n-T] , belongs to some previous frame, say, frame k-j (i.e.: j = 1, 2, ...). Therefore, it was synthesized from f[n-T], using l/Ak-j(z), a filter possibly very different from the inverse of Ak(z) when speech undergoes rapid spectral changes.
Figure 3 describes the perceptually-weighted spectral quantization module 3 of Figure 1. The ultimate purpose of this module is simply to quantize v[n] into v[n], in both the most efficient and the most subjectively-meaningful way possible.
Spectral quantization (i.e.: quantization performed in the frequency-domain) is used for its efficiency. Among other advantages, it allows dimensionality reduction.
Turning now to the concern for subjectively meaningful quantization, it is well known in the CELP literature, that minimizing the error in the so-called weighted speech (/audio) domain is subjectively-meaningful; the widely used weighting filter being of the following form, where γ is scalar constant typically between 0.7 and 0.9.
W(z) = A (z)
A (zy- )
Unlike CELP, the present invention uses quantization. However the quantization seeks also to minimize the (quantization) error in the weighted speech (/audio) domain.
In Figure 3, a filter F(z) 30 provides the needed weighting. As a matter of fact, by setting its transfer function to F(z) =— ^—- this filter
- zγ-1) combines with the synthesis filter 5 (Figure 1) to create the desired global weighting: A(z)F(z) = W(z).
The filter F(z) 30 is followed by a transform such as the odd DFT (Discrete Fourier Transform) 31. Any (orthonormal) transformation can be used with various measure of success, these include (but do not exhaust) traditional DFT, cosine, Hadamard, Karhunen-Loeve, SVD ... transforms. The transform output, X[j], is a spectral signal with frequency-domain index j. The transform output X' [j] from previously received subframe is removed (see 33) from transform output X[j], and the difference is quantized according to a MSE (mean square error) distortion (see block 32). The index, i = i(k) outputted by the quantizer 32 constitutes the digital codes at frame k. From this index, the decoder will retrieve the .(best) quantization value Xi[j] which will yield v[n] after applying successively the inverse transform and the inverse filtering (i.e.: 1/F(z) with zero initial state) (see Figure 1) . Figure 4 describes an alternate approach for implementing the perceptually-weighted spectral-quantization module 3. In this approach, the (spectral) weighting is no longer applied through filtering; it is introduced instead in the distortion measure of the quantizer. Consequently, the difference signal, v[n] , is directly applied to the frequency transform. The odd DFT 34 is used in Figure 4 for illustration purposes. Again any transformation can be used with various measure of success. Finally, the transform output X' [j] from previously received subframe is removed (see 35) from the transform output, X[j] (a vector of N/2 complex components in the odd-DFT case) , and the difference is quantized (see 36) using a weighted mean-squared error criterion.
Figure imgf000014_0001
Where, q[j] is the weight vector. Taking, q[j] equal to the module of F(z) =— Λ(zγ_1-) evaluated at z = exp(i27rj/N) will result in a structure functionally equivalent to that of Figure 3.
Note that in the alternate approach of Figure 4, any weighting filter, W(z) (i.e.: dimmed ideal), can be implemented by taking, q[j], equal to the module of F(z) = W(z)/A(z) since W[z] no longer needs be known at the receiver. Note further that, q[j] , can implement any spectral weighting based on current and passed frames. The spectral quantizer modules of Figures 3 and 4 (i.e. : modules 32 or 36) can be implemented in various ways. In particular it can make use of one, or a combination of Vector Quantization (VQ) technique(s) including, but not limited to, the stored-VQ variety (e.g.: Gain/shape VQ, tree-structured VQ, multistage VQ, split VQ) and the algebraic variety (e.g. : lattice quantization (Q) , trellis-coded Q, permutation Q) .
Figure 5 details one typical implementation for module 32 or 36. The difference between the (complex) spectral signal, X[j], and the received spectral signal X' [j] from the previous subframe is first computed. This difference is decimated according to a rule specified by index i], in module 50. The (dimensionally) reduced difference spectral signal is decomposed into amplitude 51 and phase 52 components prior to quantization. The amplitudes, |X[j] |, are then quantized by one or a combination of Vector Quantization techniques (module 53) of the stored or algebraic varieties. The phase components, [ ] , are quantized (module 54) with either a lattice (e.g. : Gosset, Barnes-Wall, Leech ...) or a trellis based on the following novel criterion called weighted cosine distortion measure.
Max ∑ q2 [j] X[j]I & [ ]I cos(φ [j] - i3[j])
Where hats refer to the quantized values. Weighting vector, q[j] , should be omitted for the MSE quantizer 32 of Figure 3. Indexes ±1 , i2 and i3 are then multiplexed (module 55) to yield the (global) differential spectral quantizer index, i = i(k) , at frame k.
As indicated in the foregoing description, DTCX (Differential-Transform-Coded Excitation) , retains several features of CELP but circumvents the complexity limitation of CELP (DTCX's complexity tends to decrease with bit rate) . CELP uses the approach called "analysis-by-synthesis" in which each possible innovative component (i.e. codebook entry) is LP-filtered one-by-one to yield the winner; by contrast the present invention takes a reverse path in which the target signal itself is differentially quantized and the winning innovation component reached by (single) inverse LP-filtering of this quantized target. The differential quantization of a target signal offers the distinct possibility to take advantage of efficient frequency-domain quantization techniques typical of Transform Coding (TC) while, staying essentially free, from framing problems that plague block transforming of continuous processes. As a matter of fact, statistically invariant properties of speech (/audio) are often more readily usable in the frequency domain. This fact enables many efficient coding techniques including spectral decimation, flexible bit allocation as well as numerous forms of stored or algebraic vector quantization techniques.
Figures 6 and 7 describes two alternate methods for removing the ringing component among variants of the method used in Figures 1 and 2. The ringing computation is based on the discrepancy between quantized and unquantized signals (i.e.: v[n] - v[n] or r[n] - r[n] or s [n] - s[n] ) and the proper ringing can be remove from the residual as in Figure 6 (or added to the p[n] ) . If weighting filter F(z) 30 of Figure 3 is implemented, it is an elegant solution to remove the proper ringing from the initial filter state as illustrated in Figure 7.
Although the present invention has been described hereinabove by way of a preferred embodiment thereof, this embodiment can be modified at will, within the scope of the appended claims, without departing from the spirit and nature of the subject invention.

Claims

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:
1. A method of coding a sound signal to produce an index signal to be decoded into an excitation signal to be supplied to a synthesis filter to synthesize said sound signal, comprising the steps of: converting said sound signal into a frequency-domain signal by means of a given frequency transform; subtracting a previous frequency-domain signal produced by the converting step, from a current frequency-domain signal produced by said converting step to generate a difference signal; and conducting a spectral quantization on said difference signal to produce the index signal.
2. The method of claim 1, wherein said difference signal is quantized using a weighted mean- squared error criterion.
3. The method of claim 1, further comprising the step of sampling the sound signal and arranging said sampled sound signal into frames of N consecutive samples applied to said converting step, N being an integer.
4. The method of claim 1, further comprising the step of producing a pitch-correlated component based on a past excitation signal, and the step of subtracting said pitch-correlated component from said sound signal prior to said spectral quantization.
5. The method of claim 1, further comprising the step of perceptually weighting said sound signal through a filter means.
6. The method of claim 1, further comprising the step of perceptually weighting said difference signal through said spectral quantization which is based on a weighted-distortion measure.
7. The method of claim 6, comprising the steps of sampling said sound signal and arranging said sampled sound signal into frames of N consecutive samples, N being an integer, wherein said weighted-distortion measure uses a weighting filter at a current frame.
8. The method of claim 6, comprising the steps of sampling said sound signal and arranging said sampled sound signal into frames of N consecutive samples, N being an integer, wherein said weighted- distortion measure implements a spectral weighting based on current and past frames.
9. The method of claim 1, further comprising the steps of: sampling said sound signal; arranging said sampled sound signal into frames of N consecutive samples, N being an integer; and producing a ringing component and removing said ringing component from said sound signal prior to said spectral quantization, said ringing component being a current effect of quantization errors incurred in previous sample frames.
10. The method of claim 9, further comprising the step of perceptually weighting said sound signal through a filter, wherein said ringing component is removed by modifying an initial state of said filter.
11. The method of claim 1, in which said spectral quantization uses a reversible frame transform.
12. The method of claim 11, in which said reversible block transform is selected from the group consisting of discrete Fourier transform, odd discrete Fourier transform, cosine transform, Karhunen-Loeve transform, SVD transform.
13. The method of claim 1, in which said spectral quantization comprises a decimation step.
14. The method of claim 1, in which said step of conducting a spectral quantization comprises using one, or a combination of vector quantization techniques.
15. The method of claim 14, in which said quantization techniques are selected from the group consisting of gain/shape vector quantization, tree-structured vector quantization, multistage vector quantization, split vector quantization, lattice quantization, trellis-coded quantization, and permutation quantization.
16. The method of claim 1, in which said step of conducting a spectral quantization comprises: decomposing said difference signal into amplitude and phase components prior to quantization; quantizing the amplitude components through at least one stored or algebraic vector quantization technique; and quantizing the phase components with either a lattice or a trellis based on a weighted cosine distortion measure.
17. The method of claim 6, comprising improving the pitch-correlated component by considering a refined version of the past excitation signal which reflects a current transfer function of the synthesis filter, said refined version of the excitation signal being obtained by inverse filtering the past synthesized signal.
18. A device for coding a sound signal to produce an index signal to be decoded into an excitation signal to be supplied to a synthesis filter to synthesize said sound signal, comprising: first means for converting said sound signal into a frequency-domain signal by means of a given frequency transform; second means for subtracting a previous frequency-domain signal produced by the converting means, from a current frequency-domain signal produced by said converting means to generate a difference signal; and thirdmeans for spectrally quantizing said difference signal to produce the index signal.
19. A device as recited in claim 18, wherein said third means comprises means for quantizing said difference signal using a weighted mean-squared error criterion.
20. A device as recited in claim 18, further comprising means for sampling the sound signal and means for arranging said sampled sound signal into frames of N consecutive samples applied to said converting means, N being an integer.
21. A device as recited in claim 18 further comprising means for producing a pitch- correlated component based on a past excitation signal, and means for subtracting said pitch-correlated component from said sound signal prior to said spectral quantization.
22. A device as recited in claim 18, further comprising filter means for perceptually weighting said sound signal.
23. A device as recited in claim 18, further comprising means for perceptually weighting said difference signal through said spectral quantization which is based on a weighted-distortion measure.
24. A device as recited in claim 18, further comprising: means for sampling said sound signal; means for arranging said sampled sound signal into frames of N consecutive samples, N being an integer; means for producing a ringing component which is a current effect of quantization errors incurred in previous sample frames; and means for removing said ringing component from said sound signal prior to said spectral quantization.
25. A device as recited in claim 24, further comprising filter means for perceptually weighting said sound signal, wherein said ringing component removing means comprises means for modifying an initial state of said filter means to remove said ringing component.
26. A device as recited in claim 18, in which said third means comprises: means for decomposing said difference signal into amplitude and phase components prior to quantization; means for quantizing the . amplitude components through at least one stored or algebraic vector quantization technique; and means for quantizing the phase components with either a lattice or a trellis based on a weighted cosine distortion measure.
27. A device as recited in claim 21, comprising means for improving the pitch-correlated component by considering a refined version of the past excitation signal which reflects a current transfer function of the synthesis filter, said refined version of the excitation signal being obtained by inverse filtering the past synthesized signal.
PCT/CA1995/000216 1994-04-19 1995-04-18 Differential-transform-coded excitation for speech and audio coding WO1995028699A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU22509/95A AU2250995A (en) 1994-04-19 1995-04-18 Differential-transform-coded excitation for speech and audio coding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CA 2121667 CA2121667A1 (en) 1994-04-19 1994-04-19 Differential-transform-coded excitation for speech and audio coding
CA2,121,667 1994-04-19

Publications (1)

Publication Number Publication Date
WO1995028699A1 true WO1995028699A1 (en) 1995-10-26

Family

ID=4153411

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA1995/000216 WO1995028699A1 (en) 1994-04-19 1995-04-18 Differential-transform-coded excitation for speech and audio coding

Country Status (3)

Country Link
AU (1) AU2250995A (en)
CA (1) CA2121667A1 (en)
WO (1) WO1995028699A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002093559A1 (en) * 2001-05-11 2002-11-21 Matsushita Electric Industrial Co., Ltd. Device to encode, decode and broadcast audio signal with reduced size spectral information
EP2077551A1 (en) * 2008-01-04 2009-07-08 Dolby Sweden AB Audio encoder and decoder
US7738558B2 (en) 2007-07-23 2010-06-15 Huawei Technologies Co., Ltd. Vector coding method and apparatus and computer program
CN101086845B (en) * 2006-06-08 2011-06-01 北京天籁传音数字技术有限公司 Sound coding device and method and sound decoding device and method
CN103366751A (en) * 2012-03-28 2013-10-23 北京天籁传音数字技术有限公司 Sound coding and decoding apparatus and sound coding and decoding method
US9224403B2 (en) 2010-07-02 2015-12-29 Dolby International Ab Selective bass post filter

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4907276A (en) * 1988-04-05 1990-03-06 The Dsp Group (Israel) Ltd. Fast search method for vector quantizer communication and pattern recognition systems
US5206884A (en) * 1990-10-25 1993-04-27 Comsat Transform domain quantization technique for adaptive predictive coding
EP0590155A1 (en) * 1992-03-18 1994-04-06 Sony Corporation High-efficiency encoding method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4907276A (en) * 1988-04-05 1990-03-06 The Dsp Group (Israel) Ltd. Fast search method for vector quantizer communication and pattern recognition systems
US5206884A (en) * 1990-10-25 1993-04-27 Comsat Transform domain quantization technique for adaptive predictive coding
EP0590155A1 (en) * 1992-03-18 1994-04-06 Sony Corporation High-efficiency encoding method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BHASKAR: "ADAPTIVE PREDICTIVE CODING WITH TRANSFORM DOMAIN QUANTIZATION", IN "SPEECH AND AUDIO CODING FOR WIRELESS AND NETWORK APPLICATIONS" BY ATAL, CUPERMAN AND GERSHO, BOSTON - DORDRECHT - LONDON, XP000470450 *
BOCHOW ET AL.: "MULTIPROCESSOR IMPLEMENTATION OF AN ATC AUDIO CODEC", INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING 89, vol. 3, 23 May 1989 (1989-05-23) - 26 May 1989 (1989-05-26), GLASGOW, GB, pages 1981 - 1984, XP000089270 *
LEFEBVRE ET AL.: "8 kbit/s coding of speech with 6 ms frame-length", INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING 93, 27 April 1993 (1993-04-27) - 30 April 1993 (1993-04-30), MINNEAPOLIS, MN, US, pages 612 - 615 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002093559A1 (en) * 2001-05-11 2002-11-21 Matsushita Electric Industrial Co., Ltd. Device to encode, decode and broadcast audio signal with reduced size spectral information
CN101086845B (en) * 2006-06-08 2011-06-01 北京天籁传音数字技术有限公司 Sound coding device and method and sound decoding device and method
US7738558B2 (en) 2007-07-23 2010-06-15 Huawei Technologies Co., Ltd. Vector coding method and apparatus and computer program
US7738559B2 (en) 2007-07-23 2010-06-15 Huawei Technologies Co., Ltd. Vector decoding method and apparatus and computer program
US7746932B2 (en) 2007-07-23 2010-06-29 Huawei Technologies Co., Ltd. Vector coding/decoding apparatus and stream media player
US8938387B2 (en) 2008-01-04 2015-01-20 Dolby Laboratories Licensing Corporation Audio encoder and decoder
EP2077551A1 (en) * 2008-01-04 2009-07-08 Dolby Sweden AB Audio encoder and decoder
WO2009086919A1 (en) * 2008-01-04 2009-07-16 Dolby Sweden Ab Audio encoder and decoder
US8484019B2 (en) 2008-01-04 2013-07-09 Dolby Laboratories Licensing Corporation Audio encoder and decoder
US8494863B2 (en) 2008-01-04 2013-07-23 Dolby Laboratories Licensing Corporation Audio encoder and decoder with long term prediction
US8924201B2 (en) 2008-01-04 2014-12-30 Dolby International Ab Audio encoder and decoder
US9558753B2 (en) 2010-07-02 2017-01-31 Dolby International Ab Pitch filter for audio signals
US9858940B2 (en) 2010-07-02 2018-01-02 Dolby International Ab Pitch filter for audio signals
US9224403B2 (en) 2010-07-02 2015-12-29 Dolby International Ab Selective bass post filter
US9343077B2 (en) 2010-07-02 2016-05-17 Dolby International Ab Pitch filter for audio signals
US9396736B2 (en) 2010-07-02 2016-07-19 Dolby International Ab Audio encoder and decoder with multiple coding modes
US9552824B2 (en) 2010-07-02 2017-01-24 Dolby International Ab Post filter
US9558754B2 (en) 2010-07-02 2017-01-31 Dolby International Ab Audio encoder and decoder with pitch prediction
US11610595B2 (en) 2010-07-02 2023-03-21 Dolby International Ab Post filter for audio signals
US9595270B2 (en) 2010-07-02 2017-03-14 Dolby International Ab Selective post filter
US9830923B2 (en) 2010-07-02 2017-11-28 Dolby International Ab Selective bass post filter
US11183200B2 (en) 2010-07-02 2021-11-23 Dolby International Ab Post filter for audio signals
US10236010B2 (en) 2010-07-02 2019-03-19 Dolby International Ab Pitch filter for audio signals
US10811024B2 (en) 2010-07-02 2020-10-20 Dolby International Ab Post filter for audio signals
CN103366751B (en) * 2012-03-28 2015-10-14 北京天籁传音数字技术有限公司 A kind of sound codec devices and methods therefor
CN103366751A (en) * 2012-03-28 2013-10-23 北京天籁传音数字技术有限公司 Sound coding and decoding apparatus and sound coding and decoding method

Also Published As

Publication number Publication date
CA2121667A1 (en) 1995-10-20
AU2250995A (en) 1995-11-10

Similar Documents

Publication Publication Date Title
US4868867A (en) Vector excitation speech or audio coder for transmission or storage
JP4662673B2 (en) Gain smoothing in wideband speech and audio signal decoders.
EP0942411B1 (en) Audio signal coding and decoding apparatus
EP0910067B1 (en) Audio signal coding and decoding methods and audio signal coder and decoder
RU2327230C2 (en) Method and device for frquency-selective pitch extraction of synthetic speech
EP1262956B1 (en) Signal encoding method and apparatus
JP4567238B2 (en) Encoding method, decoding method, encoder, and decoder
US6081776A (en) Speech coding system and method including adaptive finite impulse response filter
US7260523B2 (en) Sub-band speech coding system
US6782359B2 (en) Determining linear predictive coding filter parameters for encoding a voice signal
USRE43099E1 (en) Speech coder methods and systems
CA1219079A (en) Multi-pulse type vocoder
JPH11510274A (en) Method and apparatus for generating and encoding line spectral square root
EP1513137A1 (en) Speech processing system and method with multi-pulse excitation
JPH10214100A (en) Voice synthesizing method
EP2559028B1 (en) Flexible and scalable combined innovation codebook for use in celp coder and decoder
WO2009125588A1 (en) Encoding device and encoding method
US20040153317A1 (en) 600 Bps mixed excitation linear prediction transcoding
US6269332B1 (en) Method of encoding a speech signal
EP0919989A1 (en) Audio signal encoder, audio signal decoder, and method for encoding and decoding audio signal
WO1995028699A1 (en) Differential-transform-coded excitation for speech and audio coding
WO2000057401A1 (en) Computation and quantization of voiced excitation pulse shapes in linear predictive coding of speech
Lefebvre et al. 8 kbit/s coding of speech with 6 ms frame-length
JP2000132193A (en) Signal encoding device and method therefor, and signal decoding device and method therefor
JPH10260698A (en) Signal encoding device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AM AT AU BB BG BR BY CH CN CZ DE DK EE ES FI GB GE HU IS JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TT UA US UZ VN

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase