EP0614075A2 - Méthode de codage de parole utilisant la quantisation codée Trellis pour codage LPC - Google Patents

Méthode de codage de parole utilisant la quantisation codée Trellis pour codage LPC Download PDF

Info

Publication number
EP0614075A2
EP0614075A2 EP94103204A EP94103204A EP0614075A2 EP 0614075 A2 EP0614075 A2 EP 0614075A2 EP 94103204 A EP94103204 A EP 94103204A EP 94103204 A EP94103204 A EP 94103204A EP 0614075 A2 EP0614075 A2 EP 0614075A2
Authority
EP
European Patent Office
Prior art keywords
quantization
state
lsp
trellis
fact
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP94103204A
Other languages
German (de)
English (en)
Other versions
EP0614075B1 (fr
EP0614075A3 (fr
Inventor
Marco Fratti
Silvio Cucchi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alcatel Lucent SAS
Original Assignee
Alcatel Italia SpA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel Italia SpA filed Critical Alcatel Italia SpA
Publication of EP0614075A2 publication Critical patent/EP0614075A2/fr
Publication of EP0614075A3 publication Critical patent/EP0614075A3/fr
Application granted granted Critical
Publication of EP0614075B1 publication Critical patent/EP0614075B1/fr
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation

Definitions

  • the present invention relates to a method for speech coding as set forth in the preamble of claim 1 and a speech coder as set forth in the preamble of claim 17.
  • Trellis Coded Modulation From the field of the communication theory, and in particular from the modulation theory the Trellis Coded Modulation technique is well known.
  • the Trellis coded Modulation paradigm combined with well-known quantization theories, gave rise to the Trellis Coded Quantization (TCQ) algorithm.
  • TCQ Trellis Coded Quantization
  • TCQ is a recent technique for efficient scalar encoding of any source.
  • the main object of the present invention is therefore substantially an efficient and effective way how to apply the TCQ technique.
  • the method for speech coding is constructed as set forth in claim 1 and the speech coder as set forth in claim 17.
  • figure 1 is showed a general structure of a Trellis Coded Quantizer
  • figure 2 ia a Trellis Coded Quantizer with variable bit allocation
  • figure 3 is a TCQ scheme for LSP difference quantization
  • figure 4 is an updated paths in the TCQ.
  • the trellis encoder is completely specified by:
  • the 4-state trellis is fully connected.
  • the number associated to each state transition (branch) represents the quantization value corresponding to that branch.
  • the trellis is a N-stage one, that is, it is employed for coding a N-component input vector.
  • a trellis scheme like the one in Figure 1 is an example of what is generally found in the literature. That is, the topological configuration of the trellis is the same at each quantization step.
  • the quantization level number is the same at each quantization step.
  • This configuration may be not the ideal one in case one needs to quantize a vector whose scalar components have a different 'importance scale', according to a predefined performance criterium. In this case, a different bit/sample number may be necessary for each vector component.
  • both the bit/sample increase and the decrease can be easily realized while carrying out the Viterbi algorithm in the encoding process.
  • the addition of parallel transitions implies an increase in the evaluation of the local transition state metrics.
  • pruning a state transition branch implies the assignment of a corresponding 'infinite' local transition state metric.
  • LSP line spectrum pairs
  • the ordering property of the LSP parameters can be exploited by quantizing the differences between adjacent LSP frequencies instead of the absolute values of the LSP frequencies.
  • a proper bit allocation can be assigned to each LSP difference according to its perceptual importance.
  • the TCQ algorithm When applied to the quantization of the LSP differences, the TCQ algorithm proves itself to be particularly effective, since the quantization error accumulated in quantizing the - say - first ( i -1)-th LSP differences can be taken into account in the search of the optimum quantization level for the i -th LSP difference.
  • Each trellis state will be assigned a "history path", at each i -th trellis stage; each state transition belonging to this history path will correspond to a pointer to the quantization level of the corresponding LSP difference.
  • the i -th quantized LSP can be reconstructed (note that this reconstructed LSP will be different - in general - for each trellis state).
  • the quantity refers to the quantized i -th LSP difference, according to the corresponding quantization level belonging to the j -th state history path.
  • transition cost from each j -th state to each possible k -th future state is computed.
  • This transition cost is related to the quantization level associated to the corresponding transition branch; with reference to Figure 3, the transition cost is denoted as C jk .
  • C jk depends on the quantization error which is measured (according to a proper metric) as function of the "transitional" quantization level.
  • the final state with the minimum accumulated cost is selected as the "winning" one. Its index (or, equivalently, the index of the corresponding initial state) is transmitted, together with the state transition labels of its history path.
  • the initial winning state index and the state transition labels of its history path are input. All the LSP differences can be recovered from the state transition label pointers to the quantization level table. Afterward, the LSP frequencies can be reconstructed.
  • the working principle is analogous to the one described previously for the 1-D (i.e. intra-frame) case, which simply exploits the LSP,ordering property.
  • the combination weights are the 2-D predictor coefficients.
  • a multi-coefficient 2-D predictor can be employed, both in the intra-frame and in the inter-frame sense, as follows: It is worth to note that the predictor length is not necessarily the same in each dimension.
  • f i ( n ) a i f i -1 ( n ) + b i f i ( n -1) + c i f i -1 ( n -1) (4) that is, we introduce another inter-frame/intra-frame dependency, namely the one related to the previous (in the intra-frame sense) LSP of the previous (in the inter-frame sense) frame.
  • the third weighting coefficient can be determined in an 'optimal' way, as will be described in a following section.
  • the concept can be extended further, by introducing a multi-coefficient multi-dimensional predictor, operating with different prediction orders, according to the prediction 'direction' (i.e. intra-frame, inter-frame, various intra-frame/inter-frame combinations).
  • LSP differences can be recovered from the best state information and the related history path.
  • the LSP values can then be reconstructed by re-adding the previously (both in the intra-frame and in the inter-frame sense) reconstructed parameters, after weighting them by the corresponding predictor coefficients.
  • TCQ of the LSP parameters in a generical differential sense
  • TCQ of the LSP parameters can be carried out according to any suitable metric that allows to measure an overall distortion as function of successive local distortions.
  • a simple mean squared error could be used as the local metric for the quantization error.
  • the transition cost defined previously i.e. see the 1-D case
  • the transition cost defined previously could be defined as:
  • a weighted MSE could be employed, following (e.g.) the guidelines specified in K.K. Paliwal, B.S. Atal, "Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame", Proc. ICASSP '91, pagg. 661-664, where the spectral content of the speech signal at the LSP frequency location is taken into account explicitly.
  • a WMSE criterion that considers the relative weight of the specific LSP that is being quantized could also be considered.
  • formula (5) could be re-written as: where f (.,.) is a (one-dimensional or two-dimensional) weighting function that would take into account the differential LSP to be quantized and/or the quantization level that is being considered.
  • a recursive structure may be used for the computation of the reflection coefficients, starting either from the values of the autocorrelation function (and thereby using the well-known Leroux-Gueguen algorithm) or from the values of the signal covariance function (by employing the so-called covariance-lattice formulation, as explained in A. Cumani, "On a Covariance-Lattice Algorithm fof Linear Prediction", Proc. ICASSP '82, pagg. 651-654).
  • Leroux-Gueguen algorithm should be reformulated properly in order to take into account the eventual quantization of the reflection coefficients after their computation at each step of the recursion.
  • each reflection coefficients can be computed as function of each particular state.
  • the computed reflection coefficient can be quantized according to the quantization level subset 'seen' by each particular trellis state.
  • the recursive algorithm for reflection coefficient computation, with embedded TCQ may be stated as follows (only the formulation related to the covariance-lattice approach is given, since the corresponding formalism for the autocorrelation approach may be derived in an analogous way; also, note that the formalism used resembles the one described in: A. Cumani, "On a Covariance-Lattice Algorithm for Linear Prediction", Proc. ICASSP '82, pagg. 651-654.
  • the i-th LSP difference must be quantized; its value is computed by taking the difference between the i-th LSF and the reconstructed (i.e. quantized) ( i - 1)-th LSF.
  • This reconstructed ( i -1)-th LSF is different for each trellis state; the i -th LSF difference thus obtained must be quantized accordingly to the level partition "seen" by the corresponding trellis state.
  • TCQ training procedure in particular we start from a unique set of quantization values for each state subset; these values can be found by using a standard scalar quantization clustering procedure.
  • each possible quantization path is assigned a partition; the corresponding "cluster vector" can be derived by simply taking a proper mean of each partition value and assigning this mean value to the corresponding path state.
  • the LSF value training sequence is again input to the TCQ; a new partition set can be generated and the corresponding set of cluster vectors can be found.
  • Trellis Coded Vector Quantization is a generalization of the TCQ concept. It has been introduced in T.G. Fisher, M.W. Marcellin, M.Wang, "Trellis Coded Vector Quantization", IEEE Trans. on Information Theory, Vol. IT-37, Nov. 1991. and, again, consists of using a structured codebook with an expanded set of quantization levels.
  • the trellis structure prunes the expanded number of quantization reproduction vectors down to the desired encoding rate.
  • the TCVQ procedure is carried out in exactly the same way as for the TCQ counterpart, both in the 1 -D case (i.e., taking into account only the intra-frame dependency of the LSP parameters), and in the case of multi-dimensional prediction (i.e., exploiting both the intra-frame and the inter-frame dependency of LSP parameters, with any prediction length in either direction).
  • the TCVQ procedure can be carried out by recursive quantization of reflection coefficient couples (if the subvector dimension is actually 2), by using the same strategy employed for the TCQ case.
  • TCVQ generalization for the LSP multi-dimensional predictor case and for the reflection coefficients case can be derived in a straightforward manner following the corresponding TCQ descriptions.
  • the trellis level reoptimization procedure can be carried out in an analogous way as for the TCQ case.
  • the vector clusters can be constructed in an iterative way, as function of the different trellis states and of the corresponding encoding paths. These clusters are obtained as 'centroid' (according to a predetermined metric) of corresponding partitions of the input vector set.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Error Detection And Correction (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
EP94103204A 1993-03-03 1994-03-03 Méthode et appareil de codage de parole utilisant la quantisation codée Trellis pour la quantisation LPC Expired - Lifetime EP0614075B1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
ITMI930406 1993-03-03
ITMI930406A IT1271959B (it) 1993-03-03 1993-03-03 Codec per il parlato a predizione lineare eccitato da un libro di codici

Publications (3)

Publication Number Publication Date
EP0614075A2 true EP0614075A2 (fr) 1994-09-07
EP0614075A3 EP0614075A3 (fr) 1995-08-02
EP0614075B1 EP0614075B1 (fr) 2000-06-21

Family

ID=11365221

Family Applications (1)

Application Number Title Priority Date Filing Date
EP94103204A Expired - Lifetime EP0614075B1 (fr) 1993-03-03 1994-03-03 Méthode et appareil de codage de parole utilisant la quantisation codée Trellis pour la quantisation LPC

Country Status (3)

Country Link
EP (1) EP0614075B1 (fr)
DE (1) DE69424960T2 (fr)
IT (1) IT1271959B (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0708435A1 (fr) * 1994-10-18 1996-04-24 Matsushita Electric Industrial Co., Ltd. Appareil de codage et de décodage de paramètres de paires de raies spectrales
EP1072103A1 (fr) * 1998-04-14 2001-01-31 Motorola, Inc. Procede et dispositif de quantification d'un signal dans un systeme numerique
CN110853659A (zh) * 2014-03-28 2020-02-28 三星电子株式会社 用于对音频信号进行编码的量化装置
US11922960B2 (en) 2014-05-07 2024-03-05 Samsung Electronics Co., Ltd. Method and device for quantizing linear predictive coefficient, and method and device for dequantizing same

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0230001A1 (fr) * 1985-12-17 1987-07-29 CSELT Centro Studi e Laboratori Telecomunicazioni S.p.A. Procédé et dispositif de codage et de décodage de la parole par analyse de sous-bandes et quantification vectorielle comportant une allocation dynamique de bits
US4975956A (en) * 1989-07-26 1990-12-04 Itt Corporation Low-bit-rate speech coder using LPC data reduction processing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0230001A1 (fr) * 1985-12-17 1987-07-29 CSELT Centro Studi e Laboratori Telecomunicazioni S.p.A. Procédé et dispositif de codage et de décodage de la parole par analyse de sous-bandes et quantification vectorielle comportant une allocation dynamique de bits
US4975956A (en) * 1989-07-26 1990-12-04 Itt Corporation Low-bit-rate speech coder using LPC data reduction processing

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ADVANCES IN SPEECH CODING, 1 January 1991 pages 47 - 56 M.W. MARCELLIN ET AL. 'A Trellis-searched 16 kbit/sec speech coder with low-delay' *
ICASSP 91, vol.1, 14 May 1991, TORONTO pages 661 - 664 K.K. PALIWAL ET AL. 'Efficient vector quantization of LPC parameters at 24 bits/frame' *
IEEE TRANSACTIONS ON COMMUNICATIONS, vol.38, no.1, January 1990 pages 82 - 93 M.W.MARCELLIN ET AL. 'Trellis Coded Quantization of Memoryless and Gauss-Markov sources' *
K. SAM SHANMUGAM: "Digital and Analog Communication Systems" , , JOHN WILEY & SONS, NEW YORK *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0708435A1 (fr) * 1994-10-18 1996-04-24 Matsushita Electric Industrial Co., Ltd. Appareil de codage et de décodage de paramètres de paires de raies spectrales
US5802487A (en) * 1994-10-18 1998-09-01 Matsushita Electric Industrial Co., Ltd. Encoding and decoding apparatus of LSP (line spectrum pair) parameters
USRE40968E1 (en) * 1994-10-18 2009-11-10 Panasonic Corporation Encoding and decoding apparatus of LSP (line spectrum pair) parameters
EP1072103A1 (fr) * 1998-04-14 2001-01-31 Motorola, Inc. Procede et dispositif de quantification d'un signal dans un systeme numerique
EP1072103A4 (fr) * 1998-04-14 2008-02-06 Motorola Inc Procede et dispositif de quantification d'un signal dans un systeme numerique
CN110853659A (zh) * 2014-03-28 2020-02-28 三星电子株式会社 用于对音频信号进行编码的量化装置
US11848020B2 (en) 2014-03-28 2023-12-19 Samsung Electronics Co., Ltd. Method and device for quantization of linear prediction coefficient and method and device for inverse quantization
CN110853659B (zh) * 2014-03-28 2024-01-05 三星电子株式会社 用于对音频信号进行编码的量化装置
US11922960B2 (en) 2014-05-07 2024-03-05 Samsung Electronics Co., Ltd. Method and device for quantizing linear predictive coefficient, and method and device for dequantizing same

Also Published As

Publication number Publication date
EP0614075B1 (fr) 2000-06-21
EP0614075A3 (fr) 1995-08-02
DE69424960D1 (de) 2000-07-27
ITMI930406A1 (it) 1994-09-03
ITMI930406A0 (it) 1993-03-03
IT1271959B (it) 1997-06-10
DE69424960T2 (de) 2001-01-11

Similar Documents

Publication Publication Date Title
US5271089A (en) Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits
KR100492965B1 (ko) 벡터 양자화를 위한 고속 탐색방법
US9245532B2 (en) Variable bit rate LPC filter quantizing and inverse quantizing device and method
JP3114197B2 (ja) 音声パラメータ符号化方法
US20030135365A1 (en) Efficient excitation quantization in noise feedback coding with general noise shaping
US6161086A (en) Low-complexity speech coding with backward and inverse filtered target matching and a tree structured mutitap adaptive codebook search
CA2037475C (fr) Methode pour reduire la complexite des recherches dans le codage a analyse par synthese
US20070233473A1 (en) Multi-path trellis coded quantization method and multi-path coded quantizer using the same
EP0614075B1 (fr) Méthode et appareil de codage de parole utilisant la quantisation codée Trellis pour la quantisation LPC
US7206740B2 (en) Efficient excitation quantization in noise feedback coding with general noise shaping
KR100465316B1 (ko) 음성 부호화기 및 이를 이용한 음성 부호화 방법
Bouzid et al. Optimized trellis coded vector quantization of LSF parameters, application to the 4.8 kbps FS1016 speech coder
US7110942B2 (en) Efficient excitation quantization in a noise feedback coding system using correlation techniques
Cao et al. A fast search algorithm for vector quantization using a directed graph
CN100367347C (zh) 话音信号编码器和话音信号解码器
Lahouti et al. Quantization of LSF parameters using a trellis modeling
EP0483882B1 (fr) Méthode de codage de paramètres de parole permettant de transmettre un paramètre spectral sur un nombre de bits de réduits
US8560306B2 (en) Method and apparatus to search fixed codebook using tracks of a trellis structure with each track being a union of tracks of an algebraic codebook
Korse et al. GMM-based iterative entropy coding for spectral envelopes of speech and audio
Mohammadi et al. Low cost vector quantization methods for spectral coding in low rate speech coders
Popescu et al. CELP coding using trellis-coded vector quantization of the excitation
EP0755047B1 (fr) Procédé de codage d'un paramètre de parole capable de transmettre à débit réduit un paramètre spectral
Ghido et al. Optimization-quantization for least squares estimates and its application for lossless audio compression
Chatterjee et al. A mixed-split scheme for 2-D DPCM based LSF quantization
CA2494946C (fr) Codec vocal

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): DE FR GB IT

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): DE FR GB IT

17P Request for examination filed

Effective date: 19960111

RHK1 Main classification (correction)

Ipc: G10L 9/14

16A New documents despatched to applicant after publication of the search report
17Q First examination report despatched

Effective date: 19980803

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

RTI1 Title (correction)

Free format text: METHOD AND APPARATUS FOR SPEECH CODING USING TRELLIS CODED QUANTIZATION FOR LINEAR PREDICTIVE CODING QUANTIZATION

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 19/04 A

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ALCATEL

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB IT

ET Fr: translation filed
ITF It: translation for a ep patent filed

Owner name: BORSANO CORRADO

REF Corresponds to:

Ref document number: 69424960

Country of ref document: DE

Date of ref document: 20000727

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20010214

Year of fee payment: 8

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20010313

Year of fee payment: 8

Ref country code: DE

Payment date: 20010313

Year of fee payment: 8

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20020303

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20021001

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20020303

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20021129

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20050303