EP0734013A2 - Bestimmung eines Anregungsvektors in einem CELP-Kodierer - Google Patents

Bestimmung eines Anregungsvektors in einem CELP-Kodierer Download PDF

Info

Publication number
EP0734013A2
EP0734013A2 EP96410028A EP96410028A EP0734013A2 EP 0734013 A2 EP0734013 A2 EP 0734013A2 EP 96410028 A EP96410028 A EP 96410028A EP 96410028 A EP96410028 A EP 96410028A EP 0734013 A2 EP0734013 A2 EP 0734013A2
Authority
EP
European Patent Office
Prior art keywords
excitation
vector
subset
code
codes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP96410028A
Other languages
English (en)
French (fr)
Other versions
EP0734013A3 (de
EP0734013B1 (de
Inventor
Mustapha Bouraoui
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
STMicroelectronics SA
Original Assignee
STMicroelectronics SA
SGS Thomson Microelectronics SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by STMicroelectronics SA, SGS Thomson Microelectronics SA filed Critical STMicroelectronics SA
Publication of EP0734013A2 publication Critical patent/EP0734013A2/de
Publication of EP0734013A3 publication Critical patent/EP0734013A3/de
Application granted granted Critical
Publication of EP0734013B1 publication Critical patent/EP0734013B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms
    • G10L2019/0014Selection criteria for distances

Definitions

  • the present invention relates to the compression of speech signals to be transmitted over a telephone line and more particularly to the determination of an excitation vector in the context of compression according to the method of Linear Prediction Excited by Code (CELP).
  • CELP Linear Prediction Excited by Code
  • FIG. 1 very schematically represents a CELP compression circuit.
  • Such a circuit is based on a modeling of the vocal cords and of the resonance chamber constituted by the oral cavities, the throat and the larynx. Such a compression method is therefore optimized for the processing of speech signals.
  • the oral, throat and larynx cavities are modeled by a "linear prediction" filter 10, the transfer function of which generally comprises ten poles.
  • the vocal cords are modeled by an excitation E treated by a comb filter 12.
  • a digitized speech signal S is analyzed in frames by an analysis circuit 14.
  • the analysis circuit 14 determines coefficients a 1 to a 10 of the transfer function of the filter 10, the step p of the filter comb 12, and a gain G applied to the excitation E at 16 at the input of the filter 12.
  • the values a i , p and G are calculated for each frame to take account respectively of variations in the oral cavity, the frequency spectrum of the vocal cords and the amplitude of the sound. In this way an attempt is made to obtain that the output of the filter 10 is equal to the signal S.
  • the coefficients a i , p, and G are transmitted so that a decoder which receives these coefficients reconstruct the corresponding frames of the signal S.
  • the decoder must also know the excitation E to be used.
  • the determination of the coefficients ai, p and G does not pose any particular problem.
  • the procedure for finding the optimal excitation remains the heaviest in terms of computational load and it is always of great interest to simplify it even at the cost of a significant reduction in the quality of the compression.
  • the excitation E was selected in a table 18 (called "codebook") containing several possible excitations which represented, in fact, pieces of white noise.
  • codebook a table 18 containing several possible excitations which represented, in fact, pieces of white noise.
  • a control circuit 20 scans the table 18 until the difference e, formed at 22, between the current frame of the signal S and the corresponding frame at the output of the filter 10 is minimal. (Of course, instead of comparing the signal S at the output of the filter 10, one can also compare the excitation E to the frame of the signal S having undergone the reverse processing of the filters 10 and 12.)
  • the address C selecting the best excitation E in table 18 is supplied to a decoder provided with a homologous table.
  • Each excitation contained in the table 18 is a sequence of digital samples corresponding respectively to the samples of each of the frames of the signal to be compressed.
  • each sample of an excitation sequence can only take three values, namely, 0, 1 or -1 (ternary excitation sequence). We realized that this did not change the quality of the compression in a perceptible way.
  • FIG. 2 represents an example of excitation sequence E which has been proposed to further reduce the complexity of the search.
  • This excitation sequence is called binary. It includes several non-zero samples of values 1 and -1, two non-zero samples, or pulses, being separated by a constant number of zero samples, here 3.
  • Such an excitation sequence can be represented by a binary number (or excitation code) C whose bits are associated with the pulses and correspond to the polarity of these. By doing so, the code C supplied by the control circuit 20 corresponds directly to an excitation sequence and the table 18 is deleted.
  • the complexity is moreover reduced by the fact that the samples to be taken into account are reduced to pulses, the number of which is, in the example of FIG. 2, four times less than the total number of samples of a sequence.
  • the structure of filters 10 and 12 is simplified.
  • Each C code is associated with an excitation vector C having as components the values 1 and -1 corresponding to the bits 0 and 1 of the code C.
  • m scal 2 (T, C i ) / mod 2 (FC i ), where C i is the excitation vector tested; where T is a target vector formed by samples of the frame being analyzed of the signal S having undergone the reverse processing of filters 10 and 12, these samples being those corresponding to the values 1 and -1 of the vector C i and where F denotes the matrix representing the transfer function of filters 10 and 12, in which only the rows corresponding to the values 1 and -1 of the vector C i have been kept.
  • the notations scal (.,.) And mod (.,.) Denote the scalar product and the module respectively.
  • the denominator of the criterion m is approximately constant, whatever the excitation vector C i .
  • the criterion m is approximately maximized by maximizing the numerator. This numerator is maximized when each component of the excitation vector C i is that of the same sign as the corresponding sample of the target vector T.
  • an approximate optimal excitation code is obtained directly by taking for its bits the sign bits of the target vector samples (or the inverse of the sign bits).
  • An object of the present invention is to provide a method making it possible to limit the number of calculations necessary to maximize the aforementioned criterion m in the case where the usable excitation codes belong to a subset representative of a larger set.
  • the present invention provides a method for determining an excitation vector associated with a frame of a speech signal to be compressed, said vector belonging to a subset associated with a larger set of vectors d excitation likely to maximize a criterion, and having as components values 1 and -1 corresponding to a sequence of excitation samples of a linear prediction filter.
  • the criterion is equal to the square of the ratio between, on the one hand, the scalar product of the excitation vector by a target vector formed by samples of the frame having undergone inverse linear prediction filtering and, on the other hand, the module of the excitation vector having undergone direct linear prediction filtering.
  • the method comprises the steps of preselecting an excitation vector having as components those of the same sign as the corresponding samples of the target vector, or those of inverse sign, and, if the preselected excitation vector does not belong to said sub- together, to select as the excitation vector the one which maximizes said criterion from the vectors of the subset which are respectively associated with the preselected vector and with the vectors closest to it in the larger set.
  • the excitation vectors are associated with excitation codes, the bits of which correspond to the signs of the components of the excitation vectors, the subset of excitation codes associated with said sub- set of vectors being formed by binary values supplemented by error correction bits, each excitation code being associated with an excitation code of the subset by an error correction function.
  • the method comprises the steps of forming the group comprising the preselected code associated with the preselected vector and the codes closest to it, so that each of these closest codes differs from the preselected code by a single bit, to be submitted the codes of this group to the error correction function to obtain a group of corrected codes belonging to the subset, and to select as excitation code, among the corrected codes, that associated with the vector which maximizes said criterion.
  • the error correction bits are bits of a Hamming correcting code.
  • This method is not directly applicable in the case where the possible excitation codes belong to a subset representative of a larger set, for example when this subset is formed from values of n bits to which one adds Nn bits of an error correcting code. Indeed, it is then very likely that the excitation code found does not belong to the subset. In this case, one could consider reducing the excitation code found to an excitation code belonging to the subset by applying to it an error correction function associated with the correction code. We then come across the excitation code of the subset which is closest to the excitation code found. The effect of this "error correction" is to modify at least one bit of the excitation code, this bit possibly having in some cases a great influence on the value of the criterion m, so that the final excitation code provides bad results.
  • H Hamming correcting code denoted H (N, n, 3) is used below, where 3 denotes the minimum Hamming distance separating two elements belonging to the representative subset.
  • the Hamming distance between two values is defined as the number of bitwise differences between these two values.
  • a group of excitation codes comprising an initial code found by maximizing the simplified criterion m as well as all the other codes obtained from it by modifying a single bit.
  • This has the consequence, by using a Hamming correction code allowing to correct a single bit (minimum Hamming distance 3), that each of the group excitation codes is close to a code distinct from the usable subset.
  • the Hamming error correction function is applied to each of the codes in the group, which brings each of the codes in the group to the code closest to the subset.
  • the corrected codes one retains as an approximate optimal code that which maximizes the complete criterion m, that is to say by calculating its numerator and its denominator.
  • FIG. 3 represents in diagram form the method according to the invention which has just been described.
  • the analyzed frame of the signal S undergoes the reverse processing at 24 of that of the filters 10 and 12 of FIG. 1.
  • a target vector T is thus obtained, of which only the samples corresponding to the pulses of the excitation sequences are kept.
  • samples of the vector T are retained only the sign bits (or their inverses) to provide an initial excitation code C 0 .
  • This code C 0 is "corrupted" at 28 to form a group of codes comprising the code C 0 and all the other codes C 1 to C N obtained by modifying a single bit of the code C 0 .
  • Each code C 0 to C N undergoes an "error correction" at 30 to provide a group of corrected codes C ' 0 to C' N.
  • each of the vectors associated with the corrected codes is compared with the target vector T, and the optimal associated excitation code Copt is that associated with the vector which maximizes the complete criterion m.
  • the position of the first pulse of the excitation sequences E is variable.
  • this position can be one of the first four, which is determined by two additional bits to be transmitted to the decoder and which multiplies by four the number of excitation vectors to be tested.
EP96410028A 1995-03-24 1996-03-19 Bestimmung eines Anregungsvektors in einem CELP-Kodierer Expired - Lifetime EP0734013B1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR9503735 1995-03-24
FR9503735A FR2732148B1 (fr) 1995-03-24 1995-03-24 Determination d'un vecteur d'excitation dans un codeur celp

Publications (3)

Publication Number Publication Date
EP0734013A2 true EP0734013A2 (de) 1996-09-25
EP0734013A3 EP0734013A3 (de) 1997-05-28
EP0734013B1 EP0734013B1 (de) 2001-08-22

Family

ID=9477567

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96410028A Expired - Lifetime EP0734013B1 (de) 1995-03-24 1996-03-19 Bestimmung eines Anregungsvektors in einem CELP-Kodierer

Country Status (5)

Country Link
US (1) US5719994A (de)
EP (1) EP0734013B1 (de)
JP (1) JPH0990996A (de)
DE (1) DE69614594D1 (de)
FR (1) FR2732148B1 (de)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE192259T1 (de) * 1995-11-09 2000-05-15 Nokia Mobile Phones Ltd Verfahren zur synthetisierung eines sprachsignalblocks in einem celp-kodierer

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0294012A2 (de) * 1987-06-05 1988-12-07 AT&T Corp. Entgegenwirkung gegen die Auswirkungen von Kanalrauschen in digitaler Informationsübertragung
EP0515138A2 (de) * 1991-05-20 1992-11-25 Nokia Mobile Phones Ltd. Digitaler Sprachkodierer

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5138661A (en) * 1990-11-13 1992-08-11 General Electric Company Linear predictive codeword excited speech synthesizer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0294012A2 (de) * 1987-06-05 1988-12-07 AT&T Corp. Entgegenwirkung gegen die Auswirkungen von Kanalrauschen in digitaler Informationsübertragung
EP0515138A2 (de) * 1991-05-20 1992-11-25 Nokia Mobile Phones Ltd. Digitaler Sprachkodierer

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
IEEE TRANSACTIONS ON COMMUNICATIONS, vol. 42, no. 2/3/4, 1 Février 1994, pages 248-251, XP000445940 RAMABADRAN ET AL.: "Complexity reduction of CELP speech coders through the use of phase information" *
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING,, vol. 1, no. 3, Juillet 1993, US, pages 315-325, XP000388575 AHMED ET AL.: "Fast methods for code search in CELP" *
INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING 1987, vol. 4, 6 - 9 Avril 1987, DALLAS, TX, US, pages 1953-1956, XP000568035 ADOUL ET AL.: "A comparison of some algebraic structures for CELP coding of speech" *
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP), vol. 1, 18 - 22 Novembre 1990, KOBE, JP, pages 49-52, XP000503311 LEE ET AL.: "An Improved Method for Multipulse Speech Analysis" *

Also Published As

Publication number Publication date
FR2732148B1 (fr) 1997-06-13
EP0734013A3 (de) 1997-05-28
US5719994A (en) 1998-02-17
FR2732148A1 (fr) 1996-09-27
DE69614594D1 (de) 2001-09-27
JPH0990996A (ja) 1997-04-04
EP0734013B1 (de) 2001-08-22

Similar Documents

Publication Publication Date Title
EP1994531B1 (de) Verbesserte celp kodierung oder dekodierung eines digitalen audiosignals
EP0481895B1 (de) Verfahren und Einrichtung zur Übertragung mit niedriger Bitrate eines Sprachsignals mittels CELP-Codierung
EP0782128A1 (de) Verfahren zur Analyse eines Audiofrequenzsignals durch lineare Prädiktion, und Anwendung auf ein Verfahren zur Kodierung und Dekodierung eines Audiofrequenzsignals
EP0608174A1 (de) System zur prädiktiven Kodierung/Dekodierung eines digitalen Sprachsignals mittels einer adaptiven Transformation mit eingebetteten Kodes
EP0749626A1 (de) Verfahren zur sprachkodierung mittels linearer prädiktion und anregung durch algebraische kodes
EP0428445B1 (de) Verfahren und Einrichtung zur Codierung von Prädiktionsfiltern in Vocodern mit sehr niedriger Datenrate
FR2702075A1 (fr) Procédé de génération d'un filtre de pondération spectrale du bruit dans un codeur de la parole.
EP0685833A1 (de) Verfahren zur Sprachkodierung mittels linearer Prädiktion
CA2340028C (fr) Reseau neuronal et son application pour la reconnaissance vocale
EP1836699A1 (de) Verfahren und vorrichtung zur ausführung einer optimalen kodierung zwischen zwei langzeitvorhersagemodellen
EP0734013B1 (de) Bestimmung eines Anregungsvektors in einem CELP-Kodierer
EP0616315A1 (de) Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse
EP0573358B1 (de) Verfahren und Vorrichtung zur Sprachsynthese mit variabler Geschwindigkeit
EP0347307B1 (de) Kodierungsverfahren und linearer Prädiktionssprachkodierer
EP1192619B1 (de) Audio-kodierung, dekodierung zur interpolation
EP0469997B1 (de) Kodierungsverfahren und Sprachkodierer unter Anwendung von Analyse durch lineare Prädiktion
EP1605440A1 (de) Verfahren zur Quellentrennung eines Signalgemisches
EP1192618B1 (de) Audiokodierung mit adaptiver lifterung
EP1192621B1 (de) Audiokodierung mit harmonischen komponenten
FR2709366A1 (fr) Procédé de stockage de vecteurs de coefficient de réflexion.
EP1190414A1 (de) Audio-kodierung, dekodierung, mit harmonischen komponenten und minimaler phase
CA2079884A1 (fr) Procede et dispositif de codage bas debit de la parole
WO2001003119A1 (fr) Codage et decodage audio incluant des composantes non harmoniques du signal
EP0796490A1 (de) Verfahren und vorrichtung zur signalprädiktion für einen sprachkodierer
FR2739964A1 (fr) Dispositif de prediction de periode de voisement pour codeur de parole

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): DE FR GB IT

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): DE FR GB IT

17P Request for examination filed

Effective date: 19971030

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: STMICROELECTRONICS S.A.

17Q First examination report despatched

Effective date: 19991001

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 19/10 A

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB IT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRE;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.SCRIBED TIME-LIMIT

Effective date: 20010822

REF Corresponds to:

Ref document number: 69614594

Country of ref document: DE

Date of ref document: 20010927

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 20011005

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20011123

RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

Owner name: STMICROELECTRONICS S.A.

REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20060816

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20060825

Year of fee payment: 11

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20070319

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20071130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070319

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070402