WO2000016501A1 - Method and apparatus for coding an information signal - Google Patents

Method and apparatus for coding an information signal Download PDF

Info

Publication number
WO2000016501A1
WO2000016501A1 PCT/US1999/019217 US9919217W WO0016501A1 WO 2000016501 A1 WO2000016501 A1 WO 2000016501A1 US 9919217 W US9919217 W US 9919217W WO 0016501 A1 WO0016501 A1 WO 0016501A1
Authority
WO
WIPO (PCT)
Prior art keywords
positions
pulse
pulses
signal
combinations
Prior art date
Application number
PCT/US1999/019217
Other languages
English (en)
French (fr)
Inventor
James P. Ashley
Weimin Peng
Original Assignee
Motorola Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc. filed Critical Motorola Inc.
Priority to JP2000570919A priority Critical patent/JP4460165B2/ja
Priority to DE69931641T priority patent/DE69931641T2/de
Priority to EP99943854A priority patent/EP1112625B1/de
Publication of WO2000016501A1 publication Critical patent/WO2000016501A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • G10L19/107Sparse pulse excitation, e.g. by using algebraic codebook

Definitions

  • the present invention relates, in general, to communication systems and, more particularly, to coding information signals in such communication systems.
  • CDMA communication systems are well known.
  • One exemplary CDMA communication system is the so-called IS-95 which is defined for use in North America by the Telecommunications Industry Association (TLA).
  • TLA Telecommunications Industry Association
  • TIA/EIA/IS-95 Mobile Station-Base-station Compatibility Standard for Dual Mode Wideband Spread Spectrum Cellular System, January 1997, published by the Electronic Industries Association (EIA), 2001 Eye Street, N.W., Washington, D.C. 20006.
  • a variable rate speech codec, and specifically Code Excited Linear Prediction (CELP) codec, for use in communication systems compatible with IS-95 is defined in the document known as IS- 127 and titled Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems, September 1996. IS-127 is also published by the Electronic Industries Association (EIA), 2001 Eye Street, N. W., Washington, D.C. 20006.
  • EIA Electronic Industries Association
  • FIG. 1 generally depicts a CELP decoder as is known in the prior art.
  • FIG. 2 generally depicts a Code Excited Linear Prediction (CELP) encoder as is known in the prior art.
  • CELP Code Excited Linear Prediction
  • FIG. 3 generally depicts a joint interleaved pulse permutation matrix in accordance with the invention.
  • FIG. 4 generally depicts a flow chart describing how the codebook is generated in accordance with the invention.
  • FIG. 5 generally depicts a joint interleaved pulse permutation matrix for pulses 3 and 4 in accordance with the present invention.
  • a method for coding an information signal comprises the steps of dividing the information signal into blocks and deriving a target signal based on a block of the information signal.
  • the method further includes the steps of coding the target signal using pulse positioning techniques based on an error criteria, wherein the allowable positions of a given pulse are dependent on the positions of one or more other pulses, to produce coded pulse positions and transmitting the coded pulse positions to a destination.
  • the information signal further comprises a speech signal or an audio signal and a block of the information signals further comprise a frame or a subframe of the information signals.
  • the error criteria further comprises a perceptually weighted squared error criteria and the allowable pulse positions are determined using an arbitrary closed-form expression E( ⁇ ), in which at least one of the conditions within the expression pertain to at least two of the elements within ⁇ .
  • FIG. 1 generally depicts a Code Excited Linear Prediction (CELP) decoder 100 as is known in the art.
  • CELP Code Excited Linear Prediction
  • This signal is scaled using the FCB gain factor / and combined with a signal E(n) output from an adaptive codebook 104 (ACB) and scaled by a factor ⁇ , which is used to model the long term (or periodic) component of a speech signal (with period r).
  • the signal E t (n) which represents the total excitation, is used as the input to the LPC synthesis filter 106, which models the coarse short term spectral shape, commonly referred to as "formants”.
  • the output of the synthesis filter 106 is then perceptually postfiltered by perceptual postfilter 108 in which the coding distortions are effectively "masked” by amplifying the signal spectra at frequencies that contain high speech energy, and attenuating those frequencies that contain less speech energy. Additionally, the total excitation signal E,(n) is used as the adaptive codebook for the next block of synthesized speech.
  • FIG. 2 generally depicts a CELP encoder 200.
  • the goal is to code the perceptually weighted target signal x w (n), which can be represented in general terms by the z-transform:
  • W(z) is the transfer function of the perceptual weighting filter 208, and is of the form:
  • H(z) is the transfer function of the perceptually weighted synthesis filters 206 and 210, and is of the form:
  • H zs ( ) is the "zero state" response of H(z) from filter 206, in which the initial state of H(z) is all zeroes
  • H m (z) is the "zero input response" of H(z) from filter 210, in which the previous state of H(z) is allowed to evolve with no input excitation.
  • the initial state used for generation of H zlR (z) is derived from the total excitation E t (n) from the previous subframe.
  • FCB perceptually weighted target signal x w (n) and the perceptually weighted excitation signal x w (n). This can be expressed in time domain form as:
  • c k (ri) is the codevector corresponding to FCB codebook index k
  • ⁇ k is the optimal FCB gain associated with codevector c k ( ⁇ )
  • h( ⁇ ) is the impulse response of the perceptually weighted synthesis filter H(z)
  • M is the codebook size
  • L is the subframe length
  • x w (n) ⁇ k c k ( ⁇ ) * h(ri) .
  • speech is coded every 20 milliseconds (ms) and each frame includes three subframes of length L.
  • Eq.4 can also be expressed in vector-matrix form as:
  • H is the L x L zero-state convolution matrix
  • the FCB utilizes a multipulse configuration in which the excitation vector c k contains very few non-zero, unit magnitude values. This configuration is known in the art as Algebraic CELP, or ACELP.
  • Table 1 generally depicts pulse positions defined for IS-127 Rate 1/2.
  • the excitation codevector c k can contain " holes" in which certain positions are not represented by the vector space. That is, an optimal match to the target vector may require a pulse at position 12, but the definitions of the pulse positions in Table 1 does not allow a pulse to be located at that position.
  • the constraints on positions may cause the pulse to be placed either at locations close to the optimal position, or worse, the energy of the target signal may be completely missed at that position. This can cause distortion, and possibly audible artifacts in the synthesized speech signal.
  • the bit allocation of 16 bits would be divided between the four tracks equally so that each track would receive four bits.
  • the four bits per track would further be composed of three bits for position (comprising 8 different positions) and one sign bit to indicate the polarity of the pulse.
  • the pulse positions can then be extracted at the decoder by:
  • the respective positions of pulse 0 are shown along the horizontal axis, and the positions of pulse 1 are shown along the vertical axis.
  • the "forbidden" pulse combinations are designated by the shaded regions while the allowable combinations are unshaded.
  • FIG. 4 generally depicts a flow chart describing how the codebook is generated in accordance with the invention.
  • the flowchart shows a basic nested loop structure in which all permutations of 0 ⁇ / ⁇ M and 0 ⁇ j ⁇ N are generated.
  • N and M are the total number of allowable positions for each pulse.
  • the decision in the innermost loop simply checks for forbidden combinations [i,j] according to function F(i,j) at step 402, which in the example of FIG. 3 is described as:
  • This function returns a value of 1 for cases when the absolute value of the difference of / andy is an element of the given set; otherwise, a zero is returned. This is shown in step 403.
  • the elements of the given set correspond to the distances between the diagonal shaded elements of FIG. 3, and the expression is therefore sufficient in describing all necessary shaded regions.
  • the respective positions are calculated using the following expression:
  • is the decimated track position
  • N lr ⁇ ck is the number of tracks
  • n is the track number.
  • FIG. 5 generally depicts a joint interleaved pulse permutation matrix for pulses p 2 and p 3 in accordance with the present invention. As shown in FIG.
  • n is the number of pulses.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
  • Paper (AREA)
  • Control Of El Displays (AREA)
  • Control Of Motors That Do Not Use Commutators (AREA)
PCT/US1999/019217 1998-09-11 1999-08-24 Method and apparatus for coding an information signal WO2000016501A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2000570919A JP4460165B2 (ja) 1998-09-11 1999-08-24 情報信号を符号化する方法および装置
DE69931641T DE69931641T2 (de) 1998-09-11 1999-08-24 Verfahren zur Kodierung von Informationssignalen
EP99943854A EP1112625B1 (de) 1998-09-11 1999-08-24 Verfahren zur kodierung von informationsignalen

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15143098A 1998-09-11 1998-09-11
US09/151,430 1998-09-11

Publications (1)

Publication Number Publication Date
WO2000016501A1 true WO2000016501A1 (en) 2000-03-23

Family

ID=22538745

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/019217 WO2000016501A1 (en) 1998-09-11 1999-08-24 Method and apparatus for coding an information signal

Country Status (6)

Country Link
EP (1) EP1112625B1 (de)
JP (1) JP4460165B2 (de)
KR (1) KR100409167B1 (de)
AT (1) ATE328407T1 (de)
DE (1) DE69931641T2 (de)
WO (1) WO2000016501A1 (de)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1132893A3 (de) * 2000-02-15 2002-10-16 Lucent Technologies Inc. Pulspositions- Kontrolle für einen CELP-Sprachkodierer
RU2471288C2 (ru) * 2008-03-13 2012-12-27 Моторола Мобилити, Инк. Устройство и способ комбинаторного кодирования малой сложности сигналов

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5963897A (en) * 1998-02-27 1999-10-05 Lernout & Hauspie Speech Products N.V. Apparatus and method for hybrid excited linear prediction speech encoding
US5970444A (en) * 1997-03-13 1999-10-19 Nippon Telegraph And Telephone Corporation Speech coding method

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2579356B1 (fr) * 1985-03-22 1987-05-07 Cit Alcatel Procede de codage a faible debit de la parole a signal multi-impulsionnel d'excitation
SE463691B (sv) * 1989-05-11 1991-01-07 Ericsson Telefon Ab L M Foerfarande att utplacera excitationspulser foer en lineaerprediktiv kodare (lpc) som arbetar enligt multipulsprincipen
US5754976A (en) * 1990-02-23 1998-05-19 Universite De Sherbrooke Algebraic codebook with signal-selected pulse amplitude/position combinations for fast coding of speech
JP3057907B2 (ja) * 1992-06-16 2000-07-04 松下電器産業株式会社 音声符号化装置
KR950011967B1 (ko) * 1992-07-31 1995-10-12 임홍식 반도체 녹음기용 메모리 정리 장치
JP3196595B2 (ja) * 1995-09-27 2001-08-06 日本電気株式会社 音声符号化装置
JP4063911B2 (ja) * 1996-02-21 2008-03-19 松下電器産業株式会社 音声符号化装置
JP3180762B2 (ja) * 1998-05-11 2001-06-25 日本電気株式会社 音声符号化装置及び音声復号化装置
JP3824810B2 (ja) * 1998-09-01 2006-09-20 富士通株式会社 音声符号化方法、音声符号化装置、及び音声復号装置

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5970444A (en) * 1997-03-13 1999-10-19 Nippon Telegraph And Telephone Corporation Speech coding method
US5963897A (en) * 1998-02-27 1999-10-05 Lernout & Hauspie Speech Products N.V. Apparatus and method for hybrid excited linear prediction speech encoding

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1132893A3 (de) * 2000-02-15 2002-10-16 Lucent Technologies Inc. Pulspositions- Kontrolle für einen CELP-Sprachkodierer
US6539349B1 (en) 2000-02-15 2003-03-25 Lucent Technologies Inc. Constraining pulse positions in CELP vocoding
RU2471288C2 (ru) * 2008-03-13 2012-12-27 Моторола Мобилити, Инк. Устройство и способ комбинаторного кодирования малой сложности сигналов

Also Published As

Publication number Publication date
KR20010073146A (ko) 2001-07-31
KR100409167B1 (ko) 2003-12-12
EP1112625B1 (de) 2006-05-31
ATE328407T1 (de) 2006-06-15
EP1112625A1 (de) 2001-07-04
JP2002525667A (ja) 2002-08-13
DE69931641T2 (de) 2006-10-05
DE69931641D1 (de) 2006-07-06
EP1112625A4 (de) 2004-06-16
JP4460165B2 (ja) 2010-05-12

Similar Documents

Publication Publication Date Title
US6236960B1 (en) Factorial packing method and apparatus for information coding
US6141638A (en) Method and apparatus for coding an information signal
US7280959B2 (en) Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals
DE69928288T2 (de) Kodierung periodischer sprache
US6055496A (en) Vector quantization in celp speech coder
EP1235203B1 (de) Verschleierungsverfahren bei Verlust von Sprachrahmen und Dekoder dafér
KR20010024935A (ko) 음성 코딩
EP2805324B1 (de) System und verfahren für gemischte codebuchanregung zur sprachcodierung
AU2002221389A1 (en) Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals
KR20020052191A (ko) 음성 분류를 이용한 음성의 가변 비트 속도 켈프 코딩 방법
US6678651B2 (en) Short-term enhancement in CELP speech coding
US6415252B1 (en) Method and apparatus for coding and decoding speech
EP1103953B1 (de) Verschleierungsverfahren bei Verlust von Sprachrahmen
EP1112625B1 (de) Verfahren zur kodierung von informationsignalen
KR100718487B1 (ko) 디지털 음성 코더들에서의 고조파 잡음 가중
Bessette et al. Techniques for high-quality ACELP coding of wideband speech
WO2002023536A2 (en) Formant emphasis in celp speech coding
JP3166697B2 (ja) 音声符号化・復号装置及びシステム
Saleem et al. Implementation of Low Complexity CELP Coder and Performance Evaluation in terms of Speech Quality
EP1212750A1 (de) Multimodaler vselp sprachkodierer
Ravishankar et al. Voice Coding Technology for Digital Aeronautical Communications

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): BR JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1999943854

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1020017003129

Country of ref document: KR

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 570919

Kind code of ref document: A

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 1999943854

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020017003129

Country of ref document: KR

WWG Wipo information: grant in national office

Ref document number: 1020017003129

Country of ref document: KR

WWG Wipo information: grant in national office

Ref document number: 1999943854

Country of ref document: EP