WO2000016501A1 - Method and apparatus for coding an information signal - Google Patents
Method and apparatus for coding an information signal Download PDFInfo
- Publication number
- WO2000016501A1 WO2000016501A1 PCT/US1999/019217 US9919217W WO0016501A1 WO 2000016501 A1 WO2000016501 A1 WO 2000016501A1 US 9919217 W US9919217 W US 9919217W WO 0016501 A1 WO0016501 A1 WO 0016501A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- positions
- pulse
- pulses
- signal
- combinations
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 22
- 230000001419 dependent effect Effects 0.000 claims description 3
- 230000005236 sound signal Effects 0.000 claims description 2
- 230000005284 excitation Effects 0.000 description 11
- 239000013598 vector Substances 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
- G10L19/107—Sparse pulse excitation, e.g. by using algebraic codebook
Definitions
- the present invention relates, in general, to communication systems and, more particularly, to coding information signals in such communication systems.
- CDMA communication systems are well known.
- One exemplary CDMA communication system is the so-called IS-95 which is defined for use in North America by the Telecommunications Industry Association (TLA).
- TLA Telecommunications Industry Association
- TIA/EIA/IS-95 Mobile Station-Base-station Compatibility Standard for Dual Mode Wideband Spread Spectrum Cellular System, January 1997, published by the Electronic Industries Association (EIA), 2001 Eye Street, N.W., Washington, D.C. 20006.
- a variable rate speech codec, and specifically Code Excited Linear Prediction (CELP) codec, for use in communication systems compatible with IS-95 is defined in the document known as IS- 127 and titled Enhanced Variable Rate Codec, Speech Service Option 3 for Wideband Spread Spectrum Digital Systems, September 1996. IS-127 is also published by the Electronic Industries Association (EIA), 2001 Eye Street, N. W., Washington, D.C. 20006.
- EIA Electronic Industries Association
- FIG. 1 generally depicts a CELP decoder as is known in the prior art.
- FIG. 2 generally depicts a Code Excited Linear Prediction (CELP) encoder as is known in the prior art.
- CELP Code Excited Linear Prediction
- FIG. 3 generally depicts a joint interleaved pulse permutation matrix in accordance with the invention.
- FIG. 4 generally depicts a flow chart describing how the codebook is generated in accordance with the invention.
- FIG. 5 generally depicts a joint interleaved pulse permutation matrix for pulses 3 and 4 in accordance with the present invention.
- a method for coding an information signal comprises the steps of dividing the information signal into blocks and deriving a target signal based on a block of the information signal.
- the method further includes the steps of coding the target signal using pulse positioning techniques based on an error criteria, wherein the allowable positions of a given pulse are dependent on the positions of one or more other pulses, to produce coded pulse positions and transmitting the coded pulse positions to a destination.
- the information signal further comprises a speech signal or an audio signal and a block of the information signals further comprise a frame or a subframe of the information signals.
- the error criteria further comprises a perceptually weighted squared error criteria and the allowable pulse positions are determined using an arbitrary closed-form expression E( ⁇ ), in which at least one of the conditions within the expression pertain to at least two of the elements within ⁇ .
- FIG. 1 generally depicts a Code Excited Linear Prediction (CELP) decoder 100 as is known in the art.
- CELP Code Excited Linear Prediction
- This signal is scaled using the FCB gain factor / and combined with a signal E(n) output from an adaptive codebook 104 (ACB) and scaled by a factor ⁇ , which is used to model the long term (or periodic) component of a speech signal (with period r).
- the signal E t (n) which represents the total excitation, is used as the input to the LPC synthesis filter 106, which models the coarse short term spectral shape, commonly referred to as "formants”.
- the output of the synthesis filter 106 is then perceptually postfiltered by perceptual postfilter 108 in which the coding distortions are effectively "masked” by amplifying the signal spectra at frequencies that contain high speech energy, and attenuating those frequencies that contain less speech energy. Additionally, the total excitation signal E,(n) is used as the adaptive codebook for the next block of synthesized speech.
- FIG. 2 generally depicts a CELP encoder 200.
- the goal is to code the perceptually weighted target signal x w (n), which can be represented in general terms by the z-transform:
- W(z) is the transfer function of the perceptual weighting filter 208, and is of the form:
- H(z) is the transfer function of the perceptually weighted synthesis filters 206 and 210, and is of the form:
- H zs ( ) is the "zero state" response of H(z) from filter 206, in which the initial state of H(z) is all zeroes
- H m (z) is the "zero input response" of H(z) from filter 210, in which the previous state of H(z) is allowed to evolve with no input excitation.
- the initial state used for generation of H zlR (z) is derived from the total excitation E t (n) from the previous subframe.
- FCB perceptually weighted target signal x w (n) and the perceptually weighted excitation signal x w (n). This can be expressed in time domain form as:
- c k (ri) is the codevector corresponding to FCB codebook index k
- ⁇ k is the optimal FCB gain associated with codevector c k ( ⁇ )
- h( ⁇ ) is the impulse response of the perceptually weighted synthesis filter H(z)
- M is the codebook size
- L is the subframe length
- x w (n) ⁇ k c k ( ⁇ ) * h(ri) .
- speech is coded every 20 milliseconds (ms) and each frame includes three subframes of length L.
- Eq.4 can also be expressed in vector-matrix form as:
- H is the L x L zero-state convolution matrix
- the FCB utilizes a multipulse configuration in which the excitation vector c k contains very few non-zero, unit magnitude values. This configuration is known in the art as Algebraic CELP, or ACELP.
- Table 1 generally depicts pulse positions defined for IS-127 Rate 1/2.
- the excitation codevector c k can contain " holes" in which certain positions are not represented by the vector space. That is, an optimal match to the target vector may require a pulse at position 12, but the definitions of the pulse positions in Table 1 does not allow a pulse to be located at that position.
- the constraints on positions may cause the pulse to be placed either at locations close to the optimal position, or worse, the energy of the target signal may be completely missed at that position. This can cause distortion, and possibly audible artifacts in the synthesized speech signal.
- the bit allocation of 16 bits would be divided between the four tracks equally so that each track would receive four bits.
- the four bits per track would further be composed of three bits for position (comprising 8 different positions) and one sign bit to indicate the polarity of the pulse.
- the pulse positions can then be extracted at the decoder by:
- the respective positions of pulse 0 are shown along the horizontal axis, and the positions of pulse 1 are shown along the vertical axis.
- the "forbidden" pulse combinations are designated by the shaded regions while the allowable combinations are unshaded.
- FIG. 4 generally depicts a flow chart describing how the codebook is generated in accordance with the invention.
- the flowchart shows a basic nested loop structure in which all permutations of 0 ⁇ / ⁇ M and 0 ⁇ j ⁇ N are generated.
- N and M are the total number of allowable positions for each pulse.
- the decision in the innermost loop simply checks for forbidden combinations [i,j] according to function F(i,j) at step 402, which in the example of FIG. 3 is described as:
- This function returns a value of 1 for cases when the absolute value of the difference of / andy is an element of the given set; otherwise, a zero is returned. This is shown in step 403.
- the elements of the given set correspond to the distances between the diagonal shaded elements of FIG. 3, and the expression is therefore sufficient in describing all necessary shaded regions.
- the respective positions are calculated using the following expression:
- ⁇ is the decimated track position
- N lr ⁇ ck is the number of tracks
- n is the track number.
- FIG. 5 generally depicts a joint interleaved pulse permutation matrix for pulses p 2 and p 3 in accordance with the present invention. As shown in FIG.
- n is the number of pulses.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Control Of El Displays (AREA)
- Control Of Motors That Do Not Use Commutators (AREA)
- Paper (AREA)
- Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000570919A JP4460165B2 (ja) | 1998-09-11 | 1999-08-24 | 情報信号を符号化する方法および装置 |
DE69931641T DE69931641T2 (de) | 1998-09-11 | 1999-08-24 | Verfahren zur Kodierung von Informationssignalen |
EP99943854A EP1112625B1 (en) | 1998-09-11 | 1999-08-24 | Method for coding an information signal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15143098A | 1998-09-11 | 1998-09-11 | |
US09/151,430 | 1998-09-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000016501A1 true WO2000016501A1 (en) | 2000-03-23 |
Family
ID=22538745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/019217 WO2000016501A1 (en) | 1998-09-11 | 1999-08-24 | Method and apparatus for coding an information signal |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1112625B1 (ja) |
JP (1) | JP4460165B2 (ja) |
KR (1) | KR100409167B1 (ja) |
AT (1) | ATE328407T1 (ja) |
DE (1) | DE69931641T2 (ja) |
WO (1) | WO2000016501A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1132893A3 (en) * | 2000-02-15 | 2002-10-16 | Lucent Technologies Inc. | Constraining pulse positions in CELP vocoding |
RU2471288C2 (ru) * | 2008-03-13 | 2012-12-27 | Моторола Мобилити, Инк. | Устройство и способ комбинаторного кодирования малой сложности сигналов |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5963897A (en) * | 1998-02-27 | 1999-10-05 | Lernout & Hauspie Speech Products N.V. | Apparatus and method for hybrid excited linear prediction speech encoding |
US5970444A (en) * | 1997-03-13 | 1999-10-19 | Nippon Telegraph And Telephone Corporation | Speech coding method |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2579356B1 (fr) * | 1985-03-22 | 1987-05-07 | Cit Alcatel | Procede de codage a faible debit de la parole a signal multi-impulsionnel d'excitation |
SE463691B (sv) * | 1989-05-11 | 1991-01-07 | Ericsson Telefon Ab L M | Foerfarande att utplacera excitationspulser foer en lineaerprediktiv kodare (lpc) som arbetar enligt multipulsprincipen |
US5754976A (en) * | 1990-02-23 | 1998-05-19 | Universite De Sherbrooke | Algebraic codebook with signal-selected pulse amplitude/position combinations for fast coding of speech |
JP3057907B2 (ja) * | 1992-06-16 | 2000-07-04 | 松下電器産業株式会社 | 音声符号化装置 |
KR950011967B1 (ko) * | 1992-07-31 | 1995-10-12 | 임홍식 | 반도체 녹음기용 메모리 정리 장치 |
JP3196595B2 (ja) * | 1995-09-27 | 2001-08-06 | 日本電気株式会社 | 音声符号化装置 |
JP4063911B2 (ja) * | 1996-02-21 | 2008-03-19 | 松下電器産業株式会社 | 音声符号化装置 |
JP3180762B2 (ja) * | 1998-05-11 | 2001-06-25 | 日本電気株式会社 | 音声符号化装置及び音声復号化装置 |
JP3824810B2 (ja) * | 1998-09-01 | 2006-09-20 | 富士通株式会社 | 音声符号化方法、音声符号化装置、及び音声復号装置 |
-
1999
- 1999-08-24 DE DE69931641T patent/DE69931641T2/de not_active Expired - Lifetime
- 1999-08-24 JP JP2000570919A patent/JP4460165B2/ja not_active Expired - Fee Related
- 1999-08-24 WO PCT/US1999/019217 patent/WO2000016501A1/en active IP Right Grant
- 1999-08-24 EP EP99943854A patent/EP1112625B1/en not_active Expired - Lifetime
- 1999-08-24 AT AT99943854T patent/ATE328407T1/de not_active IP Right Cessation
- 1999-08-24 KR KR10-2001-7003129A patent/KR100409167B1/ko not_active IP Right Cessation
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5970444A (en) * | 1997-03-13 | 1999-10-19 | Nippon Telegraph And Telephone Corporation | Speech coding method |
US5963897A (en) * | 1998-02-27 | 1999-10-05 | Lernout & Hauspie Speech Products N.V. | Apparatus and method for hybrid excited linear prediction speech encoding |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1132893A3 (en) * | 2000-02-15 | 2002-10-16 | Lucent Technologies Inc. | Constraining pulse positions in CELP vocoding |
US6539349B1 (en) | 2000-02-15 | 2003-03-25 | Lucent Technologies Inc. | Constraining pulse positions in CELP vocoding |
RU2471288C2 (ru) * | 2008-03-13 | 2012-12-27 | Моторола Мобилити, Инк. | Устройство и способ комбинаторного кодирования малой сложности сигналов |
Also Published As
Publication number | Publication date |
---|---|
DE69931641T2 (de) | 2006-10-05 |
KR20010073146A (ko) | 2001-07-31 |
EP1112625B1 (en) | 2006-05-31 |
DE69931641D1 (de) | 2006-07-06 |
KR100409167B1 (ko) | 2003-12-12 |
EP1112625A4 (en) | 2004-06-16 |
ATE328407T1 (de) | 2006-06-15 |
EP1112625A1 (en) | 2001-07-04 |
JP2002525667A (ja) | 2002-08-13 |
JP4460165B2 (ja) | 2010-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6236960B1 (en) | Factorial packing method and apparatus for information coding | |
US6141638A (en) | Method and apparatus for coding an information signal | |
US7280959B2 (en) | Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals | |
DE69928288T2 (de) | Kodierung periodischer sprache | |
US6055496A (en) | Vector quantization in celp speech coder | |
EP1235203B1 (en) | Method for concealing erased speech frames and decoder therefor | |
KR20010024935A (ko) | 음성 코딩 | |
EP2805324B1 (en) | System and method for mixed codebook excitation for speech coding | |
AU2002221389A1 (en) | Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals | |
US6678651B2 (en) | Short-term enhancement in CELP speech coding | |
US6415252B1 (en) | Method and apparatus for coding and decoding speech | |
EP1103953B1 (en) | Method for concealing erased speech frames | |
EP1112625B1 (en) | Method for coding an information signal | |
KR100718487B1 (ko) | 디지털 음성 코더들에서의 고조파 잡음 가중 | |
Bessette et al. | Techniques for high-quality ACELP coding of wideband speech | |
WO2002023536A2 (en) | Formant emphasis in celp speech coding | |
JP3166697B2 (ja) | 音声符号化・復号装置及びシステム | |
Saleem et al. | Implementation of Low Complexity CELP Coder and Performance Evaluation in terms of Speech Quality | |
WO2001009880A1 (en) | Multimode vselp speech coder | |
WO2000042601A1 (en) | A method and device for designing and searching large stochastic codebooks in low bit rate speech encoders | |
Ravishankar et al. | Voice Coding Technology for Digital Aeronautical Communications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): BR JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1999943854 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020017003129 Country of ref document: KR |
|
ENP | Entry into the national phase |
Ref country code: JP Ref document number: 2000 570919 Kind code of ref document: A Format of ref document f/p: F |
|
WWP | Wipo information: published in national office |
Ref document number: 1999943854 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1020017003129 Country of ref document: KR |
|
WWG | Wipo information: grant in national office |
Ref document number: 1020017003129 Country of ref document: KR |
|
WWG | Wipo information: grant in national office |
Ref document number: 1999943854 Country of ref document: EP |