EP0749626A1 - Verfahren zur sprachkodierung mittels linearer prädiktion und anregung durch algebraische kodes - Google Patents

Verfahren zur sprachkodierung mittels linearer prädiktion und anregung durch algebraische kodes

Info

Publication number
EP0749626A1
EP0749626A1 EP96901020A EP96901020A EP0749626A1 EP 0749626 A1 EP0749626 A1 EP 0749626A1 EP 96901020 A EP96901020 A EP 96901020A EP 96901020 A EP96901020 A EP 96901020A EP 0749626 A1 EP0749626 A1 EP 0749626A1
Authority
EP
European Patent Office
Prior art keywords
pulse
integer
excitation
components
repertoire
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP96901020A
Other languages
English (en)
French (fr)
Other versions
EP0749626B1 (de
Inventor
Claude Lamblin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP0749626A1 publication Critical patent/EP0749626A1/de
Application granted granted Critical
Publication of EP0749626B1 publication Critical patent/EP0749626B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • G10L19/107Sparse pulse excitation, e.g. by using algebraic codebook
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0007Codebook element generation
    • G10L2019/0008Algebraic codebooks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms
    • G10L2019/0014Selection criteria for distances

Definitions

  • the present invention relates to a method of digital coding, in particular of speech signals.
  • CELP Code Excited Linear Prediction
  • CELP coding with an algebraic repertoire has been further improved by the introduction of coders ACELP (Algebraic Code Excited Linear Prediction) which use an algebraic repertoire associated with a focused search with adaptive thresholds allowing the complexity of the calculation to be adjusted.
  • coders ACELP Algebraic Code Excited Linear Prediction
  • CELP coders belong to the family of synthesis analysis coders, in which the coding synthesis model is used.
  • the signals to be coded can be sampled at the telephone frequency (Fe ⁇ kHz) or a higher frequency, for example at 16 kHz for wideband coding (bandwidth from 0 to 7 kHz).
  • the compression rate varies from 1 to 16:
  • CELP coders operate at rates of 2 to 16 kbit / s in the telephone band, and at rates of 16 to 32 kbit / s in wide band.
  • the speech signal is sampled and converted into a series of frames of L samples.
  • Each frame is synthesized by filtering a waveform extracted from a repertoire (also called dictionary), multiplied by a gain, through two filters varying over time.
  • the excitation repertoire is a set of K codes or waveforms of L samples.
  • the waveforms are numbered by an integer index k, k ranging from 0 to K-1, K being the size of the repertoire.
  • the first filter is the long-term prediction filter.
  • LTP Long Term Prediction
  • LPC linear prediction coding
  • the excitation of the synthesis model is therefore constituted by waveforms extracted from a repertoire.
  • the repertoires of the first CELP coders consisted of stochastic waveforms. These repertoires are obtained either by learning or by random generation. Their major drawback is their lack of structure which requires to store them and results in a high complexity of implementation.
  • the excitation repertoire of the first CELP coder was a stochastic dictionary, composed of a set of 1024 waveforms of 40 Gaussian samples. This CELP coder did not work in real time on the most powerful computers of the time. Other stochastic dictionaries allowing to decrease and the memory and the necessary computing time were introduced; however, both the complexity and the required memory capacity remained significant.
  • ACELP coders have been proposed as candidates for several standardizations: standardization ITU (International Telecommunication Union) at 8 kbit / s, standardization ITU for PSTN video telephony at 6.8 kbit / s-5.4 kbit / s.
  • standardization ITU International Telecommunication Union
  • standardization ITU for PSTN video telephony 6.8 kbit / s-5.4 kbit / s.
  • the short-term prediction, LTP analysis and perceptual weighting modules are similar to those used in a conventional CELP coder.
  • the originality of the ACELP encoder resides in the module for searching for the excitation signal.
  • the ACELP coder has two major advantages: great flexibility in throughput and adjustable implementation complexity. The flexibility in speed comes from the directory generation method. The possibility of adjusting the complexity is due to the waveform selection procedure which uses a focused search with adaptive thresholds.
  • the excitation directory is a virtual set (in the sense that it is not stored), generated algebraically.
  • the algebraic code generator produces in response to an index k, k varying from 0 to K-1, a code vector of L samples having very few non-zero components.
  • N be the number of non-zero components.
  • the dimension of the code words is extended to L + N, and the last N components are zero. It is assumed here, without affecting the generality of the presentation, that L is a multiple of N.
  • the code words c k are therefore composed of N pulses.
  • the amplitudes of the pulses are fixed (for example ⁇ 1).
  • Permitted positions for pulse p are of the form
  • CELP is carried out by looking for the one which minimizes the quadratic error between the weighted original signal and the weighted synthetic signal. This amounts to maximizing the quantity
  • T denotes the matrix transposition.
  • D is a target vector which depends on the input signal, the synthetic signal passed and the filter composed of the synthesis and perceptual weighting filters.
  • h be the vector of the impulse response of this composite filter:
  • H H T .H is the covariance matrix of h.
  • the waveform c k is composed of N pulses of positions pos i (q, k), q and of amplitude S q (0 ⁇ q ⁇ N)
  • the scalar product P k of the target vector D with a waveform c k and the energy ⁇ k 2 of the filtered waveform c k have as expression:
  • the exploration is accelerated by calculating before entering the search procedure an adaptive threshold for each loop. One enters the search loop of the pulse q only if a partial quantity Cr k (q-1), calculated from the pulses 0 to q-1 previously determined in the upper loops, exceeds a threshold calculated for the loop q-1.
  • the output signal typically 80 to 200 words or bytes.
  • the size of the covariance matrix occupies a preponderant place. It is noted that, for a given application, the memory space necessary for the intermediate signals is incompressible; if we want to reduce the overall memory size, it therefore seems that it is only possible to play on the memory size necessary for the covariance matrix. However, until now, the experts knew that this matrix was symmetrical compared to the principal diagonal and that certain terms were not useful, but they thought that these last were arranged in the matrix without a given order.
  • a main object of the present invention is to propose an ACELP type coding method which notably reduces the size of the memory necessary for the coder.
  • the invention thus proposes a speech coding method with linear prediction and excitation by codes (CELP), in which a speech signal is digitized in successive frames of L samples, on the one hand, synthesis parameters defining synthesis filters are determined, and on the other hand excitation parameters including for each frame positions of pulses of an excitation code of L samples belonging to a predetermined algebraic repertoire and an associated excitation gain, and quantization values representative of the determined parameters are transmitted.
  • the algebraic repertoire is defined from at least one group of N sets of possible pulse positions in codes of at least L samples, a code of the repertoire being represented by N pulse positions belonging respectively to the N sets d 'a group.
  • c k T denotes the scalar product between the code c k of the repertoire and a target vector D dependent on the speech signal of the frame and the synthesis parameters
  • ⁇ k 2 denotes the energy on the frame of the code c k filtered by a filter composed of the synthesis filters and of a filter perceptual weighting.
  • the memorized components of the covariance matrix are only, for at least one group of N sets, those of the form:
  • the stored components of the covariance matrix are structured, for a group, in the form of N correlation vectors and N (N-1) / 2 correlation matrices.
  • This arrangement of the components of the covariance matrix facilitates their access during the search for the ACELP excitation, so as to reduce or at least not increase the complexity of this module.
  • the method according to the invention is applicable to various types of algebraic codes, that is to say whatever the structure of the sets of possible positions for the different pulses of the codes of the directory.
  • the procedure for calculating correlation vectors and correlation matrices can be made relatively simple and efficient when, in a group of N sets, the sets of possible positions for an impulse of the codes of the repertoire all have the same cardinal L 'and that the position of order i in the set of possible positions for the pulse p (0 ⁇ i ⁇ L ', 0 ⁇ p ⁇ N) is given by:
  • ⁇ and ⁇ being two integers such that ⁇ > 0 and ⁇ 0.
  • FIGS. 1 and 2 are block diagrams of a CELP decoder and coder using an algebraic repertoire according to the invention
  • FIGS. 3 and 4 are flowcharts illustrating the calculation of correlation vectors and correlation matrices in a first embodiment of the invention
  • FIG. 6 to 8 are flowcharts illustrating the calculation of correlation vectors and correlation matrices in a second embodiment of the invention.
  • FIG. 9 is a flowchart illustrating a sub-optimal procedure for seeking excitation in the second embodiment.
  • FIG. 1 The speech synthesis process implemented in a CELP coder and decoder is illustrated in FIG. 1.
  • An excitation generator 10 delivers an excitation code c k belonging to a predetermined directory in response to an index k.
  • An amplifier 12 multiplies this excitation code by an excitation gain ⁇ , and the resulting signal is subjected to a long-term synthesis filter 14.
  • the output signal u of the filter 14 is in turn subjected to a short-term synthesis filter 16, the output of which constitutes what is considered here as the synthesized speech signal.
  • filters can also be implemented at the level of the decoder, for example post-filters, as is well known in the field of speech coding.
  • the aforementioned signals are digital signals represented for example by 16-bit words at a sampling rate Fe equal for example to 8 kHz.
  • the synthesis filters 14, 16 are generally purely recursive filters.
  • the delay T and the gain G constitute long-term prediction parameters (LTP) which are determined adaptively by the coder.
  • the LPC parameters of the short-term synthesis filter 16 are determined at the coder by a linear prediction of the speech signal.
  • the transfer function of the filter 16 is thus of the form 1 / A (z) with in the case of a linear prediction of order P (typically P ⁇ 10), a i representing the i-th linear prediction coefficient.
  • FIG. 2 shows the diagram of a CELP coder.
  • the speech signal s (n) is a digital signal, for example supplied by an analog-digital converter 20 processing the amplified and filtered output signal from a microphone 22.
  • the LPC, LTP and EXC parameters are obtained at the coder by three respective analysis modules 24, 26, 28. These parameters are then quantified in a known manner for transmission effective digital, then subjected to a multiplexer 30 which forms the output signal of the encoder. These parameters are also supplied to a module 32 for calculating the initial states of certain coder filters.
  • This module 32 essentially comprises a decoding chain such as that represented in FIG. 1. The module 32 makes it possible to know at the level of the coder the previous states of the synthesis filters 14, 16 of the decoder, determined according to the synthesis parameters and d 'excitation prior to the subframe under consideration.
  • the short-term analysis module 24 determines the LPC parameters (coefficients a 1 of the short-term synthesis filter) by analyzing the short-term correlations of the speech signal s (n). This determination is made for example once per frame of ⁇ samples, so as to adapt to the evolution of the spectral content of the speech signal. LPC analysis methods are well known in the art, and will therefore not be detailed here. We can for example refer to the book “Digital Processing of Speech Signais" by LR Rabmer and RW Shafer, Prentice-Hall Int., 1978.
  • the next step in coding is to determine the LTP parameters for long-term synthesis. These are for example determined once per subframe of L samples.
  • a subtractor 34 subtracts from the speech signal s (n) the response to a zero input signal from the short-term synthesis filter 16. This response is determined by a filter 36 of transfer function 1 / A (z) whose coefficients are given by the LPC parameters which have been determined by the module 24, and whose initial states s are supplied by the module 32 so as to correspond to the last p synthetic signal samples.
  • the output signal from the subtractor 34 is subjected to a perceptual weighting filter 38.
  • the transfer function W (z) of this perceptual weighting filter is determined from the LPC parameters.
  • W (z) A (z) / A (z / ⁇ ), where ⁇ is a coefficient on the order of 0.8.
  • is a coefficient on the order of 0.8.
  • the role of the perceptual weighting filter 38 is to accentuate the portions of the spectrum where the errors are most perceptible.
  • the closed loop LTP analysis performed by the module 26 consists, in a conventional manner, in selecting for each subframe the delay T which maximizes the normalized correlation: where x '(n) denotes the output signal of the filter 38 during the sub-frame considered, and y T (n) denotes the convolution product u (nT) * h' (n).
  • h '(0), h' (1), ..., h '(L-1) denotes the impulse response of the weighted synthesis filter, with transfer function W (z) / A (z).
  • This impulse response h ′ is obtained by a module 40 for calculating impulse responses, as a function of the LPC parameters which have been determined for the sub-frame.
  • the samples u (nT) are the previous states of the long-term synthesis filter 14, provided by the module 32.
  • the missing samples u (nT) are obtained by interpolation on the basis of previous samples, or from the speech signal.
  • the delays T, whole or fractional, are selected in a specific window, ranging from example from 20 to 143 samples.
  • the open loop search consists more simply in determining the delay T 1 which maximizes the autocorrelation of the speech signal s (n) possibly filtered by the reverse filter of transfer function A (z). Once the delay T has been determined, the long-term prediction gain G is obtained by.
  • the signal Gy T (n) which has been calculated by the module 26 for the optimal delay T, is first subtracted from the signalx '(n) by the subtractor 42
  • the resulting signal x (n) is subjected to a reverse filter 44 which provides a signal D (n) given by:
  • h (0), h (1), ..., h (L-1) denotes the impulse response of the filter composed of the synthesis filters and the perceptual weighting filter, calculated by module 40.
  • the compound filter has the transfer function W (z) / [A (z) .B (z)].
  • the vector D constitutes a target vector for the module 28 for searching for the excitation.
  • This module 28 determines a code word from the directory which maximizes the standardized correlation P k 2 / ⁇ k 2 in which
  • the algebraic repertoire of possible excitation codes is defined from at least one group of N sets E 0 , E 1 , ..., E N-1 of possible positions for pulses of order 0.1 , ..., N-1 and of amplitude S 0 , S 1 , ..., S N-1 in codes of at least L samples.
  • a directory code is represented by N pulse positions belonging respectively to the sets E 0 , E 1 , ..., E N-1 of the same group of N sets.
  • the cardinals L ' 0 , L' 1 , ..., L ' N-1 of the sets E 0 , E 1 , ..., E N-1 can be equal or different, and these sets can be disjointed or not.
  • pos i, p ⁇ . (iN + p) + ⁇ (2) ⁇ and ⁇ being two integers such that 0 ⁇ ⁇ .
  • the module 28 After having calculated and memorized certain terms of the covariance matrix U, the module 28 proceeds to the search for the excitation code for the current sub-frame.
  • the memorized components of the covariance matrix are on the one hand those of the form:
  • the calculation of the N correlation vectors R p, p is carried out by the module 28 in the manner illustrated in FIG. 3.
  • This calculation comprises a loop indexed by an integer i decreasing from L'-l to 0.
  • the integer variable k is taken equal to L- ⁇ L'N- ⁇ (we assume here L- ⁇ L'N- ⁇ ⁇ 0), and the accumulation variable cor is taken equal to 0.
  • the components R p, p (i) are calculated successively for p decreasing from Nl to 0.
  • the variable p is first taken equal to N-1 (step 52).
  • N correlation vectors requires of the order of ⁇ L'N additions, ⁇ L'N multiplications and L'N loads in memory.
  • the initialization 50 of the calculation could be different.
  • the integer k can also be initialized to L- ⁇ L'N in step 50, each iteration in the loops indexed by p decreasing from N-1 to 0 then being constituted by ⁇ - ⁇ executions from step 54, followed by step 56 followed by ⁇ executions of step 54.
  • the calculation remains correct because in total ⁇ steps 54 are carried out between two successive memorizations of terms R p, p (i).
  • N (Nl) / 2 matrices of correlations R p, q . can be performed by module 28 as illustrated in FIG. 4.
  • this calculation includes a loop B t, d ' , indexed by an integer i decreasing from L'-1-d' to 0.
  • the integer t is taken equal to 1.
  • the integer d ' is then taken equal to 0 in step 72.
  • Step 74 corresponds to the initialization of the loop indexed by the integer i.
  • the integer i is initialized to L'-1-d ", the integer j to L'-1, the integer d to ⁇ .
  • Step 78 is then executed ⁇ times consisting in adding the term h (k) .h (k + d) to the accumulation variable cor and incrementing the variable k by one.
  • step 80 the component R p, q (i, j) is taken equal to the accumulation variable cor, and the integers p and q are each decremented by one.
  • Test 82 is then performed on the integer p. If p ⁇ 0, we return before step 78 which will be executed again ⁇ times. If test 82 shows that p ⁇ 0, test 84 is performed on the integer i. If i> 0, we go to step 86 where the integer p 'is initialized to N-1, the integer q remaining equal to t-1.
  • Step 86 is followed by ⁇ successive executions of step 88 consisting, like step 78, of adding h (k) .h (k + d) to the accumulation variable cor and of incrementing the integer variable k of a unit. Then, the component R q, p ' (j, i-1) is taken equal to the accumulation variable cor, and the integers p' and q are each decremented by one, in step 90. We perform then test 92 on the value of the integer q. If q ⁇ 0, we return before step 88 which will be executed again ⁇ times.
  • step 94 we return before step 76 for the execution of the next iteration in the loop B t , of .
  • This loop is terminated when test 84 shows that i ⁇ 0. The integer of is then incremented by one
  • N (N-1) / 2 correlation matrices requires only about ⁇ L ' 2 N (N-1) / 2 additions, ⁇ L' 2 N (N-1) / 2 multiplications and L ' 2 N (N-1) / 2 loads in memory .
  • N-1 partial thresholds T (0), ..., T (N-2) are first calculated, and the threshold T (N-1) is initialized to a negative value, for example -1.
  • the partial thresholds T (0), ..., T (N-2) are positive and calculated as a function of the input vector D and of the targeted compromise between the efficiency of the search for excitation and the simplicity of this research. High values of partial thresholds tend to decrease the amount of computation necessary to search for excitation, while low values of partial thresholds lead to a more exhaustive search in the ACELP repertoire.
  • index i 0 is taken equal to 0.
  • the iteration of index i 0 in the loop B 0 comprises a step 124 0 of calculation of two terms P (0) and ⁇ 2 (0) according to:
  • the comparison 126 0 is then carried out between the quantities P 2 (0) and T (0) . ⁇ 2 (0). If P 2 (0) ⁇ T (0) . ⁇ 2 (0), then we go to step 130 0 of incrementing the index i 0 then to test 132 0 where the index i 0 is compared to number L '. When i 0 becomes equal to L ', the search for excitation is finished. Otherwise, we return before step 124 0 to proceed to the next iteration in the loop B 0 . If comparison 126 0 shows that P 2 (0) ⁇ T (0) . ⁇ 2 (0), then the loop B 1 is executed.
  • the loops B q , for 0 ⁇ q ⁇ N-1 consist of identical instructions:
  • comparison 126 q shows that P 2 (q) ⁇ T (q) . ⁇ 2 (q), increment 130 g of the index i q , then comparison 132 q between the index i q and the number L ';
  • Loop B N-1 consists of the same instructions as the previous loops. However, if the comparison 126 N-1 shows that P 2 (N-1) ⁇ T (N-1) . ⁇ 2 (N-1), then a step 128 is executed before going to step 130 N- 1 for incrementing the index i N-1 .
  • N indexes i 0 , i 1 , ..., i N-1 used to find the positions of N code pulses.
  • the N indexes i 0 , i 1 , ..., i N-1 can be compiled into a global index k given by:
  • this index k being coded on N.log 2 (L ') bits. It is noted that the arrangement of the components in correlation matrices makes it possible, during the search for nested loops, to address the necessary components of the matrix U for a loop by a simple incrementation of the pointers i q by one unit, instead to have to do more complicated address calculations as in the case of the previous ACELP coder.
  • the last sequence numbers are preferably assigned to the pulses in question. If there are q possible amplitude values for the pulse q, then loop N of the flowchart in FIGS. 5A and 5B is executed n each time with a value different from the amplitude S q , and further stores in step 128 the number of times that the loop B q has been executed before encountering a value greater than P 2 (N-1) / ⁇ 2 (N-1). This number will also be transmitted to the decoder which will therefore be able to find the amplitude S q to be applied to the corresponding pulse of the excitation code.
  • the ACELP decoder comprises a demultiplexer 8 receiving the bit stream from the coder.
  • the quantized values of the excitation parameters EXC and of the synthesis parameters LTP and LPC are supplied to the generator 10, to the amplifier 12 and to the filters 14, 16 to reconstruct the synthetic signal s, which can for example be converted into analog by the converter 18 before being amplified and then applied to a loudspeaker 19 to restore the original speech.
  • the integer variable k is initialized to L- ⁇ L'N at the initialization 74 of a loop B t, d ' ;
  • step 78 and step 80 are
  • the search for excitation can be simply carried out by performing once for each of the M groups the search for nested loops represented in FIGS. 5A and 5B. It then suffices to store in step 128 the number of times that the search for nested loops has been entirely executed before the current search to obtain the index m of the group making it possible to reconstruct the selected excitation code.
  • the second embodiment with M> 1 makes it possible to implement a suboptimal search procedure which still provides significant savings in memory space.
  • This procedure consists in memorizing the correlation vectors R p, p (m) and the correlation matrices R p, q (m) only for ⁇ of group indices m (1 ⁇ ⁇ M). The additional gain in memory space is then a factor ⁇ / M.
  • This procedure amounts to subdividing the cova matrix riance U in sub-blocks with the approximation U (i, j) * U (i-1, j-1) within each sub-block.
  • the steps 55 m , 79 m and 89 m are bypassed relative to those of the indexes m for which the correlation vectors R p, p ( m) and the correlation matrices R p, q (m) .
  • the search for excitation can be carried out in accordance with the flow diagram of FIGS. 5A and 5B by modifying the loops B q (0 ⁇ q ⁇ N) in the manner indicated in FIG. 9.
  • test 126 q shows that P 2 (q) / ⁇ 2 (q) is greater than the threshold T (q)
  • we execute the lower loops starting with B q + 1 or, if q N-1, we perform updating 128 of the threshold and of the excitation parameters which also include the index m then taken equal to 0.
  • step 125 q is executed directly if the test 126 shows that P 2 ( q) ⁇ T (q). ⁇ 2 (q).
  • LPC coefficients are converted into vectorally quantized spectral line parameters (LSP).
  • LTP delays which can take 256 integer or fractional values between 19% and 143 are quantized on 8 bits. These 8 bits are transmitted in sub-frames 1 and 4 and, for the other sub-frames, a differential value is coded on 5 bits only.
  • the implementation of the invention makes it possible to divide by 2.5 the size of the memory required for the coder to store the components of the covariance matrix, while obtaining output signals identical to those which allowed obtain the previous ACELP encoder.
  • DSP digital processing processors
  • the LPC and LTP parameters are determined similar to example 1.
  • the bit rate is then 158 bits per frame, or 5.3 kbit / s.
  • the implementation of the invention makes it possible to divide by 2.8 the memory required for the coder to store the components of the covariance matrix while obtaining identical output signals (gain of 1488 words of 16 bits allowing 12-bit addressing in RAM).
  • the synthesis parameters being coded as in the case of Examples 1 and 2, the coder produces 153 bits per frame, which represents a bit rate of 5.1 kbit / s.
  • the second embodiment of the invention, applied without the sub-optimal procedure, would require storing 832 components of the matrix U.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • General Physics & Mathematics (AREA)
  • Algebra (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
EP96901020A 1995-01-06 1996-01-04 Verfahren zur sprachkodierung mittels linearer prädiktion und anregung durch algebraische kodes Expired - Lifetime EP0749626B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR9500133 1995-01-06
FR9500133A FR2729245B1 (fr) 1995-01-06 1995-01-06 Procede de codage de parole a prediction lineaire et excitation par codes algebriques
PCT/FR1996/000017 WO1996021221A1 (fr) 1995-01-06 1996-01-04 Procede de codage de parole a prediction lineaire et excitation par codes algebriques

Publications (2)

Publication Number Publication Date
EP0749626A1 true EP0749626A1 (de) 1996-12-27
EP0749626B1 EP0749626B1 (de) 1999-10-20

Family

ID=9474930

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96901020A Expired - Lifetime EP0749626B1 (de) 1995-01-06 1996-01-04 Verfahren zur sprachkodierung mittels linearer prädiktion und anregung durch algebraische kodes

Country Status (8)

Country Link
US (1) US5717825A (de)
EP (1) EP0749626B1 (de)
JP (1) JP3481251B2 (de)
KR (1) KR100389693B1 (de)
CA (1) CA2182386C (de)
DE (1) DE69604729T2 (de)
FR (1) FR2729245B1 (de)
WO (1) WO1996021221A1 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194461A (zh) * 2006-03-10 2011-09-21 松下电器产业株式会社 固定码本搜索装置

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2729247A1 (fr) * 1995-01-06 1996-07-12 Matra Communication Procede de codage de parole a analyse par synthese
FR2729246A1 (fr) * 1995-01-06 1996-07-12 Matra Communication Procede de codage de parole a analyse par synthese
US5646867A (en) * 1995-07-24 1997-07-08 Motorola Inc. Method and system for improved motion compensation
JP3094908B2 (ja) * 1996-04-17 2000-10-03 日本電気株式会社 音声符号化装置
JP3707154B2 (ja) * 1996-09-24 2005-10-19 ソニー株式会社 音声符号化方法及び装置
DE19641619C1 (de) * 1996-10-09 1997-06-26 Nokia Mobile Phones Ltd Verfahren zur Synthese eines Rahmens eines Sprachsignals
US5970444A (en) * 1997-03-13 1999-10-19 Nippon Telegraph And Telephone Corporation Speech coding method
US5924062A (en) * 1997-07-01 1999-07-13 Nokia Mobile Phones ACLEP codec with modified autocorrelation matrix storage and search
CA2254620A1 (en) * 1998-01-13 1999-07-13 Lucent Technologies Inc. Vocoder with efficient, fault tolerant excitation vector encoding
US6266412B1 (en) * 1998-06-15 2001-07-24 Lucent Technologies Inc. Encrypting speech coder
US6556966B1 (en) 1998-08-24 2003-04-29 Conexant Systems, Inc. Codebook structure for changeable pulse multimode speech coding
US6714907B2 (en) 1998-08-24 2004-03-30 Mindspeed Technologies, Inc. Codebook structure and search for speech coding
SE521225C2 (sv) * 1998-09-16 2003-10-14 Ericsson Telefon Ab L M Förfarande och anordning för CELP-kodning/avkodning
JP4005359B2 (ja) 1999-09-14 2007-11-07 富士通株式会社 音声符号化及び音声復号化装置
US6738733B1 (en) 1999-09-30 2004-05-18 Stmicroelectronics Asia Pacific Pte Ltd. G.723.1 audio encoder
JP3449339B2 (ja) * 2000-06-08 2003-09-22 日本電気株式会社 復号化装置および復号化方法
US7363219B2 (en) * 2000-09-22 2008-04-22 Texas Instruments Incorporated Hybrid speech coding and system
JP3449348B2 (ja) * 2000-09-29 2003-09-22 日本電気株式会社 相関行列学習方法および装置ならびに記憶媒体
JP3536921B2 (ja) * 2001-04-18 2004-06-14 日本電気株式会社 相関行列学習方法、装置及びプログラム
DE10140507A1 (de) * 2001-08-17 2003-02-27 Philips Corp Intellectual Pty Verfahren für die algebraische Codebook-Suche eines Sprachsignalkodierers
US7383283B2 (en) * 2001-10-16 2008-06-03 Joseph Carrabis Programable method and apparatus for real-time adaptation of presentations to individuals
US8195597B2 (en) * 2002-02-07 2012-06-05 Joseph Carrabis System and method for obtaining subtextual information regarding an interaction between an individual and a programmable device
US8655804B2 (en) 2002-02-07 2014-02-18 Next Stage Evolution, Llc System and method for determining a characteristic of an individual
JP4290917B2 (ja) * 2002-02-08 2009-07-08 株式会社エヌ・ティ・ティ・ドコモ 復号装置、符号化装置、復号方法、及び、符号化方法
US7003461B2 (en) * 2002-07-09 2006-02-21 Renesas Technology Corporation Method and apparatus for an adaptive codebook search in a speech processing system
EP1383109A1 (de) * 2002-07-17 2004-01-21 STMicroelectronics N.V. Verfahren und Vorrichtung für breitbandige Sprachkodierung
US7249014B2 (en) 2003-03-13 2007-07-24 Intel Corporation Apparatus, methods and articles incorporating a fast algebraic codebook search technique
GB0307752D0 (en) * 2003-04-03 2003-05-07 Seiko Epson Corp Apparatus for algebraic codebook search
FI118835B (fi) * 2004-02-23 2008-03-31 Nokia Corp Koodausmallin valinta
KR100668299B1 (ko) * 2004-05-12 2007-01-12 삼성전자주식회사 구간별 선형양자화를 이용한 디지털 신호 부호화/복호화방법 및 장치
SG123639A1 (en) 2004-12-31 2006-07-26 St Microelectronics Asia A system and method for supporting dual speech codecs
KR101542069B1 (ko) * 2006-05-25 2015-08-06 삼성전자주식회사 고정 코드북 검색 방법 및 장치와 그를 이용한 음성 신호의부호화/복호화 방법 및 장치
KR101196506B1 (ko) * 2007-06-11 2012-11-01 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 임펄스형 부분 및 정적 부분을 갖는 오디오 신호를 인코딩하는 오디오 인코더 및 인코딩 방법, 디코더, 디코딩 방법 및 인코딩된 오디오 신호
WO2009033288A1 (en) * 2007-09-11 2009-03-19 Voiceage Corporation Method and device for fast algebraic codebook search in speech and audio coding
ATE518224T1 (de) * 2008-01-04 2011-08-15 Dolby Int Ab Audiokodierer und -dekodierer
CN101615394B (zh) * 2008-12-31 2011-02-16 华为技术有限公司 分配子帧的方法和装置
EP2665060B1 (de) * 2011-01-14 2017-03-08 Panasonic Intellectual Property Corporation of America Gerät zur kodierung eines sprach-/tonsignal
CN102623012B (zh) * 2011-01-26 2014-08-20 华为技术有限公司 矢量联合编解码方法及编解码器
CA2887009C (en) * 2012-10-05 2019-12-17 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. An apparatus for encoding a speech signal employing acelp in the autocorrelation domain
EP2919232A1 (de) * 2014-03-14 2015-09-16 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Codierer, Decodierer und Verfahren zur Codierung und Decodierung

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1229681A (en) * 1984-03-06 1987-11-24 Kazunori Ozawa Method and apparatus for speech-band signal coding
CA1255802A (en) * 1984-07-05 1989-06-13 Kazunori Ozawa Low bit-rate pattern encoding and decoding with a reduced number of excitation pulses
US4868867A (en) * 1987-04-06 1989-09-19 Voicecraft Inc. Vector excitation speech or audio coder for transmission or storage
US4910781A (en) * 1987-06-26 1990-03-20 At&T Bell Laboratories Code excited linear predictive vocoder using virtual searching
US4899385A (en) * 1987-06-26 1990-02-06 American Telephone And Telegraph Company Code excited linear predictive vocoder
CA2005115C (en) * 1989-01-17 1997-04-22 Juin-Hwey Chen Low-delay code-excited linear predictive coder for speech or audio
EP0422232B1 (de) * 1989-04-25 1996-11-13 Kabushiki Kaisha Toshiba Stimmenkodierer
CA2027705C (en) * 1989-10-17 1994-02-15 Masami Akamine Speech coding system utilizing a recursive computation technique for improvement in processing speed
CA2010830C (en) * 1990-02-23 1996-06-25 Jean-Pierre Adoul Dynamic codebook for efficient speech coding based on algebraic codes
US5195137A (en) * 1991-01-28 1993-03-16 At&T Bell Laboratories Method of and apparatus for generating auxiliary information for expediting sparse codebook search
US5495555A (en) * 1992-06-01 1996-02-27 Hughes Aircraft Company High quality low bit rate celp-based speech codec
FR2700632B1 (fr) * 1993-01-21 1995-03-24 France Telecom Système de codage-décodage prédictif d'un signal numérique de parole par transformée adaptative à codes imbriqués.

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9621221A1 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194461A (zh) * 2006-03-10 2011-09-21 松下电器产业株式会社 固定码本搜索装置
CN102194461B (zh) * 2006-03-10 2013-01-23 松下电器产业株式会社 固定码本搜索装置

Also Published As

Publication number Publication date
DE69604729T2 (de) 2002-07-25
EP0749626B1 (de) 1999-10-20
KR100389693B1 (ko) 2003-12-01
FR2729245B1 (fr) 1997-04-11
JP3481251B2 (ja) 2003-12-22
US5717825A (en) 1998-02-10
JPH10502191A (ja) 1998-02-24
CA2182386A1 (fr) 1996-07-11
CA2182386C (fr) 2003-09-09
DE69604729D1 (de) 1999-11-25
FR2729245A1 (fr) 1996-07-12
KR970701901A (ko) 1997-04-12
WO1996021221A1 (fr) 1996-07-11

Similar Documents

Publication Publication Date Title
EP0749626B1 (de) Verfahren zur sprachkodierung mittels linearer prädiktion und anregung durch algebraische kodes
EP0608174B1 (de) System zur prädiktiven Kodierung/Dekodierung eines digitalen Sprachsignals mittels einer adaptiven Transformation mit eingebetteten Kodes
EP0782128B1 (de) Verfahren zur Analyse eines Audiofrequenzsignals durch lineare Prädiktion, und Anwendung auf ein Verfahren zur Kodierung und Dekodierung eines Audiofrequenzsignals
EP1994531B1 (de) Verbesserte celp kodierung oder dekodierung eines digitalen audiosignals
KR930010399B1 (ko) 특정 여기 코드 워드 선택 방법
EP1692689B1 (de) Optimiertes mehrfach-codierungsverfahren
EP0801790B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
FR2731548A1 (fr) Recherche profondeur d'abord dans un repertoire algebrique pour un encodage rapide de la paroie
FR2706064A1 (fr) Procédé et dispositif de quantitication vectorielle.
EP0801788B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
EP0721180B1 (de) Sprachkodierung mittels Analyse durch Synthese
EP0428445B1 (de) Verfahren und Einrichtung zur Codierung von Prädiktionsfiltern in Vocodern mit sehr niedriger Datenrate
FR2690551A1 (fr) Procédé de quantification d'un filtre prédicteur pour vocodeur à très faible débit.
EP1836699B1 (de) Verfahren und Vorrichtung zur Ausführung einer optimalizierten Audiokodierung zwischen zwei Langzeitvorhersagemodellen
EP0616315A1 (de) Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse
EP1383109A1 (de) Verfahren und Vorrichtung für breitbandige Sprachkodierung
WO2011144863A1 (fr) Codage avec mise en forme du bruit dans un codeur hierarchique
EP1192619B1 (de) Audio-kodierung, dekodierung zur interpolation
WO2002029786A1 (fr) Procede et dispositif de codage segmental d'un signal audio
FR2709366A1 (fr) Procédé de stockage de vecteurs de coefficient de réflexion.
JP2001100799A (ja) 音声符号化装置、音声符号化方法および音声符号化アルゴリズムを記録したコンピュータ読み取り可能な記録媒体
FR2709387A1 (fr) Système de communication radio.
FR2980620A1 (fr) Traitement d'amelioration de la qualite des signaux audiofrequences decodes

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19960912

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE GB IT NL SE

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

17Q First examination report despatched

Effective date: 19990210

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE GB IT NL SE

REF Corresponds to:

Ref document number: 69604729

Country of ref document: DE

Date of ref document: 19991125

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 19991203

ITF It: translation for a ep patent filed

Owner name: BARZANO' E ZANARDO MILANO S.P.A.

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 20141223

Year of fee payment: 20

Ref country code: GB

Payment date: 20141219

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 20141217

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20141222

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20141218

Year of fee payment: 20

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 69604729

Country of ref document: DE

REG Reference to a national code

Ref country code: NL

Ref legal event code: MK

Effective date: 20160103

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20160103

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20160103

REG Reference to a national code

Ref country code: SE

Ref legal event code: EUG