WO2002029786A1

WO2002029786A1 - Method and device for segmental coding of an audio signal

Info

Publication number: WO2002029786A1
Application number: PCT/FR2001/003060
Authority: WO
Inventors: Alain Le Guyader; Catherine Quinquis
Original assignee: France Telecom
Priority date: 2000-10-06
Filing date: 2001-10-04
Publication date: 2002-04-11
Also published as: FR2815160A1; AU2001293950A1; FR2815160B1

Abstract

An audio frequency signal is quantified in a CELP-type encoder or the like using perceptual weighting. Synthesizer filters receive a frame-by-frame excitation signal as being the product of a wave form for gain. The wave form is partitioned into segments with predefined size. Said partitioning is used to carry out an efficient calculation of the excitation, thereby limiting the complexity of the encoder while remaining very close to the optimal solution.

Description

METHOD AND DEVICE FOR SEGMENTAL CODING OF AN AUDIO SIGNAL

The present invention relates to the coding methods of audio frequency signals. These methods find particular application in coders with analysis by synthesis, of which the most widespread type is the coder CELP (“Code-Excited Linear Prediction”).

The techniques of predictive coding with analysis by synthesis are currently very widespread for the coding of speech in the telephone band (300-3400 Hz) at bit rates which can go down to 8 kbit / s, while maintaining telephone quality.

The . first CELP coder standardized in 1992 at the ITU (International Telecommunication Union) was G.728 using the LD-CELP (“Low Delay CELP”) technique. This technique makes it possible to reduce the bit rate in a ratio 4 compared to the MIC coding at 64 kbit / s. A general description of G.728 can be found in the article "A Low-Delay CELP coder for the CCITT 16 kb / s Speech Coding Standard" by JH Chen et al., Published in the journal IEEE Journal on Selected Areas on Communication, Vol. 10-5, June 1992 pages 830-849.

A few years later, in 1995, the ITU standardized a CELP coder called G.729, operating at 8 kbit / s. This algorithm includes at the decoder a synthesis filter provided with a short-term and long-term predictor having as input an excitation signal. In the coder, the short and long term predictors are first calculated by methods of the linear prediction type. Once these predictors are known, it is possible to calculate the excitation signal by the so-called analysis-by-synthesis method. The algorithms used in the G.729 coder are described in the article "Design and description of CS-ACELP: a toll quality 8 kb / s speech coder" by R. Salami et al., Published in IEEE Trans. On Speech and Audio Processing, Vol. 6-2, March 1998, pages 116-130. Several CELP coding standards have, in their basic principle, an organization similar to that described in the previous article. communication with mobiles GSM 06.60 and GSM 06.90 also called NB-AMR ("Narrowband Adaptive Multi-Rate"), published by ETSI ("European Telecommunications Standard Institute") and annexes D and E of G.729. In the CELP coders mentioned above, the excitation signal is modeled by a waveform from a CB dictionary (CodeBook in English) and reset to the correct scale by multiplication by a gain factor. The efficiency of CELP coding consists in using in the process of analysis by synthesis a replica of the synthesis model of the decoder for the selection of the excitation signal: for each waveform of index i of the repertoire of models of excitation, the synthetic signal is evaluated at the output of the synthesis model, and finally, as the excitation signal to be transmitted, the one which makes it possible to get as close as possible to the signal to be coded according to a perceptual error criterion.

By separating the contribution from the past ("zero input response") from the synthesis with zero initial conditions ("zero memory response"), the perceptual error criterion E (i) to be minimized can be written from the signals in the time domain, for i = 0, 1, ..., NC-1:

where: t is the target vector of dimension L to be modeled; g (i) is the gain of the waveform;

H is the filter matrix associated with the filter H (z) combining the perceptual weighting filter W (z) and the long-term synthesis filters 1 / (1 - bz ^{~ τ} ) and short-term 1 / A (z) :

U _j is the waveform of index i from the dictionary CB called repertoire of waveforms; NC the number of waveforms in the dictionary. The filtering matrix H can be expressed as a function of the impulse response h ₀ , h. ,, .... h _-1 on L samples of the transfer function filter H (z):

Equation (1) shows that the minimization of the CELP criterion amounts to a vector quantization with a weighted error criterion. Most current CELP coders include such a mechanism for selection of the excitation signal. The calculations carried out to arrive at this perceptual error criterion are described in the article "The processing of the voice signal" by P. Combescure et al., Published in the Telecommunications Annals, vol. 50-1, 1995, pages 142-164. The size L of the samples in the dictionary varies from a few samples (5 for the G.728) to a few tens (40 for the G.729). The number N _c of dictionary waveforms can be coded on a number of bits commonly having values from 4 to a few tens. If this number exceeds ten, it is necessary to use focused search methods ("focussed search") as is the case for the G.729 coder and the coders of the ACELP family ("Algebraic CELP) for which the excitation vector is formed by a few non-zero pulses.

By deriving the perceptual error criterion with respect to g (i), we obtain the optimal value of the gain g (i):

By replacing g (i) by its value in the criterion, the index l _opt of the optimal waveform, to be transmitted to the decoder, will be such that the following CELP criterion is minimum:

NC is the number of vectors in the dictionary, therefore the number of searches to perform. The optimal gain g (l _opt ) corresponding to the optimal index will be quantified and its value β will be transmitted. In fact the minimization of E (i) is equivalent to maximizing the term:

Critd) = I ^ £ (6, To avoid divisions, it is also possible to perform a search loop by iteration on the indices i, for example initialized by i = l _opt = 0, each iteration i consisting in replacing l _opt by i if

[Num (l _opt )] ² .Den (i) <[Num (i)] ² .Den (l _opt ) (7) and to store the quantities [Num (l _opt )] ² and Den (l _opt ), before incrementing i. This way of proceeding makes it possible to arrive at different similar structures of which one will find some examples in the standards and GSM and ITU mentioned above. In practice, so that the previous method of selecting the excitation signal remains of reasonable complexity, special dictionaries are used, the organization of which makes it possible to reduce the complexity of finding the optimal waveform. Among these dictionaries one can cite the ACELP dictionaries consisting of a small number of pulses per block of samples (for G.729, typically 5 non-zero pulses per block of 40 samples of excitation signal).

One can also quote the excitation VSELP (“Vector Sum Excited Linear Prediction”) made up of Gray binary codes used to model the excitation signal in the standard GSM 06.20 and whose principle is described in the article “Vector Sum Excited Linear Prediction (VSELP): speech coding at 8 bps "by LA. Gerson et al., Published in the proceedings of the ICASSP conference, 1990, pages 461-464. Natural binary codes have also been used, as described in the article "A robust and fast CELP coder at 16 kbit / s", by A. Le Guyader et al., Published in the journal Speech Communication, vol. 7, 1988, pages 217-226.

In the G.728 coder standardized at the ITU, the excitation signal is coded on 10 bits for each vector of 5 samples with the following distribution: 2 bits for the gain g (i) and 8 bits for the excitation or waveform U _j (NC = 256). In this case, the percentage of the bit rate taken by the gain is 20%, which is important and still the gain has been quantified on too few bits. The effects of this poor quantification are offset by the richness of the dictionary at the cost of a significant throughput.

Another approach is to take longer blocks, for example 10 samples, and to quantify the gain correctly on 3 or 4 bits. In our example, there will remain 16 bits to quantify the waveforms. In this case, there will be 2 ¹⁶ analysis-by-synthesis searches to be carried out for each vector as indicated above. Knowing that with the current components, a number of searches of more than a few thousand is reasonable, it is not possible to afford the 2 ¹⁶ searches each requiring the calculation of the quantities Num (i) and Den (i ) of relation (6).

In the cases where one wants to model speech signals possibly noisy with bit rates close to 1 bit / sample, the conventional methods prove difficult to implement. If we take for example an excitation vector of length L = 20 with 3 bits for the gain and 16 bits for the waveforms, we are again brought back to the previous case: the complexity of the 2 ¹⁶ searches is prohibitive if we use the methods classics. In the coder described in US Pat. No. 5,274,741, the excitation signal comes from several dictionaries, the waveforms of which consist of non-zero pieces, followed and / or preceded by blocks having zero values. This way of defining the dictionary allows searching using an identical search procedure for all dictionaries. In fact, the excitation signal is seen in this case as coming from the sum of waveforms coming from different dictionaries, certain regions of the dictionaries being made up of zero values. A disadvantage is that the fact of not taking account of the presence of the null values in the dictionaries leads to an overload of computations. In addition, the minimization is done here over the length of the excitation signal, even at the places where the dictionary contains blocks of zero values. In the sequential minimization process, these locations contribute significantly to the final error, although the excitation signal block is not yet known. This is largely suboptimal. In the coder described in US Pat. No. 5,621,852, the excitation signal of a CELP coder is a sum of two waveforms from two dictionaries, to which a single gain is applied. The minimization is also done over the length of the excitation signal, which is that of the added waveforms. There is no partitioning as in US Patent 5,274,741.

An object of the invention is to propose a modeling of the excitation in a CELP or similar type coder and processing operations of reasonable complexity in order to carry out an effective coding of audio frequency signals.

The invention thus proposes a method of coding an audio-frequency signal digitized in successive frames of L samples, comprising the following steps:

- adaptive determination of synthesis parameters defining synthesis filters;

- adaptive determination of excitation parameters defining an excitation signal to be applied to the synthesis filters to produce a signal synthetic representative of the audio signal, minimizing the energy of an error signal resulting from the filtering of the difference between the audio signal and the synthetic signal by at least one perceptual weighting filter; and - production of quantization values representative of the determined parameters, in which the excitation parameters include, for at least some of the frames, a waveform of L samples associated with a gain, said excitation signal relating to the frame resulting from the multiplication of the waveform of L samples by the associated gain, the waveform being made up of a number K, greater than 1, of segments juxtaposed according to a determined partition of the frame and coming from dictionaries respective, the k-th segment being made up of L _k samples for 1 <k <K, the numbers L ₁ to L _κ being integers whose sum is equal to L. According to the invention, the determination of the excitation parameters for at least one signal frame comprises the following steps: lai calculation of a target vector of L samples from the signal frame and parameters determined for at least one previous frame; / b / calculation, over a length of L samples, of an impulse response h ₀ , h. ,, ..., h _L-1 of a filter composed of the synthesis filters and a perceptual weighting filter; Here precalculation of vectors V _kk - and of matrices M _kk - _k "for integers k, k 'and k" such that 1 <k "<k'<k<K, with V _kk . = tj .H _kk . and

M _kk . _{k "} = H _k ..H _kk ., where t _k denotes the k-th vector, of L _k samples, obtained by subdividing the target vector according to said partition, H _kk . denotes the matrix of L _k lines and L _k columns whose term located in the (q + 1) -th row and in the (r + 1) -th column (0 <q <L _k ,

0 <r <L _k .) Is worth H _kk . (Q, r) = h _{q + Λk} _ _r _ _Λk . with, for 1 <k <K: ki Λk = ∑ L and H _kk (q, r) = 0 if q <r, and (.) ^τ denotes the operation of k '= l transposition;

/ d / selection of at least one segment from the dictionary relating to the first segment, by maximizing a criterion of the form: C ^C r ^π it ^t i.0 ¹ ) - [ ^N _D ^u _e ^m _n l _ι ( ₍ ' _j ¹ ₁ ) ₎ F ₍ ( _R 8) ₎

where: Num ₁ (i1) = V _{1 1} .U1 _i1 (9)

Den., ( _I1 ) = U1 _j1 ^τ M ₁ ^ ^ .U ^ (10) and U1 denotes a segment of index il from the first dictionary, and the following steps le / and IV for k ranging from 2 to K: the / for each (kl) -uplet of segments U1 _{i1 [nj} , ..., Uk _j ^ _k _ ₁ ^ _nj just selected, calculation of numbers FN [n] and FD [n] and of a vector C [n] of the form:

FN [n] = V _kk ..Uk ' _ik . _[n] (11)

FD [n] = Den ^ ilfn], i2 [n], .... i (k-1) [n]) + £ Uk ', ^ - ^ _ιk ..Uk' _ik . _[n] k '= 1

C [n] = ¹ 2 ^k £ Uk _{'ik [n]} .M _{kιk k.} (13) k '= 1

IV selection of at least one k-tuplet of segments respectively from k dictionaries relating to segments 1 to k, including the first k-1 segments

Uk _j , _k _. _{| n} ι come from a (kl) -uplet previously selected, by maximizing a criterion of the form:

Crit mmi if k-1 πm \ VΛ - [ ^Num k ( ⁱ¹ ["1 Kk-i) [n], ik) f

^{Cr, tk (l1 [n] (k 1) [n]} ' ^{lk) ~} Den _k (i1 [n] i (k-1) [n], ik) ⁽¹⁴⁾ where: Num _k (i1 [n] i (k-1) [n], ik) = FN [n] + V _kk .Uk _jk (15)

Den _k (i1 [n] i (k-1) [n], ik) = FD [n] + C [n] .Uk _jk + Uk ^ .M, - _kk .Uk _jk (16) and Uk _jk denotes a segment of index ik from the k-th dictionary, the waveform retained for the frame consisting of a K-tuplet of segments selected in step IV for k = K.

This method makes it possible to model the excitation signal by juxtaposed waveform segments, so that the overall waveform can be seen as a waveform partitioned into successive blocks.

Thanks to this partitioning, we can adopt new calculation methods excitement, reduced to a reasonable complexity. The search for excitation proceeds by sequential and progressive optimization over the length of the successive segments of the waveform.

The gain in complexity makes it possible to propose dictionaries of large size by juxtaposing dictionaries of reasonable size (typically <2 ¹⁰ searches), which results in better modeling of the excitation signal therefore a higher listening quality.

In the case of structured dictionaries (binary, ACELP, ...), the method makes it possible to take advantage of the structure of the dictionaries to reduce even more clearly the computational load to be carried out for the implementation. In the optimal waveform search loops, the essential quantities are calculated by fast algorithms outside these loops. Generally, the calculations described below are arranged, in their optimal form, so that the maximum of calculation is done outside the search loops of the optimal waveform.

Structured dictionaries are often generated using transformation matrices.

In this case, for 1 <k <K, each segment of L _k samples of the dictionary relating to the k-th segment of the partition of the frames is the product of a predefined transformation matrix F _k of L _k rows and nck columns by a respective code of nck samples belonging to a directory generating the dictionary, nck being an integer smaller than L _k . The determination of the excitation parameters for at least one signal frame then comprises the following steps: / a / calculation of the target vector of L samples from the signal frame and of the parameters determined for at least one previous frame; / b / calculation, over a length of L samples, of the impulse response of the compound filter; Ic'l precalculation of vectors V _kk - and of matrices M _kk . _k .. for integers k, k 'and k "such that 1 <k"<k'<k<K, with V _kk . = tf.H _kk ..F _k . and

k, k ', k " ^{= F} k" - ^H k, k "- k, k ^, - ^F k'^;Id'l selection of at least one code from the directory relating to the first segment, by maximizing a form criterion: Crit.iD - [ ^Num l (' ¹ ) f, a ^{ϋπtl (l1)} - De _nι (i1) ⁽⁸⁾ where: Num ₁ (i1) = V ₁ .,. 01;., (17)

Den ₁ (i1) = C1 _i1 ^τ M _{1 1 1} .C1 _i1 (18) and Cl _j ., Denotes an index code il of the first directory, and the steps le'l and / f / following for k ranging from 2 to K: le'l for each (kl) -uplet of codes C1 _h j _n j ^Ck _j (ki) [n] ve ^nan t to be selected, calculation of numbers FN [n] and FD [n] and of a vector C [n] of the form:

FN [n] = Num _k _ ₁ (i1 [n], i2 [n] i (k-1) [n]) + £ V _kk ..Ck ' _ik . _[n] (19) k '= 1 k-1 FD [n] = Den ^ MM, i2 [n] i (k-1) [n]) + ∑ Ck' _jk . _[n ^τ M _kk . _k ..Ck ' _ik . _[n] k '= 1

+ 2. ^K ∑ ∑ Ck " _jk , .TM _kk , _k , Ck ' _ik . _[N] (20) k' = 2 k" = 1

C [n] = 2. ^k £ ¹ Ck ' _{jk [n]} .M _kk . _k . (21) k '= 1

/ f / selection of at least one k-tuplet of codes respectively from the k directories relating to segments 1 to k, including the first k-1 codes C1 _. ιr _n ι, ••• 'C " ^k i (kl) [n] come from a (kl) -uplet previously selected, by maximizing a criterion of the form: r Crιt M (ι1 _M [n] ι (k- -1 _M ) i [ _n nij, iιik-) = [N- = u —m _k (.i.1. [. n]. i ^ (k —-1.). [- n], , i .k,) .f (14) ^{k LJJ} 'Den _k (ι1 [n] ι (k-1) [n], ιk)' where: Num _k (i1 [n] i (k-1) [ n], ik) = FN [n] + V _kk .Ck _ik (22)

Den _k (i1 [n] i (k-1) [n], ik) = FD [n] + C [n] .Ck _ik + Ck _jk ^τ M _kkk .Ck _ik (23) and Ck _jk denotes a code index ik of the k-th repertoire, the waveform retained for the frame consisting of a K-tuplet of segments respectively obtained from a K-tuplet of codes selected in step / f / for k = K.

Another aspect of the present invention relates to an audiofrequency signal encoder comprising means suitable for implementing a method as defined above. Other features and advantages of the present invention will appear in the following description of nonlimiting exemplary embodiments, with reference to the appended drawings, in which:

- Figures 1 and 2 are block diagrams of a CELP decoder and coder capable of implementing the invention;

FIGS. 3 and 4 are flow diagrams of procedures for searching for the excitation vector usable in an encoder according to FIG. 2.

The invention is described below in its application to a CELP type speech coder. The speech synthesis process implemented in a CELP coder and decoder is illustrated in FIG. 1.

An excitation generator 10 delivers an excitation waveform U _{f1 i2 j} "of L samples denoted U _{h i2} JK (Π), belonging to a predetermined repertoire in response to an index produced by the coder. In the present case, this index is a composite index formed by whole K il, i2, .... iK (K> 2), the directory being constituted by juxtaposing by blocks K predetermined dictionaries CB1, CB2, ..., CBK . Form U _M excitation wave _{i2 iK} is thus decomposed into respective segments ^ Ul, U2 _j2, _.... UK _lK of L _{1 (} L ₂ L _κ samples belonging to K elementary dictionaries (L _{1 +} L ₂ + ... + L _κ = L):

with, for 1 <k <K

The K dictionaries CB1, ..., CBK from which the segments Ul _j . ,, U2 _i2 , ..., UK _iK are _extracted can be of different dimensions (L _k ≠ L _k . For k ≠ k ').

The dictionary CBk (1 <k <K) consists of NCk distinct segments indexed by the integer ik with 1 <ik <NCk. In general, NCk will be an integer power of 2 (NCk = 2 ^nck ), and the transmission of the composite index will therefore require nc1 + nc2 + ... + ncK bits per excitation waveform, i.e. NC = NC1 x NC2 χ ... x NCK.

An amplifier 12 multiplies the waveform U _{S1 i2 iK} by an excitation gain β, and the resulting signal is subjected to a long-term synthesis filter 14. The output signal u of the filter 14 is in turn subjected to a short-term synthesis filter 16, the output of which constitutes what is considered here as the synthesized speech signal. Of course, other filters can also be implemented at the level of the decoder, for example post-filters, as is well known in the field of speech coding.

The aforementioned signals are digital signals represented for example by 16-bit words at a sampling rate F _e equal for example to 8 kHz. The synthesis filters 14, 16 are generally purely recursive filters. The long-term synthesis filter 14 typically has a transfer function of the form 1 / B (z) with B (z) = 1 - bz ^{~ τ} . The delay T and the gain b constitute long-term prediction parameters (LTP) which are determined adaptively by the coder. The LPC parameters of the short-term synthesis filter 16 are determined at the coder by a linear prediction of the speech signal. The transfer function of the filter 16 is thus of the form 1 / A (z) with:

in the case of a linear prediction of order p (typically p ≈ 10), a _j representing the i-th coefficient of linear prediction.

The signal u (n) applied to the short-term synthesis filter 14 comprises an LTP bu component (nT) and a residual component, or innovation sequence, β.U _{h j2 jK} (n). In an analysis-by-synthesis coder, the parameters characterizing the residual component and, optionally, the LTP component are evaluated in a closed loop, using a perceptual weighting filter.

Figure 2 shows the diagram of a CELP coder. The speech signal s (n) is a digital signal, for example supplied by an analog-digital converter 20 processing the amplified and filtered output signal from a microphone 22. The signal s (n) is digitized in successive frames of N _τ samples themselves divided into sub-frames, or excitation frames, of L excitation samples (for example N _τ = 320, L = 32). The LPC, LTP and EXC parameters (index i1, j2 iK and excitation gain β) are obtained at the coder by three respective analysis modules 24, 26, 28. These parameters are then quantified in a known manner for the purpose of an efficient digital transmission, then subjected to a multiplexer 30 which forms the output signal of the encoder. These parameters are also supplied to a module 32 for calculating the initial states of certain coder filters. This module 32 essentially comprises a decoding chain such as that shown in FIG. 1. Like the decoder, the module 32 operates on the basis of the quantized LPC, LTP and EXC parameters. If an interpolation of the LPC parameters is carried out at the decoder, as is common, the same interpolation is carried out by the module 32. In the absence of transmission errors, the module 32 makes it possible to know at the level of the coder the previous states of the synthesis filters 14, 16 of the decoder, determined as a function of the synthesis and excitation parameters prior to the sub-frame considered.

In a first step of the coding process, the short-term analysis module 24 determines the LPC parameters (coefficients a ^ of the short-term synthesis filter) by analyzing the short-term correlations of the speech signal s (n). This determination is made for example once per frame of N _τ samples, so as to adapt to the evolution of the spectral content of the speech signal. LPC analysis methods are well known in the art. We can for example refer to the book “Digital Processing of Speech Signais” by LR Rabiner and RW Shafer, Prentice-Hall Int., 1978. The quantification of LPC parameters can be performed on the coefficients a _t directly, or on d other parameters to which they are linked such as reflection coefficients, LAR "log-area-ratios", spectral line parameters (LSP for "line spectrum parameters, or LSF for" line spectrum frequencies "), as well as it is common in audio coding techniques. An example of module 24 is described in standard G.729 published by the ITU.

The next step in coding is to determine the LTP parameters for long-term prediction. These are for example determined once per subframe of L samples. A subtractor 34 subtracts from the speech signal s (n) the response to a zero input signal from the short-term synthesis 16. This response is determined by a filter 36 of transfer function 1 / A (z) whose coefficients are given by the LPC parameters which have been determined by the module 24, and whose initial states are provided. by the module 32 so as to correspond to the M _p last samples of the synthetic signal. The output signal from the subtractor 34 is subjected to a perceptual weighting filter 38 whose role is to accentuate the portions of the spectrum where the errors are most perceptible, that is to say the inter-forming zones.

The transfer function W (z) of the perceptual weighting filter is of the general form: W (z) = A _N (z) / A _p (z), where A _N (z) and A _p (z) are order transfer functions M _p of finite impulse response (RIF) type calculated from A (z), for example A _N (z) = A (z / γι) and A _p (z) = A (z / γ ₂ ), γ-, and γ ₂ being two coefficients such that 0 <γ ₂ <γ ₁ <1 (M _p = p).

The closed loop LTP analysis performed by the module 26 consists, in a conventional manner, in selecting for each subframe the delay T which maximizes the normalized correlation:

where x '(n) denotes the output signal of the filter 38 during the sub-frame considered, and y _τ (n) denotes the convolution product S _| _ _τp (nT) * h '(n). In the above expression, h '(0), h' (1), ..., h '(L-1) denotes the impulse response of the weighted synthesis filter, with transfer function W (z) / A (z). This impulse response h 'is obtained by a module 40 for calculating impulse responses, as a function of the LPC parameters which have been determined for the sub-frame, where appropriate after quantization and interpolation. The samples S _LTP (nT) are the previous states of the long-term synthesis filter 14, provided by the module 32. For the delays T less than the length of a sub-frame, the missing samples S _TP (nT) are obtained by interpolation on the basis of previous samples, or from the residual speech signal. The delays T, whole or fractional, are selected in a specific window. To reduce the search range in closed loop, and therefore to reduce the number of convolutions y _τ (n) to calculate, we can first determine an open loop delay T 'for example once per frame, then select the closed loop delays for each subframe in a reduced interval around T'. The open loop search consists more simply in determining the delay T 'which maximizes the autocorrelation of the speech signal s (n) possibly filtered by the reverse filter with transfer function A (z). Once the delay T has been determined, the gain g _LTP of long-term prediction is obtained by:

This gain g _LTP is quantized, and its quantized value b is transmitted to the decoder in the LTP parameters.

To find the CELP excitation relating to a subframe, the signal by _τ (n), which has been calculated by the module 26 for the optimal delay T, is first subtracted from the signal x '(n) by the subtractor 42. The resulting signal t (n) is called target, and denoted vectorially t in relation (1). The impulse response h ₀ , h. ,, ..., h _L-1 of the filter composed of the synthesis filters and the perceptual weighting filter (cf. relation (2) and (3)), is calculated by the module 40 .

As a function of this impulse response and of the target t, the module 28 for searching for the excitation CELP determines the parameters EXC which will be communicated to the decoder, that is to say the composite index i1, i2 iK and the gain of quantified excitation β.

With reference to FIG. 1, the CELP decoder comprises a demultiplexer 8 receiving the bit stream from the coder. The quantized values of the excitation parameters EXC and of the synthesis parameters LTP and LPC are supplied to the generator 10, to the amplifier 12 and to the filters 14, 16 to reconstruct the synthetic signal s, which can for example be converted into analog by the converter 18 before being amplified and then applied to a loudspeaker 19 to restore the original speech.

We will now describe the operation of the module 28 of the coder of FIG. 2.

The error criterion to be minimized E (i1, i2, ..., iK) is expressed as a function of the composite index il, i2, ..., iK and the associated gain g (i1, i2, .. ., iK):

where the target vector t = has been decomposed according to the same

partition as the excitation waveform U _{M j2 iK} of the relation (24), that is

L _k .. There

so is the impulse response matrix H

^f k, k (oo) k, ko. - H _kιk (0, L _k-1.).

H _{k> k} - (1.0) H _kιk . (1,1) \

^H k, k '~ being a matrix at L _k

• '• ^H k, k ^, ( ^L k-2.L _k --1)

H _kk . (L _k -1,0) ... H _kk , (L _k -1, L _k -1) lines and L _k . columns whose term located on the (q + 1) -th row and at the

(r + 1) -th column (0 <q <L _k and 0 <r <L _k .) is H _kk ^, (q, r) = h _{q + Λk} _ _r _ _Λk ., with

H _kk (q, r) = 0 if q <r.

Note that the filtering matrix H is a particular matrix, since the terms of the parallels to the main diagonal are equal: it is a so-called Toeplitz matrix. In addition, the upper part of the matrix H is zero. These remarks lead to the following equality: for 1 <k <k'≤KH _klk - = (0) (31) for 1 <k <K and 1 <i <Kk (when the blocks are the same size): k + i , i + l ⁼ k, l (32) which module 28 takes advantage of to accelerate the calculations necessary for determining the optimal indices to be transmitted.

The optimal gain g (i1, i2, .... iK) is obtained by taking the derivative of E (i1, i2 iK) with respect to the gain: _j Num (i1, i2 iK) g (i1, i2, .. (33)

"Den (i1, i2, ..., iK)

with: Num (i1, i2,. - • .iK) = ∑ ∑ fH _k . _k -Uk (34) k = 1 k '= 1

^ Den (i1, i2 iK) = ∑ ∑Uk ' _kk TH _kk , Uk' _ik . k = 1 k '= 1

The indices of the optimal waveforms lk _opt (1 <k <K) are obtained when the error criterion E (i1, i2, ..., iK) is minimal or, equivalently, by maximizing:

Prit.i _\ v \ - [Num (i1, i2 iK) f. ^" ^{Cπt (l1} ' ^{l2 lK)} - Den (i1, i2 iK) ⁽³⁶⁾

In this case, the criterion Crit (i1, i2 iK) is maximized for the set of NC = NC1 x NC2 x ... x NCK waveforms, which obliges to take values of NCk corresponding to a number of bits of not more than a few units if the complexity is to remain reasonable for implementation on current signal processor circuits.

However, for an implementation of minimal complexity, the module 28 can take advantage of the fact that it is not necessary to recalculate everything in the search loops on the waveforms.

In what follows, we describe a process of optimal search for CELP excitation, in the case where K = 2 in order to lighten the notations.

In the equations giving Num (i1, i2) and Den (i1, i2), all the quantities which are not a function of the indices il and i2 can be calculated once and for all outside the search loops. This is highlighted by rewriting Num (i1, i2) and Den (i1, i2):

Num (i1, i2) = (V _1f1 + V ₂₁ ) .U1 _n + V ₂₂ .U2 _i2 (37)

Den (i1, \ 2) = U1 _h ^τ ( _{Ml> 1j1} +

+ 2.U1 _i1 ^τ .M ₂₂₁ .U2 _i2 + U2 _j2 ^τ M _{2> 2ι2} .U2 _i2

(38) with: VVl ₁ , 1 ₁ == tt}} .. HH ₁₁₁₁ (39)

^M 1,1,1 = ^{: H} 1,1- ^H 1,1 (42)

^M 2.1.1 = ^H 2.1 ^H 2.1 (43)

^M 2.2.1 = ^H 2.1 ^H 2.2 (44)

^M 2,2,2 = H ₂₂ .H ₂₂ (45) The vector-lines (V ₁₁ + V ₂₁ ) and V ₂₂ and the matrices

( ^M 1,1,1 ^{+ M} 2, ι, ι) '( ² - ^M 2,2.ι) ^{and M} 2,2,2 ^{are ca |} once and for all before proceeding to the search loop. In addition, the only scalar product in relations (21) and (22) that can be calculated NC1 x NC2 times is the second term of Den (i1, i2), the others being calculable in a loop with a single index.

The flowchart in Figure 3 describes the sequencing of the calculations for the simultaneous (optimal) search for the indices il and i2 in the two dictionaries CB1 and CB2.

In the preliminary step 50, the vectors (V ₁₁ + V ₂₁ ) and V ₂₂ and the matrices (M ₁₁₁ + M ₂₁₁ ), (2.M ₂₂₁ ) and M ₂₂₂ are precalculated according to the relations (39) to (45 ) as a function of the impulse response h ₀ , h. ,, .... h _L _., and the target t, and one of the indices (it in the illustration) is initialized to 1. In the loop next (step 51), the terms P1 (i1) = (V _{1 1} + V ₂ - _| ). U1 _j1 and

D1 (i1) = U1 _J. (M _{1 (1} + M _2ι1ι1 ). U1 _i1 are calculated, as well as the line vector

C (i1) = 2.U1 _j .M _{22 v} This step 51 is executed for each value of it between 1 and NC1 (steps 52-53). When M = NC1 in test 52, the module

28 begins the search for the optimal waveform in step 54: the index i2 is initialized to 1 and the numbers N _opt and D _opt are initialized to represent an arbitrarily small quotient N _opt / D _opt .

This search includes a double loop on the indices il and i2. In the outer loop, in step 55, the terms P2 (i2) = V ₂₂ .U2 _j2 and

D2 (i2) = U2 .M ₂₂₂ .U2 _i2 are calculated, and the index M is taken equal to 1 for initialize the internal loop. This begins at step 56 by calculating the cross term D3 = C (i1) .U2 _j2 . In the next step 57, the numerator NNum and the denominator Den of the criterion to be maximized are calculated according to

NNum = [P1 (i1) + P2 (i2)] ² and Den = D1 (i1) + D2 (i2) + D3, then this criterion is compared to the previous optimum, without performing a division, in step 58. If

N _opt .Den <D _opt .NNum, the current indices il and i2 are assigned to the indices

I1 _0pt and I2 _0pt in step 59, and the numbers N _opt and D _opt are updated accordingly. Then, or if N _opt .Den> D _opt .NNum, the test 60 for the end of the internal loop (on it) is carried out. Incrementing it in step 61 returns to step 56 for examining the next segment of the dictionary CB1. When it = NC1 in test 60, the end of external loop test 62 (on i2) is carried out.

Incrementing i2 in step 63 returns to step 55 for examining the next segment of the dictionary CB2.

The search is finished when i2 = NC2 in test 62. The module 28 then determines the gain in step 64 according to the relation (33), that is to say

90 ^l2 opt) = [P1 (H ₀ pt) ^{+ p} 2 (l2 _opt )] D _opt .

When the dictionaries have a particular structure it is also possible to use their properties to speed up calculations. In particular, the calculation of the terms P1 (it), P2 (i2), D1 (it), D2 (i2) and D3 can be accelerated by fast algorithms known to those skilled in the art in the case of binary dictionaries.

The procedure according to FIG. 3 requires a double loop on the indices il and i2 (K nested loops in the general case), which is relatively consumer of computing power. The invention proposes slightly sub-optimal search methods which make it possible to solve this problem with less complexity, of the order of the size of a dictionary.

We can thus proceed in K phases. FIG. 4 illustrates the particular case where K = 2. In the preliminary step 70, the line vectors ^v kk ' ⁼ (tk- ^H kk') ^{and, are matrices M} k, k ', k " ⁼ ( ^H kk "- ^H kk ')> ° l ^u ' correspond to those of equations (39) to (45) when K = 2, are precalculated for 1 <k"<k'<k<K.

In the first phase 71, the N1 best candidates of the first dictionary CB1 are selected by maximizing the criterion Crit ₁ (i1) of the relation (8) on the first segment of the partitioning, N1 being a number greater than or equal to 1.

This relation (8) is identical to (6) taking into account only the length of the first segment of the excitation waveform, that is to say of L ₁ samples. Scalars therefore require significantly less multiplication.

The N1 indices il for which Crit ₁ (î1) takes the largest values are denoted i1 [1] to i1 [N1]. For each candidate i1 [n] selected (1 <n ≤ N1), module 28 calculates, at step 72, the quantities FN [n] and FD [n] and the following line vector C [n] from numerator and denominator obtained according to relations (9) and (10) in the research phase 71 and the corresponding waveform U1 _j ., r _n ι from the dictionary CB1:

FN [n] = Nu _mι (i1 [n]) + V _{2 1} .U1 _{i1 [n]} (46)

FD [n] = De _nι (i1 [n]) + U1 _{i1 [n} jM _{2 1 1} .U1 _{i1 [n]} (47)

C [n] = U1 _{i1 [n} f. (2.M _{2 | 2 1} ) (48)

These relations (46) to (48) correspond to relations (11) to (13) for k = 2.

In a second phase, for each of the candidates n selected in the first phase 71 (1 <n ≤ N1), the module 28 maximizes the criterion Crit ₂ (i1 [n], i2) for all the waveforms of the second dictionary CB2:

Critrf1 [ ^l n] ^J. l2) '= J & Den ₂ (ι1 [#n], ι2) (49)' with: Num ₂ (i1 [n], i2) = FN [n] + V _{2 2} .U2 _j2 (50)

Den ₂ (i1 [n], i2) = FD [n] + C [n] .U2 _j2 + U2 .M _{2 2 2} .U2 _i2 (51)

These relations (49) to (51) correspond to relations (14) to (16) for k = 2.

If K = 2, the module 28 selects the pair of indices i1 [n], i2 such that Crit ₂ (i1 [n], i2) is maximum. The corresponding gain g (i1 [n], i2) is taken equal to

- = r - ² ; ,, _r , '; before being quantified in a manner known per se. Den ₂ (ι1 [n], ι2)

The second phase is represented by the double loop 73-83 in FIG. 4 (it is a simple loop if N1 = 1). At start 73 of the loop external, the variable Crit _opt is initialized to 0 and the index i2 of the second segment to 1. The terms V _{2 2} .U2 _i2 and U2 .M _{2 2 2} .U2 _i2 of relations (50) and (51), which do not depend on the candidate n, are first calculated in step 74. This is generalizable for any K: the terms V _kk .Uk _ik and Uk _ik ^r .M _kkk .Uk _jk of the relations (15) and (16 ) are calculable before examining the N (k-1) candidates ((kl) -uplets) previously selected (2 <k ≤ K), which avoids having to redo the matrix products in each iteration of the internal loop .

At the start 75 of the internal loop, the candidate's index n for the first segment is initialized to 1. In each iteration of this internal loop, the module 28 calculates the terms Num ₂ and Den ₂ of the relations according to (50) and ( 51) in step 76, then the quotient Crit = Num ₂ / Den ₂ in step 77. If Crit ₂ (i1 [n], i2)> Crit _opt (test 78), the current indices i1 [n] and i2 are assigned to the indices I1 _0pt and I2 _0pt in step 79, and Crit _opt is replaced by Crit = Crit ₂ (i1 [n], i2). Then, or if Crit ₂ (i1 [n], i2) <Crit _opt , the end of internal loop 80 test (on n) is performed. Incrementing n at step 81 returns to step 76 for examining the next candidate segment of the dictionary CB1. When n = N1 in test 80, the test 82 for the end of the external loop (on i2) is carried out. Incrementing i2 at step 83 returns to step 75 for examining the next segment of the CB2 dictionary. The search is finished when i2 = NC2 in test 82. The module 28 then determines the gain in step 84 according to the relation (33), that is to say

9 ('V ^l2 opt) = ^Num 2 ("V ^l2 o _P t)' ^Den 2 ('V ^l2 o _P t) -

If K> 2, the process is repeated. At the start of the k-th phase with 1 <k <K, the quantities FN [n] and FD [n] and the line vector C [n] are updated according to relations (11) to (13). During this k-th phase (1 <k <K), the module 28 selects Nk k-tuples, by maximizing criterion of the relation (14).

The search is finished when k = K (with NK = 1). The module 28 selects the K-tuplet of indices such that Crit _κ (i1 [n] i (k-1) [n], ik) is maximum. The corresponding gain is taken equal to

Num _κ (i1 [n], i2 [n] i (k-1) [n], ik) _t . _{| (} .. ₄ .-.. - = 7 - _/ -,, i _r -. ,, _Λ sr,., _S before being quantified in a manner known per se. Den _κ (ι1 [n], ι2 [n] (k-1) [n], ιk) ^γ

At each phase k (1 <k <K), the search provides one or more waveform segments from the dictionary CBk of size NCk, using scalar products of vectors of dimension L _k . The optimization therefore relates exclusively to the segment of rank k of the partitioning of the waveforms. One can dispense with considering the influence of the segment k on the preceding segments because it is null because of the Toeplitz structure of the matrix H. On the other hand, the influence of the preceding segments is taken into account in phase k. The appreciable calculation economy made by not considering (during phase k) the influence of the segment k on the following segments leads to a solution which is only slightly sub-optimal since each of these following segments will be searched later in a dictionary.

The algorithm illustrated in FIG. 4 performs divisions at the iterations of step 77. It will have been noted that the optimization can be carried out without division, by the same method as in FIG. 3 (test 58).

The arrangement of the calculations illustrated in FIG. 4 makes it possible to further reduce the calculation load when the dictionaries CB1, CB2, ... CBK have an algebraic structure of the ACELP or binary type.

Thus, the introduction of a transformation matrix makes it possible to modify the properties of the waveform dictionary. The transformation matrices can be optimized on a reference corpus or have an algebraic structure themselves. In the latter case, they can notably be used to generate codes with +1 / 0 / -1 ternary values indexed from binary codes. Starting from NCk = 2 ^nck binary codes Ck _jk describing all the possible combinations of nck samples with values ± 1, we form a dictionary of NCk waveform segments Uk _ik of length L _k by applying a transformation matrix F _k to L _k rows and nck columns:

Uk _ik = F _k .Ck _ik (52)

The expressions (9) to (13) and (15) to (16) are modified by replacing the segments Uk _jk by the corresponding binary codes Ck _jk (which gives respectively the relations (17) to (23)), after having replaced the line vectors (tf.H _kιk .) by (tJ.H _kk , F _k .) and the matrices (H _k J..H _kk .) by

ψj - ^H kk _" - ^H k, k ^, - ^F k 'JP ° ^{ur 1} ≤ ^k " ≤ ^k ' ≤ ^k ≤ in the calculations made once and for all in step 70 to obtain the vectors V _kk - and the matrices M _kk . _k ".

Given the binary nature of Ck _jk codes, the calculation of Num _k numerators (i1 [π], .... i (k— 1) [n], ik) and denominators

Den _k (i1 [n] i (k— 1) [n], ik) can on the other hand be carried out by fast algorithms known in the field of digital signal processing.

For example, waveforms of L = 20 samples can be partitioned into K = 2 segments of L ₁ = L ₂ = 10 samples each, with two identical dictionaries CB1 = CB2 composed of 2 ⁸ = 256 forms d wave generated by a transformation matrix F ₁ = F ₂ = F given by: 00 0 0 0 0 0 0 1 0 0 0 0 0 0

00100000 00010000

F = 00001000 00000100 (53) 00000010 00000000 00000001 00000000

The dictionary structure can then be used to obtain the scalar products with a minimal computational load. The methods that can be used are described in particular in the references: "A robust and fast CELP coder at 16 kbit / s", by A. Le Guyader, D. Massaloux and J.F. Zurcher, published in the journal Speech Communication, Vol. 7, 1988, pages 217-226; and "Vector Sum Excited Linear Prediction (VSELP): speech coding at 8 bps" by A. Gerson and M.A. Jasiuk, published in the proceedings of the ICASSP conference, 1990, pages 461-464.

In this example, satisfactory results have been obtained by choosing as candidates the N1 = 3 dictionary indices of the first partition which give the lowest error criterion. In the tests which were carried out, the speech signal was subjected to a voiced / unvoiced detector. According to the latter's decision, it was then subjected to a search of ACELP type for the voiced frames and of binary type transformed according to the matrix F given by (53) for the non-voiced frames.

Claims

1. Method for coding an audio-frequency signal digitized in successive frames of L samples, comprising the following steps:

- adaptive determination of synthesis parameters defining synthesis filters;

- adaptive determination of excitation parameters defining an excitation signal to be applied to the synthesis filters to produce a synthetic signal representative of the audio signal, by minimizing the energy of an error signal resulting from the filtering of the difference between the audio signal and the synthetic signal by at least one perceptual weighting filter; and

production of quantization values representative of the determined parameters, in which the excitation parameters include, for at least some of the frames, a waveform of L samples associated with a gain, said excitation signal relating to the resulting frame multiplication of the waveform of L samples by the associated gain, the waveform consisting of a number K, greater than 1, of segments juxtaposed according to a determined partition of the frame and coming from respective dictionaries , the k-th segment being made up of L _k samples for 1 ≤ k <K, the numbers L ₁ to L _κ being integers whose sum is equal to L, characterized in that the determination of the excitation parameters for at at least one signal frame comprises the following steps: lai calculating a target vector of L samples from the signal frame and the parameters determined for at least one previous frame;

Ibl calculation, over a length of L samples, of an impulse response h ₀ , h ₁ h _-1 of a filter composed of the synthesis filters and of a perceptual weighting filter; Here precalculation of vectors V _kk . and of matrices M _kk - _k M for integers k, k 'and k "such that 1 <k"<k'<k<K, with V _kk . = tJ.H _kk . and

^M kk 'k " ⁼ ^ kk"' - ^ k _, k '' ^or * k denotes the k-th vector, of L _k samples, obtained by subdividing the target vector according to said partition, H _kk . denotes the matrix of L _k lines and L _k . columns whose term located in the (q + 1) -th row and in the (r + 1) -th column (0 <q <L _k , 0 <r <L _k .) equals H _kk - (q, r ) = h _{q + Λk} _ _r _ _Λk , with, for 1 <k <K: k-1 Λk = ∑L _k . and H _kk (q, r) = 0 if q <r, and (.) ^τ denotes the operation of k '= ι transposition;

/ d / selection of at least one segment from the dictionary relating to the first segment, by maximizing a criterion of the form:

Crit f m- t ^Num ι ( ⁱ¹ > F

^Cπt ι ^{(, 1)} - D Deenn., (Mi1)) where: Num ₁ (i1) = V _{1 1} .U1 _i1 Den ^ il) = U1 _i1 ^τ .M _{1 1ι1} .U1 _i1 and U1 denotes a segment d index il of the first dictionary, s following steps lel and IV for k ranging from 2 to K: lel for each (kl) -uplet of segments

having just been selected, calculation of numbers FN [n] and FD [n] and of a vector C [n] of the form:

FN [n] = Num ^ fllln], i2 [n] i (k-1) [n]) + £ V _kk , Uk ' _ik . _[n] k '= 1 k-1

FD [n] = Den ^^ illn], i2 [n] i (k-1) [n]) + ∑ Uk ' _ik . _[n ^τ _] .M _kk . _k , .Uk ^, _ik . _[n] k '= 1 k-1 k'-1 ⁺ 2. ∑ ∑ Uk " _jk „ r _n] .M _kk . _k ".Uk' _ik . _[n ] k '= 2 k "= l

C [n] = ^2.k £ ¹ Uk ' _jk ^ι : _[n] .M _{k) kιk,;} k '= 1 IV selection of at least one k-tuplet of segments respectively from the k dictionaries relating to the segments 1 to k, of which the first k-1 segments Ul ^^, .... Uk ^^^ come from a (kl) -uplet previously selected, by maximizing a criterion of the form:

^ -x _I - „ _{H Γ T '} n [Num _k (i1 [n] i (k-1) [n], ik)] ²

Crιt _k (ι1 [n], ..., ι (k-1) [n], ιk) = i - = - - ^κ ... ^L _r ... ') i ..' * ^kx 'Den _k (ι1 [n], ..., ι (k-1) [n], ιk) where: Num _k (i1 [n] i (k-1) [n], ik) = FN [n] + V _kk .Uk _ik Den _k (i1 [n] i (k-1) [n], ik) = FD [n] + C [n] .Uk _jk + Uk _ik ^r .M _kιkιk .Uk _ik and Uk _ik denotes a segment of index ik of the k-th dictionary, the waveform retained for the frame consisting of a K-tuplet of segments selected in step IV for k = K.

2. Method for coding an audio-frequency signal digitized in successive frames of L samples, comprising the following steps:

- adaptive determination of synthesis parameters defining synthesis filters;

- adaptive determination of excitation parameters defining an excitation signal to be applied to the synthesis filters to produce a synthetic signal representative of the audio signal, by minimizing the energy of an error signal resulting from the filtering of the difference between the audio signal and the synthetic signal by at least one perceptual weighting filter; and - production of quantization values representative of the determined parameters, in which the excitation parameters include, for at least some of the frames, a waveform of L samples associated with a gain, said excitation signal relating to the frame resulting from the multiplication of the waveform of L samples by the associated gain, the waveform being made up of a number K, greater than 1, of segments juxtaposed according to a determined partition of the frame and coming from dictionaries respective, the k-th segment being made up of L _k samples for 1 <k <K, the numbers L ₁ to L _κ being integers whose sum is equal to L, and in which, for 1 <k <K, each segment of L _k samples of the dictionary relative to the k-th segment of the partition is the product of a predefined transformation matrix F _k of L _k lines and nck columns by a respective code of nck samples belonging to a generated directory ant the dictionary, nck being an integer smaller than L _k , characterized in that the determination of the excitation parameters for at least one signal frame comprises the following steps: lai calculation of a target vector of L samples from the signal frame and parameters determined for at least one previous frame; / b / calculation, over a length of L samples, of an impulse response h ₀ , h ₁ , ..., h _L-1 of a filter composed of the synthesis filters and of a perceptual weighting filter;

Here precalculation of vectors V _kk - and of matrices M _kk - _k .. for integers k, k 'and k "such that 1 <k"<k'<k<K, with V _kk . = tj .H _kk ..F _k . and

M _kk . _k .. = ^rT _k XH _kk „.H _kk ..F _k . , where t _k denotes the k-th vector, of L _k samples, obtained by subdividing the target vector according to said partition, H _kk . denotes the matrix of L _k rows and L _k - columns whose term located in the (q + 1) -th row and in the (r + 1) -th column (0 <q <L _k ,

0 <r <L _k .) Is H _kk ^, (q, r) = h _{q + Λk} _ _r _ _Λk . with, for 1 <k <K: k-1 Λk = ∑ L _k , and H _kk (q, r) = 0 if q <r, and (.) ^τ denotes the operation of k '= 1 transposition; / d / selection of at least one code from the directory relating to the first segment, by maximizing a criterion of the form: c ^C r ^r iHtπ " ⁾ -

where: Num., (i1) = V _{1 1} .C1 _i1

Den ₁ (i1) = C1 _j1 ^τ M _{1 1 1} .C1 _i1 and Cl _j -, designates an index code il of the first repertoire, s steps le / and IV following for k going from 2 to K: lel for each (kl) -uplet of codes Cl ^^ j, ..., Ck _j / _k _ ₁ ^ _n -, having just been selected, calculation of numbers FN [n] and FD [n] and of a vector C [n] of the form:

FN [n] = V _kk -.Ck ' _ik . _[not]

k-1 FD [n] = Den ^ M tn], i2 [n] i (k-1) [n]) + £ Ck ' _ik . _[n ^τ _] .M _kk . _k ..Ck ' _ik . _[n] k '= ι k— 1 k'-1 + 2. £ £ Ck " _ik .. _[ J _] .M _kk . _k ... Ck' _ik . _[n] k '= 2 k" = 1

IV selection of at least one k-tuplet of codes respectively from k directories relating to segments 1 to k, including the first k-1 codes C1 _{j1 [n]} , ..., Ck _{j (k} _ _{1) [n ]} come from a (kl) -uplet previously selected, by maximizing a criterion of the form:

Crit "1 M i. k-1 Uni \ \ - [ ^Num ( ⁱ¹ i (k-1) [n], ik)] ² ^{Cπtk (l1 [n] I (k 1) [n]} ' ^lk) - Den _k (i1 [n] i (k-1) [n], ik) where: Num _k (i1 [n] i (k- 1) [n], ik) = FN [n] + V _kk .Ck _jk

Den _k (i1 [n] i (k-1) [n], ik) = FD [n] + C [n] .Ck _ik + Ck .M ,, _kk .Ck _ik and Ck _jk denotes a code of index ik of the k-th repertoire, the waveform retained for the frame consisting of a K-tuplet of segments respectively obtained from a K-tuplet of codes selected in step IV for k = K.

3. Method according to claim 2, in which the directory codes are binary codes with values ± 1.

4. Method according to any one of the preceding claims, in which the gain associated with the waveform selected for a frame is taken

- _{ol βl l nnnnrt} Num _κ (i1 [n], i2 [n], ..., i (k-1) [n], ik) _{l ιn} . _was equal to the ratio - =; -, .. _r . , _r rr. - „... ., Concerning said K-tuple ^has Den _κ (ι1 [n], ι2 [n] ι (k-1) [n], ιk) selected in step IV for k = K.

5. Method according to any one of the preceding claims, in which the numbers L ₁ to L _κ are identical, as are the dictionaries relating to the K segments.

6. Method according to claim 5, in which K = 2.

7. Audiofrequency signal encoder, comprising:

means for receiving the digital audio signal in successive frames of L samples;

means of adaptive determination of synthesis parameters defining synthesis filters; means of adaptive determination of excitation parameters defining an excitation signal to be applied to the synthesis filters to produce a synthetic signal representative of the audio frequency signal, by minimizing the energy of an error signal resulting from the filtering of the difference between the audio signal and the synthetic signal by at least one perceptual weighting filter; and

means for producing quantization values representative of the determined parameters, in which the excitation parameters include, for at least some of the frames, a waveform of L samples associated with a gain, said excitation signal relating to the frame resulting from the multiplication of the waveform of L samples by the associated gain, the waveform being made up of a number K, greater than 1, of segments juxtaposed according to a determined partition of the frame and originating of respective dictionaries, the k-th segment consisting of L _k samples for 1 <k ≤ K, the numbers L ₁ to L _κ being integers whose sum is equal to L, characterized in that the means for adaptively determining the excitation parameters include:

means for calculating a target vector of L samples from a signal frame and parameters determined for at least one previous frame;

means for calculating, over a length of L samples, an impulse response h ₀ , h ₁ h ^ of a filter composed of the synthesis filters and of a perceptual weighting filter;

- means of precalculation of vectors V _kk , and of matrices M _kk . _k - for integers k, k 'and k "such that 1 <k"<k' ≤ k ≤ K, with V _kk . = t _k ^r .H _kk . and

M _kk . _k .. = H _k ^τ ".H _kk . , where t _k denotes the k-th vector, of L _k samples, obtained by subdividing the target vector according to said partition, H _kk - denotes the matrix of L _k lines and L _k . columns whose term located in the (q + 1) -th row and in the (r + 1) -th column (0 <q <L _k , 0 ≤ r <L _k >) equals k-1 H _kk . ( q, r) = h _{q + Λk} _ _r _ _Λk . with, for 1 <k <K: Λk = ∑ L _k . and H _kk (q, r) = 0 k '= l if q <r, and (.) ^τ denotes the transposition operation; means for selecting at least one segment from the dictionary relating to the first segment, by maximizing a criterion of the form:

[Num ^ il)] ²

CritiM) Deα, (i1) where: Num ₁ (i1) = V ₁₁ .U1 _j1

and U1 designates a segment of index il from the first dictionary;

- optimization means for performing the following operations for k ranging from 2 to K:

- for each (kl) -uplet of segments

FN [n] = V _kk ..Uk ' _ik . _[not]

k-1 FD [n] = Den _k _., (i1 [n], i2 [n], ..., i (k-1) [n]) + £ Uk '^ M _kk . _k , Uk ' _ik . _[n] k '= 1 k-1 k'-1 ^{+ 2} - ∑ ∑ Uk " _jk ., _[n] .M _kιk . _ιk ... Uk' _jk . _[n j k '= 2 k" = l ^c t ⁿ ] = 2.∑Uk ' _{ik [n]} .M _kιkιk . ; k'≈i

- selection of at least one k-tuplet of segments respectively from the k dictionaries relating to the segments 1 to k, of which the first k-1 segments U1 _j ., r _n ι Uk ^., ^, come from a

(k-l) -uplet previously selected, by maximizing a criterion of the form:

where: Num _k (i1 [n] i (k-1) [n], ik) = FN [n] + V _kk .Uk _jk

Den _k (i1 [n] i (k-1) [n], ik) = FD [n] + C [n] .Uk _ik + Uk _ik ^τ M _kkk .Uk _jk and Uk _jk denotes an index segment ik from the k-th dictionary, means for producing the waveform retained for a frame in the form of a K-tuplet of segments selected by the optimization means for k = K.

8. Audiofrequency signal coder, comprising: means for receiving the digitized audiofrequency signal in successive frames of L samples;

means of adaptive determination of synthesis parameters defining synthesis filters;

means of adaptive determination of excitation parameters defining an excitation signal to be applied to the synthesis filters to produce a synthetic signal representative of the audio frequency signal, by minimizing the energy of an error signal resulting from the filtering of the difference between the audio signal and the synthetic signal by at least one perceptual weighting filter; and - means for producing quantization values representative of the determined parameters, in which the excitation parameters include, for at least some of the frames, a waveform of L samples associated with a gain, said relative excitation signal the frame resulting from the multiplication of the waveform of L samples by the associated gain, the waveform consisting of a number K, greater than 1, of segments juxtaposed according to a determined partition of the frame and from respective dictionaries, the k-th segment being made up of L _k samples for 1 <k <K, the numbers L ₁ to L _κ being integers whose sum is equal to L, and in which, for 1 <k < K, each segment of L _k samples of the dictionary relating to the k-th segment of the partition is the product of a predefined transformation matrix F _k of L _k rows and nck columns by a respective code of nck samples belonging to a rep rtoire generating the dictionary, nck being an integer smaller than L _k , characterized in that the means for determining the excitation parameters comprise:

means for calculating a target vector of L samples from a signal frame and parameters determined for at least one previous frame; means of calculation, over a length of L samples, of an impulse response h ₀ , h.,, ..., h _L _-, of a filter composed of the synthesis filters and of a weighting filter perceptual;

- means for precomputing vectors V _kk - and matrices M _kk . _k "for integers k, k 'and k" such that 1 <k "<k'<k<K, with V _kk - = tf.H _kk , .F _k . and

M _kk . _k "= F _| ^ .H _kk „.H _kk ..F _k . , where t _k denotes the k-th vector, of L _k samples, obtained by subdividing the target vector according to said partition, H _kk . denotes the matrix of L _k rows and L _k , columns whose term located in the (q + 1) -th row and in the (r + 1) -th column (0 <q <L _k , 0 <r <L _k .) is worth k-1 ^H k, k '(q- = ^h q + Λk-r- _Λ k' ^{a ec} - P ° ^ur 1 < ^k <K: Λk = ∑ L _k - and H _kk (q, r) = 0 k '= l if q <r, and (.) ^τ denotes the transposition operation;

means for selecting at least one segment from the dictionary relating to the first segment, by maximizing a criterion of the form:

_.,,. „. [Nunr M)] ² Crιt, (ι1) = ^L _ ^1V .., ' ^J ¹ ' Den., (Ι1) where: Num., (I1) = V _{1 1} .C1 _j1

Den ₁ (i1) = C1 _i1 ^τ M _{1 1 1} .C1 _i1 and Cl _j -, denotes an index code M from the first directory;

- optimization means to perform the following operations for k ranging from 2 to K: - for each (kl) -uplet of codes 01 ^ C ^k i (ki) [n] just selected, calculation of numbers FN [n] and FD [n] and a vector C [n] of the form:

FN [n] = V _kk ..Ck ' _ik . _[π]

k -1

FD [n] = Den _k _., (I1 [n], i2 [n] i (k-1) [n]) + £ Ck ' _ik , _[n ^τ _] .M _kk . _k ..Ck ' _ik . _[n] k = 1

^{+ 2} - ∑ ∑ ^Ck "ik" [π] - k ₁ k ', k "- ^Ck, ik' [n] k '= 2 k" = 1 C [n] = 2. ^k £ ¹ Ck ' _{ik [n]} .M _kkk . ; k '= 1

- selection of at least one k-tuplet of codes respectively from k directories relating to segments 1 to k, of which the first k-1 codes Cl ^^ j Ck ^., ^ -] come from a (kl) -uplet previously selected, by maximizing a criterion of the form:. r C-rιt

where: Num _k (i1 [n] i (k-1) [n], ik) = FN [n] + V _kk .Ck _ik

Den _k (i1 [n], ..., i (k-1) [n], ik) = FD [n] + C [n] .Ck _jk + Ck _jk ^τ .M _{k k> k} .Ck _jk and Ck _ik designates an index code ik from the k-th directory,

means for producing the waveform retained for a frame in the form of a K-tuplet of segments respectively obtained from a K-tuplet of codes selected by the optimization means for k = K.

9. The encoder according to claim 7 or 8, in which the means for determining the excitation parameters further comprise means for calculating the gain associated with the waveform retained for a frame according to the ratio ^ ^m κ (^ ' . ^ ^] ™ '^ relating to said K-tuplet selected by Den _κ (ι1 [n], ι2 [n] (k-1) [n], ιk) optimization means for k = K.

10. Encoder according to any one of claims 7 to 9, in which the numbers L ₁ to L _κ are identical, as are the dictionaries relating to the K segments.