EP0428445A1

EP0428445A1 - Method and apparatus for coding of predictive filters in very low bitrate vocoders

Info

Publication number: EP0428445A1
Application number: EP90403195A
Authority: EP
Inventors: Pierre-André Laurent
Original assignee: Thomson CSF SA
Current assignee: Thales SA
Priority date: 1989-11-14
Filing date: 1990-11-09
Publication date: 1991-05-22
Anticipated expiration: 2010-11-09
Also published as: FR2654542B1; CA2029768A1; EP0428445B1; FR2654542A1; CA2029768C; ES2069044T3; US5243685A; DE69017842T2; DE69017842D1

Abstract

The method consists in dividing the voice signal into binary frames of specified duration in order to group them (121 ... 123) into packets of successive frames while associating a predictive filter with each frame of a packet and in quantifying (181 ... 183) the coefficients of each predictive filter taking account (20, 21) of the stable or unstable configuration of the voice signal. <??>Application: speech encoding. <IMAGE>

Description

La présente invention concerne un procédé et un dispositif de codage de filtres prédicteurs pour vocodeurs très bas débit.The present invention relates to a method and a device for coding predictive filters for very low bit rate vocoders.

Parmi les méthodes de numération de la parole à bas débit la méthode la plus connue est celle du codage prédictif linéaire LPC10, où LPC10 est l'abréviation dans le langage anglo-saxon de "Linear predictive coding, order 10" Suivant cette méthode la synthèse de la parole a lieu en excitant au moyen d'un signal périodique ou par une source de bruit un filtre dont la fonction est de donner au spectre en fréquence du signal une forme d'onde proche de celle du signal de parole d'origine.Among the methods of low-speed speech counting, the most well-known method is that of linear predictive coding LPC10, where LPC10 is the abbreviation in the English language of "Linear predictive coding, order 10". speech takes place by exciting by means of a periodic signal or by a noise source a filter whose function is to give the frequency spectrum of the signal a waveform close to that of the original speech signal.

La majeure partie du débit, qui est de 2400 bits par seconde, est consacrée à la transmission des coefficients du filtre. Pour cela le train binaire est découpé en trames de 22,5 millisecondes comportant 54 bits dont 41 sont utilisés pour adapter la fonction de transfert du filtre.The major part of the bit rate, which is 2400 bits per second, is devoted to the transmission of the filter coefficients. For this, the bit stream is split into 22.5 millisecond frames comprising 54 bits, 41 of which are used to adapt the transfer function of the filter.

Un procédé connu de réduction de débit consiste à comprimer les 41 bits associés à un filtre en 10 à 12 bits qui représentent le numéro d'un filtre prédéfini, appartenant à un dictionnaire de 2¹⁰ à 2¹² filtres différents, ce filtre étant celui qui est le plus proche du filtre d'origine. Ce procédé présente cependant un premier inconvénient majeur qui est de nécessiter la construction d'un dictionnaire de filtres dont le contenu dépend étroitement du jeu des filtres utinisés pour le constituer par des techniques classiques de données ("clustering") et de la sorte ce procédé n'est pas parfaitement bien adapté aux conditions de prise de son réelles. Un deuxième inconvénient de ce procédé est qu'il exige pour sa mise en oeuvre une taille de mémoire très importante pour stocker le dictionnaire (2¹⁰ à 2¹² paquets de coefficients). Corrélativement les temps de calcul deviennent importants du fait qu'il faut rechercher dans le dictionnaire le filtre le plus proche du filtre original. Enfin ce procédé ne permet pas de reproduire de façon satisfaisante des sons stables. Ceci est dû au fait que même pour un son stationnaire l'analyse LPC ne sélectionne jamais en pratique deux fois de suite le même filtre original mais choisit successivement dans le dictionnaire des filtres proches mais distincts.A known method of reduction of bit rate consists in compressing the 41 bits associated with a filter into 10 to 12 bits which represent the number of a predefined filter, belonging to a dictionary of 2 ¹⁰ to 2 ¹² different filters, this filter being the one which is closest to the original filter. This method has however a first major drawback which is to require the construction of a dictionary of filters whose content depends closely on the set of filters used to constitute it by conventional data techniques ("clustering") and thus this method is not perfectly suited to the actual sound recording conditions. A second drawback of this method is that, for its implementation, it requires a very large memory size to store the dictionary (2 ¹⁰ to 2 ¹² packets of coefficients). Correlatively the computation times become important because it is necessary to search in the dictionary for the filter closest to the original filter. Finally, this process does not make it possible to satisfactorily reproduce stable sounds. This is due to the fact that even for a stationary sound the LPC analysis never selects in practice twice the same original filter but chooses successively in the dictionary of close but distinct filters.

De même qu'en télévision où la reconstruction d'une image colorée dépend essentiellement de la qualité du signal de luminance et non pas de celle du signal de chrominance qui peut de ce fait être transmis avec une définition moindre, il apparaît aussi suffisant en synthèse de parole de ne bien reproduire que le contour de l'énergie du signal vocal, sa coloration (voisement, forme de spectre) revêtant une importance moindre pour sa reconstruction. De ce fait, dans les procédés connus de synthèse de la parole le processus de recherche de spectres basé sur l'évolution de la distance minimale qui sépare les spectres de la parole d'origine (du locuteur) et de la parole synthétique ne sont pas pleinement justifiés.As in television where the reconstruction of a colored image depends essentially on the quality of the luminance signal and not that of the chrominance signal which can therefore be transmitted with a lower definition, it also appears sufficient in synthesis of speech to reproduce only the contour of the energy of the voice signal, its coloring (voicing, spectrum form) being of less importance for its reconstruction. Therefore, in the known methods of speech synthesis the process of finding spectra based on the evolution of the minimum distance which separates the spectra of the original speech (of the speaker) and synthetic speech is not fully justified.

Par exemple, différents exemplaires du son "A" prononcés par différents locuteurs, ou enregistrès dans des conditions différentes peuvent avoir une distance spectrale élevée mais resteront toujours des "A" pouvant être reconnus en tant que tels, et s'il y a ambiguïté, traduite par une possibilité de confusion avec un son proche, l'auditeur pourra toujours rectifier de lui-même grâce au contexte. En fait, l'expérience montre qu'en ne consacrant pas plus d'une trentaine de bits aux coefficients du filtre prédicteur au lieu de 41, la qualité de restitution reste satisfaisante même si un auditeur entraîné peut perce voir une différence légère entre les sons synthétisés avec des coefficients prédicteurs définis sur 30 ou 41 bits. D'autre part, comme la transmission a lieu à distance et que le destinataire n'a pas de ce fait la possibilité de faire cette différence, il apparaît suffisant que l'auditeur puisse reconnaître correctement le son synthétisé.For example, different copies of the sound "A" spoken by different speakers, or recorded under different conditions can have a high spectral distance but will always remain "A" that can be recognized as such, and if there is ambiguity, translated by a possibility of confusion with a close sound, the listener can always correct himself thanks to the context. In fact, experience shows that by devoting no more than thirty bits to the coefficients of the predictor filter instead of 41, the quality of reproduction remains satisfactory even if a trained listener can perceive or see a slight difference between the sounds. synthesized with predictor coefficients defined on 30 or 41 bits. On the other hand, since the transmission takes place at a distance and the recipient does not therefore have the possibility of making this difference, it appears sufficient that the listener can correctly recognize the synthesized sound.

Egalement il apparaît important que dans les parties stables du signal (voyelles) le filtre prédicteur reste stable et soit aussi proche que possible du filtre prédicteur d'origine. Par contre dans les parties instables (transition, son non voisé) le prédicteur transmis n'a pas besoin d'être une copie fidèle du prédicteur d'origine.Also it seems important that in the stable parts of the signal (vowels) the predictor filter remains stable and is as close as possible to the original predictor filter. On the other hand, in unstable parts (transition, unvoiced sound) the predictor transmitted does not need to be a faithful copy of the original predictor.

Le but de l'invention est de pallier les inconvénients précités.The object of the invention is to overcome the aforementioned drawbacks.

A cet effet, l'invention a pour objet un procédé de codage de filtres prédicteurs de vocodeurs très bas débit du type dans lequel le signal vocal est découpé en trames binaires de durée déterminée caractérisé en ce qu'il consiste à regrouper les trames par paquets de trames successives, à associer respectivement à chaque trame contenue dans un paquet un filtre prédicteur, et à quantifier les coefficients de chaque filtre prédicteur en tenant compte de la configuration stable ou non stable du signal vocal.To this end, the subject of the invention is a method of coding filters for predicting very low bit rate vocoders of the type in which the voice signal is divided into binary frames of determined duration, characterized in that it consists in grouping the frames in packets of successive frames, to associate respectively with each frame contained in a packet a predictor filter, and to quantify the coefficients of each predictor filter by taking account of the stable or non-stable configuration of the voice signal.

D'autres caractéristiques et avantages de l'invention apparaîtront ci-après à la lecture de la description qui suit faite en regard des dessins annexés qui représentent :

- la figure 1 un schéma de principe d'un synthétiseur de parole de l'art connu ;
- la figure 2 une mise sous forme de tableaux des quatre codages possibles des filtres prédicteurs du vocodeur selon l'invention ;
- la figure 3 un organigramme pour illustrer le calcul de l'erreur de prédiction des filtres prédicteurs mis en oeuvre par l'invention ;
- la figure 4 un graphe de transformation des coefficients de réflexion des filtres prédicteurs ;
- la figure 5 la loi de quantification des coefficients de réflexion des filtres transformés par le graphe de la figure 3 ;
- la figure 6 un dispositif pour la mise en oeuvre du procédé selon l'invention.

Other characteristics and advantages of the invention will appear below on reading the description which follows made with reference to the appended drawings which represent:

- Figure 1 a block diagram of a speech synthesizer of the known art;
- Figure 2 a table layout of the four possible codings of predictor filters of the vocoder according to the invention;
- Figure 3 a flowchart to illustrate the calculation of the prediction error of the predictor filters set implemented by the invention;
- Figure 4 a graph of transformation of the reflection coefficients of the predictor filters;
- Figure 5 the law of quantification of the reflection coefficients of the filters transformed by the graph of Figure 3;
- Figure 6 a device for implementing the method according to the invention.

Le synthétiseur de parole représenté à la figure 1 comporte de façon connue un filtre prédicteur 1 couplé par son entrée E₁ à un générateur de signal périodique 2 et à un générateur de bruit 3 au travers d'un commutateur 4 et d'une amplificateur à gain variable 5 reliés en série. Le commutateur 4 couple l'entrée du filtre prédicteur 1 à la sortie du générateur de signal périodique 2 ou à la sortie du générateur de bruit 3 suivant la nature voisée ou non du son à restituer. L'amplitude du son est commandée par l'amplificateur 5. la filtre 1 restitue sur sa sortie S un signal de parole en fonction de coefficients de prédiction appliqués sur son entrée E₂. A la différence de ce qui est représenté à la figure 1 les synthétiseurs de parole auxquels s'appliquent le procédé et le dispositif de codage de l'invention doivent comporter trois filtres prédicteurs 1 adaptés à chaque groupe de trois trames de 22,5 ms successives du signal de parole suivant l'état stable ou non stable du son à synthétiser. Cette organisation permet, par exemple, de réduire le débit de 2400 bits par seconde à 800 bits par seconde, en regroupant les trames par paquets de 3 x 22,5 = 67,5 millisecondes de 54 bits dans lesquels 30 à 35 bits sont utilisés pour décrire par exemple les 10 coefficients prédicteurs des trois filtres successifs nécessaires à la mise en oeuvre de la méthode de codage LPC10 décrite précédemment, et deux bits parmi ceux-ci sont utilisés pour définir la configuration à donner aux trois filtres à générer suivant la nature stable ou non du signal vocal à générer. Dans le tableau de la figure 2 où sont consignées les quatre configurations possibles des trois filtres, à l'état 00 des deux bits de configuration correspond une première configuration où les trois filtres prédicteurs sont identiques pour les trois trames du signal vocal. Pour la deuxième configuration les bits de configuration ont la valeur 01 et seuls les deux premiers filtres des trames 1 et 2 sont identiques. Dans la troisième configuration, correspondant aux bits de configuration 10 seuls les deux derniers filtres des trames 2 et 3 sont identiques. Enfin dans la quatrième configura tion, correspondant aux bits de configuration 11, les trois filtres des trames 1 et 3 sont différents. Naturellement ce mode de configuration n'est pas unique et il est tout aussi possible en restant dans le cadre de l'invention à définir le nombre de trames dans un paquet par un nombre quelconque. Cependant pour des commodités de réalisation ce nombre pourra être compris entre 2 et 4 inclusivement. Dans ces cas naturellement le nombre de configurations possibles pourra être étendu à 8 ou 16 au maximum. La définition des filtres est établie suivant les étapes 1 à 6 du procédé représenté par l'organigramme de la figure 2. Selon une première étape du procédé portant la référence 5 sur l'organigramme les coefficients d'autocorrélation R;,_k du signal sont calculés suivant une relation de la forme :

où S_in est un échantillon n du signal dans la trame i et W_n désigne la fenêtre de pondération. A la deuxième étape référencée 6 le calcul des coefficients de réflexion du filtre prédicteur en treillis correspondant aux coefficients Ri(k) précédent est effectué en application d'un algorithme standard par exemple, de l'algorithme connu de LEROUX-GUEGUEN ou SCHUR. A cette étape, les coefficients R_ik sont transformés en coefficients K_ij où j est un entier positif prenant les valeurs successives de 1 à 10. A la troisième étape portant la référence 7 les coefficients k dont les valeurs sont comprises par définition entre -1 et + 1 sont transformés en des coefficients modifiés qui évoluent entre "-l'infini" et "+l'infini" et qui tiennent compte du fait que la quantification des coefficients k doit être fidèle lorsqu'ils ont une valeur absolue proche de 1 et une valeur qui peut être plus grossière lorsqu'ils sont voisins de 0 par exemple. Chaque coefficient K_ij est par exemple transformé suivant une relation de la forme

dont le graphe est représenté à la figure 3 ou encore suivant les relations (L_ij = K_ij| 1- |K_ij|) ; (L_ij = arc cos K_ij) ; (L_ij = arc sin K_ij) ou encore en application de la méthode de calcul des coefficients LSP décrite dans l'article de George S. Kang an Lawrence, J. Fransen du Naval Research Laboratory Washington DC 20375 1985 ayant pour titre "Application of line spectrum pairs to low bit rate speech encoder". A la quatrième étape représentée en 8 les coefficients L_ij sont quantifiés suivant n_j bits chacun de façon non uniforme en tenant compte de la répartition des coefficients pour donner une valeur L_ij suivant une loi de répartition représentée par l'histogramme des L_ij de la figure 4. A l'étape 5 les valeurs de Lisent à leur tour utilisées pour calculer des coefficients K_ijsuivant la relation

The speech synthesizer shown in FIG. 1 comprises, in a known manner, a predictor filter 1 coupled by its input E ₁ to a periodic signal generator 2 and to a noise generator 3 through a switch 4 and an amplifier to variable gain 5 connected in series. The switch 4 couples the input of the predictor filter 1 to the output of the periodic signal generator 2 or to the output of the noise generator 3 depending on whether or not the sound is to be reproduced. The amplitude of the sound is controlled by the amplifier 5. the filter 1 reproduces on its output S a speech signal as a function of prediction coefficients applied to its input E ₂ . Unlike what is shown in FIG. 1, the speech synthesizers to which the method and the coding device of the invention apply must include three predictor filters 1 adapted to each group of three successive 22.5 ms frames of the speech signal according to the stable or non-stable state of the sound to be synthesized. This organization makes it possible, for example, to reduce the bit rate from 2400 bits per second to 800 bits per second, by grouping the frames in packets of 3 x 22.5 = 67.5 milliseconds of 54 bits in which 30 to 35 bits are used to describe for example the 10 predictor coefficients of the three successive filters necessary for the implementation of the LPC10 coding method described above, and two bits among these are used to define the configuration to be given to the three filters to be generated according to the nature whether or not the voice signal to be generated is stable. In the table of FIG. 2 where the four possible configurations of the three filters are recorded, in the state 00 of the two configuration bits corresponds a first configuration where the three predictor filters are identical for the three frames of the voice signal. For the second configuration, the configuration bits have the value 01 and only the first two filters of

frames

1 and 2 are identical. In the third configuration, corresponding to configuration bits 10, only the last two filters of

frames

2 and 3 are identical. Finally in the fourth configuration, corresponding to the configuration bits 11, the three filters of

frames

1 and 3 are different. Naturally, this configuration mode is not unique and it is equally possible, while remaining within the scope of the invention, to define the number of frames in a packet by any number. However, for convenience, this number may be between 2 and 4 inclusive. In these cases, of course, the number of possible configurations can be extended to a maximum of 8 or 16. The definition of the filters is established according to steps 1 to 6 of the method represented by the flowchart in FIG. 2. According to a first step of the method bearing the reference 5 on the flowchart, the autocorrelation coefficients R _i , _k of the signal are calculated according to a relation of the form:

where S _in is a sample n of the signal in frame i and W _n denotes the weighting window. In the second step referenced 6, the calculation of the reflection coefficients of the lattice predictor filter corresponding to the preceding coefficients Ri (k) is carried out in application of a standard algorithm for example, of the known algorithm of LEROUX-GUEGUEN or SCHUR. At this stage, the coefficients R _ik are transformed into coefficients K _ij where j is a positive integer taking the successive values from 1 to 10. At the third stage bearing the reference 7 the coefficients k whose values are included by definition between -1 and + 1 are transformed into modified coefficients which evolve between "-infinity" and "+ infinity" and which take into account the fact that the quantification of the coefficients k must be faithful when they have an absolute value close to 1 and a value which can be coarser when they are close to 0 for example. Each coefficient K _ij is for example transformed according to a relation of the form

whose graph is shown in Figure 3 or according to the relationships (L _ij = K _ij | 1- | K _ij |); (L _ij = arc cos K _ij ); (L _ij = arc sin K _ij ) or in application of the method of calculating the LSP coefficients described in the article by George S. Kang an Lawrence, J. Fransen of the Naval Research Laboratory Washington DC 20375 1985 entitled "Application of line spectrum pairs to low bit rate speech encoder ". In the fourth step represented at 8, the coefficients L _ij are quantified along n _j bits each in a non-uniform manner, taking into account the distribution of the coefficients to give a value L _ij according to a distribution law represented by the histogram of L _ij of Figure 4. In step 5 the values of Lis in turn used to calculate coefficients K _ij according to the relation

Ces valeurs K_ijreprésentent les valeurs quantifiées des coefficients de prédiction à partir desquels les coefficients d'un prédicteur A_i(z)peuvent être déduits par des relations de récurrence définies comme suit :

pour p = 1, 2, ... 10.

These values K _ij represent the quantified values of the prediction coefficients from which the coefficients of a predictor A _{i (z)} can be deduced by recurrence relations defined as follows:

for p = 1, 2, ... 10.

Enfin A la dernière étape représentée en 10 le calcul de l'énergie de l'erreur de prédiction est effectué en application de la relation suivante

ou encore

avec

Finally At the last step represented in 10 the calculation of the energy of the prediction error is carried out by applying the following relation

or

with

Pour compléter l'algorithme il suffit alors de tester les quatre différentes configurations décrites précédemment en intercalant entre la première et la deuxième étape du procédé une étape supplémentaire tenant compte des configurations possibles pour ne retenir finalement que la configuration pour laquelle l'erreur de prédiction totale obtenue est minimale (sommée sur les trois trames).To complete the algorithm, it suffices to test the four different configurations described above by inserting between the first and second step of the process an additional step taking into account the possible configurations to finally retain only the configuration for which the total prediction error. obtained is minimal (summed over the three frames).

Dans la première configuration le même filtre est utilisé pour les trois trames. On utilise alors pour le déroulement des étapes 2 à 6 un quatrième filtre fictif unique qui est calculé à partir des coefficients R_4j donnés par la relation

avec j variant de 0 à 10.In the first configuration the same filter is used for the three frames. A fourth unique dummy filter is then used for the progress of steps 2 to 6 which is calculated from the coefficients R _4j given by the relation

with j varying from 0 to 10.

L'erreur de prédiction totale est alors égale à E2 et l'algorithme du procédé revient en fait à considérer les trois trames comme une seule trame de durée trois fois supérieure.The total prediction error is then equal to E2 and the algorithm of the method in fact amounts to considering the three frames as a single frame of duration three times greater.

Les coefficients L1 à L10 peuvent alors être quantifiés avec par exemple 5,5,4,4,4,3,2,2,2,2 bits respectivement, soit 33 bits au total.The coefficients L1 to L10 can then be quantified with, for example 5.5,4,4,4,3,2,2,2,2 bits respectively, or 33 bits in total.

Selon la deuxième configuration, dans laquelle un même filtre est utilisé pour les trames 1 et 2, l'algorithme est exécuté avec des valeurs des coefficients R_5j et R_3j d'autocorrélation définis comme suit : R_5,j = R_1,j + R₂,_j où j prend successivement les valeurs de 1 à 10 pour les deux premières trames et R₃,_j (j variant de 1 à 10) pour la dernière trame.According to the second configuration, in which the same filter is used for frames 1 and 2, the algorithm is executed with values of the autocorrelation coefficients R _5j and R _3j defined as follows: R _{5, j} = R _{1, j} + R ₂ , _j where j successively takes the values from 1 to 10 for the first two frames and R ₃ , _j (j varying from 1 to 10) for the last frame.

L'erreur de prédiction est égale à Es² + E₃ ² ce qui revient à considérer que les trames 1 et 2 sont regroupées en une seule trame de durée double, la trame 3 restant inchangée. Il est alors possible de quantifier les coefficients Li à L₁₀ sur les trames 1 et 2 avec respectivement 5,4,4,3,3,2,2,2,0,0 bits (25 bits au total, les coefficients Lg et L₁₀ n'étant pas transmis), et leur variation pour obtenir ceux de la troisième trame en utilisant 3,2,2,1,0,0,0,0,0,0 bits respectivement (8 bits au total), soit 33 bits pour les trois trames.The prediction error is equal to Es ² + E ₃ ² which amounts to considering that frames 1 and 2 are grouped into a single frame of double duration, frame 3 remaining unchanged. It is then possible to quantify the coefficients Li to L ₁₀ on frames 1 and 2 with respectively 5,4,4,3,3,2,2,2,0,0 bits (25 bits in total, the coefficients Lg and L ₁₀ not being transmitted), and their variation to obtain those of the third frame using 3,2,2,1,0,0,0,0,0,0 bits respectively (8 bits in total), that is to say 33 bits for the three frames.

Le fait de ne pas transmettre les coefficients Ls et Lio n'est pas gênant puisque dans ce cas la configuration correspond à des prédicteurs qui évoluent et dont les coefficients ont une importance qui va décroissante en fonction de leur rang.The fact of not transmitting the coefficients Ls and Lio is not troublesome since in this case the configuration corresponds to predictors which evolve and whose coefficients have an importance which decreases according to their rank.

Dans la troisième configuration ,où les mêmes filtres sont utilisés pour les trames 2 et 3 le même procédé que dans la deuxième configuration est utilisé en regroupant les coefficients R_ij des trames 2 et 4 tel que R_6j = R_2j + R_3j. Le même procédé de quantification est utilisé mais en codant le prédicteur des trames 2 et 3 et le différentiel pour la trame 1.In the third configuration, where the same filters are used for frames 2 and 3 the same method as in the second configuration is used by grouping the coefficients R _ij of frames 2 and 4 such that R _6j = R _2j + R _3j . The same quantification method is used but by coding the predictor of frames 2 and 3 and the differential for frame 1.

Enfin pour la dernière configuration où tous les filtres sont différents il faut considérer que les trois trames sont découplées et que l'erreur totale est égale à E,² + E₂ ² + E₃ ². Dans ce cas les coefficients L₁ à L₁₀ de la trame 2 seront quantifiés avec respectivement 4,4,3,3,3,2,2,0,0 bits soit 21 bits, ainsi que les différences pour la première trame avec 2,2,1,1,0,0,0,0,0,0 bits soit 6 bits ainsi que les différences pour la trame 3 (6 bits supplémentaires). Cette dernière configuration correspond à un codage de 21 + 6 + 6 = 33 bits.Finally for the last configuration where all the filters are different, it must be considered that the three frames are decoupled and that the total error is equal to E, ² + E ₂ ² + E ₃ ² . In this case the coefficients L ₁ to L ₁₀ of frame 2 will be quantified with respectively 4,4,3,3,3,2,2,0,0 bits or 21 bits, as well as the differences for the first frame with 2 , 2,1,1,0,0,0,0,0,0 bits or 6 bits as well as the differences for frame 3 (6 additional bits). This last configuration corresponds to a coding of 21 + 6 + 6 = 33 bits.

Le dispositif pour la mise en oeuvre du procédé qui est représenté à la figure 6 comporte un dispositif 1 de calcul des 10 coefficients d'autocorrélation pour chaque trame couplée à des éléments de retard formés par trois mémoires de trames 12₁ à 12₃ pour mémoriser les coefficients R_ij calculés à la première étape du procédé. Il comprend également un dispositif de calcul 13 des coefficients K_ij et L_ij suivant la deuxième étape du procédé. Un bus de données 14 véhicule les valeurs des ccefficients L_ij (i = 1 à 3, j = 1 à 10) et les valeurs des coefficients R_io représentant les énergies où i = 1 à 3. Le bus de données 14 relie les éléments de retard 12₁ à 12₃ et le dispositif de calcul 13 a quatre chaînes de calcul référencés de 15₁ à 15₄. Les chaînes de calcul 15₁ à 15₃ comprennent respectivement un dispositif sommateur, respectivement 16, à 16₃ qui est relié aux éléments de retard 12, à 12₃ pour calculer les coefficients R_4j, R_5j et R_6j suivant les 4 configurations décrites précédemment. Les sorties des dispositifs de sommation 16₁ à 16₃ sont reliées à des dispositifs de calcul respectivement 17, à 17₃ des coefficients L_4j, K_4j; K_Sj, L_5j et K_6j et L_6j. Les coefficients L_4j L_5j L_6j sont transmis respectivement à des dispositifs de quantification 18, à 18₃ pour calculer les coefficients L_ijconformément à la quatrième étape du procédé. Ces coefficients sont appliqués à des dispositifs de calcul d'erreur totale référencés respectivement de 19, à 19₃ pour fournir respectivement des erreurs de prédiction totale E₄ ², E_s ² + E₂ ² et enfin E,² + E₆ ² pour chacune des configurations 1 à 3 décrites précédemment. La chaîne de calcul 15₄ comprend relié au bus de données 14 un dispositif de quantification séparée 184 des coefficients L_ij. Les coefficients L_ijobtenus à la sortie du dispositif de quantification 18₄ sont appliqués à un dispositif de calcul d'erreur totale 19₄ pour calculer l'erreur totale suivant la relation E,² + E₂ ² + E₃ ² définie précédemment. Chacune des sorties des dispositifs de calcul d'erreur totale 19, à 19₄ des chaînes de calcul 15₁ à 15₄ sont appliquées aux entrées respectives d'un dispositif de recherche totale de minimum 20. D'autre part, chacune des sorties du dispositif de quantification 18₁ à 18₄, fournissant les coefficients L_ij, sont appliquées à un dispositif d'aiguillage 21 commandé par la sortie du dispositif de recherche d'erreur totale minimum 20 pour sélectionner des coefficients L_ijà transmettre qui corresponde à l'erreur totale minimum calculée par le dispositif 20. Dans cet exemple la sortie du dispositif comporte 35 bits, 33 bits représentant les valeurs des coefficients L_ijobtenues à la sortie du dispositif d'aiguillage 21 et 2 bits représentant l'une des quatre configurations possibles indiquées par le dispositif de recherche d'erreur totale minimum 20.The device for implementing the method which is shown in FIG. 6 comprises a device 1 for calculating the 10 autocorrelation coefficients for each frame coupled to delay elements formed by three frame memories 12 ₁ to 12 ₃ for memorizing the coefficients R _ij calculated in the first step of the process. It also includes a device 13 for calculating the coefficients K _ij and L _ij according to the second step of the method. A data bus 14 conveys the values of the coefficients L _ij (i = 1 to 3, j = 1 to 10) and the values of the coefficients R _io representing the energies where i = 1 to 3. The data bus 14 connects the elements delay 12 ₁ to 12 ₃ and the calculation device 13 has four calculation chains referenced from 15 ₁ to 15 ₄ . Calculating channels January ₁₅ _to March 15 comprises a summing device respectively or 16, to 16 ₃ which is connected to the delay elements 12, 12 ₃ to calculate the coefficients R _4d, R _5d and R _6d following the four configurations described previously. The outputs of the summing devices 16 ₁ to 16 ₃ are connected to devices 17, 17 ₃ respectively for calculating the coefficients L _4j , K _4j ; _Sj K, L _5j and _6j K and L _6j. The coefficients _4j L L L _5d _6d are respectively transmitted to quantizing devices 18, to 18 ₃ to calculate the coefficients L _ij in accordance with the fourth method step. These coefficients are applied to total error calculation devices referenced respectively from 19, to 19 ₃ to respectively provide total prediction errors E ₄ ² , E _s ² + E ₂ ² and finally E, ² + E ₆ ² for each of the configurations 1 to 3 described above. The calculation chain 15 ₄ comprises connected to the data bus 14 a separate quantization device 184 of the coefficients L _ij . The coefficients L _ij obtained at the output of the quantization device 18 ₄ are applied to a total error calculation device 19 ₄ to calculate the total error according to the relation E, ² + E ₂ ² + E ₃ ² defined above. Each of the outputs of the total error calculation devices 19, to 19 ₄ of the calculation chains 15 ₁ to 15 ₄ are applied to the respective inputs of a total search device of minimum 20. On the other hand, each of the outputs of the quantization device 18 ₁ to 18 ₄ , supplying the coefficients L _ij , are applied to a switching device 21 controlled by the output of the minimum total error search device 20 to select coefficients L _ij to be transmitted which correspond to l total minimum error calculated by the device 20. In this example, the output of the device comprises 35 bits, 33 bits representing the values of the coefficients L _ij obtained at the output of the switching device 21 and 2 bits representing one of the four configurations possible indicated by the minimum total error finding device 20.

Il va de soi que l'invention ne se limite pas aux exemples qui viennent d'être décrits et qu'elle peut recevoir d'autres variantes de réalisation dépendant notamment, des coefficients qui sont appliqués aux filtres qui peuvent être différents des coefficients L_ij définis précédemment et du nombre de ces coefficients qui peut être différent de 10. Il est clair également que l'invention peut encore s'appliquer pour des définitions de paquets de trames comprenant des nombres différents de trois trames ou des configurations de filtrage différentes de quatre et que ces variantes doivent conduire naturellement à des nombres totaux de bits de quantification différents de (33+2) bits avec une répartition par configuration différente.It goes without saying that the invention is not limited to the examples which have just been described and that it can receive other variant embodiments depending in particular on the coefficients which are applied to the filters which may be different from the coefficients L _ij defined above and of the number of these coefficients which can be different from 10. It is also clear that the invention can still be applied for definitions of packets of frames comprising numbers different from three frames or filtering configurations different from four and that these variants must naturally lead to total numbers of quantization bits different from (33 + 2) bits with a distribution by different configuration.

Claims

1. Method for coding predictive filters for very low bit rate vocoders of the type in which the voice signal is divided into binary frames of determined duration, characterized in that it consists in grouping (12, ... 12 ₃ ) the frames in packets of successive frames, to associate respectively to each frame contained in a packet a predictor filter (1)) and to quantify the coefficient of each predictor filter (5 ... 9) taking into account the stable (19) or non-stable configuration of the voice signal.

2. Method according to claim 1 characterized in that the number of frames in a packet is between 2 and 4 inclusive (12 ₁ ... 12 ₃ ).

3. Method according to claims 1 and 2 characterized in that the number of configurations is the numbers of 4, 8 or 16.

4. Method according to claim 3 characterized in that it consists in limiting the choice of configurations to four,
a first configuration where the predictor filters are identical a second and a third configuration where only two predictor filters are identical and a fourth configuration where the three predictor filters are different.

5. Method according to claim 4 characterized in that it consists in calculating (17, 18) for each configuration the prediction coefficients and the energy (19) of the prediction error in order to retain (20) only the coefficients prediction whose prediction error is minimal.

6. Method according to claim 5 characterized in that it consists for the calculation of the prediction coefficients to calculate in each frame the autocorrelation coefficients R _{i, k} of the sampled voice signal, and to apply the Leroux-Gueguen algorithm or Schur to determine the reflection coefficients of each predictor filter.

7. Method according to any one of claims 1 to 6 characterized in that the reflection coefficients L _{i, j} of the filters are 10 in number and are coded over a total length of 33 bits whatever the configuration.

8. Method according to claim 7 characterized in that the reflection coefficients Li to L, filters have respectively the length: (5,5,4,4,4,3,2,2,2,2) bits according to the first configuration (5,4,4,3,3,2,2,2,0,0) bits and ( 3,2,2,1,0,0,0,0,0,0) bits according to the second and third configurations (4,4,3,3,3,2,2,0,0) bits for coding of the intermediate frame (frame 2) according to the fourth configuration and (2,2,1,1,0,0,0,0,0,0) bits for the two other frames (frame 1) (frame 3) according to the fourth configuration.

9. Method according to claim 6 characterized in that the reflection coefficients of the filters are determined by the relation
L, j ⁼ K _{i, j} / (1-K _ij ² ) ^-2

10. Device for implementing the method according to any one of claims 1 to 9.