EP0734013A2 - Determination of an excitation vector in a CELP coder - Google Patents
Determination of an excitation vector in a CELP coder Download PDFInfo
- Publication number
- EP0734013A2 EP0734013A2 EP96410028A EP96410028A EP0734013A2 EP 0734013 A2 EP0734013 A2 EP 0734013A2 EP 96410028 A EP96410028 A EP 96410028A EP 96410028 A EP96410028 A EP 96410028A EP 0734013 A2 EP0734013 A2 EP 0734013A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- excitation
- vector
- subset
- code
- codes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0013—Codebook search algorithms
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0013—Codebook search algorithms
- G10L2019/0014—Selection criteria for distances
Definitions
- the present invention relates to the compression of speech signals to be transmitted over a telephone line and more particularly to the determination of an excitation vector in the context of compression according to the method of Linear Prediction Excited by Code (CELP).
- CELP Linear Prediction Excited by Code
- FIG. 1 very schematically represents a CELP compression circuit.
- Such a circuit is based on a modeling of the vocal cords and of the resonance chamber constituted by the oral cavities, the throat and the larynx. Such a compression method is therefore optimized for the processing of speech signals.
- the oral, throat and larynx cavities are modeled by a "linear prediction" filter 10, the transfer function of which generally comprises ten poles.
- the vocal cords are modeled by an excitation E treated by a comb filter 12.
- a digitized speech signal S is analyzed in frames by an analysis circuit 14.
- the analysis circuit 14 determines coefficients a 1 to a 10 of the transfer function of the filter 10, the step p of the filter comb 12, and a gain G applied to the excitation E at 16 at the input of the filter 12.
- the values a i , p and G are calculated for each frame to take account respectively of variations in the oral cavity, the frequency spectrum of the vocal cords and the amplitude of the sound. In this way an attempt is made to obtain that the output of the filter 10 is equal to the signal S.
- the coefficients a i , p, and G are transmitted so that a decoder which receives these coefficients reconstruct the corresponding frames of the signal S.
- the decoder must also know the excitation E to be used.
- the determination of the coefficients ai, p and G does not pose any particular problem.
- the procedure for finding the optimal excitation remains the heaviest in terms of computational load and it is always of great interest to simplify it even at the cost of a significant reduction in the quality of the compression.
- the excitation E was selected in a table 18 (called "codebook") containing several possible excitations which represented, in fact, pieces of white noise.
- codebook a table 18 containing several possible excitations which represented, in fact, pieces of white noise.
- a control circuit 20 scans the table 18 until the difference e, formed at 22, between the current frame of the signal S and the corresponding frame at the output of the filter 10 is minimal. (Of course, instead of comparing the signal S at the output of the filter 10, one can also compare the excitation E to the frame of the signal S having undergone the reverse processing of the filters 10 and 12.)
- the address C selecting the best excitation E in table 18 is supplied to a decoder provided with a homologous table.
- Each excitation contained in the table 18 is a sequence of digital samples corresponding respectively to the samples of each of the frames of the signal to be compressed.
- each sample of an excitation sequence can only take three values, namely, 0, 1 or -1 (ternary excitation sequence). We realized that this did not change the quality of the compression in a perceptible way.
- FIG. 2 represents an example of excitation sequence E which has been proposed to further reduce the complexity of the search.
- This excitation sequence is called binary. It includes several non-zero samples of values 1 and -1, two non-zero samples, or pulses, being separated by a constant number of zero samples, here 3.
- Such an excitation sequence can be represented by a binary number (or excitation code) C whose bits are associated with the pulses and correspond to the polarity of these. By doing so, the code C supplied by the control circuit 20 corresponds directly to an excitation sequence and the table 18 is deleted.
- the complexity is moreover reduced by the fact that the samples to be taken into account are reduced to pulses, the number of which is, in the example of FIG. 2, four times less than the total number of samples of a sequence.
- the structure of filters 10 and 12 is simplified.
- Each C code is associated with an excitation vector C having as components the values 1 and -1 corresponding to the bits 0 and 1 of the code C.
- m scal 2 (T, C i ) / mod 2 (FC i ), where C i is the excitation vector tested; where T is a target vector formed by samples of the frame being analyzed of the signal S having undergone the reverse processing of filters 10 and 12, these samples being those corresponding to the values 1 and -1 of the vector C i and where F denotes the matrix representing the transfer function of filters 10 and 12, in which only the rows corresponding to the values 1 and -1 of the vector C i have been kept.
- the notations scal (.,.) And mod (.,.) Denote the scalar product and the module respectively.
- the denominator of the criterion m is approximately constant, whatever the excitation vector C i .
- the criterion m is approximately maximized by maximizing the numerator. This numerator is maximized when each component of the excitation vector C i is that of the same sign as the corresponding sample of the target vector T.
- an approximate optimal excitation code is obtained directly by taking for its bits the sign bits of the target vector samples (or the inverse of the sign bits).
- An object of the present invention is to provide a method making it possible to limit the number of calculations necessary to maximize the aforementioned criterion m in the case where the usable excitation codes belong to a subset representative of a larger set.
- the present invention provides a method for determining an excitation vector associated with a frame of a speech signal to be compressed, said vector belonging to a subset associated with a larger set of vectors d excitation likely to maximize a criterion, and having as components values 1 and -1 corresponding to a sequence of excitation samples of a linear prediction filter.
- the criterion is equal to the square of the ratio between, on the one hand, the scalar product of the excitation vector by a target vector formed by samples of the frame having undergone inverse linear prediction filtering and, on the other hand, the module of the excitation vector having undergone direct linear prediction filtering.
- the method comprises the steps of preselecting an excitation vector having as components those of the same sign as the corresponding samples of the target vector, or those of inverse sign, and, if the preselected excitation vector does not belong to said sub- together, to select as the excitation vector the one which maximizes said criterion from the vectors of the subset which are respectively associated with the preselected vector and with the vectors closest to it in the larger set.
- the excitation vectors are associated with excitation codes, the bits of which correspond to the signs of the components of the excitation vectors, the subset of excitation codes associated with said sub- set of vectors being formed by binary values supplemented by error correction bits, each excitation code being associated with an excitation code of the subset by an error correction function.
- the method comprises the steps of forming the group comprising the preselected code associated with the preselected vector and the codes closest to it, so that each of these closest codes differs from the preselected code by a single bit, to be submitted the codes of this group to the error correction function to obtain a group of corrected codes belonging to the subset, and to select as excitation code, among the corrected codes, that associated with the vector which maximizes said criterion.
- the error correction bits are bits of a Hamming correcting code.
- This method is not directly applicable in the case where the possible excitation codes belong to a subset representative of a larger set, for example when this subset is formed from values of n bits to which one adds Nn bits of an error correcting code. Indeed, it is then very likely that the excitation code found does not belong to the subset. In this case, one could consider reducing the excitation code found to an excitation code belonging to the subset by applying to it an error correction function associated with the correction code. We then come across the excitation code of the subset which is closest to the excitation code found. The effect of this "error correction" is to modify at least one bit of the excitation code, this bit possibly having in some cases a great influence on the value of the criterion m, so that the final excitation code provides bad results.
- H Hamming correcting code denoted H (N, n, 3) is used below, where 3 denotes the minimum Hamming distance separating two elements belonging to the representative subset.
- the Hamming distance between two values is defined as the number of bitwise differences between these two values.
- a group of excitation codes comprising an initial code found by maximizing the simplified criterion m as well as all the other codes obtained from it by modifying a single bit.
- This has the consequence, by using a Hamming correction code allowing to correct a single bit (minimum Hamming distance 3), that each of the group excitation codes is close to a code distinct from the usable subset.
- the Hamming error correction function is applied to each of the codes in the group, which brings each of the codes in the group to the code closest to the subset.
- the corrected codes one retains as an approximate optimal code that which maximizes the complete criterion m, that is to say by calculating its numerator and its denominator.
- FIG. 3 represents in diagram form the method according to the invention which has just been described.
- the analyzed frame of the signal S undergoes the reverse processing at 24 of that of the filters 10 and 12 of FIG. 1.
- a target vector T is thus obtained, of which only the samples corresponding to the pulses of the excitation sequences are kept.
- samples of the vector T are retained only the sign bits (or their inverses) to provide an initial excitation code C 0 .
- This code C 0 is "corrupted" at 28 to form a group of codes comprising the code C 0 and all the other codes C 1 to C N obtained by modifying a single bit of the code C 0 .
- Each code C 0 to C N undergoes an "error correction" at 30 to provide a group of corrected codes C ' 0 to C' N.
- each of the vectors associated with the corrected codes is compared with the target vector T, and the optimal associated excitation code Copt is that associated with the vector which maximizes the complete criterion m.
- the position of the first pulse of the excitation sequences E is variable.
- this position can be one of the first four, which is determined by two additional bits to be transmitted to the decoder and which multiplies by four the number of excitation vectors to be tested.
Abstract
Description
La présente invention concerne la compression des signaux de parole à transmettre sur une ligne téléphonique et plus particulièrement la détermination d'un vecteur d'excitation dans le cadre d'une compression selon la méthode de la Prédiction Linéaire Excitée par Code (CELP).The present invention relates to the compression of speech signals to be transmitted over a telephone line and more particularly to the determination of an excitation vector in the context of compression according to the method of Linear Prediction Excited by Code (CELP).
La figure 1 représente très schématiquement un circuit de compression CELP. Un tel circuit est basé sur une modélisation des cordes vocales et de la chambre de résonance constituée par les cavités buccale, de la gorge et du larynx. Une telle méthode de compression est donc optimisée pour le traitement des signaux de parole.FIG. 1 very schematically represents a CELP compression circuit. Such a circuit is based on a modeling of the vocal cords and of the resonance chamber constituted by the oral cavities, the throat and the larynx. Such a compression method is therefore optimized for the processing of speech signals.
Les cavités buccale, de la gorge et du larynx sont modélisées par un filtre de "prédiction linéaire" 10 dont la fonction de transfert comporte en général dix pôles. Les cordes vocales sont modélisées par une excitation E traitée par un filtre en peigne 12.The oral, throat and larynx cavities are modeled by a "linear prediction"
Un signal de parole numérisé S est analysé par trames par un circuit d'analyse 14. A chaque trame, le circuit d'analyse 14 détermine des coefficients a1 à a10 de la fonction de transfert du filtre 10, le pas p du filtre en peigne 12, et un gain G appliqué à l'excitation E en 16 à l'entrée du filtre 12. Les valeurs ai, p et G sont calculées à chaque trame pour tenir compte respectivement des variations de la cavité buccale, du spectre de fréquence des cordes vocales et de l'amplitude du son. On tente de cette manière d'obtenir que la sortie du filtre 10 soit égale au signal S. Alors, au lieu de transmettre les échantillons du signal S, on transmet les coefficients ai, p, et G afin qu'un décodeur qui reçoit ces coefficients reconstitue les trames correspondantes du signal S.A digitized speech signal S is analyzed in frames by an
Bien entendu, le décodeur doit également connaître l'excitation E à utiliser. La détermination des coefficients ai, p et G ne pose pas de problème particulier. Toutefois, la procédure de recherche de l'excitation optimale reste la plus lourde en termes de charge de calcul et il est toujours d'un grand intérêt de la simplifier même au prix d'une sensible réduction de la qualité de la compression.Of course, the decoder must also know the excitation E to be used. The determination of the coefficients ai, p and G does not pose any particular problem. However, the procedure for finding the optimal excitation remains the heaviest in terms of computational load and it is always of great interest to simplify it even at the cost of a significant reduction in the quality of the compression.
A l'origine du codage CELP, l'excitation E était sélectionnée dans une table 18 (appelée "codebook") contenant plusieurs excitations possibles qui représentaient, en fait, des morceaux de bruit blanc. Dans ce cas, un circuit de commande 20 balaye la table 18 jusqu'à ce que la différence e, formée en 22, entre la trame courante du signal S et la trame correspondante en sortie du filtre 10 soit minimale. (Bien entendu, au lieu de comparer le signal S à la sortie du filtre 10, on peut également comparer l'excitation E à la trame du signal S ayant subi le traitement inverse des filtres 10 et 12.)At the origin of the CELP coding, the excitation E was selected in a table 18 (called "codebook") containing several possible excitations which represented, in fact, pieces of white noise. In this case, a
Avec cette technique, outre les coefficients ai, p et G, on fournit l'adresse C sélectionnant la meilleure excitation E dans la table 18 à un décodeur muni d'une table homologue.With this technique, in addition to the coefficients a i , p and G, the address C selecting the best excitation E in table 18 is supplied to a decoder provided with a homologous table.
Chaque excitation contenue dans la table 18 est une séquence d'échantillons numériques correspondant respectivement aux échantillons de chacune des trames du signal à comprimer. Pour que la compression soit de qualité acceptable, il est nécessaire de stocker un nombre relativement important, de l'ordre de 1000, de séquences d'excitation.Each excitation contained in the table 18 is a sequence of digital samples corresponding respectively to the samples of each of the frames of the signal to be compressed. For the compression to be of acceptable quality, it is necessary to store a relatively large number, of the order of 1000, of excitation sequences.
Pour limiter la complexité de la procédure de recherche, on a proposé que chaque échantillon d'une séquence d'excitation ne puisse prendre que trois valeurs, à savoir, 0, 1 ou -1 (séquence d'excitation ternaire). On s'est aperçu que cela ne modifiait pas la qualité de la compression de manière perceptible.To limit the complexity of the search procedure, it has been proposed that each sample of an excitation sequence can only take three values, namely, 0, 1 or -1 (ternary excitation sequence). We realized that this did not change the quality of the compression in a perceptible way.
La figure 2 représente un exemple de séquence d'excitation E que l'on a proposé pour réduire davantage la complexité de la recherche. Cette séquence d'excitation est dite binaire. Elle comprend plusieurs échantillons non-nuls de valeurs 1 et -1, deux échantillons non-nuls, ou impulsions, étant séparés par un nombre constant d'échantillons nuls, ici 3. Une telle séquence d'excitation peut être représentée par un nombre binaire (ou code d'excitation) C dont les bits sont associés aux impulsions et correspondent à la polarité de celles-ci. En procédant ainsi, le code C fourni par le circuit de commande 20 correspond directement à une séquence d'excitation et la table 18 est supprimée. La complexité est par ailleurs réduite du fait que les échantillons à prendre en compte se réduisent aux impulsions, dont le nombre est, dans l'exemple de la figure 2, quatre fois inférieur au nombre total d'échantillons d'une séquence. En outre, la structure des filtres 10 et 12 est simplifiée.FIG. 2 represents an example of excitation sequence E which has been proposed to further reduce the complexity of the search. This excitation sequence is called binary. It includes several non-zero samples of
Cette technique dégrade très légèrement la qualité de la compression, mais cette dégradation est compensée de manière simple par un traitement destiné à supprimer les effets de la régularité de l'espacement des échantillons non-nuls.This technique very slightly degrades the quality of the compression, but this degradation is compensated for in a simple way by a treatment intended to remove the effects of the regularity of the spacing of the non-zero samples.
A chaque code C est associé un vecteur d'excitation C ayant pour composantes les valeurs 1 et -1 correspondant aux bits à 0 et 1 du code C. Les termes "vecteur" et "code" sont confondus dans la suite de la description.Each C code is associated with an excitation vector C having as components the
Afin de réduire davantage le nombre d'essais pour minimiser l'erreur, on a proposé de limiter le nombre de vecteurs ou codes d'excitation possibles à un sous-ensemble représentatif d'un ensemble plus grand. L'article intitulé "A Comparison of some Algebraic Structures for CELP Coding of Speech" de J.P. Adoul et C. Lamblin dans Proc. ICASSP, 1987, décrit une telle méthode. Pour créer un sous-ensemble représentatif de tous les codes C de N bits, on forme l'ensemble des valeurs de n bits (n<N), dont chacune est complétée par N-n bits de correction d'erreur.In order to further reduce the number of tests to minimize the error, it has been proposed to limit the number of possible vectors or excitation codes to a subset representative of a larger set. The article entitled "A Comparison of some Algebraic Structures for CELP Coding of Speech "by JP Adoul and C. Lamblin in Proc. ICASSP, 1987, describes such a method. To create a representative subset of all N-bit C codes, we form the set of values of n bits (n <N), each of which is supplemented by Nn error correction bits.
Pour trouver le meilleur vecteur d'excitation C, on cherche généralement à maximiser un critère de sélection défini par :
L'essai de tous les vecteurs d'excitation Ci selon ce critère représente un nombre de calculs considérable à effectuer entre les arrivées de deux trames du signal S.The test of all the excitation vectors C i according to this criterion represents a considerable number of calculations to be carried out between the arrivals of two frames of the signal S.
Il a été constaté que le dénominateur du critère m est approximativement constant, quel que soit le vecteur d'excitation Ci. Ainsi, le critère m est approximativement maximisé en maximisant le numérateur. Ce numérateur est maximisé lorsque chaque composante du vecteur d'excitation Ci est celle de même signe que l'échantillon correspondant du vecteur cible T. En d'autres termes, un code d'excitation optimal approximatif est obtenu directement en prenant pour ses bits les bits de signe des échantillons du vecteur cible (ou les inverses des bits de signe).It was found that the denominator of the criterion m is approximately constant, whatever the excitation vector C i . Thus, the criterion m is approximately maximized by maximizing the numerator. This numerator is maximized when each component of the excitation vector C i is that of the same sign as the corresponding sample of the target vector T. In other words, an approximate optimal excitation code is obtained directly by taking for its bits the sign bits of the target vector samples (or the inverse of the sign bits).
Cette solution n'est pas applicable dans le cas où les codes d'excitation utilisables sont limités à un sous-ensemble représentatif d'un ensemble plus grand obtenu, par exemple, à l'aide d'un code correcteur d'erreurs.This solution is not applicable in the case where the usable excitation codes are limited to a subset representative of a larger set obtained, for example, using an error correcting code.
Un objet de la présente invention est de prévoir un procédé permettant de limiter le nombre de calculs nécessaires pour maximiser le critère m susmentionné dans le cas où les codes d'excitation utilisables appartiennent à un sous-ensemble représentatif d'un ensemble plus grand.An object of the present invention is to provide a method making it possible to limit the number of calculations necessary to maximize the aforementioned criterion m in the case where the usable excitation codes belong to a subset representative of a larger set.
Pour atteindre cet objet, la présente invention prévoit un procédé de détermination d'un vecteur d'excitation associé à une trame d'un signal de parole à comprimer, ledit vecteur appartenant à un sous-ensemble associé à un ensemble plus grand de vecteurs d'excitation susceptibles de maximiser un critère, et ayant pour composantes des valeurs 1 et -1 correspondant à une séquence d'échantillons d'excitation d'un filtre de prédiction linéaire. Le critère est égal au carré du rapport entre, d'une part, le produit scalaire du vecteur d'excitation par un vecteur cible formé par des échantillons de la trame ayant subi un filtrage de prédiction linéaire inverse et, d'autre part, le module du vecteur d'excitation ayant subi un filtrage de prédiction linéaire direct. Le procédé comprend les étapes consistant à présélectionner un vecteur d'excitation ayant pour composantes celles de même signe que les échantillons correspondants du vecteur cible, ou celles de signe inverse, et, si le vecteur d'excitation présélectionné n'appartient pas audit sous-ensemble, à sélectionner comme vecteur d'excitation celui qui maximise ledit critère parmi les vecteurs du sous-ensemble qui sont associés respectivement au vecteur présélectionné et aux vecteurs les plus proches de celui-ci dans l'ensemble plus grand.To achieve this object, the present invention provides a method for determining an excitation vector associated with a frame of a speech signal to be compressed, said vector belonging to a subset associated with a larger set of vectors d excitation likely to maximize a criterion, and having as
Selon un mode de réalisation de la présente invention, les vecteurs d'excitation sont associés à des codes d'excitation dont les bits correspondent aux signes des composantes des vecteurs d'excitation, le sous-ensemble de codes d'excitation associé audit sous-ensemble de vecteurs étant formé par des valeurs binaires complétées par des bits de correction d'erreurs, tout code d'excitation étant associé à un code d'excitation du sous-ensemble par une fonction de correction d'erreurs. Le procédé comprend les étapes consistant à former le groupe comprenant le code présélectionné associé au vecteur présélectionné et les codes les plus proches de celui-ci, de manière que chacun de ces codes les plus proches diffère du code présélectionné par un seul bit, à soumettre les codes de ce groupe à la fonction de correction d'erreur pour obtenir un groupe de codes corrigés appartenant au sous-ensemble, et à sélectionner comme code d'excitation, parmi les codes corrigés, celui associé au vecteur qui maximise ledit critère.According to an embodiment of the present invention, the excitation vectors are associated with excitation codes, the bits of which correspond to the signs of the components of the excitation vectors, the subset of excitation codes associated with said sub- set of vectors being formed by binary values supplemented by error correction bits, each excitation code being associated with an excitation code of the subset by an error correction function. The method comprises the steps of forming the group comprising the preselected code associated with the preselected vector and the codes closest to it, so that each of these closest codes differs from the preselected code by a single bit, to be submitted the codes of this group to the error correction function to obtain a group of corrected codes belonging to the subset, and to select as excitation code, among the corrected codes, that associated with the vector which maximizes said criterion.
Selon un mode de réalisation de la présente invention, les bits de correction d'erreurs sont des bits d'un code correcteur de Hamming.According to an embodiment of the present invention, the error correction bits are bits of a Hamming correcting code.
Ces objets, caractéristiques et avantages ainsi que d'autres de la présente invention seront exposés en détail dans la description suivante de modes de réalisation particuliers, faite à titre non-limitatif à l'aide des figures jointes parmi lesquelles :
- la figure 1, précédemment décrite, illustre une méthode de compression CELP ;
- la figure 2, précédemment décrite, représente un exemple de séquence d'excitation et du code correspondant ; et
- la figure 3 illustre des traitements à effectuer selon la présente invention pour sélectionner un vecteur d'excitation optimal dans le cas où celui-ci appartient à un sous-ensemble obtenu en utilisant un code correcteur d'erreurs.
- Figure 1, previously described, illustrates a CELP compression method;
- FIG. 2, previously described, shows an example of excitation sequence and the corresponding code; and
- FIG. 3 illustrates the processing operations to be carried out according to the present invention for selecting an optimal excitation vector in the case where it belongs to a subset obtained using an error correcting code.
Pour maximiser le critère m susmentionné, on a constaté que le dénominateur de ce critère, le carré du module du vecteur FCi, est approximativement constant, quel que soit le vecteur d'excitation Ci. Cette approximation est relativement bonne puisque le module du vecteur Ci est constant. Ainsi, pour maximiser le critère m de manière approchée, il suffit de maximiser un critère simplifié qui est le produit scalaire du vecteur cible T par le vecteur d'excitation Ci. Ce produit scalaire est maximal lorsque chaque composante (1 ou -1) du vecteur Ci est celle de même signe que l'échantillon correspondant du vecteur cible T. On obtient ainsi directement un vecteur d'excitation optimal approché Copt à partir du vecteur cible T.To maximize the above-mentioned criterion m, it has been found that the denominator of this criterion, the square of the module of the vector FC i , is approximately constant, whatever the excitation vector C i . This approximation is relatively good since the modulus of the vector C i is constant. So, to maximize the criterion m in an approximate way, it suffices to maximize a simplified criterion which is the scalar product of the target vector T by the excitation vector C i . This scalar product is maximal when each component (1 or -1) of the vector C i is that of the same sign as the corresponding sample of the target vector T. We thus obtain directly an optimal excitation vector approximated Copt from the target vector T.
Ce procédé n'est pas directement applicable dans le cas où les codes d'excitation possibles appartiennent à un sous-ensemble représentatif d'un ensemble plus grand, par exemple lorsque ce sous-ensemble est formé à partir de valeurs de n bits auxquels on ajoute N-n bits d'un code correcteur d'erreurs. En effet, il est alors très probable que le code d'excitation trouvé n'appartienne pas au sous-ensemble. Dans ce cas, on pourrait envisager de ramener le code d'excitation trouvé à un code d'excitation appartenant au sous-ensemble en lui appliquant une fonction de correction d'erreur associée au code correcteur. On tombe alors sur le code d'excitation du sous-ensemble qui est le plus proche du code d'excitation trouvé. Cette "correction d'erreur" a pour conséquence de modifier au moins un bit du code d'excitation, ce bit pouvant avoir dans certains cas une grande influence sur la valeur du critère m, de manière que le code d'excitation final fournisse de mauvais résultats.This method is not directly applicable in the case where the possible excitation codes belong to a subset representative of a larger set, for example when this subset is formed from values of n bits to which one adds Nn bits of an error correcting code. Indeed, it is then very likely that the excitation code found does not belong to the subset. In this case, one could consider reducing the excitation code found to an excitation code belonging to the subset by applying to it an error correction function associated with the correction code. We then come across the excitation code of the subset which is closest to the excitation code found. The effect of this "error correction" is to modify at least one bit of the excitation code, this bit possibly having in some cases a great influence on the value of the criterion m, so that the final excitation code provides bad results.
A titre d'exemple, on utilise ci-après un code correcteur de Hamming noté H(N, n, 3), où 3 désigne la distance minimale de Hamming séparant deux éléments appartenant au sous-ensemble représentatif. La distance de Hamming entre deux valeurs est définie comme étant le nombre de différences bit à bit entre ces deux valeurs. Avec cette solution, on crée un sous-ensemble de 2n vecteurs d'excitation de N bits.By way of example, a Hamming correcting code denoted H (N, n, 3) is used below, where 3 denotes the minimum Hamming distance separating two elements belonging to the representative subset. The Hamming distance between two values is defined as the number of bitwise differences between these two values. With this solution, we create a subset of 2 n excitation vectors of N bits.
Selon l'invention, on forme un groupe de codes d'excitation comprenant un code initial trouvé en maximisant le critère m simplifié ainsi que tous les autres codes obtenus à partir de celui-ci en modifiant un seul bit. Ceci a pour conséquence, en utilisant un code correcteur de Hamming permettant de corriger un bit unique (distance de Hamming minimale 3), que chacun des codes d'excitation du groupe est proche d'un code distinct du sous-ensemble utilisable. Ensuite, on applique à chacun des codes du groupe la fonction de correction d'erreur de Hamming, ce qui ramène chacun des codes du groupe au code le plus proche du sous-ensemble. On obtient un groupe de codes "corrigés" appartenant au sous-ensemble et qui "entoure" le code initialement trouvé. Parmi les codes corrigés, on retient comme code optimal approché celui qui maximise le critère m complet, c'est-à-dire en calculant son numérateur et son dénominateur.According to the invention, a group of excitation codes is formed comprising an initial code found by maximizing the simplified criterion m as well as all the other codes obtained from it by modifying a single bit. This has the consequence, by using a Hamming correction code allowing to correct a single bit (minimum Hamming distance 3), that each of the group excitation codes is close to a code distinct from the usable subset. Next, the Hamming error correction function is applied to each of the codes in the group, which brings each of the codes in the group to the code closest to the subset. We obtain a group of "corrected" codes belonging to the subset and which "surrounds" the code initially found. Among the corrected codes, one retains as an approximate optimal code that which maximizes the complete criterion m, that is to say by calculating its numerator and its denominator.
La figure 3 représente sous forme de schéma le procédé selon l'invention qui vient d'être décrit. La trame analysée du signal S subit en 24 le traitement inverse de celui des filtres 10 et 12 de la figure 1. On obtient ainsi un vecteur cible T, dont on ne garde que les échantillons correspondant aux impulsions des séquences d'excitation. En 26, on ne retient des échantillons du vecteur T que les bits de signe (ou leurs inverses) pour fournir un code d'excitation initial C0. Ce code C0 est "corrompu" en 28 pour former un groupe de codes comprenant le code C0 et tous les autres codes C1 à CN obtenus en modifiant un seul bit du code C0. Chaque code C0 à CN subit en 30 une "correction d'erreur" pour fournir un groupe de codes corrigés C'0 à C'N. En 32, chacun des vecteurs associés aux codes corrigés est comparé au vecteur cible T, et on retient comme code d'excitation optimal approché Copt celui associé au vecteur qui maximise le critère m complet.FIG. 3 represents in diagram form the method according to the invention which has just been described. The analyzed frame of the signal S undergoes the reverse processing at 24 of that of the
Généralement, pour obtenir de meilleurs résultats, la position de la première impulsion des séquences d'excitation E est variable. Dans l'exemple de la figure 2, cette position peut être l'une des quatre premières, ce qui est déterminé par deux bits supplémentaires à transmettre au décodeur et qui multiplie par quatre le nombre de vecteurs d'excitation à essayer. Dans ce cas, on commence par former, pour chacune des quatre positions possibles, un vecteur cible et un vecteur d'excitation dont les composantes sont celles de même signe que celles du vecteur cible correspondant. On retient comme vecteur d'excitation optimal approché, parmi les quatre vecteurs ainsi obtenus, celui qui maximise le critère m complet.Generally, to obtain better results, the position of the first pulse of the excitation sequences E is variable. In the example of FIG. 2, this position can be one of the first four, which is determined by two additional bits to be transmitted to the decoder and which multiplies by four the number of excitation vectors to be tested. In this case, we start by forming, for each of the four possible positions, a target vector and an excitation vector whose components are those of the same sign as those of the corresponding target vector. One retains as approximate optimal excitation vector, among the four vectors thus obtained, that which maximizes the complete criterion m.
Claims (3)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9503735 | 1995-03-24 | ||
FR9503735A FR2732148B1 (en) | 1995-03-24 | 1995-03-24 | DETERMINATION OF AN EXCITATION VECTOR IN A CELP ENCODER |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0734013A2 true EP0734013A2 (en) | 1996-09-25 |
EP0734013A3 EP0734013A3 (en) | 1997-05-28 |
EP0734013B1 EP0734013B1 (en) | 2001-08-22 |
Family
ID=9477567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP96410028A Expired - Lifetime EP0734013B1 (en) | 1995-03-24 | 1996-03-19 | Determination of an excitation vector in a CELP coder |
Country Status (5)
Country | Link |
---|---|
US (1) | US5719994A (en) |
EP (1) | EP0734013B1 (en) |
JP (1) | JPH0990996A (en) |
DE (1) | DE69614594D1 (en) |
FR (1) | FR2732148B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0773533B1 (en) * | 1995-11-09 | 2000-04-26 | Nokia Mobile Phones Ltd. | Method of synthesizing a block of a speech signal in a CELP-type coder |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0294012A2 (en) * | 1987-06-05 | 1988-12-07 | AT&T Corp. | Resisting the effects of channel noise in digital transmission of information |
EP0515138A2 (en) * | 1991-05-20 | 1992-11-25 | Nokia Mobile Phones Ltd. | Digital speech coder |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5138661A (en) * | 1990-11-13 | 1992-08-11 | General Electric Company | Linear predictive codeword excited speech synthesizer |
-
1995
- 1995-03-24 FR FR9503735A patent/FR2732148B1/en not_active Expired - Fee Related
-
1996
- 1996-03-19 DE DE69614594T patent/DE69614594D1/en not_active Expired - Lifetime
- 1996-03-19 EP EP96410028A patent/EP0734013B1/en not_active Expired - Lifetime
- 1996-03-22 JP JP8065885A patent/JPH0990996A/en active Pending
- 1996-03-22 US US08/621,084 patent/US5719994A/en not_active Expired - Lifetime
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0294012A2 (en) * | 1987-06-05 | 1988-12-07 | AT&T Corp. | Resisting the effects of channel noise in digital transmission of information |
EP0515138A2 (en) * | 1991-05-20 | 1992-11-25 | Nokia Mobile Phones Ltd. | Digital speech coder |
Non-Patent Citations (4)
Title |
---|
IEEE TRANSACTIONS ON COMMUNICATIONS, vol. 42, no. 2/3/4, 1 Février 1994, pages 248-251, XP000445940 RAMABADRAN ET AL.: "Complexity reduction of CELP speech coders through the use of phase information" * |
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING,, vol. 1, no. 3, Juillet 1993, US, pages 315-325, XP000388575 AHMED ET AL.: "Fast methods for code search in CELP" * |
INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING 1987, vol. 4, 6 - 9 Avril 1987, DALLAS, TX, US, pages 1953-1956, XP000568035 ADOUL ET AL.: "A comparison of some algebraic structures for CELP coding of speech" * |
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP), vol. 1, 18 - 22 Novembre 1990, KOBE, JP, pages 49-52, XP000503311 LEE ET AL.: "An Improved Method for Multipulse Speech Analysis" * |
Also Published As
Publication number | Publication date |
---|---|
DE69614594D1 (en) | 2001-09-27 |
EP0734013A3 (en) | 1997-05-28 |
FR2732148B1 (en) | 1997-06-13 |
JPH0990996A (en) | 1997-04-04 |
EP0734013B1 (en) | 2001-08-22 |
FR2732148A1 (en) | 1996-09-27 |
US5719994A (en) | 1998-02-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0749626B1 (en) | Speech coding method using linear prediction and algebraic code excitation | |
EP1994531B1 (en) | Improved celp coding or decoding of a digital audio signal | |
EP0481895B1 (en) | Method and apparatus for low bit rate transmission of a speech signal using CELP coding | |
EP0782128A1 (en) | Method of analysing by linear prediction an audio frequency signal, and its application to a method of coding and decoding an audio frequency signal | |
EP0608174A1 (en) | System for predictive encoding/decoding of a digital speech signal by an adaptive transform with embedded codes | |
EP0428445B1 (en) | Method and apparatus for coding of predictive filters in very low bitrate vocoders | |
FR2702075A1 (en) | A method of generating a spectral weighting filter of noise in a speech coder. | |
EP1836699B1 (en) | Method and device for carrying out optimized audio coding between two long-term prediction models | |
EP0685833A1 (en) | Method for speech coding using linear prediction | |
CA2340028C (en) | Neural network and its use for speech recognition | |
EP0734013B1 (en) | Determination of an excitation vector in a CELP coder | |
EP0616315A1 (en) | Digital speech coding and decoding device, process for scanning a pseudo-logarithmic LTP codebook and process of LTP analysis | |
EP0573358B1 (en) | Variable speed voice synthesizer method and apparatus | |
EP0347307B1 (en) | Coding method and linear prediction speech coder | |
EP1192619B1 (en) | Audio coding and decoding by interpolation | |
EP0469997B1 (en) | Coding method and speech coder using linear prediction analysis | |
EP1605440A1 (en) | Method for signal source separation from a mixture signal | |
EP1192618B1 (en) | Audio coding with adaptive liftering | |
EP1192621B1 (en) | Audio encoding with harmonic components | |
EP0796490B1 (en) | Signal prediction method and device for a speech coder | |
FR2709366A1 (en) | Method of storing reflection coefficient vectors | |
EP1190414A1 (en) | Encoding and decoding with harmonic components and minimum phase | |
EP0543719A1 (en) | Method and arrangement for voiced-unvoiced decision applied in a very low rate vocoder | |
WO2001003119A1 (en) | Audio encoding and decoding including non harmonic components of the audio signal | |
FR2739964A1 (en) | Speech signal transmission method requiring reduced data flow rate |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): DE FR GB IT |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Kind code of ref document: A3 Designated state(s): DE FR GB IT |
|
17P | Request for examination filed |
Effective date: 19971030 |
|
RAP3 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: STMICROELECTRONICS S.A. |
|
17Q | First examination report despatched |
Effective date: 19991001 |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 19/10 A |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB IT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRE;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.SCRIBED TIME-LIMIT Effective date: 20010822 |
|
REF | Corresponds to: |
Ref document number: 69614594 Country of ref document: DE Date of ref document: 20010927 |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20011005 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20011123 |
|
RAP2 | Party data changed (patent owner data changed or rights of a patent transferred) |
Owner name: STMICROELECTRONICS S.A. |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20060816 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20060825 Year of fee payment: 11 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20070319 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20071130 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070319 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070402 |