EP2347411B1

EP2347411B1 - Pre-echo attenuation in a digital audio signal

Info

Publication number: EP2347411B1
Application number: EP09747881A
Authority: EP
Inventors: Balazs Kovesi; Stéphane RAGOT
Original assignee: France Telecom SA
Current assignee: Orange SA
Priority date: 2008-09-17
Filing date: 2009-09-15
Publication date: 2012-12-05
Anticipated expiration: 2029-09-15
Also published as: CN102160114A; KR20110076936A; ES2400987T3; JP2012503214A; US8676365B2; EP2347411A1; US20110178617A1; KR101655913B1; JP5295372B2; CN102160114B; RU2011115003A; RU2481650C2; WO2010031951A1

Description

L'invention concerne un procédé et un dispositif d'atténuation des pré-échos lors du décodage d'un signal audionumérique.The invention relates to a method and a device for attenuating pre-echoes when decoding a digital audio signal.

Pour le transport des signaux audionumériques sur les réseaux de transmission, qu'il s'agisse par exemple de réseaux fixes ou mobiles, ou pour le stockage des signaux, on fait appel à des processus de compression (ou codage source) mettant en oeuvre des systèmes de codage du type codage temporel ou codage fréquentiel par transformée.For the transport of digital audio signals on the transmission networks, whether for example fixed or mobile networks, or for the storage of signals, compression processes (or source coding) using coding systems of the time coding type or frequency coding by transform.

Le procédé et le dispositif, objets de l'invention, ont ainsi comme domaine d'application la compression des signaux sonores, en particulier les signaux audionumériques codés par transformée fréquentielle.The method and the device, which are the subject of the invention, thus have as their field of application the compression of sound signals, in particular frequency-coded digital audio signals.

La figure 1 représente à titre illustratif, un schéma de principe du codage et du décodage, d'un signal audio numérique par transformée incluant une analyse-synthèse par addition/recouvrement selon l'art antérieur.The figure 1 illustrates, by way of illustration, a schematic diagram of the coding and decoding of a digital audio signal by transform including an addition / overlap synthesis analysis according to the prior art.

Certaines séquences musicales, telles que les percussions et certains segments de parole comme les plosives (/k/, /t/, ...), sont caractérisées par des attaques extrêmement brusques qui se traduisent par des transitions très rapides et une variation très forte de la dynamique du signal en l'espace de quelques échantillons. Un exemple de transition est donné à la figure 1 à partir de l'échantillon 410.Certain musical sequences, such as percussion and certain segments of speech like the plosives (/ k /, / t /, ...), are characterized by extremely sudden attacks which result in very fast transitions and a very strong variation signal dynamics in a few samples. An example of a transition is given to the figure 1 from sample 410.

Pour le traitement de codage/décodage, le signal d'entrée est découpé en blocs d'échantillons de longueur L (représentés ici par des traits verticaux en pointillés). Le signal d'entrée est noté x(n). La découpe en blocs successifs conduit à définir les blocs x _N = [x(N.L) ... x(N.L+L-1) ] = [ x _N(0) ... x _N(L-1)], où N est l'indice de la trame, L est la longueur de la trame. A la figure 1 on a L=160 échantillons. Dans le cas de la transformée modulée en cosinus modifiée MDCT (pour "Modified Discrete Cosine Transform" en anglais), deux blocs x_N(n) et x_N+1(n) sont analysés conjointement pour donner un bloc de coefficients transformés associés à la trame d'indice N.For the coding / decoding process, the input signal is cut into blocks of samples of length L (here represented by dotted vertical lines). The input signal is noted x ( n ). Cutting into successive blocks leads to defining the blocks x _N = [ x (NL) ... x (N.L + L-1)] = [ x _N (0) ... x _N (L-1)] , where N is the index of the frame, L is the length of the frame. To the figure 1 we have L = 160 samples. In the case of the modulated modified cosine transform MDCT (for "Modified Discrete Cosine Transform" in English), two blocks x _N (n) and x _{N + 1} (n) are analyzed together to give a block of transformed coefficients associated with the frame of index N.

La division en blocs, aussi appelés trames, opérée par le codage par transformée est totalement indépendante du signal sonore et les transitions apparaissent donc en un point quelconque de la fenêtre d'analyse. Or après décodage par transformée, le signal reconstruit est entaché de "bruit" (ou distorsion) engendré par l'opération quantification (Q)-quantification inverse (Q^-1). Ce bruit de codage est réparti temporellement de façon relativement uniforme sur tout le support temporel du bloc transformé, c'est-à-dire sur toute la longueur de la fenêtre de longueur 2L d'échantillons (avec recouvrement de L échantillons). L'énergie du bruit de codage est en général proportionnelle à l'énergie du bloc et est fonction du débit de décodage.The division in blocks, also called frames, operated by the transform coding is totally independent of the sound signal and the transitions appear at any point in the analysis window. But after transform decoding, the reconstructed signal is tainted by "noise" (or distortion) generated by the quantization (Q) -quantization inverse (Q ^-1 ) operation. This coding noise is temporally distributed in a relatively uniform manner over the entire temporal support of the transformed block, that is to say over the entire length of the window of length 2L of samples (with overlap of L samples). The energy of the coding noise is generally proportional to the energy of the block and is a function of the decoding rate.

Pour un bloc comportant une attaque (comme le bloc 320-340 de la figure 1) l'énergie du signal est élevée, le bruit est donc également de niveau élevé.For a block with an attack (such as block 320-340 of the figure 1 ) the signal energy is high, so the noise is also high.

En codage par transformée, le niveau du bruit de codage est inférieur à celui du signal pour les échantillons de forte énergie qui suivent immédiatement la transition, mais le niveau est supérieur à celui du signal pour les échantillons d'énergie plus faible, notamment sur la partie précédant la transition (échantillons 160 - 410 de la figure 1). Pour la partie précitée, le rapport signal à bruit est négatif et la dégradation résultante, peut apparaître très gênante à l'écoute. On appelle pré-écho le bruit de codage antérieur à la transition et post-écho le bruit postérieur à la transition.In transform coding, the level of the coding noise is lower than that of the signal for the high energy samples that immediately follow the transition, but the level is higher than that of the signal for the lower energy samples, especially on the part preceding the transition (samples 160 - 410 of the figure 1 ). For the aforementioned part, the signal to noise ratio is negative and the resulting degradation, can appear very troublesome to listen. Pre-echo is the coding noise prior to the transition and post-echo the noise after the transition.

On peut observer sur la figure 1 que le pré-écho affecte la trame précédant la transition ainsi que la trame où se produit la transition.We can observe on the figure 1 that the pre-echo affects the frame preceding the transition as well as the frame where the transition occurs.

Les expériences psycho-acoustiques ont montré que l'oreille humaine effectue un pré-masquage temporel des sons assez limité, de l'ordre de quelques millisecondes. Le bruit précédant l'attaque, ou pré-écho, est audible lorsque la durée du pré-écho est supérieure à la durée du pré-masquage.Psychoacoustic experiments have shown that the human ear performs a rather limited temporal pre-masking of sounds, of the order of a few milliseconds. The noise preceding the attack, or pre-echo, is audible when the duration of the pre-echo is greater than the duration of the pre-masking.

L'oreille humaine effectue également un post-masquage d'une durée plus longue, de 5 à 60 millisecondes, lors du passage de séquences de forte énergie à des séquences de faible énergie. Le taux ou niveau de gêne acceptable pour les post-échos est donc plus important que pour les pré-échos.The human ear also performs a post-masking of a longer duration, from 5 to 60 milliseconds, during the passage of high energy sequences to low energy sequences. The rate or level of inconvenience acceptable for post-echoes is therefore greater than for pre-echoes.

Le phénomène des pré-échos, plus critique, est d'autant plus gênant que la longueur des blocs en nombre d'échantillons est importante. Or, en codage par transformée, il est nécessaire d'avoir une résolution fidèle des zones fréquentielles les plus significatives. A fréquence d'échantillonnage fixée et à débit fixé, si on augmente le nombre de points de la fenêtre on disposera de plus de bits pour coder les raies fréquentielles jugées utiles par le modèle psycho acoustique, d'où l'avantage d'utiliser des blocs de grande longueur. Le codage MPEG AAC (Advanced Audio Coding), par exemple, utilise une fenêtre de grande longueur qui contient un nombre fixe d'échantillons, 2048, soit sur une durée de 64 ms à une fréquence d'échantillonnage de 32 kHz. Les codeurs par transformée utilisés pour les applications conversationnelles utilisent souvent une fenêtre de durée 40 ms à 16 kHz et une durée de renouvellement de trame de 20 ms.The phenomenon of pre-echoes, more critical, is even more troublesome as the length of the blocks in number of samples is important. However, in transform coding, it is necessary to have a faithful resolution of the most significant frequency zones. At a fixed sampling rate and at a fixed rate, if we increase the number of points in the window, we will have more bits to code the frequency lines considered useful by the psychoacoustic model, hence the advantage of using blocks of great length. MPEG AAC (Advanced Audio Coding) coding, for example, uses a long window that contains a fixed number of samples, 2048, over a period of 64 ms at a sample rate of 32 kHz. Transform encoders used for conversational applications often use a window of 40 ms duration at 16 kHz and a frame renewal time of 20 ms.

Dans le but de réduire l'effet gênant précité du phénomène des pré-échos différentes solutions ont jusqu'ici été proposées.In order to reduce the aforementioned annoying effect of the pre-echo phenomenon, various solutions have heretofore been proposed.

Une première solution consiste à appliquer un filtrage adaptatif. Dans la zone précédant la transmission due à l'attaque, le signal reconstitué est en fait constitué du signal original et du bruit de quantification superposé au signal.A first solution is to apply adaptive filtering. In the zone preceding the transmission due to the attack, the reconstituted signal consists in fact of the original signal and the quantization noise superimposed on the signal.

Une technique de filtrage correspondante a été décrite dans l'article intitulé High Quality Audio Transform Coding at 64 kbits, IEEE Trans. on Communications Vol 42, No. 11, November 1994, publié par Y. Mahieux et J. P. Petit .A corresponding filtering technique has been described in the article entitled High Quality Transform Audio Coding at 64 kbits, IEEE Trans. on Communications Vol 42, No. 11, November 1994, published by Y. Mahieux and JP Petit .

La mise en oeuvre d'un tel filtrage nécessite la connaissance de paramètres dont certains sont estimés au décodeur à partir des échantillons bruités. Par contre, des informations telles que l'énergie du signal d'origine ne peuvent être connues qu'au codeur et doivent par conséquent être transmises. Lorsque le bloc reçu contient une variation brusque de dynamique, le traitement de filtrage lui est appliqué.The implementation of such a filtering requires the knowledge of parameters some of which are estimated at the decoder from the noisy samples. On the other hand, information such as the energy of the original signal can only be known to the encoder and must therefore be transmitted. When the received block contains a sudden variation of dynamics, the filtering treatment is applied to it.

Le processus de filtrage précité ne permet pas de retrouver le signal d'origine, mais procure une forte réduction des pré-échos. Il nécessite toutefois de transmettre les paramètres auxiliaires supplémentaires au décodeur.The aforementioned filtering process does not allow to find the original signal, but provides a strong reduction of pre-echoes. However, it requires to transmit additional auxiliary parameters to the decoder.

Une technique qui ne nécessite pas la transmission de paramètres auxiliaires est décrite dans la demande de brevet français FR 06 01466 . La méthode décrite permet de discriminer la présence des pré-échos et d'atténuer les pré-échos d'un signal audionumérique engendré par codage hiérarchique (générant un train binaire multicouches) à partir d'un codage par transformée, générant du pré-écho, et d'un codage temporel, ne générant pas de pré-échos.A technique that does not require the transmission of auxiliary parameters is described in the French patent application FR 06 01466 . The described method allows discriminating the presence of pre-echoes and attenuating the pre-echoes of a hierarchically coded digital audio signal (generating a multilayer binary stream) from a transform coding, generating pre-echo, and a time coding, not generating pre-echoes.

Cette demande de brevet décrit plus précisément la détection au décodeur d'une zone de basse énergie précédant une transition vers une zone de forte énergie, l'atténuation des pré-échos dans les zones de basse énergie détectées et l'inhibition de l'atténuation des pré-échos dans la zone de forte énergie. Le traitement permettant d'atténuer les pré-échos est basé sur une comparaison entre le signal issu d'un décodage par transformée (générant des pré-échos) et un signal issu d'un décodage temporel (non générateur d'échos).This patent application more specifically describes the decoder detection of a low energy zone preceding a transition to a high energy zone, the attenuation of the pre-echoes in the detected low energy zones and the inhibition of the attenuation. pre-echoes in the area of high energy. The treatment for attenuating the pre-echoes is based on a comparison between the signal resulting from a decoding by transform (generating pre-echoes) and a signal resulting from a temporal decoding (non-echo generator).

Cette technique ne nécessite pas de transmission d'information auxiliaire spécifique venant du codeur mais nécessite la présence d'un signal de référence issu d'un décodage temporel.This technique does not require specific auxiliary information transmission from the coder but requires the presence of a reference signal from a time decoding.

Tous les décodeurs utilisant un décodage par transformée ne dispose pas nécessairement d'un signal de référence issu d'un décodage temporel. De plus, dans le cas où un tel signal de référence est disponible au décodeur, il n'est pas toujours adapté pour calculer l'atténuation des pré-échos.All decoders using transform decoding do not necessarily have a reference signal from time decoding. Moreover, in the case where such a reference signal is available at the decoder, it is not always suitable for calculating the attenuation of pre-echoes.

Un codeur scalable stéréo, par exemple l'extension en stéréo de la norme UIT-T G.729.1, peut fonctionner de la manière décrite ci-après.A stereo scalable encoder, for example the stereo extension of ITU-T G.729.1, can operate as described below.

Le codeur calcule la moyenne des deux canaux gauche et droit du signal stéréo, puis code cette moyenne par le codeur G.729.1, et enfin transmet des paramètres supplémentaires d'extension stéréo. Le train binaire transmis au décodeur comporte donc une couche G.729.1 avec des couches supplémentaires d'extension stéréo. Par exemple, une première couche supplémentaire comporte des paramètres reflétant la différence d'énergie par sous-bande (dans le domaine transformé) entre les deux canaux du signal stéréo. Une seconde couche comporte par exemple les coefficients transformés du signal résiduel, défini comme la différence entre le signal original et le signal décodé à partir du train binaire G.729.1 et de la première couche.The encoder calculates the average of the two left and right channels of the stereo signal, then codes this average by the G.729.1 encoder, and finally transmits additional parameters of stereo extension. The bitstream transmitted to the decoder thus includes a G.729.1 layer with additional layers of stereo extension. For example, an additional first layer has parameters reflecting the difference in energy per subband (in the transformed domain) between the two channels of the stereo signal. A second layer comprises, for example, the transformed coefficients of the residual signal, defined as the difference between the original signal and the signal decoded from the G.729.1 bit stream and the first layer.

Le décodeur G.729.1 en mode étendu, décode d'abord le signal mono et retrouve en fonction des paramètres transmis, les coefficients transformés des deux canaux gauche et droit.The G.729.1 decoder in extended mode first decodes the mono signal and finds, according to the transmitted parameters, the transformed coefficients of the two left and right channels.

Le décodage du signal mono par un décodeur de type G.729.1 apporte un signal de référence basé sur la moyenne des deux canaux. Dans le cas où la différence de niveaux entre les deux canaux est grande, l'enveloppe temporelle du signal mono sera alors faible par rapport à la sortie de la transformée inverse du canal de plus grand niveau et fort par rapport à la sortie de la transformée inverse du canal de plus faible niveau.The decoding of the mono signal by a G.729.1 decoder provides a reference signal based on the average of the two channels. In the case where the difference in levels between the two channels is large, the time envelope of the mono signal will then be small relative to the output of the inverse transform of the higher level channel and strong compared to the output of the transform. inverse of the lower level channel.

L'utilisation d'une référence comme la sortie du décodeur G.729.1 pour atténuer les pré-échos ne sera donc pas efficace pour le décodage stéréo : Dans le canal de plus grand niveau on détectera à tort trop de pré-écho et on supprimera donc du signal utile tandis que dans le canal de plus faible niveau on ne détectera ni ne supprimera tous les pré-échos.Using a reference such as the output of the G.729.1 decoder to attenuate the pre-echo will not be effective for stereo decoding: In the higher level channel we will detect too much pre-echo and we will delete therefore the useful signal while in the lower level channel will not detect or delete all pre-echoes.

Il existe donc un besoin d'une technique d'atténuation précise de pré-échos au décodage, dans le cas où un signal issu d'un décodage temporel n'est pas disponible ou n'est pas performant et où aucune information auxiliaire n'est transmise par le codeur. Cette technique doit, de plus, pouvoir fonctionner pour le codage mono et stéréo.There is therefore a need for a technique for accurately attenuating pre-echoes during decoding, in the case where a signal resulting from a time decoding is not available or is not efficient and where no auxiliary information is available. is transmitted by the encoder. This technique must, moreover, be able to work for mono and stereo coding.

A cet effet, la présente invention traite d'un procédé d'atténuation de pré-échos dans un signal audionumérique engendré à partir d'un codage par transformée, dans lequel, au décodage, pour une trame courante de ce signal audionumérique, le procédé comporte:

une étape de définition d'un signal concaténé, à partir au moins du signal reconstruit de la trame courante;
une étape de division dudit signal concaténé en sous-blocs d'échantillons de longueur déterminée;
une étape de calcul d'enveloppe temporelle du signal concaténé;
une étape de détection de transition de l'enveloppe temporelle vers une zone à forte énergie;
une étape de détermination des sous-blocs de basse énergie précédant un sous-bloc dans lequel une transition a été détectée; et
une étape d'atténuation dans les sous-blocs déterminés,

le procédé étant caractérisé en ce que l'atténuation s'effectue selon un facteur d'atténuation calculé pour chacun des sous-blocs déterminés, en fonction de l'enveloppe temporelle du signal concaténé et de l'enveloppe temporelle du signal reconstruit de la trame précédente.For this purpose, the present invention relates to a method for attenuating pre-echoes in a digital audio signal generated from a transform coding, in which, at decoding, for a current frame of this digital audio signal, the method includes:

a step of defining a concatenated signal from at least the reconstructed signal of the current frame;
a step of dividing said concatenated signal into sub-blocks of samples of determined length;
a time envelope calculation step of the concatenated signal;
a transition detection step of the time envelope to a high energy area;
a step of determining the low energy sub-blocks preceding a sub-block in which a transition has been detected; and
an attenuation step in the determined sub-blocks,

the method being characterized in that the attenuation is effected according to an attenuation factor calculated for each of the determined sub-blocks, as a function of the temporal envelope of the concatenated signal and the temporal envelope of the reconstructed signal of the frame previous.

Ainsi, le facteur d'atténuation est défini sur des caractéristiques propres au signal décodé qui ne nécessitent pas de transmission d'information du codeur ni de signal issu d'un décodage non générateur d'échos.Thus, the attenuation factor is defined on characteristics specific to the decoded signal that do not require transmission of information from the encoder or signal derived from non-echo-generating decoding.

Un facteur adapté à chaque sous-bloc de la trame courante et calculé à partir du signal reconstruit permet d'améliorer la qualité du traitement d'atténuation des pré-échos.A factor adapted to each sub-block of the current frame and calculated from the reconstructed signal makes it possible to improve the quality of the pre-echo attenuation processing.

Le signal concaténé peut être défini à partir du signal reconstruit de la trame courante et de la deuxième partie de la trame courante tel que défini ultérieurement en référence à la figure 2. Dans ce cas, la méthode n'introduit pas de retard temporel.The concatenated signal can be defined from the reconstructed signal of the current frame and the second part of the current frame as defined later with reference to the figure 2 . In this case, the method does not introduce time delay.

Dans le cas où on s'autorise un retard temporel, le signal concaténé est défini comme le signal reconstruit de la trame courante et de la trame suivante.In the case of allowing a time delay, the concatenated signal is defined as the reconstructed signal of the current frame and the next frame.

Le signal concaténé peut être physiquement stocké à différents endroits par sous-blocs.The concatenated signal can be physically stored at different locations by sub-blocks.

Les différents modes particuliers de réalisation mentionnés ci-après peuvent être ajoutés indépendamment ou en combinaison les uns avec les autres, aux étapes du procédé défini ci-dessus.The various particular embodiments mentioned below may be added independently or in combination with each other, to the steps of the method defined above.

Ainsi, dans un mode particulier de réalisation, une valeur minimale est fixée pour une valeur d'atténuation du facteur en fonction de l'enveloppe temporelle du signal reconstruit de la trame précédente.Thus, in a particular embodiment, a minimum value is set for an attenuation value of the factor as a function of the temporal envelope of the reconstructed signal of the previous frame.

Ceci permet d'éviter une trop grande différence d'atténuation d'une trame à une autre en particulier sur le niveau de bruit de fond et ainsi d'éviter des artéfacts audibles.This makes it possible to avoid an excessive difference in attenuation from one frame to another, in particular on the level of background noise and thus to avoid audible artifacts.

L'enveloppe temporelle du signal reconstruit de la trame précédente peut par exemple être déterminée par le calcul de l'énergie minimale par sous-bloc ou encore par le calcul de l'énergie moyenne ou tout autre calcul.The time envelope of the reconstructed signal of the preceding frame can for example be determined by calculating the minimum energy per sub-block or by calculating the average energy or any other calculation.

Dans un mode particulier de réalisation de l'invention, le facteur d'atténuation est déterminé en fonction de l'enveloppe temporelle dudit sous-bloc, du maximum de l'enveloppe temporelle du sous-bloc comprenant ladite transition et de l'enveloppe temporelle du signal reconstruit de la trame précédente.In a particular embodiment of the invention, the attenuation factor is determined according to the temporal envelope of said sub-block, the maximum of the temporal envelope of the sub-block comprising said transition and the temporal envelope. of the reconstructed signal of the previous frame.

Dans un exemple de réalisation, l'enveloppe temporelle est déterminée par un calcul d'énergie par sous-blocs.In an exemplary embodiment, the time envelope is determined by a calculation of energy by sub-blocks.

Avantageusement, le procédé comporte en outre une étape de calcul et de mémorisation de l'enveloppe temporelle de la trame courante après l'étape d'atténuation dans les sous-blocs déterminés.Advantageously, the method further comprises a step of calculating and storing the temporal envelope of the current frame after the attenuation step in the determined sub-blocks.

Ce calcul d'enveloppe temporelle sera donc utilisé pour traiter la trame suivante. Ce calcul est précis puisque le signal n'est plus perturbé par les pré-échos.This time envelope calculation will therefore be used to process the next frame. This calculation is precise since the signal is no longer disturbed by the pre-echoes.

Avantageusement, un facteur d'atténuation de valeur 1 est attribué aux échantillons dudit sous-bloc comportant la transition ainsi qu'aux échantillons des sous-blocs suivants dans la trame courante.Advantageously, an attenuation factor of value 1 is assigned to the samples of said sub-block comprising the transition as well as to the samples of the following sub-blocks in the current frame.

L'atténuation est donc inhibée dans ces sous-blocs qui ne comportent pas de pré-échos.Attenuation is therefore inhibited in these sub-blocks which do not include pre-echoes.

Dans un mode de réalisation particulier, le facteur d'atténuation est déterminé par sous-bloc déterminé selon les étapes suivantes:

calcul du rapport de l'énergie maximale déterminée dans le sous-bloc comportant une transition sur l'énergie du sous-bloc courant;
comparaison du rapport à un premier seuil;
dans le cas où le rapport est inférieur ou égal au premier seuil, attribution d'une valeur inhibant l'atténuation au facteur d'atténuation;
dans le cas où le rapport est supérieur au premier seuil:
- comparaison du rapport à un deuxième seuil;
- dans le cas où le rapport est inférieur ou égal au deuxième seuil, attribution d'une valeur d'atténuation faible au facteur d'atténuation;
- dans le cas où le rapport est supérieur au deuxième seuil, attribution d'une valeur d'atténuation forte au facteur d'atténuation.

In a particular embodiment, the attenuation factor is determined by sub-block determined according to the following steps:

calculating the ratio of the maximum energy determined in the sub-block comprising a transition on the energy of the current sub-block;
comparing the ratio to a first threshold;
in the case where the ratio is less than or equal to the first threshold, assigning a value inhibiting attenuation to the attenuation factor;
in the case where the ratio is greater than the first threshold:
- comparing the ratio to a second threshold;
- in the case where the ratio is less than or equal to the second threshold, assigning a low attenuation value to the attenuation factor;
- in the case where the ratio is greater than the second threshold, assigning a strong attenuation value to the attenuation factor.

Ce mode de réalisation particulier s'est révélé particulièrement efficace et est simple à mettre en oeuvre.This particular embodiment has proved particularly effective and is simple to implement.

Avantageusement, le procédé prévoit la détermination d'une fonction de lissage entre les facteurs calculés échantillon par échantillon.Advantageously, the method provides for the determination of a smoothing function between the calculated factors sample by sample.

Ceci permet également d'éviter des artéfacts audibles lors d'une variation trop brusque des valeurs d'atténuation.This also avoids audible artifacts when the attenuation values are too abruptly varied.

Dans une variante de mise en oeuvre, une correction de facteur est effectuée pour le sous-bloc précédant le sous-bloc comportant une transition, en appliquant une valeur d'atténuation inhibant l'atténuation, au facteur d'atténuation appliqué à un nombre prédéterminé d'échantillons du sous-bloc précédant le sous-bloc comportant une transition.In an alternative embodiment, a factor correction is performed for the sub-block preceding the sub-block having a transition, by applying an attenuation inhibiting attenuation value, to the attenuation factor applied to a predetermined number. samples of the sub-block preceding the sub-block having a transition.

Ceci permet donc de ne pas diminuer l'amplitude de l'attaque par la fonction de lissage définie pour les valeurs d'atténuation.This therefore makes it possible not to reduce the amplitude of the attack by the smoothing function defined for the attenuation values.

La présente invention vise également un dispositif d'atténuation de pré-échos dans un signal audionumérique engendré à partir d'un codeur par transformée, dans lequel, le dispositif associé à un décodeur comprend pour traiter une trame courante de ce signal audionumérique:

un module de définition d'un signal concaténé, à partir au moins du signal reconstruit de la trame courante;
un module de division dudit signal concaténé en sous-blocs d'échantillons de longueur déterminée;
un module de calcul d'enveloppe temporelle du signal concaténé;
un module de détection de transition de l'enveloppe temporelle vers une zone à forte énergie;
un module de détermination des sous-blocs de basse énergie précédant un sous-bloc dans lequel une transition a été détectée; et
un module d'atténuation dans les sous-blocs déterminés.

Le dispositif est tel que le module d'atténuation effectue l'atténuation selon un facteur d'atténuation calculé pour chacun des sous-blocs déterminés, en fonction de l'enveloppe temporelle du signal concaténé et de l'enveloppe temporelle du signal reconstruit de la trame précédente.The present invention also provides a device for attenuating pre-echoes in a digital audio signal generated from a transform coder, in which the device associated with a decoder comprises for processing a current frame of this digital audio signal:

a module for defining a concatenated signal from at least the reconstructed signal of the current frame;
a division module of said concatenated signal into sub-blocks of samples of determined length;
a temporal envelope calculation module of the concatenated signal;
a transition detection module from the time envelope to a high energy zone;
a module for determining low energy sub-blocks preceding a sub-block in which a transition has been detected; and
an attenuation module in the determined sub-blocks.

The device is such that the attenuation module performs the attenuation according to a calculated attenuation factor for each of the determined sub-blocks, as a function of the temporal envelope of the concatenated signal and the temporal envelope of the reconstructed signal of the previous frame.

L'invention vise un décodeur d'un signal audionumérique comportant un dispositif tel que décrit précédemment.The invention relates to a decoder of a digital audio signal comprising a device as described above.

Un tel décodeur peut par exemple être un décodeur de type G.729.1-SWB/stéréo étudié à la question 23 de l'UIT-T, commission 16.Such a decoder can for example be a G.729.1-SWB / stereo decoder studied in Question 23 of ITU-T, Commission 16.

L'invention peut être intégrée à un tel décodeur en mode stéréo ou en mode SWB (pour "super Wide Band" anglais).The invention can be integrated in such a decoder in stereo mode or in SWB mode (for "super wide band" English).

Enfin, l'invention vise un programme informatique comportant des instructions de code pour la mise en oeuvre des étapes du procédé d'atténuation tel que décrit, lorsque ces instructions sont exécutées par un processeur.Finally, the invention is directed to a computer program comprising code instructions for implementing the steps of the attenuation method as described, when these instructions are executed by a processor.

D'autres caractéristiques et avantages de l'invention apparaîtront plus clairement à la lecture de la description suivante, donnée uniquement à titre d'exemple non limitatif, et faite en référence aux dessins annexés, sur lesquels:

la figure 1 décrite précédemment illustre un système de codage-décodage par transformée selon l'état de l'art;
la figure 2 illustre la configuration du signal reconstruit par rapport à la trame courante d'un signal;
la figure 3 illustre un dispositif d'atténuation de pré-échos dans un décodeur de signal audionumérique;
la figure 4a représente le signal concaténé lorsqu'une transition se trouve dans la deuxième partie de la trame courante;
la figure 4b représente le signal concaténé lorsqu'une transition se trouve dans le signal reconstruit de la trame courante;
la figure 5 illustre un organigramme représentant un mode de réalisation général des étapes du calcul du facteur d'atténuation selon de l'invention;
la figure 6 illustre un organigramme détaillé de la mise en oeuvre du procédé d'atténuation selon un mode de réalisation de l'invention;
la figure 7 illustre un mode de réalisation particulier du calcul du facteur d'atténuation selon l'invention;
la figure 8a illustre un exemple de signal audionumérique pour lequel l'invention selon un mode de réalisation est mis en oeuvre;
la figure 8b illustre le même signal audionumérique pour lequel l'invention selon une variante de réalisation est mise en oeuvre;
la figure 9 illustre le signal concaténé lorsque l'attaque se situe dans le deuxième sous-bloc de la deuxième partie de la trame courante;
la figure 10 illustre le signal concaténé lorsque l'attaque se situe dans le troisième sous-bloc de la deuxième partie de la trame courante;
la figure 11 illustre le signal concaténé lorsque l'attaque se situe dans le premier sous-bloc de la deuxième partie de la trame courante;
la figure 12 illustre le signal concaténé lorsque l'attaque se situe dans le quatrième sous-bloc de la deuxième partie de la trame courante;
les figures 13a et 13b illustrent respectivement un codeur et un décodeur de type G.729.1SWB/stéréo, le décodeur comportant un dispositif d'atténuation selon l'invention;
les figures 14a et 14b illustrent respectivement un codeur et un décodeur de type G.729.1 SWB, le décodeur comportant un dispositif d'atténuation selon l'invention;
la figure 15 illustre un exemple d'un dispositif d'atténuation selon l'invention.

Other features and advantages of the invention will appear more clearly on reading the following description, given solely by way of nonlimiting example, and with reference to the appended drawings, in which:

the figure 1 previously described illustrates a state-of-the-art transform coding-decoding system;
the figure 2 illustrates the configuration of the reconstructed signal with respect to the current frame of a signal;
the figure 3 illustrates a pre-echo attenuation device in a digital audio signal decoder;
the figure 4a represents the concatenated signal when a transition is in the second part of the current frame;
the figure 4b represents the concatenated signal when a transition is in the reconstructed signal of the current frame;
the figure 5 illustrates a flow chart showing a general embodiment of the steps of the calculation of the attenuation factor according to the invention;
the figure 6 illustrates a detailed flowchart of the implementation of the attenuation method according to one embodiment of the invention;
the figure 7 illustrates a particular embodiment of the calculation of the attenuation factor according to the invention;
the figure 8a illustrates an exemplary digital audio signal for which the invention according to one embodiment is implemented;
the figure 8b illustrates the same digital audio signal for which the invention according to an alternative embodiment is implemented;
the figure 9 illustrates the concatenated signal when the attack is in the second sub-block of the second part of the current frame;
the figure 10 illustrates the concatenated signal when the attack is in the third sub-block of the second part of the current frame;
the figure 11 illustrates the concatenated signal when the attack is in the first sub-block of the second part of the current frame;
the figure 12 illustrates the concatenated signal when the attack is in the fourth sub-block of the second part of the current frame;
the figures 13a and 13b respectively illustrate a coder and a G.729.1SWB / stereo decoder, the decoder comprising an attenuation device according to the invention;
the figures 14a and 14b respectively illustrate a coder and a G.729.1 SWB decoder, the decoder comprising an attenuation device according to the invention;
the figure 15 illustrates an example of an attenuation device according to the invention.

La figure 2 représente une trame du signal décodé ainsi que la configuration du signal reconstruit par addition recouvrement tel que décrit en référence à la figure 1. Dans la suite, la notation suivante est utilisée en référence à la figure 2 et à l'équation suivante : $x_{rec, N} (n) = h (n + L) x_{tr, N - 1} (n + L) + h (n) x_{tr, N} (n) pour n \in [0, L - 1]$

où N est l'indice de la trame, L est la longueur de la trame, x_rec,N est le signal reconstruit de la trame N, x_tr,_N est le signal de longueur 2L issu de la transformation inverse MDCT de la trame N. Sans rentrer dans les détails de la MDCT et de la transformation inverse MDCT, on définit le signal intermédiaire x_tr,N de longueur 2L pour la trame N comme:

x_{tr, N} = [\underset{y_{r}}{\underset{︸}{y_{r} (0) \dots y_{r} (\frac{L}{2} - 1)}} \underset{- y_{r} inversé}{\underset{︸}{- y_{r} (\frac{L}{2} - 1) \dots - y_{r} (0)}} \underset{y_{i}}{\underset{︸}{y_{i} (0) \dots y_{i} (\frac{L}{2} - 1)}} \underset{y_{i} inversé}{\underset{︸}{y_{i} (\frac{L}{2} - 1) \dots y_{i} (0)}}]

où y_r(n) et y_i(n) sont des signaux intermédiaires qui ne sont pas détaillés ici. Alors on peut montrer que le signal reconstruit x_rec,_N de la trame N est donné par:

x_{rec, N} (n) = h (n + L) x_{tr, N - 1} (n + L) + h (n) x_{tr, N} (n) pour n \in [0, L - 1]

La reconstruction s'effectue donc par addition-recouvrement.The figure 2 represents a frame of the decoded signal as well as the configuration of the reconstructed recovery signal as described with reference to FIG. figure 1 . In the following, the following notation is used with reference to the figure 2 and to the following equation:

x_{rec, NOT} (not) = h (not + The) x_{tr, NOT - 1} (not + The) + h (not) x_{tr, NOT} (not) for n \in [0, The - 1]

where N is the index of the frame, L is the length of the frame, x _{rec, N} is the reconstructed signal of the frame N, x _tr , _N is the signal of length 2L resulting from the transformation inverse MDCT of the frame N. Without going into the details of the MDCT and the inverse transformation MDCT, the intermediate signal x _{tr, N} of length 2L for the frame N is defined as:

x_{tr, NOT} = [\underset{{there}_{r}}{\underset{}}{{there}_{r} (0) \dots {there}_{r} (\frac{The}{2} - 1)}} \underset{- {there}_{r} inverted}{\underset{}}{- {there}_{r} (\frac{The}{2} - 1) \dots - {there}_{r} (0)}} \underset{{there}_{i}}{\underset{}}{{there}_{i} (0) \dots {there}_{i} (\frac{The}{2} - 1)}} \underset{{there}_{i} inverted}{\underset{}}{{there}_{i} (\frac{The}{2} - 1) \dots {there}_{i} (0)}}]

where y _r (n) and y _i (n) are intermediate signals that are not detailed here. Then we can show that the reconstructed signal x _rec , _N of the frame N is given by:

x_{rec, NOT} (not) = h (not + The) x_{tr, NOT - 1} (not + The) + h (not) x_{tr, NOT} (not) for n \in [0, The - 1]

Reconstruction is therefore done by addition-recovery.

On remarque que le signal intermédiaire comprend une partie antisymétrique et une partie symétrique. Lors du décodage de la trame N, on reçoit le train binaire qui permet de trouver x_tr,N ; on peut donc reconstruire x_rec,N(n), n=0...L-1. Par contre on ne dispose que de la « moitié » de l'information sur la trame future d'indice N+1, c'est-à-dire x_tr,N, n=L...2L-1, sur la trame future d'indice N+1. Il est important de noter que pour toutes les variantes de réalisation de la MDCT (et de son inverse) on peut toujours définir un signal intermédiaire x_tr,N de la forme définie ci-dessus. Cependant dans certaines réalisations le signal x_tr,N n'est pas explicite en tant que tel, seuls les signaux intermédiaires y_r(n) et y_i(n), comprenant du « repliement temporel », sont disponibles.Note that the intermediate signal comprises an antisymmetrical portion and a symmetrical portion. When decoding the frame N, the bit stream is received which makes it possible to find x _{tr, N} ; we can therefore reconstruct x _{rec, N} (n), n = 0 ... L-1. On the other hand, only the "half" of the information on the future frame of index N + 1, that is to say x _{tr, N} , n = L ... 2L-1, is available on the future frame of index N + 1. It is important to note that for all variants of the MDCT (and its inverse) it is always possible to define an intermediate signal x _{tr, N} of the form defined above. However in some embodiments the signal x _{tr, N} is not explicit as such, only the intermediate signals y _r (n) and y _i (n), including "time folding", are available.

Ainsi, dans un décodeur par transformée, le signal reconstruit de la trame courante (x_rec,N(n), n = 0 à L-1) est obtenu par addition pondérée de la deuxième partie de la sortie de la transformée inverse des coefficients MDCT de la trame précédente (x_tr,N-1(n), n = L à 2L-1) et de la première partie de la sortie de la transformée inverse des coefficients MDCT de la trame courante (x_tr,N(n), n = 0 à L-1). La deuxième partie de la sortie de la transformée inverse des coefficients MDCT de la trame courante (x_tr,N(n), n = L à 2L-1) sera gardée en mémoire et deviendra x_tr,N - _l(n), n = L à 2L-1 pour être exploitée pour obtenir le signal reconstruit de la trame suivante. Pour simplifier, dans la suite, les termes "première partie de la trame courante", "deuxième partie de la trame courante", "signal reconstruit de la trame courante" seront utilisés. Dans la trame suivante, la deuxième partie de la trame courante devient donc la deuxième partie de la trame précédente.Thus, in a transform decoder, the reconstructed signal of the current frame (x _{rec, N} (n), n = 0 to L-1) is obtained by weighted addition of the second part of the output of the inverse transform of the coefficients. MDCT of the previous frame ( x _{tr, N-1} (n), n = L to 2L-1 ) and the first part of the output of the inverse transform of the MDCT coefficients of the current frame (x _{tr, N} (n ), n = 0 to L-1 ). The second part of the output of the inverse transform of the MDCT coefficients of the current frame (x _{tr, N} (n), n = L to 2L-1) will be kept in memory and will become x _{tr, N} - ₁ (n), n = L to 2L-1 to be exploited to obtain the reconstructed signal of the next frame. For simplicity, in the following, the terms "first part of the current frame", "second part of the current frame", "reconstructed signal of the frame In the next frame, the second part of the current frame becomes the second part of the previous frame.

Pour encore simplifier les figures on introduit également la notation suivante pour la deuxième partie de la trame courante mis à niveau, c'est-à-dire multiplié par la valeur maximale de la fenêtre de synthèse de la transformée MDCT: $x_{cur 2 h, N} (n) = h (L) \cdot x_{tr, N} (L + n), n = 0 à L - 1$

To further simplify the figures, the following notation is also introduced for the second part of the current frame upgraded, that is to say multiplied by the maximum value of the synthesis window of the MDCT transform:

x_{heart 2 h, NOT} (not) = h (The) \cdot x_{tr, NOT} (The + not), not = 0 at L - 1

En particulier, pour une attaque située dans la trame courante, en première ou deuxième partie, le procédé d'atténuation des pré-échos selon un mode de réalisation de l'invention génère un signal concaténé [x_rec,N(0) ... _X,,,,_N(L-1) x_cur2h,N(0) ... x_cur2h,N(L-1)], à partir du signal reconstruit de la trame courante x_rec,N(n) et du signal de la deuxième partie de la trame courante mis à niveau x_cur2h,N(n).In particular, for an attack located in the current frame, in the first or second part, the pre-echo attenuation method according to one embodiment of the invention generates a concatenated signal [x _{rec, N} (0). _X ,,,, _N (L-1) x _{cur2h, N} (0) ... x _{cur2h, N} (L-1)], from the reconstructed signal of the current frame x _{rec, N} (n) and the signal of the second part of the current frame upgraded x _{cur2h, N} (n).

Ce signal concaténé est divisé en sous-blocs d'échantillons de longueur déterminée, ici un nombre pair.This concatenated signal is divided into sub-blocks of samples of determined length, here an even number.

Le procédé détermine les sous-blocs du bloc courant nécessitant une atténuation de pré-échos.The method determines the sub-blocks of the current block requiring attenuation of pre-echoes.

Le procédé d'atténuation comporte également une étape de calcul du facteur d'atténuation à appliquer aux sous-blocs déterminés. Le calcul s'effectue pour chacun des sous-blocs en fonction de l'enveloppe temporelle du signal concaténé.The attenuation method also includes a step of calculating the attenuation factor to be applied to the determined sub-blocks. The calculation is performed for each of the sub-blocks as a function of the temporal envelope of the concatenated signal.

Ce calcul peut aussi s'effectuer en fonction en outre de l'enveloppe temporelle du signal reconstruit de la trame précédente.This calculation can also be performed in addition to the time envelope of the reconstructed signal of the previous frame.

Ainsi en référence à la figure 3 , un dispositif d'atténuation 100 comporte un module 101 de définition d'un signal concaténé, un module 102 de division du signal concaténé en sous-blocs, un module 103 de calcul d'enveloppe temporelle du signal concaténé, un module 104 de détection de transition de l'enveloppe temporelle vers une zone à forte énergie et de détermination des sous-blocs de basse énergie précédant un sous-bloc dans lequel une transition a été détectée et un module 105 d'atténuation dans les sous-blocs déterminés. Le module d'atténuation est apte à appliquer un facteur d'atténuation aux sous-blocs déterminés par le module 104, le facteur d'atténuation étant déterminé par le module d'atténuation en fonction de l'enveloppe temporelle du signal concaténé.So with reference to the figure 3 , An attenuation device 100 includes a module 101 for defining a concatenated signal, a module 102 for dividing the concatenated signal into sub-blocks, a module 103 for calculating temporal envelope of the concatenated signal, a detection unit 104 transition from the time envelope to a high energy area and determination of the low energy sub-blocks preceding a sub-block in which a transition has been detected and an attenuation module 105 in the determined sub-blocks. The attenuation module is able to apply an attenuation factor to the sub-blocks determined by the module 104, the attenuation factor being determined by the attenuation module as a function of the time envelope of the concatenated signal.

En référence à la figure 3, le dispositif d'atténuation est compris dans un décodeur comportant un module 110 de quantification inverse (Q^-1), un module 120 de transformée inverse (MDCT^-1), un module 130 de reconstruction du signal par addition/recouvrement (add/rec) comme décrit en référence à la figure 1 et délivrant un signal reconstruit au dispositif d'atténuation selon l'invention.With reference to the figure 3 the attenuation device is included in a decoder comprising an inverse quantization module 110 (Q ^-1 ), an inverse transform module 120 (MDCT ^-1 ), an add / overlap signal reconstruction module 130 (add / rec) as described with reference to figure 1 and delivering a reconstructed signal to the attenuation device according to the invention.

Les figures 4a et 4b illustrent des exemples de signaux comportant des transitions ou attaques dans le signal. Le phénomène de pré-écho existe lorsque l'énergie d'une partie du signal dans une fenêtre MDCT est nettement supérieure (attaque) à celle des autres parties. Le pré-écho est alors observé dans les parties à basse énergie avant l'attaque. C'est donc dans cette partie qu'il faut atténuer les pré-échos.The Figures 4a and 4b illustrate examples of signals with transitions or attacks in the signal. The pre-echo phenomenon exists when the energy of a part of the signal in an MDCT window is significantly higher (attack) than that of the other parts. The pre-echo is then observed in the low energy parts before the attack. It is therefore in this part that the pre-echoes must be attenuated.

Deux cas sont possibles: l'attaque ou la transition du signal se trouve dans la trame courante (L premiers échantillons) ou dans la trame suivante (L échantillons suivants) correspondant à la deuxième partie de la trame courante comme représenté en figure 2.Two cases are possible: the attack or the transition of the signal is in the current frame (L first samples) or in the next frame (L following samples) corresponding to the second part of the current frame as represented in figure 2 .

La figure 4a représente un signal concaténé avec une attaque du signal dans la deuxième partie de la trame courante. On peut voir sur cette figure la découpe en K₂ sous-blocs k de longueur N₂ échantillons avec N₂=L/K₂, K₂=4. Les L premiers échantillons représentent le signal reconstruit de la trame courante x_rec,N(n), n=0, ..., L-1. Les L échantillons suivants (L à 2L-1) représentent la deuxième partie de la trame courante x_cut2h,N(n), n=0, ..., L-1. Dans la trame suivante, cette deuxième partie devient la première partie de la trame précédente.The figure 4a represents a signal concatenated with an attack of the signal in the second part of the current frame. We can see in this figure the cut in K ₂ sub-blocks k of length N ₂ samples with N ₂ = L / K ₂ , K ₂ = 4. The first L samples represent the reconstructed signal of the current frame x _{rec, N} (n), n = 0, ..., L-1. The following L samples (L to 2L-1) represent the second part of the current frame x _{cut2h, N} (n), n = 0, ..., L-1. In the next frame, this second part becomes the first part of the previous frame.

A noter que la deuxième partie de la trame courante est symétrique par propriété de la transformée inverse MDCT. En effet selon l'invention les pré-échos sont atténués sans introduire de retard supplémentaire dans le décodage par transformée. Lors du décodage de la trame courante, le décodeur synthétise les échantillons x_tr,N (n), n=0, ..., 2L-1, mais ne peut utiliser que les échantillons x_tr,N (n), n=0, ..., L-1 pour reconstruire x_rec,N (n), n=0, ..., L-1.Note that the second part of the current frame is symmetric by property of the inverse transform MDCT. Indeed according to the invention the pre-echoes are attenuated without introducing additional delay in the transform decoding. When decoding the current frame, the decoder synthesizes the samples x _{tr, N} (n), n = 0, ..., 2L-1, but can only use the samples x _{tr, N} (n), n = 0, ..., L-1 to reconstruct x _{rec, N} (n), n = 0, ..., L-1.

On voit que l'attaque ou transition se trouve dans la trame suivante (mais sans pouvoir donner encore sa position), il faut donc atténuer le pré-écho pour les L premiers échantillons de la trame courante du signal reconstruit.We see that the attack or transition is in the next frame (but without being able to give its position yet), so it is necessary to attenuate the pre-echo for the first L samples of the current frame of the reconstructed signal.

La figure 4b représente le même signal une trame plus tard, cette fois l'attaque se trouve dans la trame courante du signal reconstruit, dans le troisième sous-bloc (k=2). Il faut donc atténuer le pré-écho dans les deux premiers sous-blocs.The figure 4b represents the same signal one frame later, this time the attack is in the current frame of the reconstructed signal, in the third sub-block (k = 2). It is therefore necessary to attenuate the pre-echo in the first two sub-blocks.

Le procédé d'atténuation des pré-échos selon l'invention délivre des facteurs d'atténuation du pré-écho pour chaque échantillon de la trame. Ce procédé va maintenant être décrit en référence aux figures 5 et 6.The pre-echo attenuation method according to the invention delivers pre-echo attenuation factors for each sample of the frame. This process will now be described with reference to figures 5 and 6 .

L'organigramme représenté en figure 5 illustre les différentes étapes de calcul du facteur d'atténuation selon l'invention pour une trame courante.The organization chart represented in figure 5 illustrates the different stages of calculation of the attenuation factor according to the invention for a current frame.

A l'étape 201, l'enveloppe temporelle du signal reconstruit de la trame courante est calculée et à l'étape 202, l'enveloppe temporelle de la deuxième partie de la trame courante mise à niveau est calculée.In step 201, the time envelope of the reconstructed signal of the current frame is calculated and in step 202, the temporal envelope of the second part of the updated current frame is calculated.

L'enveloppe temporelle est par exemple obtenue par le calcul de l'énergie par sous-blocs comme décrit en référence à la figure 6. Elle peut être obtenue par d'autres méthodes, en calculant par exemple la moyenne des valeurs absolues du signal par sous-blocs, ou encore la valeur maximale ou la valeur médiane de chaque sous-bloc. L'enveloppe peut également être obtenue par exemple comme un opérateur de type Teager-Kaiser suivi par un filtrage passe-bas. Dans tous les cas on suppose ici, sans perte de généralité, que l'enveloppe temporelle est définie avec une résolution temporelle d'une valeur par sous-bloc, la taille des sous-blocs étant flexible.The temporal envelope is for example obtained by calculating the energy by sub-blocks as described with reference to the figure 6 . It can be obtained by other methods, for example by calculating the average of the absolute values of the signal by sub-blocks, or else the maximum value or the median value of each sub-block. The envelope can also be obtained for example as a Teager-Kaiser type operator followed by a low-pass filtering. In any case, it is assumed here, without loss of generality, that the temporal envelope is defined with a temporal resolution of one value per sub-block, the size of the sub-blocks being flexible.

A l'étape 203, une fonction de facteur d'atténuation est définie à partir des enveloppes de la trame courante définie aux étapes 201 et 202 et à partir de l'enveloppe du signal reconstruit de la trame précédente (T_env(x_rec,N-1(n)).In step 203, an attenuation factor function is defined from the envelopes of the current frame defined in steps 201 and 202 and from the envelope of the reconstructed signal of the previous frame (T _env (x _{rec, N-1} (n)).

L'étape 204 optionnelle, définie une fonction de lissage sur les valeurs obtenues du facteur d'atténuation afin d'éviter les discontinuités qui pourrait se révéler dans le signal traité.The optional step 204, defines a smoothing function on the obtained values of the attenuation factor in order to avoid discontinuities that could be revealed in the processed signal.

En référence à la figure 6 , le procédé d'atténuation dans un mode de réalisation détaillé de l'invention va maintenant être décrit.With reference to the figure 6 , the attenuation method in a detailed embodiment of the invention will now be described.

Ainsi, à l'étape 301, comme illustré en figure 4a ou 4b, le signal est découpé en sous-blocs de longueur N₂ = L/ K₂. On obtient ainsi 2 K ₂ sous-blocs.Thus, in step 301, as illustrated in figure 4a or 4b , the signal is split into sub-blocks of length N ₂ = L / K ₂ . We thus obtain 2 K ₂ sub-blocks.

A l'étape 302, l'énergie En(k) des K₂ sous-blocs du signal reconstruit x_rec,N (n) est calculée.In step 302, the energy En (k) of the K ₂ sub-blocks of the reconstructed signal x _{rec, N} (n) is calculated.

A l'étape 303, l'énergie de chaque sous-bloc de la deuxième partie de la trame courante mise à niveau x_cur2h,N(n), est calculée. Seules K ₂/2 valeurs sont différentes du fait de la symétrie de cette partie du signal comme représenté en figure 4a.In step 303, the energy of each sub-block of the second part of the current frame being upgraded x _{cur2h, N} (n), is calculated. Only K _2/2 values are different because of the symmetry of this part of the signal as represented in figure 4a .

Le maximum des énergies des sous-blocs de signal x_rec,N (n) et x_cur2h (n) est calculé à l'étape 304 sur les K ₂ + K ₂ / 2 = 3 K ₂ /2 blocs et son indice est mémorisé dans ind₁. The maximum energy of the subblocks signal x _{rec, N} (n) and x _cur2h (n) is calculated in step 304 on K ₂ + K _2/2 = 3 K _2/2 blocks and the index is stored in ind ₁ .

La valeur de l'énergie maximale max_en ainsi calculée est également mémorisée.The value of maximum max _power thus calculated is also stored.

A l'étape 305 un compteur de boucle est initialisé. Dans la boucle des étapes 306 à 309, on détermine en 307, pour chaque sous-bloc précédent le sous-bloc d'indice ind1 un facteur d'atténuation g(k) en fonction de son énergie En(k), de l'énergie maximale max_en et de l'énergie moyenne du signal reconstruit de la trame précédente x_rec,N-1 et on attribue ce facteur en 308 à tous les échantillons du sous-bloc.In step 305 a loop counter is initialized. In the loop of steps 306 to 309, for each preceding sub-block, the sub-block of index ind1 is determined at 307 to have an attenuation factor g (k) as a function of its energy En (k). max maximum energy _in and of the average energy of the reconstructed signal of the previous frame x _{rec, N-1} and this factor is assigned in 308 to all the samples of the sub-block.

A l'étape 310 on calcule l'indice du premier échantillon du sous-bloc à l'énergie maximale. A l'étape 311 on vérifie s'il est inférieur à la longueur de la trame. Si oui, le sous-bloc d'énergie maximale est dans la trame courante et on attribue le facteur 1, c'est-à-dire une valeur inhibant l'atténuation, à tous les échantillons à partir du début du sous-bloc jusqu'à la fin de la trame dans la boucle des étapes 311-312-313.In step 310 the index of the first sample of the sub-block is calculated at maximum energy. In step 311 it is checked whether it is less than the length of the frame. If so, the maximum energy sub-block is in the current frame and the factor 1, ie, a value inhibiting attenuation, is assigned to all samples from the beginning of the sub-block to the current frame. at the end of the frame in the loop of steps 311-312-313.

A l'étape 314 l'énergie moyenne de la trame courante reconstruite, c'est à dire des K₂ premiers blocs du signal reconstruit x_rec,N (n), est calculée et mémorisée. Elle sera utilisée dans la trame suivante pour le calcul des nouveaux facteurs. Dans une variante on peut remplacer l'équation de cette étape par une autre qui tient compte également de l'atténuation des pré-échos, par exemple par l'équation suivante : ${\overline{En}}_{prev} = \frac{1}{K_{2}} \sum_{k = 0}^{K_{2} - 1} En (k) \cdot g^{2} (k)$

In step 314 the average energy of the reconstructed current frame, that is to say the first K ₂ blocks of the reconstructed signal x _{rec, N} ( n ), is calculated and stored. It will be used in the following frame for the calculation of the new factors. In one variant, the equation of this step can be replaced by another which also takes into account the attenuation of the pre-echoes, for example by the following equation:

{\tilde{In}}_{prev} = \frac{1}{K_{2}} Σ_{k = 0}^{K_{2} - 1} In (k) \cdot {boy Wut}^{2} (k)

Ainsi, on prend en compte le signal traité qui n'est plus perturbé par des pré-échos.Thus, we take into account the processed signal which is no longer disturbed by pre-echoes.

Aux étapes 315 et 316, une fonction de lissage des facteurs est déterminée et appliquée échantillon par échantillon pour éviter des variations trop brusques du facteur.In steps 315 and 316, a factor smoothing function is determined and applied sample by sample to avoid abrupt factor variations.

Cette fonction de lissage est par exemple définie par les équations suivantes: $g_{pre} (0) = α g_{old} + (1 - α) g_{pre} ʹ (0)$

g_{pre} (i) = α g_{pre} (i - 1) + (1 - α) g_{pre} ʹ (i), i = 1, \dots, L - 1

où on pondère le facteur défini pour l'échantillon précédent et le facteur de l'échantillon courant pour obtenir le facteur lissé.This smoothing function is for example defined by the following equations:

{boy Wut}_{pre} (0) = α {boy Wut}_{old} + (1 - α) {boy Wut}_{pre}' (0)

{boy Wut}_{pre} (i) = α {boy Wut}_{pre} (i - 1) + (1 - α) {boy Wut}_{pre}' (i), i = 1, ..., The - 1

where the factor defined for the previous sample and the factor of the current sample are weighted to obtain the smoothed factor.

Le dernier facteur d'atténuation obtenu pour le dernier sous-bloc à atténuer de la trame courante est mémorisé pour une utilisation dans la trame suivante à l'étape 315.The last attenuation factor obtained for the last sub-block to be attenuated of the current frame is stored for use in the next frame in step 315.

D'autres fonctions de lissage sont possibles comme par exemple une transition linéaire entre les deux valeurs de facteur, soit avec une pente constante (par exemple par pas de 0.05), soit avec une longueur fixe (par exemple sur 16 échantillons).Other smoothing functions are possible, such as, for example, a linear transition between the two factor values, either with a constant slope (for example in steps of 0.05) or with a fixed length (for example, on 16 samples).

Une fois les facteurs ainsi calculés, l'atténuation de pré-écho est faite sur le signal reconstruit de la trame courante en multipliant chaque échantillon par le facteur correspondant : $x_{recg, N} (n) = g (n) x_{rec, N} (n), n = 0 à L - 1$

Once the factors thus calculated, the pre-echo attenuation is made on the reconstructed signal of the current frame by multiplying each sample by the corresponding factor:

x_{RecG, NOT} (not) = boy Wut (not) x_{rec, NOT} (not), not = 0 at L - 1

L'étape 307 de calcul du facteur d'atténuation pour un sous-bloc, est maintenant détaillée dans un mode de réalisation particulier de l'invention en référence à la figure 7 . The step 307 for calculating the attenuation factor for a sub-block is now detailed in a particular embodiment of the invention with reference to FIG. figure 7 .

Dans ce mode de réalisation, on calcule d'abord à l'étape 401, le rapport max_en / En(k) de l'énergie maximale déterminée à l'étape 304 sur l'énergie du sous-bloc traité.In this embodiment, the ratio max / _en (k) of the maximum energy determined in step 304 on the energy of the sub-block treated is first calculated in step 401.

En pratique, ce rapport peut être inversé et les seuils adaptés en conséquence.In practice, this ratio can be reversed and the thresholds adapted accordingly.

On teste à l'étape 402 si ce rapport est inférieur ou égal à un premier seuil S1. La valeur de S1 est fixée à 16 dans l'exemple, cette valeur étant optimisée expérimentalement.One tests at step 402 if this ratio is less than or equal to a first threshold S1. The value of S1 is fixed at 16 in the example, this value being optimized experimentally.

Si oui, la variation de l'énergie par rapport à l'énergie maximale est faible pour produire un pré-écho gênant, aucune atténuation n'est alors nécessaire. Le facteur est alors fixé à l'étape 403, à une valeur d'atténuation inhibant l'atténuation, c'est-à-dire 1.If so, the variation of the energy with respect to the maximum energy is small to produce an annoying pre-echo, no attenuation is then necessary. The factor is then set at step 403 at an attenuation inhibiting attenuation value, i.e. 1.

Sinon, on teste à l'étape 404 si le rapport r est inférieur ou égal à un deuxième seuil S2. La valeur de S2 est fixée à 32 dans l'exemple, cette valeur étant optimisée expérimentalement.Otherwise, it is tested in step 404 if the ratio r is less than or equal to a second threshold S2. The value of S2 is set at 32 in the example, this value being optimized experimentally.

Si oui, cela veut dire que l'on peut avoir un petit pré-écho gênant qui est à atténuer légèrement en fixant le facteur à l'étape 405, à une valeur d'atténuation faible, par exemple à 0,5. Quand le rapport est supérieur à ce deuxième seuil, le risque de pré-écho est alors maximal et on applique à l'étape 406 une valeur d'atténuation forte au facteur, par exemple 0,1.If so, this means that one can have a small annoying pre-echo which is to be attenuated slightly by fixing the factor at step 405, at a low attenuation value, for example at 0.5. When the ratio is greater than this second threshold, the risk of pre-echo is then maximum and is applied in step 406 a strong attenuation value to the factor, for example 0.1.

Dans la plupart des cas, surtout quand le pré-écho est gênant, la trame qui précède la trame de pré-écho a une énergie homogène qui correspond à l'énergie du bruit de fond à ce moment. Selon l'expérience il n'est pas utile ni même souhaitable que l'énergie du signal devienne inférieure à l'énergie moyenne de la trame précédente après le traitement de pré-écho.In most cases, especially when the pre-echo is awkward, the frame that precedes the pre-echo frame has a homogeneous energy that corresponds to the energy of the background noise at that time. According to the experience it is not useful or even desirable that the signal energy becomes lower than the average energy of the previous frame after pre-echo processing.

A l'étape 407 on calcule donc une valeur limite du facteur lim_r avec lequel on obtient pour le sous-bloc donné exactement la même énergie que l'énergie moyenne de la trame précédente. Puis à l'étape 408, on limite cette valeur à un maximum de 1 puisqu'on s'intéresse ici aux valeurs d'atténuation.At step 407, a limit value of the lim _r factor _r is calculated with which the given sub-block is obtained exactly the same energy as the average energy of the previous frame. Then at step 408, this value is limited to a maximum of 1 since the attenuation values are of interest here.

La valeur lim_g ainsi obtenue sert comme limite inférieure dans le calcul final du facteur d'atténuation à l'étape 409.The thus obtained lim value _g serves as the lower limit in the final calculation of the attenuation factor at step 409.

Dans une variante de réalisation du calcul du facteur d'atténuation, une caractéristique de débit du signal transmis peut-être pris en compte. En effet, dans une transmission à bas débit, le bruit de quantification est en général important, ce qui augmente le risque de pré-écho gênant. A l'opposé, à très haut débit, la qualité de codage peut être très bonne et aucune atténuation de pré-écho n'est alors nécessaire.In an alternative embodiment of the calculation of the attenuation factor, a rate characteristic of the transmitted signal may be taken into account. Indeed, in a low-rate transmission, the quantization noise is generally important, which increases the risk of annoying pre-echo. In contrast, at very high speed, the coding quality can be very good and no pre-echo attenuation is necessary.

Dans le cas d'un codage/décodage multi-débit, l'information de débit peut donc être prise en compte pour déterminer le facteur d'atténuation.In the case of multi-rate encoding / decoding, the rate information can therefore be taken into account in determining the attenuation factor.

Les figures 8a et 8b illustrent la mise en oeuvre du procédé d'atténuation de l'invention sur un exemple typique.The figures 8a and 8b illustrate the implementation of the attenuation method of the invention in a typical example.

Dans cet exemple le signal est échantillonné à 8 kHz, la longueur de la trame est de 160 échantillons et chaque trame est divisée en 4 sous-blocs de 40 échantillons.In this example the signal is sampled at 8 kHz, the frame length is 160 samples and each frame is divided into 4 sub-blocks of 40 samples.

Dans la partie a.) de la figure 8a, 3 trames du signal original correspondant à la partie bande étroite (0-4000Hz) du canal gauche d'un signal stéréo échantillonné à 16 kHz, sont représentées. Une attaque ou transition dans le signal est située dans le sous-bloc commençant à l'indice 360. Ce signal à été codé par exemple par une extension stéréo du codeur G.729.1.In part a) of the figure 8a , 3 frames of the original signal corresponding to the narrowband portion (0-4000Hz) of the left channel of a stereo signal sampled at 16 kHz are shown. An attack or transition in the signal is located in the sub-block starting at the index 360. This signal has been encoded for example by a stereo extension of the G.729.1 coder.

Dans la partie b.) de la figure 8a, le résultat du décodage (uniquement le canal gauche) sans traitement de pré-écho est illustré. On peut observer le pré-écho à partir de l'échantillon 160 (début de la trame précédant la trame avec l'attaque).In part b) of the figure 8a , the result of decoding (only the left channel) without pre-echo processing is shown. We can observe the pre-echo from the sample 160 (beginning of the frame preceding the frame with the attack).

La partie c.) montre l'évolution du facteur d'atténuation de pré-écho (ligne continue) obtenu par la mise en oeuvre du procédé selon l'invention. La ligne pointillée représente le facteur avant lissage.Part c.) Shows the evolution of the pre-echo attenuation factor (continuous line) obtained by implementing the method according to the invention. The dotted line represents the factor before smoothing.

La partie d.) illustre le résultat du décodage après application du traitement de pré-écho (multiplication du signal b.) avec le signal c.)). On voit que le pré-écho a bien été supprimé.Part d.) Illustrates the result of the decoding after application of the pre-echo processing (multiplication of the signal b.) With the signal c.)). We see that the pre-echo has been removed.

La figure 8b illustre le même exemple typique pour lequel une mise en oeuvre d'une variante de réalisation du procédé d'atténuation selon l'invention est effectuée.The figure 8b illustrates the same typical example for which an implementation of an alternative embodiment of the attenuation method according to the invention is carried out.

Si on observe bien la figure 8a on s'aperçoit que le facteur lissé ne remonte pas à 1 au moment de l'attaque, ce qui implique une diminution de l'amplitude de l'attaque. L'impact perceptible de cette diminution est très faible mais peut néanmoins être évité.If we look closely at figure 8a we see that the smoothed factor does not go back to 1 at the moment of the attack, which implies a decrease in the amplitude of the attack. The noticeable impact of this decrease is very small but can nevertheless be avoided.

Pour cela, on peut par exemple affecter, avant lissage, la valeur de facteur 1 aux quelques derniers échantillons du sous-bloc précédant le sous-bloc où se situe l'attaque. La partie c.) de la figure 8b donne un exemple d'une telle correction. Dans cet exemple on a affecté la valeur de facteur 1 aux 16 derniers échantillons du sous-bloc précédent le sous-bloc avec l'attaque, à partir de l'indice 344.For this, one can for example affect, before smoothing, the value of factor 1 to the last few samples of the sub-block preceding the sub-block where the attack is located. Part c. figure 8b give an example of such a correction. In this example, the value of factor 1 has been assigned to the last 16 samples of the sub-block preceding the sub-block with the attack, starting from index 344.

Ainsi la fonction de lissage accroit progressivement le facteur pour avoir une valeur proche de 1 au moment de l'attaque. L'amplitude de l'attaque est alors préservée.Thus the smoothing function gradually increases the factor to have a value close to 1 at the time of the attack. The amplitude of the attack is then preserved.

La difficulté de cette méthode est de savoir, dans la trame qui précède la trame comportant l'attaque, si l'attaque se situe dans le premier sous-bloc ou non.The difficulty of this method is to know, in the frame that precedes the frame including the attack, whether the attack is in the first sub-block or not.

Si l'attaque se situe dans le premier sous-bloc, alors la valeur de facteur 1 doit être affectée aux derniers échantillons de la trame. Le problème est que sur le signal concaténé on ne peut pas déterminer avec certitude la position de l'attaque, à cause de la symétrie de cette partie du signal concaténé qui reflète en fait la propriété bien connue de " repliement temporel " de la transformée MDCT.If the attack is in the first sub-block, then the factor 1 value must be assigned to the last samples of the frame. The problem is that on the concatenated signal the position of the attack can not be determined with certainty due to the symmetry of this part of the concatenated signal which in fact reflects the well-known "time folding" property of the MDCT transform .

Les figures 9 et 10 illustrent le signal concaténé correspondant à la deuxième trame des figures 8a et 8b.The figures 9 and 10 illustrate the concatenated signal corresponding to the second frame of figures 8a and 8b .

On peut en effet voir que l'attaque est dans le sous-bloc k=5 du signal concaténé. Cette attaque sera donc soit dans le deuxième soit dans le troisième sous bloc du signal reconstruit de la trame suivante. Elle ne sera donc pas dans le premier sous-bloc de la trame suivante. Il n'est alors pas nécessaire d'affecter la valeur de facteur à 1 aux derniers échantillons de la trame courante. Ceci est valable que le signal ait effectivement l'attaque dans le deuxième sous-bloc de la trame suivante (cas de la figure 9) ou dans le troisième sous-bloc (cas de la figure 10).It can indeed be seen that the attack is in the sub-block k = 5 of the concatenated signal. This attack will be either in the second or in the third sub-block of the reconstructed signal of the next frame. It will not be in the first sub-block of the next frame. It is not necessary to assign the factor value to 1 to the last samples of the current frame. This is valid that the signal actually has the attack in the second sub-block of the next frame (case of the figure 9 ) or in the third sub-block (case of figure 10 ).

Par contre, comme représenté en figure 11 ou 12 , quand l'attaque est dans le 1^er ou dans le 4^éme sous-bloc de la trame suivante, on détecte l'attaque dans le sous-bloc k=4 du signal concaténé à cause de la symétrie de cette partie du signal concaténé.However, as shown in figure 11 or 12 , When the attack is in the ¹ or the 4 ^th sub-block of the next frame, it is detected the attack in the sub-block k = 4 of the concatenated signal because of the symmetry of this part of the concatenated signal .

Or si l'attaque est dans le premier sous-bloc, il faut affecter la valeur de facteur à 1 aux derniers échantillons de la trame mais cela n'est pas nécessaire quand l'attaque est dans le 4^ème sous-bloc.Now if the attack is in the first sub-block, it is necessary to assign the factor value to 1 to the last samples of the frame but this is not necessary when the attack is in the 4 ^th sub-block.

Une solution est de toujours affecter la valeur de facteur à 1 aux derniers échantillons de la trame si l'attaque est détectée dans le 4^éme sous-bloc du signal concaténé. Si dans la trame suivante, l'attaque est dans le premier sous-bloc (cas de la figure 11) le fonctionnement est alors optimal. Par contre quand l'attaque est dans le 4^ème sous-bloc (cas de la figure 12), l'atténuation est sous-optimale car autour de la fin de la trame, le facteur d'atténuation de pré-écho augmente vers 1 pour quelques échantillons puis redescend vers le niveau correct d'atténuation au début de la trame suivante. L'impact subjectif de cette sous-optimalité est faible car quand l'attaque se trouve dans le 4^ème sous-bloc de la trame suivante son amplitude est bien diminuée par le fenêtrage d'analyse. Le pré-écho provoqué par cette attaque est faible.One solution is to always assign the factor value to 1 to the last samples of the frame if the attack is detected in the 4 ^th sub-block of the concatenated signal. If in the next frame, the attack is in the first sub-block (case of the figure 11 ) the operation is then optimal. On the other hand, when the attack is in the 4 ^th sub-block (case of the figure 12 ), the attenuation is suboptimal because around the end of the frame, the pre-echo attenuation factor increases to 1 for a few samples and then drops back to the correct attenuation level at the beginning of the next frame. The subjective impact of this sub-optimality is low because when the attack is in the 4 ^th sub-block of the following frame its amplitude is well diminished by the analysis windowing. The pre-echo caused by this attack is weak.

Les figures 9 à 12 ont été obtenues avec le même signal d'entrée, en le décalant de la longueur d'un sous-bloc pour déplacer la position de l'attaque dans la trame. On peut observer en comparant les figures 11 et 12 par exemple, la différence de niveau de pré-écho en fonction de la position de l'attaque : quand l'attaque se trouve dans le 4^éme sous-bloc le pré-écho est nettement plus faible.The Figures 9 to 12 were obtained with the same input signal, shifted by the length of a sub-block to move the position of the attack in the frame. We can observe by comparing the figures 11 and 12 for example, the difference in pre-echo level depending on the position of the attack when the attack is in the 4 ^th pre-echo sub-block is much lower.

Le procédé objet de l'invention utilise un exemple particulier de calcul du début de l'attaque (recherche du maximum d'énergie par sous-bloc) mais peut fonctionner avec toute autre méthode de détermination du début de l'attaque.The method of the invention uses a particular example of calculating the beginning of the attack (search for the maximum energy per sub-block) but can work with any other method of determining the beginning of the attack.

Le procédé objet de l'invention précitée s'appliquent à l'atténuation des pré-échos dans tout codeur par transformée qui utilise un banc de filtre MDCT ou tout banc de filtres à reconstruction parfaite à valeur réelle ou complexe, ou les bancs de filtres à reconstruction presque parfaite ainsi que les bancs de filtres utilisant la transformée de Fourier ou la transformée en ondelettes.The method which is the subject of the invention mentioned above applies to the attenuation of the pre-echoes in any transform coder which uses an MDCT filter bank or any real or complex value perfect reconstruction filter bank, or the banks of almost perfect reconstruction filters as well as filter banks using the Fourier transform or the wavelet transform.

Il est à noter que dans le cas où un retard d'une trame est tolérable au décodeur, les problèmes de localisation de transitoire (attaque) dans la seconde partie du signal concaténé peuvent être évités. Le procédé de réduction des pré-échos s'applique alors directement au signal reconstruit et non plus au signal concaténé qui est hybride entre signal reconstruit / signal intermédiaire avec repliement temporel. Les moyens de détection de transition, de calcul de facteur d'atténuation et de réduction de pré-échos décrits précédemment s'appliquent.It should be noted that in the case where a delay of one frame is tolerable to the decoder, transient location (attack) problems in the second part of the concatenated signal can be avoided. The pre-echo reduction method then applies directly to the reconstructed signal and no longer to the concatenated signal which is hybridized between reconstructed signal / intermediate signal with temporal folding. The transition detection, attenuation factor calculation and pre-echo reduction means described above apply.

Par ailleurs, dans le cas où le signal concaténé n'est pas défini explicitement, il est toujours possible d'utiliser le signal reconstruit à la trame courante et un signal intermédiaire de la MDCT inverse pour réaliser les opérations décrites précédemment.Moreover, in the case where the concatenated signal is not defined explicitly, it is always possible to use the reconstructed signal at the current frame and an intermediate signal of the inverse MDCT to perform the operations described above.

Des exemples d'application de l'invention sont donnés ci-après.Examples of application of the invention are given below.

Un exemple de codeur de signal stéréo est décrit en référence à la figure 13a. Un décodeur adapté comportant un dispositif d'atténuation selon l'invention est décrit en référence à la figure 13b.An example of a stereo signal encoder is described with reference to the figure 13a . A suitable decoder comprising an attenuation device according to the invention is described with reference to the figure 13b .

La figure 13a montre un exemple de codeur, pour lequel des informations stéréo sont transmises par bande de fréquences et sont décodées dans le domaine fréquentiel.The figure 13a shows an example of an encoder, for which stereo information is transmitted by frequency band and is decoded in the frequency domain.

Un signal mono M est calculé à partir des signaux d'entrée de la voie gauche L et droite R par des moyens de matriçage 500.A mono signal M is calculated from the input signals of the left channel L and the right channel R by die-stamping means 500.

Le codeur intègre également des moyens de transformation temps-fréquence 502, 503 et 504 apte à réaliser une transformée, par exemple une transformée de Fourier discrète ou DFT (de l'anglais « Discrete Fourier Transform »), une transformée MDCT (de l'anglais « Modified Discrete Cosine Transform »), une transformée MCLT (de l'anglais « Modulated Complex Lapped Transform »).The encoder also integrates time-frequency transformation means 502, 503 and 504 capable of producing a transform, for example a discrete Fourier Transform or DFT (of the English "Discrete Fourier Transform"), an MDCT transform (of the "Modified Discrete Cosine Transform"), an MCLT (Modulated Complex Lapped Transform).

On obtient ainsi, à partir des valeurs L, R et M correspondant aux signaux temporels gauche et droit, et mono, des valeurs de signaux fréquentiels gauche L et droit R, et mono M. On utilisera pour décrire les figures 13 et 14 les caractères en italique pour des signaux dans le domaine fréquentiel.Thus, from the values L, R and M corresponding to the left and right time signals, and mono, values of left frequency signals L and right R, and mono M. We will use to describe the figures 13 and 14 italic characters for signals in the frequency domain.

Le signal mono M est également quantifié et codé par les moyens 501 par exemple par le codeur G.729.1 normalisé à l'UIT-T. Ce module délivre le train binaire de coeur bst₁ et également le signal mono décodé M̂ transformé dans le domaine fréquentiel.The mono signal M is also quantized and coded by the means 501, for example by the G.729.1 coder standardized in ITU-T. This module delivers the binary bit stream bst ₁ and also the decoded mono signal M transformed in the frequency domain.

Le module 505 effectue le codage paramétrique stéréo à partir des signaux fréquentiels L, R, et M et du signal décodé M̂. Il délivre la première couche d'extension optionnelle du train binaire bst₂ et les deux canaux du signal stéréo décodé L et R̂ qu'on obtient en décodant les deux couche bst₁ et bst₂.The module 505 performs the stereo parametric coding from the frequency signals L, R, and M and the decoded signal M. It delivers the first optional extension layer of the bit stream bst ₂ and the two channels of the decoded stereo signal L and R obtained by decoding the two layers bst ₁ and bst ₂ .

Le signal résiduel stéréo dans le domaine fréquentiel est calculé par les moyens 506 et 507 et encodé par le moyen de codage 508 et on obtient la deuxième couche d'extension optionnelle du train binaire bst₃.The stereo residual signal in the frequency domain is calculated by the means 506 and 507 and encoded by the coding means 508 and the second optional extension layer of the bitstream bst _{3 is} obtained.

Le signal encodé de coeur bst₁ et les couches d'extension optionnelles bst₂ et bst₃ sont transmis au décodeur.The encoded heart signal bst ₁ and the optional extension layers bst ₂ and bst ₃ are transmitted to the decoder.

La figure 13b montre un exemple de décodeur susceptible de recevoir le signal encodé de coeur bst₁ et les couches d'extension optionnelles bst₂ et bst₃.The figure 13b shows an example of a decoder capable of receiving the encoded heart signal bst ₁ and the optional extension layers bst ₂ and bst ₃ .

Des moyens de décodage 600 permettent de décoder le train binaire coeur bst₁ et d'obtenir le signal décodé mono M̂. Si la première couche d'extension optionnelle bst₂ est disponible elle peut être décodée par les moyens de décodage stéréo paramétrique 601 pour construire le signal stéréo décodé L et R à partir du signal décodé mono M̂. Sinon, L et R̂ seront égale a M̂. Decoding means 600 make it possible to decode the binary bit stream bst ₁ and to obtain the decoded mono signal M. If the first optional extension layer bst ₂ is available it can be decoded by the parametric stereo decoding means 601 to build the decoded stereo signal L and R from the mono decoded signal M. Otherwise, L and R will be equal to M.

Quand la deuxième couche d'extension optionnelle bst₃ est également disponible elle est décodée par les moyens de décodage 602 pour obtenir le signal résiduel stéréo dans le domaine fréquentiel. Ceci s'ajoute au signal stéréo décodé L̂ et R̂ pour augmenter la précision de la représentation fréquentielle du signal. Sinon, quand cette deuxième couche d'extension n'est pas disponible L et R̂ restent inchangés.When the second optional extension layer bst ₃ is also available it is decoded by the decoding means 602 to obtain the stereo residual signal in the frequency domain. This is in addition to the decoded stereo signal L and R to increase the accuracy of the frequency representation of the signal. Otherwise, when this second extension layer is not available L and R remain unchanged.

Ces deux signaux subissent une transformation inverse fréquence-temps par les modules 605 et 606, une reconstruction par addition/recouvrement par les modules respectifs 607 et 608. Une réduction des pré-échos selon l'invention est alors effectuée par les modules d'atténuation 609 et 610 tels que décrit en référence à la figure 3, pour obtenir les deux canaux du signal stéréo temporel décodé L̃ et R̃.These two signals undergo an inverse frequency-time transformation by the modules 605 and 606, an addition / overlap reconstruction by the respective modules 607 and 608. A reduction of the pre-echoes according to the invention is then performed by the attenuation modules. 609 and 610 as described with reference to the figure 3 , to obtain the two channels of the decoded stereo time signal L and R.

Un autre exemple de décodeur comportant un dispositif selon l'invention est maintenant décrit en référence aux figures 14a et 14b.Another example of a decoder comprising a device according to the invention is now described with reference to the figures 14a and 14b .

La figure 14a montre un exemple de codeur de l'extension en bande super-élargie d'un codeur bande élargie de type G.729.1. Le signal d'entrée en bande super-élargie S₃₂ est sous-échantillonné par les moyens de sous-échantillonnage 700 pour obtenir un signal en bande élargie S₁₆. Ce signal est quantifié et codé par les moyens 701 par exemple par le codeur ITU G.729.1. Ce module délivre le train binaire de coeur bst₁ et également le signal bande élargie décodé Ŝ ₁₆ dans le domaine fréquentiel.The figure 14a shows an exemplary encoder of the super-wide band extension of a G.729.1 type wideband encoder. The super-wideband input signal S ₃₂ is downsampled by the subsampling means 700 to obtain an expanded band signal S ₁₆ . This signal is quantized and coded by the means 701 for example by the ITU G.729.1 coder. This module delivers the binary bit stream bst ₁ and also the decoded broadband signal Ŝ ₁₆ in the frequency domain.

Le signal d'entrée en bande super-élargie S₃₂ est transformé dans le domaine fréquentiel par les moyens de transformation 704. Les fréquences de la bande haute (bande 7000 -14000Hz) non codées dans la partie bande élargie seront encodées par les moyens de codage 704. Ce codage est basé sur le spectre du signal bande élargie décodé Ŝ ₁₆. Les paramètres codés constituent la première extension optionnelle du train binaire bst₂.The super-wideband input signal S ₃₂ is transformed in the frequency domain by the transformation means 704. The frequencies of the high band (7000 -14000Hz band) that are not coded in the enlarged band portion will be encoded by the means of transmission. This coding is based on the spectrum of the decoded broadband signal Ŝ ₁₆ . The coded parameters constitute the first optional extension of the bst ₂ binary train.

Une deuxième couche optionnelle du train binaire bst₃ fourni par les moyens de codage 705, contient les paramètres pour améliorer la qualité de la bande élargie (50-7000 Hz).An optional second layer of the bit stream bst ₃ provided by the coding means 705 contains the parameters for improving the quality of the enlarged band (50-7000 Hz).

Le décodeur de la figure 14b représente un décodeur en bande super-élargie (50-14000 Hz) correspondant à l'encodeur de la figure 14a. Le train binaire de coeur bst₁ est décodé par un codeur en bande élargie de type G.729.1 (module 800). On obtient donc le spectre du signal décodé en bande élargie. Ce spectre est éventuellement amélioré par le décodage en 801 de la deuxième couche d'extension optionnelle bst₃. Le module 801 comprend également la transformation fréquence-temps du signal bande élargie. La présente invention n'intervient pas dans cette transformation fréquence-temps pour réduire les pré-échos car ici on dispose des signaux temporels sans écho (composants CELP et TDBWE du codeur G.729.1) et donc la technique décrit dans la demande de brevet français FR 06 01466 peut être appliquée. Le signal bande élargie décodé est ensuite sur-échantillonné par un facteur de 2 dans les moyennes de sur-échantillonnage 802.The decoder of the figure 14b represents a super-wideband decoder (50-14000 Hz) corresponding to the encoder of the figure 14a . The binary bit stream bst ₁ is decoded by a G.729.1 type wideband encoder (module 800). The spectrum of the broadband decoded signal is thus obtained. This spectrum is possibly improved by the decoding at 801 of the second extension layer optional bst ₃ . The module 801 also includes the frequency-time transformation of the broadband signal. The present invention does not intervene in this frequency-time transformation to reduce the pre-echoes because here we have the echo-free time signals (CELP and TDBWE components of the G.729.1 coder) and therefore the technique described in the French patent application. FR 06 01466 can be applied. The decoded broadband signal is then oversampled by a factor of 2 in the oversampling means 802.

Quand la première couche d'extension optionnelle bst₂ est disponible au décodeur, elle est décodée par les moyens de décodage 803.When the first optional extension layer bst ₂ is available at the decoder, it is decoded by the decoding means 803.

Ce décodage est basé sur le spectre du signal bande élargie décodé Ŝ₁₆. Le spectre ainsi obtenu contient les valeurs non-nulles uniquement dans la zone de fréquence 7000-14000 Hz non codée par la partie en bande élargie. Dans cette configuration, entre 7000 et 14000 Hz, on ne dispose donc pas de signaux de référence sans pré-écho. Le dispositif d'atténuation selon l'invention est donc mis en oeuvre.This decoding is based on the spectrum of the decoded broadband signal Ŝ ₁₆ . The spectrum thus obtained contains the non-zero values only in the 7000-14000 Hz frequency zone not coded by the enlarged band part. In this configuration, between 7000 and 14000 Hz, therefore, there are no reference signals without pre-echo. The attenuation device according to the invention is therefore implemented.

Le signal temporel est obtenu par transformation inverse fréquence-temps par le module 504. Le module de reconstruction par addition/recouvrement fourni un signal reconstruit. La réduction des pré-échos selon la présente invention est effectué par le module d'atténuation 807 tel que décrit en référence à la figure 3.The time signal is obtained by frequency-time inverse transformation by the module 504. The addition / recovery reconstruction module provides a reconstructed signal. The reduction of the pre-echoes according to the present invention is carried out by the attenuation module 807 as described with reference to FIG. figure 3 .

A noter que pour cette application, le signal après transformation inverse MDCT ne contient que des fréquences supérieures à 7000 Hz. L'enveloppe temporelle de ce signal peut donc être déterminée avec une très grande précision, ce qui augmente l'efficacité de l'atténuation des pré-échos par le procédé d'atténuation de l'invention.Note that for this application, the signal after inverse transformation MDCT contains only frequencies higher than 7000 Hz. The temporal envelope of this signal can therefore be determined with a very high precision, which increases the efficiency of the attenuation. pre-echoes by the attenuation method of the invention.

Un exemple de réalisation d'un dispositif d'atténuation selon l'invention est maintenant décrit en référence à la figure 15 .An embodiment of an attenuation device according to the invention is now described with reference to the figure 15 .

Matériellement, ce dispositif 100 au sens de l'invention comporte typiquement, un processeur µP coopérant avec un bloc mémoire BM incluant une mémoire de stockage et/ou de travail, ainsi qu'une mémoire tampon MEM précitée en tant que moyen pour mémoriser par exemple l'enveloppe temporelle de la trame courante, le facteur d'atténuation calculé pour le dernier échantillon de la trame courante, l'énergie des sous-blocs de la trame courante ou toutes autres données nécessaire à la mise en oeuvre du procédé d'atténuation tel que décrit en référence aux figures 5 à 7. Ce dispositif reçoit en entrée des trames successives du signal numérique Se et délivre le signal Sa reconstruit avec atténuation de pré-échos le cas échéant.Materially, this device 100 in the sense of the invention typically comprises a μP processor cooperating with a memory block BM including a storage and / or working memory, and a memory buffer MEM mentioned above as a means for storing, for example the time envelope of the frame current attenuation factor calculated for the last sample of the current frame, the energy of the sub-blocks of the current frame or any other data necessary for the implementation of the attenuation method as described with reference to the Figures 5 to 7 . This device receives as input successive frames of the digital signal Se and delivers the reconstructed signal Sa with attenuation of pre-echoes if necessary.

Le bloc mémoire BM peut comporter un programme informatique comportant les instructions de code pour la mise en oeuvre des étapes du procédé selon l'invention lorsque ces instructions sont exécutées par un processeur µP du dispositif et notamment une étape de définition d'un signal concaténé, à partir au moins du signal reconstruit de la trame courante, une étape de division dudit signal concaténé en sous-blocs d'échantillons de longueur déterminée, une étape de calcul d'enveloppe temporelle du signal concaténé, une étape de détection de transition de l'enveloppe temporelle vers une zone à forte énergie, une étape de détermination des sous-blocs de basse énergie précédant un sous-bloc dans lequel une transition a été détectée et une étape d'atténuation dans les sous-blocs déterminés.The memory block BM may comprise a computer program comprising the code instructions for implementing the steps of the method according to the invention when these instructions are executed by a μP processor of the device and in particular a step of defining a concatenated signal, from at least the reconstructed signal of the current frame, a step of dividing said concatenated signal into sub-blocks of samples of a determined length, a time envelope calculation step of the concatenated signal, a transition detection step of the temporal envelope to a high energy area, a step of determining the low energy sub-blocks preceding a sub-block in which a transition has been detected and an attenuation step in the determined sub-blocks.

L'atténuation s'effectue selon un facteur d'atténuation calculé pour chacun des sous-blocs déterminés, en fonction de l'enveloppe temporelle du signal concaténé.The attenuation is performed according to an attenuation factor calculated for each of the determined sub-blocks, as a function of the temporal envelope of the concatenated signal.

Les figures 5 à 7 peuvent illustrer l'algorithme d'un tel programme informatique.The Figures 5 to 7 can illustrate the algorithm of such a computer program.

Ce dispositif d'atténuation selon l'invention peut être indépendant ou intégré dans un décodeur de signal numérique.This attenuation device according to the invention can be independent or integrated into a digital signal decoder.

Claims

Method for attenuating pre-echoes in a digital audio signal produced on the basis of a transform coding, in which, upon decoding, for a current frame of this digital audio signal, the method comprises:
- a step (CONC) of defining a concatenated signal, on the basis at least of the reconstructed signal of the current frame;

- a step (DIV, 301) of dividing said concatenated signal into sub-blocks of samples of determined length;

- a step (ENV, 302) of calculating a temporal envelope of the concatenated signal;

- a step (DETECT, 304) of detecting a transition of the temporal envelope to a high-energy zone;

- a step (DETECT, 304) of determining the sub-blocks of low energy preceding a sub-block in which a transition has been detected; and characterized by

- a step (ATT) of attenuation in the determined sub-blocks,
the method being characterized in that the attenuation is performed according to an attenuation factor calculated for each of the determined sub-blocks, as a function of the temporal envelope of the concatenated signal and of the temporal envelope of the reconstructed signal of the previous frame.
Method according to Claim 1, characterized in that a minimum value is fixed for an attenuation value of the factor as a function of the temporal envelope of the reconstructed signal of the previous frame.
Method according to Claim 1, characterized in that the attenuation factor is determined as a function of the temporal envelope of said sub-block, of the maximum of the temporal envelope of the sub-block comprising said transition and of the temporal envelope of the reconstructed signal of the previous frame.
Method according to one of Claims 1 to 3, characterized in that the temporal envelope is determined by a sub-block energy calculation.
Method according to Claim 1, characterized in that it furthermore comprises a step of calculating and storing the temporal envelope of the current frame after the step of attenuation in the determined sub-blocks.
Method according to Claim 1, characterized in that an attenuation factor of value 1 is allocated to the samples of said sub-block comprising the transition as well as to the samples of the following sub-blocks in the current frame.
Method according to Claim 4, characterized in that the attenuation factor is determined per sub-block determined according to the following steps:
- calculation of the ratio of the maximum energy determined in the sub-block comprising a transition over the energy of the current sub-block;

- comparison of the ratio with a first threshold;

- in the case where the ratio is less than or equal to the first threshold, allocating of a value inhibiting the attenuation to the attenuation factor;

- in the case where the ratio is greater than the first threshold:
. comparison of the ratio with a second threshold;

. in the case where the ratio is less than or equal to the second threshold, allocating of a low attenuation value to the attenuation factor;

. in the case where the ratio is greater than the second threshold, allocating of a high attenuation value to the attenuation factor.
Method according to Claim 1, characterized in that a smoothing function is determined between the factors calculated sample by sample.
Method according to Claim 1, characterized in that a factor correction is performed for the sub-block preceding the sub-block comprising a transition, by applying an attenuation value inhibiting the attenuation, to the attenuation factor applied to a predetermined number of samples of the sub-block preceding the sub-block comprising a transition.
Device for attenuating pre-echoes in a digital audio signal produced on the basis of a transform coder, in which, the device associated with a decoder comprises, for processing a current frame of this digital audio signal:
- a module (101) for defining a concatenated signal, on the basis at least of the reconstructed signal of the current frame;

- a module (102) for dividing said concatenated signal into sub-blocks of samples of determined length;

- a module (103) for calculating a temporal envelope of the concatenated signal;

- a module (104) for detecting a transition of the temporal envelope to a high-energy zone;

- a module (104) for determining the sub-blocks of low energy preceding a sub-block in which a transition has been detected; and characterized by

- a module (105) for attenuation in the determined sub-blocks,
the device being characterized in that the attenuation module performs the attenuation according to an attenuation factor calculated for each of the determined sub-blocks, as a function at least of the temporal envelope of the concatenated signal and of the temporal envelope of the reconstructed signal of the previous frame.
Decoder of a digital audio signal comprising a device according to Claim 10.
Computer program comprising code instructions for the implementation of the steps of the method according to one of Claims 1 to 9, when these instructions are executed by a processor.