EP3192073B1

EP3192073B1 - Discrimination and attenuation of pre-echoes in a digital audio signal

Info

Publication number: EP3192073B1
Application number: EP15771686.1A
Authority: EP
Inventors: Balazs Kovesi; Stéphane RAGOT
Original assignee: Orange SA
Current assignee: Orange SA
Priority date: 2014-09-12
Filing date: 2015-09-11
Publication date: 2018-08-01
Anticipated expiration: 2035-09-11
Also published as: CN112086107A; FR3025923A1; CN112086107B; KR20170055515A; EP3192073A1; WO2016038316A1; ES2692831T3; JP2020170187A; US20170263263A1; KR102000227B1; JP2017532595A; JP6728142B2; JP7008756B2; US10083705B2; CN106716529A; CN106716529B

Description

L'invention concerne un procédé et un dispositif de discrimination et de traitement d'atténuation des pré-échos lors du décodage d'un signal audionumérique.The invention relates to a method and a device for discriminating and processing pre-echo attenuation when decoding a digital audio signal.

Pour la transmission des signaux audionumériques sur des réseaux de télécommunications, qu'il s'agisse par exemple de réseaux fixes ou mobiles, ou pour le stockage des signaux, on fait appel à des processus de compression (ou codage source) mettant en oeuvre des systèmes de codage qui sont en général du type codage temporel par prédiction linéaire ou codage fréquentiel par transformée.For the transmission of digital audio signals over telecommunications networks, whether for example fixed or mobile networks, or for the storage of signals, compression processes (or source coding) using coding systems which are generally of the linear coding type by linear prediction or by transform frequency coding.

Le procédé et le dispositif, objets de l'invention, ont ainsi comme domaine d'application la compression des signaux sonores, en particulier les signaux audionumériques codés par transformée fréquentielle.The method and the device, which are the subject of the invention, thus have as their field of application the compression of sound signals, in particular frequency-coded digital audio signals.

La figure 1 représente à titre illustratif, un schéma de principe du codage et du décodage d'un signal audio numérique par transformée incluant une analyse-synthèse par addition/recouvrement selon l'art antérieur.The figure 1 represents for illustrative purposes, a block diagram of the coding and decoding of a digital audio signal by transform including an addition / overlap synthesis analysis according to the prior art.

Certaines séquences musicales, telles que les percussions et certains segments de parole comme les plosives (/k/, /t/, ...), sont caractérisées par des attaques extrêmement brusques qui se traduisent par des transitions très rapides et une variation très forte de la dynamique du signal en l'espace de quelques échantillons. Un exemple de transition est donné à la figure 1 à partir de l'échantillon 410.Certain musical sequences, such as percussion and certain segments of speech like the plosives (/ k /, / t /, ...), are characterized by extremely sudden attacks which result in very fast transitions and a very strong variation signal dynamics in a few samples. An example of a transition is given to the figure 1 from sample 410.

Pour le traitement de codage/décodage, le signal d'entrée est découpé en blocs d'échantillons de longueur L dont les frontières sont représentées sur la figure 1 par des traits verticaux en pointillés. Le signal d'entrée est noté x(n), où n est l'indice de l'échantillon. La découpe en blocs successifs (ou trames) conduit à définir les blocs X _N(n) = [x(N.L) ... x(N.L+L-1)] = [x _N(0) ... x _N(L-1)], où N est l'indice du bloc (ou de la trame), L est la longueur de la trame. A la figure 1 on a L=160 échantillons. Dans le cas de la transformée modulée en cosinus modifiée MDCT (pour "Modified Discrète Cosine Transform" en anglais), deux blocs X _N(n) et X _N+1(n) sont analysés conjointement pour donner un bloc de coefficients transformés associés à la trame d'indice N et la fenêtre d'analyse est sinusoïdale.For the coding / decoding process, the input signal is cut into blocks of samples of length L whose boundaries are represented on the figure 1 by dotted vertical lines. The input signal is denoted x ( n ), where n is the index of the sample. Cutting into successive blocks (or frames) leads to defining the blocks X _N ( n ) = [ x (NL) ... x (N.L + L-1)] = [ x _N (0) ... x _N (L-1)], where N is the index of the block (or the frame), L is the length of the frame. To the figure 1 we have L = 160 samples. In the case of the modulated modified cosine transform MDCT (for "Modified Discrete Cosine Transform" in English), two blocks X _N (n) and X _{N + 1} (n) are analyzed together to give a block of transformed coefficients associated with the frame of index N and the analysis window is sinusoidal.

La division en blocs, aussi appelés trames, opérée par le codage par transformée est totalement indépendante du signal sonore et les transitions peuvent donc apparaître en un point quelconque de la fenêtre d'analyse. Or après décodage par transformée, le signal reconstruit est entaché de "bruit" (ou distorsion) engendré par l'opération quantification (Q)-quantification inverse (Q^-1). Ce bruit de codage est réparti temporellement de façon relativement uniforme sur tout le support temporel du bloc transformé, c'est-à-dire sur toute la longueur de la fenêtre de longueur 2L d'échantillons (avec recouvrement de L échantillons). L'énergie du bruit de codage est en général proportionnelle à l'énergie du bloc et est fonction du débit de codage/décodage.The division in blocks, also called frames, operated by the transform coding is totally independent of the sound signal and the transitions can therefore appear at any point in the analysis window. But after transform decoding, the reconstructed signal is tainted by "noise" (or distortion) generated by the quantization (Q) -quantization inverse (Q ^-1 ) operation. This coding noise is temporally distributed relatively uniform throughout the temporal support of the transformed block, that is to say over the entire length of the window length 2L samples (with overlap of L samples). The energy of the coding noise is generally proportional to the energy of the block and is a function of the coding / decoding rate.

Pour un bloc comportant une attaque (comme le bloc 320-480 de la figure 1) l'énergie du signal est élevée, le bruit est donc également de niveau élevé.For a block with an attack (such as block 320-480 of the figure 1 ) the signal energy is high, so the noise is also high.

En codage par transformée, le niveau du bruit de codage est typiquement inférieur à celui du signal pour les segments de forte énergie qui suivent immédiatement la transition, mais le niveau est supérieur à celui du signal pour les segments d'énergie plus faible, notamment sur la partie précédant la transition (échantillons 160 - 410 de la figure 1). Pour la partie précitée, le rapport signal à bruit est négatif et la dégradation résultante peut apparaître très gênante à l'écoute. On appelle pré-écho le bruit de codage antérieur à la transition et post-écho le bruit postérieur à la transition.In transform coding, the level of the coding noise is typically lower than that of the signal for the high energy segments that immediately follow the transition, but the level is higher than that of the signal for the lower energy segments, especially on the part preceding the transition (samples 160 - 410 of the figure 1 ). For the aforementioned part, the signal-to-noise ratio is negative and the resulting degradation can appear very troublesome to listen. Pre-echo is the coding noise prior to the transition and post-echo the noise after the transition.

On peut observer sur la figure 1 que le pré-écho affecte la trame précédant la transition ainsi que la trame où se produit la transition.We can observe on the figure 1 that the pre-echo affects the frame preceding the transition as well as the frame where the transition occurs.

Les expériences psycho-acoustiques ont montré que l'oreille humaine effectue un pré-masquage temporel des sons assez limité, de l'ordre de quelques millisecondes. Le bruit précédant l'attaque, ou pré-écho, est audible lorsque la durée du pré-écho est supérieure à la durée du pré-masquage.Psychoacoustic experiments have shown that the human ear performs a rather limited temporal pre-masking of sounds, of the order of a few milliseconds. The noise preceding the attack, or pre-echo, is audible when the duration of the pre-echo is greater than the duration of the pre-masking.

L'oreille humaine effectue également un post-masquage d'une durée plus longue, de 5 à 60 millisecondes, lors du passage de séquences de forte énergie à des séquences de faible énergie. Le taux ou niveau de gêne acceptable pour les post-échos est donc plus important que pour les pré-échos.The human ear also performs a post-masking of a longer duration, from 5 to 60 milliseconds, during the passage of high energy sequences to low energy sequences. The rate or level of inconvenience acceptable for post-echoes is therefore greater than for pre-echoes.

Le phénomène des pré-échos, plus critique, est d'autant plus gênant que la longueur des blocs en nombre d'échantillons est importante. Or, en codage par transformée, il est bien connu que pour les signaux stationnaires plus la longueur de la transformée augmente, plus le gain de codage est important. A fréquence d'échantillonnage fixée et à débit fixé, si on augmente le nombre de points de la fenêtre (donc la longueur de la transformée) on disposera de plus de bits par trame pour coder les raies fréquentielles jugées utiles par le modèle psychoacoustique, d'où l'avantage d'utiliser des blocs de grande longueur. Le codage MPEG AAC (Advanced Audio Coding), par exemple, utilise une fenêtre de grande longueur qui contient un nombre fixe d'échantillons, 2048, soit sur une durée de 64 ms si la fréquence d'échantillonnage est de 32 kHz; le problème des pré-échos y est géré en permettant de commuter de ces fenêtres longues à 8 fenêtres courtes par le biais de fenêtres intermédiaires (dites de transition), ce qui nécessite un certain retard au codage pour détecter la présence d'une transition et adapter les fenêtres. La longueur de ces fenêtres courtes est donc de 256 échantillons (8 ms à 32 kHz). A bas débit on peut toujours avoir un pré-écho audible de quelques ms. La commutation des fenêtres permet d'atténuer le pré-écho mais pas de le supprimer. Les codeurs par transformée utilisés pour les applications conversationnelles, comme UIT-T G.722.1, G.722.1C ou G.719, utilisent souvent une longueur de trame de 20 ms et une fenêtre de durée 40 ms à 16, 32 ou 48 kHz (respectivement). On peut noter que le codeur UIT-T G.719 intègre un mécanisme de commutation de fenêtres avec détection de transitoire, cependant le pré-écho n'est pas complètement réduit à bas débit (typiquement à 32 kbit/s).The phenomenon of pre-echoes, more critical, is even more troublesome as the length of the blocks in number of samples is important. However, in transform coding, it is well known that for stationary signals plus the length of the transform increases, the higher the coding gain. At a fixed sampling rate and fixed rate, if we increase the number of points of the window (hence the length of the transform) we will have more bits per frame to code the frequency lines found useful by the psychoacoustic model, d where the advantage of using blocks of great length. MPEG AAC (Advanced Audio Coding) coding, for example, uses a long window that contains a fixed number of samples, 2048, or a duration of 64 ms if the sampling frequency is 32 kHz; the problem of pre-echoes is managed by allowing to switch from these long windows to 8 short windows through intermediate windows (called transition), which requires some delay in coding to detect the presence of a transition and adapt the windows. The length of these short windows is therefore 256 samples (8 ms at 32 kHz). At low speed you can always have an audible pre-echo of a few ms. The switching of the windows makes it possible to attenuate the pre-echo but not to suppress it. Transform encoders used for conversational applications, such as ITU-T G.722.1, G.722.1C or G.719, often use a 20 ms frame length and a 40 ms window at 16, 32 or 48 kHz (respectively). It should be noted that the ITU-T G.719 encoder incorporates a window switch mechanism with transient detection, however the pre-echo is not completely reduced at low bit rate (typically at 32 kbit / s).

Dans le but de réduire l'effet gênant précité du phénomène des pré-échos, différentes solutions ont été proposées au niveau du codeur et/ou du décodeur.In order to reduce the aforementioned annoying effect of the pre-echo phenomenon, various solutions have been proposed at the encoder and / or the decoder.

La commutation de fenêtres a été citée précédemment ; elle nécessite de transmettre une information auxiliaire pour identifier le type de fenêtres utilisées dans la trame courante. Une autre solution consiste à appliquer un filtrage adaptatif. Dans la zone précédant l'attaque, le signal reconstruit est vu comme la somme du signal original et du bruit de quantification.Window switching has been mentioned previously; it requires transmitting auxiliary information to identify the type of windows used in the current frame. Another solution is to apply adaptive filtering. In the area preceding the attack, the reconstructed signal is seen as the sum of the original signal and the quantization noise.

Une technique de filtrage correspondante a été décrite dans l'article intitulé High Quality Audio Transform Coding at 64 kbits, IEEE Trans. on Communications Vol 42, No. 11, November 1994, publié par Y. Mahieux et J. P. Petit. A corresponding filtering technique has been described in the article entitled High Quality Transform Audio Coding at 64 kbits, IEEE Trans. on Communications Vol 42, No. 11, November 1994, published by Y. Mahieux and JP Petit.

La mise en oeuvre d'un tel filtrage nécessite la connaissance de paramètres dont certains, comme les coefficients de prédiction et la variance du signal corrompu par le pré-écho, sont estimés au décodeur à partir des échantillons bruités. Par contre, des informations telles que l'énergie du signal d'origine ne peuvent être connues qu'au codeur et doivent par conséquent être transmises. Ceci nécessite de transmettre des informations supplémentaires, ce qui à débit contraint diminue le budget relatif alloué au codage par transformée. Lorsque le bloc reçu contient une variation brusque de dynamique, le traitement de filtrage lui est appliqué.The implementation of such a filtering requires the knowledge of parameters some of which, like the prediction coefficients and the variance of the signal corrupted by the pre-echo, are estimated at the decoder from the noisy samples. On the other hand, information such as the energy of the original signal can only be known to the encoder and must therefore be transmitted. This requires additional information to be transmitted, which constrains the relative budget allocated to the transform coding. When the received block contains a sudden variation of dynamics, the filtering treatment is applied to it.

Le processus de filtrage précité ne permet pas de retrouver le signal d'origine, mais procure une forte réduction des pré-échos. Il nécessite toutefois de transmettre les paramètres supplémentaires au décodeur.The aforementioned filtering process does not allow to find the original signal, but provides a strong reduction of pre-echoes. However, it requires transmitting the additional parameters to the decoder.

A la différence des solutions précédentes, différentes techniques de réduction de pré-écho sans transmission spécifique de l'information ont été proposées. Par exemple, une revue de la réduction de pré-échos dans le contexte du codage hiérarchique est présentée dans l'article B. Kövesi, S. Ragot, M. Gartner, H. Taddei, "Pre-echo réduction in the ITU-T G.729.1 embedded coder," EUSIPCO, Lausanne, Suisse, Août 2008 .Unlike previous solutions, different pre-echo reduction techniques without specific transmission of information have been proposed. For example, a review of pre-echo reduction in the context of hierarchical coding is presented in the article B. Kövesi, S. Ragot, M. Gartner, H. Taddei, "Pre-echo reduction in the ITU-T G.729.1 embedded coder," EUSIPCO, Lausanne, Switzerland, August 2008 .

Un exemple typique de procédé d'atténuation de pré-échos sans information auxiliaire est décrit dans la demande de brevet français FR 08 56248 . Dans cet exemple, on détermine des facteurs d'atténuation par sous-bloc, dans les sous-blocs de faible énergie précédant un sous-bloc dans lequel une transition ou attaque a été détectée.A typical example of pre-echo attenuation method without auxiliary information is described in the French patent application FR 08 56248 . In this example, sub-block attenuation factors are determined in the low energy sub-blocks preceding a sub-block in which a transition or attack has been detected.

Le facteur d'atténuation g(k) dans le k-ième sous-bloc est calculé par exemple en fonction du rapport R(k) entre l'énergie du sous-bloc de plus forte énergie et l'énergie du k-ième sous-bloc en question : $g (k) = f (R (k))$

où f est une fonction décroissante à valeurs entre 0 et 1 et k est le numéro du sous-bloc. D'autres définitions du facteur g(k) sont possibles, par exemple en fonction de l'énergie En(k) dans le sous-bloc courant et de l'énergie En(k-1) dans le sous-bloc précédent.The attenuation factor g ( k ) in the k-th sub-block is calculated for example according to the ratio R ( k ) between the energy of the sub-block of higher energy and the energy of the k-th under -block in question:

boy Wut (k) = f (R (k))

where f is a decreasing function with values between 0 and 1 and k is the number of the sub-block. Other definitions of the factor g ( k ) are possible, for example as a function of the energy En ( k ) in the current sub-block and the energy En ( k -1) in the preceding sub-block.

Si l'énergie des sous-blocs varie peu par rapport à l'énergie maximale dans les sous-blocs considérés à la trame courante, aucune atténuation n'est alors nécessaire ; le facteur g(k) est fixé à une valeur d'atténuation inhibant l'atténuation, c'est-à-dire 1. Sinon, le facteur d'atténuation est compris entre 0 et 1.If the energy of the sub-blocks varies little with respect to the maximum energy in the sub-blocks considered at the current frame, no attenuation is then necessary; the factor g ( k ) is set to an attenuation-inhibiting attenuation value, i.e. 1. Otherwise, the attenuation factor is between 0 and 1.

Dans la plupart des cas, surtout quand le pré-écho est gênant, la trame qui précède la trame de pré-écho a une énergie homogène qui correspond à l'énergie d'un segment de faible énergie (typiquement un bruit de fond). Selon l'expérience il n'est pas utile ni même souhaitable qu'après traitement d'atténuation de pré-écho l'énergie du signal devienne inférieure à l'énergie moyenne (par sous-bloc) du signal précédant la zone de traitement - typiquement celle de la trame précédente, notée En, ou celle de la deuxième moitié de la trame précédente, notée En '.In most cases, especially when the pre-echo is troublesome, the frame that precedes the pre-echo frame has a homogeneous energy that corresponds to the energy of a low energy segment (typically a background noise). According to experience it is not useful or even desirable that after pre-echo attenuation processing the signal energy becomes lower than the average energy (sub-block) of the signal preceding the treatment zone - typically that of the previous frame, noted In , or that of the second half of the previous frame, noted In .

Pour le sous-bloc d'indice k à traiter on peut calculer la valeur limite, notée lim _g (k), du facteur d'atténuation afin d'obtenir exactement la même énergie que l'énergie moyenne par sous-bloc du segment précédant le sous-bloc à traiter. Cette valeur est bien sûr limitée à un maximum de 1 puisqu'on s'intéresse ici aux valeurs d'atténuation. Plus précisément on définit ici: $\lim_{g} (k) = \min (\sqrt{\frac{\max (\overline{En}, \overline{En}')}{En (k)}},1)$

où l'énergie moyenne du segment précédent est approximée par la valeur max ( En, En ').For the subscript subscript k to be processed, the limit value, denoted lim _g ( k ) , of the attenuation factor can be calculated in order to obtain exactly the same energy as the average energy per sub-block of the preceding segment. the sub-block to be processed. This value is of course limited to a maximum of 1 since we are interested here in the attenuation values. More precisely, we define here:

\lim_{boy Wut} (k) = \min (\sqrt{\frac{\max (\tilde{In}, \tilde{In}')}{In (k)}}, 1)

where the average energy of the previous segment is approximated by the max value ( In , In ').

La valeur lim _g (k) ainsi obtenue sert de limite inférieure dans le calcul final du facteur d'atténuation du sous-bloc, elle est donc utilisée comme suit : $g (k) = \max (g (k), \lim_{g} (k))$

The value lim _g ( k ) thus obtained serves as a lower limit in the final calculation of the attenuation factor of the sub-block, it is therefore used as follows:

boy Wut (k) = \max (boy Wut (k), \lim_{boy Wut} (k))

Les facteurs d'atténuation (ou gains) g(k) déterminés par sous-blocs peuvent être ensuite lissés par une fonction de lissage appliquée échantillon par échantillon pour éviter des variations brusques du facteur d'atténuation aux frontières des blocs.The attenuation factors (or gains) g ( k ) determined by sub-blocks can then be smoothed by a sample-by-sample applied smoothing function to avoid abrupt changes in the attenuation factor at the block boundaries.

Par exemple, on peut d'abord définir le gain par échantillon comme une fonction constante par morceaux : $g_{pre} (n) = g (k), n = kL', \dots, (k + 1) L' - 1$

où L' représente la longueur d'un sous-bloc.
La fonction est ensuite lissée suivant l'équation suivante:

g_{pre} (n) : = α g_{pre} (n - 1) + (1 - α) g_{pre} (n), n = 0, \dots, L - 1

avec la convention que g_pre (-1) est le dernier facteur d'atténuation obtenu pour le dernier échantillon du sous-bloc précédent, α est le coefficient de lissage, typiquement α=0.85.For example, we can first define the gain per sample as a piecewise constant function:

{boy Wut}_{pre} (not) = boy Wut (k), not = kL', \dots, (k + 1) The' - 1

where L 'represents the length of a sub-block.
The function is then smoothed according to the following equation:

{boy Wut}_{pre} (not) : = α {boy Wut}_{pre} (not - 1) + (1 - α) {boy Wut}_{pre} (not), not = 0 \dots, The - 1

with the convention that g _pre (-1) is the last attenuation factor obtained for the last sample of the previous sub-block, α is the smoothing coefficient, typically α = 0.85.

D'autres fonctions de lissage sont également possibles comme par exemple le fondu enchainé linéaire sur u échantillons : $g_{pre} (n) = \frac{1}{u} \sum_{i = 0}^{u - 1} g_{pre}' (n - i), n = 0, \dots, L - 1$

où g_pre'(n) est l'atténuation non lissée et g_pre (n) est l'atténuation lissée, g_pre'(n) avec n=-(u-1),····,-1 sont les derniers u-1 facteurs d'atténuation obtenus pour les derniers échantillons du sous-bloc précédent. On peut par exemple prendre u = 5.Other smoothing functions are also possible, for example the linear cross fade on u samples:

{boy Wut}_{pre} (not) = \frac{1}{u} Σ_{i = 0}^{u - 1} {boy Wut}_{pre}' (not - i), not = 0 \dots, The - 1

where g _pre ' ( n ) is the unsmoothed attenuation and g _pre ( n ) is the attenuation smoothed, g _pre ' ( n ) with n = - ( u -1), ····, -1 are the last u-1 attenuation factors obtained for the last samples of the previous sub-block. We can for example take u = 5.

Une fois les facteurs g_pre (n) ainsi calculés, l'atténuation de pré-échos est faite sur le signal reconstruit dans la trame courante, x_rec (n), en multipliant chaque échantillon par le facteur correspondant : $x_{rec, g} (n) = g_{pre} (n) x_{rec} (n), n = 0, \dots, L - 1$

où x_rec,g (n) est le signal décodé et post-traité par la réduction de pré-écho.Once the factors g _pre ( n ) thus calculated, the attenuation of pre-echoes is made on the reconstructed signal in the current frame, x _rec ( n ) , by multiplying each sample by the corresponding factor:

x_{rec, boy Wut} (not) = {boy Wut}_{pre} (not) x_{rec} (not), not = 0 \dots, The - 1

where x _{rec, g} ( n ) is the signal decoded and post-processed by the pre-echo reduction.

Les figures 2 et 3 illustrent la mise en oeuvre du procédé d'atténuation tel que décrit dans la demande de brevet de l'état de l'art, précitée, et résumé précédemment.The figures 2 and 3 illustrate the implementation of the attenuation method as described in the patent application of the state of the art, cited above, and summarized above.

Dans ces exemples le signal est échantillonné à 32 kHz, la longueur de la trame est L=640 échantillons et chaque trame est divisée en 8 sous-blocs de K=80 échantillons.In these examples the signal is sampled at 32 kHz, the length of the frame is L = 640 samples and each frame is divided into 8 sub-blocks of K = 80 samples.

Dans la partie a) de la figure 2, une trame d'un signal original échantillonné à 32 kHz, est représentée. Une attaque (ou transition) dans le signal est située dans le sous-bloc commençant à l'indice 320. Ce signal a été codé par un codeur par transformée de type MDCT à bas débit (24 kbit/s).In part a) of the figure 2 a frame of an original signal sampled at 32 kHz is shown. An attack (or transition) in the signal is located in the sub-block beginning at the index 320. This signal has been coded by a low rate (24 kbit / s) MDCT type transform coder.

Dans la partie b) de la figure 2, le résultat du décodage sans traitement de pré-écho est illustré. On peut observer le pré-écho à partir de l'échantillon 160, dans les sous-blocs précédents celui contenant l'attaque.In part (b) of the figure 2 , the result of the decoding without pre-echo processing is illustrated. The pre-echo can be observed from sample 160, in the preceding sub-blocks the one containing the attack.

La partie c) montre l'évolution du facteur d'atténuation de pré-écho (ligne continue) obtenu par le procédé décrit dans la demande de brevet de l'état de l'art précitée. La ligne pointillée représente le facteur avant lissage. On remarque ici que la position de l'attaque est estimée autour de l'échantillon 380 (dans le bloc délimité par les échantillons 320 et 400).Part c) shows the evolution of the pre-echo attenuation factor (solid line) obtained by the method described in the aforementioned prior art patent application. The dotted line represents the factor before smoothing. Note here that the position of the attack is estimated around the sample 380 (in the block delimited by the samples 320 and 400).

La partie d) illustre le résultat du décodage après application du traitement de pré-écho (multiplication du signal b) avec le signal c)). On voit que le pré-écho a bien été atténué. La figure 2 montre également que le facteur lissé ne remonte pas à 1 au moment de l'attaque, ce qui implique une diminution de l'amplitude de l'attaque. L'impact perceptible de cette diminution est très faible mais peut néanmoins être évité. La figure 3 illustre le même exemple que la figure 2, dans lequel, avant lissage, la valeur de facteur d'atténuation est forcée à 1 pour les quelques échantillons du sous-bloc précédant le sous-bloc où se situe l'attaque. La partie c) de la figure 3 donne un exemple d'une telle correction.Part d) illustrates the result of the decoding after application of pre-echo processing (multiplication of signal b) with signal c)). We see that the pre-echo has been attenuated. The figure 2 also shows that the smoothed factor does not go back to 1 at the moment of the attack, which implies a decrease in the amplitude of the attack. The noticeable impact of this decrease is very small but can nevertheless be avoided. The figure 3 illustrates the same example as the figure 2 , in which, before smoothing, the attenuation factor value is forced to 1 for the few samples of the sub-block preceding the sub-block where the attack is located. Part (c) of the figure 3 give an example of such a correction.

Dans cet exemple on a affecté la valeur de facteur 1 aux 16 derniers échantillons du sous-bloc précédant l'attaque, à partir de l'indice 364. Ainsi la fonction de lissage accroît progressivement le facteur pour avoir une valeur proche de 1 au moment de l'attaque. L'amplitude de l'attaque est alors préservée, comme illustré dans la partie d) de la figure 3, par contre quelques échantillons de pré-écho ne sont pas atténués.In this example, the value of factor 1 has been assigned to the last 16 samples of the sub-block preceding the attack, starting from the index 364. Thus, the smoothing function progressively increases the factor to have a value close to 1 at the moment. of the attack. The amplitude of the attack is then preserved, as illustrated in part d) of the figure 3 however, some pre-echo samples are not attenuated.

Dans l'exemple de la figure 3 la réduction de pré-écho par atténuation ne permet pas de réduire le pré-écho jusqu'au niveau de l'attaque, à cause du lissage du gain.In the example of the figure 3 attenuation pre-echo reduction does not reduce the pre-echo to the attack level because of gain smoothing.

Cette technique de réduction des pré-échos est cependant perfectible pour certains types de signaux comme des signaux de musique moderne par exemple. En effet, dans certains cas, une fausse détection de pré-écho peut avoir lieu. La figure 4 illustre un exemple d'un tel signal original, non codé donc sans pré-écho. Il s'agit d'un battement d'un instrument de percussion électronique/synthétique. On peut y observer qu'avant l'attaque nette vers l'indice 1600 il y a un bruit synthétique qui démarre vers l'indice 1250. Ce bruit synthétique qui fait partie donc du signal serait détecté comme un pré-écho par l'algorithme de détection de pré-écho décrit ci-dessus, en supposant un codage/décodage parfait du signal. Le traitement d'atténuation de pré-échos supprimerait donc cette composante du signal. Ceci dénaturerait le signal décodé (quand le codage/décodage est parfait), ce qui n'est pas souhaitable.This pre-echo reduction technique is however perfectible for certain types of signals such as modern music signals, for example. Indeed, in some cases, a false pre-echo detection can take place. The figure 4 illustrates an example of such an original signal, not coded so without pre-echo. This is a beat of an electronic / synthetic percussion instrument. It can be observed that before the net attack to the index 1600 there is a synthetic noise which starts towards the index 1250. This synthetic noise which is therefore part of the signal would be detected as a pre-echo by the algorithm pre-echo detection described above, assuming perfect signal coding / decoding. The pre-echo attenuation processing would therefore suppress this component of the signal. This would distort the decoded signal (when the coding / decoding is perfect), which is undesirable.

Il existe donc un besoin d'une technique améliorée de discrimination et d'atténuation de pré-échos au décodage, qui permette de fiabiliser la détection des pré-échos et d'éviter les fausses détections sans qu'aucune information auxiliaire ne soit transmise par le codeur.There is therefore a need for an improved decoding pre-echo discrimination and attenuation technique which makes it possible to make pre-echo detection reliable and to avoid false detections without any auxiliary information being transmitted by the encoder.

La présente invention améliore la situation de l'état de l'art.The present invention improves the state of the art.

A cet effet, la présente invention se rapporte à un procédé de discrimination et d'atténuation de pré-écho dans un signal audionumérique engendré à partir d'un codage par transformée, dans lequel, pour une trame courante décomposée en sous-blocs, les sous-blocs de basse faible énergie précédant un sous-bloc dans lequel une transition ou attaque est détectée déterminent une zone de pré-écho dans laquelle un traitement d'atténuation de pré-écho est effectué. Le procédé est tel que, dans le cas où une attaque est détectée à partir du troisième sous-bloc de la trame courante, il comporte les étapes suivantes:

calcul d'un coefficient directeur des énergies pour au moins deux sous-blocs de la trame courante précédant le sous-bloc dans lequel une attaque est détectée;
comparaison du coefficient directeur à un seuil prédéfini; et
inhibition du traitement d'atténuation de pré-écho dans la zone de pré-écho dans le cas où le coefficient directeur calculé est inférieur au seuil prédéfini.

For this purpose, the present invention relates to a method of pre-echo discrimination and attenuation in a digital audio signal generated from a transform coding, in which, for a current frame decomposed into sub-blocks, the Low energy low sub-blocks preceding a sub-block in which a transition or attack is detected determine a pre-echo area in which a pre-echo attenuation processing is performed. The method is such that, in the case where an attack is detected from the third sub-block of the current frame, it comprises the following steps:

calculating an energies directing coefficient for at least two sub-blocks of the current frame preceding the sub-block in which an attack is detected;
comparing the guideline with a predefined threshold; and
inhibiting the pre-echo attenuation processing in the pre-echo zone in the case where the calculated master coefficient is lower than the predefined threshold.

Le coefficient directeur des énergies calculé pour les sous-blocs précédant la position de l'attaque, permet de vérifier la tendance d'augmentation de l'énergie du signal dans la zone de pré-écho. Ceci permet de fiabiliser la détection des pré-échos en évitant de fausse détection de pré-échos. En effet, en observant la figure 1 on peut voir que le pré-écho a une caractéristique typique : son énergie a une tendance croissante en approchant l'attaque origine du pré-écho. La forme des fenêtres de pondération de l'addition-recouvrement explique cela. Même si le pré-écho a une énergie à peu près constante avant l'addition-recouvrement, les signaux à l'entrée du module d'addition-recouvrement sont multipliés par des fenêtres de pondération dont le poids décroit vers le passé. Dans le cas du signal d'exemple de la figure 4, l'énergie du signal avant l'attaque est approximativement constante ce qui permet de le différencier d'un pré-écho. Ainsi, la vérification d'une énergie croissante du signal dans la zone de pré-écho permet d'augmenter la fiabilité de la détection de pré-écho.The energy director coefficient calculated for the sub-blocks preceding the position of the attack makes it possible to check the tendency of increase of the energy of the signal in the pre-echo zone. This makes reliable detection of pre-echoes by avoiding false detection of pre-echoes. By observing the figure 1 it can be seen that the pre-echo has a typical characteristic: its energy has a growing tendency in approaching the original pre-echo attack. The shape of the weighting windows of the addition-overlap explain this. Even if the pre-echo has a nearly constant energy before the overlap-addition, the signals at the input of the add-over module are multiplied by weighting windows whose weight decreases towards the past. In the case of the example signal of the figure 4 , the energy of the signal before the attack is approximately constant which makes it possible to differentiate it from a pre-echo. Thus, verification of increasing signal energy in the pre-echo area increases the reliability of the pre-echo detection.

Dans un mode de réalisation particulier, le procédé comporte en outre une étape de décomposition du signal audionumérique en au moins deux sous-signaux en fonction d'un critère fréquentiel et en ce que les étapes de calcul de comparaison sont effectuées pour au moins un des sous-signaux.In a particular embodiment, the method further comprises a step of decomposing the digital audio signal into at least two sub-signals according to a frequency criterion and in that the comparison calculation steps are performed for at least one of the subsignals.

Lorsque la position de l'attaque est détectée dans le troisième sous-bloc de la trame courante, l'énergie de deux sous-blocs est utilisée dans la zone de pré-écho pour calculer un coefficient directeur et le comparer à un seuil. Avec seulement deux points, seule la vérification pour le sous-signal haute-fréquence dans le cas d'une décomposition en deux sous-signaux est suffisante pour détecter une fausse détection de pré-écho.When the position of the attack is detected in the third sub-block of the current frame, the energy of two sub-blocks is used in the pre-echo zone to calculate a directional coefficient and compare it to a threshold. With only two points, only the verification for the high-frequency sub-signal in the case of two sub-signal decomposition is sufficient to detect a false pre-echo detection.

Dans le cas où le nombre de sous-blocs précédant le sous-bloc où une position d'attaque a été détecté est suffisant, le procédé comporte en outre une étape de décomposition du signal audionumérique en au moins deux sous-signaux en fonction d'un critère fréquentiel et en ce que les étapes de calcul et de comparaison sont effectuées pour chacun des sous-signaux, l'inhibition du traitement d'atténuation de pré-écho dans la zone de pré-écho de tous les sous-signaux s'effectuant quand un coefficient directeur calculé est inférieur au seuil prédéfini pour au moins un sous-signal.In the case where the number of sub-blocks preceding the sub-block where a driving position has been detected is sufficient, the method further comprises a step of decomposing the digital audio signal into at least two sub-signals as a function of a frequency criterion and in that the calculation and comparison steps are performed for each of the sub-signals, the inhibition of the pre-echo attenuation processing in the pre-echo zone of all the sub-signals performing when a calculated master coefficient is below the predefined threshold for at least one sub-signal.

La division en sous-signaux permet ainsi d'effectuer une atténuation de pré-écho de façon indépendante et adaptée dans les sous signaux. La fiabilité de détection de la zone de pré-écho est renforcée pour chacun des sous-signaux par la vérification de la valeur des coefficients directeurs respectifs.The division into sub-signals thus makes it possible to carry out a pre-echo attenuation independently and adapted in the sub-signals. The detection reliability of the pre-echo zone is enhanced for each of the sub-signals by checking the value of the respective coefficient coefficients.

Selon un mode de réalisation particulier, un seuil différent est défini par sous-signal.According to a particular embodiment, a different threshold is defined by sub-signal.

Ceci permet d'adapter la vérification aux caractéristiques spectrales des sous-signaux.This makes it possible to adapt the verification to the spectral characteristics of the sub-signals.

Dans un mode de réalisation, le coefficient directeur est calculé selon une méthode d'estimation des moindres carrés.In one embodiment, the steering coefficient is calculated using a least squares estimation method.

Cette méthode de calcul est de faible complexité.This calculation method is of low complexity.

Dans un mode possible de réalisation, le coefficient directeur est normalisé.In one possible embodiment, the steering coefficient is normalized.

Ainsi le coefficient directeur est plus facilement comparable à un seuil lorsque celui-ci est différent de 0.Thus the steering coefficient is more easily comparable to a threshold when it is different from 0.

Dans un mode de réalisation possible, dans le cas où une attaque est détectée dans le premier ou deuxième sous-bloc de la trame courante, un coefficient directeur calculé pour la trame précédente est utilisé pour l'étape de comparaison.In a possible embodiment, in the case where an attack is detected in the first or second sub-block of the current frame, a direction coefficient calculated for the previous frame is used for the comparison step.

La présente invention se rapporte également à un dispositif de discrimination et d'atténuation de pré-écho dans un signal audionumérique engendré à partir d'un codage par transformée, comportant un module de détection de transition ou attaque, un module de discrimination de zone de pré-écho et un module de traitement d'atténuation de pré-écho, un traitement d'atténuation de pré-écho étant effectué pour une trame courante décomposée en sous-blocs, dans les sous-blocs de faible énergie précédant un sous-bloc dans lequel une transition ou attaque est détectée déterminant une zone de pré-écho. Le dispositif est tel que, dans le cas où une attaque est détectée à partir du troisième sous-bloc de la trame courante, il comporte en outre:

un module de calcul calculant un coefficient directeur des énergies pour au moins deux sous-blocs de la trame courante précédant le sous-bloc dans lequel une attaque est détectée;
un comparateur apte à effectuer une comparaison du coefficient directeur à un seuil prédéfini; et
un module de discrimination apte à inhiber le traitement d'atténuation de pré-écho dans la zone de pré-écho dans le cas où le coefficient directeur calculé est inférieur au seuil prédéfini.

The present invention also relates to a pre-echo discrimination and attenuation device in a digital audio signal generated from a transform coding, comprising a transition detection or attack module, a zone discrimination module. pre-echo and a pre-echo attenuation processing module, a pre-echo attenuation processing being performed for a current sub-block decomposed frame, in the low energy sub-blocks preceding a sub-block wherein a transition or attack is detected determining a pre-echo area. The device is such that, in the case where an attack is detected from the third sub-block of the current frame, it further comprises:

a calculation module calculating an energies directing coefficient for at least two sub-blocks of the current frame preceding the sub-block in which an attack is detected;
a comparator capable of making a comparison of the steering coefficient with a predefined threshold; and
a discrimination module capable of inhibiting the pre-echo attenuation processing in the pre-echo zone in the case where the calculated directing coefficient is lower than the predefined threshold.

Les avantages de ce dispositif sont les mêmes que ceux décrits pour le procédé de discrimination et de traitement d'atténuation qu'il met en oeuvre.The advantages of this device are the same as those described for the method of discrimination and attenuation processing that it implements.

L'invention vise un décodeur d'un signal audionumérique comportant un dispositif tel que décrit précédemment.The invention relates to a decoder of a digital audio signal comprising a device as described above.

L'invention vise aussi un programme informatique comportant des instructions de code pour la mise en oeuvre des étapes du procédé tel que décrit précédemment, lorsque ces instructions sont exécutées par un processeur.The invention also relates to a computer program comprising code instructions for implementing the steps of the method as described above, when these instructions are executed by a processor.

Enfin, l'invention se rapporte à un support de stockage, lisible par un processeur, intégré ou non au dispositif de traitement, éventuellement amovible, mémorisant un programme informatique mettant en oeuvre un procédé de traitement tel que décrit précédemment.Finally, the invention relates to a storage medium, readable by a processor, integrated or not to the processing device, optionally removable, storing a computer program implementing a method of treatment as described above.

D'autres caractéristiques et avantages de l'invention apparaîtront plus clairement à la lecture de la description suivante, donnée uniquement à titre d'exemple non limitatif, et faite en référence aux dessins annexés, sur lesquels:

la figure 1 décrite précédemment illustre un système de codage-décodage par transformée selon l'état de l'art;
la figure 2 décrite précédemment illustre un exemple de signal audionumérique pour lequel une méthode d'atténuation selon l'état de l'art est effectuée;
la figure 3 illustre un autre exemple de signal audionumérique pour lequel une méthode d'atténuation selon l'état de l'art est effectuée;
la figure 4 décrite précédemment illustre un exemple d'un signal pour lequel la technique de l'état d'art détecterait à tort un pré-écho;
la figure 5 illustre un mode de réalisation d'un procédé et d'un dispositif de discrimination et de traitement d'atténuation de pré-écho compris dans un décodeur selon l'invention;
la figure 6 illustre un exemple de fenêtres d'analyse et de fenêtres de synthèse à faible retard pour le codage et le décodage par transformée susceptible de créer le phénomène de pré-écho;
la figure 7 illustre un exemple de signal audionumérique pour lequel la méthode d'atténuation de pré-écho selon un mode de réalisation de l'invention est mis en oeuvre;....
la figure 8 illustre un exemple matériel de dispositif de discrimination et de traitement d'atténuation selon l'invention.

Other features and advantages of the invention will appear more clearly on reading the following description, given solely by way of nonlimiting example, and with reference to the appended drawings, in which:

the figure 1 previously described illustrates a state-of-the-art transform coding-decoding system;
the figure 2 described above illustrates an example of a digital audio signal for which a mitigation method according to the state of the art is performed;
the figure 3 illustrates another example of a digital audio signal for which an attenuation method according to the state of the art is performed;
the figure 4 described above illustrates an example of a signal for which the state of the art technique would erroneously detect a pre-echo;
the figure 5 illustrates an embodiment of a method and a device for discriminating and pre-echo attenuation processing included in a decoder according to the invention;
the figure 6 illustrates an example of low-delay analysis windows and synthesis windows for transform coding and decoding that can create the pre-echo phenomenon;
the figure 7 illustrates an example of a digital audio signal for which the pre-echo attenuation method according to one embodiment of the invention is implemented;
the figure 8 illustrates a hardware example of discrimination device and attenuation processing according to the invention.

En référence à la figure 5 , un dispositif 600 de discrimination et de traitement d'atténuation de pré-écho est décrit. Le dispositif de traitement d'atténuation 600 tel que décrit ci-après est compris dans un décodeur comportant un module 610 de quantification inverse (Q^-1) recevant un signal S, un module 620 de transformée inverse (MDCT^-1), un module 630 de reconstruction du signal par addition/recouvrement (add/rec) comme décrit en référence à la figure 1 et délivrant un signal reconstruit x_rec (n) au dispositif de discrimination et de traitement d'atténuation selon l'invention. On peut noter qu'on prend ici l'exemple de la transformée MDCT qui est la plus courante en codage de parole et audio, cependant le dispositif 600 s'applique également à tout autre type de transformée (FFT, DCT, etc.).With reference to the figure 5 A discrimination device 600 and pre-echo reduction processing is described. The attenuation processing device 600 as described below is included in a decoder comprising a reverse quantization module 610 (Q ^-1 ) receiving a signal S, a reverse transformation module 620 (MDCT ^-1 ), a module 630 of addition / recap signal reconstruction as described with reference to FIG. figure 1 and delivering a reconstructed signal x _rec ( n ) to the attenuation discrimination and processing device according to the invention. It may be noted that here we take the example of the MDCT transform which is the most common in speech and audio coding, however the device 600 also applies to any other type of transform (FFT, DCT, etc.).

En sortie du dispositif 600, un signal traité Sa est fourni dans lequel une atténuation de pré-écho a été effectuée.At the output of the device 600, a processed signal Sa is provided in which a pre-echo attenuation has been performed.

Le dispositif 600 met en oeuvre une méthode de discrimination et de traitement d'atténuation des pré-échos dans le signal décodé x_rec (n). The device 600 implements a discrimination and pre-echo attenuation processing method in the decoded signal x _rec ( n ) .

Dans un mode de réalisation de l'invention, le procédé de discrimination et de traitement d'atténuation comporte une étape de détection (E601) des attaques qui peuvent engendrer un pré-écho, dans le signal décodé x_rec (n). In one embodiment of the invention, the discrimination and attenuation processing method includes a step of detecting (E601) attacks that may generate a pre-echo, in the decoded signal x _rec ( n ) .

Ainsi, le dispositif 600 comporte un module de détection 601 apte à mettre en oeuvre une étape de détection (E601) de la position d'une attaque dans un signal audio décodé.Thus, the device 600 comprises a detection module 601 able to implement a step of detecting (E601) the position of an attack in a decoded audio signal.

Une attaque (ou onset en anglais) est une transition rapide et une variation brusque de la dynamique (ou amplitude) du signal. On peut désigner ce type de signaux par le terme plus général de "transitoire". Dans la suite et sans perte de généralité, on utilisera uniquement les termes d'attaque ou de transition pour désigner également des transitoires.An onset (or onset ) is a rapid transition and a sudden change in the dynamics (or amplitude) of the signal. This type of signal may be referred to by the more general term "transient". In the following and without loss of generality, we will use only the terms of attack or transition to designate also transients.

Chaque trame courante de L échantillons du signal décodé x_rec (n) est divisée en K sous-blocs de longueur L', avec par exemple L = 640 échantillons (20 ms) à 32 kHz, L' = 80 échantillons (2.5 ms) et K = 8. De préférence la taille de ces sous-blocs est donc identique mais l'invention reste valable et facilement généralisable quand les sous-blocs ont une taille variable. Cela peut être le cas par exemple quand la longueur de la trame L n'est pas divisible par le nombre de sous-blocs K ou si la longueur de trame est variable.Each current frame of L samples of the decoded signal x _rec ( n ) is divided into K sub-blocks of length L ', with for example L = 640 samples (20 ms) at 32 kHz, L' = 80 samples (2.5 ms) and K = 8. Preferably the size of these sub-blocks is therefore identical but the invention remains valid and easily generalizable when the sub-blocks have a variable size. This can be the case for example when the length of the frame L is not divisible by the number of sub-blocks K or if the frame length is variable.

Des fenêtres d'analyse-synthèse spéciales à bas retard similaires à celles décrites dans la norme UIT-T G.718 sont utilisées pour la partie analyse et pour la partie synthèse de la transformation MDCT. Un exemple de telles fenêtres est illustré en référence à la figure 6. Le retard engendré par la transformation est de seulement 280 échantillons contrairement au retard de 640 échantillons dans le cas d'utilisation des fenêtres sinusoïdales classiques. Ainsi la mémoire MDCT avec des fenêtres d'analyse-synthèse spéciales à bas retard ne contient que 140 échantillons indépendants (non repliés avec la trame actuelle) contrairement aux 320 échantillons dans le cas d'utilisation des fenêtres sinusoïdales classiques.Special low-delay analysis-synthesis windows similar to those described in ITU-T G.718 are used for the analysis part and for the synthesis part of the MDCT transformation. An example of such windows is illustrated with reference to the figure 6 . The delay generated by the transformation is only 280 samples in contrast to the delay of 640 samples in the case of use of conventional sinusoidal windows. Thus the MDCT memory with special low-delay analysis-synthesis windows contains only 140 independent samples (not folded with the current frame) unlike the 320 samples in the case of using conventional sinusoidal windows.

On peut en effet remarquer sur la figure 6 pour les fenêtres d'analyses (Ana.), que la zone de repliement est limitée par les traits en pointillés entre les échantillons 820 et 1100. La ligne de repliement est représentée en trait mixte à l'échantillon 960.We can indeed notice on the figure 6 for the analysis windows (Ana.), that the folding zone is limited by the dashed lines between the samples 820 and 1100. The folding line is shown in dashed line with the sample 960.

Pour la synthèse (Synth.), seuls les échantillons représentés par l'intervalle M (140 échantillons) sont nécessaires pour obtenir l'information sur la zone de repliement de l'analyse, en exploitant la symétrie. Ces échantillons contenus en mémoire sont alors utiles pour décoder cette zone de repliement en utilisant également les échantillons repliés de la fenêtre de la trame suivante. Dans le cas d'une attaque dans cette zone entre les échantillons 820 et 1100 l'énergie moyenne des échantillons représentés par l'intervalle M est nettement supérieure à l'énergie de sous-trames précédant l'échantillon 820. L'augmentation brusque de l'énergie de l'intervalle M contenue dans la mémoire MDCT peut donc signaler une attaque dans la trame suivante qui peut engendrer un pré-écho dans la trame courante.For the synthesis (Synth.), Only the samples represented by the interval M (140 samples) are necessary to obtain the information on the folding area of the analysis, exploiting the symmetry. These samples contained in memory are then useful for decoding this folding zone by also using the folded samples of the window of the next frame. In the case of an attack in this zone between the samples 820 and 1100, the average energy of the samples represented by the interval M is clearly greater than the energy of subframes preceding the sample 820. The abrupt increase of the energy of the interval M contained in the MDCT memory can therefore signal an attack in the next frame which can generate a pre-echo in the current frame.

On utilise la mémoire MDCT x_MDCT (n) qui donne une version avec repliement temporel ("folding" en anglais) du signal futur. Avec les fenêtres d'analyse-synthèse spéciales à bas retard comme illustrées en figure 6, on ne retient qu'un (K'=1) bloc de longueur L_m(0)=140 qui contient tous les échantillons indépendants de la mémoire MDCT. Malgré le nombre d'échantillons supérieur dans ce sous-bloc, son énergie reste comparable avec celle des sous-blocs de la trame courante (si le signal reste stable) car la partie mémoire a été fenêtrée (donc atténuée) par la fenêtre d'analyse.The MDCT x _MDCT ( n ) memory is used which gives a time-folding version of the future signal. With the special low-delay analysis-synthesis windows as illustrated in figure 6 , we retain only one (K '= 1) block length L _m (0) = 140 which contains all the independent samples of the MDCT memory. Despite the higher number of samples in this sub-block, its energy remains comparable with that of the sub-blocks of the current frame (if the signal remains stable) because the memory part has been windowed (thus attenuated) by the window of analysis.

En effet, la figure 1 montre que le pré-écho influence la trame qui précède la trame où se situe l'attaque, et il est souhaitable de détecter une attaque dans la trame future qui est en partie contenue dans la mémoire MDCT.Indeed, the figure 1 shows that the pre-echo influences the frame that precedes the frame where the attack is located, and it is desirable to detect an attack in the future frame which is partly contained in the MDCT memory.

La trame courante et la mémoire MDCT peuvent être vues comme des signaux concaténés formant un signal découpé en (K+K') sous-blocs consécutifs. Dans ces conditions, on définit l'énergie dans le k-ième sous-bloc comme : $En (k) = \sum_{n = kL'}^{(k + 1) L' - 1} x_{rec} {(n)}^{2}, k = 0, \dots, K - 1$

quand le k-ième sous-bloc se situe dans la trame courante et, comme:

En (k) = \sum_{n = 0}^{L_{mem} - 1} x_{MDCT} {(n)}^{2}

quand le sous-bloc est dans la mémoire MDCT (qui représente le signal disponible pour la trame future) et L_mem est la longueur du sous-bloc de la partie mémoire :
L'énergie moyenne des sous-blocs dans la trame courante s'obtient donc comme :

\overline{En} = \frac{1}{K} \sum_{k = 0}^{K - 1} En (k)

On définit également l'énergie moyenne des sous-blocs dans la deuxième partie de la trame courante comme (supposant que K est un nombre pair):

\overline{En}' = \frac{2}{K} \sum_{k = K / 2}^{K - 1} En (k)

The current frame and the MDCT memory can be seen as concatenated signals forming a signal cut into (K + K ') consecutive sub-blocks. Under these conditions, the energy in the k-th sub-block is defined as:

In (k) = Σ_{not = kL'}^{(k + 1) The' - 1} x_{rec} {(not)}^{2}, k = 0 ..., K - 1

when the k-th sub-block is in the current frame and, like:

In (k) = Σ_{not = 0}^{{The}_{same} - 1} x_{MDCT} {(not)}^{2}

when the sub-block is in the MDCT memory (which represents the signal available for the future frame) and L _mem is the length of the sub-block of the memory part:
The average energy of the sub-blocks in the current frame is thus obtained as:

\tilde{In} = \frac{1}{K} Σ_{k = 0}^{K - 1} In (k)

The average energy of the sub-blocks in the second part of the current frame is also defined as (assuming that K is an even number):

\tilde{In}' = \frac{2}{K} Σ_{k = K / 2}^{K - 1} In (k)

Une attaque associée à un pré-écho est détectée si le rapport $R (k) = \frac{\max_{n = 0, K + K' - 1} (En (n))}{En (k)}$

dépasse un seuil prédéfini, dans un des sous-blocs considérés. D'autres critères de détection de pré-écho sont possibles sans changer la nature de l'invention.
Par ailleurs, on considère que la position de l'attaque est définie comme

pos = \min (L' . (a r g \max_{k = 0, K + K' - 1} (En (k))), L)

où la limitation à L assure que la mémoire MDCT n'est jamais modifiée. D'autres méthodes d'estimation plus précise de la position de l'attaque sont également possibles.An attack associated with a pre-echo is detected if the report

R (k) = \frac{\max_{not = 0 K + K' - 1} (In (not))}{In (k)}

exceeds a predefined threshold, in one of the sub-blocks considered. Other pre-echo detection criteria are possible without changing the nature of the invention.
Moreover, we consider that the position of the attack is defined as

pos = \min (The' . (at r boy Wut \max_{k = 0 K + K' - 1} (In (k))), The)

where the limitation to L ensures that the MDCT memory is never changed. Other methods for more accurate estimation of the attack position are also possible.

Le dispositif 600 comporte également un module 602 de discrimination de zone de pré-écho mettant en oeuvre une étape de détermination (E602) d'une zone de pré-écho (ZPE) précédant la position d'attaque détectée. On appelle ici zone de pré-écho la zone couvrant les échantillons avant la position estimée de l'attaque qui sont perturbés par le pré-écho engendré par l'attaque et où l'atténuation de ce pré-écho est souhaitable. Dans le mode de réalisation présenté la zone de pré-écho peut être déterminée sur le signal décodé.The device 600 also comprises a pre-echo zone discrimination module 602 implementing a step of determining (E602) a pre-echo zone (ZPE) preceding the detected driving position. Pre-echo zone is here referred to as the zone covering the samples before the estimated position of the attack which are disturbed by the pre-echo generated by the attack and where attenuation of this pre-echo is desirable. In the embodiment shown the pre-echo area can be determined on the decoded signal.

Dans un mode de réalisation d'obtention des zones de pré-échos, les énergies En(k) sont concaténées en ordre chronologique, avec d'abord l'enveloppe temporelle du signal décodé, puis l'enveloppe du signal de la trame suivante estimée à partir de la mémoire de la transformée MDCT. En fonction de cette enveloppe temporelle concaténée et des énergies moyennes En et En ' de la trame précédente, la présence de pré-écho est détectée par exemple si le rapport R(k) dépasse un seuil, typiquement ce seuil est 16.In one embodiment of obtaining the pre-echo zones, the energies En ( k ) are concatenated in chronological order, with the time envelope of the decoded signal first, then the envelope of the signal of the following estimated frame from the memory of the MDCT transform. According to this concatenated temporal envelope and mean energies In and In of the previous frame, the presence of pre-echo is detected for example if the ratio R ( k ) exceeds a threshold, typically this threshold is 16.

Les sous-blocs dans lesquels un pré-écho a été détecté constituent ainsi une zone de pré-écho, qui en général couvre les échantillons n = 0,···, pos-1, soit du début de la trame courante à la position de l'attaque (pos). On peut aussi noter que la zone de pré-écho peut très bien s'étendre sur toute la trame courante si l'attaque a été détectée dans la trame future.The sub-blocks in which a pre-echo has been detected thus constitute a pre-echo zone, which in general covers the samples n = 0, ···, pos -1, ie from the beginning of the current frame to the position of the attack ( pos ) . It can also be noted that the pre-echo zone may well extend over the entire current frame if the attack has been detected in the future frame.

Le dispositif 600 comporte un module 603 de calcul apte à mettre en oeuvre une étape de calcul d'un coefficient directeur (ou indicateur de tendance de variation) des énergies des sous-blocs précédant le sous-bloc dans lequel une attaque a été détectée.The device 600 comprises a calculation module 603 capable of implementing a step of calculating a director coefficient (or variation trend indicator) of the energies of the sub-blocks preceding the sub-block in which an attack has been detected.

On définit le modèle linéaire qui représente un ensemble de n réalisations (t_i, e_i), 0<=i<n où t_i sont les indices temporels des sous-blocs et e_i leurs énergies, avec l'équation $e = b_{0} + b_{1} t$

We define the linear model which represents a set of n realizations (t _i , e _i ), 0 <= i <n where t _i are the temporal indices of the sub-blocks and e _i their energies, with the equation

e = b_{0} + b_{1} t

Où b₀ est la valeur à l'instant t=0 et b₁ est le coefficient directeur. Le coefficient directeur donne l'information sur la tendance (moyenne) de variation de l'énergie. Un coefficient directeur positif signale une augmentation des énergies. Une valeur proche de 0 signale une énergie constante.Where b ₀ is the value at time t = 0 and b ₁ is the coefficient of direction. The guideline gives information on the (mean) trend of energy change. A positive directing coefficient signals an increase in energies. A value close to 0 indicates constant energy.

On peut déterminer la valeur de b₁ par exemple par régression linéaire selon les moindre carrés : $b_{1} = \frac{\sum (t_{i} - \overline{t}) (e_{i} - \overline{e})}{\sum {(t_{i} - \overline{t})}^{2}}$

We can determine the value of b ₁ for example by linear regression according to the least squares:

b_{1} = \frac{Σ (t_{i} - \tilde{t}) (e_{i} - \tilde{e})}{Σ {(t_{i} - \tilde{t})}^{2}}

Où la sommation est effectuée sur des indices i prédéterminés.Where the summation is performed on predetermined indices i.

La valeur de b₁ dépend également de la grandeur (en valeur absolue) des énergies ; elle est en effet homogène avec l'énergie sur le temps. Pour pouvoir mieux comparer la valeur de b₁ à un seuil (par exemple fixe) on peut supprimer cette dépendance. Par exemple, la valeur de b₁ peut être divisée par la valeur moyenne des énergies pour obtenir le coefficient directeur normalisé : $b_{1 n} = \frac{b_{1} n}{\sum e_{i}}$

The value of b ₁ also depends on the magnitude (in absolute value) of the energies; it is indeed homogeneous with energy over time. To be able to better compare the value of b ₁ with a threshold (for example fixed) one can suppress this dependency. For example, the value of b ₁ can be divided by the mean value of the energies to obtain the standardized guideline:

b_{1 not} = \frac{b_{1} not}{Σ e_{i}}

De façon alternative on pourra prendre le coefficient de corrélation. $b_{1 n_alt} = \frac{\sum (t_{i} - \overline{t}) (e_{i} - \overline{e})}{\sqrt{\sum {(t_{i} - \overline{t})}^{2} \sum {(e_{i} - \overline{\overline{e}})}^{2}}}$

Alternatively we can take the correlation coefficient.

b_{1 not_alt} = \frac{Σ (t_{i} - \tilde{t}) (e_{i} - \tilde{e})}{\sqrt{Σ {(t_{i} - \tilde{t})}^{2} Σ {(e_{i} - \tilde{\tilde{e}})}^{2}}}

Cette solution alternative a une complexité de calcul plus élevée car elle nécessite de calculer une racine carrée.This alternative solution has a higher computational complexity because it requires calculating a square root.

D'autres méthodes d'estimation du coefficient directeur sont également possibles comme par exemple la méthode médiane-médiane de Tukey.Other methods of estimating the directing coefficient are also possible, such as the median-median Tukey method.

On peut également noter que lorsque le coefficient directeur doit être comparé à un seuil de valeur nulle - ce qui revient à vérifier le signe de ce coefficient - il n'est pas nécessaire de normaliser ce coefficient.It can also be noted that when the reference coefficient has to be compared with a threshold of zero value - which amounts to verifying the sign of this coefficient - it is not necessary to standardize this coefficient.

Par ailleurs, au lieu de normaliser le coefficient directeur, il sera possible de rendre le seuil variable car les relations suivantes sont équivalentes : $b_{1 n} = \frac{b_{1} n}{\sum e_{i}} < seuil$

b_{1} < seuil . \frac{\sum e_{i}}{n}

On the other hand, instead of normalizing the steering coefficient, it will be possible to make the threshold variable because the following relations are equivalent:

b_{1 not} = \frac{b_{1} not}{Σ e_{i}} < threshold

b

_{1} < threshold . \frac{Σ e_{i}}{not}

Si l'attaque est détectée dans le premier ou deuxième sous-bloc la vérification selon l'invention n'est pas possible. Si l'attaque est détectée dans le troisième sous-bloc on dispose de l'énergie de 2 sous-blocs dans la zone de pré-écho, e₀ et e₁ pour faire cette vérification (e₁ étant le plus proche de l'attaque). Avec 2 points l'équation (3) se simplifie ainsi : $b_{1 n} = \frac{2 (e_{1} - e_{0})}{e_{1} + e_{0}}$

If the attack is detected in the first or second sub-block verification according to the invention is not possible. If the attack is detected in the third sub-block one has the energy of 2 sub-blocks in the pre-echo zone, e ₀ and e ₁ to make this verification (e ₁ being the closest to the attack). With 2 points equation (3) is simplified as follows:

b_{1 not} = \frac{2 (e_{1} - e_{0})}{e_{1} + e_{0}}

Si l'attaque est détectée dans le quatrième sous-bloc on dispose de l'énergie de 3 sous-blocs dans la zone de pré-écho, e₀, e₁ et e₂ pour faire cette vérification (e₂ étant le plus proche de l'attaque). Avec 3 points l'équation (3) se simplifie ainsi : $b_{1 n} = \frac{3 (e_{2} - e_{0})}{2 (e_{2} + e_{1} + e_{0})}$

If the attack is detected in the fourth sub-block we have the energy of 3 sub-blocks in the pre-echo zone, e ₀ , e ₁ and e ₂ to make this verification (e ₂ being the closest of the attack). With 3 points equation (3) is simplified as follows:

b_{1 not} = \frac{3 (e_{2} - e_{0})}{2 (e_{2} + e_{1} + e_{0})}

Si on dispose de 4 ou plus de sous-blocs on peut calculer le coefficient directeur sur 4 ou plus de sous blocs. L'expérience montre que la vérification du coefficient directeur calculé sur les 3 sous-blocs précédents le sous-bloc où l'attaque a été détectée est suffisante pour éviter les fausses détections des pré-échos - cette conclusion s'applique pour le cas de 8 sous-blocs sur chaque trame de 20 ms et peut être adaptée selon la taille des sous-blocs et de la trame.If we have 4 or more sub-blocks we can calculate the guideline on 4 or more sub-blocks. Experience shows that the verification of the coefficient calculated on the 3 sub-blocks preceding the sub-block where the attack was detected is sufficient to avoid false detections of the pre-echoes - this conclusion applies for the case of 8 sub-blocks on each frame of 20 ms and can be adapted according to the size of the sub-blocks and the frame.

Ainsi dans le mode de réalisation privilégié, le coefficient directeur est calculé avec au plus 3 sous-blocs. Ceci permet de limiter la complexité maximale du calcul du coefficient directeur.Thus, in the preferred embodiment, the steering coefficient is calculated with at most 3 sub-blocks. This makes it possible to limit the maximum complexity of the calculation of the steering coefficient.

Selon l'invention, le coefficient directeur normalisé b_1n ainsi obtenu est ensuite comparé à l'étape E604 par un module comparateur 604 à un seuil prédéfini. Le seuil peut être prédéfini à une valeur fixe ou peut être variable en fonction par exemple de la classification du signal selon un critère parole ou musique. Typiquement ce seuil est égal à 0 si on vérifie uniquement que l'énergie ne diminue pas ou égale à 0.2 si on impose une légère augmentation de l'énergie dans la zone de pré-écho. Si le coefficient directeur normalisé b_1n est inférieur à ce seuil on conclut que le signal dans la zone de pré-écho ne correspond pas à un pré-écho typique et on inhibe l'atténuation des pré-échos dans cette zone à l'étape E602. Ainsi on évite qu'un signal décodé dont le signal d'entrée original contient un composant de faible énergie avant une attaque soit modifié/altéré par erreur par le module d'atténuation des pré-échos en détectant ce composant comme un pré-écho.According to the invention, the normalized standard coefficient b _1n thus obtained is then compared with the step E604 by a comparator module 604 at a predefined threshold. The threshold may be predefined to a fixed value or may be variable depending for example on the classification of the signal according to a speech or music criterion. Typically this threshold is equal to 0 if we only check that the energy does not decrease or equal to 0.2 if we impose a slight increase in energy in the pre-echo zone. If the standardized guideline b _1n is below this threshold it is concluded that the signal in the pre-echo zone does not correspond to a typical pre-echo and the pre-echo attenuation in this zone at step E602 is inhibited. Thus, it is avoided that a decoded signal whose original input signal contains a low energy component before an attack is erroneously modified / altered by the pre-echo attenuation module by detecting that component as a pre-echo.

Une atténuation de pré-écho est mise en oeuvre à l'étape E607 par le module d'atténuation 607 pour la zone de pré-écho discriminée. Le facteur d'atténuation est par exemple calculé comme dans la demande FR 08 56248 . Dans le cas où le module 604 a détecté une fausse détection de pré-écho, le facteur d'atténuation peut être forcé à 1 inhibant ainsi l'atténuation ou bien le module de discrimination 602 ne discrimine pas cette zone comme une zone de pré-écho, le module d'atténuation n'étant alors pas sollicité.A pre-echo attenuation is implemented in step E607 by the attenuation module 607 for the discriminated pre-echo area. The attenuation factor is for example calculated as in the request FR 08 56248 . In the case where the module 604 has detected a false pre-echo detection, the attenuation factor can be forced to 1 thus inhibiting the attenuation or the discrimination module 602 does not discriminate this zone as a pre-echo zone. echo, the attenuation module is not requested.

Dans un mode de réalisation particulier, le dispositif 600 comporte en outre un module 605 de décomposition du signal, apte à effectuer une étape E605 de décomposition du signal décodé en au moins deux sous-signaux selon un critère prédéterminé. Cette méthode est notamment décrite dans la demande FR12 62598 dont on rappelle ici quelques éléments.In a particular embodiment, the device 600 further comprises a signal decomposition module 605, able to perform a step E605 of decomposing the decoded signal into at least two sub-signals according to a predetermined criterion. This method is notably described in the application FR12 62598 which we recall here some elements.

Dans une réalisation particulière de l'invention, le signal décodé x_rec (n) est décomposé à l'étape E605 en deux sous-signaux de la manière suivante :

le premier sous-signal x _rec,ss1(n) est obtenu par filtrage passe bas en utilisant un filtre FIR (filtre à réponse impulsionnelle finie) à 3 coefficients et à phase nulle de fonction de transfert c(n)z ^-1+(1-2c(n))+c(n)z avec c(n) une valeur comprise entre 0 et 0.25, où [c(n),1-2c(n),c(n)] sont les coefficients du filtre passe bas ; ce filtre est mis en oeuvre avec l'équation aux différences : $x_{rec, ss 1} (n) = c (n) x_{rec} (n - 1) + (1 - 2 c (n)) x_{rec} (n) + c (n) x (n +) (1)$

In a particular embodiment of the invention, the decoded signal x _rec ( n ) is decomposed in step E605 into two sub-signals as follows:

the first sub-signal x _{rec, ss 1} ( n ) is obtained by low-pass filtering using a FIR filter (finite impulse response filter) with 3 coefficients and a null transfer function phase c ( n ) z ^-1 + (1- 2c ( n )) + c ( n ) z with c ( n ) a value between 0 and 0.25, where [ c ( n ) , 1-2 c ( n ) , c ( n )] are the coefficients the low pass filter; this filter is implemented with the difference equation: $x_{rec, ss 1} (not) = vs (not) x_{rec} (not - 1) + (1 - 2 vs (not)) x_{rec} (not) + vs (not) x (not + 1)$

Dans un mode particulier de réalisation, on utilise une valeur constante c(n) = 0.25.In a particular embodiment, a constant value c ( n ) = 0.25 is used.

On peut remarquer que le sous signal x _rec,ss1(n) résultant de ce filtrage, contient donc des composantes plutôt basses fréquences du signal décodé.

le deuxième sous signal x _rec,ss2(n) est obtenu par filtrage passe haut complémentaire en utilisant un filtre FIR à 3 coefficients et à phase nulle de fonction de transfert -c(n)z ^-1+2c(n)-c(n)z, où [-c(n),2c(n),-c(n)] sont les coefficients du filtre passe haut ; ce filtre est mis en oeuvre avec l'équation aux différences : x _rec,ss2(n)=-c(n)x_rec (n-1)+2c(n)x_rec (n)-c(n)x(n+1). Le sous signal x _rec,ss2(n) résultant de ce filtrage, contient donc des composantes plutôt hautes fréquences du signal décodé.

It may be noted that the sub-signal x _{rec, s1} ( n ) resulting from this filtering, therefore contains rather low-frequency components of the decoded signal.

the second sub-signal x _{rec, ss 2} ( n ) is obtained by complementary high-pass filtering using a FIR filter with 3 coefficients and with a zero transfer function phase -c ( n ) z ^-1 + 2 c ( n ) - c ( n ) z , where [- c ( n ) , 2c ( n ), - c ( n )] are the coefficients of the high pass filter; this filter is implemented with the difference equation: x _{rec, ss 2} ( n ) = -c ( n ) x _rec ( n -1) + 2 c ( n ) x _rec ( n ) -c ( n ) x ( n +1). The sub-signal x _{rec, ss 2} ( n ) resulting from this filtering, therefore contains rather high-frequency components of the decoded signal.

A noter que x _rec,ss1(n)+x _rec,ss2(n) = x_rec (n).Note that x _{rec, ss 1} ( n ) + x _{rec, ss 2} ( n ) = x _rec ( n ).

Il est donc également possible d'obtenir x _rec,ss2(n) en soustrayant x _rec,ss1(n) de x_rec (n) ce qui réduit la complexité des calculs : x _rec,ss2(n) = x_rec (n)-x _rec,ss1(n)It is therefore also possible to obtain x _{rec, ss 2} ( n ) by subtracting x _{rec, ss 1} ( n ) from x _rec ( n ), which reduces the computational complexity: x _{rec, ss 2} ( n ) = x _rec ( n ) - x _{rec, ss 1} ( n )

La combinaison des sous-signaux atténués pour l'obtention du signal atténué Sa est faite par simple addition des sous-signaux atténués à l'étape E608 décrite ultérieurement.The combination of the attenuated sub-signals for obtaining the attenuated signal Sa is made by simply adding the attenuated sub-signals to the step E608 described later.

Pour ne pas utiliser de signal futur pour ces filtrages, on peut par exemple compléter le signal décodé par un échantillon à 0 à la fin du bloc. Dans le cas du signal décodé complété par un échantillon à 0 à la fin du bloc pour n=L-1 le sous signal x _rec,ss1(n) est obtenu par : $x_{rec, ss 1} (L - 1) = c (L - 1) x_{rec} (L - 2) + (1 - 2 c (L - 1)) x_{rec} (L - 1),$

x _rec,ss2(n) est toujours calculé comme x _rec,ss2(n) = x_rec (n)-x _rec,ss1(n). In order not to use a future signal for these filterings, it is possible, for example, to complete the signal decoded by a sample at 0 at the end of the block. In the case of the decoded signal completed by a sample at 0 at the end of the block for n = L-1, the sub-signal x _{rec, ss 1} ( n ) is obtained by:

x_{rec, ss 1} (The - 1) = vs (The - 1) x_{rec} (The - 2) + (1 - 2 vs (The - 1)) x_{rec} (The - 1),

x _{rec, ss 2} ( n ) is always calculated as x _{rec, ss 2} ( n ) = x _rec ( n ) - x _{rec, ss 1} ( n ) .

On peut noter que les deux sous-signaux restent ici à la même fréquence d'échantillonnage que le signal décodé.It can be noted that the two sub-signals remain here at the same sampling frequency as the decoded signal.

Une étape E606 de calcul de facteurs d'atténuation de pré-écho est mise en oeuvre dans le module de calcul 606. Ce calcul est fait séparément pour les deux sous signaux.A step E606 for calculating pre-echo attenuation factors is implemented in the calculation module 606. This calculation is done separately for the two sub-signals.

Ces facteurs d'atténuation sont obtenus par échantillon de la zone de pré-écho déterminée en E602 en fonction de la trame dans laquelle l'attaque a été détectée et de la trame précédente.These attenuation factors are obtained by sampling the pre-echo zone determined in E602 as a function of the frame in which the attack was detected and the previous frame.

On obtient alors les facteurs g _press1'(n) et g _pre,ss2'(n) où n est l'indice de l'échantillon correspondant. Ces facteurs seront éventuellement lissés pour obtenir les facteurs g _pre,ss1(n) et g _pre,ss2(n) respectivement. Ce lissage est surtout important pour les sous signaux contenant les composantes basse-fréquence (donc pour g _pre,ss1'(n) dans cet exemple).We then obtain the factors g _{press 1} '( n ) and g _{pre, ss 2} ' ( n ) where n is the index of the corresponding sample. These factors will eventually be smoothed to obtain the factors g _{pre, ss 1} ( n ) and g _{pre, ss 2} ( n ) respectively. This smoothing is especially important for sub-signals containing the low-frequency components (so for g _{pre, ss 1} '( n ) in this example).

Un exemple de réalisation de calcul d'atténuation est décrit dans la demande de brevet FR 08 56248 . Les facteurs d'atténuations sont calculés par sous-bloc. Dans le procédé décrit ici, ils sont en plus calculés séparément pour chaque sous signal. Pour les échantillons précédant l'attaque détectée, on calcule donc les facteurs d'atténuation g _pre,ss1'(n) et g _pre,ss2'(n). Ensuite ces valeurs d'atténuations sont éventuellement lissées pour obtenir les valeurs d'atténuation par échantillon.An embodiment of attenuation calculation is described in the patent application FR 08 56248 . The attenuation factors are calculated by sub-block. In the method described here, they are additionally calculated separately for each sub signal. For samples preceding the detected attack, the attenuation factors g _{pre, ss 1} '( n ) and g _{pre, ss 2} ' ( n ) are thus calculated. Then these attenuation values are optionally smoothed to obtain the attenuation values per sample.

Le calcul du facteur d'atténuation d'un sous-signal (par exemple g _pre,ss2'(n)) peut être similaire à celui décrit dans la demande de brevet FR 08 56248 pour le signal décodé en fonction du rapport R(k) (utilisé également pour la détection de l'attaque) entre l'énergie du sous-bloc de plus forte énergie et l'énergie du k-ième sous-bloc du signal décodé. On initialise g _pre,ss2'(n) comme: $g_{pre, ss 2}' (n) = g (k) = f (R (k)), n = kL', \dots, (k + 1) L' - 1; k = 0, \dots, K - 1$

où f est une fonction décroissante à valeurs entre 0 et 1, par exemple f=0 si R(k) <= 16, f = 0.1 si 16 > R(k) >= 32 et f=0.01 si r(k) >32.The calculation of the attenuation factor of a sub-signal (for example g _{pre, ss 2} '( n )) can be similar to that described in the patent application. FR 08 56248 for the decoded signal according to the ratio R ( k ) (also used for the detection of the attack) between the energy of the sub-block of higher energy and the energy of the k-th sub-block of the decoded signal. We initialize g _{pre, ss 2} '( n ) as:

{boy Wut}_{pre, ss 2}' (not) = boy Wut (k) = f (R (k)), not = kL', ..., (k + 1) The' - 1; k = 0 ..., K - 1

where f is a decreasing function with values between 0 and 1, for example f = 0 if R (k) <= 16, f = 0.1 if 16> R (k)> = 32 and f = 0.01 if r (k)> 32.

Si la variation de l'énergie par rapport à l'énergie maximale est faible, aucune atténuation n'est alors nécessaire. Le facteur est alors fixé à une valeur d'atténuation inhibant l'atténuation, c'est-à-dire 1. Sinon, le facteur d'atténuation est compris entre 0 et 1. Cette initialisation peut être commune pour tous les sous-signaux.If the variation of the energy with respect to the maximum energy is small, then no attenuation is necessary. The factor is then set to a attenuation value that inhibits the attenuation, that is to say 1. Otherwise, the attenuation factor is between 0 and 1. This initialization can be common for all the sub-signals. .

Les valeurs d'atténuation sont ensuite affinées par sous-signal pour pouvoir régler le niveau d'atténuation optimal par sous-signal en fonction des caractéristiques du signal décodé. Par exemple les atténuations peuvent être limitées en fonction de l'énergie moyenne du sous-signal de la trame précédente car il n'est pas souhaitable qu'après le traitement d'atténuation de pré-écho, l'énergie du signal devienne inférieure à l'énergie moyenne par sous-bloc du signal précédant la zone de traitement (typiquement celle de la trame précédente ou celle de la deuxième moitié de la trame précédente).The attenuation values are then refined by sub-signal to be able to adjust the optimal sub-signal attenuation level based on the characteristics of the decoded signal. For example, the attenuations can be limited according to the average energy of the sub-signal of the previous frame because it is not desirable that after the pre-echo attenuation processing, the signal energy becomes less than the average energy per sub-block of the signal preceding the processing zone (typically that of the previous frame or that of the second half of the previous frame).

Cette limitation peut être faite de façon similaire à celle décrite dans la demande de brevet FR 08 56248 . Par exemple pour le deuxième sous-signal x _rec,ss2(n) on calcule d'abord l'énergie dans les K sous-bloc de la trame courante comme : $E n_{ss 2} (k) = \sum_{n = kL'}^{(k + 1) L' - 1} x_{rec, ss 2} {(n)}^{2}, k = 0, \dots, K - 1$

On connait également par mémorisation l'énergie moyenne de la trame précédente En _ss2 et celle de la deuxième moitié de la trame précédente En _ss2' qui peuvent être calculés (à la trame précédente) comme :

\overline{E n_{ss 2}} = \frac{1}{K} \sum_{k = 0}^{K - 1} E n_{ss 2} (k)

et

\overline{E n_{ss 2}}' = \frac{2}{K} \sum_{k = K / 2}^{K - 1} E n_{ss 2} (k)

où les indices de sous-bloc de 0 à K correspondent à la trame courante.This limitation can be made in a manner similar to that described in the patent application FR 08 56248 . For example, for the second sub-signal x _{rec, ss 2} ( n ), the energy in the sub-block K of the current frame is first calculated as:

E {not}_{ss 2} (k) = Σ_{not = kL'}^{(k + 1) The' - 1} x_{rec, ss 2} {(not)}^{2}, k = 0 ..., K - 1

We also know by memorization the average energy of the previous frame In _{ss 2} and that of the second half of the previous frame In _{ss 2} 'which can be calculated (at the previous frame) as:

\tilde{E {not}_{ss 2}} = \frac{1}{K} Σ_{k = 0}^{K - 1} E {not}_{ss 2} (k)

and

\tilde{E {not}_{ss 2}}' = \frac{2}{K} Σ_{k = K / 2}^{K - 1} E {not}_{ss 2} (k)

where the sub-block indices from 0 to K correspond to the current frame.

Pour le sous-bloc k à traiter on peut calculer la valeur limite du facteur lim_g,ss2(k) afin d'obtenir exactement la même énergie que l'énergie moyenne par sous-bloc du segment précédant le sous-bloc à traiter. Cette valeur est bien sûr limitée à un maximum de 1 puisqu'on s'intéresse ici aux valeurs d'atténuation. Plus précisément : $\lim_{g, ss 2} (k) = \min (\sqrt{\frac{\max (\overline{E n_{ss 2}}, \overline{E n_{ss 2}}')}{E n_{ss 2} (k)},1})$

où l'énergie moyenne du segment précédent est approximée par max ( En _ss2 , En _ss2').For the sub-block k to be processed, the limit value of lim _{g, ss 2} ( k ) can be calculated in order to obtain exactly the same energy as the average energy per sub-block of the segment preceding the sub-block to be processed. . This value is of course limited to a maximum of 1 since we are interested here in the attenuation values. More precisely :

\lim_{boy Wut, ss 2} (k) = \min (\sqrt{\frac{\max (\tilde{E {not}_{ss 2}}, \tilde{E {not}_{ss 2}}')}{E {not}_{ss 2} (k)}, 1})

where the average energy of the previous segment is approximated by max ( In _{ss 2} , In _{ss 2} ').

La valeur lim_g,ss2(k) ainsi obtenue sert de limite inférieure dans le calcul final du facteur d'atténuation du sous-bloc : $g_{pre, ss 2}' (n) = \max (g_{pre, ss 2}' (n), \lim_{g, ss 2} (k)), n = kL', \dots, (k + 1) L' - 1; k = 0, \dots, K - 1$

The value lim _{g , ss 2} ( k ) thus obtained serves as a lower limit in the final calculation of the attenuation factor of the sub-block:

{boy Wut}_{pre, ss 2}' (not) = \max ({boy Wut}_{pre, ss 2}' (not), \lim_{boy Wut, ss 2} (k)), not = kL', ..., (k + 1) The' - 1; k = 0 ..., K - 1

Dans une première variante de réalisation la zone de pré-écho où l'atténuation s'étend du début de la trame courante jusqu'au début du sous-bloc dans lequel l'attaque a été détecté - jusqu'à l'indice pos où $pos = \min (L' . (\arg \max_{k = 0, K + K' - 1} (En (k))), L) .$

Les atténuations associées aux échantillons du sous-bloc de l'attaque sont toutes mises à 1 même si l'attaque se situe vers la fin de ce sous-bloc.In a first variant embodiment the pre-echo zone where the attenuation extends from the beginning of the current frame to the beginning of the sub-block in which the attack has been detected - up to the index pos where

pos = \min (The' . (\arg \max_{k = 0 K + K' - 1} (In (k))), The) .

The attenuations associated with the sub-block samples of the attack are all set to 1 even if the attack is towards the end of this sub-block.

Dans une autre variante de réalisation la position de début de l'attaque pos est affinée dans le sous-bloc de l'attaque, par exemple en découpant le sous-bloc en sous-sous-blocs et en observant l'évolution de l'énergie de ces sous-sous-blocs. Supposons que la position du début de l'attaque est détectée dans le sous-bloc k, k>0 et le début de l'attaque raffiné pos se trouve dans ce sous-bloc, les valeurs d'atténuation pour les échantillons de ce sous-bloc qui se trouvent avant l'indice pos peuvent être initialisées en fonction de la valeur d'atténuation correspondant au dernier échantillon du sous-bloc précédent : $g_{pre, ss 2}' (n) = g_{pre, ss 2}' (kL' - 1), n = kL', \dots, pos - 1$

In another variant embodiment, the start position of the attack pos is refined in the sub-block of the attack, for example by cutting the sub-block into sub-sub-blocks and observing the evolution of the energy of these sub-sub-blocks. Suppose that the position of the beginning of the attack is detected in the sub-block k, k> 0 and the beginning of the refined attack pos is in this sub-block, the attenuation values for the samples of this sub-block -block which are before the index pos can be initialized according to the attenuation value corresponding to the last sample of the previous sub-block:

{boy Wut}_{pre, ss 2}' (not) = {boy Wut}_{pre, ss 2}' (kL' - 1), not = kL', ..., pos - 1

Toutes les atténuations à partir de l'indice pos sont mises à 1.All attenuations from the index pos are set to 1.

Pour le premier sous-signal contenant les composantes basses fréquences du signal décodé, le calcul des valeurs d'atténuation en se basant sur le sous-signal x _rec,ss1(n) peut être similaire au calcul des valeurs d'atténuation en se basant sur le signal décodé x_rec (n). Ainsi, dans une variante de réalisation, par souci de réduction de complexité de calcul, les valeurs d'atténuation peuvent être déterminées en se basant sur le signal décodé x_rec (n). Dans le cas où la détection des attaques est faite sur le signal décodé il n'est donc plus nécessaire de recalculer des énergies des sous-blocs car pour ce signal les valeurs d'énergie par sous-bloc sont déjà calculées pour détecter les attaques. Comme pour la grande majorité des signaux les basses fréquences sont beaucoup plus énergétiques que les hautes fréquences, les énergies par sous-bloc du signal décodé x_rec (n) et du sous signal x _rec,ss1(n) sont très proches, cette approximation donne un résultat très satisfaisant.For the first sub-signal containing the low frequency components of the decoded signal, the calculation of the attenuation values based on the sub-signal x _{rec, ss 1} ( n ) may be similar to the calculation of the attenuation values by based on the decoded signal x _rec ( n ) . Thus, in an alternative embodiment, for the sake of reducing calculation complexity, the attenuation values can be determined based on the decoded signal x _rec ( n ) . In the case where the detection of attacks is made on the decoded signal, it is no longer necessary to recalculate energies of the sub-blocks because for this signal the energy values by sub-block are already calculated to detect the attacks. As for the great majority of the signals the low frequencies are much more energetic than the high frequencies, the energies by sub-block of the decoded signal x _rec ( n ) and of the sub-signal x _{rec, ss 1} ( n ) are very close, this approximation gives a very satisfactory result.

Les facteurs d'atténuation g _pre,ss1(n) et g _pre,ss2(n) déterminés par sous-blocs peuvent être ensuite lissés par une fonction de lissage appliquée échantillon par échantillon pour éviter des variations brusques du facteur d'atténuation aux frontières des blocs. Ceci est particulièrement important pour les sous-signaux contenant des composantes basses fréquences comme le sous-signal x _rec,ss1(n) mais pas nécessaire pour les sous-signaux ne contenant que des composantes hautes fréquences comme le sous signal x _rec,ss2(n). The attenuation factors g _{pre , ss1} ( n ) and g _{pre , ss 2} ( n ) determined by sub-blocks can then be smoothed by an applied smoothing function sample by sample to avoid abrupt changes in the attenuation factor at block boundaries. This is particularly important for sub-signals containing low frequency components such as sub-signal x _{rec, ss 1} ( n ) but not necessary for sub-signals containing only high frequency components such as sub-signal x _{rec, ss 2} ( n ) .

La figure 7 illustre un exemple d'application d'un gain d'atténuation avec des fonctions de lissage représentées par les flèches L.The figure 7 illustrates an example of applying an attenuation gain with smoothing functions represented by the L arrows.

Cette figure illustre en a), un exemple de signal original, en b), le signal décodé sans atténuation de pré-écho, en c), les gains d'atténuation pour les deux sous-signaux obtenus selon l'étape de décomposition E605 et en d), le signal décodé avec atténuation de pré-écho des étapes E607 et E608 (c'est-à-dire après combinaison des deux sous-signaux atténués).This figure illustrates in a), an example of an original signal, in b), the decoded signal without pre-echo attenuation, in c), the attenuation gains for the two sub-signals obtained according to the decomposition step E605 and in d), the decoded signal with pre-echo attenuation of steps E607 and E608 (i.e. after combining the two attenuated sub-signals).

On peut remarquer sur cette figure que le gain d'atténuation représenté en trait pointillé et correspondant au gain calculé pour le premier sous-signal comportant des composantes basses fréquences, comporte des fonctions de lissage comme décrit ci-dessus. Le gain d'atténuation représenté en trait plein et calculé pour le deuxième sous-signal comportant des composantes hautes fréquences, ne comporte pas de gain de lissage.It may be noted in this figure that the attenuation gain represented in dashed line and corresponding to the gain calculated for the first sub-signal comprising low frequency components, comprises smoothing functions as described above. The attenuation gain represented in solid line and calculated for the second sub-signal comprising high frequency components, does not include smoothing gain.

Le signal représenté en d) montre bien que le pré-écho a été atténué de façon efficace par le traitement d'atténuation mis en oeuvre.
La fonction de lissage est par exemple définie de préférence par les équations suivantes: $g_{pre, ss 1} (n) = \frac{1}{u} \sum_{i = 0}^{u - 1} g_{pre, ss 1}' (n - i), n = 0, \dots, L - 1$

avec la convention que g _pre,ss1'(n) n = -(u-1),···,-1 sont les derniers u-1 facteurs d'atténuation obtenus pour les derniers échantillons du sous-bloc précédent du sous-signal x _rec,ss1(n). Typiquement u = 5 mais une autre valeur pourrait être utilisée. En fonction du lissage utilisé, la zone de pré-écho (le nombre des échantillons atténués) peut donc être différente pour les 2 sous-signaux traités séparément, même si la détection de l'attaque est faite en commun sur la base du signal décodé.The signal represented in d) clearly shows that the pre-echo has been effectively attenuated by the attenuation processing implemented.
The smoothing function is for example preferably defined by the following equations:

{boy Wut}_{pre, ss 1} (not) = \frac{1}{u} Σ_{i = 0}^{u - 1} {boy Wut}_{pre, ss 1}' (not - i), not = 0 \dots, The - 1

with the convention that g _{pre, ss 1} '( n ) n = - ( u -1), ···, -1 are the last u-1 attenuation factors obtained for the last samples of the previous sub-block of the sub -signal x _{rec, ss 1} ( n ). Typically u = 5 but another value could be used. Depending on the smoothing used, the pre-echo zone (the number of attenuated samples) may therefore be different for the 2 sub-signals processed separately, even if the detection of the attack is made in common on the basis of the decoded signal. .

Le facteur d'atténuation lissé ne remonte pas à 1 au moment de l'attaque, ce qui implique une diminution de l'amplitude de l'attaque. L'impact perceptible de cette diminution est très faible mais doit néanmoins être évité. Pour pallier à ce problème la valeur de facteur d'atténuation peut être forcée à 1 pour les u-1 échantillons précédant l'indice pos où se situe le début de l'attaque. Ceci est équivalent à avancer le marqueur pos de u-1 échantillons pour le sous signal où le lissage est appliqué. Ainsi la fonction de lissage accroît progressivement le facteur pour avoir une valeur 1 au moment de l'attaque. L'amplitude de l'attaque est alors préservée.The smoothed attenuation factor does not go back to 1 at the time of the attack, which implies a decrease in the amplitude of the attack. The noticeable impact of this decrease is very small but must nevertheless be avoided. To alleviate this problem the attenuation factor value can be forced to 1 for the u-1 samples preceding the index pos where the onset of the attack is. This is equivalent to advancing the pos marker of u-1 samples for the sub signal where the smoothing is applied. Thus the smoothing function gradually increases the factor to have a value 1 at the time of the attack. The amplitude of the attack is then preserved.

Dans ce mode de réalisation avec décomposition du signal, la vérification de l'augmentation de l'énergie de la zone de pré-écho selon l'invention est réalisée pour au moins un sous-signal ou pour chacun de ces sous-signaux.In this embodiment with signal decomposition, the verification of the increase of the energy of the pre-echo zone according to the invention is carried out for at least one sub-signal or for each of these sub-signals.

Le seuil de comparaison utilisé peut être différent selon les sous-signaux et selon le nombre de sous-blocs disponibles avant l'attaque.The comparison threshold used may be different depending on the sub-signals and the number of sub-blocks available before the attack.

Si dans au moins un sous-signal le coefficient directeur normalisé b_1n est inférieur au seuil de ce sous-signal, on inhibe l'atténuation des pré-échos pour l'ensemble des sous-signaux.If, in at least one sub-signal, the normalized steering coefficient b _1n is less than the threshold of this sub-signal, the pre-echo attenuation is inhibited for all the sub-signals.

En cas de pré-écho dans un signal issu d'une transformée MDCT inverse l'énergie du composant pré-écho augmente ou est au moins stable dans tous les sous-signaux. L'inhibition de traitement de pré-écho peut être faite par exemple en mettant les facteurs d'atténuations à 1 ou en ne discriminant pas la zone comme zone de pré-écho, le module de traitement d'atténuation de pré-écho n'étant alors pas sollicité comme illustré à titre d'exemple dans le mode de réalisation de la figure 5 par le lien entre le bloc 604 et 602.In the case of a pre-echo in a signal coming from an inverse MDCT transform, the energy of the pre-echo component increases or is at least stable in all the sub-signals. Inhibition of pre-echo processing can be done for example by setting the attenuation factors to 1 or not discriminating the area as a pre-echo zone, the module of pre-echo attenuation processing is not then requested as illustrated by way of example in the embodiment of the figure 5 by the link between block 604 and 602.

Dans des variantes, l'atténuation sera inhibée séparément pour chaque sous-signal dès que le coefficient directeur normalisé b_1n est inférieur au seuil de ce sous-signal. L'inhibition pourra être par exemple mise en oeuvre en mettant les facteurs d'atténuation à 1 ou en ne sollicitant pas le module de pré-écho pour le sous-signal considéré.In variants, the attenuation will be inhibited separately for each sub-signal as soon as the normalized steering coefficient b _1n is lower than the threshold of this sub-signal. The inhibition may for example be implemented by setting the attenuation factors to 1 or by not soliciting the pre-echo module for the sub-signal considered.

Ainsi, dans le mode de réalisation particulier décrit ci-dessus avec décomposition en deux sous-signaux, si le nombre de sous-blocs avant l'attaque permet de faire cette vérification, on vérifie dans les deux sous-signaux l'évolution de l'énergie des sous-blocs précédant le sous-bloc où l'attaque a été détectée, par régression linéaire. Cette vérification peut être faite selon les étapes E603 et E604, à n'importe quel moment après la division du signal décodé en sous-signaux (E605) et avant l'application des facteurs d'atténuation des pré-échos (E607). La vérification est possible si au moins deux sous-blocs précèdent le sous-bloc où l'attaque a été détectée. Si l'attaque est détectée dans le premier ou deuxième sous-bloc la vérification selon l'invention n'est pas possible.Thus, in the particular embodiment described above with decomposition into two sub-signals, if the number of sub-blocks before the attack makes it possible to carry out this verification, the evolution of the two sub-signals is checked in both sub-signals. energy of the sub-blocks preceding the sub-block where the attack was detected, by linear regression. This verification can be done according to the steps E603 and E604, at any time after the division of the decoded signal into sub-signals (E605) and before the application of the pre-echo attenuation factors (E607). Verification is possible if at least two sub-blocks precede the sub-block where the attack was detected. If the attack is detected in the first or second sub-block verification according to the invention is not possible.

Dans des variantes, on pourra ré-utiliser le(s) coefficient(s) directeur(s) éventuellement calculé(s) dans la trame précédente si l'attaque est détectée dans le premier ou deuxième sous-bloc de la trame courante.In variants, it will be possible to re-use the coefficient (s) director (s) possibly calculated (s) in the previous frame if the attack is detected in the first or second sub-block of the current frame.

Si l'attaque est détectée dans le troisième sous-bloc on dispose alors de l'énergie de deux sous-blocs dans la zone de pré-écho pour faire cette vérification. Par expérience, avec deux points, la vérification n'est pas suffisamment fiable dans le sous-signal basse-fréquence x _rec,ss1(n). On vérifie alors uniquement le sous-signal haute-fréquence x _rec,ss2(n) et uniquement que l'énergie ne diminue pas. Le coefficient directeur du sous-signal haute-fréquence x _rec,ss2(n) est comparé au seuil de valeur 0. Seul son signe est important ici, il n'est pas nécessaire de faire une normalisation. Il suffit donc de calculer à l'étape E603 un coefficient directeur simple (sans normalisation) comme: $b_{1 ss 2} = E n_{ss 2} (1) - E n_{ss 2} (0)$

If the attack is detected in the third sub-block then we have the energy of two sub-blocks in the pre-echo zone to do this verification. By experience, with two points, the verification is not sufficiently reliable in the low frequency sub-signal x _{rec, ss 1} ( n ) . We then check only the high-frequency sub-signal x _{rec, ss 2} ( n ) and only that the energy does not decrease. The high-frequency sub-signal coefficient x _{rec, ss 2} ( n ) is compared with the value threshold 0. Only its sign is important here, it is not necessary to make a normalization. It is therefore sufficient to calculate in step E603 a simple director coefficient (without normalization) such as:

b_{1 ss 2} = E {not}_{ss 2} (1) - E {not}_{ss 2} (0)

Si b_1ss2 est inférieur à 0, l'atténuation des pré-échos pour cette zone de pré-écho est inhibé pour l'ensemble des sous-signaux.If b _1ss2 is less than 0, the attenuation of pre-echoes for this pre-echo zone is inhibited for all the sub-signals.

Si l'attaque est détectée dans le quatrième sous-bloc ou un sous-bloc d'indice supérieur à 4, on vérifie l'évolution de l'énergie des derniers 3 sous-blocs dans la zone de pré-écho précédant le sous-bloc où l'attaque a été détectée. Le coefficient directeur du sous-signal basse-fréquence x _rec,ss1(n) est comparé à 0, seul son signe est important et il n'est pas nécessaire de normaliser ce coefficient. Il suffit donc de calculer un coefficient directeur simple. Si l'attaque a été détectée dans le sous-bloc de l'indice id avec id >= 3 on détermine ce coefficient comme : $b_{1 ss 1} = En (id - 1) - E n_{ss 2} (id - 3)$

If the attack is detected in the fourth sub-block or a sub-block of index higher than 4, one checks the evolution of the energy of the last 3 sub-blocks in the pre-echo zone preceding the sub-block block where the attack was detected. The reference coefficient of the low-frequency sub-signal x _{rec, ss 1} ( n ) is compared to 0, only its sign is important and it is not necessary to normalize this coefficient. It is therefore sufficient to calculate a simple coefficient of direction. If the attack has been detected in the sub-block of the index id with id > = 3, this coefficient is determined as:

b_{1 ss 1} = In (id - 1) - E {not}_{ss 2} (id - 3)

Si b_1ss1 est inférieur à 0 on inhibe l'atténuation des pré-échos pour cette zone de pré-écho, et pour l'ensemble des sous-signaux.If b _1ss1 is less than 0, pre-echo attenuation is inhibited for this pre-echo zone, and for all the sub-signals.

Le coefficient directeur du sous-signal haute-fréquence x _rec,ss2(n) est comparé à un seuil de valeur 0.2. On calcule le coefficient directeur normalisé. Si l'attaque a été détectée dans le sous-bloc de l'indice id avec id >= 3 on détermine ce coefficient comme : $b_{1 nss 2} = \frac{3 (E n_{ss 2} (id - 1) - E n_{ss 2} (id - 2))}{2 (E n_{ss 2} (id - 1) + E n_{ss 2} (id - 2) + E n_{ss 2} (id - 3))}$

The direction coefficient of the high-frequency sub-signal x _{rec, ss 2} ( n ) is compared with a threshold of value 0.2. The standard guideline is calculated. If the attack has been detected in the sub-block of the index id with id > = 3, this coefficient is determined as:

b_{1 nss 2} = \frac{3 (E {not}_{ss 2} (id - 1) - E {not}_{ss 2} (id - 2))}{2 (E {not}_{ss 2} (id - 1) + E {not}_{ss 2} (id - 2) + E {not}_{ss 2} (id - 3))}

Si b_1nss2 est inférieur à 0.2 on inhibe l'atténuation des pré-échos pour cette zone de pré-écho, et pour l'ensemble des sous-signaux.If b _1nss2 is less than 0.2, pre-echo attenuation is inhibited for this pre-echo zone, and for all the sub-signals.

A noter que la condition $\frac{3 (E n_{ss 2} (id - 1) - E n_{ss 2} (id - 2))}{2 (E n_{ss 2} (id - 1) + E n_{ss 2} (id - 2) + E n_{ss 2} (id - 3))} < 0.2$

est équivalente à

E n_{ss 2} (id - 1) - E n_{ss 2} (id - 2) < \frac{1}{7.5} (E n_{ss 2} (id - 1) + E n_{ss 2} (id - 2) + E n_{ss 2} (id - 3))

évitant ainsi une opération de division pour réduire la complexité et pour faciliter la mise en oeuvre sur un processeur DSP (pour "Digital Signal Processor") à arithmétique à virgule fixe.Note that the condition

\frac{3 (E {not}_{ss 2} (id - 1) - E {not}_{ss 2} (id - 2))}{2 (E {not}_{ss 2} (id - 1) + E {not}_{ss 2} (id - 2) + E {not}_{ss 2} (id - 3))} < 0.2

is equivalent to

E {not}_{ss 2} (id - 1) - E {not}_{ss 2} (id - 2) < \frac{1}{7.5} (E {not}_{ss 2} (id - 1) + E {not}_{ss 2} (id - 2) + E {not}_{ss 2} (id - 3))

thus avoiding a division operation to reduce the complexity and to facilitate the implementation on a DSP (for "Digital Signal Processor") to fixed point arithmetic.

Le module 607 du dispositif 600 de la figure 5 met en oeuvre l'étape E607 d'atténuation de pré-écho dans la zone de pré-écho de chacun des sous-signaux par application aux sous-signaux des facteurs d'atténuation ainsi calculés.The module 607 of the device 600 of the figure 5 implements the pre-echo attenuation step E607 in the pre-echo area of each of the sub-signals by applying to the subsignals of the thus calculated attenuation factors.

L'atténuation de pré-écho est donc faite de façon indépendante dans les sous signaux. Ainsi, dans les sous signaux représentants différentes bandes de fréquence, l'atténuation peut être choisie en fonction de la répartition spectrale du pré-écho.The pre-echo attenuation is therefore done independently in the sub-signals. Thus, in the sub-signals representing different frequency bands, the attenuation can be chosen according to the spectral distribution of the pre-echo.

Enfin, une étape E608 du module d'obtention 608 permet d'obtenir le signal de sortie atténué (le signal décodé après atténuation de pré-écho) par combinaison (dans cet exemple par simple addition) des sous-signaux atténués, selon l'équation: $x_{ref, f} (n) = g_{pre, ss 1} (n) x_{rec, ss 1} (n) + g_{pre, ss 2} (n) x_{rec, ss 2} (n), n = 0, \dots, L - 1$

Finally, a step E608 of the obtaining module 608 makes it possible to obtain the attenuated output signal (the decoded signal after pre-echo attenuation) by combining (in this example by simple addition) the attenuated sub-signals, according to the equation:

x_{ref, f} (not) = {boy Wut}_{pre, ss 1} (not) x_{rec, ss 1} (not) + {boy Wut}_{pre, ss 2} (not) x_{rec, ss 2} (not), not = 0 \dots, The - 1

Contrairement à une décomposition en sous-bandes classique, on peut noter ici que les filtrages utilisés ne sont pas associés à des opérations de décimation des sous-signaux et la complexité et le retard ("lookahead" ou trame future) sont réduits au minimum.Unlike a conventional subband decomposition, it can be noted here that the filtering used is not associated with sub-signal decimation operations and the complexity and delay ("lookahead" or future frame) are reduced to a minimum.

Un exemple de réalisation d'un dispositif de discrimination et de traitement d'atténuation selon l'invention est maintenant décrit en référence à la figure 8 . An exemplary embodiment of a discrimination and attenuation processing device according to the invention is now described with reference to the figure 8 .

Matériellement, ce dispositif 100 au sens de l'invention comporte typiquement, un processeur µP coopérant avec un bloc mémoire BM incluant une mémoire de stockage et/ou de travail, ainsi qu'une mémoire tampon MEM précitée en tant que moyen pour mémoriser toutes données nécessaire à la mise en oeuvre du procédé de discrimination et de traitement d'atténuation tel que décrit en référence à la figure 5. Ce dispositif reçoit en entrée des trames successives du signal numérique Se et délivre le signal Sa reconstruit avec atténuation de pré-écho dans les zones de pré-écho discriminées avec le cas-échéant reconstruction du signal atténué par combinaison de sous-signaux atténués.Materially, this device 100 in the sense of the invention typically comprises a μP processor cooperating with a memory block BM including a storage and / or working memory, and a memory buffer MEM mentioned above as a means for storing all data. necessary for the implementation of the discrimination and attenuation processing method as described with reference to the figure 5 . This device receives as input successive frames of the digital signal Se and delivers the reconstructed signal Sa with pre-echo attenuation in the pre-echo areas discriminated with, if necessary, reconstruction of the attenuated signal by combining attenuated sub-signals.

Le bloc mémoire BM peut comporter un programme informatique comportant les instructions de code pour la mise en oeuvre des étapes du procédé selon l'invention lorsque ces instructions sont exécutées par un processeur µP du dispositif et notamment les étapes de calcul d'un coefficient directeur des énergies pour au moins deux sous-blocs précédant le sous-bloc dans lequel une attaque est détectée, de comparaison du coefficient directeur à un seuil prédéfini et d'inhibition du traitement d'atténuation de pré-écho dans la zone de pré-écho dans le cas où le coefficient directeur calculé est inférieur au seuil prédéfini.
La figure 5 peut illustrer l'algorithme d'un tel programme informatique.The memory block BM may comprise a computer program comprising the code instructions for implementing the steps of the method according to the invention when these instructions are executed by a μP processor of the device and in particular the steps of calculating a control coefficient of the energies for at least two sub-blocks preceding the sub-block in which an attack is detected, comparing the steering coefficient to a predefined threshold and inhibiting the pre-echo attenuation processing in the pre-echo area in the case where the calculated coefficient of direction is lower than the predefined threshold.
The figure 5 can illustrate the algorithm of such a computer program.

Ce dispositif de discrimination et de traitement d'atténuation selon l'invention peut être indépendant ou intégré dans un décodeur de signal numérique. Un tel décodeur peut être intégré à des équipements de stockage ou de transmission de signaux audionumériques tels que des passerelles de communication, des terminaux de communication ou des serveurs d'un réseau de communication.This discrimination and attenuation processing device according to the invention can be independent or integrated in a digital signal decoder. Such a decoder can be integrated with equipment for storing or transmitting digital audio signals such as communication gateways, communication terminals or servers of a communication network.

Claims

Method for discriminating and attenuating pre-echo in a digital audio signal generated from a transform coding, in which, upon decoding, for a current frame decomposed into sub-blocks, the low-energy sub-blocks preceding a sub-block in which a transition or onset is detected (E601) determine a pre-echo zone (E602) in which a pre-echo attenuation processing is carried out (E607), the method being characterized in that, in the case where an onset is detected from the third sub-block of the current frame, it comprises the following steps:
- calculation (E603) of a leading coefficient of the energies for at least two sub-blocks of the current frame preceding the sub-block in which an onset is detected;

- comparison (E604) of the leading coefficient to a predefined threshold; and

- inhibition (E602) of the pre-echo attenuation processing in the pre-echo zone in the case where the calculated leading coefficient is below the predefined threshold.
Method according to Claim 1, characterized in that it further comprises a step of decomposition of the digital audio signal into at least two sub-signals as a function of a frequency criterion and in that the comparison calculation steps are performed for at least one of the sub-signals.
Method according to Claim 1, characterized in that it further comprises a step of decomposition of the digital audio signal into at least two sub-signals as a function of a frequency criterion and in that the computation and comparison steps are performed for each of the sub-signals, the inhibition of the pre-echo attenuation processing in the pre-echo zone of all the sub-signals being performed when a calculated leading coefficient is below the predefined threshold for at least one sub-signal.
Method according to Claim 3, characterized in that a different threshold is defined for each sub-signal.
Method according to one of Claims 1 to 4, characterized in that the leading coefficient is calculated according to a least squares estimation method.
Method according to one of Claims 1 to 5, characterized in that the leading coefficient is normalized.
Method according to Claim 1, characterized in that, in the case where an onset is detected in the first or second sub-block of the current frame, a leading coefficient calculated for the preceding frame is used for the comparison step.
Device for discriminating and attenuating pre-echo in a digital audio signal generated by a transform coder, the device being associated with a decoder and comprising a transition or onset detection module (601), a pre-echo zone discrimination module (602) and a pre-echo attenuation processing module (607), an echo attenuation processing being performed for a current frame decomposed into sub-blocks, in the low-energy sub-blocks preceding a sub-block in which a transition or onset is detected determining a pre-echo zone, the device being characterized in that it further comprises:
- a computation module (603) calculating a leading coefficient of the energies for at least two sub-blocks of the current frame preceding the sub-block in which an onset is detected, in the case where an onset is detected from the third sub-block of the current frame;

- a comparator (604) capable of performing a comparison of the leading coefficient to a predefined threshold; and

- a discrimination module (602) capable of inhibiting the pre-echo attenuation processing in the pre-echo zone in the case where the calculated leading coefficient is below the predefined threshold.
Digital audio signal decoder comprising a pre-echo discrimination and attenuation device according to Claim 8.
Computer program comprising code instructions for implementing the steps of the method according to one of Claims 1 to 7, when these instructions are executed by a processor.
Storage medium that can be read by a pre-echo discrimination and attenuation processing device on which is stored a computer program comprising code instructions for executing the steps of the pre-echo discrimination and attenuation processing method according to one of Claims 1 to 7, when said storage medium runs on a computer.