EP1830349B1 - Method of noise reduction of an audio signal - Google Patents

Method of noise reduction of an audio signal Download PDF

Info

Publication number
EP1830349B1
EP1830349B1 EP07290219A EP07290219A EP1830349B1 EP 1830349 B1 EP1830349 B1 EP 1830349B1 EP 07290219 A EP07290219 A EP 07290219A EP 07290219 A EP07290219 A EP 07290219A EP 1830349 B1 EP1830349 B1 EP 1830349B1
Authority
EP
European Patent Office
Prior art keywords
signal
speech
algorithm
noise
noisy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP07290219A
Other languages
German (de)
French (fr)
Other versions
EP1830349A1 (en
Inventor
Guillaume Pinto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Parrot SA
Original Assignee
Parrot SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Parrot SA filed Critical Parrot SA
Publication of EP1830349A1 publication Critical patent/EP1830349A1/en
Application granted granted Critical
Publication of EP1830349B1 publication Critical patent/EP1830349B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Definitions

  • the present invention relates to the denoising of audio signals picked up by a microphone in a noisy environment.
  • the invention is advantageously applied, but in a nonlimiting manner, to the speech signals picked up by the hands-free telephones or the like.
  • These devices include a sensitive microphone not only capturing the voice of the user, but also the surrounding noise, noise that is a disruptive element that can go, in some cases, to make incomprehensible the speaker's words.
  • the WO-A-98/45997 uses the push-button activation of a phone (for example when the driver wants to answer an incoming call) to detect the beginning of a speech signal and consider that the signal previously received to this support was essentially a noise signal. This last signal, stored, is analyzed to give a weighted average energy spectrum of the noise, then subtract from the noisy speech signal.
  • the US-A-5,742,694 describes another technique, implementing a predictive adaptive filter type mechanism.
  • This filter delivers a "reference signal” corresponding to the predictable part of the noisy signal and an "error signal” corresponding to the prediction error, then attenuates these two signals in variable proportions, and recombines them to provide a signal noised.
  • Still other techniques called beamforming or double-phoning , implement two separate microphones.
  • the first is designed and placed to primarily capture the speaker's voice, while the other is designed and placed to capture a larger noise component than the main microphone.
  • the comparison of the signals captured makes it possible to extract the voice of the ambient noise efficiently, and by relatively simple software means.
  • This technique based on a spatial coherence analysis of two signals, however, has the disadvantage of requiring two remote microphones, which generally confines it to fixed or semi-fixed installations and does not allow to integrate it into a pre-existing device by simply adding a software module. It also assumes that the speaker's position relative to the two microphones is approximately constant, which is generally the case in a car phone used by its driver. In addition, to achieve a near satisfactory denoising, the signals are subjected to a significant pre-filtering, which again has the disadvantage of introducing distortions that degrade the quality of the denoised signal restored.
  • the invention relates to a technique for denoising audio signals picked up by a single microphone recording a voice signal in a noisy environment.
  • the application of a gain called gain LSA makes it possible to minimize the mean square distance between the logarithm of the amplitude of the estimated signal and the logarithm of the amplitude of the original speech signal.
  • This second criterion is superior to the first because the distance chosen is much better suited to the behavior of the human ear and therefore qualitatively gives better results.
  • the essential idea is to reduce the energy of the very noisy frequency components by applying a low gain while leaving intact (by the application of a gain equal to 1) those that are little or no at all.
  • the knowledge of the indices of the frames where the speech is absent makes it possible to evaluate the power of the noise as well as its evolution over time on this segment of the spectrum. It suffices to measure the energy of the raw signal when the speech is absent and to make an average continuously updated these measurements. The main question is therefore when exactly the speech of the speaker is absent from the signal picked up by the microphone.
  • the method described in this article is not intended to identify precisely on which frequency components of which frames the speech is absent, but rather to give a confidence index between 0 and 1, a value of 1 indicating that the speech is absent for sure (according to the algorithm) while a value 0 declares the opposite.
  • this index is likened to the probability of absence of speech a priori , ie the probability that speech is absent on a given frequency component of the frame considered. This is of course a non-rigorous assimilation in the sense that even if the presence of speech is probabilistic ex ante, the signal picked up by the microphone can at any moment only go through two distinct states. It can either (at the moment considered) include speech or not contain it.
  • One of the aims of the invention is to overcome the drawbacks of the methods proposed up to now, by means of an improved denoising method applicable to a speech signal considered in isolation, in particular a signal picked up by a single microphone, a method which is based on the analysis of the temporal coherence of the captured signals.
  • the starting point of the invention lies in the observation that speech generally has a temporal coherence greater than noise and that, as a result, it is clearly more predictable.
  • the invention proposes to use this property to calculate a reference signal where the speech has been more attenuated than the noise, by applying in particular a predictive algorithm which may for example be of the LMS ( Least Mean Squares, Least Mean Squares ) type. ).
  • This reference signal derived from the speech signal to be denoised may be used in a manner comparable to that of the signal of the second microphone of beam-forming techniques. two-way, for example techniques similar to those of Cohen and Berdugo [4, supra].
  • the calculation of a ratio between the respective energy levels of the original signal and the reference signal thus obtained will make it possible to discriminate between the speech components and the nonstationary noise noises, and will provide an estimate of the probability of presence of speech of independently of any statistical model.
  • the technique proposed by the invention implements an "intelligent subtraction” implying, after a linear prediction made on the passed samples of the original signal (and not of a prefiltered signal, thus degraded), a registration phase between the original signal and the predicted signal.
  • the technique of the invention turns out, in practice, sufficiently powerful to provide extremely effective denoising directly on the original signal, freeing distortions introduced by a prefiltering chain, become unnecessary.
  • the predictive algorithm is advantageously a recursive adaptive algorithm of LMS mean least squares type.
  • Step c) advantageously comprises the application of a variable gain algorithm depending on the probability of presence / absence of speech, in particular an OM-LSA optimized modified log-spectral amplitude gain type algorithm.
  • the signal that we want to denoise is a sampled digital signal x (n) , where n denotes the number of the sample ( n is the temporal variable).
  • the noisy signal x (n) is applied as input to a predictive LMS algorithm schematized by block 10, including the application of appropriate delays 12.
  • a predictive LMS algorithm schematized by block 10 including the application of appropriate delays 12. The operation of this LMS algorithm will be described below, with reference to FIG. figure 2 .
  • the short-term Fourier transform of the captured signal x (n) (block 16) and the signal y (n) delivered by the predictive LMS algorithm (block 14) are then calculated. From these two transforms is calculated a reference signal (block 18), which is one of the input variables of an algorithm for calculating the probability of absence of speech (block 24). Meanwhile, the noisy signal transform x (n), from block 16, is also applied to the probability calculation algorithm.
  • Blocks 20 and 22 estimate the pseudo-stationary noise of the reference signal and the noisy signal transform is estimated, and the result is also applied to the probability calculation algorithm.
  • the result of the speech absence probability calculation, as well as the noisy signal transform, are inputted to an OM-LSA gain processing algorithm (block 26), the result of which is subjected to an inverse transformation of Fourier (block 28) to give an estimate of speech de-noiseed.
  • the predictive algorithm LMS (block 10) is schematized on the figure 2 .
  • ⁇ i ⁇ not + 1 ⁇ i not + 2 ⁇ ⁇ not ⁇ x ⁇ not - ⁇ - i + 1 ⁇ being a gain constant which makes it possible to adjust the speed and the stability of the adaptation.
  • the respective signals x (n) and y (n) (noisy speech signal and linear prediction) are split into frames of identical lengths, and their short-term Fourier transform (denoted respectively X and Y ) is calculated for each frame.
  • the algorithm predicts a 50% overlap between consecutive frames, and the samples are multiplied by the coefficients of the Hanning window so that the addition of even and odd fields corresponds to the signal of origin itself.
  • E ⁇ Ref k ⁇ l 2 E ⁇ S k ⁇ l 2 ⁇ ⁇ S k + E ⁇ D t k ⁇ l 2 ⁇ ⁇ D t k + E ⁇ D ps k ⁇ l 2 ⁇ ⁇ D ps k or ⁇ S k ⁇ ⁇ D t k ⁇ ⁇ D ps k represent the attenuation on the reference signal of the three signals in each spectrum segment.
  • S being a smoothed estimate of the instantaneous energy:
  • M being an estimator of the pseudo-stationary energy, which can be obtained for example by a method MCRA ( Minima Controlled Recursive Averaging ) of the same type as that described by Cohen and Berdugo [5, supra] (however, several alternatives exist in the literature).
  • L x and L Ref are transient detection thresholds.
  • ⁇ min (k) and ⁇ m ax (k) are the upper and lower limits for each spectrum segment. These various parameters are chosen so as to correspond to typical situations, close to reality.
  • the next step (corresponding to block 26 of the figure 1 ) consists in operating the denoising itself (reinforcement of the speech component).
  • the estimator just described will be applied to the statistical model described by Ephraim and Malah [2, supra], which assumes that the noise and speech in each spectrum segment are independent Gaussian processes of respective variances ⁇ x ( k, l) and ⁇ d (k, l) .
  • This step may advantageously implement the OM-LSA gain algorithm ( Optimally Modified Log-Spectral Amplitude Gain ) described by Cohen and Berdugo [3, cited above].
  • the G min gain in the absence of speech hypothesis is a lower limit for noise reduction, in order to limit the distortion of speech.
  • the signal obtained at the end of this treatment is subjected to an inverse Fourier transform (block 28) to give the final estimate of the denoised speech.
  • the algorithm of the present invention is particularly effective in noisy environments, parasitized by both mechanical noises, vibrations, etc. as well as by musical noises, characteristic situations encountered in the interior of a car. Spectrograms show that the attenuation of the noise is not only effective, but is done without significant distortion of speech after denoising.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Signal Processing Not Specific To The Method Of Recording And Reproducing (AREA)
  • Noise Elimination (AREA)

Abstract

The method involves determining a reference signal by applying a process to a noisy audio signal for attenuating voice components in the audio signal by utilizing predictive least mean square (LMS) algorithm. Probability of presence/absence of voice is determined from respective energy levels in a spectral range of the audio and reference signals. A noise spectrum is estimated, and a noise reduction value is derived from the audio signal by utilizing the probability.

Description

CONTEXTE DE L'INVENTIONBACKGROUND OF THE INVENTION Domaine de l'inventionField of the invention

La présente invention concerne le débruitage des signaux audio captés par un microphone dans un environnement bruité.The present invention relates to the denoising of audio signals picked up by a microphone in a noisy environment.

L'invention s'applique avantageusement, mais de façon non limitative, aux signaux de parole captés par les appareils téléphoniques de type "mains-libres" ou analogues.The invention is advantageously applied, but in a nonlimiting manner, to the speech signals picked up by the hands-free telephones or the like.

Ces appareils comportent un microphone sensible captant non seulement la voix de l'utilisateur, mais également le bruit environnant, bruit qui constitue un élément perturbateur pouvant aller, dans certains cas, jusqu'à rendre incompréhensibles les paroles du locuteur.These devices include a sensitive microphone not only capturing the voice of the user, but also the surrounding noise, noise that is a disruptive element that can go, in some cases, to make incomprehensible the speaker's words.

Il en est de même si l'on veut mettre en oeuvre des techniques de reconnaissance vocale, où il est très difficile d'opérer une reconnaissance de forme sur des mots noyés dans un niveau de bruit élevé.It is the same if one wants to implement speech recognition techniques, where it is very difficult to perform a form recognition on words embedded in a high noise level.

Cette difficulté liée au bruit ambiant est particulièrement contraignante dans le cas des dispositifs "mains-libres" pour véhicules automobiles. En particulier, la distance importante entre le microphone et le locuteur entraîne un niveau relatif de bruit élevé qui rend difficile l'extraction du signal utile noyé dans le bruit. De plus, le milieu très bruité typique de l'environnement automobile présente des caractéristiques spectrales non stationnaires, c'est-à-dire qui évoluent de manière imprévisible en fonction des conditions de conduite : passage sur des chaussées déformées ou pavées, autoradio en fonctionnement, etc.This difficulty related to ambient noise is particularly restrictive in the case of devices "hands-free" for motor vehicles. In particular, the large distance between the microphone and the speaker leads to a high relative level of noise which makes it difficult to extract the useful signal embedded in the noise. In addition, the highly noisy environment typical of the automotive environment has non-stationary spectral characteristics, that is to say that evolve unpredictably depending on the driving conditions: passage on deformed or paved roads, car radio operating etc.

Description de la technique apparentéeDescription of the Related Art

Diverses techniques ont été proposées pour réduire le niveau de bruit du signal capté par un microphone.Various techniques have been proposed to reduce the noise level of the signal picked up by a microphone.

Par exemple, le WO-A-98/45997 (Parrot SA) utilise l'appui sur le bouton-poussoir d'activation d'un téléphone (par exemple lorsque le conducteur veut répondre à un appel entrant) pour détecter le début d'un signal de parole et considérer que le signal capté antérieurement à cet appui était essentiellement un signal de bruit. Ce dernier signal, mémorisé, est analysé pour donner un spectre énergétique moyen pondéré du bruit, puis soustrait du signal de parole bruité.For example, the WO-A-98/45997 (Parrot SA) uses the push-button activation of a phone (for example when the driver wants to answer an incoming call) to detect the beginning of a speech signal and consider that the signal previously received to this support was essentially a noise signal. This last signal, stored, is analyzed to give a weighted average energy spectrum of the noise, then subtract from the noisy speech signal.

Le US-A-5 742 694 décrit une autre technique, mettant en oeuvre un mécanisme de type filtre adaptatif prédictif. Ce filtre délivre un "signal de référence" correspondant à la partie prédictible du signal bruité et un "signal d'erreur" correspondant à l'erreur de prédiction, puis atténue ces deux signaux dans des proportions variables, et les recombine pour fournir un signal débruité.The US-A-5,742,694 describes another technique, implementing a predictive adaptive filter type mechanism. This filter delivers a "reference signal" corresponding to the predictable part of the noisy signal and an "error signal" corresponding to the prediction error, then attenuates these two signals in variable proportions, and recombines them to provide a signal noised.

L'inconvénient majeur de cette technique de débruitage réside dans la distorsion importante introduite par le préfiltrage, donnant en sortie un signal très dégradé sur le plan de la qualité acoustique. Elle est en outre mal adaptée aux situations où l'on aurait besoin d'un débruitage énergique avec un signal de parole noyé dans un bruit de nature complexe et imprévisible, avec des caractéristiques spectrales non stationnaires.The major disadvantage of this denoising technique lies in the significant distortion introduced by prefiltering, giving a very degraded signal output in terms of acoustic quality. It is also poorly suited to situations where it would require energetic denoising with a speech signal embedded in a noise of complex and unpredictable nature, with non-stationary spectral characteristics.

D'autre techniques encore, dites beamforming ou double-phoning, mettent en oeuvre deux microphones distincts. Le premier est conçu et placé pour capter principalement la voix du locuteur, tandis que l'autre est conçu et placé pour capter une composante de bruit plus importante que le microphone principal. La comparaison des signaux captés permet d'extraire la voix du bruit ambiant de manière efficace, et par des moyens logiciels relativement simples.Still other techniques, called beamforming or double-phoning , implement two separate microphones. The first is designed and placed to primarily capture the speaker's voice, while the other is designed and placed to capture a larger noise component than the main microphone. The comparison of the signals captured makes it possible to extract the voice of the ambient noise efficiently, and by relatively simple software means.

Cette technique, fondée sur une analyse de cohérence spatiale de deux signaux, présente cependant l'inconvénient de nécessiter deux microphones distants, ce qui la cantonne généralement à des installations fixes ou semi-fixes et ne permet pas de l'intégrer à un dispositif préexistant par simple adjonction d'un module logiciel. Elle présuppose aussi que la position du locuteur par rapport aux deux microphones soit à peu près constante, ce qui est généralement le cas dans un téléphone de voiture utilisé par son conducteur. De plus, pour obtenir un débruitage à peu près satisfaisant, les signaux sont soumis à un préfiltrage important ce qui présente, ici encore, l'inconvénient d'introduire des distorsions venant dégrader la qualité du signal débruité restitué.This technique, based on a spatial coherence analysis of two signals, however, has the disadvantage of requiring two remote microphones, which generally confines it to fixed or semi-fixed installations and does not allow to integrate it into a pre-existing device by simply adding a software module. It also assumes that the speaker's position relative to the two microphones is approximately constant, which is generally the case in a car phone used by its driver. In addition, to achieve a near satisfactory denoising, the signals are subjected to a significant pre-filtering, which again has the disadvantage of introducing distortions that degrade the quality of the denoised signal restored.

L'invention concerne une technique de débruitage des signaux audio captés par un microphone unique enregistrant un signal de voix dans un environnement bruité.The invention relates to a technique for denoising audio signals picked up by a single microphone recording a voice signal in a noisy environment.

Une part importante des méthodes les plus efficaces mises en oeuvre dans les système à un seul microphone se fondent sur le modèle statistique établi par D. Malah et Y. Ephraim dans :

  1. [1] Y. Ephraim et D. Malah, Speech Enhancement using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-32, No 6, pp. 1109-1121, Dec. 1984 , et
  2. [2] Y. Ephraim et D. Malah, Speech Enhancement using a Minimum Mean-Square Error Log-Spectral Amplitude Estimator, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-33, No 2, pp. 443-445, April 1985 .
A large part of the most efficient methods used in single-microphone systems are based on the statistical model established by D. Malah and Y. Ephraim in:
  1. [1] Y. Ephraim and D. Malah, Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-32, No. 6, pp. 1109-1121, Dec. 1984 , and
  2. [2] Y. Ephraim and D. Malah, Speech Enhancement Using a Minimum Mean-Square Error Log-Spectral Amplitude Estimator, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-33, No. 2, pp. 443-445, April 1985 .

Faisant l'approximation que la parole et le bruit sont des processus gaussiens non corrélés et présupposant que la puissance spectrale du bruit soit une donnée connue, ces deux articles donnent une solution optimale au problème de réduction de bruit décrit plus haut. Cette solution propose de découper le signal bruité en composantes fréquentielles indépendantes par l'utilisation de la transformée de Fourier discrète, d'appliquer un gain optimal sur chacune de ces composantes puis de recombiner le signal ainsi traité. Les deux articles divergent sur le choix du critère d'optimalité. Dans [1], le gain appliqué est nommé gain STSA et permet de minimiser la distance quadratique moyenne entre le signal estimé (à la sortie de l'algorithme) et le signal de parole originel (non bruité). Dans [2], l'application d'un gain nommé gain LSA permet quant à elle de minimiser la distance quadratique moyenne entre le logarithme de l'amplitude du signal estimé et le logarithme de l'amplitude du signal de parole original. Ce second critère se montre supérieur au premier car la distance choisie est en bien meilleure adéquation avec le comportement de l'oreille humaine et donne donc qualitativement de meilleurs résultats. Dans tous les cas, l'idée essentielle est de diminuer l'énergie des composantes fréquentielles très bruités en leur appliquant un gain faible tout en laissant intactes (par l'application d'un gain égal à 1) celles qui le sont peu ou pas du tout.Making the approximation that speech and noise are uncorrelated Gaussian processes and presupposing that the spectral power of noise is a known datum, these two articles give an optimal solution to the problem of noise reduction described above. This solution proposes to cut the noisy signal into independent frequency components by using the discrete Fourier transform, to apply an optimal gain on each of these components and then to recombine the signal thus treated. The two articles differ on the choice of the criterion of optimality. In [1], the applied gain is called the STSA gain and allows to minimize the mean squared distance between the estimated signal (at the output of the algorithm) and the original speech signal (non-noisy). In [2], the application of a gain called gain LSA makes it possible to minimize the mean square distance between the logarithm of the amplitude of the estimated signal and the logarithm of the amplitude of the original speech signal. This second criterion is superior to the first because the distance chosen is much better suited to the behavior of the human ear and therefore qualitatively gives better results. In all cases, the essential idea is to reduce the energy of the very noisy frequency components by applying a low gain while leaving intact (by the application of a gain equal to 1) those that are little or no at all.

Bien que séduisant puisque soutenu par une démonstration mathématique rigoureuse, ce procédé ne peut toutefois pas être mis en oeuvre tout seul. En effet, comme indiqué plus haut, la puissance spectrale du bruit est inconnue et imprévisible ex ante. De plus, ce même procédé ne propose pas d'évaluer à quels moments la parole du locuteur est présente dans le signai capté. Il se contente simplement de supposer soit que la parole est toujours présente, soit qu'elle est présente une portion fixe du temps, ce qui peut limiter sérieusement la qualité de la réduction de bruit.Although attractive as supported by a rigorous mathematical demonstration, this method can not however be implemented by itself. Indeed, as indicated above, the spectral power of the noise is unknown and unpredictable ex ante. Moreover, this same process does not propose not to evaluate at what moments the speech of the speaker is present in the captured signal. It simply assumes either that speech is always present, or that it is present a fixed portion of time, which can seriously limit the quality of noise reduction.

Il est donc nécessaire d'utiliser un autre algorithme ayant pour fonction d'évaluer la puissance spectrale du bruit ainsi que les instants où la parole du locuteur est présente sur le signal brut capté. Il s'avère même que cette estimation constitue le facteur déterminant de la qualité de la réduction de bruit opérée, l'algorithme d'Ephraim et Malah n'étant que la manière optimale d'utiliser l'information ainsi obtenue.It is therefore necessary to use another algorithm whose function is to evaluate the spectral power of the noise as well as the times when the speech of the speaker is present on the raw signal picked up. It even turns out that this estimate is the determining factor of the quality of the noise reduction operated, the algorithm of Ephraim and Malah being only the optimal way to use the information thus obtained.

C'est une solution originale à ce double problème d'évaluation du bruit et des instants de présence du signal de parole qu'apporte la présente invention.This is an original solution to this double problem of evaluation of the noise and moments of presence of the speech signal that the present invention provides.

Ces deux questions sont en réalité intrinsèquement liées. En effet supposons que le signal brut capté est découpé en trames de longueurs égales, dont on calcule pour chacune la transformée de Fourier à court terme.These two questions are in fact intrinsically linked. Indeed, suppose that the raw signal picked up is cut into frames of equal lengths, for which the Fourier transform is calculated for the short term.

Pour une composante fréquentielle donnée, la connaissance des indices des trames où la parole est absente permet d'évaluer la puissance du bruit ainsi que son évolution au cours du temps sur ce segment du spectre. Il suffit en effet de mesurer l'énergie du signal brut lorsque la parole est absente et de faire une moyenne continuellement mise à jour de ces mesures. La question principale est donc de savoir quand exactement la parole du locuteur est absente du signal capté par le microphone.For a given frequency component, the knowledge of the indices of the frames where the speech is absent makes it possible to evaluate the power of the noise as well as its evolution over time on this segment of the spectrum. It suffices to measure the energy of the raw signal when the speech is absent and to make an average continuously updated these measurements. The main question is therefore when exactly the speech of the speaker is absent from the signal picked up by the microphone.

Si le bruit est stationnaire ou pseudo-stationnaire, ce problème peut être aisément résolu en déclarant que la parole est absente dans un segment de spectre d'une trame donnée lorsque l'énergie spectrale des données pour ce segment de spectre n'a pas évolué ou a peu évolué par rapport aux dernières trames. Inversement, on déclare que la parole est présente en cas de comportement non stationnaire.If the noise is stationary or pseudo-stationary, this problem can be easily solved by declaring that speech is absent in a spectrum segment of a given frame when the spectral energy of the data for that spectrum segment has not evolved. or has changed little compared to the last frames. Conversely, speech is said to be present in case of non-stationary behavior.

Toutefois, dans une environnement réel, a fortiori un environnement automobile dont on a indiqué plus haut que le bruit comportait de nombreuses caractéristiques spectrales non stationnaires, ce procédé est aisément pris en défaut, dans la mesure où aussi bien la parole que le bruit peuvent présenter des comportement transitoires. Or, si l'on décide de conserver toutes les composantes transitoires, il restera du bruit musical résiduel dans les données débruitées ; inversement, si l'on décide de supprimer les composantes transitoires en deçà d'un seuil énergétique donné, les composantes faibles de la parole seront alors effacées, alors que ces composantes peuvent être importantes, tant pour leur contenu informatif que pour l'intelligibilité générale (faible distorsion) du signal débruité restitué après traitement.However, in a real environment, a fortiori an automobile environment which has been indicated above that the noise had many non-stationary spectral characteristics, this method is easily faulted, insofar as both speech and noise can present transient behavior. However, if we decide to keep all the transitional components, there will be musical noise residual in the debruised data; conversely, if it is decided to suppress the transient components below a given energy threshold, then the weak components of speech will be erased, whereas these components may be important, both for their informative content and for the general intelligibility. (low distortion) of the denoised signal restored after treatment.

À cet égard, diverses méthodes ont été proposées. Parmi les plus efficaces, on peut citer celle décrite par :

  • [3] I. Cohen et B. Berdugo, Speech Enhancement for Non-Stationary Noise Environments, Signal Processing, Elsevier, Vol. 81, pp. 2403-2418,2001 ,
In this regard, various methods have been proposed. Among the most effective, we can mention the one described by:
  • [3] I. Cohen and B. Berdugo, Speech Enhancement for Non-Stationary Noise Environments, Signal Processing, Elsevier, Vol. 81, pp. 2403 to 2418.2001 ,

Comme fréquemment dans le domaine, le procédé décrit dans cet article n'a pas pour objectif d'identifier précisément sur quelles composantes fréquentielles de quelles trames la parole est absente, mais plutôt de donner un indice de confiance entre 0 et 1, une valeur 1 indiquant que la parole est absente à coup sûr (selon l'algorithme) tandis qu'une valeur 0 déclare le contraire. De par sa nature, cet indice est assimilé à la probabilité d'absence de la parole a priori, c'est à dire la probabilité que la parole soit absente sur une composante fréquentielle donnée de la trame considérée. Il s'agit bien sûr d'une assimilation non rigoureuse dans le sens que même si la présence de la parole est probabiliste ex ante, le signal capté par le microphone ne peut à chaque instant que passer par deux états distincts. Il peut soit (à l'instant considéré) comporter de la parole soit ne pas en contenir. Toutefois cette assimilation donne de bons résultats en pratique ce qui justifie son utilisation. Afin d'estimer cette probabilité d'absence, Cohen et Berdugo utilisent des moyennes sur des rapports signal à bruit a priori eux mêmes utilisés et calculés dans l'algorithme d'Ephraim et Malah. Ces auteurs décrivent également la technique dite de gain OM-LSA (Optimally-Modified Log-Spectral Amplitude), visant à améliorer le gain LSA par l'intégration de cette probabilité d'absence de la parole.As is frequently the case in the field, the method described in this article is not intended to identify precisely on which frequency components of which frames the speech is absent, but rather to give a confidence index between 0 and 1, a value of 1 indicating that the speech is absent for sure (according to the algorithm) while a value 0 declares the opposite. By its nature, this index is likened to the probability of absence of speech a priori , ie the probability that speech is absent on a given frequency component of the frame considered. This is of course a non-rigorous assimilation in the sense that even if the presence of speech is probabilistic ex ante, the signal picked up by the microphone can at any moment only go through two distinct states. It can either (at the moment considered) include speech or not contain it. However, this assimilation gives good results in practice which justifies its use. To estimate this probability of absence, Cohen and Berdugo use averages on signal-to-noise ratios a priori themselves used and calculated in the algorithm of Ephraim and Malah. These authors also describe the so-called OM-LSA gain technique ( Optimally-Modified Log-Spectral Amplitude ), aimed at improving the LSA gain by integrating this probability of absence of speech.

Cette estimation de la probabilité a priori d'absence de la parole se révèle efficace, mais dépend directement du modèle statistique élaboré par Ephraim et Malah et non d'une connaissance a priori des données.This estimate of the a priori probability of absence of speech proves to be effective, but depends directly on the statistical model developed by Ephraim and Malah and not on a priori knowledge of the data.

Pour obtenir une estimée de la probabilité d'absence qui soit indépendante de ce modèle statistique, Cohen et Berdugo ont proposé dans :

  • [4] I. Cohen et B. Berdugo, Two Channel Signal Detection and Speech Enhancement Based on the Transient Beam-to-Reference Ratio, Proc. ICASSP 2003, Hong Kong, pp. 233-236, April 2003 ,
de calculer la probabilité d'absence à partir de signaux captés par deux microphones différemment placés, donnant des signaux respectifs sur deux voies différentes, dont la combinaison permet d'obtenir une voie dite de sortie et une voie dite de bruit de référence. L'analyse est basée sur la constatation que les composantes de parole sont relativement plus faibles sur la voie de bruit de référence, et que les composantes de bruit transitoire présentent à peu près la même énergie sur les deux voies. Une probabilité de présence de parole pour chaque segment de spectre de chaque trame est déterminée en calculant un ratio d'énergie entre les composantes non stationnaires des signaux respectifs des deux voies.To obtain an estimate of the probability of absence that is independent of this statistical model, Cohen and Berdugo proposed in:
  • [4] I. Cohen and B. Berdugo, Two Channel Signal Detection and Speech Enhancement Based on the Transient Beam-to-Reference Ratio, Proc. ICASSP 2003, Hong Kong, pp. 233-236, April 2003 ,
calculating the probability of absence from signals picked up by two differently placed microphones, giving respective signals on two different channels, the combination of which makes it possible to obtain a so-called output channel and a so-called reference noise channel. The analysis is based on the finding that the speech components are relatively weaker on the reference noise path, and that the transient noise components have approximately the same energy on both paths. A speech presence probability for each spectrum segment of each frame is determined by computing an energy ratio between the non-stationary components of the respective signals of the two channels.

Mais, comme pour les techniques de beamforming ou double-phoning évoquées plus haut, ce procédé est assez contraignant dans la mesure où il nécessite deux microphones.But, as for beamforming or double-phoning techniques mentioned above, this process is quite restrictive in that it requires two microphones.

RÉSUMÉ DE L'INVENTIONSUMMARY OF THE INVENTION

L'un des buts de l'invention est de remédier aux inconvénients des méthodes proposées jusqu'à présent, grâce à un procédé perfectionné de débruitage applicable à un signal de parole considéré isolément, notamment un signal capté par un microphone unique, procédé qui soit basé sur l'analyse de la cohérence temporelle des signaux captés.One of the aims of the invention is to overcome the drawbacks of the methods proposed up to now, by means of an improved denoising method applicable to a speech signal considered in isolation, in particular a signal picked up by a single microphone, a method which is based on the analysis of the temporal coherence of the captured signals.

Le point de départ de l'invention réside dans la constatation que la parole présente généralement une cohérence temporelle supérieure au bruit et que, de ce fait, elle est nettement plus prédictible. Essentiellement, l'invention propose d'utiliser cette propriété pour calculer un signal de référence où la parole aura été plus atténuée que le bruit, en appliquant notamment un algorithme prédictif qui pourra par exemple être de type LMS (Least Mean Squares, moindres carrés moyens). Ce signal de référence dérivé du signal de parole à débruiter pourra être utilisé de façon comparable à celle du signal du second microphone des techniques de beam-forming à deux voies, par exemple des techniques semblables à celles de Cohen et Berdugo [4, précité]. Le calcul d'un ratio entre les niveaux d'énergie respectifs du signal originel et du signal de référence ainsi obtenu permettra de discriminer entre les composantes de parole et les bruits parasites non stationnaires, et fournira une estimation de la probabilité de présence de parole de façon indépendante de tout modèle statistique.The starting point of the invention lies in the observation that speech generally has a temporal coherence greater than noise and that, as a result, it is clearly more predictable. Essentially, the invention proposes to use this property to calculate a reference signal where the speech has been more attenuated than the noise, by applying in particular a predictive algorithm which may for example be of the LMS ( Least Mean Squares, Least Mean Squares ) type. ). This reference signal derived from the speech signal to be denoised may be used in a manner comparable to that of the signal of the second microphone of beam-forming techniques. two-way, for example techniques similar to those of Cohen and Berdugo [4, supra]. The calculation of a ratio between the respective energy levels of the original signal and the reference signal thus obtained will make it possible to discriminate between the speech components and the nonstationary noise noises, and will provide an estimate of the probability of presence of speech of independently of any statistical model.

En d'autres termes, la technique proposée par l'invention met en oeuvre une "soustraction intelligente" impliquant, après une prédiction linéaire opérée sur les échantillons passés du signal originel (et non d'un signal préfiltré, donc dégradé), un recalage de phase entre le signal originel et le signal prédit.In other words, the technique proposed by the invention implements an "intelligent subtraction" implying, after a linear prediction made on the passed samples of the original signal (and not of a prefiltered signal, thus degraded), a registration phase between the original signal and the predicted signal.

La technique de l'invention s'avère, en pratique, suffisamment performante pour assurer un débruitage extrêmement efficace directement sur le signal originel, en s'affranchissant de distorsions introduites par une chaîne de préfiltrage, devenue inutile.The technique of the invention turns out, in practice, sufficiently powerful to provide extremely effective denoising directly on the original signal, freeing distortions introduced by a prefiltering chain, become unnecessary.

Plus précisément, la présente invention propose, pour le débruitage d'un signal audio bruité originel comportant une composante de parole combinée à une composante de bruit comprenant elle-même une composante de bruit transitoire et une composante de bruit pseudo-stationnaire, d'opérer une analyse de cohérence temporelle du signal bruité par les étapes de :

  1. a) détermination d'un signal de référence par application au signal bruité d'un traitement propre à atténuer de façon plus importante les composantes de parole que les composantes de bruit de ce signal bruité, ledit traitement comprenant : (a1) l'application d'un algorithme de prédiction linéaire adaptatif opérant sur une combinaison linéaire des échantillons antérieurs du signal bruité, et (a2) la détermination dudit signal de référence par une soustraction, avec compensation du déphasage, entre le signal bruité originel, non filtré et le signal délivré par l'algorithme de prédiction linéaire ;
  2. b) détermination d'une probabilité de présence/absence de parole a priori à partir des niveaux d'énergie respectifs dans le domaine spectral du signal bruité et du signal de référence ; et
  3. c) utilisation de cette probabilité d'absence de parole a priori pour estimer un spectre de bruit et dériver du signal bruité une estimée débruitée du signal de parole.
More specifically, the present invention proposes, for the denoising of an original noisy audio signal comprising a speech component combined with a noise component comprising itself a transient noise component and a pseudo-stationary noise component, to operate temporal coherence analysis of the noisy signal by the steps of:
  1. a) determining a reference signal by applying to the noisy signal a processing adapted to more significantly attenuate the speech components than the noise components of this noisy signal, said processing comprising: (a1) the application of an adaptive linear prediction algorithm operating on a linear combination of the previous noisy signal samples, and (a2) determining said reference signal by subtracting, with phase shift compensation, between the original, unfiltered noisy signal and the delivered signal by the linear prediction algorithm;
  2. b) determining a probability of presence / absence of speech a priori from the respective energy levels in the spectral range of the noisy signal and the reference signal; and
  3. c) using this probability of absence of speech a priori to estimate a noise spectrum and derive from the noisy signal a denoised estimate of the speech signal.

Le signal de référence peut notamment être déterminé par application à l'étape a2) d'une relation du type : Ref k l = X k l - X k l Y k l X k l

Figure imgb0001

X(k,l) et Y(k,l) sont les transformées de Fourier à court terme de chaque segment de spectre k de chaque trame l, respectivement du signal bruité originel et du signal délivré par l'algorithme de prédiction linéaire.The reference signal may in particular be determined by applying in step a2) a relation of the type: Ref k l = X k l - X k l Y k l X k l
Figure imgb0001

where X ( k , l ) and Y ( k , l ) are the short-term Fourier transforms of each spectrum segment k of each frame 1 , respectively of the original noisy signal and the signal delivered by the linear prediction algorithm.

L'algorithme prédictif est avantageusement un algorithme adaptatif récursif de type moindres carrés moyens LMS.The predictive algorithm is advantageously a recursive adaptive algorithm of LMS mean least squares type.

L'étape b) comprend avantageusement l'application d'un algorithme d'estimation de l'énergie de la composante de bruit pseudo-stationnaire dans le signal de référence et dans le signal bruité, notamment un algorithme de type à moyennage récursif par contrôle des minima MRCA comme décrit dans :

  • [5] I. Cohen et B. Berdugo, Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement, IEEE Signal Processing Letters, Vol. 9, No 1, pp. 12-15, Jan. 2002 ,
Step b) advantageously comprises the application of an algorithm for estimating the energy of the pseudo-stationary noise component in the reference signal and in the noisy signal, in particular a recursive averaging type algorithm by control MRCA minima as described in:
  • [5] I. Cohen and B. Berdugo, Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement, IEEE Signal Processing Letters, Vol. 9, No. 1, pp. 12-15, Jan. 2002 ,

L'étape c) comprend avantageusement l'application d'un algorithme de gain variable fonction de la probabilité de présence/absence de parole, notamment un algorithme de type gain à amplitude log-spectrale modifié optimisé OM-LSA.Step c) advantageously comprises the application of a variable gain algorithm depending on the probability of presence / absence of speech, in particular an OM-LSA optimized modified log-spectral amplitude gain type algorithm.

DESCRIPTION SOMMAIRE DES DESSINSSUMMARY DESCRIPTION OF THE DRAWINGS

On va maintenant décrire un exemple de mise en oeuvre de l'invention, en référence aux dessins annexés où les mêmes références numériques désignent d'une figure à l'autre des éléments identiques ou fonctionnellement semblables.

  • La figure 1 est un diagramme schématique illustrant les différentes opérations effectuées par un algorithme de débruitage conformément au procédé de l'invention.
  • La figure 2 est un diagramme schématique illustrant plus particulièrement l'algorithme prédictif LMS adaptatif.
An embodiment of the invention will now be described with reference to the appended drawings in which the same reference numerals designate elements that are identical or functionally similar from one figure to another.
  • The figure 1 is a schematic diagram illustrating the various operations performed by a denoising algorithm according to the method of the invention.
  • The figure 2 is a schematic diagram illustrating more particularly the predictive algorithm LMS adaptive.

DESCRIPTION DÉTAILLÉE DU MODE DE MISE EN OEUVRE PRÉFÉRÉDETAILED DESCRIPTION OF THE PREFERRED MODE OF IMPLEMENTATION

Le signal que l'on souhaite débruiter est un signal numérique échantillonné x(n), où n désigne le numéro de l'échantillon (n est donc la variable temporelle).The signal that we want to denoise is a sampled digital signal x (n) , where n denotes the number of the sample ( n is the temporal variable).

Le signal capté x(n) est une combinaison d'un signal de parole s(n) et d'un bruit surajouté, non corrélé, d(n) : x n = s n + d n

Figure imgb0002
The captured signal x (n) is a combination of a speech signal s (n) and an additional noise, uncorrelated, d (n) : x not = s not + d not
Figure imgb0002

Ce bruit d(n) a deux composantes indépendantes, à savoir une composante transitoire dt(n) et une composante pseudo-stationnaire dps(n) : d n = d t n + d ps n

Figure imgb0003
This noise d (n) has two independent components, namely a transient component d t (n) and a pseudo-stationary component d ps (n) : d not = d t not + d ps not
Figure imgb0003

Comme illustré sur la figure 1, le signal bruité x(n) est appliqué en entrée d'un algorithme LMS prédictif schématisé par le bloc 10, incluant l'application de retards appropriés 12. Le fonctionnement de cet algorithme LMS sera décrit plus bas, en référence à la figure 2.As illustrated on the figure 1 , the noisy signal x (n) is applied as input to a predictive LMS algorithm schematized by block 10, including the application of appropriate delays 12. The operation of this LMS algorithm will be described below, with reference to FIG. figure 2 .

On calcule ensuite la transformé de Fourier à court terme du signal capté x(n) (bloc 16), ainsi que du signal y(n) délivré par l'algorithme LMS prédictif (bloc 14). À partir de ces deux transformées est calculé un signal de référence (bloc 18), qui constitue l'une des variables d'entrée d'un algorithme de calcul de la probabilité d'absence de parole (bloc 24). Parallèlement, la transformée du signal bruité x(n), issue du bloc 16, est également appliquée à l'algorithme de calcul de probabilité.The short-term Fourier transform of the captured signal x (n) (block 16) and the signal y (n) delivered by the predictive LMS algorithm (block 14) are then calculated. From these two transforms is calculated a reference signal (block 18), which is one of the input variables of an algorithm for calculating the probability of absence of speech (block 24). Meanwhile, the noisy signal transform x (n), from block 16, is also applied to the probability calculation algorithm.

Les blocs 20 et 22 estiment le bruit pseudo-stationnaire du signal de référence et de la transformée du signal bruité est estimé, et le résultat est également appliqué à l'algorithme de calcul de probabilité.Blocks 20 and 22 estimate the pseudo-stationary noise of the reference signal and the noisy signal transform is estimated, and the result is also applied to the probability calculation algorithm.

Le résultat du calcul de probabilité d'absence de parole, ainsi que la transformée du signal bruité, sont appliqués en entrée d'un algorithme de traitement de gain OM-LSA (bloc 26), dont le résultat est soumis à une transformation inverse de Fourier (bloc 28) pour donner une estimée de la parole débruitée.The result of the speech absence probability calculation, as well as the noisy signal transform, are inputted to an OM-LSA gain processing algorithm (block 26), the result of which is subjected to an inverse transformation of Fourier (block 28) to give an estimate of speech de-noiseed.

On va maintenant décrire plus en détail les différentes phases de ce traitement.The different phases of this treatment will now be described in more detail.

L'algorithme prédictif LMS (bloc 10) est schématisé sur la figure 2.The predictive algorithm LMS (block 10) is schematized on the figure 2 .

Dans la mesure où les signaux en présence sont globalement non stationnaires mais localement pseudo-stationnaires, on peut avantageusement utiliser un système adaptatif, qui pourra tenir compte des variations d'énergie du signal dans le temps et converger vers les divers optima locaux.Insofar as the signals in the presence are globally non-stationary but locally pseudo-stationary, one can advantageously use an adaptive system, which can take into account the variations of energy of the signal over time and converge towards the various local optima.

Essentiellement, si l'on applique des retards successifs Δ, la prédiction linéaire y(n) du signal x(n) est une combinaison linéaire des échantillons antérieurs {x(n - Δ - i + 1)}1≤i≤M : y n = i = 1 M ω i x n - Δ - i + 1

Figure imgb0004
qui minimise l'erreur quadratique moyenne de l'erreur de prédiction : ϵ n = x n - y n
Figure imgb0005
Essentially, if we apply successive delays Δ, the linear prediction y (n) of the signal x (n) is a linear combination of the earlier samples { x ( n - Δ - i + 1)} 1 i i M M : there not = Σ i = 1 M ω i x not - Δ - i + 1
Figure imgb0004
which minimizes the mean squared error of the prediction error: ε not = x not - there not
Figure imgb0005

La minimisation consiste à trouver : min ω 1 , ω 2 , , ω M E x n - i = 1 M ω i x n - Δ - i + 1 2

Figure imgb0006
Minimization involves finding: min ω 1 , ω 2 , ... , ω M E x not - Σ i = 1 M ω i x not - Δ - i + 1 2
Figure imgb0006

Pour résoudre ce problème, il est possible d'utiliser un algorithme LMS, qui est un algorithme en lui-même connu, décrit par exemple dans :

  • [6] B. Widrow, Adaptative Filters, Aspect of Network and System Theory, R. E. Kalman and N. De Claris (Eds). New York: Holt, Rinehart and Winston, pp. 563-587, 1970 , et
  • [7] B. Widrow et al., Adaptative Noise Cancelling: Principles and Applications, Proc. IEEE, Vol. 63, No 12 pp. 1692-1716, Dec 1975 .
To solve this problem, it is possible to use an algorithm LMS, which is an algorithm in itself known, described for example in:
  • [6] B. Widrow, Adaptative Filters, Aspect of Network and System Theory, RE Kalman and N. De Claris (Eds). New York: Holt, Rinehart and Winston, pp. 563-587, 1970 , and
  • [7] B. Widrow et al., Adaptive Noise Canceling: Principles and Applications, Proc. IEEE, Vol. 63, No. 12 pp. 1692-1716, Dec 1975 .

On peut définir un procédé récursif d'adaptation des pondérations. ω i n + 1 = ω i n + 2 μϵ n x n - Δ - i + 1

Figure imgb0007
µ étant une constante de gain qui permet d'ajuster la vitesse et la stabilité de l'adaptation.It is possible to define a recursive method for adapting weights. ω i not + 1 = ω i not + 2 με not x not - Δ - i + 1
Figure imgb0007
μ being a gain constant which makes it possible to adjust the speed and the stability of the adaptation.

On pourra trouver des indications générales sur ces aspects de l'algorithme LMS dans :

  • [8] B. Widrow et S. Stearns, Adaptative Signal Processing, Prentice-Hall Signal Processing Series, Alan V. Oppenheim Series Editor, 1985 .
General information on these aspects of the LMS algorithm can be found in:
  • [8] B. Widrow and S. Stearns, Adaptative Signal Processing, Prentice-Hall Signal Processing Series, Alan V. Oppenheim Series Editor, 1985 .

On peut démontrer qu'une telle prédiction linéaire adaptative permet de discriminer efficacement entre bruit et parole car les échantillons contenant de la parole seront bien mieux prédits (plus petites erreurs quadratiques entre la prédiction et le signal brut) que ceux ne contenant que du bruit.It can be shown that such an adaptive linear prediction makes it possible to discriminate effectively between noise and speech because the samples containing speech will be much better predicted (smaller quadratic errors between the prediction and the raw signal) than those containing only noise.

Plus précisément, les signaux respectifs x(n) et y(n) (signal de parole bruitée et prédiction linéaire) sont découpés en trames de longueurs identiques, et leur transformée de Fourier à court terme (notées respectivement X et Y) est calculée pour chaque trame. Pour éviter les effets des erreurs de précision, l'algorithme prévoit un recouvrement de 50% entre trames consécutives, et les échantillons sont multipliés par les coefficients de la fenêtre de Hanning de manière que l'addition des trames paires et impaires corresponde au signal d'origine proprement dit. Pour le segment de spectre k d'une trame l paire, on a : X k l = p = 1 R h p x Rl + p e - j 2 π pk R

Figure imgb0008
More precisely, the respective signals x (n) and y (n) (noisy speech signal and linear prediction) are split into frames of identical lengths, and their short-term Fourier transform (denoted respectively X and Y ) is calculated for each frame. To avoid the effects of precision errors, the algorithm predicts a 50% overlap between consecutive frames, and the samples are multiplied by the coefficients of the Hanning window so that the addition of even and odd fields corresponds to the signal of origin itself. For the spectrum segment k of a frame 1 pair, we have: X k l = Σ p = 1 R h p x Services + p e - j 2 π pk R
Figure imgb0008

Et pour le segment de spectre k d'une trame l impaire : X k l = p = 1 R h p x R 2 l + p e - j 2 π pk R

Figure imgb0009
h étant la fenêtre de Hanning.And for the spectrum segment k of an odd l- frame: X k l = Σ p = 1 R h p x R 2 l + p e - j 2 π pk R
Figure imgb0009
h being the Hanning window.

Une première possibilité consiste à définir le signal de référence en prenant la transformée de Fourier de l'erreur de prédiction : ϵ ^ k l = X k l - Y k l

Figure imgb0010
A first possibility is to define the reference signal by taking the Fourier transform of the prediction error: ε ^ k l = X k l - Y k l
Figure imgb0010

Cependant, on constate en pratique un certain déphasage entre X et Y dû à une convergence imparfaite de l'algorithme LMS, empêchant une bonne discrimination entre parole et bruit. On préfère donc adopter pour le signal de référence une autre définition qui compense ce déphasage, à savoir : Ref k l = X k l - X k l Y k l X k l

Figure imgb0011
However, there is in practice a certain phase shift between X and Y due to an imperfect convergence of the LMS algorithm, preventing good discrimination between speech and noise. It is therefore preferred to adopt for the reference signal another definition that compensates for this phase difference, namely: Ref k l = X k l - X k l Y k l X k l
Figure imgb0011

On suppose que l'énergie spectrale du signal de référence peut être décrite sous la forme : E Ref k l 2 = E S k l 2 α S k + E D t k l 2 α D t k + E D ps k l 2 α D ps k

Figure imgb0012

α S k < α D t k < α D ps k
Figure imgb0013
représentent l'atténuation sur le signal de référence des trois signaux dans chaque segment de spectre.It is assumed that the spectral energy of the reference signal can be described as: E Ref k l 2 = E S k l 2 α S k + E D t k l 2 α D t k + E D ps k l 2 α D ps k
Figure imgb0012

or α S k < α D t k < α D ps k
Figure imgb0013
represent the attenuation on the reference signal of the three signals in each spectrum segment.

L'étape suivante consiste à délivrer une estimation q(k,l) de la probabilité d'absence de parole dans le signal bruité : q k l = Pr H 0 k l

Figure imgb0014
H0(k,l) indiquant l'absence de parole (et H1(k,l) la présence de parole) dans le k ième segment de spectre de la l ième trame.The next step consists in delivering an estimate q (k, l) of the probability of absence of speech in the noisy signal: q k l = Pr H 0 k l
Figure imgb0014
H 0 (k, l) indicating the absence of speech (and H 1 (k, l) the presence of speech) in the k th spectrum segment of the 1 th frame.

La discrimination entre bruit transitoire et parole peut être opérée par une technique comparable à celle de Cohen et Berdugo [5, précité]. Plus précisément, l'algorithme de l'invention évalue un ratio des énergies transitoires sur les deux voies, donné par : Ω k l = SX k l - MX k l SRef k l - MRef k l

Figure imgb0015
The discrimination between transient noise and speech can be made by a technique comparable to that of Cohen and Berdugo [5, cited above]. More precisely, the algorithm of the invention evaluates a ratio of the transient energies on the two paths, given by: Ω k l = SX k l - MX k l SRef k l - MRef k l
Figure imgb0015

S étant une estimation lissée de l'énergie instantanée : SX k l = SX k , l - 1 + i = - ω ω b i X k l 2

Figure imgb0016
b étant une fenêtre dans le domaine temporel et M étant un estimateur de l'énergie pseudo-stationnaire, qui peut être obtenu par exemple par une méthode MCRA (Minima Controlled Recursive Averaging) du même type que celle décrite par Cohen et Berdugo [5, précité] (cependant plusieurs alternatives existent dans la littérature). S being a smoothed estimate of the instantaneous energy: SX k l = SX k , l - 1 + Σ i = - ω ω b i X k l 2
Figure imgb0016
b being a window in the time domain and M being an estimator of the pseudo-stationary energy, which can be obtained for example by a method MCRA ( Minima Controlled Recursive Averaging ) of the same type as that described by Cohen and Berdugo [5, supra] (however, several alternatives exist in the literature).

En présence de parole mais en l'absence de bruit transitoire, ce ratio vaut approximativement : Ω k l = 1 α D t k = Ω max k

Figure imgb0017
In the presence of speech but in the absence of transient noise, this ratio is approximately: Ω k l = 1 α D t k = Ω max k
Figure imgb0017

Inversement, en l'absence de parole mais en présence de bruits transitoires : Ω k l = 1 α S k = Ω min k

Figure imgb0018
Conversely, in the absence of speech but in the presence of transient noises: Ω k l = 1 α S k = Ω min k
Figure imgb0018

Si l'on suppose qu'en général : Ω min k Ω k l Ω max k

Figure imgb0019
une procédure d'estimation de q(k,l) est donnée par l'algorithme en métalangage suivant :If we assume that in general: Ω min k Ω k l Ω max k
Figure imgb0019
a procedure for estimating q (k, l) is given by the following metalanguage algorithm:

Pour chaque trame l et pour chaque segment de spectre k,For each frame l and for each spectrum segment k,

  1. (i) Calculer SX(k,l), MX(k,l), SRef(k,l) et MRef(k,l). Aller à (ii) (i) Calculate SX ( k , l ), MX ( k , l ), SRef ( k , l ) and MRef ( k , l ). Go to (ii)
  2. (ii) Si SX(k,l) > LXMX(k,l) (détection de transitoires sur la voie de parole bruitée), alors aller à (iii) sinon q k l = 1
    Figure imgb0020
    (ii) If SX ( k , l )> L X MX ( k , l ) (transient detection on the noisy speech path), then go to (iii) otherwise q k l = 1
    Figure imgb0020
  3. (iii) Si SRef(k,l) > LRefMRef(k,l) (détection de transitoires sur la voie de référence), alors aller à (iv) sinon q k l = 0
    Figure imgb0021
    (iii) If SRef ( k, l )> L Ref MRef ( k, l ) (transient detection on the reference path), then go to (iv) otherwise q k l = 0
    Figure imgb0021
  4. (iv) Calculer Ω(k,l). aller à (v) (iv) Calculate Ω ( k , l ). go to (v)
  5. (v) Calculer :(v) Calculate: q k l = max min Ω max k - Ω k l Ω max k - Ω min k 1 , 0
    Figure imgb0022
    q k l = max min Ω max k - Ω k l Ω max k - Ω min k 1 , 0
    Figure imgb0022

Les constantes Lx et LRef sont des seuils de détection des transitoires. Ωmin (k) et Ωm ax(k) sont les limites supérieure et inférieure pour chaque segment de spectre. Ces divers paramètres sont choisis de manière à correspondre à des situations typiques, proches de la réalité.The constants L x and L Ref are transient detection thresholds. Ω min (k) and Ω m ax (k) are the upper and lower limits for each spectrum segment. These various parameters are chosen so as to correspond to typical situations, close to reality.

L'étape suivante (correspondant au bloc 26 de la figure 1) consiste à opérer le débruitage proprement dit (renforcement de la composante de parole). L'estimateur que l'on vient de décrire sera appliqué au modèle statistique décrit par Ephraim et Malah [2, précité], qui suppose que le bruit et la parole dans chaque segment de spectre sont des processus gaussiens indépendants de variances respectives λx(k,l) et λd(k,l).The next step (corresponding to block 26 of the figure 1 ) consists in operating the denoising itself (reinforcement of the speech component). The estimator just described will be applied to the statistical model described by Ephraim and Malah [2, supra], which assumes that the noise and speech in each spectrum segment are independent Gaussian processes of respective variances λ x ( k, l) and λ d (k, l) .

Cette étape peut avantageusement mettre en oeuvre l'algorithme de gain OM-LSA (Optimally Modified Log-Spectral Amplitude Gain) décrit par Cohen et Berdugo [3, précité]. Le rapport signal/bruit a priori est défini par : ξ k l = λ x k l λ d k l

Figure imgb0023
This step may advantageously implement the OM-LSA gain algorithm ( Optimally Modified Log-Spectral Amplitude Gain ) described by Cohen and Berdugo [3, cited above]. The signal / noise ratio a priori is defined by: ξ k l = λ x k l λ d k l
Figure imgb0023

Le rapport signal/bruit a posteriori est défini par : γ k l = X k l 2 λ d k l

Figure imgb0024
The signal-to-noise ratio a posteriori is defined by: γ k l = X k l 2 λ d k l
Figure imgb0024

La probabilité conditionnelle de présence du signal est : p k l = Pr H 1 k l | X k l

Figure imgb0025
The conditional probability of signal presence is: p k l = Pr H 1 k l | X k l
Figure imgb0025

Avec l'hypothèse gaussienne et les paramètres ci-dessus, il vient : p k l = 1 + q k l 1 - q k l 1 + ξ k l exp - υ k l - 1

Figure imgb0026
avec : υ k l = γ k l ξ k l 1 + ξ k l
Figure imgb0027
With the Gaussian Hypothesis and the parameters above, it comes: p k l = 1 + q k l 1 - q k l 1 + ξ k l exp - υ k l - 1
Figure imgb0026
with: υ k l = γ k l ξ k l 1 + ξ k l
Figure imgb0027

L'estimée optimale de la parole débruitée S(k,l) est donnée par : S ^ k l = G H 1 k l p k l G min 1 - p k l X k l

Figure imgb0028
The optimal estimate of speech de-noiseed S (k, l) is given by: S ^ k l = BOY WUT H 1 k l p k l BOY WUT min 1 - p k l X k l
Figure imgb0028

G H1 étant le gain dans l'hypothèse où la parole est présente, qui est défini par: G H 1 k l = ξ k l 1 + ξ k l exp 1 2 υ k l e - t t t

Figure imgb0029
G H1 being the gain in the hypothesis where speech is present, which is defined by: BOY WUT H 1 k l = ξ k l 1 + ξ k l exp 1 2 υ k l e - t t t
Figure imgb0029

Le gain Gmin dans l'hypothèse d'absence de parole est une limite inférieure pour la réduction du bruit, afin de limiter la distorsion de la parole.The G min gain in the absence of speech hypothesis is a lower limit for noise reduction, in order to limit the distortion of speech.

La formule classique d'estimation du rapport signal/bruit a priori est : ξ ^ k l = a G H 1 2 k , l - 1 γ k , l - 1 + 1 - a max γ k l - 1 , 0

Figure imgb0030
The classical formula for estimating the signal / noise ratio a priori is: ξ ^ k l = at BOY WUT H 1 2 k , l - 1 γ k , l - 1 + 1 - at max γ k l - 1 , 0
Figure imgb0030

L'estimation de l'énergie du bruit est donnée par : λ ^ d k , l + 1 = a d k l λ ^ d k l + β 1 - a d k l X k l 2

Figure imgb0031
The noise energy estimate is given by: λ ^ d k , l + 1 = at ~ d k l λ ^ d k l + β 1 - at ~ d k l X k l 2
Figure imgb0031

Le paramètre de lissage ãd évolue entre une limite inférieure ad et 1, en fonction de la probabilité de présence conditionnelle : a ^ d k l = a d + 1 - a d p k l

Figure imgb0032
β étant un facteur de surestimation qui compense le biais en l'absence de signal.The smoothing parameter ã d evolves between a lower limit of d and 1, depending on the probability of conditional presence: at ^ d k l = at d + 1 - at d p k l
Figure imgb0032
β being an overestimation factor that compensates for bias in the absence of a signal.

Le signal obtenu à l'issue de ce traitement est soumis à une transformée de Fourier inverse (bloc 28) pour donner l'estimée finale de la parole débruitée.The signal obtained at the end of this treatment is subjected to an inverse Fourier transform (block 28) to give the final estimate of the denoised speech.

L'algorithme de la présente invention se révèle particulièrement efficace dans les environnements bruyants, parasités à la fois par des bruits mécaniques, des vibrations, etc. ainsi que par des bruits musicaux, situations caractéristiques rencontrées dans l'habitacle d'une voiture. Les spectrogrammes montrent que l'atténuation du bruit est non seulement efficace, mais se fait sans distorsion notable de la parole après débruitage.The algorithm of the present invention is particularly effective in noisy environments, parasitized by both mechanical noises, vibrations, etc. as well as by musical noises, characteristic situations encountered in the interior of a car. Spectrograms show that the attenuation of the noise is not only effective, but is done without significant distortion of speech after denoising.

Claims (8)

  1. A method for processing an audio signal, for the denoising of an original noisy signal comprising a speech component combined with a noise component, this noise component itself comprising a transient noise component and a pseudo-stationary noise component, characterized in that this method is a method of analyzing the temporal coherence of the sampled noisy signal comprising the steps of:
    a) determination of a reference signal by application to the noisy signal of a processing (10, 18) suitable for more significantly attenuating the speech components than the noise components of this noisy signal, the said processing comprising:
    a1) the application of an adaptive linear prediction algorithm operating on a linear combination of the earlier samples of the noisy signal, and
    a2) the determination of the said reference signal by a subtraction, with phase-shift compensation, between the original, non-prefiltered noisy signal and the signal delivered by the linear prediction algorithm;
    b) determination (24) of an a priori probability of presence/absence of speech on the basis of the respective energy levels in the spectral domain of the noisy signal and of the reference signal; and
    c) use of this a priori probability of absence of speech to estimate a noise spectrum and to derive (26) from the noisy signal a denoised estimate of the speech signal.
  2. The method of Claim 1, in which the said reference signal is determined by application in step a2) of a relation of the type: Ref k l = X k l - X k l Y k l X k l
    Figure imgb0034

    where X(k,l) and Y(k,l) are the short-term Fourier transforms of each spectrum segment k of each frame l, respectively of the original noisy signal and of the signal delivered by the linear prediction algorithm.
  3. The method of Claim 1, in which the linear prediction algorithm (10) is an algorithm of least mean squares, LMS, type.
  4. The method of Claim 1, in which the linear prediction algorithm (10) is a recursive adaptive algorithm.
  5. The method of Claim 1, in which step b) comprises the application of an algorithm for estimating the energy of the pseudo-stationary noise component in the reference signal and in the noisy signal.
  6. The method of Claim 5, in which the algorithm for estimating the energy of the pseudo-stationary noise component is an algorithm of minima controlled recursive averaging, MCRA, type.
  7. The method of Claim 1, in which step c) comprises the application of a variable gain algorithm dependent on the probability of presence/absence of speech.
  8. The method of Claim 7, in which the variable gain algorithm is an algorithm of optimally modified log-spectral amplitude, OM-LSA, gain type.
EP07290219A 2006-03-01 2007-02-21 Method of noise reduction of an audio signal Active EP1830349B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
FR0601822A FR2898209B1 (en) 2006-03-01 2006-03-01 METHOD FOR DEBRUCTING AN AUDIO SIGNAL

Publications (2)

Publication Number Publication Date
EP1830349A1 EP1830349A1 (en) 2007-09-05
EP1830349B1 true EP1830349B1 (en) 2011-11-30

Family

ID=36992693

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07290219A Active EP1830349B1 (en) 2006-03-01 2007-02-21 Method of noise reduction of an audio signal

Country Status (6)

Country Link
US (1) US7953596B2 (en)
EP (1) EP1830349B1 (en)
AT (1) ATE535905T1 (en)
ES (1) ES2378482T3 (en)
FR (1) FR2898209B1 (en)
WO (1) WO2007099222A1 (en)

Families Citing this family (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8949120B1 (en) 2006-05-25 2015-02-03 Audience, Inc. Adaptive noise cancelation
FR2908003B1 (en) * 2006-10-26 2009-04-03 Parrot Sa METHOD OF REDUCING RESIDUAL ACOUSTIC ECHO AFTER ECHO SUPPRESSION IN HANDS-FREE DEVICE
FR2908005B1 (en) * 2006-10-26 2009-04-03 Parrot Sa ACOUSTIC ECHO REDUCTION CIRCUIT FOR HANDS-FREE DEVICE FOR USE WITH PORTABLE TELEPHONE
FR2908004B1 (en) * 2006-10-26 2008-12-12 Parrot Sa ACOUSTIC ECHO REDUCTION CIRCUIT FOR HANDS-FREE DEVICE FOR USE WITH PORTABLE TELEPHONE
FR2932332B1 (en) * 2008-06-04 2011-03-25 Parrot AUTOMATIC GAIN CONTROL SYSTEM APPLIED TO AN AUDIO SIGNAL BASED ON AMBIENT NOISE
US8521530B1 (en) * 2008-06-30 2013-08-27 Audience, Inc. System and method for enhancing a monaural audio signal
DK2151820T3 (en) * 2008-07-21 2012-02-06 Siemens Medical Instr Pte Ltd Method of bias compensation for cepstro-temporal smoothing of spectral filter gain
JP5459688B2 (en) 2009-03-31 2014-04-02 ▲ホア▼▲ウェイ▼技術有限公司 Method, apparatus, and speech decoding system for adjusting spectrum of decoded signal
FR2945696B1 (en) * 2009-05-14 2012-02-24 Parrot METHOD FOR SELECTING A MICROPHONE AMONG TWO OR MORE MICROPHONES, FOR A SPEECH PROCESSING SYSTEM SUCH AS A "HANDS-FREE" TELEPHONE DEVICE OPERATING IN A NOISE ENVIRONMENT.
WO2010151183A1 (en) * 2009-06-23 2010-12-29 Telefonaktiebolaget L M Ericsson (Publ) Method and an arrangement for a mobile telecommunications network
FR2948484B1 (en) * 2009-07-23 2011-07-29 Parrot METHOD FOR FILTERING NON-STATIONARY SIDE NOISES FOR A MULTI-MICROPHONE AUDIO DEVICE, IN PARTICULAR A "HANDS-FREE" TELEPHONE DEVICE FOR A MOTOR VEHICLE
KR101587844B1 (en) * 2009-08-26 2016-01-22 삼성전자주식회사 Microphone signal compensation apparatus and method of the same
FR2950461B1 (en) * 2009-09-22 2011-10-21 Parrot METHOD OF OPTIMIZED FILTERING OF NON-STATIONARY NOISE RECEIVED BY A MULTI-MICROPHONE AUDIO DEVICE, IN PARTICULAR A "HANDS-FREE" TELEPHONE DEVICE FOR A MOTOR VEHICLE
US8219394B2 (en) * 2010-01-20 2012-07-10 Microsoft Corporation Adaptive ambient sound suppression and speech tracking
US8798290B1 (en) 2010-04-21 2014-08-05 Audience, Inc. Systems and methods for adaptive signal equalization
DK2395506T3 (en) * 2010-06-09 2012-09-10 Siemens Medical Instr Pte Ltd Acoustic signal processing method and system for suppressing interference and noise in binaural microphone configurations
US20120245927A1 (en) * 2011-03-21 2012-09-27 On Semiconductor Trading Ltd. System and method for monaural audio processing based preserving speech information
CN102740215A (en) * 2011-03-31 2012-10-17 Jvc建伍株式会社 Speech input device, method and program, and communication apparatus
FR2974655B1 (en) 2011-04-26 2013-12-20 Parrot MICRO / HELMET AUDIO COMBINATION COMPRISING MEANS FOR DEBRISING A NEARBY SPEECH SIGNAL, IN PARTICULAR FOR A HANDS-FREE TELEPHONY SYSTEM.
FR2976111B1 (en) * 2011-06-01 2013-07-05 Parrot AUDIO EQUIPMENT COMPRISING MEANS FOR DEBRISING A SPEECH SIGNAL BY FRACTIONAL TIME FILTERING, IN PARTICULAR FOR A HANDS-FREE TELEPHONY SYSTEM
FR2976710B1 (en) * 2011-06-20 2013-07-05 Parrot DEBRISING METHOD FOR MULTI-MICROPHONE AUDIO EQUIPMENT, IN PARTICULAR FOR A HANDS-FREE TELEPHONY SYSTEM
US8880393B2 (en) * 2012-01-27 2014-11-04 Mitsubishi Electric Research Laboratories, Inc. Indirect model-based speech enhancement
US9258653B2 (en) * 2012-03-21 2016-02-09 Semiconductor Components Industries, Llc Method and system for parameter based adaptation of clock speeds to listening devices and audio applications
US9640194B1 (en) 2012-10-04 2017-05-02 Knowles Electronics, Llc Noise suppression for speech processing based on machine-learning mask estimation
US20140270249A1 (en) * 2013-03-12 2014-09-18 Motorola Mobility Llc Method and Apparatus for Estimating Variability of Background Noise for Noise Suppression
US20140278393A1 (en) 2013-03-12 2014-09-18 Motorola Mobility Llc Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System
US9536540B2 (en) 2013-07-19 2017-01-03 Knowles Electronics, Llc Speech signal separation and synthesis based on auditory scene analysis and speech modeling
US10141003B2 (en) * 2014-06-09 2018-11-27 Dolby Laboratories Licensing Corporation Noise level estimation
CN106797512B (en) 2014-08-28 2019-10-25 美商楼氏电子有限公司 Method, system and the non-transitory computer-readable storage medium of multi-source noise suppressed
WO2016100797A1 (en) 2014-12-18 2016-06-23 Conocophillips Company Methods for simultaneous source separation
US20170018273A1 (en) * 2015-07-16 2017-01-19 GM Global Technology Operations LLC Real-time adaptation of in-vehicle speech recognition systems
EP3356974B1 (en) 2015-09-28 2021-12-15 ConocoPhillips Company 3d seismic acquisition
FR3044197A1 (en) 2015-11-19 2017-05-26 Parrot AUDIO HELMET WITH ACTIVE NOISE CONTROL, ANTI-OCCLUSION CONTROL AND CANCELLATION OF PASSIVE ATTENUATION, BASED ON THE PRESENCE OR ABSENCE OF A VOICE ACTIVITY BY THE HELMET USER.
US10251002B2 (en) 2016-03-21 2019-04-02 Starkey Laboratories, Inc. Noise characterization and attenuation using linear predictive coding
US10564925B2 (en) * 2017-02-07 2020-02-18 Avnera Corporation User voice activity detection methods, devices, assemblies, and components
US10809402B2 (en) 2017-05-16 2020-10-20 Conocophillips Company Non-uniform optimal survey design principles
US10079026B1 (en) * 2017-08-23 2018-09-18 Cirrus Logic, Inc. Spatially-controlled noise reduction for headsets with variable microphone array orientation
CN108899043A (en) * 2018-06-15 2018-11-27 深圳市康健助力科技有限公司 The research and realization of digital deaf-aid instantaneous noise restrainable algorithms
WO2020069143A1 (en) 2018-09-30 2020-04-02 Conocophillips Company Machine learning based signal recovery
JP2020144204A (en) * 2019-03-06 2020-09-10 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Signal processor and signal processing method
FR3113537B1 (en) 2020-08-19 2022-09-02 Faurecia Clarion Electronics Europe Method and electronic device for reducing multi-channel noise in an audio signal comprising a voice part, associated computer program product
CN112233688B (en) * 2020-09-24 2022-03-11 北京声智科技有限公司 Audio noise reduction method, device, equipment and medium
CN116644281B (en) * 2023-07-27 2023-10-24 东营市艾硕机械设备有限公司 Yacht hull deviation detection method

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4658426A (en) * 1985-10-10 1987-04-14 Harold Antin Adaptive noise suppressor
US5251263A (en) * 1992-05-22 1993-10-05 Andrea Electronics Corporation Adaptive noise cancellation and speech enhancement system and apparatus therefor
US5742694A (en) * 1996-07-12 1998-04-21 Eatwell; Graham P. Noise reduction filter
US5924061A (en) * 1997-03-10 1999-07-13 Lucent Technologies Inc. Efficient decomposition in noise and periodic signal waveforms in waveform interpolation
US6691092B1 (en) * 1999-04-05 2004-02-10 Hughes Electronics Corporation Voicing measure as an estimate of signal periodicity for a frequency domain interpolative speech codec system
JP2005249816A (en) * 2004-03-01 2005-09-15 Internatl Business Mach Corp <Ibm> Device, method and program for signal enhancement, and device, method and program for speech recognition
DE602004004242T2 (en) * 2004-03-19 2008-06-05 Harman Becker Automotive Systems Gmbh System and method for improving an audio signal
US7813499B2 (en) * 2005-03-31 2010-10-12 Microsoft Corporation System and process for regression-based residual acoustic echo suppression

Also Published As

Publication number Publication date
WO2007099222A1 (en) 2007-09-07
FR2898209A1 (en) 2007-09-07
ES2378482T3 (en) 2012-04-13
ATE535905T1 (en) 2011-12-15
US7953596B2 (en) 2011-05-31
FR2898209B1 (en) 2008-12-12
EP1830349A1 (en) 2007-09-05
US20070276660A1 (en) 2007-11-29

Similar Documents

Publication Publication Date Title
EP1830349B1 (en) Method of noise reduction of an audio signal
EP2057835B1 (en) Method of reducing the residual acoustic echo after echo removal in a hands-free device
EP1789956B1 (en) Method of processing a noisy sound signal and device for implementing said method
EP2293594B1 (en) Method for filtering lateral non stationary noise for a multi-microphone audio device
EP2309499B1 (en) Method for optimised filtering of non-stationary interference captured by a multi-microphone audio device, in particular a hands-free telephone device for an automobile.
EP1356461B1 (en) Noise reduction method and device
EP2538409B1 (en) Noise reduction method for multi-microphone audio equipment, in particular for a hands-free telephony system
EP2772916B1 (en) Method for suppressing noise in an audio signal by an algorithm with variable spectral gain with dynamically adaptive strength
EP1096471A1 (en) Method and means for a robust feature extraction for speech recognition
EP0666655B1 (en) Method and apparatus for analyzing a return signal and adaptive echo canceller using the same
EP0767569B1 (en) Method and device for adaptive identification and related adaptive echo canceller
EP2131357A1 (en) System for automatic control of the gain applied to an audio signal according to environmental noise
EP3192073B1 (en) Discrimination and attenuation of pre-echoes in a digital audio signal
EP1940139B1 (en) Control of echo suppression filters
EP1039736B1 (en) Method and device for adaptive identification and related adaptive echo canceller
EP0534837B1 (en) Speech processing method in presence of acoustic noise using non-linear spectral subtraction and hidden Markov models
EP3627510A1 (en) Filtering of an audio signal acquired by a voice recognition system
WO2001011605A1 (en) Method and device for detecting voice activity
EP2515300A1 (en) Method and System for noise reduction
FR2767941A1 (en) ECHO SUPPRESSOR BY SENSE TRANSFORMATION AND ASSOCIATED METHOD
FR3113537A1 (en) Method and electronic device for reducing multi-channel noise in an audio signal comprising a voice part, associated computer program product
Kim et al. Improved noise reduction with packet loss recovery based on post-filtering over IP networks
FR3054338A1 (en) METHOD OF CORRECTING DEFECTS INTRODUCED BY A SCANNING SYSTEM AND ASSOCIATED DEVICES
WO2010029247A1 (en) Low-distortion noise cancellation
WO2006077005A2 (en) Device for acoustic echo cancellation, and corresponding method and computer program

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK YU

17P Request for examination filed

Effective date: 20080219

17Q First examination report despatched

Effective date: 20080331

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Free format text: LANGUAGE OF EP DOCUMENT: FRENCH

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602007019049

Country of ref document: DE

Owner name: PARROT AUTOMOTIVE, FR

Free format text: FORMER OWNER: PARROT, PARIS, FR

REG Reference to a national code

Ref country code: NL

Ref legal event code: T3

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602007019049

Country of ref document: DE

Effective date: 20120301

REG Reference to a national code

Ref country code: ES

Ref legal event code: FG2A

Ref document number: 2378482

Country of ref document: ES

Kind code of ref document: T3

Effective date: 20120413

LTIE Lt: invalidation of european patent or patent extension

Effective date: 20111130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20120330

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20120301

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20120330

REG Reference to a national code

Ref country code: IE

Ref legal event code: FD4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20120229

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

BERE Be: lapsed

Owner name: PARROT

Effective date: 20120228

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 535905

Country of ref document: AT

Kind code of ref document: T

Effective date: 20111130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120229

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120229

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120229

26N No opposition filed

Effective date: 20120831

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602007019049

Country of ref document: DE

Effective date: 20120831

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120228

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: ES

Payment date: 20130218

Year of fee payment: 7

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20111130

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120221

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070221

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 9

REG Reference to a national code

Ref country code: ES

Ref legal event code: FD2A

Effective date: 20150327

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20140222

REG Reference to a national code

Ref country code: DE

Ref legal event code: R081

Ref document number: 602007019049

Country of ref document: DE

Owner name: PARROT AUTOMOTIVE, FR

Free format text: FORMER OWNER: PARROT, PARIS, FR

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20151029 AND 20151104

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

Owner name: PARROT AUTOMOTIVE, FR

Effective date: 20151201

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 10

REG Reference to a national code

Ref country code: NL

Ref legal event code: PD

Owner name: PARROT AUTOMOTIVE; FR

Free format text: DETAILS ASSIGNMENT: VERANDERING VAN EIGENAAR(S), OVERDRACHT; FORMER OWNER NAME: PARROT

Effective date: 20151102

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 11

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 12

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 20190219

Year of fee payment: 13

REG Reference to a national code

Ref country code: NL

Ref legal event code: MM

Effective date: 20200301

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200301

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20230119

Year of fee payment: 17

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20230120

Year of fee payment: 17

Ref country code: GB

Payment date: 20230121

Year of fee payment: 17

Ref country code: DE

Payment date: 20230119

Year of fee payment: 17