EP3516653B1 - Appareil et procédé permettant de générer des estimations de bruit - Google Patents

Appareil et procédé permettant de générer des estimations de bruit Download PDF

Info

Publication number
EP3516653B1
EP3516653B1 EP16784821.7A EP16784821A EP3516653B1 EP 3516653 B1 EP3516653 B1 EP 3516653B1 EP 16784821 A EP16784821 A EP 16784821A EP 3516653 B1 EP3516653 B1 EP 3516653B1
Authority
EP
European Patent Office
Prior art keywords
noise
frequency
spectral
cut
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP16784821.7A
Other languages
German (de)
English (en)
Other versions
EP3516653A1 (fr
Inventor
Wenyu Jin
Wei Xiao
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of EP3516653A1 publication Critical patent/EP3516653A1/fr
Application granted granted Critical
Publication of EP3516653B1 publication Critical patent/EP3516653B1/fr
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain

Definitions

  • This invention relates to an apparatus and a method for generating noise estimates.
  • NR Noise reduction
  • SNR signal-to-noise ratios
  • Noise reduction methods based on single channel noise estimation can usually only deal with stationary noise scenarios and are vulnerable to non-stationary noise and interferers. Better differentiation between speech and noise can be achieved using multiple microphones. Using multiple microphones also facilitates accurate estimation of complex noise conditions and can lead to effective non-stationary noise suppression.
  • Examples of existing techniques that explore the possibility of noise estimation using multiple microphone arrays include techniques described in: " A microphone array with adaptive post-filtering for noise reduction in reverberant rooms” by R. Zelinkski (Proc. ICASSP-88, vol. 51988, pp. 2578-2581 ) and " Microphone array post-filter based on noise field coherence” by McCowan et al (Speech and Audio Processing, IEEE Transactions 11.6 (2003), 709-716 ). These techniques assume the noise is either spatially white (incoherent) or fully diffuse and cannot deal with time-varying noise and interference sources. They are also ineffective at low frequencies when the sound source is close to the microphone. Speech and noise signals show similar coherence properties under those conditions, meaning that it is not possible to determine one from the other on the basis of coherence alone.
  • US 2008/0159559 A1 describes a post-filter for a microphone array which is based on a transition frequency determined in accordance with a distance between microphones.
  • a wind noise reduction device is described in US 2008/0317261 A1 .
  • US 2014/0161271 A1 concerns a noise eliminating device and US 2016/0078856 A1 also addresses eliminating noise.
  • the noise estimator 100 comprises an estimator 101 and an adaptation unit 102.
  • the estimator is configured to receive an audio signal that is detected by microphone 103 (step S201).
  • the estimator is configured to receive audio signals that are detected by multiple microphones 104.
  • the microphones are part of the noise estimator. That device could be, for example, a mobile phone, smart phone, landline telephone, tablet, laptop, teleconferencing equipment or any generic user equipment, particularly user equipment that is commonly used to capture speech signals.
  • the audio signal represents sounds that have been captured by a microphone.
  • An audio signal will often be formed from a component that is wanted (which will usually be speech) and a component that is not wanted (which will usually be noise). Estimating the unwanted component means that it can be removed from the audio signal.
  • Each microphone will capture its own version of sounds in the surrounding environment, and those versions will tend to differ from each other depending on differences between the microphones themselves and on the respective positions of the microphones relative to the sound sources. If the sounds in the environment include speech and noise, each microphone will typically capture an audio signal that is representative of both speech and noise. Similarly, if the sounds in the environment just include noise (e.g. during pauses in speech), each microphone will capture an audio signal that represents just that noise. Sounds in the surrounding environment will typically be reflected differently in each individual audio signal. In some circumstances, these differences can be exploited to estimate the noise signal.
  • the estimator (101) is configured to generate an overall estimate of noise in the audio signal (steps 202 to 204). The estimated noise can then be removed from the audio signal by another part of the device.
  • the estimator is configured to generate the estimate based on one or more of the audio signals captured by the microphones.
  • the audio signals that are captured by the microphones are pre-processed before being input into the estimator. Such "pre-processed” are also covered by the general term "audio signals” used herein).
  • Each audio signal can be considered as being formed from a series of complex sinusoidal functions. Each of those sinusoidal functions is a spectral component of the audio signal. Typically, each spectral component is associated with a particular frequency, phase and amplitude.
  • the audio signal can be disassembled into its respective spectral components by a Fourier analysis.
  • the estimator 101 aims to form an overall noise estimate by generating a spectral noise estimate for each spectral component in the audio signal.
  • the estimator comprises a low-frequency estimator 105 and a high frequency estimator 106.
  • the low-frequency estimator 105 is configured to generate spectral noise estimates for the spectral components of the audio signal that are below a cut-off frequency. Those spectral noise estimates will form a low frequency section of the overall noise estimate.
  • the low frequency estimator achieves this by applying a first estimation technique to the audio signal to generate spectral noise estimates that are associated with frequencies below a cut-off frequency (step S202).
  • the high frequency estimator 106 is configured to generate spectral noise estimates for the spectral components of the audio signal that are above the cut-off frequency. Those spectral estimates will form a higher frequency section of the overall noise estimate. The high frequency estimator achieves this by applying a second estimation technique to the audio signal to generate spectral noise estimates that are associated with frequencies above the cut-off frequency (step S203).
  • the estimator also comprises a combine module 107 that is configured to form the overall noise estimate by combining the spectral noise estimates that are output by the low and high frequency estimators.
  • the combine module forms the overall noise estimate to have spectral noise estimates that are output by the low frequency estimator below the cut-off frequency and spectral noise estimates that are output by the high frequency estimator above the cut-off frequency (step S204).
  • the low and high frequency estimators will both be configured to generate spectral noise estimates across the whole frequency range of the audio signal. The combine module will then just select the appropriate spectral noise estimate to use for each frequency bin in the overall noise estimate, with that selection depending on the cut-off frequency.
  • the estimator 101 also comprises an adaptation unit 102.
  • the adaptation unit is configured to adjust the cut-off frequency.
  • the adaptation unit makes this adjustment to account for changes in the respective coherence properties of the speech and noise signals that are reflected in the audio signal (step S205).
  • the coherence properties of the noise signal generally varies in dependence on frequency. At low frequencies, speech and noise tend to show similar degrees of coherence whereas at higher frequencies noise is often incoherent while speech is coherent. Coherence properties can also be affected by distance between a sound source and a microphone: noise and speech show particularly similar coherence properties at low frequencies when the microphone and the sound source are close together.
  • the respective coherence properties displayed by the noise and speech signals will thus tend to vary with time, particularly in mobile and/or hands free scenarios where one or more sound sources (such as someone talking) may move with respect to the microphone.
  • One option is to track the coherence properties of both speech and noise.
  • it is the noise coherence that particularly changes. Consequently changes between the respective coherence properties of the speech and noise signals can be monitored by tracking the coherence properties of just the noise.
  • Adjusting the cut-off frequency so as to adapt to changes in the coherence properties of the noise signal that are represented in the audio signal may be advantageous because it enables the estimator to generate the overall noise estimate using techniques that work well for the particular coherence properties that are prevalent in the noise on either side of the cut-off frequency, and to alter that cut-off frequency to account for changes in those coherence properties with time. This is particularly useful for the complex noise scenarios that occur when user equipment is used in hands-free mode.
  • Figures 1 are intended to correspond to a number of functional blocks. This is for illustrative purposes only. Figure 1 is not intended to define a strict division between different parts of hardware on a chip or between different programs, procedures or functions in software.
  • some or all of the signal processing techniques described herein are likely to be performed wholly or partly in hardware. This particularly applies to techniques incorporating repetitive arithmetic operations. Examples of such repetitive operations might include Fourier transforms, auto- and cross-correlations and pseudo inverses.
  • at least some of the functional blocks are likely to be implemented wholly or partly by a processor acting under software control.
  • the processor could, for example, be a DSP of a mobile phone, smart phone, landline telephone, tablet, laptop, teleconferencing equipment or any generic user equipment with speech processing capability.
  • FIG. 3 A more detailed example of a noise estimator is shown in Figure 3 .
  • the system is configured to receive multiple audio signals X 1 to X m (301). Each of these audio signals represents a recording from a specific microphone. The number of microphones can thus be denoted M.
  • Each channel is provided with a segmentation/windowing module 302. These modules are followed by transform units 303 configured to convert the windowed signals into the frequency domain.
  • the transform units 303 are configured to implement the Fast Fourier Transform (FFT) to derive the short-term Fourier transform (STFT) coefficients for each input channel. These coefficients represent spectral components of the input signal.
  • the STFT is a Fourier-related transform used to determine the sinusoidal frequency and phase content of local sections of a signal as it changes over time.
  • the STFT may be computed by dividing the audio signal into short segments of equal length and then computing the Fourier transform separately on each short segment. The result is the Fourier spectrum for each short segment of the audio signal, giving the signal processor the changing frequency spectra of the audio signal as a function of time. Each spectral component thus has an amplitude and a time extension.
  • the length of the FFT can be denoted N. N represents a number of frequency bins, with the STFT essentially decomposing the original audio signal into those frequency bins.
  • the outputs from the transform units 303 are input into the estimator, shown generally at 304.
  • the low frequency estimator is implemented by "SPP Based NE" unit 305 (which will be referred to as SPP unit 305 hereafter).
  • the low frequency estimator is configured to generate the spectral noise estimates below the cut-off frequency.
  • the high frequency estimator is implemented by the "Noise Coherence/Covariance” modelling unit 306 and the "MMSE Optimal NE Solver” 307 (which will be respectively referred to as modelling unit 306 and optimiser 307 hereafter).
  • the high frequency estimator is configured to generate the spectral noise estimates above the cut-off frequency.
  • the low frequency estimator 305 and the high frequency estimator 306, 307 process the outputs from the transform units using different noise estimation techniques.
  • the low frequency estimator suitably uses a technique that is adapted to the respective coherence properties of the noise signal and the speech signal that are expected to predominate in the audio signal below the cut-off frequency. In most embodiments this means that the low frequency estimator will apply an estimation technique that is adapted to a scenario in which the coherence of both signals is high and similar to the coherence of the other.
  • the low frequency estimator is configured to generate its spectral noise estimates based on a single microphone signal.
  • the high frequency estimator will similarly apply an estimation technique that is adapted to a coherence of the noise signal and the speech signal that is expected to predominate in the audio signal above the cut-off frequency.
  • the noise and speech signals are generally expected to show different coherence properties above the cut-off frequency, with the noise signal becoming less coherent than below the cut-off frequency.
  • a more accurate noise estimate may be obtained by combining signals from multiple microphones under these conditions, so the high frequency estimator may be configured to receive audio signals from multiple microphones.
  • the noise estimates that are output by the low frequency estimator 305 and the high frequency estimator 306, 307 take the form of power spectral densities (PSD).
  • PSD power spectral densities
  • a PSD represents the noise as a series of coefficients. Each coefficient represents an estimated power of the noise in an audio signal for a respective frequency bin. The coefficient in each frequency bins can be considered a spectral noise estimates.
  • the frequency bins suitably replicate the frequency bins into which the audio signals were decomposed by transform units 303.
  • the outputs of the low frequency estimator and the high frequency estimator thus represent spectral noise estimates for each spectral noise component of the audio signal.
  • the two sets of coefficients are input into the "Estimate Selection" unit 308.
  • This estimate selection unit combines the functionality of combine module 107 and adaptation unit 102 shown in Figure 1 .
  • the estimate selection unit is configured to choose between the coefficients that are output by the low frequency estimator and the high frequency estimator in dependence on frequency.
  • the adaptation unit chooses the coefficients output by SPP unit 305.
  • the estimate selection unit chooses the coefficients output by the combination of the modelling unit 306 and the optimiser 307.
  • the estimate selection unit also monitors a coherence of the noise signal by means of the audio signal, and uses this to adapt the cut-off frequency.
  • the low frequency estimator may use any suitable estimation technique to generate spectral noise estimates that are below a cut-off frequency.
  • One option would be an MMSE-based spectral noise power estimation technique.
  • Another option is soft decision voice activity detection. This is the technique implemented by SPP unit 305, which is configured to implement a single-channel SPP-based method (where "SPP" stands for Speech Presence Probability). SPP maintains a quick noise tracking capability, results in less noise power overestimation and is computationally less expensive than other options.
  • SPP module 305 is configured to receive an audio signal from one microphone.
  • the SPP unit 305 is preferably configured to receive the single channel that corresponds to the device's "primary" microphone.
  • Model adaptation unit 306 is configured to update a noise coherence model and a noise covariance model in dependence on signals input from multiple microphones.
  • Optimiser 307 takes the outputs of the model adaptation unit and generates the optimum noise estimate for higher frequency sections of the overall noise estimate given those outputs.
  • step S401 the incoming signals 301 are received from multiple microphones.
  • step S402 those signals are segmented/windowed (by segmentation/windowing units 302) and converted into the frequency domain (by transform units 303).
  • ⁇ ⁇ , ⁇ represents the probability of speech presence in frame ⁇ and frequency bin ⁇
  • X 1 is the audio signal received by SPP unit 305
  • ⁇ opt is a fixed, optimal a priori signal-to-noise ratio
  • ⁇ N ,SPP ( ⁇ - 1, ⁇ ) is the noise estimate of the previous frame.
  • ⁇ ⁇ , ⁇ is a value between 0 and 1, where 1 indicates speech presence.
  • the speech presence probability calculation also triggers the updating of the noise coherence and covariance models by modelling unit 306, since these models are preferably updated in the absence of speech.
  • the model adaptation unit (306) is configured to track two qualities of the noise comprised in the incoming microphone signals: its coherence and its covariance (step S405).
  • the model adaptation unit is configured to track noise coherence using a model that is based on a coherence function.
  • the coherence function characterises a noise field by representing the coherence between two signals at points p and q.
  • the magnitude of the output of the noise coherence function is always less than or equal to one (i.e.
  • the relevant distance is this scenario is between the j th and k th microphones, so the subscripts j and k will be substituted for p and q hereafter.
  • is the frame index
  • is the frequency bin
  • ⁇ jj ( ⁇ , ⁇ ), ⁇ kk ( ⁇ , ⁇ ) and ⁇ jk ( ⁇ , ⁇ ) are the recursively-smoothed, auto-correlated and cross-correlated PSDs of the audio signals from the j th and k th microphones respectively.
  • ⁇ ( ⁇ , ⁇ ) is the posteriori SPP index for the current frame and is provided to model adaptation unit 306 by SPP unit 305.
  • ⁇ ( ⁇ , ⁇ ) acts as the threshold for ⁇ pq ( ⁇ , ⁇ ) to be updated. In practice it is preferable to only update ⁇ pq ( ⁇ , ⁇ ) in periods where speech is absent.
  • a suitable value for the smoothing factor ⁇ ⁇ might be 0.95.
  • the model adaptation unit (306) is also configured to actively update a noise covariance matrix (also in step S405).
  • the model adaptation unit (306) is thus configured to establish the coherence and covariance models and update them as audio signals are received from the microphones.
  • diag and odiag represents the diagonal and off-diagonal elements respectively, written in vector form:
  • the updated models are used to generate a further noise estimate using an optimal least squares solution (step S406).
  • the values of R and ⁇ are suitably transferred from the model adaptation unit (306) to the optimiser (307).
  • the optimiser is configured to generate the noise estimate for higher frequencies by searching for an optimal least-squares solution to equation (7) in the minimum mean square error (MMSE) sense.
  • MMSE minimum mean square error
  • the estimate selector 308 is configured to form the overall noise estimate. It receives the estimates generated by both the SPP unit (305) and the optimiser (307) ( ⁇ S and ⁇ C respectively) and combines them to form the overall noise estimate (step S407).
  • estimate selector 308 is configured to adaptively adjust the split frequency between the single microphone noise estimate and the multi microphone noise estimate based on the updating model in equation (4).
  • f pq represents the frequency where the magnitude squared value of the updated coherence function in equation (4) for the pq th microphone pair has some predetermined value.
  • a suitable value might be, for example, 0.5.
  • the split frequency is selected to be the lowest frequency among various microphone pairs where the magnitude squared value of coherence function has the predetermined value. This ensures that the appropriate noise estimate is selected for the speech and noise coherence properties experienced at different frequencies, meaning that problems caused by similarity and overlapping between speech and noise coherence properties can be consistently avoided for each channel.
  • noise reduction can be achieved using any suitable noise reduction methods, including wiener filtering, spectrum subtraction etc.
  • the techniques described above have been tested via simulation using complex non-stationary subway scenario recordings and three microphones.
  • the recording length was 130 seconds.
  • the recording was processed using the adaptive cut-off frequency technique described above and a technique in which the cut-off frequency is fixed.
  • the results are shown in Figure 5 .
  • the lower plot 502 illustrates the technique described herein and it can clearly be seen that it has been more effective in addressing the non-stationary noise issues that the fixed cut-off frequency technique shown in upper plot 501.
  • the processing was also more efficient.
  • the processing time using the non-adaptive technique was 62 seconds, compared with 35 seconds for the adaptive technique.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Claims (10)

  1. Estimateur de bruit permettant de générer une estimation de bruit globale pour un signal audio, l'estimateur de bruit comprenant des microphones permettant de capturer des sons, les sons étant représentés par une pluralité de signaux audio comprenant le signal audio, chacun de la pluralité de signaux audio étant formé, au moins partiellement, par un signal de bruit et comprenant une pluralité de composantes spectrales, et l'estimation de bruit globale comprenant, pour chaque composante spectrale dans le signal audio, une estimation de bruit spectral respective, l'estimateur de bruit comprenant :
    un estimateur (304) configuré pour générer l'estimation de bruit globale par :
    l'application d'une première technique d'estimation au signal audio pour générer des estimations de bruit spectral pour des composantes spectrales du signal audio qui sont inférieures à une fréquence de coupure ;
    l'application d'une seconde technique d'estimation différente au signal audio pour générer, sur la base de la pluralité de signaux audio, des estimations de bruit spectral pour des composantes spectrales du signal audio qui sont supérieures à la fréquence de coupure ; et
    la formation de l'estimation de bruit globale pour comprendre, pour des composantes spectrales inférieures à la fréquence de coupure, les estimations de bruit spectral générées à l'aide de la première technique d'estimation et, pour des composantes spectrales supérieures à la fréquence de coupure, les estimations de bruit spectral générées à l'aide de la seconde technique d'estimation ; caractérisé par une unité d'adaptation (306) configurée pour ajuster la fréquence de coupure de façon à prendre en compte des changements de cohérence du signal de bruit qui sont reflétés dans le signal audio, l'unité d'adaptation étant configurée pour sélectionner la fréquence de coupure pour être la fréquence la plus basse au niveau de laquelle l'un de la pluralité de signaux audio présente un degré prédéterminé de cohérence avec un autre de la pluralité de signaux audio.
  2. Estimateur de bruit selon la revendication 1, l'estimateur (308) étant configuré pour appliquer :
    en tant que première technique d'estimation, une technique qui est adaptée à une cohérence du signal de bruit qui est censé prédominer dans le signal audio en dessous de la fréquence de coupure ; et
    en tant que seconde technique d'estimation, une technique qui est adaptée à une cohérence du signal de bruit qui est censé prédominer dans le signal audio au-dessus de la fréquence de coupure.
  3. Estimateur de bruit selon la revendication 1 ou 2, l'estimateur étant configuré pour générer les estimations de bruit spectral pour une fréquence supérieure à la fréquence de coupure à l'aide d'une fonction d'optimisation qui prend la pluralité de signaux audio en tant qu'entrées.
  4. Estimateur de bruit selon l'une quelconque des revendications 1 à 3, l'estimateur étant configuré pour générer les estimations de bruit spectral pour une fréquence supérieure à la fréquence de coupure en comparant chacun de la pluralité de signaux audio avec tous les autres de la pluralité de signaux audio.
  5. Estimateur de bruit selon l'une quelconque des revendications 1 à 4, l'estimateur étant configuré pour générer les estimations de bruit spectral pour une fréquence supérieure à la fréquence de coupure en fonction de la cohérence entre chaque signal de la pluralité de signaux audio et tous les autres de la pluralité de signaux audio.
  6. Estimateur de bruit selon l'une quelconque des revendications 1 à 4, l'estimateur étant configuré pour générer les estimations de bruit spectral supérieures à la fréquence de coupure en fonction d'une covariance entre chacun de la pluralité de signaux audio avec tous les autres de la pluralité de signaux audio.
  7. Estimateur de bruit selon l'une quelconque des revendications précédentes, l'estimateur (308) étant configuré pour générer les estimations de bruit spectral inférieures à la fréquence de coupure en fonction d'un seul signal audio qui est représentatif du signal de bruit.
  8. Estimateur de bruit selon l'une quelconque des revendications précédentes, l'estimateur (308) étant configuré pour générer les estimations de bruit spectral pour une fréquence inférieure à la fréquence de coupure et/ou les estimations de bruit spectral pour une fréquence supérieure à la fréquence de coupure par application de la première ou de la seconde technique d'estimation respective uniquement aux parties du signal audio qui sont déterminées en tant que ne comprenant pas de parole.
  9. Procédé permettant de générer une estimation de bruit globale d'un signal audio à l'aide d'un estimateur de bruit qui comprend des microphones permettant de capturer des sons, les sons étant représentés par une pluralité de signaux audio comprenant le signal audio, chacun de la pluralité de signaux audio étant formé, au moins partiellement, par un signal de bruit et comprenant une pluralité de composantes spectrales, et l'estimation de bruit globale comprenant, pour chaque composante spectrale dans le signal audio, une estimation de bruit spectral respective, le procédé comprenant :
    l'application (S202) d'une première technique d'estimation au signal audio pour générer des estimations de bruit spectral pour des composantes spectrales du signal audio qui sont inférieures à une fréquence de coupure ;
    l'application (S203) d'une seconde technique d'estimation différente au signal audio pour générer des estimations de bruit spectral pour des composantes spectrales du signal audio qui sont supérieures à la fréquence de coupure ;
    la formation (S204) de l'estimation de bruit globale pour comprendre, pour des composantes spectrales inférieures à la fréquence de coupure, les estimations de bruit spectral générées à l'aide de la première technique d'estimation et, pour des composantes spectrales supérieures à la fréquence de coupure, les estimations de bruit spectral générées, sur la base de la pluralité de signaux audio, à l'aide de la seconde technique d'estimation ; et
    l'ajustement (S205) de la fréquence de coupure de façon à prendre en compte des changements de cohérence du signal de bruit qui sont reflétés dans le signal audio, la fréquence de coupure étant sélectionnée pour être la fréquence la plus basse au niveau de laquelle l'un de la pluralité de signaux audio présente un degré prédéterminé de cohérence avec un autre de la pluralité de signaux audio.
  10. Support de stockage non transitoire lisible par machine sur lequel sont stockées des instructions exécutables par un processeur mettant en œuvre un procédé permettant de générer une estimation de bruit globale d'un signal audio à l'aide d'un estimateur de bruit qui comprend des microphones permettant de capturer des sons, les sons étant représentés par une pluralité de signaux audio comprenant le signal audio, chacun de la pluralité de signaux audio étant formé, au moins partiellement, par un signal de bruit et comprenant une pluralité de composantes spectrales, et l'estimation de bruit globale comprenant, pour chaque composante spectrale dans le signal audio, une estimation de bruit spectral respective, le procédé comprenant :
    l'application (S202) d'une première technique d'estimation au signal audio pour générer des estimations de bruit spectral pour des composantes spectrales du signal audio qui sont inférieures à une fréquence de coupure ;
    l'application (S203) d'une seconde technique d'estimation différente au signal audio pour générer des estimations de bruit spectral pour des composantes spectrales du signal audio qui sont supérieures à la fréquence de coupure ;
    la formation (S204) de l'estimation de bruit globale pour comprendre, pour des composantes spectrales inférieures à la fréquence de coupure, les estimations de bruit spectral générées à l'aide de la première technique d'estimation et, pour des composantes spectrales supérieures à la fréquence de coupure, les estimations de bruit spectral générées, sur la base de la pluralité de signaux audio, à l'aide de la seconde technique d'estimation ; et
    l'ajustement (S407) de la fréquence de coupure de façon à prendre en compte des changements de cohérence du signal de bruit qui sont reflétés dans le signal audio, la fréquence de coupure étant sélectionnée pour être la fréquence la plus basse au niveau de laquelle l'un de la pluralité de signaux audio présente un degré prédéterminé de cohérence avec un autre de la pluralité de signaux audio.
EP16784821.7A 2016-10-12 2016-10-12 Appareil et procédé permettant de générer des estimations de bruit Active EP3516653B1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2016/074462 WO2018068846A1 (fr) 2016-10-12 2016-10-12 Appareil et procédé permettant de générer des estimations de bruit

Publications (2)

Publication Number Publication Date
EP3516653A1 EP3516653A1 (fr) 2019-07-31
EP3516653B1 true EP3516653B1 (fr) 2021-08-11

Family

ID=57184415

Family Applications (1)

Application Number Title Priority Date Filing Date
EP16784821.7A Active EP3516653B1 (fr) 2016-10-12 2016-10-12 Appareil et procédé permettant de générer des estimations de bruit

Country Status (2)

Country Link
EP (1) EP3516653B1 (fr)
WO (1) WO2018068846A1 (fr)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4671303B2 (ja) * 2005-09-02 2011-04-13 国立大学法人北陸先端科学技術大学院大学 マイクロホンアレイ用ポストフィルタ
US8428275B2 (en) * 2007-06-22 2013-04-23 Sanyo Electric Co., Ltd. Wind noise reduction device
US9131307B2 (en) * 2012-12-11 2015-09-08 JVC Kenwood Corporation Noise eliminating device, noise eliminating method, and noise eliminating program
KR101630155B1 (ko) * 2014-09-11 2016-06-15 현대자동차주식회사 잡음 제거 장치, 잡음 제거 방법, 잡음 제거 장치를 이용하는 음성 인식 장치 및 음성 인식 장치가 설치된 차량

Also Published As

Publication number Publication date
EP3516653A1 (fr) 2019-07-31
WO2018068846A1 (fr) 2018-04-19

Similar Documents

Publication Publication Date Title
Thiergart et al. An informed parametric spatial filter based on instantaneous direction-of-arrival estimates
US9768829B2 (en) Methods for processing audio signals and circuit arrangements therefor
US8958572B1 (en) Adaptive noise cancellation for multi-microphone systems
US9185487B2 (en) System and method for providing noise suppression utilizing null processing noise subtraction
KR101726737B1 (ko) 다채널 음원 분리 장치 및 그 방법
US20160066087A1 (en) Joint noise suppression and acoustic echo cancellation
US11631421B2 (en) Apparatuses and methods for enhanced speech recognition in variable environments
US20100217590A1 (en) Speaker localization system and method
US8682006B1 (en) Noise suppression based on null coherence
US20170092256A1 (en) Adaptive block matrix using pre-whitening for adaptive beam forming
Braun et al. Dereverberation in noisy environments using reference signals and a maximum likelihood estimator
KR20130108063A (ko) 다중 마이크로폰의 견고한 잡음 억제
Kodrasi et al. Joint dereverberation and noise reduction based on acoustic multi-channel equalization
WO2009130513A1 (fr) Système de réduction du bruit à deux microphones
CN110211602B (zh) 智能语音增强通信方法及装置
Nelke et al. Dual microphone noise PSD estimation for mobile phones in hands-free position exploiting the coherence and speech presence probability
US20200286501A1 (en) Apparatus and a method for signal enhancement
Thiergart et al. An informed MMSE filter based on multiple instantaneous direction-of-arrival estimates
US11380312B1 (en) Residual echo suppression for keyword detection
US9875748B2 (en) Audio signal noise attenuation
Yousefian et al. Using power level difference for near field dual-microphone speech enhancement
EP3516653B1 (fr) Appareil et procédé permettant de générer des estimations de bruit
Lee et al. Channel prediction-based noise reduction algorithm for dual-microphone mobile phones
Esch et al. Combined reduction of time varying harmonic and stationary noise using frequency warping
US11462231B1 (en) Spectral smoothing method for noise reduction

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20190424

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20210219

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602016062070

Country of ref document: DE

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Ref country code: AT

Ref legal event code: REF

Ref document number: 1420209

Country of ref document: AT

Kind code of ref document: T

Effective date: 20210915

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG9D

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20210811

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1420209

Country of ref document: AT

Kind code of ref document: T

Effective date: 20210811

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20211213

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20211111

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20211111

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20211112

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602016062070

Country of ref document: DE

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20211031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

26N No opposition filed

Effective date: 20220512

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20211012

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20211031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20211031

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20211031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20211031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20211012

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230524

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20161012

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230831

Year of fee payment: 8

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20230830

Year of fee payment: 8

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210811