EP1016072A1 - Verfahren zur rauschunterdrückung eines digitalen sprachsignals - Google Patents

Verfahren zur rauschunterdrückung eines digitalen sprachsignals

Info

Publication number
EP1016072A1
EP1016072A1 EP98943999A EP98943999A EP1016072A1 EP 1016072 A1 EP1016072 A1 EP 1016072A1 EP 98943999 A EP98943999 A EP 98943999A EP 98943999 A EP98943999 A EP 98943999A EP 1016072 A1 EP1016072 A1 EP 1016072A1
Authority
EP
European Patent Office
Prior art keywords
signal
speech signal
frame
noise
spectral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP98943999A
Other languages
English (en)
French (fr)
Other versions
EP1016072B1 (de
Inventor
Philip Lockwood
Stéphane LUBIARZ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks France SAS
Original Assignee
Matra Nortel Communications SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matra Nortel Communications SAS filed Critical Matra Nortel Communications SAS
Publication of EP1016072A1 publication Critical patent/EP1016072A1/de
Application granted granted Critical
Publication of EP1016072B1 publication Critical patent/EP1016072B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques

Definitions

  • the present invention relates to digital techniques for denoising speech signals. It relates more particularly to noise reduction by nonlinear spectral subtraction.
  • This technique makes it possible to obtain an acceptable denoising for strongly voiced signals, but totally distorts the speech signal. Faced with relatively coherent noise, such as that caused by the contact of car tires or the rattling of an engine, the noise can be more easily predictable than the unvoiced speech signal. There is then a tendency to project the speech signal into a part of the vector space of the noise. The method ignores the speech signal, especially the unvoiced speech areas where the predictability is reduced. In addition, predicting the speech signal from a reduced set of parameters does not take into account all the intrinsic richness of the speech. We understand here the limits of techniques based solely on mathematical considerations while forgetting the particular character of speech. Finally, other techniques are based on the consistency criteria.
  • the coherence function is particularly well developed by JA Cadzow and 0. M. Solomon ("Lmear modelmg and the coherence function", IEEE Trans. On Acoustics, Speech and Signal Processing, Vol. AS5P-35, n ° 1, January 1987 , pages 19-28), and its application to denoising was studied by R. Le Bouquin ("Enhancement of noisy speech signais: application to mobile ractio communications", Speech Communication, Vol. 18, pages 3-19). This method is based on the fact that the speech signal has a significantly greater coherence than noise provided that several independent channels are used. The results seem to be quite encouraging. But unfortunately, this technique requires having multiple sources of sound, which is not always achieved.
  • a main object of the present invention is to propose a new denoising technique which takes into account the characteristics of speech perception by the human ear, thus allowing effective denoising without deteriorating speech perception.
  • the invention thus proposes a method for denoising a digital speech signal processed by successive frames, in which: - spectral components of the speech signal are calculated on each frame;
  • spectral subtraction is carried out comprising at least a first subtraction step in which, respectively, from each spectral component of the speech signal on the frame, a first quantity depending on parameters is subtracted including the estimate increased by the corresponding spectral component of the noise for said frame, so as to obtain spectral components of a first denoised signal; and a transformation to the time domain is applied to the result of the spectral subtraction to construct a denoised speech signal.
  • the spectral subtraction also comprises the following steps:
  • a second subtraction step in which a second quantity depending on parameters, respectively subtracting from each spectral component of the speech signal on the frame, includes a difference between the estimate increased by the corresponding spectral component of the noise and the masking curve calculated.
  • the second subtracted quantity can in particular be limited to the fraction of the estimate increased by the corresponding spectral component of the noise which exceeds the masking curve. This procedure is based on the observation that it is sufficient to denoise the audible noise frequencies. Conversely, there is no point in eliminating noise which is masked by speech. Overestimating the noise spectral envelope is generally desirable so that the increased estimate thus obtained is robust to sudden variations in noise. However, this overestimation usually has the disadvantage of distorting the speech signal when it becomes too large. This has the effect of affecting the voiced character of the speech signal by suppressing part of its predictability. This drawback is very annoying in the conditions of telephony, because it is during the voicing areas that the speech signal is then most energetic. By limiting the amount subtracted when all or part of a frequency component of the overestimated noise turns out to be masked by speech, the invention makes it possible to greatly reduce this drawback.
  • FIG. 1 is a block diagram of a denoising system implementing the present invention
  • FIG. 2 and 3 are flowcharts of procedures used by a voice activity detector of the system of Figure 1;
  • FIG. 4 is a diagram representing the states of a voice activity detection automaton;
  • Figure 5 is a graph illustrating the variations of a degree of vocal activity;
  • FIG. 6 is a block diagram of a noise overestimation module of the system of Figure 1;
  • FIG. 7 is a graph illustrating the calculation of a masking curve;
  • FIG. 8 is a graph illustrating the use of the masking curves in the system of FIG. 1;
  • FIG. 9 is a block diagram of another denoising system implementing the present invention.
  • FIG. 10 is a graph illustrating a harmonic analysis method usable in a method according to the invention.
  • FIG. 11 partially shows a variant of the block diagram of FIG. 9.
  • the denoising system shown in FIG. 1 processes a digital speech signal s.
  • a windowing module 10 puts this signal s in the form of successive windows or frames, each consisting of a number N of digital signal samples. Conventionally, these frames can have mutual overlaps.
  • the signal frame is transformed in the frequentiei domain by a module 11 applying a conventional fast Fourier transform (TFR) algorithm to calculate the module of the signal spectrum.
  • TFR fast Fourier transform
  • the frequency resolution available at the output of the fast Fourier transform is not used, but a lower resolution, determined by a number I of frequency bands covering the band [0 , F / 2] of the signal.
  • a module 12 calculates the respective averages of the spectral components Si_l, f 1 of the speech signal in bands, for example by a uniform weighting such that:
  • the averaged spectral components S, i are addressed to a voice activity detection module 15 and to a noise estimation module 16. These two modules 15,
  • module 16 work jointly, in the sense that degrees of vocal activity ⁇ . measured for the different bands by module 15 are used by module 16 to estimate the long-term energy of noise in the different bands, while these long-term estimates B n ⁇ are used by module 15 to carry out a
  • modules 15 and 16 can correspond to the flowcharts represented in the figures
  • the module 15 proceeds a priori to denoising the speech signal in the different bands i for the signal frame n. This a priori denoising is carried out according to a process
  • step 17 the module 15 calculates, with the resolution of the bands î, the frequency response
  • ⁇ l and ⁇ 2 are delays expressed in number of frames ( ⁇ l ⁇ l, ⁇ 2> 0 ), and ⁇ 1_1 / 1, is a noise overestimation coefficient, the determination of which will be explained below.
  • Ep n / 1 max
  • ⁇ p is a floor coefficient close to 0, conventionally used to prevent the spectrum of the denoised signal from taking negative or too low values which would cause musical noise.
  • Steps 17 to 20 therefore essentially consist in subtracting from the spectrum of the signal an estimate, increased by the coefficient ⁇ ⁇ _-,,, of the noise spectrum estimated a priori.
  • the module 15 calculates, for each band î (0 ⁇ I), a quantity 1. representing the short-term variation of the energy of the noise-suppressed signal in the band Î, as well as a long-term value E n ⁇ of the energy of the noise-reduced signal in the band Î
  • the quantity ⁇ E can be calculated by a simplified formula of
  • step 25 the quantity ⁇ E is compared with a threshold ⁇ l. If the threshold ⁇ l is not reached, the counter b is incremented by one unit in step 26.
  • step 27 the long-term estimator ba is compared to the value of the smoothed energy E n -, _. If ba ⁇ E n -, _, the estimator ba is taken equal to the smoothed value E nx in step 28, and the counter b is reset to zero.
  • the quantity p which is taken equal to the ratio ba / E n / 1 (step 36), is then equal to 1.
  • step 27 shows that ba ⁇ n!
  • the counter b is compared with a limit value bmax in step 29. If b> bmax, the signal is considered to be too stationary to support vocal activity.
  • the long-term estimator ba is updated with the value of the internal estimator bi in step 35. Otherwise, the long-term estimator ba remains unchanged. This avoids that sudden variations due to a speech signal lead to an update of the noise estimator.
  • the module 15 After having obtained the quantities p, the module 15 proceeds to the voice activity decisions in step 37.
  • the module 15 first updates the state of the detection automaton according to the quantity P Q calculated for the entire signal band.
  • the new state ⁇ of the automaton depends on the previous state ⁇ -, and of p 0 , as shown in Figure 4.
  • the module 15 also calculates the degrees of vocal activity ⁇ advise11.1. in each band ⁇ > l.
  • This degree _ is preferably a non-binary parameter, that is to say that the function ⁇ Il is a function varying continuously between 0 and 1 according to the values taken by the quantity p. This function has for example the appearance shown in FIG. 5.
  • the module 16 calculates the noise estimates per band, which will be used in the denoising process, using the successive values of the components X. and degrees of vocal activity ⁇ i_l / X ⁇ .
  • step 42 the module 16 updates the noise estimates per band according to the formulas:
  • the long-term noise estimates B j _ are overestimated, by a module 45 (FIG. 1), before proceeding to denoising by nonlinear spectral subtraction.
  • Module 45 calculates the overestimation coefficient ⁇ I n l f J. • previously
  • this combination is essentially a simple sum made by an adder 46. It could also be a weighted sum.
  • the ⁇ B TM ax measurement of noise variability reflects the variance of the noise estimator. It is obtained as a function of the values of S I..l X. and of BI n lf-_ calculated for a certain number of previous frames on which the speech signal does not present any vocal activity in the
  • band î It is a function of the differences S nk, ⁇ B nk, j calculated for a number K of frames of silence (nk ⁇ n). In the example shown, this function is simply the maximum (block 50). For each frame n, the degree of vocal activity 1. is compared to a threshold (block 51)
  • the measure of variability ⁇ B I TM lf a J x can, as a variant, be obtained as a function of the values ⁇ x (and not S_n X) and n, 1v. We then proceed in the same way, except that the FIFO
  • a first phase of the spectral subtraction is carried out by the module 55 shown in FIG. 1. This phase provides, with the resolution of the bands i
  • the coefficient ⁇ ⁇ represents, like the coefficient ⁇ p - of formula (3), a floor conventionally used to avoid negative or too low values of the denoised signal.
  • the overestimation coefficient & nj _ could be replaced in formula (7) by another coefficient equal to a function r of n - and an estimate of the signal-ratio over-noise
  • this function decreasing based on the estimated signal-to-noise ratio.
  • This r function is then equal to a n 2 for the lowest values of the signal-to-noise ratio. Indeed, when the signal is very noisy, it is a priori not useful to reduce the overestimation factor.
  • this function decreases towards zero for the highest values of the signal / noise ratio. This makes it possible to protect the most energetic areas of the spectrum, where the speech signal is the most significant, the quantity subtracted ⁇ signal then tending towards zero.
  • a second denoising phase is carried out by a module 56 for protecting harmonics. This module calculates, with the resolution of the Fourier transform,
  • the module 57 can apply any known method of analysis of the speech signal of the frame to determine the period T, expressed as an integer or fractional number of samples, for example a linear prediction method.
  • the protection provided by the module 56 may consist in carrying out, for each frequency f belonging to a band i:
  • This protection strategy is preferably applied for each of the frequencies closest to the harmonics of f, that is to say for any integer ⁇ .
  • ⁇ f the frequency resolution with which the analysis module 57 produces the estimated tonal frequency f, that is to say that the real tonal frequency is between f - ⁇ f / 2 and fp + ⁇ fp / 2
  • the difference between the ⁇ -th harmonic of the real tonal frequency is its estimate ⁇ xf n (condition (9)) can go up to ⁇ ⁇ x ⁇ f / 2.
  • this difference can be greater than the spectral half-resolution ⁇ f / 2 of the Fourier transform.
  • the spectral components S n f of a denoised signal are calculated by a multiplier 58:
  • This signal S n ⁇ is supplied to a module 60 which calculates, for each frame n, a masking curve by applying a psychoacoustic model of auditory perception by the human ear.
  • the masking phenomenon is a known principle of the functioning of the human ear. When two frequencies are heard simultaneously, one of them may no longer be heard. We then say that it is masked.
  • M n, q C n, q R q ⁇ 12 > where R depends on the more or less voiced character of the signal.
  • denotes a degree of voicing of the speech signal, varying between zero (no voicing) and
  • the parameter ⁇ can be of the known form:
  • the denoising system also includes a module 62 which corrects the frequency response of the noise reduction, depending on the mas ⁇ uage curve calculated by module 60 and increased estimates BI n l f . calculated by the module 45.
  • the module 62 decides the level of noise reduction which must really be reached. By comparing the envelope of the estimate increased by the noise with the envelope formed by the mas ⁇ uage thresholds M ⁇ , q, it is decided to denoise the signal only
  • the new response H n ⁇ , for a frequency f belonging to the band i defined by the module 12 and to the bark band q, thus depends on the relative difference between the increased estimate B n of the corresponding spectral component of the noise and the masking curve q, as follows
  • H n f is substantially equal to the minimum between on the one hand the quantity subtracted from this spectral component in the process of spectral subtraction having the frequency response HA f f f , and on the other hand the fraction of
  • FIG. 8 illustrates the principle of the correction applied by the module 62. It schematically shows a example of masking curve M_il, g_. calculated on the basis
  • a module 65 reconstructs the denoised signal in the time domain, by operating the inverse fast Fourier transform (TFRI) inverse of the samples of frequency S n f delivered by the multiplier
  • FIG. 9 shows a preferred embodiment of a denoising system implementing the invention.
  • This system includes a certain number of elements similar to elements corresponding to the system of FIG. 1, for which the same reference numerals have been used.
  • modules 10, 11, 12, 15, 16, 45 and 55 provide in particular the quantities
  • Fast Fourier 11 is a limitation of the system of FIG. 1.
  • the frequency subject to protection by the module 56 is not necessarily the precise tone frequency f, but the frequency closest to it in the discrete spectrum. In some cases, it is then possible to protect harmonics relatively far from that of the tone frequency.
  • the system of FIG. 9 overcomes this drawback thanks to an appropriate conditioning of the speech signal.
  • the sampling frequency of the signal is modified so that the period 1 / f covers exactly an integer number of sample times of the conditioned signal.
  • harmonic analysis methods that can be implemented by the module 57 are capable of providing a fractional value of the delay T, expressed in number of samples at the initial sampling frequency F.
  • a new sampling frequency f is then chosen so that it is equal to an integer multiple of the estimated tone frequency, ie with p integer.
  • f should be greater than F.
  • F is between F and 2F (1 ⁇ K ⁇ 2), to facilitate the implementation of the packaging.
  • N is usually a power of 2 for the implementation of the TFR. It is 256 in the example considered.
  • This choice is made by a module 70 according to the value of the delay T supplied by the narmonic analysis module 57.
  • the module 70 provides the ratio K between the sampling frequencies to three frequency change modules 71, 72, 73.
  • the module 71 is used to transform the values S ⁇ n, i. , r l B n ⁇ > a n ⁇ ' B n ⁇ and H nf' relating to the bands i defined by the module 12, in the modified frequency scale (sampling frequency f). This transformation consists simply in dilating the bands i in the factor K. The values thus transformed are supplied to the module 56 for protecting harmonics.
  • the module 72 proceeds to the oversampling of the frame of N samples provided by the windowing module 10.
  • the oversampling in a rational factor K K1 / K2) consists in first of all performing an oversampling in the integer factor K1, then a subsampling in the integer factor K2.
  • K K1 / K2
  • the conditioned signal frame supplied by the module 72 includes KN samples at the frequency f. These samples are sent to a module 75 which calculates their Fourier transform.
  • the two blocks therefore have an overlap of (2-K) xl00%.
  • For each of the two blocks we obtain a set of Fourier components S f . These components S f are supplied to the multiplier 58, which multiplies them by the spectral response
  • the autocorrelations A (k) are calculated by a module 76, for example according to the formula:
  • a module 77 then calculates the normalized entropy
  • the normalized entropy H constitutes a measurement of voicing very robust to noise and to variations in the tonal frequency.
  • the correction module 62 operates in the same way as that of the system of FIG. 1, taking into account the overestimated noise B n ⁇ resized by the frequency change module 71. It provides the frequency response # ⁇ of the final denoising filter, which is multiplied by the spectral components S I_I ,, ⁇ 1 of the signal conditioned by the multiplier
  • TFRI 65 a module 80 combines, for each frame, the two signal blocks resulting from the processing of the two blocks overlays issued by TFR 75. This combination can consist of a weighted sum of Hamming of samples, to form a signal frame conditions noise-suppressed KN samples.
  • a module 82 manages the windows formed by the module 10 and saved by the module 66, so that a number M of samples is saved equal to an integer multiple of. This avoids the problems of phase discontinuity between the frames.
  • the management module 82 controls the windowing module 10 so that the overlap between the current frame and the next one corresponds to NM. This recovery of NM samples will be required in the recovery sum carried out by the module 66 during the processing of the next frame. From the value of T provided by the harmonic analysis module 57, the module 82 calculates the number of samples to be saved
  • the tonal frequency is estimated so average on the frame.
  • the tonal frequency may vary somewhat over this period. It is possible to take these variations into account in the context of the present invention, by conditioning the signal so as to artificially obtain a constant tone frequency in the frame. For this, it is necessary that the module 57 of harmonic analysis provides the time intervals between the consecutive breaks in the speech signal attributable to closings of the glottis of the intervening speaker during the duration of the frame. Methods usable for detecting such micro-ruptures are well known in the field of harmonic analysis of speech signals.
  • w.m is the cumulative sum of the posterior likelihood ratio of two distributions, corrected by the Kullback divergence. For a distribution of residuals having a Gaussian statistic, this value w.m is given by:
  • FIG. 10 thus shows a possible example of evolution of the value w, showing the breaks R of the speech signal.
  • FIG. 11 shows the means used to calculate the conditioning of the signal in the latter case.
  • the largest T of the time intervals t supplied by the module 57 for a frame is selected by the module 70 (block 91 in FIG. 11) to obtain a torque p, ⁇ as indicated in table I.
  • the tonal frequency harmonics protection module 56 operates in the same way as above, using for condition (9) the spectral resolution ⁇ f provided by block 91 and the tonal frequency defined according to the value of the integer delay p supplied by block 91.
  • This embodiment of the invention also involves an adaptation of the window management module 82.
  • the number M of samples of the denoised signal to be saved on the current frame here corresponds to an integer number of consecutive time intervals t between two glottal breaks (see FIG. 10). This arrangement avoids the problems of phase discontinuity between frames, while taking into account the possible variations of the time intervals t on a frame.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP98943999A 1997-09-18 1998-09-16 Verfahren und vorrichtung zur rauschunterdrückung eines digitalen sprachsignals Expired - Lifetime EP1016072B1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR9711643 1997-09-18
FR9711643A FR2768547B1 (fr) 1997-09-18 1997-09-18 Procede de debruitage d'un signal de parole numerique
PCT/FR1998/001980 WO1999014738A1 (fr) 1997-09-18 1998-09-16 Procede de debruitage d'un signal de parole numerique

Publications (2)

Publication Number Publication Date
EP1016072A1 true EP1016072A1 (de) 2000-07-05
EP1016072B1 EP1016072B1 (de) 2002-01-16

Family

ID=9511230

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98943999A Expired - Lifetime EP1016072B1 (de) 1997-09-18 1998-09-16 Verfahren und vorrichtung zur rauschunterdrückung eines digitalen sprachsignals

Country Status (7)

Country Link
US (1) US6477489B1 (de)
EP (1) EP1016072B1 (de)
AU (1) AU9168998A (de)
CA (1) CA2304571A1 (de)
DE (1) DE69803203T2 (de)
FR (1) FR2768547B1 (de)
WO (1) WO1999014738A1 (de)

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6510408B1 (en) * 1997-07-01 2003-01-21 Patran Aps Method of noise reduction in speech signals and an apparatus for performing the method
US6549586B2 (en) * 1999-04-12 2003-04-15 Telefonaktiebolaget L M Ericsson System and method for dual microphone signal noise reduction using spectral subtraction
US6717991B1 (en) 1998-05-27 2004-04-06 Telefonaktiebolaget Lm Ericsson (Publ) System and method for dual microphone signal noise reduction using spectral subtraction
FR2797343B1 (fr) * 1999-08-04 2001-10-05 Matra Nortel Communications Procede et dispositif de detection d'activite vocale
JP3454206B2 (ja) * 1999-11-10 2003-10-06 三菱電機株式会社 雑音抑圧装置及び雑音抑圧方法
US6804640B1 (en) * 2000-02-29 2004-10-12 Nuance Communications Signal noise reduction using magnitude-domain spectral subtraction
US6766292B1 (en) * 2000-03-28 2004-07-20 Tellabs Operations, Inc. Relative noise ratio weighting techniques for adaptive noise cancellation
JP2002221988A (ja) * 2001-01-25 2002-08-09 Toshiba Corp 音声信号の雑音抑圧方法と装置及び音声認識装置
US20020150264A1 (en) * 2001-04-11 2002-10-17 Silvia Allegro Method for eliminating spurious signal components in an input signal of an auditory system, application of the method, and a hearing aid
US6985709B2 (en) * 2001-06-22 2006-01-10 Intel Corporation Noise dependent filter
DE10150519B4 (de) * 2001-10-12 2014-01-09 Hewlett-Packard Development Co., L.P. Verfahren und Anordnung zur Sprachverarbeitung
US7103539B2 (en) * 2001-11-08 2006-09-05 Global Ip Sound Europe Ab Enhanced coded speech
US20040078199A1 (en) * 2002-08-20 2004-04-22 Hanoh Kremer Method for auditory based noise reduction and an apparatus for auditory based noise reduction
US7398204B2 (en) * 2002-08-27 2008-07-08 Her Majesty In Right Of Canada As Represented By The Minister Of Industry Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking
WO2004036549A1 (en) * 2002-10-14 2004-04-29 Koninklijke Philips Electronics N.V. Signal filtering
KR101141247B1 (ko) * 2003-10-10 2012-05-04 에이전시 포 사이언스, 테크놀로지 앤드 리서치 디지털 신호를 확장성 비트스트림으로 인코딩하는 방법;확장성 비트스트림을 디코딩하는 방법
US7725314B2 (en) * 2004-02-16 2010-05-25 Microsoft Corporation Method and apparatus for constructing a speech filter using estimates of clean speech and noise
US7729908B2 (en) * 2005-03-04 2010-06-01 Panasonic Corporation Joint signal and model based noise matching noise robustness method for automatic speech recognition
US20060206320A1 (en) * 2005-03-14 2006-09-14 Li Qi P Apparatus and method for noise reduction and speech enhancement with microphones and loudspeakers
KR100927897B1 (ko) * 2005-09-02 2009-11-23 닛본 덴끼 가부시끼가이샤 잡음억제방법과 장치, 및 컴퓨터프로그램
US8126706B2 (en) * 2005-12-09 2012-02-28 Acoustic Technologies, Inc. Music detector for echo cancellation and noise reduction
JP4592623B2 (ja) * 2006-03-14 2010-12-01 富士通株式会社 通信システム
US8949120B1 (en) 2006-05-25 2015-02-03 Audience, Inc. Adaptive noise cancelation
JP4757158B2 (ja) * 2006-09-20 2011-08-24 富士通株式会社 音信号処理方法、音信号処理装置及びコンピュータプログラム
US20080162119A1 (en) * 2007-01-03 2008-07-03 Lenhardt Martin L Discourse Non-Speech Sound Identification and Elimination
ES2391228T3 (es) 2007-02-26 2012-11-22 Dolby Laboratories Licensing Corporation Realce de voz en audio de entretenimiento
US8560320B2 (en) * 2007-03-19 2013-10-15 Dolby Laboratories Licensing Corporation Speech enhancement employing a perceptual model
JP5302968B2 (ja) * 2007-09-12 2013-10-02 ドルビー ラボラトリーズ ライセンシング コーポレイション 音声明瞭化を伴うスピーチ改善
US8538763B2 (en) * 2007-09-12 2013-09-17 Dolby Laboratories Licensing Corporation Speech enhancement with noise level estimation adjustment
EP2192579A4 (de) * 2007-09-19 2016-06-08 Nec Corp Rauschunterdrückungsvorrichtung sowie entsprechendes verfahren und programm
JP5056654B2 (ja) * 2008-07-29 2012-10-24 株式会社Jvcケンウッド 雑音抑制装置、及び雑音抑制方法
US20110257978A1 (en) * 2009-10-23 2011-10-20 Brainlike, Inc. Time Series Filtering, Data Reduction and Voice Recognition in Communication Device
US9838784B2 (en) 2009-12-02 2017-12-05 Knowles Electronics, Llc Directional audio capture
US8423357B2 (en) * 2010-06-18 2013-04-16 Alon Konchitsky System and method for biometric acoustic noise reduction
US9640194B1 (en) 2012-10-04 2017-05-02 Knowles Electronics, Llc Noise suppression for speech processing based on machine-learning mask estimation
US9536540B2 (en) * 2013-07-19 2017-01-03 Knowles Electronics, Llc Speech signal separation and synthesis based on auditory scene analysis and speech modeling
CN103824562B (zh) * 2014-02-10 2016-08-17 太原理工大学 基于心理声学模型的语音后置感知滤波器
DE102014009689A1 (de) * 2014-06-30 2015-12-31 Airbus Operations Gmbh Intelligentes Soundsystem/-modul zur Kabinenkommunikation
DE112015003945T5 (de) 2014-08-28 2017-05-11 Knowles Electronics, Llc Mehrquellen-Rauschunterdrückung
CN107112025A (zh) 2014-09-12 2017-08-29 美商楼氏电子有限公司 用于恢复语音分量的系统和方法
CN105869652B (zh) * 2015-01-21 2020-02-18 北京大学深圳研究院 心理声学模型计算方法和装置
US9820042B1 (en) 2016-05-02 2017-11-14 Knowles Electronics, Llc Stereo separation and directional suppression with omni-directional microphones
EP3566229B1 (de) * 2017-01-23 2020-11-25 Huawei Technologies Co., Ltd. Vorrichtung und verfahren zur verbesserung einer erwünschten komponente in einem signal
US11017798B2 (en) * 2017-12-29 2021-05-25 Harman Becker Automotive Systems Gmbh Dynamic noise suppression and operations for noisy speech signals

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03117919A (ja) * 1989-09-30 1991-05-20 Sony Corp ディジタル信号符号化装置
AU633673B2 (en) 1990-01-18 1993-02-04 Matsushita Electric Industrial Co., Ltd. Signal processing device
DE69124005T2 (de) 1990-05-28 1997-07-31 Matsushita Electric Ind Co Ltd Sprachsignalverarbeitungsvorrichtung
US5450522A (en) * 1991-08-19 1995-09-12 U S West Advanced Technologies, Inc. Auditory model for parametrization of speech
US5469087A (en) 1992-06-25 1995-11-21 Noise Cancellation Technologies, Inc. Control system using harmonic filters
US5400409A (en) * 1992-12-23 1995-03-21 Daimler-Benz Ag Noise-reduction method for noise-affected voice channels
AU676714B2 (en) * 1993-02-12 1997-03-20 British Telecommunications Public Limited Company Noise reduction
US5623577A (en) * 1993-07-16 1997-04-22 Dolby Laboratories Licensing Corporation Computationally efficient adaptive bit allocation for encoding method and apparatus with allowance for decoder spectral distortions
JP3131542B2 (ja) * 1993-11-25 2001-02-05 シャープ株式会社 符号化復号化装置
US5555190A (en) 1995-07-12 1996-09-10 Micro Motion, Inc. Method and apparatus for adaptive line enhancement in Coriolis mass flow meter measurement
FR2739736B1 (fr) * 1995-10-05 1997-12-05 Jean Laroche Procede de reduction des pre-echos ou post-echos affectant des enregistrements audio
FI100840B (fi) * 1995-12-12 1998-02-27 Nokia Mobile Phones Ltd Kohinanvaimennin ja menetelmä taustakohinan vaimentamiseksi kohinaises ta puheesta sekä matkaviestin
US6144937A (en) * 1997-07-23 2000-11-07 Texas Instruments Incorporated Noise suppression of speech by signal processing including applying a transform to time domain input sequences of digital signals representing audio information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9914738A1 *

Also Published As

Publication number Publication date
AU9168998A (en) 1999-04-05
FR2768547B1 (fr) 1999-11-19
US6477489B1 (en) 2002-11-05
DE69803203D1 (de) 2002-02-21
FR2768547A1 (fr) 1999-03-19
WO1999014738A1 (fr) 1999-03-25
DE69803203T2 (de) 2002-08-29
CA2304571A1 (fr) 1999-03-25
EP1016072B1 (de) 2002-01-16

Similar Documents

Publication Publication Date Title
EP1016072B1 (de) Verfahren und vorrichtung zur rauschunterdrückung eines digitalen sprachsignals
EP1789956B1 (de) Verfahren zum verarbeiten eines rauschbehafteten tonsignals und einrichtung zur implementierung des verfahrens
EP2002428B1 (de) Verfahren zur trainierten diskrimination und dämpfung von echos eines digitalsignals in einem decoder und entsprechende einrichtung
EP1830349B1 (de) Verfahren zur Geräuschdämpfung eines Audiosignals
CA2436318C (fr) Procede et dispositif de reduction de bruit
EP1016071B1 (de) Verfahren und vorrichtung zur sprachdetektion
FR2907586A1 (fr) Synthese de blocs perdus d'un signal audionumerique,avec correction de periode de pitch.
JP3960834B2 (ja) 音声強調装置及び音声強調方法
EP1016073B1 (de) Verfahren und vorrichtung zur rauschunterdrückung eines digitalen sprachsignals
EP0490740A1 (de) Verfahren und Einrichtung zum Bestimmen der Sprachgrundfrequenz in Vocodern mit sehr niedriger Datenrate
EP1021805B1 (de) Verfahren und vorrichtung zur verbesserung eines digitalen sprachsignals
EP3192073B1 (de) Unterscheidung und dämpfung von vorechos in einem digitalen audiosignal
EP2515300B1 (de) Verfahren und System für die Geräuschunterdrückung
FR2888704A1 (de)
EP4287648A1 (de) Elektronische vorrichtung und verarbeitungsverfahren, akustische vorrichtung und computerprogramm dafür
WO2006117453A1 (fr) Procede d’attenuation des pre- et post-echos d’un signal numerique audio et dispositif correspondant
FR2799601A1 (fr) Dispositif et procede d'annulation de bruit

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20000316

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): DE FR GB

17Q First examination report despatched

Effective date: 20001004

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 21/02 A

RTI1 Title (correction)

Free format text: METHOD AND APPARATUS FOR SUPPRESSING NOISE IN A DIGITAL SPEECH SIGNAL

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 21/02 A

RTI1 Title (correction)

Free format text: METHOD AND APPARATUS FOR SUPPRESSING NOISE IN A DIGITAL SPEECH SIGNAL

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB

REF Corresponds to:

Ref document number: 69803203

Country of ref document: DE

Date of ref document: 20020221

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 20020407

RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

Owner name: NORTEL NETWORKS FRANCE

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: FR

Ref legal event code: CD

Ref country code: FR

Ref legal event code: CA

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20031127

Year of fee payment: 6

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20050401

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20050817

Year of fee payment: 8

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20050902

Year of fee payment: 8

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20060916

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20070531

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20060916

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20061002