EP1016072A1 - Procede de debruitage d'un signal de parole numerique - Google Patents
Procede de debruitage d'un signal de parole numeriqueInfo
- Publication number
- EP1016072A1 EP1016072A1 EP98943999A EP98943999A EP1016072A1 EP 1016072 A1 EP1016072 A1 EP 1016072A1 EP 98943999 A EP98943999 A EP 98943999A EP 98943999 A EP98943999 A EP 98943999A EP 1016072 A1 EP1016072 A1 EP 1016072A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- signal
- speech signal
- frame
- noise
- spectral
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000003595 spectral effect Effects 0.000 claims abstract description 82
- 230000000873 masking effect Effects 0.000 claims abstract description 28
- 230000008447 perception Effects 0.000 claims abstract description 6
- 230000006870 function Effects 0.000 claims description 20
- 230000001143 conditioned effect Effects 0.000 claims description 16
- 238000004458 analytical method Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 12
- 238000005070 sampling Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000001755 vocal effect Effects 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 5
- 230000007423 decrease Effects 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 2
- 210000004704 glottis Anatomy 0.000 claims description 2
- 230000007774 longterm Effects 0.000 description 18
- 230000004044 response Effects 0.000 description 14
- 230000000694 effects Effects 0.000 description 12
- 238000001514 detection method Methods 0.000 description 10
- 238000001228 spectrum Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 230000009467 reduction Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 230000003750 conditioning effect Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000012937 correction Methods 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 230000000630 rising effect Effects 0.000 description 3
- 239000000523 sample Substances 0.000 description 3
- 230000007480 spreading Effects 0.000 description 3
- 238000000528 statistical test Methods 0.000 description 3
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000005534 acoustic noise Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 210000000721 basilar membrane Anatomy 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000002542 deteriorative effect Effects 0.000 description 1
- 230000000916 dilatatory effect Effects 0.000 description 1
- 230000002964 excitative effect Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- XOFYZVNMUHMLCC-ZPOLXVRWSA-N prednisone Chemical compound O=C1C=C[C@]2(C)[C@H]3C(=O)C[C@](C)([C@@](CC4)(O)C(=O)CO)[C@@H]4[C@@H]3CCC2=C1 XOFYZVNMUHMLCC-ZPOLXVRWSA-N 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
Definitions
- the present invention relates to digital techniques for denoising speech signals. It relates more particularly to noise reduction by nonlinear spectral subtraction.
- This technique makes it possible to obtain an acceptable denoising for strongly voiced signals, but totally distorts the speech signal. Faced with relatively coherent noise, such as that caused by the contact of car tires or the rattling of an engine, the noise can be more easily predictable than the unvoiced speech signal. There is then a tendency to project the speech signal into a part of the vector space of the noise. The method ignores the speech signal, especially the unvoiced speech areas where the predictability is reduced. In addition, predicting the speech signal from a reduced set of parameters does not take into account all the intrinsic richness of the speech. We understand here the limits of techniques based solely on mathematical considerations while forgetting the particular character of speech. Finally, other techniques are based on the consistency criteria.
- the coherence function is particularly well developed by JA Cadzow and 0. M. Solomon ("Lmear modelmg and the coherence function", IEEE Trans. On Acoustics, Speech and Signal Processing, Vol. AS5P-35, n ° 1, January 1987 , pages 19-28), and its application to denoising was studied by R. Le Bouquin ("Enhancement of noisy speech signais: application to mobile ractio communications", Speech Communication, Vol. 18, pages 3-19). This method is based on the fact that the speech signal has a significantly greater coherence than noise provided that several independent channels are used. The results seem to be quite encouraging. But unfortunately, this technique requires having multiple sources of sound, which is not always achieved.
- a main object of the present invention is to propose a new denoising technique which takes into account the characteristics of speech perception by the human ear, thus allowing effective denoising without deteriorating speech perception.
- the invention thus proposes a method for denoising a digital speech signal processed by successive frames, in which: - spectral components of the speech signal are calculated on each frame;
- spectral subtraction is carried out comprising at least a first subtraction step in which, respectively, from each spectral component of the speech signal on the frame, a first quantity depending on parameters is subtracted including the estimate increased by the corresponding spectral component of the noise for said frame, so as to obtain spectral components of a first denoised signal; and a transformation to the time domain is applied to the result of the spectral subtraction to construct a denoised speech signal.
- the spectral subtraction also comprises the following steps:
- a second subtraction step in which a second quantity depending on parameters, respectively subtracting from each spectral component of the speech signal on the frame, includes a difference between the estimate increased by the corresponding spectral component of the noise and the masking curve calculated.
- the second subtracted quantity can in particular be limited to the fraction of the estimate increased by the corresponding spectral component of the noise which exceeds the masking curve. This procedure is based on the observation that it is sufficient to denoise the audible noise frequencies. Conversely, there is no point in eliminating noise which is masked by speech. Overestimating the noise spectral envelope is generally desirable so that the increased estimate thus obtained is robust to sudden variations in noise. However, this overestimation usually has the disadvantage of distorting the speech signal when it becomes too large. This has the effect of affecting the voiced character of the speech signal by suppressing part of its predictability. This drawback is very annoying in the conditions of telephony, because it is during the voicing areas that the speech signal is then most energetic. By limiting the amount subtracted when all or part of a frequency component of the overestimated noise turns out to be masked by speech, the invention makes it possible to greatly reduce this drawback.
- FIG. 1 is a block diagram of a denoising system implementing the present invention
- FIG. 2 and 3 are flowcharts of procedures used by a voice activity detector of the system of Figure 1;
- FIG. 4 is a diagram representing the states of a voice activity detection automaton;
- Figure 5 is a graph illustrating the variations of a degree of vocal activity;
- FIG. 6 is a block diagram of a noise overestimation module of the system of Figure 1;
- FIG. 7 is a graph illustrating the calculation of a masking curve;
- FIG. 8 is a graph illustrating the use of the masking curves in the system of FIG. 1;
- FIG. 9 is a block diagram of another denoising system implementing the present invention.
- FIG. 10 is a graph illustrating a harmonic analysis method usable in a method according to the invention.
- FIG. 11 partially shows a variant of the block diagram of FIG. 9.
- the denoising system shown in FIG. 1 processes a digital speech signal s.
- a windowing module 10 puts this signal s in the form of successive windows or frames, each consisting of a number N of digital signal samples. Conventionally, these frames can have mutual overlaps.
- the signal frame is transformed in the frequentiei domain by a module 11 applying a conventional fast Fourier transform (TFR) algorithm to calculate the module of the signal spectrum.
- TFR fast Fourier transform
- the frequency resolution available at the output of the fast Fourier transform is not used, but a lower resolution, determined by a number I of frequency bands covering the band [0 , F / 2] of the signal.
- a module 12 calculates the respective averages of the spectral components Si_l, f 1 of the speech signal in bands, for example by a uniform weighting such that:
- the averaged spectral components S, i are addressed to a voice activity detection module 15 and to a noise estimation module 16. These two modules 15,
- module 16 work jointly, in the sense that degrees of vocal activity ⁇ . measured for the different bands by module 15 are used by module 16 to estimate the long-term energy of noise in the different bands, while these long-term estimates B n ⁇ are used by module 15 to carry out a
- modules 15 and 16 can correspond to the flowcharts represented in the figures
- the module 15 proceeds a priori to denoising the speech signal in the different bands i for the signal frame n. This a priori denoising is carried out according to a process
- step 17 the module 15 calculates, with the resolution of the bands î, the frequency response
- ⁇ l and ⁇ 2 are delays expressed in number of frames ( ⁇ l ⁇ l, ⁇ 2> 0 ), and ⁇ 1_1 / 1, is a noise overestimation coefficient, the determination of which will be explained below.
- Ep n / 1 max
- ⁇ p is a floor coefficient close to 0, conventionally used to prevent the spectrum of the denoised signal from taking negative or too low values which would cause musical noise.
- Steps 17 to 20 therefore essentially consist in subtracting from the spectrum of the signal an estimate, increased by the coefficient ⁇ ⁇ _-,,, of the noise spectrum estimated a priori.
- the module 15 calculates, for each band î (0 ⁇ I), a quantity 1. representing the short-term variation of the energy of the noise-suppressed signal in the band Î, as well as a long-term value E n ⁇ of the energy of the noise-reduced signal in the band Î
- the quantity ⁇ E can be calculated by a simplified formula of
- step 25 the quantity ⁇ E is compared with a threshold ⁇ l. If the threshold ⁇ l is not reached, the counter b is incremented by one unit in step 26.
- step 27 the long-term estimator ba is compared to the value of the smoothed energy E n -, _. If ba ⁇ E n -, _, the estimator ba is taken equal to the smoothed value E nx in step 28, and the counter b is reset to zero.
- the quantity p which is taken equal to the ratio ba / E n / 1 (step 36), is then equal to 1.
- step 27 shows that ba ⁇ n!
- the counter b is compared with a limit value bmax in step 29. If b> bmax, the signal is considered to be too stationary to support vocal activity.
- the long-term estimator ba is updated with the value of the internal estimator bi in step 35. Otherwise, the long-term estimator ba remains unchanged. This avoids that sudden variations due to a speech signal lead to an update of the noise estimator.
- the module 15 After having obtained the quantities p, the module 15 proceeds to the voice activity decisions in step 37.
- the module 15 first updates the state of the detection automaton according to the quantity P Q calculated for the entire signal band.
- the new state ⁇ of the automaton depends on the previous state ⁇ -, and of p 0 , as shown in Figure 4.
- the module 15 also calculates the degrees of vocal activity ⁇ advise11.1. in each band ⁇ > l.
- This degree _ is preferably a non-binary parameter, that is to say that the function ⁇ Il is a function varying continuously between 0 and 1 according to the values taken by the quantity p. This function has for example the appearance shown in FIG. 5.
- the module 16 calculates the noise estimates per band, which will be used in the denoising process, using the successive values of the components X. and degrees of vocal activity ⁇ i_l / X ⁇ .
- step 42 the module 16 updates the noise estimates per band according to the formulas:
- the long-term noise estimates B j _ are overestimated, by a module 45 (FIG. 1), before proceeding to denoising by nonlinear spectral subtraction.
- Module 45 calculates the overestimation coefficient ⁇ I n l f J. • previously
- this combination is essentially a simple sum made by an adder 46. It could also be a weighted sum.
- the ⁇ B TM ax measurement of noise variability reflects the variance of the noise estimator. It is obtained as a function of the values of S I..l X. and of BI n lf-_ calculated for a certain number of previous frames on which the speech signal does not present any vocal activity in the
- band î It is a function of the differences S nk, ⁇ B nk, j calculated for a number K of frames of silence (nk ⁇ n). In the example shown, this function is simply the maximum (block 50). For each frame n, the degree of vocal activity 1. is compared to a threshold (block 51)
- the measure of variability ⁇ B I TM lf a J x can, as a variant, be obtained as a function of the values ⁇ x (and not S_n X) and n, 1v. We then proceed in the same way, except that the FIFO
- a first phase of the spectral subtraction is carried out by the module 55 shown in FIG. 1. This phase provides, with the resolution of the bands i
- the coefficient ⁇ ⁇ represents, like the coefficient ⁇ p - of formula (3), a floor conventionally used to avoid negative or too low values of the denoised signal.
- the overestimation coefficient & nj _ could be replaced in formula (7) by another coefficient equal to a function r of n - and an estimate of the signal-ratio over-noise
- this function decreasing based on the estimated signal-to-noise ratio.
- This r function is then equal to a n 2 for the lowest values of the signal-to-noise ratio. Indeed, when the signal is very noisy, it is a priori not useful to reduce the overestimation factor.
- this function decreases towards zero for the highest values of the signal / noise ratio. This makes it possible to protect the most energetic areas of the spectrum, where the speech signal is the most significant, the quantity subtracted ⁇ signal then tending towards zero.
- a second denoising phase is carried out by a module 56 for protecting harmonics. This module calculates, with the resolution of the Fourier transform,
- the module 57 can apply any known method of analysis of the speech signal of the frame to determine the period T, expressed as an integer or fractional number of samples, for example a linear prediction method.
- the protection provided by the module 56 may consist in carrying out, for each frequency f belonging to a band i:
- This protection strategy is preferably applied for each of the frequencies closest to the harmonics of f, that is to say for any integer ⁇ .
- ⁇ f the frequency resolution with which the analysis module 57 produces the estimated tonal frequency f, that is to say that the real tonal frequency is between f - ⁇ f / 2 and fp + ⁇ fp / 2
- the difference between the ⁇ -th harmonic of the real tonal frequency is its estimate ⁇ xf n (condition (9)) can go up to ⁇ ⁇ x ⁇ f / 2.
- this difference can be greater than the spectral half-resolution ⁇ f / 2 of the Fourier transform.
- the spectral components S n f of a denoised signal are calculated by a multiplier 58:
- This signal S n ⁇ is supplied to a module 60 which calculates, for each frame n, a masking curve by applying a psychoacoustic model of auditory perception by the human ear.
- the masking phenomenon is a known principle of the functioning of the human ear. When two frequencies are heard simultaneously, one of them may no longer be heard. We then say that it is masked.
- M n, q C n, q R q ⁇ 12 > where R depends on the more or less voiced character of the signal.
- ⁇ denotes a degree of voicing of the speech signal, varying between zero (no voicing) and
- the parameter ⁇ can be of the known form:
- the denoising system also includes a module 62 which corrects the frequency response of the noise reduction, depending on the mas ⁇ uage curve calculated by module 60 and increased estimates BI n l f . calculated by the module 45.
- the module 62 decides the level of noise reduction which must really be reached. By comparing the envelope of the estimate increased by the noise with the envelope formed by the mas ⁇ uage thresholds M ⁇ , q, it is decided to denoise the signal only
- the new response H n ⁇ , for a frequency f belonging to the band i defined by the module 12 and to the bark band q, thus depends on the relative difference between the increased estimate B n of the corresponding spectral component of the noise and the masking curve q, as follows
- H n f is substantially equal to the minimum between on the one hand the quantity subtracted from this spectral component in the process of spectral subtraction having the frequency response HA f f f , and on the other hand the fraction of
- FIG. 8 illustrates the principle of the correction applied by the module 62. It schematically shows a example of masking curve M_il, g_. calculated on the basis
- a module 65 reconstructs the denoised signal in the time domain, by operating the inverse fast Fourier transform (TFRI) inverse of the samples of frequency S n f delivered by the multiplier
- FIG. 9 shows a preferred embodiment of a denoising system implementing the invention.
- This system includes a certain number of elements similar to elements corresponding to the system of FIG. 1, for which the same reference numerals have been used.
- modules 10, 11, 12, 15, 16, 45 and 55 provide in particular the quantities
- Fast Fourier 11 is a limitation of the system of FIG. 1.
- the frequency subject to protection by the module 56 is not necessarily the precise tone frequency f, but the frequency closest to it in the discrete spectrum. In some cases, it is then possible to protect harmonics relatively far from that of the tone frequency.
- the system of FIG. 9 overcomes this drawback thanks to an appropriate conditioning of the speech signal.
- the sampling frequency of the signal is modified so that the period 1 / f covers exactly an integer number of sample times of the conditioned signal.
- harmonic analysis methods that can be implemented by the module 57 are capable of providing a fractional value of the delay T, expressed in number of samples at the initial sampling frequency F.
- a new sampling frequency f is then chosen so that it is equal to an integer multiple of the estimated tone frequency, ie with p integer.
- f should be greater than F.
- F is between F and 2F (1 ⁇ K ⁇ 2), to facilitate the implementation of the packaging.
- N is usually a power of 2 for the implementation of the TFR. It is 256 in the example considered.
- This choice is made by a module 70 according to the value of the delay T supplied by the narmonic analysis module 57.
- the module 70 provides the ratio K between the sampling frequencies to three frequency change modules 71, 72, 73.
- the module 71 is used to transform the values S ⁇ n, i. , r l B n ⁇ > a n ⁇ ' B n ⁇ and H nf' relating to the bands i defined by the module 12, in the modified frequency scale (sampling frequency f). This transformation consists simply in dilating the bands i in the factor K. The values thus transformed are supplied to the module 56 for protecting harmonics.
- the module 72 proceeds to the oversampling of the frame of N samples provided by the windowing module 10.
- the oversampling in a rational factor K K1 / K2) consists in first of all performing an oversampling in the integer factor K1, then a subsampling in the integer factor K2.
- K K1 / K2
- the conditioned signal frame supplied by the module 72 includes KN samples at the frequency f. These samples are sent to a module 75 which calculates their Fourier transform.
- the two blocks therefore have an overlap of (2-K) xl00%.
- For each of the two blocks we obtain a set of Fourier components S f . These components S f are supplied to the multiplier 58, which multiplies them by the spectral response
- the autocorrelations A (k) are calculated by a module 76, for example according to the formula:
- a module 77 then calculates the normalized entropy
- the normalized entropy H constitutes a measurement of voicing very robust to noise and to variations in the tonal frequency.
- the correction module 62 operates in the same way as that of the system of FIG. 1, taking into account the overestimated noise B n ⁇ resized by the frequency change module 71. It provides the frequency response # ⁇ of the final denoising filter, which is multiplied by the spectral components S I_I ,, ⁇ 1 of the signal conditioned by the multiplier
- TFRI 65 a module 80 combines, for each frame, the two signal blocks resulting from the processing of the two blocks overlays issued by TFR 75. This combination can consist of a weighted sum of Hamming of samples, to form a signal frame conditions noise-suppressed KN samples.
- a module 82 manages the windows formed by the module 10 and saved by the module 66, so that a number M of samples is saved equal to an integer multiple of. This avoids the problems of phase discontinuity between the frames.
- the management module 82 controls the windowing module 10 so that the overlap between the current frame and the next one corresponds to NM. This recovery of NM samples will be required in the recovery sum carried out by the module 66 during the processing of the next frame. From the value of T provided by the harmonic analysis module 57, the module 82 calculates the number of samples to be saved
- the tonal frequency is estimated so average on the frame.
- the tonal frequency may vary somewhat over this period. It is possible to take these variations into account in the context of the present invention, by conditioning the signal so as to artificially obtain a constant tone frequency in the frame. For this, it is necessary that the module 57 of harmonic analysis provides the time intervals between the consecutive breaks in the speech signal attributable to closings of the glottis of the intervening speaker during the duration of the frame. Methods usable for detecting such micro-ruptures are well known in the field of harmonic analysis of speech signals.
- w.m is the cumulative sum of the posterior likelihood ratio of two distributions, corrected by the Kullback divergence. For a distribution of residuals having a Gaussian statistic, this value w.m is given by:
- FIG. 10 thus shows a possible example of evolution of the value w, showing the breaks R of the speech signal.
- FIG. 11 shows the means used to calculate the conditioning of the signal in the latter case.
- the largest T of the time intervals t supplied by the module 57 for a frame is selected by the module 70 (block 91 in FIG. 11) to obtain a torque p, ⁇ as indicated in table I.
- the tonal frequency harmonics protection module 56 operates in the same way as above, using for condition (9) the spectral resolution ⁇ f provided by block 91 and the tonal frequency defined according to the value of the integer delay p supplied by block 91.
- This embodiment of the invention also involves an adaptation of the window management module 82.
- the number M of samples of the denoised signal to be saved on the current frame here corresponds to an integer number of consecutive time intervals t between two glottal breaks (see FIG. 10). This arrangement avoids the problems of phase discontinuity between frames, while taking into account the possible variations of the time intervals t on a frame.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9711643 | 1997-09-18 | ||
FR9711643A FR2768547B1 (fr) | 1997-09-18 | 1997-09-18 | Procede de debruitage d'un signal de parole numerique |
PCT/FR1998/001980 WO1999014738A1 (fr) | 1997-09-18 | 1998-09-16 | Procede de debruitage d'un signal de parole numerique |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1016072A1 true EP1016072A1 (fr) | 2000-07-05 |
EP1016072B1 EP1016072B1 (fr) | 2002-01-16 |
Family
ID=9511230
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP98943999A Expired - Lifetime EP1016072B1 (fr) | 1997-09-18 | 1998-09-16 | Procede et dispositif de debruitage d'un signal de parole numerique |
Country Status (7)
Country | Link |
---|---|
US (1) | US6477489B1 (fr) |
EP (1) | EP1016072B1 (fr) |
AU (1) | AU9168998A (fr) |
CA (1) | CA2304571A1 (fr) |
DE (1) | DE69803203T2 (fr) |
FR (1) | FR2768547B1 (fr) |
WO (1) | WO1999014738A1 (fr) |
Families Citing this family (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6510408B1 (en) * | 1997-07-01 | 2003-01-21 | Patran Aps | Method of noise reduction in speech signals and an apparatus for performing the method |
US6549586B2 (en) | 1999-04-12 | 2003-04-15 | Telefonaktiebolaget L M Ericsson | System and method for dual microphone signal noise reduction using spectral subtraction |
US6717991B1 (en) | 1998-05-27 | 2004-04-06 | Telefonaktiebolaget Lm Ericsson (Publ) | System and method for dual microphone signal noise reduction using spectral subtraction |
FR2797343B1 (fr) * | 1999-08-04 | 2001-10-05 | Matra Nortel Communications | Procede et dispositif de detection d'activite vocale |
JP3454206B2 (ja) * | 1999-11-10 | 2003-10-06 | 三菱電機株式会社 | 雑音抑圧装置及び雑音抑圧方法 |
US6804640B1 (en) * | 2000-02-29 | 2004-10-12 | Nuance Communications | Signal noise reduction using magnitude-domain spectral subtraction |
US6766292B1 (en) * | 2000-03-28 | 2004-07-20 | Tellabs Operations, Inc. | Relative noise ratio weighting techniques for adaptive noise cancellation |
JP2002221988A (ja) * | 2001-01-25 | 2002-08-09 | Toshiba Corp | 音声信号の雑音抑圧方法と装置及び音声認識装置 |
AU4627801A (en) * | 2001-04-11 | 2001-07-09 | Phonak Ag | Method for the elimination of noise signal components in an input signal for an auditory system, use of said method and hearing aid |
US6985709B2 (en) * | 2001-06-22 | 2006-01-10 | Intel Corporation | Noise dependent filter |
DE10150519B4 (de) * | 2001-10-12 | 2014-01-09 | Hewlett-Packard Development Co., L.P. | Verfahren und Anordnung zur Sprachverarbeitung |
US7103539B2 (en) * | 2001-11-08 | 2006-09-05 | Global Ip Sound Europe Ab | Enhanced coded speech |
US20040078199A1 (en) * | 2002-08-20 | 2004-04-22 | Hanoh Kremer | Method for auditory based noise reduction and an apparatus for auditory based noise reduction |
US7398204B2 (en) * | 2002-08-27 | 2008-07-08 | Her Majesty In Right Of Canada As Represented By The Minister Of Industry | Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking |
EP1554716A1 (fr) * | 2002-10-14 | 2005-07-20 | Koninklijke Philips Electronics N.V. | Filtrage de signaux |
KR101141247B1 (ko) * | 2003-10-10 | 2012-05-04 | 에이전시 포 사이언스, 테크놀로지 앤드 리서치 | 디지털 신호를 확장성 비트스트림으로 인코딩하는 방법;확장성 비트스트림을 디코딩하는 방법 |
US7725314B2 (en) * | 2004-02-16 | 2010-05-25 | Microsoft Corporation | Method and apparatus for constructing a speech filter using estimates of clean speech and noise |
US7729908B2 (en) * | 2005-03-04 | 2010-06-01 | Panasonic Corporation | Joint signal and model based noise matching noise robustness method for automatic speech recognition |
US20060206320A1 (en) * | 2005-03-14 | 2006-09-14 | Li Qi P | Apparatus and method for noise reduction and speech enhancement with microphones and loudspeakers |
KR100927897B1 (ko) * | 2005-09-02 | 2009-11-23 | 닛본 덴끼 가부시끼가이샤 | 잡음억제방법과 장치, 및 컴퓨터프로그램 |
US8126706B2 (en) * | 2005-12-09 | 2012-02-28 | Acoustic Technologies, Inc. | Music detector for echo cancellation and noise reduction |
JP4592623B2 (ja) * | 2006-03-14 | 2010-12-01 | 富士通株式会社 | 通信システム |
US8949120B1 (en) | 2006-05-25 | 2015-02-03 | Audience, Inc. | Adaptive noise cancelation |
JP4757158B2 (ja) * | 2006-09-20 | 2011-08-24 | 富士通株式会社 | 音信号処理方法、音信号処理装置及びコンピュータプログラム |
US20080162119A1 (en) * | 2007-01-03 | 2008-07-03 | Lenhardt Martin L | Discourse Non-Speech Sound Identification and Elimination |
BRPI0807703B1 (pt) | 2007-02-26 | 2020-09-24 | Dolby Laboratories Licensing Corporation | Método para aperfeiçoar a fala em áudio de entretenimento e meio de armazenamento não-transitório legível por computador |
EP2130019B1 (fr) * | 2007-03-19 | 2013-01-02 | Dolby Laboratories Licensing Corporation | Procédé d'amélioration de la qualité de la parole au moyen d'un modèle perceptuel |
BRPI0816792B1 (pt) * | 2007-09-12 | 2020-01-28 | Dolby Laboratories Licensing Corp | método para melhorar componentes de fala de um sinal de áudio composto de componentes de fala e ruído e aparelho para realizar o mesmo |
EP2191465B1 (fr) * | 2007-09-12 | 2011-03-09 | Dolby Laboratories Licensing Corporation | Amélioration de la qualité de la parole avec ajustement de l'évaluation des niveaux de bruit |
JP5483000B2 (ja) * | 2007-09-19 | 2014-05-07 | 日本電気株式会社 | 雑音抑圧装置、その方法及びプログラム |
JP5056654B2 (ja) * | 2008-07-29 | 2012-10-24 | 株式会社Jvcケンウッド | 雑音抑制装置、及び雑音抑制方法 |
US20110257978A1 (en) * | 2009-10-23 | 2011-10-20 | Brainlike, Inc. | Time Series Filtering, Data Reduction and Voice Recognition in Communication Device |
US9838784B2 (en) | 2009-12-02 | 2017-12-05 | Knowles Electronics, Llc | Directional audio capture |
US8423357B2 (en) * | 2010-06-18 | 2013-04-16 | Alon Konchitsky | System and method for biometric acoustic noise reduction |
US9640194B1 (en) | 2012-10-04 | 2017-05-02 | Knowles Electronics, Llc | Noise suppression for speech processing based on machine-learning mask estimation |
US9536540B2 (en) * | 2013-07-19 | 2017-01-03 | Knowles Electronics, Llc | Speech signal separation and synthesis based on auditory scene analysis and speech modeling |
CN103824562B (zh) * | 2014-02-10 | 2016-08-17 | 太原理工大学 | 基于心理声学模型的语音后置感知滤波器 |
DE102014009689A1 (de) * | 2014-06-30 | 2015-12-31 | Airbus Operations Gmbh | Intelligentes Soundsystem/-modul zur Kabinenkommunikation |
CN106797512B (zh) | 2014-08-28 | 2019-10-25 | 美商楼氏电子有限公司 | 多源噪声抑制的方法、系统和非瞬时计算机可读存储介质 |
US9978388B2 (en) | 2014-09-12 | 2018-05-22 | Knowles Electronics, Llc | Systems and methods for restoration of speech components |
CN105869652B (zh) * | 2015-01-21 | 2020-02-18 | 北京大学深圳研究院 | 心理声学模型计算方法和装置 |
US9820042B1 (en) | 2016-05-02 | 2017-11-14 | Knowles Electronics, Llc | Stereo separation and directional suppression with omni-directional microphones |
EP3566229B1 (fr) * | 2017-01-23 | 2020-11-25 | Huawei Technologies Co., Ltd. | Appareil et procédé permettant d'améliorer une composante souhaitée dans un signal |
US11017798B2 (en) * | 2017-12-29 | 2021-05-25 | Harman Becker Automotive Systems Gmbh | Dynamic noise suppression and operations for noisy speech signals |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03117919A (ja) * | 1989-09-30 | 1991-05-20 | Sony Corp | ディジタル信号符号化装置 |
AU633673B2 (en) | 1990-01-18 | 1993-02-04 | Matsushita Electric Industrial Co., Ltd. | Signal processing device |
EP0459362B1 (fr) | 1990-05-28 | 1997-01-08 | Matsushita Electric Industrial Co., Ltd. | Processeur de signal de parole |
US5450522A (en) * | 1991-08-19 | 1995-09-12 | U S West Advanced Technologies, Inc. | Auditory model for parametrization of speech |
US5469087A (en) | 1992-06-25 | 1995-11-21 | Noise Cancellation Technologies, Inc. | Control system using harmonic filters |
US5400409A (en) * | 1992-12-23 | 1995-03-21 | Daimler-Benz Ag | Noise-reduction method for noise-affected voice channels |
JPH08506427A (ja) * | 1993-02-12 | 1996-07-09 | ブリテイッシュ・テレコミュニケーションズ・パブリック・リミテッド・カンパニー | 雑音減少 |
US5623577A (en) * | 1993-07-16 | 1997-04-22 | Dolby Laboratories Licensing Corporation | Computationally efficient adaptive bit allocation for encoding method and apparatus with allowance for decoder spectral distortions |
JP3131542B2 (ja) * | 1993-11-25 | 2001-02-05 | シャープ株式会社 | 符号化復号化装置 |
US5555190A (en) | 1995-07-12 | 1996-09-10 | Micro Motion, Inc. | Method and apparatus for adaptive line enhancement in Coriolis mass flow meter measurement |
FR2739736B1 (fr) * | 1995-10-05 | 1997-12-05 | Jean Laroche | Procede de reduction des pre-echos ou post-echos affectant des enregistrements audio |
FI100840B (fi) * | 1995-12-12 | 1998-02-27 | Nokia Mobile Phones Ltd | Kohinanvaimennin ja menetelmä taustakohinan vaimentamiseksi kohinaises ta puheesta sekä matkaviestin |
US6144937A (en) * | 1997-07-23 | 2000-11-07 | Texas Instruments Incorporated | Noise suppression of speech by signal processing including applying a transform to time domain input sequences of digital signals representing audio information |
-
1997
- 1997-09-18 FR FR9711643A patent/FR2768547B1/fr not_active Expired - Fee Related
-
1998
- 1998-09-16 EP EP98943999A patent/EP1016072B1/fr not_active Expired - Lifetime
- 1998-09-16 CA CA002304571A patent/CA2304571A1/fr not_active Abandoned
- 1998-09-16 US US09/509,145 patent/US6477489B1/en not_active Expired - Fee Related
- 1998-09-16 DE DE69803203T patent/DE69803203T2/de not_active Expired - Fee Related
- 1998-09-16 AU AU91689/98A patent/AU9168998A/en not_active Abandoned
- 1998-09-16 WO PCT/FR1998/001980 patent/WO1999014738A1/fr active IP Right Grant
Non-Patent Citations (1)
Title |
---|
See references of WO9914738A1 * |
Also Published As
Publication number | Publication date |
---|---|
FR2768547A1 (fr) | 1999-03-19 |
WO1999014738A1 (fr) | 1999-03-25 |
AU9168998A (en) | 1999-04-05 |
CA2304571A1 (fr) | 1999-03-25 |
DE69803203T2 (de) | 2002-08-29 |
DE69803203D1 (de) | 2002-02-21 |
EP1016072B1 (fr) | 2002-01-16 |
US6477489B1 (en) | 2002-11-05 |
FR2768547B1 (fr) | 1999-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1016072B1 (fr) | Procede et dispositif de debruitage d'un signal de parole numerique | |
EP1789956B1 (fr) | Procede de traitement d'un signal sonore bruite et dispositif pour la mise en oeuvre du procede | |
EP2002428B1 (fr) | Procede de discrimination et d'attenuation fiabilisees des echos d'un signal numerique dans un decodeur et dispositif correspondant | |
EP1830349B1 (fr) | Procédé de débruitage d'un signal audio | |
CA2436318C (fr) | Procede et dispositif de reduction de bruit | |
EP1016071B1 (fr) | Procede et dispositif de detection d'activite vocale | |
FR2907586A1 (fr) | Synthese de blocs perdus d'un signal audionumerique,avec correction de periode de pitch. | |
EP2936488B1 (fr) | Atténuation efficace de pré-échos dans un signal audionumérique | |
JP3960834B2 (ja) | 音声強調装置及び音声強調方法 | |
EP1016073B1 (fr) | Procede et dispositif de debruitage d'un signal de parole numerique | |
EP0490740A1 (fr) | Procédé et dispositif pour l'évaluation de la périodicité et du voisement du signal de parole dans les vocodeurs à très bas débit. | |
EP1021805B1 (fr) | Procede et disposition de conditionnement d'un signal de parole numerique | |
EP3192073B1 (fr) | Discrimination et atténuation de pré-échos dans un signal audionumérique | |
EP2515300B1 (fr) | Procédé et système de réduction du bruit | |
FR2888704A1 (fr) | ||
EP4287648A1 (fr) | Dispositif électronique et procédé de traitement, appareil acoustique et programme d'ordinateur associés | |
WO2006117453A1 (fr) | Procede d’attenuation des pre- et post-echos d’un signal numerique audio et dispositif correspondant |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20000316 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE FR GB |
|
17Q | First examination report despatched |
Effective date: 20001004 |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 21/02 A |
|
RTI1 | Title (correction) |
Free format text: METHOD AND APPARATUS FOR SUPPRESSING NOISE IN A DIGITAL SPEECH SIGNAL |
|
RIC1 | Information provided on ipc code assigned before grant |
Free format text: 7G 10L 21/02 A |
|
RTI1 | Title (correction) |
Free format text: METHOD AND APPARATUS FOR SUPPRESSING NOISE IN A DIGITAL SPEECH SIGNAL |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAG | Despatch of communication of intention to grant |
Free format text: ORIGINAL CODE: EPIDOS AGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB |
|
REF | Corresponds to: |
Ref document number: 69803203 Country of ref document: DE Date of ref document: 20020221 |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20020407 |
|
RAP2 | Party data changed (patent owner data changed or rights of a patent transferred) |
Owner name: NORTEL NETWORKS FRANCE |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
REG | Reference to a national code |
Ref country code: FR Ref legal event code: CD Ref country code: FR Ref legal event code: CA |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20031127 Year of fee payment: 6 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20050401 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20050817 Year of fee payment: 8 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20050902 Year of fee payment: 8 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20060916 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20070531 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20060916 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20061002 |