DE4405723A1

DE4405723A1 - Method for noise reduction of a disturbed speech signal

Info

Publication number: DE4405723A1
Application number: DE4405723A
Authority: DE
Inventors: Klaus Dr Ing Linhard; Heinz Klemm
Original assignee: Daimler Benz AG
Current assignee: Mercedes Benz Group AG
Priority date: 1994-02-23
Filing date: 1994-02-23
Publication date: 1995-08-24
Also published as: EP0669606A2; DE59506864D1; ATE185014T1; ES2138669T3; EP0669606B1; EP0669606A3

Abstract

Filtering is applied to the band spectrum of the noisy signal before and optionally also after spectral subtraction of the noise is applied. The required transfer function for the spectral subtraction process is obtained during speech pauses and may itself be subjected to median filtering. The filter may operate in the time domain over a window of 3 successive time segments, in total less than 50 ms. Alternatively, the filter operates in the frequency domain with a window size of less than 5 frequency units.

Description

Die Erfindung betrifft ein Verfahren zur Geräuschreduktion eines gestörten Sprachsignals mit Hilfe der spektralen Subtraktion.The invention relates to a method for noise reduction a disturbed speech signal using the spectral Subtraction.

Die Geräuschreduktion fit der Methode der spektralen Sub traktion findet Anwendung bei der automatischen Spracher kennung oder bei Freisprechanlagen zur Verbesserung der Sprachqualität, z. B. beim Telefonieren aus dem Kraftfahr zeug.The noise reduction fits the spectral sub method traction is used in automatic speech identifier or in hands-free systems to improve the Voice quality, e.g. B. when calling from the motor vehicle stuff.

Die Geräuschreduktion durch spektrale Subtraktion zeichnet sich dadurch aus, daß relativ stationäre Störungen typischerweise um ca. 10dB reduziert werden kön nen, ohne daß zusätzliche Information über die Störung benötigt wird. Es wird nur der gestörte Sprachkanal benö tigt. Das Sprachsignal wird in kurze überlappende Zeitseg mente unterteilt und segmentweise bearbeitet. Bei der spektralen Subtraktion wird in den Sprachpausen ein Schätzwert der Störung ermittelt, und dieser Schätzwert wird im Spektralbereich betragsmäßig subtrahiert. Die spektrale Subtraktion ist auf verschiedene Arten reali sierbar, wird aber in der Regel als multiplikatives Filter im Frequenzbereich implementiert. Diese spektrale Subtrak tion zeigt den unerwünschten Nebeneffekt eines musikali schen Restgeräuschs, der "musical tones" und einer Sprach verzerrung.The noise reduction by spectral subtraction draws is characterized by the fact that relatively stationary Interference can typically be reduced by approx. 10dB NEN without additional information about the fault is needed. Only the disturbed voice channel is used does. The speech signal is in short overlapping time segment elements divided and processed segment by segment. In the spectral subtraction is used in the speech pauses Estimate of the fault, and this estimate is subtracted in the spectral range. The spectral subtraction is real in different ways can be used, but is usually used as a multiplicative filter implemented in the frequency domain. This spectral subtrak tion shows the undesirable side effect of a musical residual noise, the "musical tones" and a speech distortion.

Üblicherweise werden "musical tones" durch eine überhöhte Dämpfung unterdrückt. Die überhöhte Dämpfung kann durch ein Überschätzen der Störung mit einem Überschätzfaktor erfolgen oder durch die Wahl einer speziellen Übertra gungskennlinie. Aus der Übertragungskennlinie werden für jede Frequenz die Werte der aktuellen Übertragungsfunktion bestimmt. Es ist üblich im spektralen Subtraktionsfilter eine Betragskennlinie zu implementieren, die eine höhere Dämpfung aufweist als z. B. ein Kennlinie nach dem quadra tischen Fehlerkriterium. Speziell entworfene Kennlinien sind ebenfalls möglich. Abhängig von der verwendeten Kenn linie ist eine Überschätzung der Störung um den Faktor 1 bis 3 üblich. Die überhöhte Dämpfung durch die Kennlinie und den Überschätzfaktor ergibt zwar den gewünschten Ef fekt der Unterdrückung von "musical tones", hat aber auch den Nebeneffekt einer z. T. erheblichen Verzerrung der Sprache. Usually "musical tones" are exaggerated Damping suppressed. The excessive damping can be caused by an overestimation of the disturbance with an overestimation factor done or by choosing a special transfer characteristic curve. The transmission characteristic curve becomes each frequency the values of the current transfer function certainly. It is common in the spectral subtraction filter implement an amount characteristic that is higher Damping has z. B. a characteristic curve after the quadra table error criterion. Specially designed characteristic curves are also possible. Depending on the identifier used line is an overestimation of the disturbance by a factor of 1 up to 3 usual. The excessive damping due to the characteristic and the overestimation factor gives the desired Ef effect of the suppression of "musical tones", but also has the side effect of a z. T. considerable distortion of the Language.

Eine weitere übliche Methode "musical tones" zu unterdrüc ken, ist die Maskierung durch Zulassen eines bestimmten Anteils (z. B. 20%) des ursprünglichen Geräuschs als Hin tergrundgeräusch ("spectral floor"). "musical tones" wer den dadurch weniger hörbar, das Geräusch wird aber auch nicht mehr vollständig unterdrückt.Another common way to suppress "musical tones" is masking by allowing a specific one Share (e.g. 20%) of the original noise as a hint background noise ("spectral floor"). "musical tones" who which makes it less audible, but the noise also becomes no longer completely suppressed.

Es gilt bei der spektralen SubtraktionIt applies to spectral subtraction

_i,l = K_i,l · Y_i,l (1) _{i, l} = K _{i, lYI} _{, l} (1)

mitWith

Y_i,l = S_i,l + N_i,l (2)Y _{i, l} = S _{i, l} + N _{i, l} (2)

und für das Beispiel einer sogenannten Betragskennlinie als Übertragungskennlinieand for the example of a so-called amount characteristic as a transmission characteristic

sowie beispielsweise die Auswahl eines minimalen Übertra gungswertes für den spectral floorand, for example, the selection of a minimal transfer values for the spectral floor

Min(K_i,l) = b (4)Min (K _{i, l} ) = b (4)

mit den Größen:
: geschätztes Ausgangssignal
K: Übertragungsfunktion
Y: gestörtes Sprachsignal
S: Sprachsignal
N: Störgeräusch
b: Hintergrundrestgeräusch (spectral floor)
a: Überschätzfaktor (overestimate)
||²: in Sprachpausen geschätzte Störung
i: Frequenzindex
l: Zeitindex des Segments.with the sizes:
: estimated output signal
K: transfer function
Y: disturbed speech signal
S: voice signal
N: noise
b: background noise (spectral floor)
a: overestimate factor
|| ²: Estimated disturbance during speech pauses
i: frequency index
l: Time index of the segment.

Methoden zur Unterdrückung der "musical tones", durch Kennlinie, "overestimation" und "spectral floor", sind in vielfältiger Variation durch zahlreiche Veröffentlichungen bekannt, z. B.Methods to suppress the "musical tones", by Characteristic curve, "overestimation" and "spectral floor", are in diverse variation through numerous publications known, e.g. B.

Boll, S.: Suppression of Noise in Speech Using the SABER Method, Proc. IEEE Int. Conf. on ASSP, 1978, pp. 600-609.Boll, S .: Suppression of Noise in Speech Using the SABER Method, Proc. IEEE Int. Conf. on ASSP, 1978, pp. 600-609.

Boll, S.: Suppression of Acoustic Noise in Speech Using Spectral Substraction, IEEE Trans. on ASSP, Vol. ASSP-27, No. 2, April 79, pp. 113-120.Boll, S .: Suppression of Acoustic Noise in Speech Using Spectral Substraction, IEEE Trans. On ASSP, Vol. ASSP-27, No. April 2, 79, pp. 113-120.

Berouti, M.; Schwartz, R.; Makhoul, J.: Enhancement of Speech Corrupted by Acoustic Noise, Proc. Int. Conf. on ASSP, 1979, pp. 208-211.Berouti, M .; Schwartz, R .; Makhoul, J .: Enhancement of Speech Corrupted by Acoustic Noise, Proc. Int. Conf. on ASSP, 1979, pp. 208-211.

Vary, P.: Noise Suppression by Spectral Magnitude Estima tion - Mechanism and Theoretical Limits-, Signal Proces sing, Vol. 8, No. 4, 1986, pp. 387-400.Vary, P .: Noise Suppression by Spectral Magnitude Estima tion - Mechanism and Theoretical Limits-, Signal Proces sing, vol. 8, no. 4, 1986, pp. 387-400.

Xie, F.; Compernolle.: Speech Enhancement by Nonlinear Spectral Estimation - A Unifying Approach, Int. Conf. Eu rospeech, 1993, pp. 617-620. Xie, F .; Compernolle .: Speech Enhancement by Nonlinear Spectral Estimation - A Unifying Approach, Int. Conf. Eu rospeech, 1993, pp. 617-620.

Über die oben angesprochenen Methoden hinaus, sind weitere spezielle Methoden bekannt, die ebenfalls zur Reduzierung der "musical tones" verwendet werden:Beyond the methods mentioned above, there are others known special methods that are also used for reduction the "musical tones" can be used:

Die Amplitudenwerte zeitlich aufeinanderfolgender gestör ter Sprachspektren werden gemittelt (z. B. bei Boll "magni tude averaging"). Dadurch werden zwar Rauschanteile ge dämpft aber da Sprache stark instationär ist, tritt schon bei kurzen Mittelungslängen eine zeitliche Verschmierung des Sprachsignals auf (echoartiger Effekt). Bei Boll wird weiterhin ein "magnitude plus bandwith measurement test" beschrieben, nachdem spektrale Bereiche mit einer Band breite unter 300 Hz und einer Amplitude, kleiner als eine vorgegebene Schwelle, als "residual noise" erkannt werden.The amplitude values are interfered with one another in time ter speech spectra are averaged (e.g. Boll "magni tude averaging "). This does in fact generate noise components dampens but since language is highly transient, it already occurs with short averaging lengths a time smear of the speech signal on (echo-like effect). At Boll still a "magnitude plus bandwith measurement test" described after spectral ranges with a band width below 300 Hz and an amplitude less than one predetermined threshold, can be recognized as "residual noise".

Diese Bereiche werden dann zusätzlich gedämpft. Es wird von Boll vorgeschlagen, den "residual noise" dadurch zu reduzieren, daß aus drei zeitlich aufeinanderfolgenden Spektren des gefilterten Signals jeweils der minimale Wert als Ausgangssignal verwendet wird. Die Ausgabe der minima len Spektrallinie von drei zeitlich benachbarten Linien führt zwar zu einer deutlichen Reduzierung des Restge räuschs und damit der "musical tones", gelegentlich treten jedoch in unregelmäßigen Abständen plötzliche kurze "Ge räuschbündel" auf.These areas are then additionally damped. It will von Boll proposed to reduce the "residual noise" reduce that from three consecutive times Spectra of the filtered signal the minimum value is used as the output signal. The output of the minima len spectral line of three temporally adjacent lines leads to a significant reduction in the residual ge noise and thus the "musical tones", occasionally kicking however, sudden short "Ge bundle of noise ".

Ein weiteres Verfahren verwendet eine sogenannte nichtli neare spektrale Subtraktion. Der Überschätzfaktor wird hier abhängig vom Pausengeräusch und dem aktuell anliegen den Signal errechnet. Die optimale Einstellung dieser Re gelung ist jedoch schwierig. (Lockwood, P.; Boudy, J.: Ex periments with a Nonlinear Spectral Subtraction (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars, Speech Communication, No. 11, 1992, p. 215-228).Another method uses a so-called non-li linear spectral subtraction. The overestimation factor is depending on the pause noise and the current concern calculated the signal. The optimal setting of this re However, success is difficult. (Lockwood, P .; Boudy, J .: Ex periments with a Nonlinear Spectral Subtraction (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars, Speech Communication, No. 11, 1992, p. 215-228).

Aufgabe der vorliegenden Erfindung ist es, ein Verfahren zur Geräuschreduktion eines gestörten Sprachsignals an zugeben, welches bei hoher Sprachqualität des Ausgangssi gnals eine starke Reduktion der Geräusche, insbesondere auch der "musical tones" ermöglicht.The object of the present invention is a method to reduce the noise of a disturbed speech signal admit which with high speech quality of the output si gnals a strong reduction in noise, in particular the "musical tones" also makes it possible.

Erfindungsgemäße Lösungen dieser Aufgabe sowie vorteil hafte Ausführungen und Weiterbildungen sind in den Pa tentansprüchen beschrieben.Solutions according to the invention of this task and of advantage sturdy designs and further training are in Pa claims described.

Die Medianfilterung erweist sich als vorteilhaftes Verfah ren zur weiteren wesentlichen Verbesserung des Verfahrens der spektralen Subtraktion für die Geräuschreduktion eines gestörten Sprachsignals. Die Medianfilterung kann dabei sowohl auf das Betragsspektrum des gestörten Eingangssi gnals oder des nach der spektralen Subtraktion geräuschre duzierten Ausgangssignals als auch auf die aus der Anwen dung einer Übertragungskennlinie bestimmte Übertragungs funktion angewandt und in Zeitrichtung oder in Frequenz richtung durchgeführt werden. Die Übertragungsfunktion ist repräsentiert durch die zeit- und frequenzdiskreten Werte K_i,l (z. B. Gleichung (3)). Auch eine Kombination ver schiedener dieser Vorgehensweisen kann vorteilhaft sein.The median filtering proves to be an advantageous method for further substantially improving the method of spectral subtraction for the noise reduction of a disturbed speech signal. The median filtering can be applied both to the magnitude spectrum of the disturbed input signal or the noise-reduced output signal after spectral subtraction, and to the transmission function determined from the application of a transmission characteristic and can be carried out in the time direction or in the frequency direction. The transfer function is represented by the time and frequency discrete values K _{i, l} (e.g. equation (3)). A combination of various of these procedures can also be advantageous.

So sieht ein bevorzugtes Verfahren vor, in Sprachpausen durch Anwendung der Medianfilterung bevorzugt in zeitli cher Richtung, auf die Übertragungsfunktion den natürli chen Eindruck eines schwachen Hintergrundgeräusches zu be wahren und während Sprachaktivität durch Anwendung der Medianfilterung auf das Betragsspektrum des Sprachsignals eine starke Unterdrückung der "musical tones" zu erreichen. Die getrennte Erkennung von Sprachpausen und Sprachaktivität ist zur Ermittlung eines mittleren Ge räuschsignals während Sprachpausen ohnehin vorgesehen und bekannt, so daß hierfür kein gesonderter Aufwand erforder lich ist. Die erfindungsgemäßen Verfahren sind einfach im plementierbar.This is a preferred procedure during language breaks by using median filtering, preferably in time direction, on the transfer function the natural to give the impression of a weak background noise true and during speech activity by using the Median filtering on the magnitude spectrum of the speech signal a strong suppression of the "musical tones" too to reach. The separate recognition of speech pauses and Speech activity is used to determine a mean Ge noise signal provided anyway during speech pauses and known, so that this requires no special effort is. The methods of the invention are simple can be implemented.

Das Prinzip der Medianfilterung an sich ist allgemein be kannt (z. B. Nitra, S.K.: Handbook for Digital Signal Pro cessing, John Wiley & Sons, 1993).The principle of median filtering per se is general knows (e.g. Nitra, S.K .: Handbook for Digital Signal Pro cessing, John Wiley & Sons, 1993).

Fig. 1 zeigt ein Beispiel für ein Eingangssignal E und ein mit einem Medianfilter der Länge 3 gefiltertes Ausgangssi gnal A. Das Medianfilter sortiert zuerst die Werte inner halb des Datenfensters F und gibt dann den mittleren Wert med aus. Das Medianfilter blendet kurze Signalspitzen aus, erhält aber die übrigen Signalflanken. Fig. 1 shows an example for an input signal E and a filtered with a median filter of length 3 shows Ausgangssi gnal A. The median filter sorted first the values within half of the data window F, and then outputs the average value of med. The median filter hides short signal peaks, but receives the remaining signal edges.

Für das Beispiel der Anwendung eines Medianfilters der Länge 3 auf ein geräuschreduziertes Betragsspektrum eines Sprachsignals gilt bei zeitlicher FilterungFor the example of the application of a median filter of length 3 to a noise-reduced magnitude spectrum of a speech signal, time filtering applies

|_i,l,m| = med(|_i,l-1|, |_i,l|, |_i,l+1|) (5)| _{i, l, m} | = med (| _{i, l-1} |, | _{i, l} |, | _{i, l + 1} |) (5)

oder bei Filterung in Frequenzrichtungor with filtering in the frequency direction

|_i,l,m| = med(|_i-1,l|, |_i,l|, |_i+1,l|) (6)| _{i, l, m} | = med (| _{i-1, l} |, | _{i, l} |, | _{i + 1, l} |) (6)

Der Filterung am Betrag ist die Filterung am Betragsqua drat im Prinzip gleichwertig. The amount filtering is the amount filtering in principle equivalent.

Die Wirkung der Medianfilterung auf die Verringerung der "musical tones" ist veranschaulicht anhand von Darstellun gen eines typischen zeitlichen und spektralen Verlaufs solcher "musical tones". Dargestellt ist das Betragsspek trum eines in einer Sprachpause gewonnenen und mit Hilfe der spektralen Subtraktion geschätzten Ausgangssignals. Da in der Sprachpause keine Sprachanteile vorliegen treten vor allem die "musical tones" deutlich in Erscheinung.The effect of median filtering on reducing the "musical tones" is illustrated by means of representation against a typical temporal and spectral course such "musical tones". The amount spectra is shown dream of a language break obtained with help the spectral subtraction of the estimated output signal. There there are no speech parts during the language break especially the "musical tones" clearly appear.

Als Beispiel der spektralen Subtraktion wurde verwendet: Standardverfahren mit Betragskennlinie, 20% Hintergrundge räusch (b = 0,2), ohne Überschätzfaktor (a = 1,0).The following was used as an example of spectral subtraction: Standard procedure with amount characteristic, 20% background ge noise (b = 0.2), without overestimation factor (a = 1.0).

Als Geräusch-Beispiel wurde verwendet: Fahrzeuginnenge räusch bei 140 km/h, 12 kHz Abtastfrequenz, Segmentlänge 512 Werte, die letzten 256 Werte jedes Segments werden zu Null gesetzt, die ersten 256 Werte jedes Segments werden mit Hanning-Fenster multipliziert, Segmente sind halb über lappt, d. h. alle 10,67 ms ein neues Segment.The following was used as a sound example: vehicle interior noise at 140 km / h, 12 kHz sampling frequency, segment length 512 Values, the last 256 values of each segment become zero are set, the first 256 values of each segment are marked with Hanning window multiplied, segments are half over laps, d. H. a new segment every 10.67 ms.

Fig. 2 zeigt zunächst über der Frequenz (linear 0 bis 6 kHz) das Spektrum für 4 zeitlich aufeinanderfolgende Seg mente (Zeitabstand 10,67 ms, Index l) und dann über der Zeit (0 bis 2,5 sec) den Signalverlauf für 4 aufeinander folgende diskrete Frequenzen (Index i), stellvertretend für alle 256 Frequenzen. Es zeigt sich als typische Eigen schaft der "musical tones", daß der Verlauf über der Fre quenz relativ ausgedehnte Störungen (breite Impulse) auf weist, wogegen der Verlauf über der Zeit einen starken im pulsartigen Charakter (schmale Impulse) hat. Genau der im pulsartige Charakter in zeitlicher Richtung macht die Me dianfilterung hier besonders effektiv. Eine impulsartige Störung wird gelöscht. Für impulsartige Störungen mit breiteren Impulsen ist eine größere Fensterlänge des Me dianfilters erforderlich. Im Gegensatz zu linearen Filte rungsverfahren (Glättungsfilter, "linear smoother") findet keine Verschmierung des Signalverlaufs statt. Die Darstel lung der in zeitlicher Richtung mit dem 3-er Median gefil terten Signale in Fig. 3 verdeutlicht diese Eigenschaft. Das gefilterte Signal zeigt im Zeitverlauf deutlich einen glatteren Verlauf. Im Frequenzverlauf sind einige der (breiteren) Impulse durch die Filterung in Zeitrichtung ebenfalls gelöscht. Fig. 2 shows first the frequency (linear 0 to 6 kHz), the spectrum for 4 successive segments (time interval 10.67 ms, index 1) and then over time (0 to 2.5 sec) the signal curve for 4 successive discrete frequencies (index i), representative of all 256 frequencies. It shows up as a typical property of the "musical tones" that the course over the frequency has relatively extensive disturbances (broad impulses), whereas the course over time has a strong pulse-like character (narrow impulses). It is precisely the pulse-like character in the temporal direction that makes median filtering particularly effective here. A pulse-like fault is deleted. A longer window length of the media filter is required for pulse-like interference with wider pulses. In contrast to linear filtering methods (smoothing filter, "linear smoother") there is no smearing of the signal curve. The representation of the signals filtered in time with the 3-fold median in FIG. 3 illustrates this property. The filtered signal clearly shows a smoother course over time. In the frequency response, some of the (broader) pulses are also deleted by filtering in the time direction.

Bei Sprachaktivität führt die Anwendung des Medianfilters in zeitlicher Richtung der einzelnen Spektrallinien zu ei ner Verbesserung der Sprachqualität, da impulsartige Stö rungen des Sprachspektrums "repariert" werden. Das Sprach signal selbst wird nur sehr gering verändert. Eine Erhö hung der Fensterlänge von 3 auf 5 (in Zeitrichtung) ergibt zwar eine noch bessere Auslöschung der "musical tones", es wird aber bereits ein schwacher echoartiger Charakter der Sprache hörbar.When there is voice activity, the median filter is used in the temporal direction of the individual spectral lines ner improvement of speech quality, since impulsive disturbances of the language spectrum are "repaired". The voice signal itself is changed very little. An increase hung the window length from 3 to 5 (in the time direction) although an even better erasure of the "musical tones", it but is already a weak echo-like character of the Speech audible.

Das Medianfilter kann anstatt am Ausgangssignal auch am Eingangssignal, vor der spektralen Subtraktion, durchge führt werden. Im Idealfall können dadurch keine "musical tones" entstehen, die sonst alternativ durch die Nachfil terung mit dem Medianfilter gelöst werden. Die Medianfil terung am Eingangssignal kann dann vorteilhaft sein, wenn "musical tones" die verschiedenen implementierten Verar beitungsschritte im spektralen Substraktionsfilter (außer der Kennlinienfunktion) beeinflussen. Es soll im weiteren nicht auf mögliche Vor- oder Nachteile einer Medianfilte rung am Ein- oder Ausgangssignal eingegangen werden. Im Prinzip sind beide Möglichkeiten gegeben und von speziel len Fällen der Implementierung abgesehen gleichwertig.The median filter can also be used instead of the output signal Input signal, prior to spectral subtraction leads. Ideally, this will not result in any "musical tones ", which otherwise alternate through the post-film can be solved with the median filter. The median file The input signal can be advantageous if "musical tones" the various implementations implemented processing steps in the spectral subtraction filter (except the characteristic function). It should go on not on possible advantages or disadvantages of a median filter tion on the input or output signal. in the In principle, both possibilities are given and of special len cases of implementation apart from equivalent.

Das Medianfilter kann anstatt am Betragsspektrum eines Sprachsignals auch an der Übertragungsfunktion K ausge führt werden.The median filter can be used instead of the magnitude spectrum Speech signal also on the transfer function K out leads.

Es gilt für den 3er Median:The following applies to the median of 3:

|K_i,l,m| = med(|K_i,l-1|, |K_i,l|, |K_i,l+1 _| ₎ (7)| K _{i, l, m} | = med (| K _{i, l-1} |, | K _{i, l} |, | K _{i, l + 1} _| ₎ (7)

oderor

|K_i,l,m| = med(|K_i-1,l|, |K_i,l|, |K_i+1,l _| ₎ (8)| K _{i, l, m} | = med (| K _{i-1, l} |, | K _{i, l} |, | K _{i + 1, l} _| ₎ (8)

Fig. 4 zeigt die Übertragungsfunktion K über der Zeit und über der Frequenz. Dargestellt ist der gleiche Ausschnitt wie in Fig. 2. Die Übertragungsfunktion zeigt ein ähnli ches Verhalten wie das Ausgangssignal in Fig. 2. Fig. 4 shows the transfer function K over time and over frequency. The same section is shown as in Fig. 2. The transfer function shows a similar behavior as the output signal in Fig. 2nd

Fig. 5 zeigt die in zeitlicher Richtung mit dem 3-er Me dian gefilterte Übertragungsfunktion. Dargestellt ist der gleiche Ausschnitt wie in Fig. 3. Auch hier ist die Me dianfilterung in zeitlicher Richtung aus den gleichen Gründen wie beim Ausgangssignal äußerst effektiv. Fig. 5 shows the transfer function filtered in the temporal direction with the 3-series media. The same section is shown as in Fig. 3. Here, too, the media filtering in the temporal direction is extremely effective for the same reasons as for the output signal.

Die effektive Unterdrückung der "musical tones" durch die Medianfilterung kann wie folgt erklärt werden:The effective suppression of the "musical tones" by the Median filtering can be explained as follows:

Ein Eingangssignal mit einer impulsartigen Störung verur sacht die entsprechende impulsartige Änderung der Übertra gungsfunktion. Im ursprünglichen Geräusch gehört dieser lokale Impuls zum natürlichen Geräusch und wird deshalb nicht als besonders störend empfunden. Das Spektrum des Eingangssignals wird mit der Übertragungsfunktion multipliziert. Die impulsartige Störung wird dadurch zu sätzlich verstärkt ist jetzt als "musical tone" hörbar.An input signal with a pulse-like disturbance gently the corresponding pulse-like change in the transfer supply function. In the original sound it belongs local impulse to natural sound and therefore becomes not perceived as particularly disturbing. The spectrum of the Input signal is with the transfer function multiplied. The impulse-like disturbance thereby becomes additionally amplified is now audible as a "musical tone".

Die impulsunterdrückende Eigenschaft der Medianfilterung wirkt sich besonders deutlich auf die verstärkte Impuls störung und somit auf die "musical tones" aus. Die Median filterung wirkt reparierend auf die impulsartige Störung.The pulse-suppressing property of median filtering affects particularly clearly on the amplified impulse malfunction and thus on the "musical tones". The median filtering has a repairing effect on the pulse-like disturbance.

Die Medianfilterung am Betragsspektrum des Eingangs- oder Ausgangssignals ergibt gegenüber der Medianfilterung an den Übertragungswerten den höheren Gewinn an der Unter drückung von impulsartigen Störungen, kann aber auch zu besonders in Sprachpausen auffallenden als unnatürlich empfundenen Veränderungen führen, während die Medianfilte rung der Übertragungswerte in Sprachpausen im wesentlichen zu einer reinen Dämpfung des Signals führt, das dadurch leiser aber natürlich klingt. Im Idealfall entstehen keine "musical tones". Eine bevorzugte Ausführungsform der Er findung macht sich dies zunutze, indem die Medianfilterung bei Sprachaktivität am Betragsspektrum und in Sprachpausen an den Übertragungswerten durchgeführt wird. Die erforder liche Sprach-Pausen-Entscheidung steht bei der spektralen Subtraktion ohnehin zur Verfügung, da die Bildung des Ge räuschschätzwertes nur in den Sprachpausen durchgeführt wird.The median filtering on the magnitude spectrum of the input or Output signal indicates compared to the median filtering the transfer values the higher profit on the sub pressure from impulsive disturbances, but can also particularly noticeable in speech pauses as unnatural perceived changes while the median filters Essentially, the transmission values during pauses in speech leads to a pure attenuation of the signal sounds quieter but natural. Ideally, none arise "musical tones". A preferred embodiment of the Er This takes advantage of the invention by median filtering in the case of language activity on the range of amounts and during language breaks is carried out on the transmission values. The required The linguistic pause decision is the spectral one Subtraction is available anyway since the formation of the Ge Noise estimate only carried out during the pauses in speech becomes.

Anstelle der Medianfilterung in Zeitrichtung wie beschrie ben kann auch eine Medianfilterung in Frequenzrichtung ge mäß Gleichung (6) durchgeführt werden. Die gegebenen ausführlichen Darlegungen gelten für die Filterung in Frequenzrichtung analog. Es zeigt sich, daß mit abnehmen der Zahl der Abtastwerte innerhalb eines Zeitsegments die Medianfilterung in Frequenzrichtung an Vorteilen gewinnt gegenüber der Filterung in Zeitrichtung und umgekehrt.Instead of median filtering in the time direction as described A median filtering in the frequency direction can also be used according to equation (6). The given detailed explanations apply to the filtering in Frequency direction analog. It turns out that you lose weight the number of samples within a time segment Median filtering in the frequency direction gains advantages versus filtering in the time direction and vice versa.

Bei der beschriebenen Anwendung einer Medianfilterung in zeitlicher Richtung mit den beispielhaft angegebenen Wer ten für Abtastrate und Fensterlänge ist die Fensterlänge wie im Beispiel angegeben gleich der minimalen Medianfen sterlänge 3. Größere Fensterlängen führen in diesem Falle zwar zu einer weiteren Unterdrückung der "musical tones", u. U. aber auch zu einer als unnatürlich empfundenen Eineb nung des Sprachsignals. Die bevorzugte Fensterlänge ist daher 3 wie beispielhaft angegeben. Für zeitlich kürzere Segmente kann eine größere Fensterlänge bei der Medianfil terung angemessen sein. Der von dem Fenster der zeitlichen Medianfilterung abgedeckte Zeitintervall sollte aber 50 ms nicht überschreiten.When using median filtering in temporal direction with the exemplified who The window length is the sampling rate and window length as shown in the example, equal to the minimum median star length 3. Larger window lengths lead in this case to further suppress the "musical tones", u. However, it may also lead to a level that is perceived as unnatural voice signal. The preferred window length is therefore 3 as indicated by way of example. For shorter ones Segments can have a larger window length at the median fil be appropriate. The one from the window of temporal Median filtering covered time interval should however be 50 ms do not exceed.

Für die Filterung in Frequenzrichtung orientiert sich die Fensterlänge des Medianfilters an der Datensegmentlänge. Die Datensegmentlänge sollte im zahlenmäßig beschriebenen Beispiel kleiner als 64 sein, das Medianfilter nicht größer als 5.For filtering in the frequency direction, the Window length of the median filter at the data segment length. The data segment length should be numerically described Example is less than 64, the median filter is not greater than 5.

Claims

1. A method for noise reduction of a disturbed speech signal with the aid of spectral subtraction, characterized in that the magnitude spectrum of the speech signal is subjected to median filtering.

2. The method according to claim 1, characterized in that the median filtering on the magnitude spectrum of the disturbed Input signal is applied.

3. The method according to any one of claims 1 and 2, characterized characterized that the median filtering on the amount spectrum of the output signal of the spectral subtraction is applied.

4. Procedure for noise reduction of a disturbed Speech signal using spectral subtraction, where from a predefinable transmission characteristic Transfer function for the spectral subtraction be is true, characterized in that the transmission median filtering function.

5. Method for noise reduction of a disturbed Speech signal with a combination of previous claims che.

6. The method according to claim 5, characterized in that the median filtering in speech pauses on the transmission values and in the case of voice activity on the range of amounts of the Speech signal is applied.

7. The method according to any one of claims 1 to 6, characterized characterized that the median filtering in temporal Direction is applied.

8. The method according to claim 7, characterized in that the window length of the median filter three consecutive included time segments.

9. The method according to claim 7 or claim 8, characterized ge indicates that the window length of the median filter is small is less than 50 ms.

10. The method according to any one of claims 1 to 6, characterized characterized that the median filtering in frequency tion is applied.

11. The method according to claim 10, characterized in that the window length of the median filter does not exceed 5 Frequency values included.