DE10137685C1

DE10137685C1 - Speech signal detection method for hearing aid provides evaluation index from correlation between instant amplitude signal and instant frequency signal

Info

Publication number: DE10137685C1
Application number: DE2001137685
Authority: DE
Inventors: Zlatan Ribic
Original assignee: TUERK and TUERK ELECTRONIC GmbH; Tuerk & Tuerk Electronic GmbH
Current assignee: Interton Electronic Hoergeraete 51469 Be De GmbH
Priority date: 2001-08-01
Filing date: 2001-08-01
Publication date: 2002-12-19
Anticipated expiration: 2021-08-02

Abstract

The speech signal detection method has an analytical signal extracted from an input signal (Sin) and used for calculation of an instant amplitude signal (IA) and an instant phase signal, which is processed to provide an instant frequency signal, e.g. an instant angular frequency signal (IW). An evaluation function providing an evaluation index (n) is calculated from the instant amplitude signal and the instant frequency signal. An Independent claim for a device for detection of the presence of speech signals is also included.

Description

Die Erfindung betrifft ein Verfahren zum Erkennen des Vorliegens von Sprachsigna len.The invention relates to a method for recognizing the presence of voice signals len.

Die Entwicklung von Hörgeräten ist in den letzten Jahren so weit perfektioniert wor den, dass technische Probleme nahezu ausgeschlossen beziehungsweise unbedeu tend sind. Nach wie vor drängend ist jedoch das Problem, die Signale bei der Ver stärkung so zu bearbeiten, dass die Nutzsignale möglichst verlustfrei übertragen werden und die Störsignale so weit als möglich unterdrückt werden.The development of hearing aids has been perfected so far in recent years that technical problems are almost impossible or unimportant tend. However, the problem is still pressing, the signals in the Ver to process strengthening in such a way that the useful signals are transmitted with as little loss as possible and the interference signals are suppressed as much as possible.

Aber auch bei anderen Anwendungen, wie etwa im Bereich der Nachrichtenübertra gung über Telefonleitungen oder per Funk ist die Unterdrückung von Störschall ein wichtiges Thema.But also in other applications, such as in the area of message transmission The suppression of noise is an option via telephone lines or by radio important topic.

Ein einfacher Ansatz besteht darin, durch Anwendung von Hochpassfiltern, Tiefpass filtern oder Bandpassfiltern bestimmte Frequenzbereiche abzuschwächen, in denen ein hoher Anteil an Störsignalen vermutet wird. Aufgrund der Vielfältigkeit möglicher Störsignale haben solche Verfahren jedoch nur einen begrenzten Nutzen und darüber hinaus wird auch das Nutzsignal, das üblicherweise ein Sprachsignal ist, verzerrt und gestört.A simple approach is through the use of high pass filters, low pass filter or bandpass filters to attenuate certain frequency ranges in which a high proportion of interference signals is suspected. Due to the diversity possible Interference signals, however, have such methods only of limited use and above in addition, the useful signal, which is usually a speech signal, is also distorted and disturbed.

Eine weitere Schwierigkeit besteht darin, dass es sich bei der Sprache um ein äußerst komplexes Signal handelt. Es sind verschiedene Modelle der Spracherzeugung be kannt, wie etwa in J. L. FLANAGAN: "Speech Analysis, Synthesis and Perception" 2. ed, Springer Verlag, New York 1972. Darin wird ein Grundsignal definiert, das entwe der aus einer Reihe von Impulsen besteht, wie dies etwa bei Vokalen der Fall ist oder aus Rauschen beispielsweise bei Konsonanten, wie etwa "S" oder "SCH". Die Im pulsreihe definiert die Tonhöhe und ist oft als F0 (Null-Formant) bezeichnet. Ein sol ches Signal hat zumeist zahlreiche harmonische Komponenten bis zu sehr hohen Frequenzen. Durch das Atmen entsteht zusätzlich ein Rauschen. Bei der Artikulation werden die so erzeugten Signale weiter gefiltert. Dadurch ändert sich die spektrale Form und es entsteht die Sprache. Daraus abgeleitet ist versucht worden, Störschall unterdrückungssysteme zu entwickeln, die auf einer spektralen Analyse beruhen. Da sich jedoch die Sprache ständig ändert, das heißt Amplitude, Frequenz und Spektren nicht konstant sind, sind solchen Verfahren Grenzen gesetzt. Zusätzliche Schwierig keiten entstehen beispielsweise durch Koartikulationen, die einen Übergang von ei nem zu einem anderen Phonem darstellen. Im Gegensatz dazu sind Störungen übli cherweise relativ einfachere Signale, was im Übrigen auch für Musik zutrifft.Another difficulty is that the language is extremely complex signal. There are different models of speech production knows, as in J.L. FLANAGAN: "Speech Analysis, Synthesis and Perception" 2. ed, Springer Verlag, New York 1972. This defines a basic signal, which either which consists of a series of impulses, as is the case with vowels or from noise, for example in the case of consonants such as "S" or "SCH". The Im pulse series defines the pitch and is often referred to as F0 (zero formant). A sol The signal usually has numerous harmonic components up to very high ones Frequencies. Breathing also creates noise. When articulating the signals generated in this way are further filtered. This changes the spectral Form and the language arises. From this has been attempted noise to develop suppression systems based on spectral analysis. There However, the language is constantly changing, i.e. amplitude, frequency and spectra there are limits to such procedures. Additional difficult For example, coarticulations result in a transition from egg to egg to represent another phoneme. In contrast, disturbances are common relatively simpler signals, which also applies to music.

Eine grundlegende Darstellung, die auch in heutiger Zeit noch zutreffend ist, ist in J. S. LIM, A. V. OPPENHEIM: "Enhancement and Bandwith compression of noisy speech" Proceedings of IEEE Vol. 67, Nr. 12, Dezember 1979 gegeben. Weiters ha ben in jüngster Zeit Verfahren wie "Beam Forming" und "Blind Source Separation" an Bedeutung gewonnen. Bei solchen Verfahren wird jedoch stets mehr als ein Mikro phon benötigt. Die vorliegende Erfindung betrifft jedoch Verfahren, die auch auf ein aus einem einzigen Mikrophon gewonnenes Signal anwendbar sind.A basic description, which is still applicable today, is in J. S. LIM, A. V. OPPENHEIM: "Enhancement and Bandwith compression of noisy speech "Proceedings of IEEE Vol. 67, No. 12, December 1979. Furthermore ha have recently been using methods such as "beam forming" and "blind source separation" Gained meaning. With such methods, however, there is always more than one mic phon needed. However, the present invention relates to methods that also apply to a signal obtained from a single microphone are applicable.

In der Praxis werden häufig sogenannte "Noise Gates" verwendet, die im Grunde ge nommen einen oder mehrere parallel geschaltete Expander darstellen. Dabei wird das Eingangssignal verstärkt und parallel mehreren Filtern zugeführt und dadurch in mehrere Frequenzbänder unterteilt. In jedem Kanal wird danach die Amplitude fest gestellt, indem der Absolutwert mit einem Tiefpassfilter gefiltert wird, um die durch schnittliche Energie oder Amplitude des Signals zu gewinnen. Danach folgt eine nichtlineare Transformation, die bei digitaler Signalbearbeitung auch mit einer soge nannten "Look up Table", aber auch anders, beispielsweise durch eine geschlossen angegebene Funktion realisierbar ist. Der so gewonnene Wert wird dazu verwendet, das Signal des jeweiligen Kanals zu verstärken beziehungsweise abzuschwächen, das heißt, dass im einfachsten Fall eine Multiplikation stattfindet. Die auf diese Weise ge wonnenen Signale jedes Kanals werden addiert, um ein Ausgangssignal zu erzeugen. Eine Expansion des Signals kann auf diese Weise leicht durchgeführt werden, indem dann wenn die Energie, das heißt die Amplitude des Signals, gering ist, das Signal reduziert wird, wogegen bei größerer Amplitude eine Verstärkung vorgenommen wird. In jedem Frequenzbereich werden daher Störungen geringer Amplitude unter drückt. Solche Systeme funktionieren jedoch nur bei relativ konstanter Störung. Ein weiteres Problem besteht darin, dass auch leise Sprachsignale unterdrückt werden. Ferner werden in Sprechpausen Artefakte generiert, die manchmal sehr störend sind. Insgesamt kann man sagen, dass solche Systeme keine befriedigende Lösung zur Unterdrückung von Störschall bieten können.In practice, so-called "noise gates" are often used, which are basically ge represent one or more expanders connected in parallel. Doing so the input signal is amplified and fed in parallel to several filters and thereby in divided several frequency bands. The amplitude then becomes fixed in each channel by filtering the absolute value with a low-pass filter to get the through to gain average energy or amplitude of the signal. Then comes one nonlinear transformation, which with digital signal processing also with a so-called called "look up table", but also differently, for example closed by a specified function is feasible. The value obtained in this way is used to to amplify or weaken the signal of the respective channel, the means that in the simplest case, multiplication takes place. The ge in this way acquired signals of each channel are added to produce an output signal. An expansion of the signal can be easily carried out in this way by then when the energy, i.e. the amplitude of the signal, is low, the signal is reduced, whereas amplification is carried out with a larger amplitude becomes. In every frequency range, interference is therefore of low amplitude suppressed. However, such systems only work with a relatively constant disturbance. On Another problem is that even quiet speech signals are suppressed. Furthermore, artifacts are generated during pauses in speech, which are sometimes very annoying. Overall, one can say that such systems are not a satisfactory solution to the Can provide suppression of noise.

Aus der EP 542 710 A (RIBIC) ist ein Verfahren zur Verarbeitung von Signalen be kannt, bei dem aus einem Eingangssignal ein analytisches Signal gewonnen wird. Als analytisches Signal wird ein komplexes Signal bezeichnet, dessen imaginäre Kompo nente die Hilbert-Transformierte der reellen Komponente darstellt. Die mathemati schen Grundlagen davon sind beispielsweise in R. B. RANDALL: "Frequency Analysis" BRÜL & KJAER, 1987 ausführlich beschrieben. In der genannten Offenlegungsschrift sind verschiedene Möglichkeiten und Schaltungen zu Gewinnung der Hilbert-Signale beschrieben. Durch die derzeitigen Möglichkeiten der digitalen Signalverarbeitung ist es in relativ einfacher Weise möglich einen Hilbert-Transformator zu realisieren, um das reelle und das imaginäre Signal zu gewinnen. Es wird dazu beispielsweise auf S. L. HAHN: "Hilbert Transforms in Signal Processing" Artech House, 1996 verwiesen. Ausgehend von dem analytischen Signal bestehend aus den beiden Hilbert-Signalen, beziehungsweise dem Realteil und dem Imaginärteil, kann ein sogenanntes Instant- Amplitudensignal nach folgender Formel (1) berechnet werden:
From EP 542 710 A (RIBIC) a method for processing signals is known, in which an analytical signal is obtained from an input signal. An analytical signal is a complex signal whose imaginary component represents the Hilbert transform of the real component. The mathematical foundations of this are described in detail, for example, in RB RANDALL: "Frequency Analysis" BRÜL & KJAER, 1987. Various possibilities and circuits for obtaining the Hilbert signals are described in the published patent application. Due to the current possibilities of digital signal processing, it is possible to implement a Hilbert transformer in a relatively simple manner in order to obtain the real and the imaginary signal. For example, reference is made to SL HAHN: "Hilbert Transforms in Signal Processing" Artech House, 1996. Based on the analytical signal consisting of the two Hilbert signals, or the real part and the imaginary part, a so-called instant amplitude signal can be calculated using the following formula (1):

IA = (Re² + Im²)^1/2 (1)
IA = (Re ² + Im ² ) ^1/2 (1)

wobei Re den Realteil des analytischen Signals und Im den Imaginärteil des analyti schen Signals bezeichnet.where Re is the real part of the analytical signal and Im the imaginary part of the analyti designated signal.

Das Instant-Amplitudensignal stellt einen Wert dar, der die momentane Magnitude repräsentiert. Die Magnitude ist die Vektorlänge für komplexe Signale, die Amplitude des Eingangsignals ist in der Zeitdomäne der Momentanwert des Realteils des analy tischen Signals. Gemäß der folgenden Formel (2) wird in analoger Weise ein Instant- Phasensignal berechnet:
The instant amplitude signal represents a value that represents the current magnitude. The magnitude is the vector length for complex signals, the amplitude of the input signal in the time domain is the instantaneous value of the real part of the analytical signal. An instant phase signal is calculated in an analogous manner according to the following formula (2):

IFI = arctan(Im/Re) (2)
IFI = arctan (Im / Re) (2)

wobei IFI einen Wert darstellt, der als momentane Phase des Signals angesehen werden kann.where IFI represents a value considered the current phase of the signal can be.

Aus der oben genannten EP 542 711 A1 ist ein Verfahren bekannt, mit dem Audio signale bearbeitet werden können, um die Funktion von Hörgeräten zu verbessern. Dabei wird aus einem Eingangssignal ein analytisches Signal erzeugt, aus dem ein Instant-Amplitudensignal berechnet wird. Dieses Instant-Amplitudensignal wird als Stellgröße verwendet, um das Eingangssignal oder eines der Hilbert-Signale passend zu verstärken, so dass eine Signalkompression erreicht wird. Es wird also das In stant-Amplitudensignal nur dazu verwendet, das Eingangssignal entsprechend zu be arbeiten. Da jedoch die Verzögerung des Instant-Amplitudensignals und des damit gesteuerten Signals nicht übereinstimmen, kann eine vollständig befriedigende Lö sung nicht erreicht werden.From the above-mentioned EP 542 711 A1 a method is known with which audio signals can be processed to improve the function of hearing aids. An analytical signal is generated from an input signal, from which a Instant amplitude signal is calculated. This instant amplitude signal is called Control variable used to match the input signal or one of the Hilbert signals amplify so that signal compression is achieved. So it becomes the in stant amplitude signal used only to be the input signal accordingly work. However, since the delay of the instant amplitude signal and thus controlled signal do not match, a completely satisfactory Lö solution cannot be achieved.

Auch die US 4,495,643 A (ORBAN) und die US 6,205,225 B (ORBAN) zeigen Verfah ren, die durch eine Hilbert-Transformation zunächst ein analytisches Signal erzeugen. In den obigen Schaltungen werden die Hilbert-Signale jedoch vor der weiteren Ver arbeitung gefiltert, so dass ein echtes Instant-Amplitudensignal nicht erhalten wer den kann. Mit solchen Verfahren können Signalspitzen begrenzt werden, es ist jedoch nicht möglich, Störschall insgesamt wirksam zu unterdrücken.US 4,495,643 A (ORBAN) and US 6,205,225 B (ORBAN) also show procedures that first generate an analytical signal using a Hilbert transformation. In the above circuits, however, the Hilbert signals are used before further ver work filtered so that a real instant amplitude signal is not received that can. Such methods can limit signal peaks, but it is not possible to effectively suppress overall noise.

Aufgabe der vorliegenden Erfindung ist es, ein Verfahren und eine Vorrichtung zum Erkennen des Vorlie gens von Sprachsignalen anzugeben, damit in weiterer Folge Störschall wirksam unterdrückt werden kann. Insbesondere soll ein solches Verfahren eine leichte Ein stellbarkeit und Anpassung an verschiedenste Umgebungsbedingungen ermöglichen, wobei im Fall von Hörgeräten auch der spezifische Hörverlust der jeweiligen Person zu berücksichtigen ist.The object of the present invention is to provide a method and a device for recognizing the luff gens of speech signals, so that subsequent noise is effective can be suppressed. In particular, such a method should be an easy one enable adjustability and adaptation to various environmental conditions, in the case of hearing aids also the specific hearing loss of the person concerned is to be considered.

Erfindungsgemäß werden die Schritte durchgeführt, die im Patentanspruch 1 ange geben sind. Die Grundidee der Erfindung ist, dass die Sprache als Nutzsignal be stimmte harmonische Strukturen aufweist, die zur Unterscheidung zwischen Sprachsignalen und Störsignalen dienen können, dabei wurde festgestellt, dass Sprachsignale eine relativ hohe Korrelation des erfindungsgemäß gebildeten In stant-Amplitudensignals zum Instant-Frequenzsignal haben. Daraus kann eine Bewertungsfunktion abgeleitet werden, die einen Bewertungsindex ausgibt, der eine Aussage über das Vorliegen von Nutzsignalen (Sprache und Störsignalen) ermöglicht. Im einfachsten Fall kann der Bewertungsindex lediglich zwei Werte wie etwa 0 und 1 angeben, die für Störsignal beziehungsweise Sprachsignal ste hen. In verfeinerten Ausführungen ist es möglich, mehrere Ausgabewerte den Bewertungsindex vorzusehen oder einen kontinuierlichen Wertebereich bei spielsweise zwischen 0 und 1 zu definieren, wobei der Wert des Bewertungsindex die Wahrscheinlichkeit des Vorliegens von Sprachsignalen beziehungsweise Stör signalen angibt oder bei vermischten Signalen ein Maß für den Anteil der jeweili gen Signalkomponenten darstellt. Die zeitliche Ableitung des Phasensignals IFI ist exakt ein Kreisfrequenzsignal IW, das nach Division durch 2π das eigentliche Frequenzsignal IFR ergibt.According to the invention, the steps are carried out as set out in claim 1 are given. The basic idea of the invention is that language be as a useful signal has harmonious structures that distinguish between Speech signals and interference signals can serve, it was found that Speech signals have a relatively high correlation of the In formed according to the invention stant amplitude signal to the instant frequency signal. One can Evaluation function can be derived, which outputs a rating index, the a statement about the presence of useful signals (speech and interference signals) allows. In the simplest case, the valuation index can only have two values such as 0 and 1, which represent noise and speech, respectively hen. In more refined versions, it is possible to have multiple output values Provide rating index or a continuous range of values for example between 0 and 1, the value of the rating index the probability of the presence of speech signals or interference signals or, in the case of mixed signals, a measure of the proportion of the respective gene signal components. The time derivative of the phase signal IFI is exactly an angular frequency signal IW, which after division by 2π is the real one Frequency signal IFR results.

Eine besonders einfache Durchführung des Verfahrens ergibt sich gemäß An spruch 2. Aufgrund der hohen Korrelation von IA und IW bei Sprachsignalen wird sich das Verhältnis IA/IW bei solchen Signalen in einem relativ engen Bereich bewegen. Ist das Verhältnis wesentlich kleiner oder wesentlich größer, so kann daraus geschlossen, dass ein Störsignal dominiert. Der Bewertungsindex n kann analytisch beispielsweise nach einer Formel n = exp(-(k - IA/IW)²) berechnet werden, mit einem empirisch bestimmten Proportionalitätsfaktor k, wobei n = 1 exakte Proportionalität, also Sprachsignal, bedeutet und n << 1 kein Sprachsig nal.A particularly simple implementation of the method results according to claim 2. Due to the high correlation of IA and IW with speech signals, the ratio IA / IW with such signals will be in a relatively narrow range. If the ratio is significantly smaller or significantly larger, it can be concluded that an interference signal dominates. The evaluation index n can be calculated analytically, for example according to a formula n = exp (- (k - IA / IW) ² ), with an empirically determined proportionality factor k, where n = 1 means exact proportionality, i.e. speech signal, and n << 1 means none Speech nal.

Eine allgemeinere Bestimmung des Signals ist gemäß Anspruch 3 gegeben. Durch die nichtlinearen Modifikationen des Instant-Amplitudensignals und des Instant-Frequenzsignals kann eine schärfere Unterscheidung gewonnen werden.A more general determination of the signal is given according to claim 3. Due to the nonlinear modifications of the instant amplitude signal and the Instant frequency signal can be a sharper distinction.

Eine weitere Schärfung der Unterscheidung kann dadurch gewonnen werden, dass die Schritte gemäß Anspruch 4 durchgeführt werden. Dies bedeutet, dass nicht nur das Verhältnis von IA zu IW sondern auch das Verhältnis der zeitlichen Ableitungen dieser Signale berücksichtigt wird, da bei Sprachsignalen auch die Ableitungen miteinander korreliert sind. Der Bewertungsindex wird dann einen großen Wert aufweisen, wenn sowohl die Korrelation der Signale selbst als auch die Korrelation ihrer Ableitungen gegeben ist. Zu diesem Zweck werden die ent sprechenden Teilbewertungsindizes additiv verknüpft. Es ist dabei möglich, die eine oder die andere Teilbewertungsfunktion bei der Summenbildung stärker zu gewichten, wobei die entsprechenden Gewichte gemäß den jeweilig herrschenden Umständen durch Versuche leicht ermittelbar sind. Generell ist es vorteilhaft, die differenzierten Signale stärker zu gewichten, indem beispielsweise der erste Teil bewertungsindex mit w und der zweite Teilbewertungsindex mit (1 - w) multipli ziert wird, wobei w beispielsweise zwischen 0,2 und 0,4 betragen kann. A further sharpening of the distinction can be gained by that the steps are carried out according to claim 4. This means that not only the ratio of IA to IW but also the ratio of the temporal Derivatives of these signals are taken into account, since speech signals also include Derivatives are correlated with each other. The rating index will then become one have great value if both the correlation of the signals themselves and the correlation of their derivatives is given. For this purpose, the ent speaking sub-rating indices additively linked. It is possible that one or the other partial valuation function in the formation of totals weights, the corresponding weights according to the prevailing Circumstances can easily be determined by experiment. Generally, it is beneficial to to weight differentiated signals more strongly, for example by the first part rating index with w and the second partial rating index with (1 - w) multipli is decorated, where w can be, for example, between 0.2 and 0.4.

Das Verfahren nach Anspruch 4 kann im Sinne der bevorzugten Ausführung nach Patentanspruch 3 weitergebildet werden, wie dies in den Ansprüchen 5 und 6 definiert ist.The method of claim 4 can in the sense of the preferred embodiment Claim 3 are further developed, as in claims 5 and 6 is defined.

Eine besonders bevorzugte Ausführungsvariante der Erfindung ist gemäß Patent anspruch 8 gegeben. Es hat sich überraschenderweise herausgestellt, dass ein Kennfeld, in dem die Wahrscheinlichkeitsdichte des Auftretens bestimmter Kom binationen von Instant-Amplitudensignal und Instant-Frequenzsignal dargestellt ist, für Sprache im Gegensatz zu sonstigen Signalen eine charakteristische Form aufweist. Zunächst zerfällt das Kennfeld grundsätzlich in einen Bereich positiver Werte der Instant-Frequenz und in einen Bereich negativer Instant-Frequenz. Für die Beurteilung ist nur der erste Bereich relevant. Überrachenderweise hat sich herausgestellt, dass bei Sprachsignalen in dem positiven Bereich zwei lokale Ma xima vorliegen, von denen eines gleichzeitig das absolute Maximum ist. Die Wahrscheinlichkeitsdichte weist somit eine zweihöckerige Struktur auf. Interes santerweise ist diese Struktur weitgehend unabhängig von der gesprochenen Sprache und der sprechenden Person. Aufgrund dieser Erkenntnis kann aus dem Kennfeld auf das Vorliegen oder Nicht-Vorliegen von Sprache geschlossen wer den.A particularly preferred embodiment of the invention is according to the patent Claim 8 given. It has surprisingly been found that a Map in which the probability density of the occurrence of certain com combinations of instant amplitude signal and instant frequency signal is a characteristic form for speech as opposed to other signals having. First of all, the map basically breaks down into a more positive area Values of the instant frequency and in a range of negative instant frequency. For the assessment is only relevant to the first area. Surprisingly, has emphasized that two local Ma xima are present, one of which is also the absolute maximum. The Probability density thus has a bumpy structure. interes kindly, this structure is largely independent of the spoken one Language and the speaking person. Based on this knowledge, the Map on the presence or absence of language concluded the.

Zur Erhöhung der Trennschärfe ist es vorteilhaft, wenn gemäß Anspruch 10 das Eingangssignal amplitudenmäßig zunächst normalisiert wird. Dies erfolgt in be kannter Weise mit einem AVC-Glied mit relativ langer Zeitkonstante, was be wirkt, dass der Durchschnittspegel der Amplitude im Mittel im Wesentlichen kon stant ist. Dieser Vorgang wird auch als langsame Kompression bezeichnet. In sprachlicher Hinsicht werden dabei die Vokale eher unterdrückt, während die Konsonanten verstärkt werden.To increase the selectivity, it is advantageous if according to claim 10 Input signal is initially normalized in terms of amplitude. This is done in be known manner with an AVC element with a relatively long time constant, which be has the effect that the average level of the amplitude is essentially con is constant. This process is also known as slow compression. In linguistically, the vowels are suppressed, while the Consonants are amplified.

In kurzfristiger Hinsicht jedoch liegt eine Proportionalität zwischen der Instant- Amplitude und der Instant-Frequenz vor, wie dies oben beschrieben ist. Der Pro portionalitätsfaktor ist jedoch im Zeitablauf veränderlich.In the short term, however, there is a proportionality between the instant Amplitude and instant frequency as described above. The pro however, the proportionality factor changes over time.

Weiters betrifft die vorliegende Erfindung eine Vorrichtung zum Erkennen von Sprachsignalen gemäß Anspruch 11.Furthermore, the present invention relates to a device for recognizing Speech signals according to claim 11.

In der Folge wird die vorliegende Erfindung anhand der in den Figuren darge stellten Ausführungsbeispiele näher erläutert.In the following, the present invention is illustrated by the figures in the figures presented embodiments explained in more detail.

Es zeigenShow it

Fig. 1 und 2 Blockdiagramme von Schaltungen, die bei der vorliegenden Erfin dung verwendet werden, Fig. 1 and 2 are block diagrams of circuits used in the present dung OF INVENTION,

Fig. 3 ein Blockdiagramm einer einfachen Vorrichtung zur Unterdrückung von Störschall, Fig. 3 is a block diagram of a simple device for suppressing acoustic noise,

Fig. 4a, 4b, 4c Diagramme, die die Expansion von Signalen erläutern, Fig. 4a, 4b, 4c are diagrams explaining the expansion of signals,

Fig. 5, 6 und 7 Blockdiagramme von weiteren Vorrichtungen zur Unterdrückung von Störschall, Fig. 5, 6 and 7 are block diagrams of further devices for suppressing acoustic noise,

Fig. 8 verschiedene Diagramme, die die Wirksamkeit der Unterdrückung von Störschall zeigen,8 shows various diagrams., Which show the effectiveness of suppression of background noise,

Fig. 9 und 10 weitere Blockdiagramme von Vorrichtungen zur Unterdrückung von Störschall, FIGS. 9 and 10 further block diagrams of devices for suppressing acoustic noise,

Fig. 11 und 12 Blockdiagramme von Schaltungen zum Erkennen von Sprachsig nalen, FIGS. 11 and 12 are block diagrams of circuits for detecting dimensional Sprachsig,

Fig. 13 und 14 Diagramme, die die Wirkung von Schaltungen zur Erkennung von Sprachsignalen erläutern, Fig. 13 and 14 are diagrams explaining the effect of circuits for detecting voice signals,

Fig. 15 ein dreidimensionales Diagramm in axonometrischer Darstellung, das ein Kennfeld der Wahrscheinlichkeitsdichte von Instant-Ampitude und Instant-Fre quenz für Sprache darstellt, Fig. 15 is a three-dimensional diagram in axonometric view, showing a map of the probability density of instant Ampitude and instant fre quency for speech,

Fig. 16 das Kennfeld von Fig. 15 in einer Schichtliniendarstellung, Fig. 16 shows the map of FIG. 15 in a layer line representation,

Fig. 17 ein Diagramm analog zu Fig. 15 für Störsignale, und FIG. 17 shows a diagram analogous to FIG. 15 for interference signals, and

Fig. 18 das Diagramm von dem Kennfeld in Fig. 17 in einer Schichtliniendarstel lung. Fig. 18 shows the diagram of the map in Fig. 17 in a stratification.

In Fig. 1 ist eine allgemeine Schaltung gezeigt, in der aus einem Eingangssignal ein Instant-Amplitudensignal IA, ein Instant-Phasensignal IFI und ein Instant- Kreisfrequenzsignal IW gewonnen wird. In einem ersten Block 1 wird das Ein gangssignal S_in in ein analytisches Signal umgewandelt, das aus einem Realteil Re und einem Imaginärteil Im besteht. Da der Realteil und der Imaginärteil einen konstanten Phasenunterschied von ^π/₂ aufweisen, stellt der Imaginärteil Im die Hilbert-Transformierte des Realteils Re dar. Diese Signale Re und Im werden so mit auch als Hilbert-Signale beizeichnet. Möglichkeiten zur Gewinnung des analy tischen Signals sind in der EP 542 710 A beschrieben. Im Wesentlichen kann man das Eingangssignal S_in einer Hilbert-Transformation unterziehen, um beispiels weise zum Imaginärteil Im zu kommen. Da die Hilbert-Transformation mit einer Verzögerung verbunden ist, muss das Eingangssignal S_in ebenfalls verzögert werden, um den Realteil Re zu erhalten. Eine alternative Möglichkeit besteht darin, das Eingangssignal S_in durch zwei unterschiedliche Allpassfilter in zwei Hil bert-Signale umzuwandeln. Eine weitere Möglichkeit zum Erhalten des analyti schen Signals besteht darin, durch eine Fourier-Transformation ein komplexes Spektrum des Eingangssignals S_in zu erhalten, alle Linien um ^π/₂ zu verschieben und das Signal durch inverse Transformation wieder in die Zeitdomäne zurückzu setzen. Durch die Möglichkeiten der digitalen Signalbearbeitung ist es unproble matisch, ein solches analytisches Signal in geeigneter Weise zu erhalten. In den Blöcken 2 und 3 wird gemäß den obigen Formeln (1) und (2) das Instant- Amplitudensignal IA beziehungsweise das Instant-Phasensignal IFI erhalten. Durch zeitliche Ableitung des Instant-Phasensignals IFI in Block 4 kann das In stant-Kreisfrequenzsignal IW gebildet werden. Es muss festgehalten werden, dass die Signale IA, IFI und IW abgesehen von der Verzögerung durch die Hil bert-Transformation Echtzeitparameter sind, die frei von Mittelungen oder Verzö gerungen sind. Die Instant-Amplitude IA ist stets nichtnegativ, wogegen das In stant-Kreisfrequenzsignal IW nicht notwendigerweise positiv ist. Da das Instant- Phasensignal im Wesentlichen einen Winkel definiert, kann es durch das soge nannte Wraping auf einen Bereich zwischen 0 und 2π oder auf einem Bereich zwischen -π und π eingeschränkt werden.In Fig. 1 is a general circuit is shown in which from an input signal an instant amplitude signal IA, an instant phase signal IFI and an instant angular frequency signal IW is recovered. In a first block 1 , the input signal S is _converted into an analytical signal which consists of a real part Re and an imaginary part Im. Since the real part and the imaginary part have a constant phase difference of ^π / ₂ , the imaginary part Im represents the Hilbert transform of the real part Re. These signals Re and Im are also referred to as Hilbert signals. Ways of obtaining the analytical signal are described in EP 542 710 A. In essence, can be subjected to the input signal S _in a Hilbert transform to the example as the imaginary to come in. Since the Hilbert transformation is associated with a delay, the input signal S _{in must} also be delayed in order to obtain the real part Re. An alternative possibility is, the input signal S _in different by two all-pass filters in two Hil bert-convert signals. Another way of obtaining the analytical signal is to obtain a complex spectrum of the input signal S _in by a Fourier transformation, to shift all lines by ^π / ₂ and to set the signal back into the time domain by inverse transformation. Due to the possibilities of digital signal processing, it is unproblematic to receive such an analytical signal in a suitable manner. In blocks 2 and 3 , the instant amplitude signal IA and the instant phase signal IFI are obtained in accordance with the formulas (1) and (2) above. By temporally deriving the instant phase signal IFI in block 4 , the constant angular frequency signal IW can be formed. It must be noted that the signals IA, IFI and IW are, apart from the delay caused by the Hilbert transformation, real-time parameters that are free from averaging or delays. The instant amplitude IA is always non-negative, whereas the instantaneous angular frequency signal IW is not necessarily positive. Since the instant phase signal essentially defines an angle, it can be restricted to a range between 0 and 2π or to a range between -π and π by so-called wraping.

In Fig. 2 sind die einzelnen Bearbeitungsschritte von Fig. 1 in einem einzigen Block 5 zusammengefasst, um die Darstellung in weiterer Folge zu vereinfachen.In FIG. 2, the individual processing steps from FIG. 1 are combined in a single block 5 in order to simplify the presentation in a further sequence.

Störungen geringer Amplitude können in allgemeiner Form durch eine Schaltung gemäß Fig. 3 unterdrückt werden. Dabei wird das Instant-Amplitudensignal IA aus dem Block 5 in einer Look up Table einer nichtlinearen Veränderung unterzo gen, wodurch ein modifiziertes Instant-Amplitudensignal IA_mod erzeugt wird. An Stelle einer Look up Table kann auch eine geschlossen angegebene Funktion o der dgl. zur Erzeugung von IA_mod verwendet werden. In einer Verknüpfungs schaltung 7 wird das Ausgangssignal S_out aus dem modifizierten Instant-Amplitu densignal IA_mod und dem Instant-Phasensignal IFI aus der oben beschriebenen Formel (3) berechnet.Interference of low amplitude can be suppressed in general by a circuit according to FIG. 3. The instant amplitude signal IA from block 5 is subjected to a non-linear change in a look-up table, as a result of which a modified instant amplitude signal IA _{mod is} generated. Instead of a look up table, a closed function or the like can also be used to generate IA _mod . In a logic circuit 7 , the output signal S _out is calculated from the modified instant amplitude signal IA _mod and the instant phase signal IFI from the formula (3) described above.

In Fig. 4a, 4b und 4c sind drei verschiedene Varianten dargestellt, wie das In stant-Amplitudensignal IA durch nichtlineare Transformation zu dem modifizier ten Instant-Amplitudensignal IA_mod umgewandelt werden kann.In Fig. 4a, 4b and 4c are three different variants are shown, such as the IA-amplitude signal in stant amplitude signal Instant IA can be converted _mod by non-linear transformation to modify th.

In allen drei Diagrammen ist oberhalb eines vorbestimmten Grenzwerts IA_lim des Instant-Amplitudensignals IA eine direkte Proportionalität zwischen IA_mod und IA gegeben. Unterhalb dieses Grenzwerts IA_lim ist IA_mod kleiner als es der Proportio nalität entspricht. Bei der Ausführungsvariante von Fig. 4a ist eine Beziehung zwischen IA_mod und IA durch gerade Kurvenabschnitte 101, 102, 103 gegeben, wobei der Kurvenabschnitt 101 die geringste Abschwächung bedeutet, wogegen der Kurvenabschnitt 103 bedeutet, dass IA_mod für Werte unterhalb von IA_lim auf Null gesetzt ist. Bei der Ausführungsvariante von Fig. 4b gibt es einen Über gangsbereich unmittelbar unterhalb von IA_lim und daran anschließend Kurvenab schnitte 104, 105, 106, die parallel zum proportionalen Bereich 100 sind. Bei der Ausführungsvariante von Fig. 4c setzt sich der proportionale Bereich 100 unter halb von IA_lim in Kurven 107, 108 fort, die eine größere Steigung aufweisen. Die Darstellungen in Fig. 4a, 4b, 4c sind schematisch, und es können die dargestell ten Kurven auch auf eine logarithmische Darstellung von IA_mod oder IA angewen det werden, um zu einer üblichen dB-Skala zu gelangen.In all three diagrams there is a direct proportionality between IA _mod and IA above a predetermined limit value IA _{lim of} the instant amplitude signal IA. Below this limit IA _lim , IA _{mod is} smaller than the proportionality. In the embodiment of Fig. 4a is a relationship between IA _mod and IA by straight curve sections 101, 102 given, 103, wherein the curved section 101 represents the lowest attenuation, the curve portion while 103 indicates that IA _mod for values below IA _lim on Is set to zero. In the embodiment of FIG. 4b, there is a transition area immediately below IA _lim and then sections 104 , 105 , 106 , which are parallel to the proportional area 100 . In the embodiment of Fig. 4c the proportional region 100 is under half of IA _lim in curves 107, 108 on, which have a greater incline. The representations in Fig. 4a, 4b, 4c are schematic, and it can also be applied to a logarithmic representation of IA _mod or IA in order to arrive at a conventional dB scale.

Fig. 5 zeigt eine erweiterte Ausführungsvariante ausgehend von der Lösung von Fig. 3, wobei die nichtlineare Bearbeitung des Instant-Amplitudensignals IA in Abhängigkeit von einer Bewertung des Eingangssignals S_in in Block 8 verändert wird. Das Ergebnis des Bewertungsblocks 8 wird als Steuersignal der Look up Table 9 zugeführt. das Ausgangssignal S_out wird wie zuvor in dem Block 7 aus IA_mod und IFI gebildet. Fig. 5 shows an extended variant starting from the solution of Fig. 3, wherein the non-linear processing is changed the instant amplitude signal IA as a function of an evaluation of the input signal S _in in block 8. The result of the evaluation block 8 is supplied to the look-up table 9 as a control signal. the output signal S _out is formed as before in block 7 from IA _mod and IFI.

Fig. 6 zeigt eine dreikanalige Lösung, bei der das Eingangssignal S_in durch ein Hochpassfilter 10, ein Bandpassfilter 11 und ein Tiefpassfilter 12 in drei verschie dene Frequenzbänder unterteilt wird, die in getrennten Kanälen weiter bearbeitet werden. Alternativ können auch drei oder mehr Bandpassfilter verwendet wer den, um beliebig viele Kanäle darzustellen. Mit 13, 14, 15 sind jedem Kanal Signalbearbeitungsschaltungen zugeordnet, die der Ausführungsvariante von Fig. 3 oder von Fig. 5 entsprechen. Verstärker 16, 17, 18 verstärken die Signale jedes Kanals, und in einem Addierglied 19 werden die Signale der einzelnen Ka näle zu einem Ausgangssignal S_out aufaddiert. Die Verbesserung der Schaltung von Fig. 6 gegenüber den zuvor beschriebenen Schaltungen besteht darin, dass anstelle einer breitbandigen und frequenzunabhängigen Regelung eine selektive Regelung in einzelnen Frequenzbereichen erfolgt. Auf diese Weise kann das so genannte Atmen der Regelung unterdrückt werden, das im praktischen Betrieb bisweilen störbar ist. Außerdem ergibt sich bei Hörgeräten eine verbesserte An passungsmöglichkeit an den spezifischen Hörverlust. Fig. 6 shows a three-channel solution, in which the input signal S _in by a high-pass filter 10, a bandpass filter 11 and a low pass filter is divided into three various dene frequency bands 12, which are further processed in separate channels. Alternatively, three or more bandpass filters can be used to display any number of channels. With 13 , 14 , 15 , each channel is assigned signal processing circuits which correspond to the embodiment variant of FIG. 3 or of FIG. 5. Amplifiers 16 , 17 , 18 amplify the signals of each channel, and in an adder 19 , the signals of the individual channels are added to an output signal S _out . The improvement of the circuit of FIG. 6 compared to the circuits described above is that instead of a broadband and frequency-independent control, selective control takes place in individual frequency ranges. In this way, the so-called breathing of the regulation can be suppressed, which can sometimes be disrupted in practical operation. In addition, there is an improved adaptability to the specific hearing loss in hearing aids.

Die Schaltung von Fig. 7 ist besonders dazu ausgelegt, schmalbandige Störungen zu unterdrücken. In den Blöcken 20 und 21 wird zunächst das Instant-Amplitu densignal IA bzw. das Instant-Phasensignal IFI in der Frequenzdomäne einer zeitlichen Mittelung unterzogen und anschließend integriert. Dies kann bei digita ler Signalbearbeitung in einfacher Weise dadurch erfolgen, dass die Signale IA und IFI im aktuellen Zeitpunkt t und zu den k unmittelbar zurückliegenden Zeit punkten t-1, t-2 . . . t-k gemittelt werden, um das integrierte Instant- Amplitudensignal IA_int und das integrierte Instant-Phasensignal IFI_int zu erhalten. In dem Block 7a werden diese Signale gemäß der folgenden Formel (3a) zu ei nem integrierten Ausgangssignal S_int zusammengesetzt.
The circuit of FIG. 7 is particularly designed to suppress narrowband interference. In blocks 20 and 21 , the instant amplitude signal IA or the instant phase signal IFI in the frequency domain is first subjected to time averaging and then integrated. With digital signal processing, this can be done in a simple manner in that the signals IA and IFI at the current point in time t and at the k immediately preceding points t-1, t-2. , , tk are averaged to obtain the integrated instant amplitude signal IA _int and the integrated instant phase signal IFI _int . In block 7 a, these signals are combined according to the following formula (3a) to an integrated output signal S _int .

S_int = IA_int.cos(IFI_int) (3a)
S _int = IA _int .cos (IFI _int ) (3a)

Parallel dazu werden das Instant-Amplitudensignal IA bzw. das Instant-Phasen signal IFI in den Blöcken 22 und 23 um eine Zeitdauer verzögert, die der Verzö gerung entspricht, die durch die Mittelwertbildung in den Blöcken 20 und 21 ver ursacht ist. Bei der oben beschriebenen Mittelung beträgt die Verzögerung k/2. Auf diese Weise werden das verzögerte Instant-Amplitudensignal IA_del bzw. das verzögerte Instant-Phasensignal IFI_del gebildet.In parallel, the instant amplitude signal IA and the instant phase signal IFI are delayed in blocks 22 and 23 by a time period which corresponds to the delay caused by the averaging in blocks 20 and 21 . With the averaging described above, the delay is k / 2. In this way, the delayed instant amplitude signal IA _del or the delayed instant phase signal IFI _{del are} formed.

In dem Block 7b werden diese Signale gemäß der folgenden Formel (3b) zu ei nem verzögerten Ausgangssignal S_del zusammengesetzt.
In block 7, these signals the following formula (3b) are assembled into ei nem delayed output signal S _del b according.

S_del = IA_del.cos(IFI_del) (3b)S _del = IA _del .cos (IFI _del ) (3b)

In einem Subtraktionsglied 24 wird von dem verzögerten Ausgangssignal S_del das integrierte Ausgangssignal S_int ganz oder teilweise subtrahiert, um das Aus gangssignal S_out zu erhalten, das auf diese Weise die dynamische Komponente des Eingangssignals S_in darstellt.In a subtraction element 24 , the integrated output signal S _{int is} completely or partially subtracted from the delayed output signal S _{del in} order to obtain the output signal S _out , which in this way represents the dynamic component of the input signal S _in .

Durch die Schaltung von Fig. 7 ist es mit sehr gutem Erfolg möglich relativ kon stante, schmalbandige Störungen auszufiltern, d. h. Störsignale, deren Frequenz und Amplitude sich nur langsam ändert. Dabei ist es durchaus möglich, dass das Störsignal wesentlich größer als das Nutzsignal ist.With the circuit of FIG. 7 it is possible with very good success to filter out relatively constant, narrow-band interference, ie interference signals whose frequency and amplitude change only slowly. It is entirely possible that the interference signal is significantly larger than the useful signal.

Es hat sich herausgestellt, dass die Länge k des Zeitfensters, über dem die Mit telwertbildung erfolgt, kritisch für die Qualität der Signalbearbeitung ist. Daher hat eine digitale Signalbearbeitung nach dem oben beschriebenen Algorithmus wesentliche Vorteile gegenüber einer analogen Schaltung mit Filtern, da der Wert von k leicht an die jeweiligen Verhältnisse angepasst werden kann.It has been found that the length k of the time window over which the Mit value formation takes place, is critical for the quality of the signal processing. Therefore has digital signal processing according to the algorithm described above significant advantages over an analog circuit with filters because of the value from k can easily be adapted to the respective conditions.

In Fig. 8 ist die Wirksamkeit der Schaltung nach Fig. 7 aufgezeigt. Die Dia gramme von Fig. 8 zeigen Signale jeweils für einen Zeitabschnitt von 1,5 Sekun den. Im oberen Diagramm ist ein Eingangssignal S_in gezeigt, das sich aus einem Nutzsignal, nämlich Musik, mit einer Amplitude von etwa 0,1 und einem Störsig nal, einem gewobbelten Ton mit einer Amplitude von 5 zusammensetzt. Die Stö rung ist also um etwa 34 dB größer als das Nutzsignal. Im mittleren Diagramm ist das integrierte Ausgangssignal S_int gezeigt, das die isolierte Störung repräsen tiert. Im unteren Diagramm ist das Ausgangssignal S_out aufgetragen, das aus der Differenz des verzögerten Ausgangssignals S_del und des integrierten Aus gangssignals S_int gebildet ist.In FIG. 8, the effectiveness of the circuit of Fig. 7 is shown. The diagrams of Fig. 8 show signals for a period of 1.5 seconds each. The upper diagram shows an input signal S _in which is composed of a useful signal, namely music, with an amplitude of approximately 0.1 and a Störsig signal, a wobbled tone with an amplitude of 5. The disturbance is therefore about 34 dB larger than the useful signal. The middle diagram shows the integrated output signal S _int , which represents the isolated disturbance. In the lower diagram, the output signal S _{out is} plotted, which is formed from the difference between the delayed output signal S _del and the integrated output signal S _int .

In Fig. 9 ist eine dreikanalige Schaltung dargestellt, die gegenüber der von Fig. 7 zwei wesentliche Vorteile besitzt. Zum einen wird das Eingangssignal S_in wie in Fig. 6 durch ein Hochpassfilter 10, ein Bandpassfilter 11 und ein Tiefpassfilter 12 in drei verschiedene Frequenzbänder unterteilt, die in getrennten Kanälen weiter bearbeitet werden. Mit 25, 26, 27 sind in der Fig. 9 Blöcke bezeichnet, die jeweils einer Schaltung von Fig. 7 mit Ausnahme des Subtraktionsglieds 24 entsprechen. An den Ausgängen dieser Blöcke 25, 26, 27 liegt jeweils das verzögerte Aus gangssignal S_del und das integrierte Ausgangssignal S_int für jeden Frequenzbe reich an. Auf diese Weise ist es möglich, bis zu drei voneinander unabhängige quasistatische Störungen optimal auszufiltern. Es ist offensichtlich, dass die An zahl der Kanäle je nach Bedarf und der verfügbaren Rechenleistung beliebig ge wählt werden kann. FIG. 9 shows a three-channel circuit which has two essential advantages over that of FIG. 7. First, the input signal S _in as shown in Figure 6. Through a high-pass filter 10, divides a band-pass filter 11 and a low pass filter 12 in three different frequency bands which are processed in separate channels. 25 , 26 , 27 in FIG. 9 denote blocks which each correspond to a circuit from FIG. 7 with the exception of the subtraction element 24 . At the outputs of these blocks 25 , 26 , 27 , the delayed output signal S _del and the integrated output signal S _{int are} rich for each frequency region. In this way it is possible to optimally filter out up to three independent quasi-static disturbances. It is obvious that the number of channels can be chosen arbitrarily depending on the need and the available computing power.

Ein weiterer Unterschied der Ausführungsvariante von Fig. 9 zu der oben be schriebenen Lösung besteht darin, dass in Bewertungsgliedern 28, 29, 30 die Signale S_del und S_int für jeden Frequenzbereich analysiert werden. Es hat sich nämlich herausgestellt, dass die Schaltung von Fig. 7 sehr gute Ergebnisse bringt, wenn tatsächlich eine beträchtliche Störung vorliegt. Bei ungestörten Ein gangssignalen jedoch wird ebenfalls ein statischer Anteil ausgefiltert, was zu un erwünschten Verzerrungen führt. In den Bewertungsgliedern 28, 29, 30 wird ver sucht, das Ausmaß der Störung zu erfassen, um eine überschießende Korrektur zu vermeiden. Im einfachsten Fall wird im Wesentlichen das Verhältnis aus dem verzögerten Ausgangssignal S_del und dem integrierten Ausgangssignal S_int für je den Frequenzbereich ermittelt. Dies kann beispielsweise nach folgender Formel (6) erfolgen:
Another difference between the embodiment variant of FIG. 9 and the solution described above is that the signals S _del and S _int are analyzed for each frequency range in evaluation elements 28 , 29 , 30 . Namely, it has been found that the circuit of Fig. 7 gives very good results when there is indeed a considerable disturbance. In the case of undisturbed input signals, however, a static component is also filtered out, which leads to undesired distortions. In the evaluation elements 28 , 29 , 30 an attempt is made to record the extent of the disturbance in order to avoid an excessive correction. In the simplest case, the ratio of the delayed output signal S _del and the integrated output signal S _int is essentially determined for each frequency range. This can be done, for example, using the following formula (6):

n = f(MAS_int/(MAS_del - MAS_int)) (6)n = f (MAS _int / (MAS _del - MAS _int )) (6)

Dabei stellt MAS_int als gemittelter Absolutwert von S_int die statische Komponente und (MAS_del - MAS_int) die dynamische Komponente in dem jeweiligen Kanal dar, wobei MAS_del den gemittelten Absolutwert von S_del bezeichnet. Mit der Bewer tungsfunktion f, die durch eine Look up Table realisiert ist, wird ein Bewertungs index n berechnet, der zwischen 0 und 1 liegt. Je größer die statische Kompo nente MAS_int im Vergleich zu der dynamischen Komponente (MAS_del - MAS_int) ist, um so näher liegt n bei 1. Umgekehrt wird n für Werte des Verhältnisses (MA S_int/(MAS_del - MAS_int) unterhalb eines vorbestimmten Grenzwerts mit Null festge legt. Die Bewertungsfunktion f kann auf empirischem Weg optimiert werden.It turns MAS _int than average absolute value of S _int the static component and (MAS _del - MAS _int) represents the dynamic component in the respective channel, wherein MAS _del indicates the averaged absolute value of S _del. With the evaluation function f, which is implemented by a look-up table, an evaluation index n is calculated which lies between 0 and 1. The larger the static component MAS _int compared to the dynamic component (MAS _del - MAS _int ), the closer n is to 1. Conversely, n is used for values of the ratio (MA S _int / (MAS _del - MAS _int ) below a predetermined limit with 0. The evaluation function f can be optimized empirically.

Mit dem so bestimmten Bewertungsindex n, der für den ersten Kanal mit n_a be zeichnet ist, kann in den Subtraktionsgliedern 31, 32, 33 für jeden Kanal ein Ausgangssignal berechnet werden. Am Beispiel des ersten Kanals lautet die For mel:
With the thus determined evaluation index n which is for the first channel with _a n be distinguished, can in the Subtraktionsgliedern 31, 32 for each channel 33, an output signal can be calculated. Using the first channel as an example, the formula is:

S_out1 = S_del - n_a.S_int (7)S _out1 = S _del - n _a .S _int (7)

Aus dieser Formel (7) ist ersichtlich, dass die Korrektur um so größer ist, je grö ßer die statische Komponente ist. Auf diese Weise können Verzerrungen mini miert und die Entstehung von Artefakten vermieden werden. From this formula (7) it can be seen that the larger the correction, the greater the larger is the static component. In this way, distortions can be mini and the creation of artifacts can be avoided.

Analog zu der Schaltung von Fig. 6 werden in einem Addierglied 19 die Signale der einzelnen Kanäle zu einem Ausgangssignal S_out aufaddiert.Analogously to the circuit of FIG. 6, the signals of the individual channels are added to an output signal S _out in an adder 19 .

In Fig. 10 ist eine Schaltung dargestellt, die weitgehend der von Fig. 9 ent spricht. Es werden daher nur die Unterschiede erklärt. Es werden bei dieser Aus führung die verzögerten Ausgangssignale S_del und die integrierten Ausgangssig nale S_int aller Kanäle in den Addiergliedern 19a und 19b addiert. Auf diese Weise erhält man ein verzögertes Ausgangssignal S_del und ein integriertes Ausgangssig nal S_int für alle Kanäle. In dem Bewertungsglied 28 wird wie oben beschrieben die Berechnung nach Formel (6) ausgeführt, und in dem Subtraktionsglied 31 erfolgt die Bestimmung des Ausgangssignal S_out nach Formel (7).In Fig. 10, a circuit is shown which speaks ent of Fig. 9 largely. Only the differences are therefore explained. In this version, the delayed output signals S _del and the integrated output signals S _{int of} all channels in the adders 19 a and 19 b are added. In this way, a delayed output signal S _del and an integrated output signal S _{int are obtained} for all channels. As described above, the calculation according to formula (6) is carried out in the evaluation element 28 , and the output signal S _{out is determined} according to formula (7) in the subtraction element 31 .

Eine solche Schaltung besitzt einen einfacheren Aufbau und benötigt weniger Re chenleistung als die von Fig. 9, wobei jedoch die Unterscheidung zwischen Nutz signal und Störung nicht so genau ist.Such a circuit has a simpler structure and requires less computing power than that of FIG. 9, but the distinction between useful signal and interference is not as accurate.

In den Bewertungsgliedern 28, 29, 30 kann alternativ oder zusätzlich ein Ver fahren zur Erkennung von Sprache ausgeführt werden, wie dies in der Folge be schrieben ist.As an alternative or in addition, a method for recognizing speech can be carried out in the evaluation elements 28 , 29 , 30 , as will be described below.

In Fig. 11 ist in allgemeiner Form eine Schaltung zur Erkennung von Störsigna len. In dem Block 40 wird eine Normierung der Amplitude durchgeführt, wobei die Zeitkonstanten relativ lang sein sollten. Durch die Normierung wird der Um stand ausgeglichen, dass der Pegel der Eingangssignale je nach Entfernung zum Sprecher sehr unterschiedlich sein kann, was die Korrelation, die in der Folge beschrieben wird, stört. Durch die relativ lange Zeitkonstante bleibt jedoch die Dynamik der Sprache erhalten, und Störungen in kürzeren Sprechpausen werden nicht überbetont.In Fig. 11 is a circuit for the detection of Störsigna len in general. In block 40 , the amplitude is normalized, the time constants should be relatively long. The normalization compensates for the fact that the level of the input signals can vary considerably depending on the distance to the speaker, which disrupts the correlation that is described below. Due to the relatively long time constant, however, the dynamics of the speech are preserved, and disturbances in shorter pauses are not overemphasized.

Der Block 5 liefert ein Instant-Amplitudensignal IA, ein Instant-Phasensignal IFI und ein Instant-Kreisfrequenzsignal IW, wie dies zu Fig. 1 und 2 beschrieben worden ist. In Block 41 wird aus dem Instant-Kreisfrequenzsignal IW durch Divi sion durch 2π das Instant-Frequenzsignal IFR berechnet. Daraus und dem In stant-Amplitudensignal IA wird ein Bewertungsindex aus einer Bewertungsbe wertungsfunktion errechnet. Die Erkennung von Sprache als Nutzsignal, aber auch in geringerem Umfang von Musik beruht auf der Tatsache, dass bei Vorlie gen harmonischer Komponenten eine Korrelation zwischen und dem Instant- Amplitudensignal IA und dem Instant-Frequenzsignal IFR zu beobachten ist. Ü berraschenderweise liegt jedoch eine solche Korrelation in geringerem Umfang auch bisweilen bei nicht harmonischen Signalen vor, keinesfalls jedoch bei Rau schen. Block 5 supplies an instant amplitude signal IA, an instant phase signal IFI and an instant angular frequency signal IW, as has been described for FIGS. 1 and 2. In block 41 , the instant frequency signal IFR is calculated from the instant angular frequency signal IW by division by 2π. From this and the instantaneous amplitude signal IA, an evaluation index is calculated from an evaluation evaluation function. The recognition of speech as a useful signal, but also to a lesser extent of music, is based on the fact that a correlation between and the instant amplitude signal IA and the instant frequency signal IFR can be observed in the case of harmonic components. Surprisingly, however, such a correlation is sometimes also present to a lesser extent in the case of non-harmonic signals, but never in the case of noise.

Im einfachsten Fall wird das Verhältnis des Instant-Amplitudensignals IA und dem Instant-Frequenzsignal IFR als Variable für die Bewertungsfunktion heran gezogen. Es kann aber auch vor der Bildung des Verhältnisses eine nichtlineare Transformation durchgeführt werden, um ein modifizertes Instant-Amplituden signal IA_mod bzw. ein modifizertes Instant-Frequenzsignal IFR_mod zu erhalten:
In the simplest case, the ratio of the instant amplitude signal IA and the instant frequency signal IFR is used as a variable for the evaluation function. However, a nonlinear transformation can also be carried out before the relationship is formed, in order to obtain a modified instant amplitude signal IA _mod or a modified instant frequency signal IFR _mod :

IA_mod = f(IA) (8)
IA _mod = f (IA) (8)

IFR_mod = g(IFR) (9)IFR _mod = g (IFR) (9)

Die Bewertungsfunktionen f und g sind nichtlineare, vorzugsweise monotone Funktionen, wie In(x) oder x³, werden jedoch im Allgemeinen durch Look up Tables realsiert, die empirisch ermittelt werden. Dazu werden Signale, die Stö rungen unterschiedlichen Typs und die unterschiedliche Sprachanteile aufweisen, hergestellt und gemäß Fig. 11 analysiert. Durch Variation von f und g kann die Korrelation des Bewertungsindex n mit dem tatsächlichen Sprachanteil optimiert werden. Inputgröße der Bewertungsfunktion ist in diesem Fall IA_mod/IFR_mod.The evaluation functions f and g are non-linear, preferably monotonous functions, such as In (x) or x ³ , but are generally implemented using look-up tables that are determined empirically. For this purpose, signals with different types of interference and different speech components are produced and analyzed according to FIG. 11. By varying f and g, the correlation of the evaluation index n with the actual speech component can be optimized. In this case, the input value of the evaluation function is IA _mod / IFR _mod .

Eine verbesserte Erkennung ermöglicht die Schaltung gemäß Fig. 12, bei der nicht nur die Korrelation zwischen dem Instant-Amplitudensignal IA und dem In stant-Frequenzsignal IFR oder einer dazu ähnlichen Größe in Betracht gezogen wird, um einen ersten Teilbewertungsindex n₁ zu erhalten, sondern auch die Kor relation zwischen den zeitlichen Ableitungen IA_diff bzw. IFR_diff dieser Größen, die in den Differenziergliedern 4a, 4b gebildet werden.An improved detection is made possible by the circuit according to FIG. 12, in which not only the correlation between the instant amplitude signal IA and the instantaneous frequency signal IFR or a similar quantity is taken into account in order to obtain a first partial evaluation index n ₁ , but also also the correlation between the time derivatives IA _diff and IFR _{diff of} these variables, which are formed in the differentiators 4 a, 4 b.

In den Blöcken 43 und 44 wird wie oben beschrieben eine nichtlineare Transfor mation durchgeführt werden, um ein modifizertes Instant-Amplitudensignal IA_mod bzw. ein modifizertes Instant-Frequenzsignal IFR_mod zu erhalten. Nicht darge stellt, jedoch möglich, ist eine nichtlineare Transformation der zeitlichen Ablei tungen IA_diff bzw. IFR_diff zu modifizierten Ableitungen IA_diffmod bzw. IFR_diffmod.In blocks 43 and 44 , a nonlinear transformation is carried out, as described above, in order to obtain a modified instant amplitude signal IA _mod or a modified instant frequency signal IFR _mod . Not shown, but possible, is a non-linear transformation of the temporal derivatives IA _diff or IFR _diff to modified derivatives IA _diffmod or IFR _diffmod .

Ein erster Teilbewertungsindex wird erhalten, indem in dem Block 41a, der im Wesentlichen ein Multiplikator ist, das modifizerte Instant-Amplitudensignal IA_mod bzw. das modifizerte Instant-Frequenzsignal IFR_mod miteinander multipliziert wer den. Wenn die nichtlinearen Transformationen in den Blöcken 43 und 44 so durchgeführt werden, dass IA_mod und IFR_mod um den Nullpunkt schwanken, dann ist das Produkt bei hoher Korrelation groß, ansonsten klein.A first partial evaluation index is obtained by multiplying the modified instant amplitude signal IA _mod or the modified instant frequency signal IFR _mod with one another in block 41 a, which is essentially a multiplier. If the nonlinear transformations in blocks 43 and 44 are carried out in such a way that IA _mod and IFR _mod fluctuate around the zero point, then the product is large with a high correlation, otherwise small.

In dem Block 41b wird in analoger Weise eine zweite Teilbewertungsfunktion aus den Ableitungen berechnet, um einen zweiten Teilbewertungsindex n₂ zu erhal ten. In block 41 b, a second partial evaluation function is calculated in an analogous manner from the derivatives in order to obtain a second partial evaluation index n ₂ .

Block 41 arbeitet wie oben beschrieben. In den Blöcken 45 und 46 wird eine zeit liche Mittelwertbildung durchgeführt, um geringfügige Phasenunterschiede zwi schen den Signalen, die die Korrelation beeinträchtigen, auszugleichen. Dadurch wird im Wesentlichen eine Kreuzkorrelationsberechnung durchgeführt.Block 41 operates as described above. A temporal averaging is carried out in blocks 45 and 46 in order to compensate for minor phase differences between the signals which impair the correlation. In this way, a cross-correlation calculation is essentially carried out.

In einem Addierglied 47 werden die Ausgänge der Blöcke 45 und 46 addiert, wo bei gegebenenfalls eine Gewichtung durchgeführt werden kann, um den endgül tigen Bewertungsindex n zu erhalten. Vorteilhaft ist an dieser Ausführungsvari ante, dass die Blöcke 41, 43 und 44 einerseits und 42 andererseits getrennt von einander optimiert werden können, was die Bestimmung der Funktionen und. Ko effizienten erleichtert.In an adder 47 , the outputs of blocks 45 and 46 are added, where a weighting can be carried out if necessary in order to obtain the final evaluation index n. It is advantageous in this embodiment that blocks 41 , 43 and 44 on the one hand and 42 on the other hand can be optimized separately from one another, which determines the functions and. Coefficient relieved.

In Fig. 13 ist für ein Signal, das primär aus Sprache besteht, IA und IFR aufge tragen. Das Instant-Amplitudensignal IA ist stark verstärkt im oberen Bereich als helle Kurve dargestellt. Darunter ist das Instant-Frequenzsignal IFR dunkle Kurve aufgetragen. Die Korrelation zwischen diesen Signalen ist offensichtlich.In Fig. 13, IA and IFR are applied for a signal consisting primarily of speech. The instant amplitude signal IA is shown as a bright curve in the upper area with a strong amplification. Underneath the instant frequency signal IFR dark curve is plotted. The correlation between these signals is obvious.

In Fig. 14 ist ein Sprachsignal, das im mittleren Abschnitt dargestellt ist, durch eine starke Störung überlagert, wobei das Summensignal unten hell dargestellt ist. Im oberen Bereich ist der Bewertungsindex n aufgetragen, der gemäß der Schaltung von Fig. 11 ermittelt worden ist.In Fig. 14, a speech signal, which is shown in the middle section, is superimposed by a strong interference, the sum signal being shown brightly below. The evaluation index n, which was determined in accordance with the circuit of FIG. 11, is plotted in the upper region.

Es ist umittelbar ersichtlich, dass ein Wert n < 3 nahezu sicher auf die Anwesen heit von Sprache schließen lässt, wogegen n < 3 die Abwesenheit eines Sprach signals anzeigt. Auf diese Weise ist es möglich, die Verstärkung und die Signal bearbeitung zu optimieren. Im einfachsten Fall wird bei Abwesenheit eines Sprachsignals die Verstärkung um ein vorbestimmtes Ausmaß verringert.It is immediately apparent that a value n <3 is almost certain to affect the property indicates the absence of language, whereas n <3 indicates the absence of a language signals. In this way it is possible to set the gain and the signal optimize machining. In the simplest case, one is absent Speech signal reduces the gain by a predetermined amount.

In Fig. 15 ist in einer dreidimensionalen Darstellung die Wahrscheinlichkeits dichte von Instant-Ampitude und Instant-Kreisfrequenz für ein Sprachsignal dar gestellt. Die Instant-Kreisfrequenz nimmt sowohl positive als auch negative Wer te an, wobei in der Darstellung der Bereich zwischen -200π und 200π ausge nommen ist, da dieser Bereich für die Analyse unbedeutend ist, jedoch Signale in diesem Frequenzbereich eine hohe Energie haben können, insbesondere bei Plo siven, was die Berechnungen unter Umständen stört.In Fig. 15, the probability density of instant amplitude and instant angular frequency for a speech signal is shown in a three-dimensional representation. The instant angular frequency takes on both positive and negative values, the range between -200π and 200π being excluded, since this range is insignificant for the analysis, but signals in this frequency range can have a high energy, in particular with positives, which may interfere with the calculations.

Es ist aus Fig. 15 eine für Sprache typischen zweihügelige Struktur ersichtlich, das heißt, dass in dem Kennfeld zwei lokale Maxima 50 und 51 vorliegen. Dies ist auch aus der Schichtliniendarstellung von Fig. 16 ersichtlich.It is apparent from Fig. 15 is a typical language two undulating structure, i.e., that present in the map two local maxima 50 and 51. This can also be seen from the layer line representation of FIG. 16.

Fig. 17 und 18 entsprechen Fig. 15 und 16, jedoch für ein Störsignal ohne Sprachanteil. Im positiven Bereich der Instant-Kreisfrequenz liegt dabei nur ein Hügel vor. FIGS. 17 and 18 correspond to FIGS. 15 and 16, but for an interfering signal without speech portion. There is only one hill in the positive range of the instant angular frequency.

Claims

1. Method for recognizing the presence of speech signals with the following steps:

- obtaining an analytical signal from an input signal (S _in );
- Calculating an instant amplitude signal (IA) from the analytical signal;
- Calculating an instant phase signal (IFI) from the analytical signal;
- Calculating an instant frequency signal (IFR) from the time derivative of the instant phase signal (IFI);
- Calculating a weighting function from the instant amplitude signal (IA) and the instant frequency signal (IFR) in order to obtain a weighting index (s) which enables a statement about the presence of speech signals.

2. The method according to claim 1, characterized in that the evaluator tion function from the ratio of the instant amplitude signal (IA) to the In constant frequency signal (IFR) is calculated.

3. The method according to claim 1, characterized in that the calculation of the evaluation function is carried out by the following steps:

- Nonlinear change of the instant amplitude signal (IA) to a modified instant amplitude signal (IA _mod );
- Nonlinear change of the instant frequency signal (IFR) to a modified instant frequency signal (IFR _mod );
- Calculate the evaluation function from the ratio of the modified instant amplitude signal (IA _mod ) to the modified instant frequency signal (IFR _mod ) in order to obtain an evaluation index (n).

4. The method according to any one of claims 1 to 3, characterized in that a first partial evaluation index (n ₁ ) is formed from the instant amplitude signal (IA) and the instant frequency signal (IFR) that from a time derivative of the instant Amplitude signal (IA) and a time derivative of the instant frequency signal (IFR) a second partial evaluation index (n ₂ ) is formed and that the first and the second partial evaluation index (n ₁ , n ₂ ) are essentially additively linked to the evaluation index ( n) to calculate.

5. The method according to claim 4, characterized in that the first partial evaluation function is calculated by the following steps:

- Nonlinear change of the instant amplitude signal (IA) to a modified instant amplitude signal (IA _mod );
- Nonlinear change of the instant frequency signal (IFR) to a modified instant frequency signal (IFR _mod );
- Calculate the first partial evaluation function from the ratio of the modified instant amplitude signal (IA _mod ) to the modified instant frequency signal (IFR _mod ) in order to obtain a first partial evaluation index (n ₁ ).

6. The method according to claim 4 or 5, characterized in that the calculation of the second partial evaluation function is carried out by the following steps:

1. nonlinear change of a time derivative of the instant amplitude signal (IA) to a modified differentiated instant amplitude signal (IA _diffmod );
2. Nonlinear change of a time derivative of the instant fre frequency signal (IFR) to a modified differentiated instant fre frequency signal (IFR _diffmod );
3. Calculate the second partial evaluation function from the ratio of the modified differentiated instant amplitude _signal (IA _diffmod ) to the modified differentiated instantaneous frequency signal (IFR _diffmod ) in order to obtain a second partial _{evaluation index} (n ₂ ).

7. The method according to any one of claims 4 to 6, characterized in that the second partial evaluation index (n ₂ ) is weighted more strongly than the first partial evaluation index (n ₁ ) in the case of additive linking.

8. The method according to claim 1, characterized in that a two in advance Dimensional map is created, preferably for a useful signal Language, the probability density as a function of instant amplitudes signals (IA) and the instant frequency signal (IFR) is specified, and that the evaluation function is determined on the basis of this map.

9. The method according to claim 8, characterized in that in the characteristic field two non-contiguous areas are defined that signal and that the rest of the map is the Interference signal is assigned.

10. The method according to any one of claims 1 to 9, characterized in that the input signal (S _in ) is previously subjected to a normalization of the amplitude un.

11. Device for recognizing the presence of speech signals for implementation tion of a method according to one of claims 1 to 10, in which a Input signal an instant amplitude signal (IA) and an instant frequency signal (IFR) is calculated, characterized in that the device contains a weighting element which consists of the instant amplitude signal (IA) and the instant frequency signal (IFR) calculates a rating index (n) that indicates the presence of a voice signal.