EP0110467A1 - Arrangement for the detection of speech intervals - Google Patents
Arrangement for the detection of speech intervals Download PDFInfo
- Publication number
- EP0110467A1 EP0110467A1 EP83201638A EP83201638A EP0110467A1 EP 0110467 A1 EP0110467 A1 EP 0110467A1 EP 83201638 A EP83201638 A EP 83201638A EP 83201638 A EP83201638 A EP 83201638A EP 0110467 A1 EP0110467 A1 EP 0110467A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- short
- term mean
- value
- speech
- arrangement according
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 6
- 238000009499 grossing Methods 0.000 claims abstract description 9
- 230000001419 dependent effect Effects 0.000 claims 1
- 238000001914 filtration Methods 0.000 abstract description 3
- 238000000034 method Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 15
- 238000012935 Averaging Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/87—Detection of discrete points within a voice signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
- G10L2025/786—Adaptive threshold
Definitions
- the invention relates to an arrangement for recognizing speech pauses in a speech signal, which can be superimposed by interference signals.
- Such arrangements are e.g. the prerequisite for the suppression of interference signals when calling from an acoustically disturbed environment.
- characteristic parameters of the interference signal are measured and used to filter out the interference as completely as possible from the signal to be transmitted using adaptive filters.
- this pause detection does not take into account the fact that, for example, unvoiced sounds lead to a drop in performance in the speech signal and the relevant speech sections are therefore incorrectly called speech pauses are viewed. Such incorrect decisions occur in the known arrangement the more frequently the more the speech signal is overlaid with interference signals.
- the arrangement is also intended to enable speech pause recognition even if the average noise level changes only slowly.
- the averager M generates a so-called short-term average at all clock instants T (n) with the time interval mT from the amounts of m consecutive samples.
- the arithmetic mean of the amounts of the sampled values is used as the mean value, since the effort involved in the building block is less than e.g. to form the quadratic mean.
- Each short-term mean value G (n) is approximately a measure of the average power of the disturbed speech signal over a period of approximately 100 ms. This specification and the sampling frequency also determine the number m of samples which are required to determine one of the short-term mean values G (n). E.g. the disturbed speech signal sampled at 10 kHz, m must be about 1000.
- Each of the quantities G (1), G (2) ... thus results from approximately a thousand consecutive samples.
- the unit GL of FIG. 1 smoothes the sequence of the short-term mean values G (n). More about the purpose and manner of smoothing is given below.
- an estimated value P (n) for the average noise power is determined from the short-term average values by the block PA in FIG. 1. More details about the estimate P (n) are also given below.
- a comparator V in FIG. 1 compares a threshold S which depends on the estimated value P (n) with the smoothed threshold Short-term mean values GG (n). If the smoothed short-term mean value GG (n) is less than the threshold S, a signal is forwarded to a unit EN. If the unit EN has received such a signal, for example, at two successive clock instants T (n-1) and T (n), it in turn can detect the presence of a speech pause by means of its own signal at terminal A.
- the diagram a) of Fig. 2 shows a possible output signal AM of the averager M, i.e. a possible sequence of the short-term mean values G (l), G (2) ...
- the output signal AM is standardized so that its absolute maximum assumes the value 1.
- the entered amplitude thresholds are the estimated value P (n) (lower threshold, shown in broken lines) and the threshold S (upper threshold, solid).
- Diagram b) schematically shows the associated speech signal S with its true pauses P. If a pause determination were made due to the fact that the upper amplitude threshold was not reached in diagram a) - this pause determination is shown in diagram c) - a large number of wrong decisions would result, as a comparison of diagrams b) and c) shows. A shift of the upper threshold downward would lead to the fact that the performance drops contained in diagram c), which are not based on language breaks, would not be displayed either, but the statement about the length of the breaks would then be significantly falsified.
- a smoothing of the output signal AM is provided before the decision to pause, either with the aid of a linear digital filter, through which three of each other following short-term mean values G (n), G (nl) and G (n-2), a value GG (n) of the smoothed signal is obtained, or using a median filter.
- FIG. 3 shows how the output signal of the mean value generator M looks after smoothing with a linear digital filter.
- diagram b) the true speech sections and the real pauses of the speech signal are in turn plotted
- diagram c) shows the speech sections and speech pauses, as they result in a manner analogous to diagram c) in FIG. 1. Due to the linear smoothing, the number of wrong decisions has decreased considerably, as the comparison of FIGS. 2 and 3 shows. Even with smoothing with a median filter, the number of incorrect decisions is reduced, as can be seen from diagram c) in FIG. 4.
- a further measure, not to misinterpret shorter drops in performance in the disturbed speech signal as pauses, is e.g. a drop in performance can only be regarded as a speech pause when the upper amplitude threshold is fallen below twice in FIG. 2, 3 or 4.
- the amplitude thresholds shown in FIGS. 2, 3 and 4 are - as already indicated above - determined by the unit PA in FIG. 1, and this is done first for each time T (n) the estimated value P (n) of the noise power is determined.
- This variable is intended to be an approximate measure of the average power of the interference signal, the averaging time being of the order of one second.
- the arrangement according to the invention still delivers good results even if the above-mentioned average power of the interference signal changes only slowly , ie if it is to be regarded as stationary in time intervals of the size, one or two seconds.
- the estimated value P (n) is a linear combination of the previous estimated value P (nl) and the short-term mean value G (n) according to the equation redefined.
- the value of the constant a appearing in this equation is between zero and one.
- K urzzeitstoffgeberubenн ⁇ D In order to recognize the longer speech pauses, it is continuously checked whether the difference between two successive ones K urzzeitstofftechnisch magnitude D falls below a threshold. For example, if K is the inequality in succession is fulfilled, this fact is considered to be a longer speech pause and the new estimated value P (n) is determined according to the equation given above.
- the threshold D is selected proportional to the short-term mean value G (n) in order to arrive at the same statements if, for example, the level of all signals would be doubled.
- the constant c is to be chosen so that the estimation value reaches the modulation limit in one to two seconds with unimpeded enlargement.
- the already existing estimate P (n-1) is above the current short-term mean G (n)
- the new estimate P (n) is lowered from the existing one, according to the equation which represents the new estimated value as a linear combination of the previous estimated value and the current short-term mean value G (n).
- the threshold S which is used for the pause decision, is proportional to the estimated value P (n).
- the relationship S 1.1 P (n) is typical of the relationship between the threshold S and the estimated value P (n).
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Noise Elimination (AREA)
- Analogue/Digital Conversion (AREA)
- Telephone Function (AREA)
Abstract
Die beschriebene Anordnung zur Erkennung von Pausen in einem Sprachsignal ermöglicht die Pausenerkennung auch dann, wenn dem Sprachsignal ein langsam veränderliches Störsignal überlagert ist. Zur Pausenerkennung werden aus den Abtastproben des gestörten Sprachsignales laufend an einen Takt gebundene sogenannte Kurzzeitmittelwerte bestimmt, die ein Maß für die mittlere Leistung von etwa 100 ms langen Abschnitten des gestörten Sprachsignales sind. Die Folge dieser Kurzzeitmittelwerte wird sodann durch lineare Filterung oder durch ein Medianfilter geglättet. Parallel zum Glättungsvorgang wird aus der Folge der Kurzzeitmittelwerte ein Schätzwert für die über einige Sekunden gemittelte Leistung des Störsignales gewonnen. Ist der geglättete Kurzzeitmittelwert einmal oder mehrmals kleiner als eine zum erwähnten Schätzwert proportionale Schwelle, so wird auf Sprachpause entschieden.The described arrangement for recognizing pauses in a speech signal enables pause detection even when a slowly changing interference signal is superimposed on the speech signal. For pause detection, so-called short-term average values, which are a measure of the average power of approximately 100 ms long sections of the disturbed speech signal, are continuously determined from the samples of the disturbed speech signal. The sequence of these short-term mean values is then smoothed by linear filtering or by a median filter. In parallel to the smoothing process, an estimate of the noise signal power averaged over a few seconds is obtained from the sequence of the short-term mean values. If the smoothed short-term mean value is one or more times smaller than a threshold proportional to the mentioned estimated value, a decision is made to pause the speech.
Description
Die Erfindung betrifft eine Anordnung zur Erkennung von Sprachpausen in einem Sprachsignal, das von Störsignalen überlagert sein kann.The invention relates to an arrangement for recognizing speech pauses in a speech signal, which can be superimposed by interference signals.
Derartige Anordnungen sind z.B. die Voraussetzung für die Unterdrückung von Störsignalen beim Telefonieren aus akustisch gestörter Umgebung. Während der Sprachpause werden charakteristische Parameter des Störsignales gemessen und dazu verwendet, die Störungen vor der Übertragung möglichst vollständig aus dem zu übertragenden Signal mit adaptiven Filtern herauszufiltern.Such arrangements are e.g. the prerequisite for the suppression of interference signals when calling from an acoustically disturbed environment. During the pause in speech, characteristic parameters of the interference signal are measured and used to filter out the interference as completely as possible from the signal to be transmitted using adaptive filters.
Aus der DE-AS 24 55 47, Spalte 10 ist eine Anordnung in analoger Technik zur Erkennung von Sprachpausen bekannt, der folgende Wirkungsweise zugrunde liegt: Das Sprachsignal wird in gleich lange Abschnitte zerlegt und für jeden Abschnitt wird durch Gleichrichtung und Mittelwertbildung ein Spannungswert gewonnen, der zur mittleren Lautstärke des Abschnittes proportional ist. Schließlich wird durch Mittelwertbildung über mehrere Sprachabschnitte ein weiterer Spannungswert bestimmt, der zur mittleren Gesprächslautstärke proportional ist. Durch einen Vergleich der beiden Mittelwerte wird entschieden, ob ein Abschnitt einer Sprachpause angehört oder nicht.From DE-AS 24 55 47,
Bei dieser Pausenerkennung ist unter anderem nicht berücksichtigt, daß z.B. stimmlose Laute zu einem Leistungseinbruch im Sprachsignal führen und die betreffenden Sprachabschnitte deshalb fälschlicherweise als Sprachpausen angesehen werden. Derartige Fehlentscheidungen treten bei der bekannten Anordnung- um so häufiger auf, je stärker das Sprachsignal von Störsignalen überlagert ist.Among other things, this pause detection does not take into account the fact that, for example, unvoiced sounds lead to a drop in performance in the speech signal and the relevant speech sections are therefore incorrectly called speech pauses are viewed. Such incorrect decisions occur in the known arrangement the more frequently the more the speech signal is overlaid with interference signals.
Es ist deshalb Aufgabe der Erfindung, eine Anordnung zur Erkennung der Pausen in einem gestörten Sprachsignal anzugeben, bei der Fehlentscheidungen im oben erläuterten Sinne vermieden werden. Die Anordnung soll darüberhinaus eine Sprachpausenerkennung auch dann ermöglichen, wenn sich die mittlere Geräuschleistung nur langsam verändert.It is therefore an object of the invention to provide an arrangement for recognizing the pauses in a disturbed speech signal, in which incorrect decisions in the sense explained above are avoided. The arrangement is also intended to enable speech pause recognition even if the average noise level changes only slowly.
Diese Aufgabe wird durch die im Kennzeichen des Anspruches 1 angegebenen Merkmale gelöst. Vorteilhafte Ausgestaltungen geben die Unteransprüche an.This object is achieved by the features specified in the characterizing part of claim 1. Advantageous refinements indicate the subclaims.
Anhand der Figuren soll die Erfindung näher erläutert werden.The invention will be explained in more detail with reference to the figures.
Es zeigt:
- Figur 1 ein Blockschaltbild der erfindungsgemäßen Anordnung
Figur 2, 3 und 4 Diagramme zur Erläuterung der Wirkungsweise der erfindungsgemäßen Anordnung
- Figure 1 is a block diagram of the arrangement according to the invention
- Figure 2, 3 and 4 diagrams to explain the operation of the arrangement according to the invention
Im Blockschaltbild nach Fig.l werden aus dem an einer Klemme E angelegten,gestörten Sprachsignal durch einen Analog-Digital-Umsetzer A/D zu Abtastzeitpunkten kTo Abtastwerte x(k) gewonnen, wobei k eine natürliche Zahl und l/T die Abtastfrequenz darstellt. Die Abtastwerte werden an einen Mittelwertbildner M weitergegeben.In the block diagram according to Fig.l disturbed speech signal from the voltage applied to a terminal E, by a analog-to-digital converter A / D obtained at sampling instants kT o samples x (k), where k is a natural number and L / T is the sampling frequency . The samples are passed on to an averager M.
Der Mittelwertbildner M erzeugt zu allen Taktzeitpunkten T(n) mit dem zeitlichen Abstand mT aus den Beträgen von m aufeinanderfolgenden Abtastwerten einen sogenannten Kurzzeitmittelwert.
Als Mittelwert ist das arithmetische Mittel aus den Beträgen der Abtastwerte verwendet, da zu dessen Bestimmung der Bausteineaufwand geringer ist als z.B. zur Bildung des quadratischen Mittels. Jeder Kurzzeitmittelwert G(n) ist näherungsweise ein Maß für die mittlere Leistung des gestörten Sprachsignales über einen Zeitraum von etwa 100 ms. Durch diese Angabe und durch die Abtastfrequenz ist auch die Zahl m der Abtastwerte festgelegt, die zur Bestimmung eines der Kurzzeitmittelwerte G(n) erforderlich sind. Wird z.B. das gestörte Sprachsignal mit 10 kHz abgetastet, so muß m etwa 1000 betragen. Jede der Größen G(1), G(2)... ergibt sich also aus etwa tausend aufeinanderfolgenden Abtastwerten.The arithmetic mean of the amounts of the sampled values is used as the mean value, since the effort involved in the building block is less than e.g. to form the quadratic mean. Each short-term mean value G (n) is approximately a measure of the average power of the disturbed speech signal over a period of approximately 100 ms. This specification and the sampling frequency also determine the number m of samples which are required to determine one of the short-term mean values G (n). E.g. the disturbed speech signal sampled at 10 kHz, m must be about 1000. Each of the quantities G (1), G (2) ... thus results from approximately a thousand consecutive samples.
Die Einheit GL der Fig. 1 führt eine Glättung der Folge der Kurzzeitmittelwerte G(n) durch. Näheres über den Zweck und die Art und Weise der Glättung wird weiter unten angegeben.The unit GL of FIG. 1 smoothes the sequence of the short-term mean values G (n). More about the purpose and manner of smoothing is given below.
Parallel zur Glättung wird durch den Block PA der Fig. 1 aus den Kurzzeitmittelwerten ein Schätzwert P(n) für die mittlere Geräuschleistung, d.h. für die mittlere Leistung des Störsignales bestimmt. Genaueres über den Schätzwert P(n) wird ebenfalls weiter unten ausgeführt. Ein Vergleicher V in Fig. 1 vergleicht eine vom Schätzwert P(n) abhängige Schwelle S mit den geglätteten Kurzzeitmittelwerten GG(n). Ist der geglättete Kurzzeitmittelwert GG(n) kleiner als die Schwelle S, wird ein Signal an eine Einheit EN weitergeleitet. Hat die Einheit EN z.B. zu zwei aufeinanderfolgenden Taktzeitpunkten T(n-1) und T(n) ein derartiges Signal erhalten, so läßt sie ihrerseits durch ein eigenes Signal an einer Klemme A das Vorliegen einer Sprachpause erkennen.In parallel to the smoothing, an estimated value P (n) for the average noise power, ie for the average power of the interference signal, is determined from the short-term average values by the block PA in FIG. 1. More details about the estimate P (n) are also given below. A comparator V in FIG. 1 compares a threshold S which depends on the estimated value P (n) with the smoothed threshold Short-term mean values GG (n). If the smoothed short-term mean value GG (n) is less than the threshold S, a signal is forwarded to a unit EN. If the unit EN has received such a signal, for example, at two successive clock instants T (n-1) and T (n), it in turn can detect the presence of a speech pause by means of its own signal at terminal A.
Das Diagramm a) der Fig. 2 zeigt ein mögliches Ausgangssignal AM des Mittelwertbildners M, d.h. eine mögliche Folge der Kurzzeitmittelwerte G(l), G(2) ... In dem Diagramm a) ist das Ausgangssignal AM so normiert, daß sein absolutes Maximum den Wert 1 annimmt. Bei den eingetragenen Amplitudenschwellen handelt es sich um den Schätzwert P (n) (untere Schwelle, unterbrochen gezeichnet) und die Schwelle S (obere Schwelle, durchgezogen). Im Diagramm b) ist schematisch das zugehörige Sprachsignal S mit seinen wahren Pausen P abgebildet. Würde eine Pausenbestimmung aufgrund der Unterschreitung der oberen Amplitudenschwelle im Diagramm a) - diese Pausenbestimmung ist im Diagramm c) abgebildet - vorgenommen werden, so würde sich eine Vielzahl von Fehlentscheidungen ergeben, wie ein Vergleich der Diagramme b) und c) zeigt. Eine Verschiebung der oberen Schwelle nach unten würde zwar dazu führen, daß die im Diagramm c) enthaltenen Leistungseinbrüche, die nicht auf Sprachpausen beruhen, auch nicht angezeigt würden, jedoch würde dann die Aussage über die Pausenlängen erheblich verfälscht werden.The diagram a) of Fig. 2 shows a possible output signal AM of the averager M, i.e. a possible sequence of the short-term mean values G (l), G (2) ... In diagram a), the output signal AM is standardized so that its absolute maximum assumes the value 1. The entered amplitude thresholds are the estimated value P (n) (lower threshold, shown in broken lines) and the threshold S (upper threshold, solid). Diagram b) schematically shows the associated speech signal S with its true pauses P. If a pause determination were made due to the fact that the upper amplitude threshold was not reached in diagram a) - this pause determination is shown in diagram c) - a large number of wrong decisions would result, as a comparison of diagrams b) and c) shows. A shift of the upper threshold downward would lead to the fact that the performance drops contained in diagram c), which are not based on language breaks, would not be displayed either, but the statement about the length of the breaks would then be significantly falsified.
Daher ist bei der erfindungsgemäßen Anordnung vor der Entscheidung auf Pause eine Glättung des Ausgangsignales AM vorgesehen, und zwar entweder mit Hilfe eines linearen Digitalfilters, durch das aus drei aufeinanderfolgenden Kurzzeitmittelwerten G(n), G(n-l) und G(n-2) ein Wert GG(n) des geglätteten Signales erhalten wird, oder mit Hilfe eines Median-Filters.Therefore, in the arrangement according to the invention, a smoothing of the output signal AM is provided before the decision to pause, either with the aid of a linear digital filter, through which three of each other following short-term mean values G (n), G (nl) and G (n-2), a value GG (n) of the smoothed signal is obtained, or using a median filter.
Bei der linearen Filterung hat sich ein Filter mit den Koeffizienten 1/4, 1/2 und 1/4 als günstig erwiesen.With linear filtering, a filter with the coefficients 1/4, 1/2 and 1/4 has proven to be cheap.
Bei der Medianfilterung werden z.B. fünf aufeinanderfolgende Kurzzeitmittelwerte G(n) ... G(n-4) der Größe nach geordnet und dann der mittlere Wert als Ausgangswert GG(n) des Filters ausgelesen. Wie das Ausgangssignal des Mittelwertbildners M nach der Glättung mit einem linearen Digitalfilter aussieht, ist dem Diagramm a) der Fig. 3 zu entnehmen. Im Diagramm b) sind wiederum schematisch die wahren Sprachabschnitte und die wahren Pausen des Sprachsignales aufgetragen, und das Diagramm c) zeigt die Sprachabschnitte und Sprachpausen,wie sie sich analog zum Diagramm c) in Fig. 1 ergeben. Durch die lineare Glättung ist die Zahl der Fehlentscheidungen erheblich zurückgegangen, wie der Vergleich von Fig. 2 und Fig. 3 zeigt. Auch bei Glättung mit einem Median-Filter verringert sich - wie dem Diagramm c) der Fig. 4 zu entnehmen ist - die Zahl der Fehlentscheidungen.With median filtering e.g. five successive short-term mean values G (n) ... G (n-4) ordered in size and then the mean value is read out as the output value GG (n) of the filter. The diagram a) of FIG. 3 shows how the output signal of the mean value generator M looks after smoothing with a linear digital filter. In diagram b), the true speech sections and the real pauses of the speech signal are in turn plotted, and diagram c) shows the speech sections and speech pauses, as they result in a manner analogous to diagram c) in FIG. 1. Due to the linear smoothing, the number of wrong decisions has decreased considerably, as the comparison of FIGS. 2 and 3 shows. Even with smoothing with a median filter, the number of incorrect decisions is reduced, as can be seen from diagram c) in FIG. 4.
Eine weitere Maßnahme, kürzere Leistungseinbrüche im gestörten Sprachsignal nicht als Pausen zu mißdeuten, besteht darin, z.B. einen Leistungseinbruch erst bei zweimaligem Unterschreiten der oberen Amplitudenschwelle in der Fig. 2, 3 oder 4 als Sprachpause anzusehen.A further measure, not to misinterpret shorter drops in performance in the disturbed speech signal as pauses, is e.g. a drop in performance can only be regarded as a speech pause when the upper amplitude threshold is fallen below twice in FIG. 2, 3 or 4.
Die in der Fig. 2, 3 und 4 eingezeichneten Amplitudenschwellen werden - wie oben schon angedeutet - von der Einheit PA in Fig. 1 ermittelt, und zwar wird zunächst für jeden Zeitpunkt T(n) der Schätzwert P(n) der Geräuschleistung bestimmt. Diese Größe soll ein ungefähres Maß für die mittlere Leistung des Störsignales sein, wobei die Mittelungszeit in der Größenordnung einer Sekunde liegt.The amplitude thresholds shown in FIGS. 2, 3 and 4 are - as already indicated above - determined by the unit PA in FIG. 1, and this is done first for each time T (n) the estimated value P (n) of the noise power is determined. This variable is intended to be an approximate measure of the average power of the interference signal, the averaging time being of the order of one second.
Weil der Schätzwert P(n) der Geräuschleistung während längerer Sprachpausen - auf deren Erkennung wird weiter unten eingegangen - auf einen aktuellen Wert gebracht wird, liefert die erfindungsgemäße Anordnung auch dann noch gute Ergebnisse, wenn sich die oben erwähnte mittlere Leistung des Störsignales nur langsam verändert, d.h.,wenn sie in Zeitintervallen der Größe ein bis zwei Sekunden als stationär anzusehen ist.Because the estimated value P (n) of the noise power during longer speech pauses - their recognition will be discussed below - is brought to a current value, the arrangement according to the invention still delivers good results even if the above-mentioned average power of the interference signal changes only slowly , ie if it is to be regarded as stationary in time intervals of the size, one or two seconds.
Fällt der Zeitpunkt T(n) in eine längere Sprachpause, so wird der Schätzwert P(n) als Linearkombination aus dem vorangegangenen Schätzwert P(n-l) und dem Kurzzeitmittelwert G(n) nach der Gleichung
Um die längeren Sprachpausen zu erkennen, wird laufend geprüft, ob die Differenz zweier aufeinanderfolgender Kurzzeitmittelwerte betragsmäßig unter eine Schwelle D fällt. Ist z.B. K mal nacheinander die Ungleichung
Ein anderer Weg, einen möglichst guten Schätzwert P(n) für eine langsam veränderliche Geräuschleistung zu erhalten, besteht darin, zu jedem Taktzeitpunkt T(n) eine Vergrößerung des schon vorhandenen Schätzwertes P(n-l) um einen festen Betrag c vorzunehmen, wenn der Schätzwert P(n-l) kleiner als der Kurzzeitmittelwert G(n) ist. Jedes Mal also, wenn die Ungleichung P (n-1) < G(n) erfüllt ist, wird P(n) = P(n-l) + c gesetzt.Another way of obtaining the best possible estimate P (n) for a slowly changing noise output is to increase the already existing estimate P (nl) by a fixed amount c at every cycle time T (n), if the estimate P (nl) is smaller than the short-term mean G (n). So every time the inequality P (n-1) <G (n) is satisfied, P (n) = P (nl) + c is set.
Die Konstante c ist so zu wählen, daß der Schätzwert bei ungehinderter Vergrößerung in ein bis zwei Sekunden die Aussteuerungsgrenze erreicht hat. Liegt andererseits der schon vorhandene Schätzwert P(n-1) über dem augenblicklichen Kurzzeitmittelwert G(n), so wird der neue Schätzwert P(n) gegenüber dem vorhandenen erniedrigt, und zwar gemäß der Gleichung
Die Schwelle S, die zur Pausenentscheidung herangezogen wird, ist proportional zum Schätzwert P(n). Typisch für den Zusammenhang zwischen der Schwelle S und dem Schätzwert P(n) ist die Gleichung S = 1,1 P(n).The threshold S, which is used for the pause decision, is proportional to the estimated value P (n). The relationship S = 1.1 P (n) is typical of the relationship between the threshold S and the estimated value P (n).
Claims (8)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE3243231 | 1982-11-23 | ||
DE19823243231 DE3243231A1 (en) | 1982-11-23 | 1982-11-23 | METHOD FOR DETECTING VOICE BREAKS |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0110467A1 true EP0110467A1 (en) | 1984-06-13 |
EP0110467B1 EP0110467B1 (en) | 1987-08-12 |
EP0110467B2 EP0110467B2 (en) | 1991-06-19 |
Family
ID=6178780
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP83201638A Expired - Lifetime EP0110467B2 (en) | 1982-11-23 | 1983-11-17 | Arrangement for the detection of speech intervals |
Country Status (6)
Country | Link |
---|---|
US (1) | US4700394A (en) |
EP (1) | EP0110467B2 (en) |
JP (1) | JPS59105695A (en) |
AU (1) | AU561076B2 (en) |
CA (1) | CA1203627A (en) |
DE (2) | DE3243231A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0154020A1 (en) * | 1983-12-19 | 1985-09-11 | CSELT Centro Studi e Laboratori Telecomunicazioni S.p.A. | Device for speaker verification |
EP0167364A1 (en) * | 1984-07-06 | 1986-01-08 | AT&T Corp. | Speech-silence detection with subband coding |
EP0669606A2 (en) * | 1994-02-23 | 1995-08-30 | Daimler-Benz Aktiengesellschaft | Method for noise reduction in disturbed voice channels |
DE10120231A1 (en) * | 2001-04-19 | 2002-10-24 | Deutsche Telekom Ag | Single-channel noise reduction of speech signals whose noise changes more slowly than speech signals, by estimating non-steady noise using power calculation and time-delay stages |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU583871B2 (en) * | 1984-12-31 | 1989-05-11 | Itt Industries, Inc. | Apparatus and method for automatic speech recognition |
JPH0748695B2 (en) * | 1986-05-23 | 1995-05-24 | 株式会社日立製作所 | Speech coding system |
DE3626862A1 (en) * | 1986-08-08 | 1988-02-11 | Philips Patentverwaltung | MULTI-STAGE TRANSMITTER ANTENNA COUPLING DEVICE |
DE3739681A1 (en) * | 1987-11-24 | 1989-06-08 | Philips Patentverwaltung | METHOD FOR DETERMINING START AND END POINT ISOLATED SPOKEN WORDS IN A VOICE SIGNAL AND ARRANGEMENT FOR IMPLEMENTING THE METHOD |
FR2631147B1 (en) * | 1988-05-04 | 1991-02-08 | Thomson Csf | METHOD AND DEVICE FOR DETECTING VOICE SIGNALS |
JP2573352B2 (en) * | 1989-04-10 | 1997-01-22 | 富士通株式会社 | Voice detection device |
US5305422A (en) * | 1992-02-28 | 1994-04-19 | Panasonic Technologies, Inc. | Method for determining boundaries of isolated words within a speech signal |
DE4220524A1 (en) * | 1992-06-23 | 1992-10-22 | Matzner Rolf Dipl Ing | Separate estimation of power in two superimposed stochastic processes - by sampling and filtering to identify inputs for processing to identify separate signal and noise components |
US5459814A (en) * | 1993-03-26 | 1995-10-17 | Hughes Aircraft Company | Voice activity detector for speech signals in variable background noise |
DE19730518C1 (en) * | 1997-07-16 | 1999-02-11 | Siemens Ag | Speech pause recognition method |
GB0103242D0 (en) * | 2001-02-09 | 2001-03-28 | Radioscape Ltd | Method of analysing a compressed signal for the presence or absence of information content |
US7535859B2 (en) * | 2003-10-16 | 2009-05-19 | Nxp B.V. | Voice activity detection with adaptive noise floor tracking |
US8543061B2 (en) | 2011-05-03 | 2013-09-24 | Suhami Associates Ltd | Cellphone managed hearing eyeglasses |
CN104658546B (en) * | 2013-11-19 | 2019-02-01 | 腾讯科技(深圳)有限公司 | Recording treating method and apparatus |
RU2691603C1 (en) * | 2018-08-22 | 2019-06-14 | Акционерное общество "Концерн "Созвездие" | Method of separating speech and pauses by analyzing values of interference correlation function and signal and interference mixture |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2316814A1 (en) * | 1975-07-03 | 1977-01-28 | Telettra Lab Telefon | METHOD AND DEVICE FOR DETECTION OF THE PRESENCE AND / OR ABSENCE OF A SPEECH SIGNAL IN TELEPHONE LINES |
FR2451680A1 (en) * | 1979-03-12 | 1980-10-10 | Soumagne Joel | SPEECH / SILENCE DISCRIMINATOR FOR SPEECH INTERPOLATION |
US4357491A (en) * | 1980-09-16 | 1982-11-02 | Northern Telecom Limited | Method of and apparatus for detecting speech in a voice channel signal |
DE3235279A1 (en) * | 1981-09-25 | 1983-04-21 | Nissan Motor Co., Ltd., Yokohama, Kanagawa | VOICE RECOGNITION DEVICE |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4052568A (en) * | 1976-04-23 | 1977-10-04 | Communications Satellite Corporation | Digital voice switch |
US4025721A (en) * | 1976-05-04 | 1977-05-24 | Biocommunications Research Corporation | Method of and means for adaptively filtering near-stationary noise from speech |
US4028496A (en) * | 1976-08-17 | 1977-06-07 | Bell Telephone Laboratories, Incorporated | Digital speech detector |
JPS56104399A (en) * | 1980-01-23 | 1981-08-20 | Hitachi Ltd | Voice interval detection system |
JPS56135898A (en) * | 1980-03-26 | 1981-10-23 | Sanyo Electric Co | Voice recognition device |
CA1147071A (en) * | 1980-09-09 | 1983-05-24 | Northern Telecom Limited | Method of and apparatus for detecting speech in a voice channel signal |
US4531228A (en) * | 1981-10-20 | 1985-07-23 | Nissan Motor Company, Limited | Speech recognition system for an automotive vehicle |
-
1982
- 1982-11-23 DE DE19823243231 patent/DE3243231A1/en active Granted
-
1983
- 1983-11-17 CA CA000441366A patent/CA1203627A/en not_active Expired
- 1983-11-17 DE DE8383201638T patent/DE3373037D1/en not_active Expired
- 1983-11-17 EP EP83201638A patent/EP0110467B2/en not_active Expired - Lifetime
- 1983-11-17 US US06/552,998 patent/US4700394A/en not_active Expired - Fee Related
- 1983-11-21 AU AU21545/83A patent/AU561076B2/en not_active Ceased
- 1983-11-22 JP JP58220467A patent/JPS59105695A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2316814A1 (en) * | 1975-07-03 | 1977-01-28 | Telettra Lab Telefon | METHOD AND DEVICE FOR DETECTION OF THE PRESENCE AND / OR ABSENCE OF A SPEECH SIGNAL IN TELEPHONE LINES |
FR2451680A1 (en) * | 1979-03-12 | 1980-10-10 | Soumagne Joel | SPEECH / SILENCE DISCRIMINATOR FOR SPEECH INTERPOLATION |
US4357491A (en) * | 1980-09-16 | 1982-11-02 | Northern Telecom Limited | Method of and apparatus for detecting speech in a voice channel signal |
DE3235279A1 (en) * | 1981-09-25 | 1983-04-21 | Nissan Motor Co., Ltd., Yokohama, Kanagawa | VOICE RECOGNITION DEVICE |
Non-Patent Citations (2)
Title |
---|
COMSAT TECHNICAL REVIEW, Band 6, Nr. 1, Fr}hling 1976, Seiten 159-178, Washington, USA * |
IEEE TRANSACTIONS ON COMMUNICATIONS, Band COM-30, Nr. 4, April 1982, Seiten 739-750, New York; USA * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0154020A1 (en) * | 1983-12-19 | 1985-09-11 | CSELT Centro Studi e Laboratori Telecomunicazioni S.p.A. | Device for speaker verification |
EP0167364A1 (en) * | 1984-07-06 | 1986-01-08 | AT&T Corp. | Speech-silence detection with subband coding |
EP0669606A2 (en) * | 1994-02-23 | 1995-08-30 | Daimler-Benz Aktiengesellschaft | Method for noise reduction in disturbed voice channels |
EP0669606A3 (en) * | 1994-02-23 | 1995-10-25 | Daimler Benz Ag | Method for noise reduction in disturbed voice channels. |
DE10120231A1 (en) * | 2001-04-19 | 2002-10-24 | Deutsche Telekom Ag | Single-channel noise reduction of speech signals whose noise changes more slowly than speech signals, by estimating non-steady noise using power calculation and time-delay stages |
Also Published As
Publication number | Publication date |
---|---|
AU561076B2 (en) | 1987-04-30 |
JPS59105695A (en) | 1984-06-19 |
DE3243231A1 (en) | 1984-05-24 |
CA1203627A (en) | 1986-04-22 |
AU2154583A (en) | 1984-05-31 |
EP0110467B2 (en) | 1991-06-19 |
DE3243231C2 (en) | 1987-07-02 |
EP0110467B1 (en) | 1987-08-12 |
US4700394A (en) | 1987-10-13 |
DE3373037D1 (en) | 1987-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0110467B2 (en) | Arrangement for the detection of speech intervals | |
EP0111947A1 (en) | Arrangement for the detection of silence in speech signals | |
DE19736669C1 (en) | Beat detection method for time discrete audio signal | |
EP1009463B1 (en) | Apparatus for switching the inspiration or expiration phase during cpap therapy | |
DE3101851A1 (en) | METHOD FOR SCANNING LANGUAGE | |
EP0584388A1 (en) | Method of producing a signal corresponding to a patient's minute-volume | |
EP0560047B1 (en) | Safety device for power-closable openings | |
DE2949752A1 (en) | METHOD FOR DETECTING PULLOUTS IN AN ELECTROFILTER | |
DE69725970T2 (en) | METHOD FOR MONITORING LEVEL SWITCHES BY ACOUSTIC ANALYSIS | |
CH648129A5 (en) | METHOD AND DEVICE FOR MONITORING THE CHANGE OF A SIGNAL. | |
WO2000010633A1 (en) | Method and device for switching the inspiration or expiration phase during cpap therapy | |
EP0775348B1 (en) | Method of detecting signals by means of fuzzy-logic classification | |
DE19848586C2 (en) | Detector and method for detecting tones or other periodic signals | |
DE19930458C2 (en) | Ringing frequency determination apparatus and method | |
DE19854341A1 (en) | Method and circuit arrangement for speech level measurement in a speech signal processing system | |
EP0203029B1 (en) | Method of producing a tripping signal in dependence upon the amplitude and the duration of an overcurrent | |
DE69725964T2 (en) | Cardiac reaction detection in pacemaker-wearing patients | |
DE3305045C2 (en) | Arrangement for determining the basic speech frequency | |
DE3335343A1 (en) | METHOD FOR EXCITING ANALYSIS FOR AUTOMATIC VOICE RECOGNITION | |
DE4315677C2 (en) | Circuit arrangement for determining the basic frequency from a signal which does not have a band-limited signal and contains harmonics and interference signals, in particular for determining the basic voice frequency from the voice and speech signal | |
DE3617949C2 (en) | ||
EP0159650B2 (en) | Evaluation device for received signals in an audio frequency ripple control receiver | |
EP0161423A1 (en) | Method for determining the boundaries of a signal mixed with background noise | |
DE1276740B (en) | Procedures and arrangements for improving the speech quality of channel vocoders | |
DE3831047C2 (en) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Designated state(s): BE DE FR GB IT SE |
|
17P | Request for examination filed |
Effective date: 19840718 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): BE DE FR GB IT SE |
|
REF | Corresponds to: |
Ref document number: 3373037 Country of ref document: DE Date of ref document: 19870917 |
|
ITF | It: translation for a ep patent filed |
Owner name: ING. C. GREGORJ S.P.A. |
|
ET | Fr: translation filed | ||
PLBI | Opposition filed |
Free format text: ORIGINAL CODE: 0009260 |
|
26 | Opposition filed |
Opponent name: SIEMENS AKTIENGESELLSCHAFT, BERLIN UND MUENCHEN Effective date: 19880502 |
|
PLAB | Opposition data, opponent's data or that of the opponent's representative modified |
Free format text: ORIGINAL CODE: 0009299OPPO |
|
R26 | Opposition filed (corrected) |
Opponent name: SIEMENS AKTIENGESELLSCHAFT, BERLIN UND MUENCHEN Effective date: 19880809 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: BE Payment date: 19891114 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 19891121 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: SE Payment date: 19891128 Year of fee payment: 7 |
|
ITTA | It: last paid annual fee | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 19891130 Year of fee payment: 7 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 19900125 Year of fee payment: 7 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Effective date: 19901117 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Effective date: 19901118 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Effective date: 19901130 |
|
PUAH | Patent maintained in amended form |
Free format text: ORIGINAL CODE: 0009272 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: PATENT MAINTAINED AS AMENDED |
|
BERE | Be: lapsed |
Owner name: N.V. PHILIPS' GLOEILAMPENFABRIEKEN Effective date: 19901130 |
|
27A | Patent maintained in amended form |
Effective date: 19910619 |
|
AK | Designated contracting states |
Kind code of ref document: B2 Designated state(s): BE DE FR GB IT SE |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Effective date: 19910731 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Effective date: 19910801 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |
|
EN3 | Fr: translation not filed ** decision concerning opposition | ||
EUG | Se: european patent has lapsed |
Ref document number: 83201638.0 Effective date: 19910705 |