EP0047589B1 - Verfahren und Vorrichtung für die Anzeige von Sprachsignalen in einem Übertragungskanal - Google Patents

Verfahren und Vorrichtung für die Anzeige von Sprachsignalen in einem Übertragungskanal Download PDF

Info

Publication number
EP0047589B1
EP0047589B1 EP81303695A EP81303695A EP0047589B1 EP 0047589 B1 EP0047589 B1 EP 0047589B1 EP 81303695 A EP81303695 A EP 81303695A EP 81303695 A EP81303695 A EP 81303695A EP 0047589 B1 EP0047589 B1 EP 0047589B1
Authority
EP
European Patent Office
Prior art keywords
sample
magnitude
signal
speech
producing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
EP81303695A
Other languages
English (en)
French (fr)
Other versions
EP0047589A1 (de
Inventor
Fouad Daaboul
Tiu Le Van
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nortel Networks Ltd
Original Assignee
Northern Telecom Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northern Telecom Ltd filed Critical Northern Telecom Ltd
Publication of EP0047589A1 publication Critical patent/EP0047589A1/de
Application granted granted Critical
Publication of EP0047589B1 publication Critical patent/EP0047589B1/de
Expired legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Definitions

  • This invention relates to a method of, and a speech detector for, detecting the presence of speech signals in a sampled voice channel signal.
  • Speech detectors are used in a variety of speech transmission systems in which speech transmission paths are established in response to the detection of speech activity on a voice channel.
  • One such system is a digital speech interpolation (DSI) transmission system, such as the system described and claimed in E.P.A. 47588 corresponding to Canadian Patent Application No. 359,965 filed September 9, 1980, entitled "Mitigation of Noise Signal Contrast in a Digital Speech Interpolation Transmission System", which conveniently embodies the speech detector of this invention.
  • DSI digital speech interpolation
  • a speech detector should ideally be highly sensitive to the presence of speech signals while at the same time remaining insensitive to non-speech signals such as noise.
  • a difficulty arises in distinguishing, quickly and accurately, between speech signals, particularly at low levels, and noise.
  • the speech detector should be able to detect speech signals at low levels in order to avoid excessive clipping of speech signals at the start of speech utterances, but at the same time should not respond to noise alone, even at relatively. high levels, because this would undesirably increase the activity of the DSI transmission.
  • Patent 4,057,690 issued November 8, 1977 discloses an arrangement in which segments of the envelope of a voice channel signal are compared with one another over different time domains in order to distinguish between speech signals and noise.
  • these arrangements do not fully satisfy the requirements, of a speech detector in a DSI transmission system, of distinguishing between low levels of speech and noise and avoiding clipping of the speech signals at the start of speech utterances, and accordingly a need still exists for an improved speech detection arrangement which satisfies these requirements.
  • an object of this invention is to provide an improved method of, and speech detector for, detecting the presence of speech signals in a sampled voice channel signal.
  • the speech detection is effected in two separate parts, associated with the production of the first and second signals respectively.
  • the first threshold is set to be above anticipated noise levels, so that the first signal state is produced only at relatively high levels of speech signals, which high levels exceed the first threshold level and accordingly can not be noise.
  • the second threshold level is adaptively adjusted to be a little above the level of noise on the relevant channel. When the sample signal magnitude rises above this second threshold level, the second signal state is produced immediately. If, as at the start of a speech utterance, the signal magnitude continues to increase in successive samples, the second signal state continues to be produced for these samples. If on the other hand, the signal magnitude falls again the second signal state is no longer produced and the second threshold level is adaptively adjusted.
  • this arrangement provides a rapid detection of speech signals at low levels at the start of speech utterances.
  • the method preferably includes the steps of:- in response to the first signal state, producing a fourth signal state for a first predetermined number of consecutive samples commencing with the current sample; and in response to the second signal state, producing a fifth signal state for a second number of consecutive samples commencing with the current sample; wherein the signal representing presence of speech is produced in the presence of either the fourth signal state or the fifth signal state.
  • the second number of consecutive samples is desirably varied in dependence upon the reliability with which the second signal state is produced for each sample, in order that a speech indication signal is not produced for a long hangover period in response to a spurious noise signal which has resulted in the production of the second signal state.
  • the method preferably also includes the step of determining said second number in dependence upon previous sample magnitudes, said second number being increased by a predetermined amount, up to a maximum number, for each sample in respect of which the second signal state is produced, and being decreased by a predetermined amount at least for each sample whose magnitude is not greater than the magnitude of the preceding sample.
  • hangover period which is associated with the production of the second signal state is gradually increased, up to a maximum period, as the reliability of speech signal detection increases due to successive increases in the signal level in successive samples.
  • the hangover period associated with the production of the first signal state need not be variable because this first signal state is only produced for relatively high signal levels for which the reliability of the speech signal indication is very high.
  • the method preferably further includes the steps of:- whenever the magnitude of a sample exceeds that of the preceding sample, and in respect of the preceding sample the fifth signal state was produced but the second signal state was not produced, producing the second signal state for the current sample if its magnitude does not exceed the second threshold level but exceeds a third threshold level; and setting the third threshold level equal to the magnitude of the preceding sample whenever the second signal state was produced for the preceding sample and the magnitude of the current sample is not greater than the magnitude of the preceding sample.
  • each signal sample is constituted by an average of a plurality of individual samples of the voice channel signal, the method of doing this comprising the step of producing each signal sample by removing d.c. offsets from and averaging a plurality of individual samples of the voice channel signal.
  • the averaging is particularly easy to achieve in a DSI transmission system of the type described in our co-pending Patent Application No. EP-A-47588, already referred to, in which updating of the speech decision for each channel takes place only once every superframe, each superframe comprising a plurality of frames each including a sample of each voice signal channel.
  • the invention also extends to a speech detector comprising one or more read-only memories programmed and arranged to carry out the method recited above.
  • the present invention still further provides a method of detecting the presence of speech in a sampled voice channel signal, characterised by the steps of:- setting a threshold (TL), to a level which is greater than and is dependent upon the magnitude (T) of the current sample, whenever the magnitude (T) of the current sample is not greater than that (TP) of the preceding sample; and providing an indication of the presence of speech whenever the magnitude (T) of the current sample is greater than that (TP) of the preceding sample and exceeds said threshold level (TL).
  • TL threshold
  • the speech detector described below with reference to Figures 1 to 3 is intended for use in a DSI transmission system of the type described in our co-pending Patent Application No. EP-A-47588 already referred to, in which once in each superframe a speech decision is updated for each of a plurality of voice signal channels in respect of each of which there is an individual sample contained in each of a plurality of frames forming the superframe.
  • a speech decision is updated for each of a plurality of voice signal channels in respect of each of which there is an individual sample contained in each of a plurality of frames forming the superframe.
  • the speech detector includes two independent parts, which are referred to herein as the level detector 601 and the slope detector 602, whose outputs are combined in an OR gate 603 to produce for each channel a speech decision which is stored in a 48-channel decision store 604, to the output of which a speech decision output line 110 is connected.
  • Each of the detectors 601 and 602 is supplied with a 7-bit average T, produced by the circuit described below with reference to Figure 4, on lines 115, and is enabled in the fourteenth frame of each superframe to up-date the speech decision for each channel.
  • each of the detectors 601 and 602 comprises a read-only memory.
  • the speech detector is required to be able to detect speech signals at low levels in order to avoid excessive clipping of speech signals at the start of speech utterances, but at the same time is required not to respond to relatively high levels of noise alone because this would undesirably increase the activity of the DSI transmission.
  • the speech detector is designed to exploit differences in the characteristics of noise and speech signals, namely that (a) speech signals usually have a higher level than noise, and (b) whereas noise is continuous, speech signals occur in bursts with the signal level progressively increasing at the start of each burst. It is to this end that the speech detector comprises the two detectors 601 and 602.
  • Each of the detectors 601 and 602 classifies each channel as being in one of three states, namely speech, hangover, and silence.
  • states are denoted by the value of an index, M for the level detector and K for the slope detector, each index having the value 0 for silence, 1 for speech, and 2 for hangover.
  • the hangover state is a temporary state which a channel is deemed to be in immediately following the speech state, and is provided to avoid speech clipping after intersyllabic pauses in speech.
  • a channel which previously was declared as being in the speech state, but in respect of which speech is no longer detected is deemed to be in the hangover state and an initial hangover count is set. If speech is still not detected in successive superframes, then this hangover count is decremented until it reaches zero, when the channel is declared silent.
  • the initial hangover count is fixed in the level detector but is variable in the slope detector, as is further explained below.
  • the level detector 601 consists of three parts, namely a comparator 605, a hangover and control unit 606, and a decision store 607.
  • the comparator 605 compares the average T with a fixed threshold TF which is above the highest possible noise level. The result of this comparison is supplied to the unit 606.
  • the unit 606 determines the state of the channel in dependence upon this comparison and the channel's previous state as stored in the store 607, and stores the current state of the channel, and any hangover count which is applicable, in the store 607.
  • the unit 606 supplies a logic 1 on the output line 608 if the channel is determined as being in either the speech or the hangover state.
  • the slope detector 602 consists of a delay unit 609, comparators 610, a hangover, control, and threshold generator unit 611, and a decision and threshold store 612.
  • the delay unit 609 provides a delay of 1 superframe for the average T to provide a previous average TP via lines 613 to the comparators 610.
  • the comparators 610 compare the current average T with the previous average TP, a threshold TL, and a threshold TH and supply the comparison results to the unit 611.
  • the thresholds TL and TH are variable thresholds which are stored for each individual channel in the store 612.
  • the unit 611 determines the state of the channel in dependence upon the comparison results and the channel's previous state as stored in the store 612, generates new thresholds TL and TH if necessary, and stores the current state of the channel, together with any new hangover count and thresholds TL and TH, in the store 612.
  • the unit 611 supplies a logic 1 on the output line 614 if the channel is determined as being in either the speech or the hangover state.
  • the threshold TL is set to BT+D, i.e. T+5 in Figure 3.
  • the previous value of K is then interrogated in a block 716, and because in the case of each of these points the previous value of K is zero, C is set to zero in a block 717 and K remains unchanged.
  • K 0 (silence). It can be seen that the threshold TL is adaptively adjusted during this period, so that this threshold is generally a little above the level of noise present on the particular channel.
  • the interrogation 710 has a positive result
  • the subsequent interrogation 711 has a negative result
  • the resultant interrogation 712 has a positive result because now T>TL, so that K is set to 1 (speech) in block 718 in Figure 2.
  • K is set to 1 (speech) in block 718 in Figure 2.
  • the interrogation 710 and the resultant interrogation 711 both have positive results.
  • the thresholds TL and TH being reset and C being decreased by 1 to 26.
  • the threshold TL is reset and C is reduced by 1.
  • the level detector 601 provides a reliable detection of the presence of speech each time that the average T exceeds the fixed threshold TF, and that after each such detection the speech decision on the line 110 is maintained for a fixed hangover period of 32 superframes, to maintain the decision during intersyllabic pauses in speech.
  • the slope detector 602 provides a less reliable but much earlier detection of the start of speech bursts, as at the point 809, to produce the speech decision on the line 110 as quickly as possible and hence to avoid excessive clipping of speech signals at the start of speech bursts.
  • the hangover period of the slope detector is not immediately set to the maximum as in the level detector; but instead is increased only gradually to avoid excessively increasing the activity of the DSI transmission.
  • the average T at the point 809 could alternatively be due to noise transients instead of the start of speech, in which case the line 801 would not rise after this point.
  • the value T is itself an average taken over the duration of one superframe, and the threshold TL is adaptively adjusted to be above the average noise level of the channel, so that the slope detector is relatively insensitive to noise transients.
  • Figure 4 illustrates in the form of a block diagram a d.c. offset remover and averaging circuit which serves to produce a 7-bit offset removed average T for each channel on the lines 115, from 8-bit individual signal samples of the channels supplied thereto on lines 102.
  • the offset remover consists of an 8-bit subtractor 401, a 16-bit up/down counter 402, and a 48-channel by 16-bit store 403.
  • the averaging circuit consists of a 12-bit adder 404, a 48- channel by 12-bit store 405, a buffer 406 having a clear input CL, and a 48-channel by 7- bit store 407 having a write-enable input WE. Each of the stores is addressed in turn for each channel via an address bus which is not shown.
  • the offset remover serves to produce on lines 409 for each channel a 7-bit magnitude signal from which long-term d.c. offsets have been removed, and to this end the offset remover in operation reaches an equilibrium state in which for each channel a 16-bit offset value of the channel is stored in the store 403.
  • the stored offset value of the channel is loaded from the store 403 into the counter 402 and is available at the counter output.
  • the 8 most significant bits of the offset value are applied via lines 410 to the subtractor 401, which subtracts the offset value bits from the current sample of the channel to produce the 7-bit magnitude signal on the lines 409 and a sign bit on a further output line 411.
  • This line 411 is connected to an up/down counting control input U/D of the counter 402 and causes the count of the counter to be increased or decreased by 1 depending on the polarity of the sign bit on the line 411.
  • the counter 402 thus produces a new, modified, 16-bit offset value for the channel at its output, and this new value is written into the store 403 in place of the previous offset value for the channel. This sequence is repeated for subsequent channels in each frame.
  • the equilibrium state reached is such that for each channel the numbers of positive and negative sign bits produced on the line 411 are equal.
  • the stored offset value of each channel varies, only the 8 most significant bits of this are subtracted from the channel information, and in fact 256 sign bits of one polarity are required in order to change the subtracted offset value bits by one step.
  • the averaging circuit serves to produce, for each channel, the 7-bit average T on the lines 115.
  • the average T on the lines 115 is actually a fraction of 27/32 of the actual average of the signals on the lines 409.
  • this average T is updated in the thirteenth frame of each superframe by signal applied via a line 414 to the input CL of the buffer 406 and the input WE of the store 407, to write a new average T into the store 407 and to clear the buffer 406.
  • the output of the adder 404 is stored in the store 405.
  • the adder output is equal to the sum of the 7-bit magnitude signal of the particular channel, present on the lines 409, and a 12-bit cumulative sum for the particular channel present on lines 412.
  • the cumulative sum for the channel is the previously stored sum for the channel which was stored in the store 405, which is clocked through the buffer 406 in each frame except the thirteenth frame of each superframe when, as described above, the buffer 406 is cleared to reduce the cumulative sum to zero.
  • the 12-bit cumulative sum produced at the output of the store 405 is equal to the sum of the offset-removed magnitude signals for that channel during the preceding 27 frames. Only the 7 most significant bits of this sum are written into the store 407 to achieve a division of the sum by a factor of 32; hence the average T is 27/32 of the actual average. This minor difference does not adversely affect the operation of the speech detector.
  • the speech detector of the invention can obviously be used in conjunction with other forms of such circuit or without any preceding offset remover and averaging circuit.
  • the speech detector can be used in other applications than that described, and can be provided in respect of any number of voice channel signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Time-Division Multiplex Systems (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Claims (13)

1. Verfahren zum Anzeigen von Sprachsignalen in einem abgetasteten Sprachkanalsignal mit Erzeugung eines ersten Signalzustandes (M=1), jedesmal, wenn die Größe (T) des Signalabtastwertes einen ersten Schwellwertpegel (TF) überschreitet, gekennzeichnet durch folgende Schritte:
Vergleichen der Größe (T) jedes Abtastwertes mit der (TP) des vorhergehenden Abtastwertes;
jedesmal, wenn die Größe (T) eines Abtastwertes nicht größer als die (TP) des vorhergehenden Abtastwertes ist, Setzen einer zweiten Schwelle (TL) auf einen Pegel, der größer als die Größe (T) des augenblicklichen Abtastwertes ist und von dieser abhängt;
jedesmal dann, wenn die Größe (T) eines Abtastwertes größer als die (TP) des vorhergehenden Abtastwertes ist, Erzeugen eines zweiten Signalzustandes (K=1), falls die Größe (T) des augenblicklichen Abtastwertes den zweiten Schwellwertpegel (TL) überschreitet; und
in Abhängigkeit von jedem der ersten und zweiten Signalzustände (M=1, K=1), Erzeugen eines die Anwesenheit von Sprache, mindestens bei dem augenblicklichen Abtastwert, repräsentierenden Signales.
2. Verfahren nach Anspruch 1, mit den zusätzlichen Schritten:
jedesmal dann, wenn die Größe (T) eines Abtastwertes nicht den ersten Schwellwertpegel (TF) überschreitet und der erste Signalzustand (M=1) bei dem vorhergehenden Abtastwert erzeugt wurde, Erzeugen eines dritten Signalzustandes (M=2) bei einer ersten vorbestimmten Zahl (H) von aufeinanderfolgenden Abtastwerten, deren Größe den ersten Schwellwertpegel nicht überschreitet, beginnend mit dem augenblicklichen Abtastwert;
jedesmal dann, wenn die Größe (T) eines Abtastwertes nicht größer als die (TP) des vorhergehenden Abtastwertes ist und der zweite Signalzustand (K=1) bei dem vorhergehenden Abtastwert erzeugt wurde, Erzeugen eines vierten Signalzustands (K=2) bei einer zweiten Zahl (C) aufeinanderfolgender Abtastwerte, beginnend mit dem augenblicklichen Abtastwert; und
Erzeugen eines die Anwesenheit von Sprache repräsentierenden Signales auch in Abhängigkeit von jedem der dritten und vierten Signalzustände (M=2, K=2).
3. Verfahren nach Anspruch 2, mit dem weiteren Schritt der Bestimmung der zweiten Zahl (C) in Abhängigkeit von vorhergehenden Abtastwert-Größen, wobei die zweite Zahl (C) um einen vorbestimmten Betrag bis zu einer Maximalzahl (C=32) erhöht wird bei jedem Abtastwert, bei dem der zweite Signalzustand (K=1) erzeugt, und erniedrigt bis zu einer Minimalzahl (C=0) wird bei jedem anderen Abtastwert, dessen Größe (T) nicht größer als die Größe (TP) des vorhergehenden Abtastwertes ist.
4. Verfahren nach Anspruch 2 oder 3, mit den weiteren Schritten:
jedesmal dann, wenn die Größe (T) eines Abtastwertes die (TP) des vorhergehenden Abtastwertes überschreitet, und wenn für den vorhergehenden Abtastwert der vierte Signalzustand (K=2) erzeugt, jedoch der zweite Signalzustand (K=1) nicht erzeugt wurde, Erzeugen des zweiten Signalzustandes (K=1) für den augenblicklichen Abtastwert, falls seine Größe (T) einen dritten Schwellwertpegel (TH) überschreitet, jedoch unter dem zweiten Schwellwertpegel (TL) liegt; und
Einstellen des dritten Schwellwertpegels (TH) gleich der Größe (TP) des vorhergehenden Abtastwertes, jedesmal, wenn der zweite Signalzustand (K=1) für den vorhergehenden Abtastwert erzeugt wurde und die Größe (T) des augenblicklichen Abtastwertes nicht größer als die Größe (TP) des vorhergehenden Abtastwertes ist.
5. Verfahren nach einem der Ansprüche 1 bis 4, bei dem jedesmal, wenn der zweite Schwellwertpegel (TL) gestellt wird, er um eine vorbestimmtes Ausmaß größer als die Größe (T) des augenblicklichen Abtastwertes gestellt wird.
6. Verfahren nach einem der Ansprüche 1 bis 5, bei dem jeder Signal-Abtastwert durch einen Durchschnitt aus einer Vielzahl von Einzelabtastwerten des Sprachkanalsignals gebildet wird, wobei des Verfahren ferner den Schritt enthält des Erzeugens jedes Signalabtastwertes durch Entfernen von Gleichspannungs-Ablagen von und Mitteln einer Vielzahl von Einzelabtastwerten des Sprachkanalsignals.
7. Sprachdetektor mit einem oder mehreren Festwertspeichern, die zur Ausführung des Verfahrens nach einem der Ansprüche 1 bis 6 programmiert und angeordnet sind.
8. Sprachdetektor für die Anzeige von Sprachsignalen in einem abgetasteten Sprachkanalsignal, mit Mitteln (605) zur Erzeugung eines ersten Signalzustandes (M=1) jedesmal dann, wenn die Größe (T) eines Signalabtastwertes einen ersten Schwellwertpegel (TF) überschreitet, dadurch gekennzeichnet, daß der Sprachdetektor enthält:
Mittel (611) zum Erzeugen eines zweiten Schwellwertes (TL);
Mittel (609) zum Aufhalten jedes Abtastwertes bis zur Ankunft des nächsten Abtastwertes;
Mittel (610) zum Vergleichen der Größe (T) jedes Abtastwertes mit der (TP) des vorhergehenden, durch die Aufhaltemittel (609) aufgehaltenen Abtastwertes;
Mittel (611) in Abhängigkeit von den Vergleichsmitteln (610), zur Bestimmung, daß die Größe (T) eines Abtastwertes nicht größer als die (TP) des vorhergehenden Abtastwertes ist, um den zweiten Schwellwert (TL) in Abhängigkeit von dieser Bestimmung auf einen Pegel zu stellen, der größer ist als die Größe (T) des augenblicklichen Abtastwertes und davon abhängig ist;
Mittel (611) in Abhängigkeit von den Vergleichsmitteln (610) zur Bestimmung, daß die Größe (T) eines Abtastwertes größer als die (TP) des vorhergehenden Abtastwertes ist zur Erzeugung eines zweiten Signalzustandes (K=1 ) in Abhängigkeit von dieser Bestimmung, falls die Größe (T) des augenblicklichen Abtastwertes den zweiten Schwellwertpegel (TL) überschreitet; und
Mittel (603) in Abhängigkeit von jedem ersten und zweiten Signalzustand (M=1, K=1) zur Erzeugung eines die Anwesenheit von Sprache repräsentierenden Signals mindestens für den augenblicklichen Abtastwert.
9. Sprachdetektor nach Anspruch 8, gekennzeichnet durch Mittel (401 bis 406) zur Erzeugung jedes Signalabtastwertes durch Entfernen von Gleichspannungs-Ablagen von und Mitteln einer Vielzahl von einzelnen Abtastwerten des Sprachkanalsignals.
10. Verfahren für die Anzeige von Sprachsignalen in einem abgetasteten Sprachkanalsignal, gekennzeichnet durch folgende Schritte:
Stellen eines Schwellwertes (TL) auf einen Pegel, der größer als die Größe (T) des augenblicklichen Abtastwertes ist und von ihr abhängt, jedesmal, wenn die Größe (T) des augenblicklichen Abtastwertes nicht größer als die (TP) des vorhergehenden Abtastwertes ist; und
Schaffen einer Anzeige der Anwesenheit von Sprache jedesmal, wenn die Größe (T) des gegenwärtigen Abtastwertes größer als die (TP) des vorhergehenden Abtastwertes ist und den Schwellwertpegel (TL) überschreitet.
11. Verfahren nach Anspruch 10, gekennzeichnet durch Aufrechterhalten der Anzeige für eine Zahl (C) von Abtastwerten folgend jedem Abtastwert, dessen Größe (T) größe als die (TP) des vorhergehenden Abtastwertes ist.
12. Verfahren nach Anspruch 12, gekennzeichnet durch Bestimmen der Zahl (C) von Abtastwerten, für die die Anzeige in Abhängigkeit von vorherigen Abtastwertgrößen aufrechterhalten wird, wobei die Zahl (C) bis zu einer Maximalzahl (C=32) bei jedem Abtastwert erhöht wird, dessen Größe (T) größer als die (TP) des vorhergehenden Abtastwertes ist, und bis zu einer Minimalzahl (C=0) erniedrigt wird bei jedem Abtastwert, dessen Größe (T) nicht größer als die (TP) des vorhergehenden Abtastwertes ist.
13. Verfahren nach Anspruch 10, 11 oder 12, gekennzeichnet durch Schaffen einer Anzeige der Anwesenheit von Sprache bei jedem Abtastwert, dessen Größe (T) einen festen Schwellwertpegel (TF) überschreitet.
EP81303695A 1980-09-09 1981-08-13 Verfahren und Vorrichtung für die Anzeige von Sprachsignalen in einem Übertragungskanal Expired EP0047589B1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CA359968 1980-09-09
CA000359968A CA1147071A (en) 1980-09-09 1980-09-09 Method of and apparatus for detecting speech in a voice channel signal

Publications (2)

Publication Number Publication Date
EP0047589A1 EP0047589A1 (de) 1982-03-17
EP0047589B1 true EP0047589B1 (de) 1984-06-13

Family

ID=4117844

Family Applications (1)

Application Number Title Priority Date Filing Date
EP81303695A Expired EP0047589B1 (de) 1980-09-09 1981-08-13 Verfahren und Vorrichtung für die Anzeige von Sprachsignalen in einem Übertragungskanal

Country Status (4)

Country Link
EP (1) EP0047589B1 (de)
JP (1) JPS5781733A (de)
CA (1) CA1147071A (de)
DE (1) DE3164171D1 (de)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1137240A (en) * 1980-09-09 1982-12-07 Northern Telecom Limited Method of and apparatus for echo detection in voice channel signals
DE3276731D1 (en) * 1982-04-27 1987-08-13 Philips Nv Speech analysis system
DE3243231A1 (de) * 1982-11-23 1984-05-24 Philips Kommunikations Industrie AG, 8500 Nürnberg Verfahren zur erkennung von sprachpausen
DE3473373D1 (en) * 1983-10-13 1988-09-15 Texas Instruments Inc Speech analysis/synthesis with energy normalization
JPS619700A (ja) * 1984-06-25 1986-01-17 シャープ株式会社 音声の特徴抽出方式
GB2379148A (en) * 2001-08-21 2003-02-26 Mitel Knowledge Corp Voice activity detection

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3712959A (en) * 1969-07-14 1973-01-23 Communications Satellite Corp Method and apparatus for detecting speech signals in the presence of noise
US4052568A (en) * 1976-04-23 1977-10-04 Communications Satellite Corporation Digital voice switch

Also Published As

Publication number Publication date
CA1147071A (en) 1983-05-24
JPS5781733A (en) 1982-05-21
EP0047589A1 (de) 1982-03-17
JPH0311139B2 (de) 1991-02-15
DE3164171D1 (en) 1984-07-19

Similar Documents

Publication Publication Date Title
US4357491A (en) Method of and apparatus for detecting speech in a voice channel signal
US4052568A (en) Digital voice switch
US4945566A (en) Method of and apparatus for determining start-point and end-point of isolated utterances in a speech signal
US4410763A (en) Speech detector
US5617508A (en) Speech detection device for the detection of speech end points based on variance of frequency band limited energy
US4028496A (en) Digital speech detector
EP0077574B1 (de) Spracherkennungssystem für Kraftwagen
ES2211057T3 (es) Sistema y metodo para el ajuste del umbral de ruido usado para detectar actividad vocal en ambientes ruidosos no estacionario.
US4401849A (en) Speech detecting method
US6088670A (en) Voice detector
US5579431A (en) Speech detection in presence of noise by determining variance over time of frequency band limited energy
US3712959A (en) Method and apparatus for detecting speech signals in the presence of noise
US4700392A (en) Speech signal detector having adaptive threshold values
JPS6011849B2 (ja) オフセツト補償回路
US4008375A (en) Digital voice switch for single or multiple channel applications
US4001505A (en) Speech signal presence detector
US4543537A (en) Method of and arrangement for controlling the gain of an amplifier
CA1150413A (en) Speech endpoint detector
EP0047589B1 (de) Verfahren und Vorrichtung für die Anzeige von Sprachsignalen in einem Übertragungskanal
US4469916A (en) Method and apparatus for detecting signalling and data signals on a telephone channel
USRE32172E (en) Endpoint detector
EP0770254B1 (de) Übertragungssystem und -verfahren für die sprachkodierung mit verbesserter detektion der grundfrequenz
JPS5834986B2 (ja) 適応形音声検出回路
KR100273395B1 (ko) 음성인식시스템의음성구간검출방법
US20060077844A1 (en) Voice recording and playing equipment

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Designated state(s): DE FR GB IT NL SE

17P Request for examination filed

Effective date: 19820426

ITF It: translation for a ep patent filed
GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Designated state(s): DE FR GB IT NL SE

REF Corresponds to:

Ref document number: 3164171

Country of ref document: DE

Date of ref document: 19840719

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 19900731

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 19900807

Year of fee payment: 10

Ref country code: FR

Payment date: 19900807

Year of fee payment: 10

ITTA It: last paid annual fee
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 19900831

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 19900927

Year of fee payment: 10

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Effective date: 19910813

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Effective date: 19910814

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Effective date: 19920301

GBPC Gb: european patent ceased through non-payment of renewal fee
NLV4 Nl: lapsed or anulled due to non-payment of the annual fee
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Effective date: 19920430

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Effective date: 19920501

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

EUG Se: european patent has lapsed

Ref document number: 81303695.1

Effective date: 19920306