EP0548054B1 - Anordnung zur Feststellung der Anwesenheit von Sprachlauten - Google Patents

Anordnung zur Feststellung der Anwesenheit von Sprachlauten Download PDF

Info

Publication number
EP0548054B1
EP0548054B1 EP93200015A EP93200015A EP0548054B1 EP 0548054 B1 EP0548054 B1 EP 0548054B1 EP 93200015 A EP93200015 A EP 93200015A EP 93200015 A EP93200015 A EP 93200015A EP 0548054 B1 EP0548054 B1 EP 0548054B1
Authority
EP
European Patent Office
Prior art keywords
speech
input signal
voice activity
signal
measure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP93200015A
Other languages
English (en)
French (fr)
Other versions
EP0548054A2 (de
EP0548054A3 (de
Inventor
Daniel Kenneth Freeman
Ivan Boyd
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LG Electronics Inc
Original Assignee
British Telecommunications PLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB888805795A external-priority patent/GB8805795D0/en
Priority claimed from GB888813346A external-priority patent/GB8813346D0/en
Priority claimed from GB888820105A external-priority patent/GB8820105D0/en
Application filed by British Telecommunications PLC filed Critical British Telecommunications PLC
Publication of EP0548054A2 publication Critical patent/EP0548054A2/de
Publication of EP0548054A3 publication Critical patent/EP0548054A3/xx
Application granted granted Critical
Publication of EP0548054B1 publication Critical patent/EP0548054B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise

Definitions

  • a voice activity detector is a device which is supplied with a signal with the object of detecting periods of speech, or periods containing only noise.
  • the present invention is not limited thereto, one application of particular interest for such detectors is in mobile radio telephone systems where the knowledge as to the presence or otherwise of speech can be exploited by a speech coder to improve the efficient utilisation of radio spectrum, and where also the noise level (from a vehicle-mounted unit) is likely to be high.
  • voice activity detection is to locate a measure which differs appreciably between speech and non-speech periods.
  • apparatus which includes a speech coder
  • a number of parameters are readily available from one or other stage of the coder, and it is therefore desirable to economise on processing needed by utilising some such parameter.
  • the main noise sources occur in known defined areas of the frequency spectrum. For example, in a moving car much of the noise (e.g. engine noise) is concentrated in the low frequency regions of the spectrum. Where such knowledge of the spectral position of noise is available, it is desirable to base the decision as to whether speech is present or absent upon measurements taken from that portion of the spectrum which contains relatively little noise. It would, of course, be possible in practice to pre-filter the signal before analysing to detect speech activity, but where the voice activity detector follows the output of a speech coder, prefiltering would distort the voice signal to be coded.
  • a voice activity detector which compares the input signal with predetermined noise characteristics, by filtering the input signal through a pair of manually balanced bandpass filters (employing analogue components) to form two frequency dependent energy segments.
  • This method is of limited usefulness for many reasons; firstly, such a crude arrangement ignores the fact that many types of noise could have an energy balance between the two bands similar to a speech signal, secondly, balancing the filters is laborious and requires a manual detection of noise periods for balancing, and thirdly, such a device is unable to adjust to changing noise or spectral changes in the environment (or communications channels).
  • European patent application published as 0127718A and US patent 4672669 describe a voice activity detection apparatus in which a first test is made on signal amplitude and a second test is based on analysis of changes in the short-term signal spectrum. Specifically, the spectral analysis is performed by comparing the autocorrelation of the signal with that of an earlier portion of the signal deemed to be speech-free.
  • voice activity detection apparatus comprising:
  • the invention provides a voice activity detection apparatus comprising:
  • a frame of n signal samples (s 0 , s 1 , s 2 , s 3 , s 4 ... s n-1 ) will, when passed through a notational fourth order finite impulse response (FIR) digital filter of impulse response (1, h 0 , h 1 , h 2 , h 3 ), result in a filtered signal (ignoring samples from previous frames)
  • s' (s 0 ), (s 1 + h 0 s 0 ), (s 2 + h 0 s 1 + h 1 s 0 ), (s 3 + h 0 s 2 + h 1 s 1 + h 2 s 0 ), (s 4 + h 0 s 3 + h 1 s 2 + h 2 s 1 + h 1 s 0 ), (s 5 + h 0 s 4 + h 1 s 3 + h 2 s 2 + h 3 s 1 ), (s 6 + h 0
  • the zero order autocorrelation coefficient is the sum of each term squared, which may be normalised i.e. divided by the total number of terms (for constant frame lengths it is easier to omit the division); that of the filtered signal is thus and this is therefore a measure of the power of the notional filtered signal s' - in other words, of that part of the signal s which falls within the passband of the notional filter.
  • R' 0 (s 4 + h 0 s 3 + h 1 s 2 + h 2 s 1 + h 3 s 0 ) 2 + (s 5 + h 0 s 4 + h 1 s 3 +h 2 s 2 + h 3 s 1 ) 2 + ...
  • R 0 R 0 (1 + h 0 2 + h 1 2 + h 2 2 + h 3 2 ) + R 1 (2h 0 + 2h 0 h 1 + 2h 1 h 2 + 2h 2 h 3 ) + R 2 (2h 1 + 2h 1 h 3 + 2h 0 h 2 ) + R 3 (2h 2 + 2h 0 h 3 ) + R 4 (2h 3 ) So R' 0 can be obtained from a combination of the autocorrelation coefficients R i , weighted by the bracketed constants which determine the frequency band to which the value of R' 0 is responsive.
  • bracketed terms are the autocorrelation coefficients of the impulse response of the notional filter, so that the expression above may be simplified to where N is the filter order and H i are the (un-normalised) autocorrelation coefficients of the impulse response of the filter.
  • the effect on the signal autocorrelation coefficients of filtering a signal may be simulated by producing a weighted sum of the autocorrelation coefficients of the (unfiltered) signal, using the impulse response that the required filter would have had.
  • a relatively simple algorithm involving a small number of multiplication operations, may simulate the effect of a digital filter requiring typically a hundred times this number of multiplication operations.
  • This filtering operation may alternatively be viewed as a form of spectrum comparison, with the signal spectrum being matched against a reference spectrum (the inverse of the response of the notional filter). Since the notional filter in this application is selected so as to approximate the inverse of the noise spectrum, this operation may be viewed as a spectral comparison between speech and noise spectra, and the zeroth autocorrelation coefficient thus generated (i.e. the energy of the inverse filtered signal) as a measure of dissimilarity between the spectra.
  • the Itakura-Saito distortion measure is used in LPC to assess the match between the predictor filter and the input spectrum, and in one form is expressed as where A 0 etc are the autocorrelation coefficients of the LPC parameter set.
  • the LPC coefficients are the taps of an FIR filter having the inverse spectral response of the input signal so that the LPC coefficient set is the impulse response of the inverse LPC filter, it will be apparent that the Itakura-Saito Distortion Measure is in fact merely a form of equation 1, wherein the filter response H is the inverse of the spectral shape of an all-pole model of the input signal.
  • a signal from a microphone (not shown) is received at an input 1 and converted to digital samples s at a suitable sampling rate by an analogue to digital converter 2.
  • An LPC analysis unit 3 (in a known type of LPC coder) then derives, for successive frames of n (e.g. 160) samples, a set of N (e.g. 8 or 12) LPC filter coefficients L i which are transmitted to represent the input speech.
  • the speech signal s also enters a correlator unit 4 (normally part of the LPC coder 3 since the autocorrelation vector R i of the speech is also usually produced as a step in the LPC analysis although it will be appreciated that a separate correlator could be provided).
  • the correlator 4 produces the autocorrelation vector R i , including the zero order correlation coefficient R 0 and at least 2 further autocorrelation coefficients R 1 , R 2 , R 3 . These are then supplied to a multiplier unit 5.
  • a second input 11 is connected to a second microphone located distant from the speaker so as to receive only background noise.
  • the input from this microphone is converted to a digital input sample train by AD convertor 12 and LPC analysed by a second LPC analyser 13.
  • the "noise" LPC coefficients produced from analyser 13 are passed to correlator unit 14, and the autocorrelation vector thus produced is multiplied term by term with the autocorrelation coefficients R i of the input signal from the speech microphone in multiplier 5 and the weighted coefficients thus produced are combined in adder 6 according to Equation 1, so as to apply a filter having the inverse shape of the noise spectrum from the noise-only microphone (which in practice is the same as the shape of the noise spectrum in the signal-plus-noise microphone) and thus filter out most of the noise.
  • the resulting measure M is thresholded by thresholder 7 to produce a logic output 8 indicating the presence or absence of speech; if M is high, speech is deemed to be present.
  • This voice activity detector does, however, require two microphones and two LPC analysers, which adds to the expense and complexity of the equipment necessary.
  • An alternative implementation of the first voice activity detector uses a corresponding measure formed using the autocorrelations from the noise microphone 11 and the LPC coefficients from the main microphone 1, meaning that an extra autocorrelator rather than an LPC analyser would be necessary.
  • Both implementations of the first voice activity detector are therefore able to operate within different environments having noise at different frequencies, or within a changing noise spectrum in a given environment.
  • a buffer 15 which stores a set of LPC coefficients (or the autocorrelation vector of the set) derived from the microphone input 1 in a period identified as being a "non speech" (i.e. noise only) period. These coefficients are then used to derive a measure using equation 1, which also of course corresponds to the Itakura-Saito Distortion Measure, except that a single stored frame of LPC coefficients corresponding to an approximation of the inverse noise spectrum is used, rather than the present frame of LPC coefficients.
  • the LPC coefficient vector L i output by analyser 3 is also routed to a correlator 14, which produces the autocorrelation vector of the LPC coefficient vector.
  • the buffer memory 15 is controlled by the speech/non-speech output of thresholder 7, in such a way that during "speech" frames the buffer retains the "noise” autocorrelation coefficients, but during "noise” frames a new set of LPC coefficients may be used to update the buffer, for example by a multiple switch 16, via which outputs of the correlator 14, carrying each autocorrelation coefficient, are connected to the buffer 15. It will be appreciated that correlator 14 could be positioned after buffer 15. Further, the speech/no-speech decision for coefficient update need not be from output 8, but could be (and preferably is) otherwise derived.
  • the LPC coefficients stored in the buffer are updated from time to time, so that the apparatus is thus capable of tracking changes in the noise spectrum. It will be appreciated that such updating of the buffer may be necessary only occasionally, or may occur only once at the start of operation of the detector, if (as is often the case) the noise spectrum is relatively stationary over time, but in a mobile radio environment frequent updating is preferred.
  • the system initially employs equation 1 with coefficient terms corresponding to a simple fixed high pass filter, and then subsequently starts to adapt by switching over to using "noise period" LPC coefficients. If, for some reason, speech detection fails, the system may return to using the simple high pass filter.
  • LPC analysis unit 13 is simply replaced by an adaptive filter (for example a transversal FIR or lattice filter), connected so as to whiten the noise input by modelling the inverse filter, and its coefficients are supplied as before to autocorrelator 14.
  • an adaptive filter for example a transversal FIR or lattice filter
  • LPC analysis means 3 is replaced by such an adaptive filter, and buffer means 15 is omitted, but switch 16 operates to prevent the adaptive filter from adapting its coefficients during speech periods.
  • a voice activity detector in accordance with an embodiment of the present invention will now be described.
  • the LPC coefficient vector is simply the impulse response of an FIR filter which has a response approximating the inverse spectral shape of the input signal.
  • the Itakura-Saito Distortion Measure between adjacent frames is formed, this is in fact equal to the power of the signal, as filtered by the LPC filter of the previous frame. So if spectra of adjacent frames differ little, a correspondingly small amount of the spectral power of a frame will escape filtering and the measure will be low.
  • a large interframe spectral difference produces a high Itakura-Saito Distortion Measure, so that the measure reflects the spectral similarity of adjacent frames.
  • the Itakura-Saito Distortion Measure between adjacent frames of a noisy signal containing intermittent speech is higher during periods of speech than periods of noise; the degree of variation (as illustrated by the standard deviation) is also higher, and less intermittently variable.
  • the standard deviation of the standard deviation of M is also a reliable measure; the effect of taking each standard deviation is essentially to smooth the measure.
  • the measured parameter used to decide whether speech is present is preferably the standard deviation of the Itakura-Saito Distortion Measure, but other measures of variance and other spectral distortion measures (based for example on FFT analysis) could be employed.
  • an adaptive threshold in voice activity detection. Such thresholds must not be adjusted during speech periods or the speech signal will be thresholded out. It is accordingly necessary to control the threshold adapter using a speech/non-speech control signal, and it is preferable that this control signal should be independent of the output of the thresholder adapter.
  • the threshold T is adaptively adjusted so as to keep the threshold level just above the level of the measure M when noise only is present. Since the measure will in general vary randomly when noise is present, the threshold is varied by determining an average level over a number of blocks, and setting the threshold at a level proportional to the average. In a noisy environment this is not usually sufficient, however, and so an assessment of the degree of variation of the parameter over several blocks is also taken into account.
  • an input 1 receives a signal which is sampled and digitised by analogue to digital converter (ADC) 2, and supplied to the input of an inverse filter analyser 3, which in practice is part of a speech coder with which the voice activity is to work, and which generates coefficients L i (typically 8) of a filter corresponding to the inverse of the input signal spectrum.
  • ADC analogue to digital converter
  • the digitised signal is also supplied to an autocorrelator 4, (which is part of analyser 3) which generates the autocorrelation vector R i of the input signal (or at least as many low order terms as there are LPC coefficients). Operation of these parts of the apparatus is as described in Figures 1 and 2.
  • the autocorrelation coefficients R i are then averaged over several successive speech frames (typically 5-20 ms long) to improve their reliability. This may be achieved by storing each set of autocorrelations coefficients output by autocorrelator 4 in a buffer 4a, and employing an averager 4b to produce a weighted sum of the current autocorrelation coefficients R i and those from previous frames stored in and supplied from buffer 4a.
  • thresholder 7 This measure is then thresholded by thresholder 7 against a threshold level, and the logical result provides an indication of the presence or absence of speech at output 8.
  • the inverse filter coefficients L i correspond to a fair estimate of the inverse of the noise spectrum, it is desirable to update these coefficients during periods of noise (and, of course, not to update during periods of speech). It is, however, preferable that the speech/non speech decision on which the updating is based does not depend upon the result of the updating, or else a single wrongly identified frame of signal may result in the voice activity detector subsequently going "out of lock" and wrongly identifying following frames.
  • a control signal generating circuit 20 effectively a separate voice activity detector, which forms an independent control signal indicating the presence or absence of speech to control inverse filter analyser 3 (or buffer 8) so that the inverse filter autocorrelation coefficients A i used to form the measure M are only updated during "noise" periods.
  • the control signal generator circuit 20 includes LPC analyser 21 (which again may be part of a speech coder and, specifically, may be performed by analyser 3), which produces a set of LPC coefficients M, corresponding to the input signal and an autocorrelator 21a (which may be performed by autocorrelator 3a) which derives the autocorrelation coefficients B i of M i .
  • a measure of the spectral similarity between the input speech frame and the preceding speech frame is thus calculated; this may be the Itakura-Saito distortion measure between R i of the present frame and B i of the preceding frame, as disclosed above, or it may instead be derived by calculating the Itakura-Saito distortion measure for R i and B i of the present frame, and subtracting (in subtractor 25) the corresponding measure for the previous frame stored in buffer 24, to generate a spectral difference signal (in either case, the measure is preferably energy-normalised by dividing by R 0 ).
  • the buffer 24 is then, of course, updated.
  • a voiced speech detection circuit comprising a pitch analyser 27 (which is practice may operate as part of a speech coder, and in particular may measure the long term predictor lag value produced in a multipulse LPC coder).
  • the pitch analyser 27 produces a logic signal which is "true” when voiced speech is detected, and this signal, together with the thresholded measure derived from thresholder 26 (which will generally be “true” when unvoiced speech is present) are supplied to the inputs of a NOR gate 28 to generate a signal which is “false” when speech is present and “true” when noise is present.
  • This signal is supplied to buffer 8 (or to inverse filter analyser 3) so that inverse filter coefficients L i are only updated during noise periods.
  • Threshold adapter 29 is also connected to receive the non-speech signal control output of control signal generator circuit 20. The output of the threshold adapter 29 is supplied to thresholder 7. The threshold adapter operates to increment or decrement the threshold in steps which are a proportion of the instant threshold value, until the threshold approximates the noise power level (which may conveniently be derived from, for example, weighting and adding circuits 22, 23). When the input signal is very low, it may be desirable that the threshold is automatically set to a fixed, low, level since at the low signal levels the effect of signal quantisation produced by ADC 2 can produce unreliable results.
  • hangover generating means 30 which operates to measure the duration of indications of speech after thresholder 7 and, when the presence of speech has been indicated for a period in excess of a predetermined time constant, the output is held high for a short "hangover" period. In this way, clipping of the middle of low-level speech bursts is avoided, and appropriate selection of the time constant prevents triggering of the hangover generator 30 by short spikes of noise which are falsely indicated as speech.
  • DSP Digital Signal Processing
  • the voice detection apparatus may be implemented as part of an LPC codec.
  • autocorrelation coefficients of the signal or relates measures (partial correlation, or "parcor", coefficients) are transmitted to a distant station the voice detection may take place distantly from the codec.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Telephone Function (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Noise Elimination (AREA)
  • Geophysics And Detection Of Objects (AREA)
  • Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Signal Processing Not Specific To The Method Of Recording And Reproducing (AREA)
  • Indexing, Searching, Synchronizing, And The Amount Of Synchronization Travel Of Record Carriers (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Claims (4)

  1. Vorrichtung zur Erfassung von Sprachlauten mit
    (i) einem ersten Sprachdetektor (3 - 6, 14), der durch Erzeugen eines Meßwerts der spektralen Ähnlichkeit zwischen einem Teil eines Eingangssignals und einem Teil des Eingangssignals arbeitet, von dem angenommen wird, daß es frei von Sprache ist, um ein Ausgangssignal zu erzeugen, das das Vorhandensein oder Fehlen von Sprachlauten im Eingangssignal angibt;
    (ii) einem Speicher (15) zum Speichern von dem sprachfreien Teil entnommenen Daten und
    (iii) einem Hilfssprachdetektor (20);
    dadurch gekennzeichnet, daß ausschließlich der Hilfssprachdetektor (20) die Aktualisierung des Speichers (15) steuert, wobei der Hilfssprachdetektor (20) durch Erzeugen eines Meßwerts der spektralen Ähnlichkeit zwischen einem aktuellen Teil des Eingangssignals und einem früheren Teil des Eingangssignals arbeitet.
  2. Vorrichtung zur Erfassung von Sprachlauten mit
    (i) einer Einrichtung (1) zum Empfangen eines Eingangssignals;
    (ii) einem Speicher (15) zum Speichern eines ein Rauschen repräsentierenden Signals, das eine geschätzte Rauschkomponente des Eingangssignals repräsentiert;
    (iii) einer Einrichtung (3-6, 14) zur periodischen Erzeugung eines Meßwerts der spektralen Ähnlichkeit zwischen einem Teil des Eingangssignals und der geschätzten Rauschkomponente anhand des Eingangssignals und des ein Rauschen repräsentierenden Signals;
    (iv) einer Einrichtung (7) zum Vergleichen des Meßwerts mit einem Schwellenwert zur Erzeugung eines Ausgangs, der das Vorhandensein oder Fehlen von Sprachlauten angibt;
    (v) einem Hilfssprachdetektor (20) und
    (vi) einer Speicheraktualisierungseinrichtung zur Aktualisierung des Speichers anhand des Eingangssignals;
    dadurch gekennzeichnet, daß der Hilfssprachdetektor abhängig von einem Meßwert der spektralen Ähnlichkeit zwischen einem aktuellen Teil des Eingangssignals und einem vorhergehenden Teil des Eingangssignals betrieben werden kann, um ein Steuersignal zu erzeugen, das das Vorhandensein oder Fehlen von Sprachlauten angibt, und daß die Speicheraktualisierungseinrichtung so betrieben werden kann, daß sie den Speicher nur dann anhand des Eingangssignals aktualisiert, wenn das Steuersignal das Fehlen von Sprachlauten angibt.
  3. Vorrichtung nach Anspruch 2, die ferner eine Einrichtung zum Einstellen des Schwellenwerts während der Perioden aufweist, in denen das Steuersignal das Fehlen von Sprachlauten angibt.
  4. Vorrichtung nach Anspruch 2 oder 3, bei der der Hilfssprachdetektor ferner eine Einrichtung (27) zur Erfassung artikulierter Sprachlaute mit einer Tonhöhenanalyseeinrichtung zur Erzeugung eines Signals umfaßt, das das Vorhandensein artikulierter Sprache angibt und von dem auch das von dem Hilfssprachdetektor (20) erzeugte Steuersignal abhängt.
EP93200015A 1988-03-11 1989-03-10 Anordnung zur Feststellung der Anwesenheit von Sprachlauten Expired - Lifetime EP0548054B1 (de)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
GB888805795A GB8805795D0 (en) 1988-03-11 1988-03-11 Voice activity detector
GB8805795 1988-03-11
GB8813346 1988-06-06
GB888813346A GB8813346D0 (en) 1988-06-06 1988-06-06 Voice activity detection
GB8820105 1988-08-24
GB888820105A GB8820105D0 (en) 1988-08-24 1988-08-24 Voice activity detection
EP89302422A EP0335521B1 (de) 1988-03-11 1989-03-10 Detektion für die Anwesenheit eines Sprachsignals

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
EP89302422A Division EP0335521B1 (de) 1988-03-11 1989-03-10 Detektion für die Anwesenheit eines Sprachsignals
EP89302422.4 Division 1989-03-10

Publications (3)

Publication Number Publication Date
EP0548054A2 EP0548054A2 (de) 1993-06-23
EP0548054A3 EP0548054A3 (de) 1994-01-12
EP0548054B1 true EP0548054B1 (de) 2002-12-11

Family

ID=27263821

Family Applications (2)

Application Number Title Priority Date Filing Date
EP93200015A Expired - Lifetime EP0548054B1 (de) 1988-03-11 1989-03-10 Anordnung zur Feststellung der Anwesenheit von Sprachlauten
EP89302422A Expired - Lifetime EP0335521B1 (de) 1988-03-11 1989-03-10 Detektion für die Anwesenheit eines Sprachsignals

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP89302422A Expired - Lifetime EP0335521B1 (de) 1988-03-11 1989-03-10 Detektion für die Anwesenheit eines Sprachsignals

Country Status (16)

Country Link
EP (2) EP0548054B1 (de)
JP (2) JP3321156B2 (de)
KR (1) KR0161258B1 (de)
AU (1) AU608432B2 (de)
BR (1) BR8907308A (de)
CA (1) CA1335003C (de)
DE (2) DE68929442T2 (de)
DK (1) DK175478B1 (de)
ES (2) ES2188588T3 (de)
FI (2) FI110726B (de)
HK (1) HK135896A (de)
IE (1) IE61863B1 (de)
NO (2) NO304858B1 (de)
NZ (1) NZ228290A (de)
PT (1) PT89978B (de)
WO (1) WO1989008910A1 (de)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102576528A (zh) * 2009-10-19 2012-07-11 瑞典爱立信有限公司 用于语音活动检测的检测器和方法
CN108985277A (zh) * 2018-08-24 2018-12-11 广东石油化工学院 一种功率信号中背景噪声滤除方法及系统

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2643593B2 (ja) * 1989-11-28 1997-08-20 日本電気株式会社 音声・モデム信号識別回路
CA2040025A1 (en) * 1990-04-09 1991-10-10 Hideki Satoh Speech detection apparatus with influence of input level and noise reduced
US5241692A (en) * 1991-02-19 1993-08-31 Motorola, Inc. Interference reduction system for a speech recognition device
FR2697101B1 (fr) * 1992-10-21 1994-11-25 Sextant Avionique Procédé de détection de la parole.
SE470577B (sv) * 1993-01-29 1994-09-19 Ericsson Telefon Ab L M Förfarande och anordning för kodning och/eller avkodning av bakgrundsljud
JPH06332492A (ja) * 1993-05-19 1994-12-02 Matsushita Electric Ind Co Ltd 音声検出方法および検出装置
SE501305C2 (sv) * 1993-05-26 1995-01-09 Ericsson Telefon Ab L M Förfarande och anordning för diskriminering mellan stationära och icke stationära signaler
EP0633658A3 (de) * 1993-07-06 1996-01-17 Hughes Aircraft Co Stimmenaktivierte übertragungsgekoppelte automatische Verstärkungsregelungsschaltung.
IN184794B (de) * 1993-09-14 2000-09-30 British Telecomm
SE501981C2 (sv) * 1993-11-02 1995-07-03 Ericsson Telefon Ab L M Förfarande och anordning för diskriminering mellan stationära och icke stationära signaler
US5742734A (en) * 1994-08-10 1998-04-21 Qualcomm Incorporated Encoding rate selection in a variable rate vocoder
FR2727236B1 (fr) * 1994-11-22 1996-12-27 Alcatel Mobile Comm France Detection d'activite vocale
GB2317084B (en) * 1995-04-28 2000-01-19 Northern Telecom Ltd Methods and apparatus for distinguishing speech intervals from noise intervals in audio signals
GB2306010A (en) * 1995-10-04 1997-04-23 Univ Wales Medicine A method of classifying signals
FR2739995B1 (fr) * 1995-10-13 1997-12-12 Massaloux Dominique Procede et dispositif de creation d'un bruit de confort dans un systeme de transmission numerique de parole
US5794199A (en) * 1996-01-29 1998-08-11 Texas Instruments Incorporated Method and system for improved discontinuous speech transmission
DE69716266T2 (de) 1996-07-03 2003-06-12 British Telecommunications P.L.C., London Sprachaktivitätsdetektor
US6618701B2 (en) 1999-04-19 2003-09-09 Motorola, Inc. Method and system for noise suppression using external voice activity detection
DE10052626A1 (de) * 2000-10-24 2002-05-02 Alcatel Sa Adaptiver Geräuschpegelschätzer
CN1617606A (zh) * 2003-11-12 2005-05-18 皇家飞利浦电子股份有限公司 一种在语音信道传输非语音数据的方法及装置
US7155388B2 (en) * 2004-06-30 2006-12-26 Motorola, Inc. Method and apparatus for characterizing inhalation noise and calculating parameters based on the characterization
US7139701B2 (en) * 2004-06-30 2006-11-21 Motorola, Inc. Method for detecting and attenuating inhalation noise in a communication system
FI20045315A (fi) * 2004-08-30 2006-03-01 Nokia Corp Ääniaktiivisuuden havaitseminen äänisignaalissa
US8708702B2 (en) * 2004-09-16 2014-04-29 Lena Foundation Systems and methods for learning using contextual feedback
US8775168B2 (en) 2006-08-10 2014-07-08 Stmicroelectronics Asia Pacific Pte, Ltd. Yule walker based low-complexity voice activity detector in noise suppression systems
US8175871B2 (en) 2007-09-28 2012-05-08 Qualcomm Incorporated Apparatus and method of noise and echo reduction in multiple microphone audio systems
US8954324B2 (en) 2007-09-28 2015-02-10 Qualcomm Incorporated Multiple microphone voice activity detector
US8223988B2 (en) 2008-01-29 2012-07-17 Qualcomm Incorporated Enhanced blind source separation algorithm for highly correlated mixtures
US8275136B2 (en) 2008-04-25 2012-09-25 Nokia Corporation Electronic device speech enhancement
US8244528B2 (en) 2008-04-25 2012-08-14 Nokia Corporation Method and apparatus for voice activity determination
US8611556B2 (en) 2008-04-25 2013-12-17 Nokia Corporation Calibrating multiple microphones
ES2371619B1 (es) * 2009-10-08 2012-08-08 Telefónica, S.A. Procedimiento de detección de segmentos de voz.

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3509281A (en) * 1966-09-29 1970-04-28 Ibm Voicing detection system
US4052568A (en) * 1976-04-23 1977-10-04 Communications Satellite Corporation Digital voice switch
US4358738A (en) * 1976-06-07 1982-11-09 Kahn Leonard R Signal presence determination method for use in a contaminated medium
JPS5636246A (en) * 1979-08-31 1981-04-09 Nec Corp Stereo signal demodulating circuit
JPS59115625A (ja) * 1982-12-22 1984-07-04 Nec Corp 音声検出器
EP0127718B1 (de) * 1983-06-07 1987-03-18 International Business Machines Corporation Verfahren zur Aktivitätsdetektion in einem Sprachübertragungssystem
JPS6196817A (ja) * 1984-10-17 1986-05-15 Sharp Corp フイルタ−

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102576528A (zh) * 2009-10-19 2012-07-11 瑞典爱立信有限公司 用于语音活动检测的检测器和方法
CN108985277A (zh) * 2018-08-24 2018-12-11 广东石油化工学院 一种功率信号中背景噪声滤除方法及系统
CN108985277B (zh) * 2018-08-24 2020-11-10 广东石油化工学院 一种功率信号中背景噪声滤除方法及系统

Also Published As

Publication number Publication date
NZ228290A (en) 1992-01-29
PT89978A (pt) 1989-11-10
EP0548054A2 (de) 1993-06-23
FI20010933A (fi) 2001-05-04
FI115328B (fi) 2005-04-15
DK215690A (da) 1990-09-07
FI110726B (fi) 2003-03-14
EP0335521B1 (de) 1993-11-24
JP3423906B2 (ja) 2003-07-07
JPH03504283A (ja) 1991-09-19
WO1989008910A1 (en) 1989-09-21
NO903936L (no) 1990-11-09
FI904410A0 (fi) 1990-09-07
KR0161258B1 (ko) 1999-03-20
DK175478B1 (da) 2004-11-08
KR900700993A (ko) 1990-08-17
BR8907308A (pt) 1991-03-19
DK215690D0 (da) 1990-09-07
JP3321156B2 (ja) 2002-09-03
NO982568L (no) 1990-11-09
NO304858B1 (no) 1999-02-22
NO903936D0 (no) 1990-09-10
IE61863B1 (en) 1994-11-30
DE68910859T2 (de) 1994-12-08
ES2047664T3 (es) 1994-03-01
IE890774L (en) 1989-09-11
PT89978B (pt) 1995-03-01
DE68929442D1 (de) 2003-01-23
JP2000148172A (ja) 2000-05-26
EP0335521A1 (de) 1989-10-04
CA1335003C (en) 1995-03-28
AU608432B2 (en) 1991-03-28
EP0548054A3 (de) 1994-01-12
NO982568D0 (no) 1998-06-04
AU3355489A (en) 1989-10-05
DE68929442T2 (de) 2003-10-02
DE68910859D1 (de) 1994-01-05
NO316610B1 (no) 2004-03-08
ES2188588T3 (es) 2003-07-01
HK135896A (en) 1996-08-02

Similar Documents

Publication Publication Date Title
EP0548054B1 (de) Anordnung zur Feststellung der Anwesenheit von Sprachlauten
US5276765A (en) Voice activity detection
US4630304A (en) Automatic background noise estimator for a noise suppression system
US5749067A (en) Voice activity detector
US5579435A (en) Discriminating between stationary and non-stationary signals
US5970441A (en) Detection of periodicity information from an audio signal
JPH09212195A (ja) 音声活性検出装置及び移動局並びに音声活性検出方法
KR20010075343A (ko) 저비트율 스피치 코더용 노이즈 억제 방법 및 그 장치
US5579432A (en) Discriminating between stationary and non-stationary signals
EP0634041B1 (de) Verfahren und vorrichtung zur kodierung/dekodierung von hintergrundgeräuschen
JPH08221097A (ja) 音声成分の検出法
WO1988007738A1 (en) An adaptive multivariate estimating apparatus
US20030125937A1 (en) Vector estimation system, method and associated encoder
NZ286953A (en) Speech encoder/decoder: discriminating between speech and background sound

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AC Divisional application: reference to earlier application

Ref document number: 335521

Country of ref document: EP

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH DE ES FR GB GR IT LI LU NL SE

17P Request for examination filed

Effective date: 19930817

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE CH DE ES FR GB GR IT LI LU NL SE

17Q First examination report despatched

Effective date: 19970716

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AC Divisional application: reference to earlier application

Ref document number: 335521

Country of ref document: EP

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE CH DE ES FR GB GR IT LI LU NL SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20021211

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20021211

Ref country code: CH

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20021211

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20021211

REF Corresponds to:

Ref document number: 229683

Country of ref document: AT

Date of ref document: 20021215

Kind code of ref document: T

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

RIC1 Information provided on ipc code assigned before grant

Free format text: 7G 10L 11/02 A

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REF Corresponds to:

Ref document number: 68929442

Country of ref document: DE

Date of ref document: 20030123

REG Reference to a national code

Ref country code: SE

Ref legal event code: TRGR

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: ES

Ref legal event code: FG2A

Ref document number: 2188588

Country of ref document: ES

Kind code of ref document: T3

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

NLS Nl: assignments of ep-patents

Owner name: LG ELECTRONICS INC.

26N No opposition filed

Effective date: 20030912

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1013496

Country of ref document: HK

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 20080306

Year of fee payment: 20

Ref country code: NL

Payment date: 20080303

Year of fee payment: 20

Ref country code: GB

Payment date: 20080305

Year of fee payment: 20

Ref country code: LU

Payment date: 20080313

Year of fee payment: 20

Ref country code: IT

Payment date: 20080327

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20080311

Year of fee payment: 20

Ref country code: ES

Payment date: 20080418

Year of fee payment: 20

Ref country code: DE

Payment date: 20080306

Year of fee payment: 20

BE20 Be: patent expired

Owner name: *LG ELECTRONICS INC.

Effective date: 20090310

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: BE

Payment date: 20081013

Year of fee payment: 20

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20090309

NLV7 Nl: ceased due to reaching the maximum lifetime of a patent

Effective date: 20090310

REG Reference to a national code

Ref country code: ES

Ref legal event code: FD2A

Effective date: 20090311

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20090310

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20090309

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20090311