EP0335521A1 - Détection de la présence d'un signal de parole - Google Patents
Détection de la présence d'un signal de parole Download PDFInfo
- Publication number
- EP0335521A1 EP0335521A1 EP89302422A EP89302422A EP0335521A1 EP 0335521 A1 EP0335521 A1 EP 0335521A1 EP 89302422 A EP89302422 A EP 89302422A EP 89302422 A EP89302422 A EP 89302422A EP 0335521 A1 EP0335521 A1 EP 0335521A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- measure
- speech
- signal
- noise
- voice activity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000000694 effects Effects 0.000 title claims abstract description 28
- 238000001514 detection method Methods 0.000 title claims description 17
- 238000001228 spectrum Methods 0.000 claims abstract description 31
- 230000003595 spectral effect Effects 0.000 claims abstract description 23
- 230000004044 response Effects 0.000 claims abstract description 16
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 238000000034 method Methods 0.000 claims 3
- 206010019133 Hangover Diseases 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 1
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 1
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 1
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 1
- 206010036086 Polymenorrhoea Diseases 0.000 description 1
- 101710096660 Probable acetoacetate decarboxylase 2 Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000011045 prefiltration Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
Definitions
- a voice activity detector is a device which is supplied with a signal with the object of detecting periods of speech, or periods containing only noise.
- the present invention is not limited thereto, one application of particular interest for such detectors is in mobile radio telephone systems where the knowledge as to the presence or otherwise of speech can be used exploited by a speech coder to improve the efficient utilisation of radio spectrum, and where also the noise level (from a vehicle-mounted unit) is likely to be high.
- voice activity detection is to locate a measure which differs appreciably between speech and non-speech periods.
- apparatus which includes a speech coder
- a number of parameters are readily available from one or other stage of the coder, and it is therefore desirable to economise on processing needed by utilising some such parameter.
- the main noise sources occur in known defined areas of the frequency spectrum. For example, in a moving car much of the noise (eg, engine noise) is concentrated in the low frequency regions of the spectrum. Where such knowledge of the spectral position of noise is available, it is desirable to base the decision as to whether speech is present or absent upon measurements taken from that portion of the spectrum which contains relatively little noise. It would, of course, be possible in practice to pre-filter the signal before analysing to detect speech activity, but where the voice activity detector follows the output of a speech coder, prefiltering would distort the voice signal to be coded.
- voice activity detection apparatus comprising means for receiving an input signal, means for estimating the noise signal component of the input signal, means for continually forming a measure M of the spectral similarity between a portion of the input signal and the noise signal, and means for comparing a parameter derived from the measure M with a threshold value T to produce an output to indicate the presence or absence of speech in dependence upon whether or not that value is exceeded.
- voice activity detection apparatus comprising: means for continually forming a spectral distortion measure of the similarity between a portion of the input signal and earlier portions of the input signal and means for comparing the degree of variation between successive values of the measure with a threshold value to produce an output yering the presence or absence of speech in dependence upon whether or not that value is exceeded.
- the measure is the Itakura-Saito Distortion Measure.
- a frame of n signal samples (s0, s1, s2, s3, s4 ... s n-1 ) will, when passed through a notional fourth order finite impulse response (FIR) digital filter of impulse response (1, h0, h1, h2, h3), result in a filtered signal (ignoring samples from previous frames)
- s′ (s0), (s1 + h0s0), (s2 + h0s1 + h1s0), (s3 + h0s2 + h1s1 + h2s0), (s4 + h0s3 + h1s2 + h2s1 + h1s0), (s5 + h0s4 + h1s3 + h2s2 + h3s1), (s6 + h0s5 + h1s4 + h2s3 + h3s2), (s7 ...
- FIR finite impulse response
- R′0 can be obtained from a combination of the autocorrelation coefficients R i , weighted by the bracketed constants which determine the frequency band to which the value of R′0 is responsive.
- bracketed terms are the autocorrelation coefficients of the impulse response of the notional filter, so that the expression above may be simplified to where N is the filter order and H i are the (un-normalised) autocorrelation coefficients of the impulse response of the filter.
- the effect on the signal autocorrelation coefficients of filtering a signal may be simulated by producing a weighted sum of the autocorrelation coefficients of the (unfiltered) signal, using the impulse response that the required filter would have had.
- a relatively simple algorithm involving a small number of multiplication operations, may simulate the effect of a digital filter requiring typically a hundred times this number of multiplication operations.
- This filtering operation may alternatively be viewed as a form of spectrum comparison, with the signal spectrum being matched against a reference spectrum (the inverse of the response of the notional filter). Since the notional filter in this application is selected so as to approximate the inverse of the noise spectrum, this operation may be viewed as a spectral comparison between speech and noise spectra, and the zeroth autocorrelation coefficient thus generated (i.e. the energy of the inverse filtered signal) as a measure of dissimilarity between the spectra.
- the Itakura-Saito distortion measure is used in LPC to assess the match between the predictor filter and the input spectrum, and in one form is expressed as where A0 etc are the autocorrelation coefficients of the LPC parameter set.
- the LPC coefficients are the taps of an FIR filter having the inverse spectral response of the input signal so that the LPC coefficient set is the impulse response of the inverse LPC filter, it will be apparent that the Itakura-Saito Distortion Measure is in fact merely a form of equation 1, wherein the filter response H is the inverse of the spectral shape of an all-pole model of the input signal.
- a signal from a microphone is received at an input 1 and converted to digital samples s at a suitable sampling rate by an analogue to digital converter 2.
- An LPC analysis unit 3 (in a known type of LPC coder) then derives, for successive frames of n (eg 160) samples, a set of N (eg 8 or 12) LPC filter coefficients L i which are transmitted to represent the input speech.
- the speech signal s also enters a correlator unit 4 (normally part of the LPC coder 3 since the autocorrelation vector R i of the speech is also usually produced as a step in the LPC analysis although it will be appreciated that a separate correlator could be provided).
- the correlator 4 produces the autocorrelation vector R i , including the zero order correlation coefficient R0 and at least 2 further autocorrelation coefficients R1, R2, R3. These are then supplied to a multiplier unit 5.
- a second input 11 is connected to a second microphone located distant from the speaker so as to receive only background noise.
- the input from this microphone is converted to a digital input sample train by AD convertor 12 and LPC analysed by a second LPC analyser 13.
- the "noise" LPC coefficients produced from analyser 13 are passed to correlator unit 14, and the autocorrelation vector thus produced is multiplied term by term with the autocorrelation coefficients R i of the input signal from the speech microphone in multiplier 5 and the weighted coefficients thus produced are combined in adder 6 according to Equation 1, so as to apply a filter having the inverse shape of the noise spectrum from the noise-only microphone (which in practice is the same as the shape of the noise spectrum in the signal-plus-noise microphone) and thus filter out most of the noise.
- the resulting measure M is thresholded by thresholder 7 to produce a logic output 8 indicating the presence or absence of speech; if M is high, speech is deemed to be present.
- This embodiment does, however, require two microphones and two LPC analysers, which adds to the expense and complexity of the equipment necessary.
- another embodiment uses a corresponding measure formed using the autocorrelations from the noise microphone 11 and the LPC coefficients from the main microphone 1, so that an extra autocorrelator rather than an LPC analyser is necessary.
- a buffer 15 which stores a set of LPC coefficients (or the autocorrelation vector of the set) derived from the microphone input 1 in a period identified as being a "non speech" (ie noise only) period. These coefficients are then used to derive a measure using equation 1, which also of course corresponds to the Itakura-Saito Distortion Measure, except that a single stored frame of LPC coefficients corresponding to an approximation of the inverse noise spectrum is used, rather than the present frame of LPC coefficients.
- the LPC coefficient vector L i output by analyser 3 is also routed to a correlator 14, which produces the autocorrelation vector of the LPC coefficient vector.
- the buffer memory 15 is controlled by the speech/non-speech output of thresholder 7, in such a way that during "speech" frames the buffer retains the "noise” autocorrelation coefficients, but during "noise” frames a new set of LPC coefficients may be used to update the buffer, for example by a multiple switch 16, via which outputs of the correlator 14, carrying each autocorrelation coefficient, are connected to the buffer 15. It will be appreciated that correlator 14 could be positioned after buffer 15. Further, the speech/no-speech decision for coefficient update need not be from output 8, but could be (and preferably is) otherwise derived.
- the LPC coefficients stored in the buffer are updated from time to time, so that the apparatus is thus capable of tracking changes in the noise spectrum. It will be appreciated that such updating of the buffer may be necessary only occasionally, or may occur only once at the start of operation of the detector, if (as is often the case) the noise spectrum is relatively stationary over time, but in a mobile radio environment frequent updating is preferred.
- the system initially employs equation 1 with coefficient terms corresponding to a simple fixed high pass filter, and then subsequently starts to adapt by switching over to using "noise period" LPC coefficients. If, for some reason, speech detection fails, the system may return to using the simple high pass filter.
- LPC analysis unit 13 is simply replaced by an adaptive filter (for example a transversal FIR or lattice filter), connected so as to whiten the noise input by modelling the inverse filter, and its coefficients are supplied as before to autocorrelator 14.
- an adaptive filter for example a transversal FIR or lattice filter
- LPC analysis means 3 is replaced by such an adapter filter, and buffer means 15 is omitted, but switch 16 operates to prevent the adaptive filter from adapting its coefficients during speech periods.
- the LPC coefficient vector is simply the impulse response of an FIR filter which has a response approximating the inverse spectral shape of the input signal.
- the Itakura-Saito Distortion Measure between adjacent frames is formed, this is in fact equal to the power of the signal, as filtered by the LPC filter of the previous frame. So if spectra of adjacent frames differ little, a correspondingly small amount of the spectral power of a frame will escape filtering and the measure will be low.
- a large interframe spectral difference produces a high Itakura-Saito Distortion Measure, so that the measure reflects the spectral similarity of adjacent frames.
- the Itakura-Saito Distortion Measure between adjacent frames of a noisy signal containing intermittent speech is higher during periods of speech than periods of noise; the degree of variation (as illustrated by the standard deviation) is higher, and less intermittently variable.
- the standard deviation of the standard deviation of M is also a reliable measure; the effect of taking each standard deviation is essentially to smooth the measure.
- the measured parameter used to decide whether speech is present is preferably the standard deviation of the Itakura-Saito Distortion Measure, but other measures of variance and other spectral distortion measures (based for example on FFT analysis) could be employed.
- an adaptive threshold in voice activity detection. Such thresholds must not be adjusted during speech periods or the speech signal will be thresholded out. It is accordingly necessary to control the threshold adapter using a speech/non-speech control signal, and it is preferable that this control signal should be independent of the output of the threshold adapter.
- the threshold T is adaptively adjusted so as to keep the threshold level just above the level of the measure M when noise only is present. Since the measure will in general vary randomly when noise is present, the threshold is varied by determining an average level over a number of blocks, and setting the threshold at a level proportional to this average. In a noisy environment this is not usually sufficient, however, and so an assessment of the degree of variation of the parameter over several blocks is also taken into account.
- an input 1 receives a signal which is sampled and digitised by analogue to digital converter (ADC) 2, and supplied to the input of an inverse filter analyser 3, which in practice is part of a speech coder with which the voice activity detector is to work, and which generates coefficients L i (typically 8) of a filter corresponding to the inverse of the input signal spectrum.
- ADC analogue to digital converter
- the digitised signal is also supplied to an autocorrelator 4, (which is part of analyser 3) which generates the autocorrelation vector R i of the input signal (or at least as many low order terms as there are LPC coefficients). Operation of these parts of the apparatus is as described in Figres 1 and 2.
- the autocorrelation coefficients R i are then averaged over several successive speech frames (typically 5-20 ms long) to improve their reliability. This may be achieved by storing each set of autocorrelations coefficients output by autocorrelator 4 in a buffer 4a, and employing an averager 4b to produce a weighted sum of the current autocorrelation coefficients R i and those from previous frames stored in and supplied from buffer 4a.
- the averaged autocorrelation coefficients Ra i thus derived are supplied to weighting and adding means 5,6 which receives also the autocorrelation vector A i of stored noise-period inverse filter coefficients L i from an autocorrelator 14 via buffer 15, and forms from Ra i and A i a measure M preferably defined as:
- thresholder 7 This measure is then thresholded by thresholder 7 against a threshold level, and the logical result provides an indication of the presence or absence of speech at output 8.
- the inverse filter coefficients L i correspond to a fair estimate of the inverse of the noise spectrum, it is desirable to update these coefficients during periods of noise (and, of course, not to update during periods of speech). It is, however, preferable that the speech/non-speech decision on which the updating is based does not depend upon the result of the updating, or else a single wrongly identified frame of signal may result in the voice activity detector subsequently going "out of lock" and wrongly identifying following frames.
- a control signal generating circuit 20 effectively a separate voice activity detector, which forms an independent control signal indicating the presence or absence of speech to control inverse filter analyser 3 (or buffer 8) so that the inverse filter autocorrelation coefficients A i used to form the measure M are only updated during "noise" periods.
- the control signal generator circuit 20 includes LPC analyser 21 (which again may be part of a speech coder and, specifically, may be performed by analyser 3), which produces a set of LPC coefficients M i corresponding to the input signal and an autocorrelator 21a (which may be performed by autocorrelator 3a) which derives the autocorrelation coefficients B i of M i .
- a measure of the spectral similarity between the input speech frame and the preceding speech frame is thus calculated; this may be the Itakura-Saito distortion measure between R i of the present frame and B i of the preceding frame, as disclosed above, or it may instead be derived by calculating the Itakura - Saito distortion measure for R i and B i of the present frame, and subtracting (in subtractor 25) the corresponding measure for the previous frame stored in buffer 24, to generate a spectral difference signal (in either case, the measure is preferably energy-normalised by dividing by R o ).
- the buffer 24 is then, of course, updated.
- a voiced speech detection circuit comprising a pitch analyser 27 (which in practice may operate as part of a speech coder, and in particular may measure the long term predictor lag value produced in a multipulse LPC coder).
- the pitch analyser 27 produces a logic signal which is "true” when voiced speech is detected, and this signal, together with the thresholded measure derived from thresholder 26 (which will generally be “true” when unvoiced speech is present) are supplied to the inputs of a NOR gate 28 to generate a signal which is “false” when speech is present and “true” when noise is present.
- This signal is supplied to buffer 8 (or to inverse filter analyser 3) so that inverse filter coefficients L i are only updated during noise periods.
- Threshold adapter 29 is also connected to receive the non-speech signal control output of control signal generator circuit 20. The output of the threshold adapter 29 is supplied to thresholder 7. The threshold adapter operates to increment or decrement the threshold in steps which are a proportion of the instant threshold value, until the threshold approximates the noise power level (which may conveniently be derived from, for example, weighting and adding circuits 22, 23). When the input signal is very low, it may be desirable that the threshold is automatically set to a fixed, low, level since at the low signal levels the effect of signal quantisation produced by ADC 2 can produce unreliable results.
- hangover generating means 30 which operates to measure the duration of indications of speech after thresholder 7 and, when the presence of speech has been indicated for a period in excess of a predetermined time constant, the output is held high for a short "hangover" period. In this way, clipping of the middle of low-level speech bursts is avoided, and appropriate selection of the time constant prevents triggering of the hangover generator 30 by short spikes of noise which are falsely indicated as speech.
- DSP Digital Signal Processing
- the voice detection apparatus may be implemented as part of an LPC codec.
- autocorrelation coefficients of the signal or related measures partial correlation, or "parcor", coefficients
- the voice detection may take place distantly from the codec.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
- Telephone Function (AREA)
- Mobile Radio Communication Systems (AREA)
- Noise Elimination (AREA)
- Geophysics And Detection Of Objects (AREA)
- Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
- Signal Processing Not Specific To The Method Of Recording And Reproducing (AREA)
- Indexing, Searching, Synchronizing, And The Amount Of Synchronization Travel Of Record Carriers (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AT93200015T ATE229683T1 (de) | 1988-03-11 | 1989-03-10 | Anordnung zur feststellung der anwesenheit von sprachlauten |
EP93200015A EP0548054B1 (fr) | 1988-03-11 | 1989-03-10 | Dispositif de détection de la présence d'un signal de parole |
AT89302422T ATE97757T1 (de) | 1988-03-11 | 1989-03-10 | Detektion fuer die anwesenheit eines sprachsignals. |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB888805795A GB8805795D0 (en) | 1988-03-11 | 1988-03-11 | Voice activity detector |
GB8805795 | 1988-03-11 | ||
GB8813346 | 1988-06-06 | ||
GB888813346A GB8813346D0 (en) | 1988-06-06 | 1988-06-06 | Voice activity detection |
GB8820105 | 1988-08-24 | ||
GB888820105A GB8820105D0 (en) | 1988-08-24 | 1988-08-24 | Voice activity detection |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP93200015.1 Division-Into | 1989-03-10 | ||
EP93200015A Division EP0548054B1 (fr) | 1988-03-11 | 1989-03-10 | Dispositif de détection de la présence d'un signal de parole |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0335521A1 true EP0335521A1 (fr) | 1989-10-04 |
EP0335521B1 EP0335521B1 (fr) | 1993-11-24 |
Family
ID=27263821
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP93200015A Expired - Lifetime EP0548054B1 (fr) | 1988-03-11 | 1989-03-10 | Dispositif de détection de la présence d'un signal de parole |
EP89302422A Expired - Lifetime EP0335521B1 (fr) | 1988-03-11 | 1989-03-10 | Détection de la présence d'un signal de parole |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP93200015A Expired - Lifetime EP0548054B1 (fr) | 1988-03-11 | 1989-03-10 | Dispositif de détection de la présence d'un signal de parole |
Country Status (16)
Country | Link |
---|---|
EP (2) | EP0548054B1 (fr) |
JP (2) | JP3321156B2 (fr) |
KR (1) | KR0161258B1 (fr) |
AU (1) | AU608432B2 (fr) |
BR (1) | BR8907308A (fr) |
CA (1) | CA1335003C (fr) |
DE (2) | DE68929442T2 (fr) |
DK (1) | DK175478B1 (fr) |
ES (2) | ES2188588T3 (fr) |
FI (2) | FI110726B (fr) |
HK (1) | HK135896A (fr) |
IE (1) | IE61863B1 (fr) |
NO (2) | NO304858B1 (fr) |
NZ (1) | NZ228290A (fr) |
PT (1) | PT89978B (fr) |
WO (1) | WO1989008910A1 (fr) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0435458A1 (fr) * | 1989-11-28 | 1991-07-03 | Nec Corporation | Discriminateur entre parole et autres données transmises dans la bande vocale |
EP0451796A1 (fr) * | 1990-04-09 | 1991-10-16 | Kabushiki Kaisha Toshiba | Appareil pour la détection de la parole sur lequel l'influence du niveau d'entrée et du bruit est réduite |
FR2697101A1 (fr) * | 1992-10-21 | 1994-04-22 | Sextant Avionique | Procédé de détection de la parole. |
EP0625774A2 (fr) * | 1993-05-19 | 1994-11-23 | Matsushita Electric Industrial Co., Ltd. | Méthode et appareil pour la détection de la parole |
WO1994028542A1 (fr) * | 1993-05-26 | 1994-12-08 | Telefonaktiebolaget Lm Ericsson | Discriminaion entre des signaux stationnaires et non stationnaires |
EP0633658A2 (fr) * | 1993-07-06 | 1995-01-11 | Hughes Aircraft Company | Circuit de commande automatique de gain couplé à la transmission et activé par la parole |
WO1995012879A1 (fr) * | 1993-11-02 | 1995-05-11 | Telefonaktiebolaget Lm Ericsson | Distinction entre signaux stationnaires et non stationnaires |
FR2727236A1 (fr) * | 1994-11-22 | 1996-05-24 | Alcatel Mobile Comm France | Detection d'activite vocale |
WO1996034382A1 (fr) * | 1995-04-28 | 1996-10-31 | Northern Telecom Limited | Procedes et appareils permettant de distinguer les intervalles de parole des intervalles de bruit dans des signaux audio |
EP0768770A1 (fr) * | 1995-10-13 | 1997-04-16 | France Telecom | Procédé et dispositif de création d'un bruit de confort dans un système de transmission numérique de parole |
GB2306010A (en) * | 1995-10-04 | 1997-04-23 | Univ Wales Medicine | A method of classifying signals |
US5632004A (en) * | 1993-01-29 | 1997-05-20 | Telefonaktiebolaget Lm Ericsson | Method and apparatus for encoding/decoding of background sounds |
WO1998001847A1 (fr) * | 1996-07-03 | 1998-01-15 | British Telecommunications Public Limited Company | Detecteur d'activite vocale |
EP0786760A3 (fr) * | 1996-01-29 | 1998-09-16 | Texas Instruments Incorporated | Codage de parole |
WO2000063887A1 (fr) * | 1999-04-19 | 2000-10-26 | Motorola Inc. | Suppression de bruit utilisant une detection d'activite vocale externe |
WO2005048619A1 (fr) * | 2003-11-12 | 2005-05-26 | Koninklijke Philips Electronics N.V. | Procede et appareil de transfert de donnees non vocales sur une voie telephonique |
EP1703493A2 (fr) * | 1994-08-10 | 2006-09-20 | Qualcomm Incorporated | Procédé et appareil de sélection de taux d'encodage dans un vocoder de taux variable |
EP1887559A3 (fr) * | 2006-08-10 | 2009-01-14 | STMicroelectronics Asia Pacific Pte Ltd. | Détecteur d'activité vocale peu complexe basé sur Yule-Walker |
CN101010722B (zh) * | 2004-08-30 | 2012-04-11 | 诺基亚西门子网络公司 | 用于检测语音信号中话音活动的设备和方法 |
US8244528B2 (en) | 2008-04-25 | 2012-08-14 | Nokia Corporation | Method and apparatus for voice activity determination |
US8275136B2 (en) | 2008-04-25 | 2012-09-25 | Nokia Corporation | Electronic device speech enhancement |
US8611556B2 (en) | 2008-04-25 | 2013-12-17 | Nokia Corporation | Calibrating multiple microphones |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5241692A (en) * | 1991-02-19 | 1993-08-31 | Motorola, Inc. | Interference reduction system for a speech recognition device |
IN184794B (fr) * | 1993-09-14 | 2000-09-30 | British Telecomm | |
DE10052626A1 (de) * | 2000-10-24 | 2002-05-02 | Alcatel Sa | Adaptiver Geräuschpegelschätzer |
US7155388B2 (en) * | 2004-06-30 | 2006-12-26 | Motorola, Inc. | Method and apparatus for characterizing inhalation noise and calculating parameters based on the characterization |
US7139701B2 (en) * | 2004-06-30 | 2006-11-21 | Motorola, Inc. | Method for detecting and attenuating inhalation noise in a communication system |
US8708702B2 (en) * | 2004-09-16 | 2014-04-29 | Lena Foundation | Systems and methods for learning using contextual feedback |
US8175871B2 (en) | 2007-09-28 | 2012-05-08 | Qualcomm Incorporated | Apparatus and method of noise and echo reduction in multiple microphone audio systems |
US8954324B2 (en) | 2007-09-28 | 2015-02-10 | Qualcomm Incorporated | Multiple microphone voice activity detector |
US8223988B2 (en) | 2008-01-29 | 2012-07-17 | Qualcomm Incorporated | Enhanced blind source separation algorithm for highly correlated mixtures |
ES2371619B1 (es) * | 2009-10-08 | 2012-08-08 | Telefónica, S.A. | Procedimiento de detección de segmentos de voz. |
EP2491549A4 (fr) | 2009-10-19 | 2013-10-30 | Ericsson Telefon Ab L M | Detecteur et procede de detection d'activite vocale |
CN108985277B (zh) * | 2018-08-24 | 2020-11-10 | 广东石油化工学院 | 一种功率信号中背景噪声滤除方法及系统 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4052568A (en) * | 1976-04-23 | 1977-10-04 | Communications Satellite Corporation | Digital voice switch |
GB2061676A (en) * | 1979-08-31 | 1981-05-13 | Marconi Co Ltd | Voice detector |
US4358738A (en) * | 1976-06-07 | 1982-11-09 | Kahn Leonard R | Signal presence determination method for use in a contaminated medium |
EP0127718A1 (fr) * | 1983-06-07 | 1984-12-12 | International Business Machines Corporation | Procédé de détection d'activité dans un système de transmission de la voix |
EP0178933A2 (fr) * | 1984-10-17 | 1986-04-23 | Sharp Kabushiki Kaisha | Filtre à autocorrélation |
US4688256A (en) * | 1982-12-22 | 1987-08-18 | Nec Corporation | Speech detector capable of avoiding an interruption by monitoring a variation of a spectrum of an input signal |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3509281A (en) * | 1966-09-29 | 1970-04-28 | Ibm | Voicing detection system |
-
1989
- 1989-03-10 IE IE77489A patent/IE61863B1/en not_active IP Right Cessation
- 1989-03-10 PT PT89978A patent/PT89978B/pt not_active IP Right Cessation
- 1989-03-10 NZ NZ228290A patent/NZ228290A/en unknown
- 1989-03-10 ES ES93200015T patent/ES2188588T3/es not_active Expired - Lifetime
- 1989-03-10 DE DE68929442T patent/DE68929442T2/de not_active Expired - Lifetime
- 1989-03-10 AU AU33554/89A patent/AU608432B2/en not_active Expired
- 1989-03-10 KR KR1019890702099A patent/KR0161258B1/ko not_active IP Right Cessation
- 1989-03-10 DE DE68910859T patent/DE68910859T2/de not_active Expired - Lifetime
- 1989-03-10 JP JP50377289A patent/JP3321156B2/ja not_active Expired - Lifetime
- 1989-03-10 ES ES89302422T patent/ES2047664T3/es not_active Expired - Lifetime
- 1989-03-10 EP EP93200015A patent/EP0548054B1/fr not_active Expired - Lifetime
- 1989-03-10 BR BR898907308A patent/BR8907308A/pt not_active IP Right Cessation
- 1989-03-10 WO PCT/GB1989/000247 patent/WO1989008910A1/fr active IP Right Grant
- 1989-03-10 EP EP89302422A patent/EP0335521B1/fr not_active Expired - Lifetime
- 1989-03-10 CA CA000593386A patent/CA1335003C/fr not_active Expired - Lifetime
-
1990
- 1990-09-07 DK DK199002156A patent/DK175478B1/da not_active IP Right Cessation
- 1990-09-07 FI FI904410A patent/FI110726B/fi not_active IP Right Cessation
- 1990-09-10 NO NO903936A patent/NO304858B1/no not_active IP Right Cessation
-
1996
- 1996-07-25 HK HK135896A patent/HK135896A/xx not_active IP Right Cessation
-
1998
- 1998-06-04 NO NO982568A patent/NO316610B1/no not_active IP Right Cessation
-
1999
- 1999-11-18 JP JP32819899A patent/JP3423906B2/ja not_active Expired - Lifetime
-
2001
- 2001-05-04 FI FI20010933A patent/FI115328B/fi not_active IP Right Cessation
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4052568A (en) * | 1976-04-23 | 1977-10-04 | Communications Satellite Corporation | Digital voice switch |
US4358738A (en) * | 1976-06-07 | 1982-11-09 | Kahn Leonard R | Signal presence determination method for use in a contaminated medium |
GB2061676A (en) * | 1979-08-31 | 1981-05-13 | Marconi Co Ltd | Voice detector |
US4688256A (en) * | 1982-12-22 | 1987-08-18 | Nec Corporation | Speech detector capable of avoiding an interruption by monitoring a variation of a spectrum of an input signal |
EP0127718A1 (fr) * | 1983-06-07 | 1984-12-12 | International Business Machines Corporation | Procédé de détection d'activité dans un système de transmission de la voix |
EP0178933A2 (fr) * | 1984-10-17 | 1986-04-23 | Sharp Kabushiki Kaisha | Filtre à autocorrélation |
Non-Patent Citations (5)
Title |
---|
1977 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, and SIGNAL PROCESSING, Hartford, Connecticut, 9th-11th May 1977, pages 425-428, IEEE, New York, US; R.J. McAULAY: "Optimum speech classification and its application to adaptive noise cancellation" * |
IBM TECHNICAL DISCLOSURE BULLETIN, vol. 22, no. 7, December 1979, pages 2624-2625, New York, US; R.J. JOHNSON et al.: "Speech detector" * |
ICASSP'81 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, Atlanta, 30th March - 1st April 1981, vol. 3, pages 1082-1085, IEEE, New York, US; C.K. UN et al.: "Improving LPC analysis of noisy speech by autocorrelation subtraction method" * |
IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. ASSP-25, no. 4, August 1977, pages 338-343, New York, US; L.R. RABINER et al.: "Application of an LPC distance measure to the voiced-unvoiced-silence detection problem" * |
IEEE TRANSACTIONS ON COMMUNICATIONS, vol. COM-26, no. 1, January 1978, pages 140-145, IEEE, New York, US; P.G. DRAGO et al.: "Digital dynamic speech detectors" * |
Cited By (45)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0435458A1 (fr) * | 1989-11-28 | 1991-07-03 | Nec Corporation | Discriminateur entre parole et autres données transmises dans la bande vocale |
EP0451796A1 (fr) * | 1990-04-09 | 1991-10-16 | Kabushiki Kaisha Toshiba | Appareil pour la détection de la parole sur lequel l'influence du niveau d'entrée et du bruit est réduite |
US5293588A (en) * | 1990-04-09 | 1994-03-08 | Kabushiki Kaisha Toshiba | Speech detection apparatus not affected by input energy or background noise levels |
FR2697101A1 (fr) * | 1992-10-21 | 1994-04-22 | Sextant Avionique | Procédé de détection de la parole. |
EP0594480A1 (fr) * | 1992-10-21 | 1994-04-27 | Sextant Avionique | Procédé de détection de la parole |
US5572623A (en) * | 1992-10-21 | 1996-11-05 | Sextant Avionique | Method of speech detection |
US5632004A (en) * | 1993-01-29 | 1997-05-20 | Telefonaktiebolaget Lm Ericsson | Method and apparatus for encoding/decoding of background sounds |
EP0625774A2 (fr) * | 1993-05-19 | 1994-11-23 | Matsushita Electric Industrial Co., Ltd. | Méthode et appareil pour la détection de la parole |
EP0625774A3 (fr) * | 1993-05-19 | 1996-10-30 | Matsushita Electric Ind Co Ltd | Méthode et appareil pour la détection de la parole. |
US5611019A (en) * | 1993-05-19 | 1997-03-11 | Matsushita Electric Industrial Co., Ltd. | Method and an apparatus for speech detection for determining whether an input signal is speech or nonspeech |
WO1994028542A1 (fr) * | 1993-05-26 | 1994-12-08 | Telefonaktiebolaget Lm Ericsson | Discriminaion entre des signaux stationnaires et non stationnaires |
AU681551B2 (en) * | 1993-05-26 | 1997-08-28 | Telefonaktiebolaget Lm Ericsson (Publ) | Discriminating between stationary and non-stationary signals |
US5579432A (en) * | 1993-05-26 | 1996-11-26 | Telefonaktiebolaget Lm Ericsson | Discriminating between stationary and non-stationary signals |
AU670383B2 (en) * | 1993-05-26 | 1996-07-11 | Telefonaktiebolaget Lm Ericsson (Publ) | Discriminating between stationary and non-stationary signals |
EP0633658A2 (fr) * | 1993-07-06 | 1995-01-11 | Hughes Aircraft Company | Circuit de commande automatique de gain couplé à la transmission et activé par la parole |
EP0633658A3 (fr) * | 1993-07-06 | 1996-01-17 | Hughes Aircraft Co | Circuit de commande automatique de gain couplé à la transmission et activé par la parole. |
AU672934B2 (en) * | 1993-11-02 | 1996-10-17 | Telefonaktiebolaget Lm Ericsson (Publ) | Discriminating between stationary and non-stationary signals |
US5579435A (en) * | 1993-11-02 | 1996-11-26 | Telefonaktiebolaget Lm Ericsson | Discriminating between stationary and non-stationary signals |
WO1995012879A1 (fr) * | 1993-11-02 | 1995-05-11 | Telefonaktiebolaget Lm Ericsson | Distinction entre signaux stationnaires et non stationnaires |
EP1703493A2 (fr) * | 1994-08-10 | 2006-09-20 | Qualcomm Incorporated | Procédé et appareil de sélection de taux d'encodage dans un vocoder de taux variable |
EP1703493A3 (fr) * | 1994-08-10 | 2007-02-14 | Qualcomm Incorporated | Procédé et appareil de sélection de taux d'encodage dans un vocoder de taux variable |
EP0714088A1 (fr) * | 1994-11-22 | 1996-05-29 | Alcatel Mobile Phones | Détection d'activité vocale |
FR2727236A1 (fr) * | 1994-11-22 | 1996-05-24 | Alcatel Mobile Comm France | Detection d'activite vocale |
US5732141A (en) * | 1994-11-22 | 1998-03-24 | Alcatel Mobile Phones | Detecting voice activity |
WO1996034382A1 (fr) * | 1995-04-28 | 1996-10-31 | Northern Telecom Limited | Procedes et appareils permettant de distinguer les intervalles de parole des intervalles de bruit dans des signaux audio |
US5774847A (en) * | 1995-04-28 | 1998-06-30 | Northern Telecom Limited | Methods and apparatus for distinguishing stationary signals from non-stationary signals |
GB2317084B (en) * | 1995-04-28 | 2000-01-19 | Northern Telecom Ltd | Methods and apparatus for distinguishing speech intervals from noise intervals in audio signals |
GB2317084A (en) * | 1995-04-28 | 1998-03-11 | Northern Telecom Ltd | Methods and apparatus for distinguishing speech intervals from noise intervals in audio signals |
GB2306010A (en) * | 1995-10-04 | 1997-04-23 | Univ Wales Medicine | A method of classifying signals |
US5812965A (en) * | 1995-10-13 | 1998-09-22 | France Telecom | Process and device for creating comfort noise in a digital speech transmission system |
EP0768770A1 (fr) * | 1995-10-13 | 1997-04-16 | France Telecom | Procédé et dispositif de création d'un bruit de confort dans un système de transmission numérique de parole |
FR2739995A1 (fr) * | 1995-10-13 | 1997-04-18 | Massaloux Dominique | Procede et dispositif de creation d'un bruit de confort dans un systeme de transmission numerique de parole |
EP0786760A3 (fr) * | 1996-01-29 | 1998-09-16 | Texas Instruments Incorporated | Codage de parole |
US6427134B1 (en) | 1996-07-03 | 2002-07-30 | British Telecommunications Public Limited Company | Voice activity detector for calculating spectral irregularity measure on the basis of spectral difference measurements |
WO1998001847A1 (fr) * | 1996-07-03 | 1998-01-15 | British Telecommunications Public Limited Company | Detecteur d'activite vocale |
US6618701B2 (en) | 1999-04-19 | 2003-09-09 | Motorola, Inc. | Method and system for noise suppression using external voice activity detection |
WO2000063887A1 (fr) * | 1999-04-19 | 2000-10-26 | Motorola Inc. | Suppression de bruit utilisant une detection d'activite vocale externe |
WO2005048619A1 (fr) * | 2003-11-12 | 2005-05-26 | Koninklijke Philips Electronics N.V. | Procede et appareil de transfert de donnees non vocales sur une voie telephonique |
CN101010722B (zh) * | 2004-08-30 | 2012-04-11 | 诺基亚西门子网络公司 | 用于检测语音信号中话音活动的设备和方法 |
EP1887559A3 (fr) * | 2006-08-10 | 2009-01-14 | STMicroelectronics Asia Pacific Pte Ltd. | Détecteur d'activité vocale peu complexe basé sur Yule-Walker |
US8775168B2 (en) | 2006-08-10 | 2014-07-08 | Stmicroelectronics Asia Pacific Pte, Ltd. | Yule walker based low-complexity voice activity detector in noise suppression systems |
US8244528B2 (en) | 2008-04-25 | 2012-08-14 | Nokia Corporation | Method and apparatus for voice activity determination |
US8275136B2 (en) | 2008-04-25 | 2012-09-25 | Nokia Corporation | Electronic device speech enhancement |
US8611556B2 (en) | 2008-04-25 | 2013-12-17 | Nokia Corporation | Calibrating multiple microphones |
US8682662B2 (en) | 2008-04-25 | 2014-03-25 | Nokia Corporation | Method and apparatus for voice activity determination |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0335521B1 (fr) | Détection de la présence d'un signal de parole | |
US5276765A (en) | Voice activity detection | |
US4630304A (en) | Automatic background noise estimator for a noise suppression system | |
US5579435A (en) | Discriminating between stationary and non-stationary signals | |
US5091948A (en) | Speaker recognition with glottal pulse-shapes | |
KR950000842B1 (ko) | 피치 검출기 | |
CA1123955A (fr) | Appareil d'analyse et de synthese de la parole | |
US5970441A (en) | Detection of periodicity information from an audio signal | |
JPH09212195A (ja) | 音声活性検出装置及び移動局並びに音声活性検出方法 | |
US5579432A (en) | Discriminating between stationary and non-stationary signals | |
US5632004A (en) | Method and apparatus for encoding/decoding of background sounds | |
JPH08221097A (ja) | 音声成分の検出法 | |
US4972490A (en) | Distance measurement control of a multiple detector system | |
Vahatalo et al. | Voice activity detection for GSM adaptive multi-rate codec | |
JP2002258881A (ja) | 音声検出装置及び音声検出プログラム | |
JPH01502858A (ja) | 音声フレーム中の基本周波数の存在を検出する装置および方法 | |
NZ286953A (en) | Speech encoder/decoder: discriminating between speech and background sound | |
Prasad et al. | A 2.4 Kilobits Per Second Linear Prediction Vocoder | |
JPH0827637B2 (ja) | 有音/無音判定回路 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH DE ES FR GB GR IT LI LU NL SE |
|
17P | Request for examination filed |
Effective date: 19900306 |
|
17Q | First examination report despatched |
Effective date: 19910201 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE CH DE ES FR GB GR IT LI LU NL SE |
|
REF | Corresponds to: |
Ref document number: 97757 Country of ref document: AT Date of ref document: 19931215 Kind code of ref document: T |
|
ITF | It: translation for a ep patent filed | ||
REF | Corresponds to: |
Ref document number: 68910859 Country of ref document: DE Date of ref document: 19940105 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FG2A Ref document number: 2047664 Country of ref document: ES Kind code of ref document: T3 |
|
ET | Fr: translation filed | ||
REG | Reference to a national code |
Ref country code: GR Ref legal event code: FG4A Free format text: 3010629 |
|
EPTA | Lu: last paid annual fee | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
EAL | Se: european patent in force in sweden |
Ref document number: 89302422.4 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
NLS | Nl: assignments of ep-patents |
Owner name: LG ELECTRONICS INC. |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: 732E |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: TP |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: NV Representative=s name: SAEGER & PARTNER Ref country code: CH Ref legal event code: PUE Owner name: LG ELECTRONICS INC. Free format text: BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY#81 NEWGATE STREET#LONDON EC1A 7AJ (GB) -TRANSFER TO- LG ELECTRONICS INC.#LG TWIN TOWERS 20, YEOUIDO-DONG YEONGDEUNGPO-GU#SEOUL, 150-721 (KR) |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: PC2A |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 20080228 Year of fee payment: 20 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PCAR Free format text: MANFRED SAEGER;POSTFACH 5;7304 MAIENFELD (CH) |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: LU Payment date: 20080313 Year of fee payment: 20 Ref country code: IT Payment date: 20080327 Year of fee payment: 20 Ref country code: SE Payment date: 20080306 Year of fee payment: 20 Ref country code: GB Payment date: 20080305 Year of fee payment: 20 Ref country code: NL Payment date: 20080203 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: AT Payment date: 20080313 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20080311 Year of fee payment: 20 Ref country code: ES Payment date: 20080418 Year of fee payment: 20 Ref country code: DE Payment date: 20080306 Year of fee payment: 20 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: BE Payment date: 20080916 Year of fee payment: 20 Ref country code: GR Payment date: 20080215 Year of fee payment: 20 |
|
BE20 | Be: patent expired |
Owner name: *LG ELECTRONICS INC. Effective date: 20090310 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: PE20 Expiry date: 20090309 |
|
NLV7 | Nl: ceased due to reaching the maximum lifetime of a patent |
Effective date: 20090310 |
|
REG | Reference to a national code |
Ref country code: ES Ref legal event code: FD2A Effective date: 20090311 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20090310 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20090309 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION Effective date: 20090311 |