EP1557820A1 - Voice activity detection operating with compressed speech signal parameters - Google Patents
Voice activity detection operating with compressed speech signal parameters Download PDFInfo
- Publication number
- EP1557820A1 EP1557820A1 EP04425031A EP04425031A EP1557820A1 EP 1557820 A1 EP1557820 A1 EP 1557820A1 EP 04425031 A EP04425031 A EP 04425031A EP 04425031 A EP04425031 A EP 04425031A EP 1557820 A1 EP1557820 A1 EP 1557820A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- voice
- voice activity
- analysis
- speech
- detector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
Definitions
- the present invention refers to digital radio communication systems, in particular mobile communication systems, and more specifically it concerns a method of and device for voice activity detection in received speech signals in one such system.
- the method and the device are intended for use in connection with voice quality enhancement.
- VADs Voice activity detectors
- a class of VADs performs detection through an energetic analysis and a spectral analysis of the input signal, the analysis results being combined to provide the classification of an analysed speech segment.
- An algorithm for classifying a speech segment as voiced speech, unvoiced speech or silence based on energetic and spectral analyses is disclosed in "Application of an LPC Distance Measure to the Voiced-Unvoiced-Silence Detection Problem", by L.R. Rabiner and M.R. Sambur, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-25, No. 4, August 1977, pages 338 - 343.
- VADs are typically used at the mobile terminals, in association with the speech coder, to drive a discontinuous transmission in which coded speech signals are transmitted during active speech periods whereas in silence periods the speech transmitter is inhibited and the so-called comfort noise is transmitted. This helps in saving power.
- VAD Voice Quality Enhancement
- the Applicant integrate the VQE function in the units (such as the Transcoding and Rate Adapting Units, or TRAU, of the GSM system) adapting the speech signals from the requirements of the radio part of the system to the requirements of the control part and, if necessary, of the fixed telephone network, and vice versa.
- the VAD is intended to drive the noise suppression in the uplink direction of communication.
- the Applicant wishes that the addition of a VAD-driven VQE in a TRAU does not entail changes in the hardware of the TRAU itself.
- the conventional approach which is substantially as disclosed in the above-mentioned document by L.R. Rabiner and M.R. Sambur, entails that the speech signal is decoded and the spectral information (here the linear prediction coefficients LPCs) is recovered from the linear frames resulting from speech decoding. Yet, the spectral analysis of the linear signal to recover the LPCs is a heavy processing task. Taking into account that the TRAU processor generally operates in parallel on a plurality of channels, real time execution of the complete VAD algorithm on the same channels could compel to use a dedicated processor or a more powerful and hence more expensive processor than the one that would be used for the TRAU. Both solutions are in contrast with the goal of keeping the TRAU hardware unchanged.
- a method of detecting voice activity in a received speech signal in a radio communication system in which speech signals are transmitted in digitally coded form, and a signal representative of the presence or absence of voice activity is generated by submitting the received speech signals to an energetic analysis and a spectral analysis, said spectral analysis being performed directly on coded speech signals.
- the invention also concerns a device for carrying out the method, comprising means for performing an energetic analysis and a spectral analysis on the received speech signal, in which said spectral analysis means are connected directly with a detector input where said coded speech signals are present.
- the voice activity detector drives a noise reduction operation, within a voice quality enhancement function performed on speech signals propagating in the uplink communication direction in a mobile communication system and embodied in units, like the so-called TRAU (Transcoding and Rate Adapting Unit), which adapt the uplink directed speech signals to the requirements of the control part of the mobile system and possibly of the fixed telephone network and adapt downlink directed speech signals to the requirements of the radio part of the mobile system.
- TRAU Transcoding and Rate Adapting Unit
- the invention provides also a method of voice quality enhancement in a mobile communication system, in which a voice quality enhancement including a noise reduction operation is performed at least for speech signals propagating in uplink direction, in which said noise reduction operation is driven by a signal representative of the presence or absence of voice activity generated by a method of and device for voice activity detection as defined above.
- the preferred embodiment disclosed here concerns a VAD intended to drive the noise reduction feature in a voice quality enhancement performed in the uplink direction of communication in a mobile communication system, in case the VQE function is incorporated into the units performing transcoding and/or rate adaptation in the control part of such a system.
- a Transcoding and Rate Adapting Unit (TRAU) 1 of a mobile communication system, for instance a GSM system.
- the TRAU is connected to the Mobile Switching Centre (MSC) 2 and the Base Station Controller (BSC) 3 through interfaces A and Asub and embodies a VAD-driven Voice Quality Enhancement function performed in block 4 labelled VAD & VQE.
- MSC Mobile Switching Centre
- BSC Base Station Controller
- a VQE includes the well-known features of Acoustic Echo Cancellation, Noise Reduction and Acoustic Level Control.
- all of said features are provided for the uplink direction of communication only and the VAD drives the Noise Reduction (NR) feature.
- NR Noise Reduction
- the Acoustic Level Control is performed, which is not concerned by the present invention.
- the drawing only shows the units that, in TRAU 1, are directly concerned with the transcoding function, namely a speech coder 5 and a speech decoder 6 on the Asub-interface side, and an A/ ⁇ law expander 7 and an A/ ⁇ law compander 8 on the A-interface side.
- the TRAU receives A-law PCM signals from MSC through a line 10, sends the expanded signals to the In_Down input of VAD&VQE block 4 through line 11.
- the signals outgoing from the Out_Down output of block 4 are fed to coder 5 through line 12, are coded according to the desired coding technique (full-rate, enhanced full-rate, half-rate or adaptive multi-rate) and the coded signals are then forwarded to base station controller 3 through line 13.
- the coded signals arriving from base station controller 3 through line 14 are fed to both decoder 6 and VAD&VQE block 4.
- the decoded signals are fed to the In_Down input of VAD&VQE block 4 through line 15.
- the decoded signals having undergone voice quality enhancement are fed from Out_Up output of VAD&VQE to A/ ⁇ law compander 8 through line 16 and hence to MSC through line 17.
- the coded speech signals include spectral information, such as the LPCs or a representation thereof.
- VAD&VQE is decomposed into its constituent blocks, namely VAD 40 and VQE 50.
- VAD 40 has been schematised by a spectral analyser 41 determining the LPC coefficients, an energy analyser 42 and Joint Processing Means including including an Hard Decision Unit 43 and a Soft Decision Unit 45.
- Said Joint Processing Means being adapted to combine the results of the two analyses and emitting on line 44 a signal indicating the nature of the received speech frame (the so-called VAD flag), which is an input to the Noise Reduction feature of VQE 50.
- LPC analyser 41 is directly fed with the coded speech signal frames present on line 14, whereas energy estimator 42 is fed with the decoded signal outgoing from decoder 6 through line 15.
- the LPC analysis of course depends on the manner in which the LPCs are represented in the coded signal.
- the energy evaluation and the decision may be performed according to any technique known in the art, for instance as disclosed in the above-mentioned paper of L.R. Rabiner et al.
- Performing the LPC analysis directly on the coded signal affords a number of advantages in terms of processing power requirements.
- there is no need of dedicating processing power to the reconstruction of the LPC coefficients from the decoded signal it is sufficient to extract them from the relevant information included in the coded speech signal, which is available on the same board.
- a reduction to at least of one fifth or even less of the information amount to be processed is achieved: indeed, at most 244 bits are to be processed to obtain the LPC coefficients from the coded signal, whereas 1280 bits are to be processed when the linear signal is used.
- the same processor used on the TRAU board for performing all TRAU functions and for managing the so-called tandem free operation i.e. for dispensing with the transcoding in case of communication between two mobile terminals
- a plurality of speech channels in parallel for instance 12 channels
- the same processor used on the TRAU board for performing all TRAU functions and for managing the so-called tandem free operation can perform in real time, for the same channels, also the voice activity detection by exploiting both the spectral and the energy information, and the subsequent VQE.
- the resulting detection is more accurate than when only the energy information is exploited and hence also the noise suppression operation is more accurate.
- the LPC information in the coded signal is updated every 5 ms.
- the energy information is computed on the same interval of 5 ms and then the two contributions are jointly processed to take a decision on the nature of audio segment.
- This is denoted by the presence of said Joint Processing Means 43 and 45.
- the high rate of decisions available at the output of said Hard decision Unit 43 is not necessary and has often a negative impact on audition. Therefore these "hard” decisions are softened through a smoothing process which aims to redefine as "voice” eventual isolated segments of noise among a group of segments of voice and to redefine as "noise” eventual isolated segments of voice among a group of segments of noise. This is the aim of Soft Decision Unit 45 sited immediately after said Hard Decision Unit 43.
Abstract
Description
- The present invention refers to digital radio communication systems, in particular mobile communication systems, and more specifically it concerns a method of and device for voice activity detection in received speech signals in one such system.
- Preferably, but not exclusively, the method and the device are intended for use in connection with voice quality enhancement.
- Voice activity detectors (VADs) are devices that are supplied with a signal to detect therein periods of speech and periods of silence, where only noise is present. Possibly, the VADs are also arranged to distinguish among voiced/unvoiced sounds in speech periods.
- A class of VADs performs detection through an energetic analysis and a spectral analysis of the input signal, the analysis results being combined to provide the classification of an analysed speech segment. An algorithm for classifying a speech segment as voiced speech, unvoiced speech or silence based on energetic and spectral analyses is disclosed in "Application of an LPC Distance Measure to the Voiced-Unvoiced-Silence Detection Problem", by L.R. Rabiner and M.R. Sambur, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-25, No. 4, August 1977, pages 338 - 343.
- In mobile communication systems, VADs are typically used at the mobile terminals, in association with the speech coder, to drive a discontinuous transmission in which coded speech signals are transmitted during active speech periods whereas in silence periods the speech transmitter is inhibited and the so-called comfort noise is transmitted. This helps in saving power.
- It has now been found that it is advantageous to use a VAD also in the control part of a mobile communication system, in particular for improving the Noise Reduction (NR) feature of the so-called Voice Quality Enhancement (VQE) function. An example of VAD-assisted noise reduction for the uplink direction of communication of a digital radio communication system is disclosed in EP-A 1 017 042.
- The Applicant, as well as other manufacturers, integrate the VQE function in the units (such as the Transcoding and Rate Adapting Units, or TRAU, of the GSM system) adapting the speech signals from the requirements of the radio part of the system to the requirements of the control part and, if necessary, of the fixed telephone network, and vice versa. In the Applicant's VQE, the VAD is intended to drive the noise suppression in the uplink direction of communication. For several reasons, including cost and size of the apparatus, the Applicant wishes that the addition of a VAD-driven VQE in a TRAU does not entail changes in the hardware of the TRAU itself.
- If the detection exploits both the spectrum and the energy characteristics of the received signal, the conventional approach, which is substantially as disclosed in the above-mentioned document by L.R. Rabiner and M.R. Sambur, entails that the speech signal is decoded and the spectral information (here the linear prediction coefficients LPCs) is recovered from the linear frames resulting from speech decoding. Yet, the spectral analysis of the linear signal to recover the LPCs is a heavy processing task. Taking into account that the TRAU processor generally operates in parallel on a plurality of channels, real time execution of the complete VAD algorithm on the same channels could compel to use a dedicated processor or a more powerful and hence more expensive processor than the one that would be used for the TRAU. Both solutions are in contrast with the goal of keeping the TRAU hardware unchanged.
- To avoid the need for a VAD-dedicated or a more powerful processor, the spectral analysis could be dispensed with and the VAD could perform only the energetic analysis. Such a solution is disclosed in EP-A 1 017 042. The document teaches also that the energy estimation can be performed directly on the compressed signal, in order to dispense the speech decoder with the relevant processing tasks and to speed up the actual speech decoding.
- Yet, by performing only the energetic analysis, only one feature of the received signal is exploited, and the detection, and hence the operation of the VAD-driven VQE, is less effective.
- Thus, it is an object of the invention to provide a method and a device for voice activity detection, in particular intended to drive a voice quality enhancement integrated in a unit performing speech rate and coding adaptation in mobile communication systems, which method and device allow performing both the energetic and the spectral analysis by using the same processor as required for performing said adaptation.
- According to the invention, there is provided a method of detecting voice activity in a received speech signal in a radio communication system in which speech signals are transmitted in digitally coded form, and a signal representative of the presence or absence of voice activity is generated by submitting the received speech signals to an energetic analysis and a spectral analysis, said spectral analysis being performed directly on coded speech signals.
- The invention also concerns a device for carrying out the method, comprising means for performing an energetic analysis and a spectral analysis on the received speech signal, in which said spectral analysis means are connected directly with a detector input where said coded speech signals are present.
- In the preferred application, the voice activity detector drives a noise reduction operation, within a voice quality enhancement function performed on speech signals propagating in the uplink communication direction in a mobile communication system and embodied in units, like the so-called TRAU (Transcoding and Rate Adapting Unit), which adapt the uplink directed speech signals to the requirements of the control part of the mobile system and possibly of the fixed telephone network and adapt downlink directed speech signals to the requirements of the radio part of the mobile system.
- Therefore, the invention provides also a method of voice quality enhancement in a mobile communication system, in which a voice quality enhancement including a noise reduction operation is performed at least for speech signals propagating in uplink direction, in which said noise reduction operation is driven by a signal representative of the presence or absence of voice activity generated by a method of and device for voice activity detection as defined above.
- A preferred embodiment of the invention, given by way of non-limiting example, will now be described with reference to the accompanying drawings, in which:
- Fig. 1 is a schematic block diagram of a TRAU embodying a VQE unit and of its connections inside the mobile communications system; and
- Fig. 2 is a schematic block diagram of the invention.
- The preferred embodiment disclosed here concerns a VAD intended to drive the noise reduction feature in a voice quality enhancement performed in the uplink direction of communication in a mobile communication system, in case the VQE function is incorporated into the units performing transcoding and/or rate adaptation in the control part of such a system.
- Referring to Fig. 1, there is schematically shown a Transcoding and Rate Adapting Unit (TRAU) 1 of a mobile communication system, for instance a GSM system. The TRAU is connected to the Mobile Switching Centre (MSC) 2 and the Base Station Controller (BSC) 3 through interfaces A and Asub and embodies a VAD-driven Voice Quality Enhancement function performed in
block 4 labelled VAD & VQE. - In the most general case, a VQE includes the well-known features of Acoustic Echo Cancellation, Noise Reduction and Acoustic Level Control. In the preferred application of the invention, all of said features are provided for the uplink direction of communication only and the VAD drives the Noise Reduction (NR) feature. In downlink direction, only the Acoustic Level Control is performed, which is not concerned by the present invention.
- The drawing only shows the units that, in TRAU 1, are directly concerned with the transcoding function, namely a
speech coder 5 and aspeech decoder 6 on the Asub-interface side, and an A/µ law expander 7 and an A/µ law compander 8 on the A-interface side. - In downlink direction, the TRAU receives A-law PCM signals from MSC through a
line 10, sends the expanded signals to the In_Down input of VAD&VQEblock 4 throughline 11. The signals outgoing from the Out_Down output ofblock 4 are fed tocoder 5 throughline 12, are coded according to the desired coding technique (full-rate, enhanced full-rate, half-rate or adaptive multi-rate) and the coded signals are then forwarded tobase station controller 3 throughline 13. - In uplink direction, the coded signals arriving from
base station controller 3 throughline 14 are fed to bothdecoder 6 and VAD&VQEblock 4. The decoded signals are fed to the In_Down input of VAD&VQEblock 4 throughline 15. The decoded signals having undergone voice quality enhancement are fed from Out_Up output of VAD&VQE to A/µ law compander 8 throughline 16 and hence to MSC throughline 17. - It is not necessary to provide here details on the organisation of the coded speech signals in a mobile communication system, which depends on the kind of system and on the chosen coding rate. On the other hand, for any given system and rate, such organisation is well known to the skilled in the art and can be found in the relevant standards. It is sufficient here to recall that the coded speech signals include spectral information, such as the LPCs or a representation thereof.
- In Fig. 2 block VAD&VQE is decomposed into its constituent blocks, namely VAD 40 and VQE 50. VAD 40 has been schematised by a
spectral analyser 41 determining the LPC coefficients, anenergy analyser 42 and Joint Processing Means including including anHard Decision Unit 43 and aSoft Decision Unit 45. Said Joint Processing Means being adapted to combine the results of the two analyses and emitting on line 44 a signal indicating the nature of the received speech frame (the so-called VAD flag), which is an input to the Noise Reduction feature of VQE 50. - According to the invention,
LPC analyser 41 is directly fed with the coded speech signal frames present online 14, whereasenergy estimator 42 is fed with the decoded signal outgoing fromdecoder 6 throughline 15. The LPC analysis of course depends on the manner in which the LPCs are represented in the coded signal. The energy evaluation and the decision may be performed according to any technique known in the art, for instance as disclosed in the above-mentioned paper of L.R. Rabiner et al. - Performing the LPC analysis directly on the coded signal affords a number of advantages in terms of processing power requirements. In particular, there is no need of dedicating processing power to the reconstruction of the LPC coefficients from the decoded signal: it is sufficient to extract them from the relevant information included in the coded speech signal, which is available on the same board. Besides the greater processing simplicity, also a reduction to at least of one fifth or even less of the information amount to be processed is achieved: indeed, at most 244 bits are to be processed to obtain the LPC coefficients from the coded signal, whereas 1280 bits are to be processed when the linear signal is used.
- Under such conditions, the same processor used on the TRAU board for performing all TRAU functions and for managing the so-called tandem free operation (i.e. for dispensing with the transcoding in case of communication between two mobile terminals), for a plurality of speech channels in parallel (for
instance 12 channels), can perform in real time, for the same channels, also the voice activity detection by exploiting both the spectral and the energy information, and the subsequent VQE. The resulting detection is more accurate than when only the energy information is exploited and hence also the noise suppression operation is more accurate. - It is to be appreciated that, according to the existing GSM standards, the LPC information in the coded signal is updated every 5 ms. The energy information is computed on the same interval of 5 ms and then the two contributions are jointly processed to take a decision on the nature of audio segment. This is denoted by the presence of said
Joint Processing Means Hard decision Unit 43 is not necessary and has often a negative impact on audition. Therefore these "hard" decisions are softened through a smoothing process which aims to redefine as "voice" eventual isolated segments of noise among a group of segments of voice and to redefine as "noise" eventual isolated segments of voice among a group of segments of noise. This is the aim ofSoft Decision Unit 45 sited immediately after saidHard Decision Unit 43. - It is clear that the above description is given only by way of non-limiting example and that variations and modifications are possible without departing from the scope of the invention. In particular, even if reference has been made to a TRAU unit in a GSM system, what has been said can be applied also to mobile communication systems operating according to other standards. In such case, the energy analysis and spectral analysis should be adapted to the specific requirements of that system. The invention could be used also in other radio communication signals in which a digital coding of speech is adopted and the digitally coded speech signals include spectral information. Moreover, also the energy analysis could be performed on the coded signal, as disclosed in the above-mentioned EP-A 1017042.
Claims (10)
- A method of detecting voice activity in a received speech signal in a radio communication system in which speech signals are transmitted in digitally coded form, and a signal representative of the presence or absence of voice activity is generated by submitting the received speech signals to an energetic analysis and a spectral analysis, characterised in that said spectral analysis is performed directly on coded speech signals.
- A method as claimed in claim 1, wherein spectral information in said coded signals is periodically updated, characterised in that speech signal analysis comprises a joint processing of said spectral information with energy information, and generating said signal representative of the presence or absence of voice activity from said joint processing through a softening decision step adapted to smooth the decision's rate.
- A method as claimed in claim 1 or 2, characterised in that said radio communication system is a mobile communication system and said signal representative of the presence or absence of voice activity is used to drive a noise reduction feature in a voice quality enhancement operation performed on signals propagating in uplink direction.
- A voice activity detector for detecting voice activity in a received speech signal in a radio communication system in which digitally coded speech signals are transmitted, the detector (40) comprising means (41, 42, 43, 45) for performing an energetic analysis and a spectral analysis on the received speech signal and for generating a signal representative of the presence or absence of voice activity based upon the results of said analyses, characterised in that said spectral analysis means (41) are connected directly with a detector input (14) where said coded speech signals are present.
- A voice activity detector as claimed in claim 4, wherein spectral information in said coded signal are periodically updated, characterised in that for a joint processing means (43, 45) of both spectral and energetic information are connected among said spectral (42) and energetic (41) analysis means and the output of the voice activity detector.
- A voice activity detector as claimed in claim 5, characterised in that said Joint Processing Means are including:an Hard Decision Unit (43) connected to said means (42) for performing an energetic analysis and to said means (41) for performing a spectral analysis, adapted to joint processing the two input segments and to output an hard noise or voice decision;a Soft decision Unit (45) connected at the output of said Hard decision Unit (43) and adapted to perform a smoothing process in order to redefine as "voice" eventual isolated segments of noise among a group of segments of voice and to redefine as "noise" eventual isolated segments of voice among a group of segments of noise.
- A voice activity detector as claimed in claim 4 to 6 for use in a mobile communication system, characterised in that said detector (40) is located upstream of means (50) performing a voice quality enhancement in the upstream direction of communication, and said signal representative of the presence or absence of voice activity drives noise reduction means in said means (50) performing voice quality enhancement.
- A voice activity detector as claimed in claim 7, characterised in that said detector (40) is part, together with said voice quality enhancement means (50), of a unit (2) performing an adaptation of the uplink directed speech signals to the requirements of the control part of the mobile system and possibly of a fixed network and an adaptation of the downlink directed speech signals to the requirements of the radio part of the mobile system.
- A voice activity detector as claimed in claim 8, characterised in that it is implemented by the same processor that would be provided for performing speech signal adaptation and voice quality enhancement in parallel for a plurality of speech channels.
- A method of voice quality enhancement in a mobile communication system, in which a voice quality enhancement including a noise reduction operation is performed at least for speech signals propagating in uplink direction, characterised in that said noise reduction operation is driven by a signal representative of the presence or absence of voice activity generated by a method of voice activity detection as claimed in any of claims 1 to 3 and/or by a voice activity detector as claimed in any of claims 4 to 8.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AT04425031T ATE343196T1 (en) | 2004-01-22 | 2004-01-22 | VOICE ACTIVITY DETECTION USING COMPRESSED VOICE SIGNAL PARAMETERS |
DE602004002845T DE602004002845T2 (en) | 2004-01-22 | 2004-01-22 | Voice activity detection using compressed speech signal parameters |
EP04425031A EP1557820B1 (en) | 2004-01-22 | 2004-01-22 | Voice activity detection operating with compressed speech signal parameters |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP04425031A EP1557820B1 (en) | 2004-01-22 | 2004-01-22 | Voice activity detection operating with compressed speech signal parameters |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1557820A1 true EP1557820A1 (en) | 2005-07-27 |
EP1557820B1 EP1557820B1 (en) | 2006-10-18 |
Family
ID=34626566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP04425031A Expired - Lifetime EP1557820B1 (en) | 2004-01-22 | 2004-01-22 | Voice activity detection operating with compressed speech signal parameters |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1557820B1 (en) |
AT (1) | ATE343196T1 (en) |
DE (1) | DE602004002845T2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017044245A1 (en) * | 2015-09-10 | 2017-03-16 | Qualcomm Incorporated | Audio signal classification and post-processing following a decoder |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107331393B (en) * | 2017-08-15 | 2020-05-12 | 成都启英泰伦科技有限公司 | Self-adaptive voice activity detection method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5579435A (en) * | 1993-11-02 | 1996-11-26 | Telefonaktiebolaget Lm Ericsson | Discriminating between stationary and non-stationary signals |
US5732390A (en) * | 1993-06-29 | 1998-03-24 | Sony Corp | Speech signal transmitting and receiving apparatus with noise sensitive volume control |
EP1017042A1 (en) * | 1994-01-28 | 2000-07-05 | AT&T Corp. | Voice activity detection driven noise remediator |
-
2004
- 2004-01-22 AT AT04425031T patent/ATE343196T1/en not_active IP Right Cessation
- 2004-01-22 EP EP04425031A patent/EP1557820B1/en not_active Expired - Lifetime
- 2004-01-22 DE DE602004002845T patent/DE602004002845T2/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5732390A (en) * | 1993-06-29 | 1998-03-24 | Sony Corp | Speech signal transmitting and receiving apparatus with noise sensitive volume control |
US5579435A (en) * | 1993-11-02 | 1996-11-26 | Telefonaktiebolaget Lm Ericsson | Discriminating between stationary and non-stationary signals |
EP1017042A1 (en) * | 1994-01-28 | 2000-07-05 | AT&T Corp. | Voice activity detection driven noise remediator |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017044245A1 (en) * | 2015-09-10 | 2017-03-16 | Qualcomm Incorporated | Audio signal classification and post-processing following a decoder |
CN107949881A (en) * | 2015-09-10 | 2018-04-20 | 高通股份有限公司 | Audio signal classification and post processing after decoder |
US9972334B2 (en) | 2015-09-10 | 2018-05-15 | Qualcomm Incorporated | Decoder audio classification |
CN107949881B (en) * | 2015-09-10 | 2019-05-31 | 高通股份有限公司 | Audio signal classification and post-processing after decoder |
Also Published As
Publication number | Publication date |
---|---|
ATE343196T1 (en) | 2006-11-15 |
EP1557820B1 (en) | 2006-10-18 |
DE602004002845D1 (en) | 2006-11-30 |
DE602004002845T2 (en) | 2007-06-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA1231473A (en) | Voice activity detection process and means for implementing said process | |
US6424938B1 (en) | Complex signal activity detection for improved speech/noise classification of an audio signal | |
JP4870313B2 (en) | Frame Erasure Compensation Method for Variable Rate Speech Encoder | |
EP1279167B1 (en) | Method and apparatus for predictively quantizing voiced speech | |
EP1982324B1 (en) | A voice detector and a method for suppressing sub-bands in a voice detector | |
EP0786760B1 (en) | Speech coding | |
US9373342B2 (en) | System and method for speech enhancement on compressed speech | |
KR20010024869A (en) | A decoding method and system comprising an adaptive postfilter | |
JPH09152894A (en) | Sound and silence discriminator | |
EP1212749B1 (en) | Method and apparatus for interleaving line spectral information quantization methods in a speech coder | |
CN1046366C (en) | Discriminating between stationary and non-stationary signals | |
US8144862B2 (en) | Method and apparatus for the detection and suppression of echo in packet based communication networks using frame energy estimation | |
JP2000172283A (en) | System and method for detecting sound | |
SE470577B (en) | Method and apparatus for encoding and / or decoding background noise | |
EP1557820B1 (en) | Voice activity detection operating with compressed speech signal parameters | |
KR100641673B1 (en) | Pitch quantization for distributed speech recognition | |
JP3055608B2 (en) | Voice coding method and apparatus | |
JPH07135490A (en) | Voice detector and vocoder having voice detector | |
JPH07281689A (en) | Audio signal transmission device | |
KR20100022894A (en) | A voiced/unvoiced decision method for the smv of 3gpp2 using gaussian mixture model | |
JPH08293817A (en) | Sound signal detection circuit and traveling communication terminal equipment | |
KR20100116102A (en) | Method and apparatus for transmitting signal in a communication system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK |
|
17P | Request for examination filed |
Effective date: 20060123 |
|
AKX | Designation fees paid |
Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: SIEMENS S.P.A. |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20061018 Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: LI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: BE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: CH Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D Ref country code: CH Ref legal event code: EP |
|
REF | Corresponds to: |
Ref document number: 602004002845 Country of ref document: DE Date of ref document: 20061130 Kind code of ref document: P |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070118 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070118 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070118 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070122 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070129 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070131 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070319 |
|
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
EN | Fr: translation not filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20070719 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070601 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070119 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20080122 Year of fee payment: 5 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20070122 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20070419 Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20061018 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20090801 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: 732E Free format text: REGISTERED BETWEEN 20120719 AND 20120725 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20170119 Year of fee payment: 14 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20180122 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20180122 |