EP1557820A1 - Voice activity detection operating with compressed speech signal parameters - Google Patents

Voice activity detection operating with compressed speech signal parameters Download PDF

Info

Publication number
EP1557820A1
EP1557820A1 EP04425031A EP04425031A EP1557820A1 EP 1557820 A1 EP1557820 A1 EP 1557820A1 EP 04425031 A EP04425031 A EP 04425031A EP 04425031 A EP04425031 A EP 04425031A EP 1557820 A1 EP1557820 A1 EP 1557820A1
Authority
EP
European Patent Office
Prior art keywords
voice
voice activity
analysis
speech
detector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP04425031A
Other languages
German (de)
French (fr)
Other versions
EP1557820B1 (en
Inventor
Matteo Aldrovandi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens SpA
Original Assignee
Siemens Mobile Communications SpA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Mobile Communications SpA filed Critical Siemens Mobile Communications SpA
Priority to AT04425031T priority Critical patent/ATE343196T1/en
Priority to DE602004002845T priority patent/DE602004002845T2/en
Priority to EP04425031A priority patent/EP1557820B1/en
Publication of EP1557820A1 publication Critical patent/EP1557820A1/en
Application granted granted Critical
Publication of EP1557820B1 publication Critical patent/EP1557820B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

Definitions

  • the present invention refers to digital radio communication systems, in particular mobile communication systems, and more specifically it concerns a method of and device for voice activity detection in received speech signals in one such system.
  • the method and the device are intended for use in connection with voice quality enhancement.
  • VADs Voice activity detectors
  • a class of VADs performs detection through an energetic analysis and a spectral analysis of the input signal, the analysis results being combined to provide the classification of an analysed speech segment.
  • An algorithm for classifying a speech segment as voiced speech, unvoiced speech or silence based on energetic and spectral analyses is disclosed in "Application of an LPC Distance Measure to the Voiced-Unvoiced-Silence Detection Problem", by L.R. Rabiner and M.R. Sambur, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-25, No. 4, August 1977, pages 338 - 343.
  • VADs are typically used at the mobile terminals, in association with the speech coder, to drive a discontinuous transmission in which coded speech signals are transmitted during active speech periods whereas in silence periods the speech transmitter is inhibited and the so-called comfort noise is transmitted. This helps in saving power.
  • VAD Voice Quality Enhancement
  • the Applicant integrate the VQE function in the units (such as the Transcoding and Rate Adapting Units, or TRAU, of the GSM system) adapting the speech signals from the requirements of the radio part of the system to the requirements of the control part and, if necessary, of the fixed telephone network, and vice versa.
  • the VAD is intended to drive the noise suppression in the uplink direction of communication.
  • the Applicant wishes that the addition of a VAD-driven VQE in a TRAU does not entail changes in the hardware of the TRAU itself.
  • the conventional approach which is substantially as disclosed in the above-mentioned document by L.R. Rabiner and M.R. Sambur, entails that the speech signal is decoded and the spectral information (here the linear prediction coefficients LPCs) is recovered from the linear frames resulting from speech decoding. Yet, the spectral analysis of the linear signal to recover the LPCs is a heavy processing task. Taking into account that the TRAU processor generally operates in parallel on a plurality of channels, real time execution of the complete VAD algorithm on the same channels could compel to use a dedicated processor or a more powerful and hence more expensive processor than the one that would be used for the TRAU. Both solutions are in contrast with the goal of keeping the TRAU hardware unchanged.
  • a method of detecting voice activity in a received speech signal in a radio communication system in which speech signals are transmitted in digitally coded form, and a signal representative of the presence or absence of voice activity is generated by submitting the received speech signals to an energetic analysis and a spectral analysis, said spectral analysis being performed directly on coded speech signals.
  • the invention also concerns a device for carrying out the method, comprising means for performing an energetic analysis and a spectral analysis on the received speech signal, in which said spectral analysis means are connected directly with a detector input where said coded speech signals are present.
  • the voice activity detector drives a noise reduction operation, within a voice quality enhancement function performed on speech signals propagating in the uplink communication direction in a mobile communication system and embodied in units, like the so-called TRAU (Transcoding and Rate Adapting Unit), which adapt the uplink directed speech signals to the requirements of the control part of the mobile system and possibly of the fixed telephone network and adapt downlink directed speech signals to the requirements of the radio part of the mobile system.
  • TRAU Transcoding and Rate Adapting Unit
  • the invention provides also a method of voice quality enhancement in a mobile communication system, in which a voice quality enhancement including a noise reduction operation is performed at least for speech signals propagating in uplink direction, in which said noise reduction operation is driven by a signal representative of the presence or absence of voice activity generated by a method of and device for voice activity detection as defined above.
  • the preferred embodiment disclosed here concerns a VAD intended to drive the noise reduction feature in a voice quality enhancement performed in the uplink direction of communication in a mobile communication system, in case the VQE function is incorporated into the units performing transcoding and/or rate adaptation in the control part of such a system.
  • a Transcoding and Rate Adapting Unit (TRAU) 1 of a mobile communication system, for instance a GSM system.
  • the TRAU is connected to the Mobile Switching Centre (MSC) 2 and the Base Station Controller (BSC) 3 through interfaces A and Asub and embodies a VAD-driven Voice Quality Enhancement function performed in block 4 labelled VAD & VQE.
  • MSC Mobile Switching Centre
  • BSC Base Station Controller
  • a VQE includes the well-known features of Acoustic Echo Cancellation, Noise Reduction and Acoustic Level Control.
  • all of said features are provided for the uplink direction of communication only and the VAD drives the Noise Reduction (NR) feature.
  • NR Noise Reduction
  • the Acoustic Level Control is performed, which is not concerned by the present invention.
  • the drawing only shows the units that, in TRAU 1, are directly concerned with the transcoding function, namely a speech coder 5 and a speech decoder 6 on the Asub-interface side, and an A/ ⁇ law expander 7 and an A/ ⁇ law compander 8 on the A-interface side.
  • the TRAU receives A-law PCM signals from MSC through a line 10, sends the expanded signals to the In_Down input of VAD&VQE block 4 through line 11.
  • the signals outgoing from the Out_Down output of block 4 are fed to coder 5 through line 12, are coded according to the desired coding technique (full-rate, enhanced full-rate, half-rate or adaptive multi-rate) and the coded signals are then forwarded to base station controller 3 through line 13.
  • the coded signals arriving from base station controller 3 through line 14 are fed to both decoder 6 and VAD&VQE block 4.
  • the decoded signals are fed to the In_Down input of VAD&VQE block 4 through line 15.
  • the decoded signals having undergone voice quality enhancement are fed from Out_Up output of VAD&VQE to A/ ⁇ law compander 8 through line 16 and hence to MSC through line 17.
  • the coded speech signals include spectral information, such as the LPCs or a representation thereof.
  • VAD&VQE is decomposed into its constituent blocks, namely VAD 40 and VQE 50.
  • VAD 40 has been schematised by a spectral analyser 41 determining the LPC coefficients, an energy analyser 42 and Joint Processing Means including including an Hard Decision Unit 43 and a Soft Decision Unit 45.
  • Said Joint Processing Means being adapted to combine the results of the two analyses and emitting on line 44 a signal indicating the nature of the received speech frame (the so-called VAD flag), which is an input to the Noise Reduction feature of VQE 50.
  • LPC analyser 41 is directly fed with the coded speech signal frames present on line 14, whereas energy estimator 42 is fed with the decoded signal outgoing from decoder 6 through line 15.
  • the LPC analysis of course depends on the manner in which the LPCs are represented in the coded signal.
  • the energy evaluation and the decision may be performed according to any technique known in the art, for instance as disclosed in the above-mentioned paper of L.R. Rabiner et al.
  • Performing the LPC analysis directly on the coded signal affords a number of advantages in terms of processing power requirements.
  • there is no need of dedicating processing power to the reconstruction of the LPC coefficients from the decoded signal it is sufficient to extract them from the relevant information included in the coded speech signal, which is available on the same board.
  • a reduction to at least of one fifth or even less of the information amount to be processed is achieved: indeed, at most 244 bits are to be processed to obtain the LPC coefficients from the coded signal, whereas 1280 bits are to be processed when the linear signal is used.
  • the same processor used on the TRAU board for performing all TRAU functions and for managing the so-called tandem free operation i.e. for dispensing with the transcoding in case of communication between two mobile terminals
  • a plurality of speech channels in parallel for instance 12 channels
  • the same processor used on the TRAU board for performing all TRAU functions and for managing the so-called tandem free operation can perform in real time, for the same channels, also the voice activity detection by exploiting both the spectral and the energy information, and the subsequent VQE.
  • the resulting detection is more accurate than when only the energy information is exploited and hence also the noise suppression operation is more accurate.
  • the LPC information in the coded signal is updated every 5 ms.
  • the energy information is computed on the same interval of 5 ms and then the two contributions are jointly processed to take a decision on the nature of audio segment.
  • This is denoted by the presence of said Joint Processing Means 43 and 45.
  • the high rate of decisions available at the output of said Hard decision Unit 43 is not necessary and has often a negative impact on audition. Therefore these "hard” decisions are softened through a smoothing process which aims to redefine as "voice” eventual isolated segments of noise among a group of segments of voice and to redefine as "noise” eventual isolated segments of voice among a group of segments of noise. This is the aim of Soft Decision Unit 45 sited immediately after said Hard Decision Unit 43.

Abstract

There is provided a voice activity detector (VAD) (40) for assisting the voice quality enhancement in the uplink direction of a mobile communication system in which the voice quality enhancement means (50) are embodied in the transcoding and rate adapting unit (TRAU) (2). The VAD (40) comprises means (41, 42) for performing both a spectral analysis and an energetic analysis on a received speech signal and means (43, 45) for processing the results of said analysis and taking a decision on audio segment nature. The VAD performs spectral analysis directly on the coded signal.

Description

    Field of the invention
  • The present invention refers to digital radio communication systems, in particular mobile communication systems, and more specifically it concerns a method of and device for voice activity detection in received speech signals in one such system.
  • Preferably, but not exclusively, the method and the device are intended for use in connection with voice quality enhancement.
  • Background of the invention
  • Voice activity detectors (VADs) are devices that are supplied with a signal to detect therein periods of speech and periods of silence, where only noise is present. Possibly, the VADs are also arranged to distinguish among voiced/unvoiced sounds in speech periods.
  • A class of VADs performs detection through an energetic analysis and a spectral analysis of the input signal, the analysis results being combined to provide the classification of an analysed speech segment. An algorithm for classifying a speech segment as voiced speech, unvoiced speech or silence based on energetic and spectral analyses is disclosed in "Application of an LPC Distance Measure to the Voiced-Unvoiced-Silence Detection Problem", by L.R. Rabiner and M.R. Sambur, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-25, No. 4, August 1977, pages 338 - 343.
  • In mobile communication systems, VADs are typically used at the mobile terminals, in association with the speech coder, to drive a discontinuous transmission in which coded speech signals are transmitted during active speech periods whereas in silence periods the speech transmitter is inhibited and the so-called comfort noise is transmitted. This helps in saving power.
  • It has now been found that it is advantageous to use a VAD also in the control part of a mobile communication system, in particular for improving the Noise Reduction (NR) feature of the so-called Voice Quality Enhancement (VQE) function. An example of VAD-assisted noise reduction for the uplink direction of communication of a digital radio communication system is disclosed in EP-A 1 017 042.
  • The Applicant, as well as other manufacturers, integrate the VQE function in the units (such as the Transcoding and Rate Adapting Units, or TRAU, of the GSM system) adapting the speech signals from the requirements of the radio part of the system to the requirements of the control part and, if necessary, of the fixed telephone network, and vice versa. In the Applicant's VQE, the VAD is intended to drive the noise suppression in the uplink direction of communication. For several reasons, including cost and size of the apparatus, the Applicant wishes that the addition of a VAD-driven VQE in a TRAU does not entail changes in the hardware of the TRAU itself.
  • If the detection exploits both the spectrum and the energy characteristics of the received signal, the conventional approach, which is substantially as disclosed in the above-mentioned document by L.R. Rabiner and M.R. Sambur, entails that the speech signal is decoded and the spectral information (here the linear prediction coefficients LPCs) is recovered from the linear frames resulting from speech decoding. Yet, the spectral analysis of the linear signal to recover the LPCs is a heavy processing task. Taking into account that the TRAU processor generally operates in parallel on a plurality of channels, real time execution of the complete VAD algorithm on the same channels could compel to use a dedicated processor or a more powerful and hence more expensive processor than the one that would be used for the TRAU. Both solutions are in contrast with the goal of keeping the TRAU hardware unchanged.
  • To avoid the need for a VAD-dedicated or a more powerful processor, the spectral analysis could be dispensed with and the VAD could perform only the energetic analysis. Such a solution is disclosed in EP-A 1 017 042. The document teaches also that the energy estimation can be performed directly on the compressed signal, in order to dispense the speech decoder with the relevant processing tasks and to speed up the actual speech decoding.
  • Yet, by performing only the energetic analysis, only one feature of the received signal is exploited, and the detection, and hence the operation of the VAD-driven VQE, is less effective.
  • Object of the Invention
  • Thus, it is an object of the invention to provide a method and a device for voice activity detection, in particular intended to drive a voice quality enhancement integrated in a unit performing speech rate and coding adaptation in mobile communication systems, which method and device allow performing both the energetic and the spectral analysis by using the same processor as required for performing said adaptation.
  • Summary of the Invention
  • According to the invention, there is provided a method of detecting voice activity in a received speech signal in a radio communication system in which speech signals are transmitted in digitally coded form, and a signal representative of the presence or absence of voice activity is generated by submitting the received speech signals to an energetic analysis and a spectral analysis, said spectral analysis being performed directly on coded speech signals.
  • The invention also concerns a device for carrying out the method, comprising means for performing an energetic analysis and a spectral analysis on the received speech signal, in which said spectral analysis means are connected directly with a detector input where said coded speech signals are present.
  • In the preferred application, the voice activity detector drives a noise reduction operation, within a voice quality enhancement function performed on speech signals propagating in the uplink communication direction in a mobile communication system and embodied in units, like the so-called TRAU (Transcoding and Rate Adapting Unit), which adapt the uplink directed speech signals to the requirements of the control part of the mobile system and possibly of the fixed telephone network and adapt downlink directed speech signals to the requirements of the radio part of the mobile system.
  • Therefore, the invention provides also a method of voice quality enhancement in a mobile communication system, in which a voice quality enhancement including a noise reduction operation is performed at least for speech signals propagating in uplink direction, in which said noise reduction operation is driven by a signal representative of the presence or absence of voice activity generated by a method of and device for voice activity detection as defined above.
  • Brief description of the drawings
  • A preferred embodiment of the invention, given by way of non-limiting example, will now be described with reference to the accompanying drawings, in which:
    • Fig. 1 is a schematic block diagram of a TRAU embodying a VQE unit and of its connections inside the mobile communications system; and
    • Fig. 2 is a schematic block diagram of the invention.
    Description of the preferred embodiment
  • The preferred embodiment disclosed here concerns a VAD intended to drive the noise reduction feature in a voice quality enhancement performed in the uplink direction of communication in a mobile communication system, in case the VQE function is incorporated into the units performing transcoding and/or rate adaptation in the control part of such a system.
  • Referring to Fig. 1, there is schematically shown a Transcoding and Rate Adapting Unit (TRAU) 1 of a mobile communication system, for instance a GSM system. The TRAU is connected to the Mobile Switching Centre (MSC) 2 and the Base Station Controller (BSC) 3 through interfaces A and Asub and embodies a VAD-driven Voice Quality Enhancement function performed in block 4 labelled VAD & VQE.
  • In the most general case, a VQE includes the well-known features of Acoustic Echo Cancellation, Noise Reduction and Acoustic Level Control. In the preferred application of the invention, all of said features are provided for the uplink direction of communication only and the VAD drives the Noise Reduction (NR) feature. In downlink direction, only the Acoustic Level Control is performed, which is not concerned by the present invention.
  • The drawing only shows the units that, in TRAU 1, are directly concerned with the transcoding function, namely a speech coder 5 and a speech decoder 6 on the Asub-interface side, and an A/µ law expander 7 and an A/µ law compander 8 on the A-interface side.
  • In downlink direction, the TRAU receives A-law PCM signals from MSC through a line 10, sends the expanded signals to the In_Down input of VAD&VQE block 4 through line 11. The signals outgoing from the Out_Down output of block 4 are fed to coder 5 through line 12, are coded according to the desired coding technique (full-rate, enhanced full-rate, half-rate or adaptive multi-rate) and the coded signals are then forwarded to base station controller 3 through line 13.
  • In uplink direction, the coded signals arriving from base station controller 3 through line 14 are fed to both decoder 6 and VAD&VQE block 4. The decoded signals are fed to the In_Down input of VAD&VQE block 4 through line 15. The decoded signals having undergone voice quality enhancement are fed from Out_Up output of VAD&VQE to A/µ law compander 8 through line 16 and hence to MSC through line 17.
  • It is not necessary to provide here details on the organisation of the coded speech signals in a mobile communication system, which depends on the kind of system and on the chosen coding rate. On the other hand, for any given system and rate, such organisation is well known to the skilled in the art and can be found in the relevant standards. It is sufficient here to recall that the coded speech signals include spectral information, such as the LPCs or a representation thereof.
  • In Fig. 2 block VAD&VQE is decomposed into its constituent blocks, namely VAD 40 and VQE 50. VAD 40 has been schematised by a spectral analyser 41 determining the LPC coefficients, an energy analyser 42 and Joint Processing Means including including an Hard Decision Unit 43 and a Soft Decision Unit 45. Said Joint Processing Means being adapted to combine the results of the two analyses and emitting on line 44 a signal indicating the nature of the received speech frame (the so-called VAD flag), which is an input to the Noise Reduction feature of VQE 50.
  • According to the invention, LPC analyser 41 is directly fed with the coded speech signal frames present on line 14, whereas energy estimator 42 is fed with the decoded signal outgoing from decoder 6 through line 15. The LPC analysis of course depends on the manner in which the LPCs are represented in the coded signal. The energy evaluation and the decision may be performed according to any technique known in the art, for instance as disclosed in the above-mentioned paper of L.R. Rabiner et al.
  • Performing the LPC analysis directly on the coded signal affords a number of advantages in terms of processing power requirements. In particular, there is no need of dedicating processing power to the reconstruction of the LPC coefficients from the decoded signal: it is sufficient to extract them from the relevant information included in the coded speech signal, which is available on the same board. Besides the greater processing simplicity, also a reduction to at least of one fifth or even less of the information amount to be processed is achieved: indeed, at most 244 bits are to be processed to obtain the LPC coefficients from the coded signal, whereas 1280 bits are to be processed when the linear signal is used.
  • Under such conditions, the same processor used on the TRAU board for performing all TRAU functions and for managing the so-called tandem free operation (i.e. for dispensing with the transcoding in case of communication between two mobile terminals), for a plurality of speech channels in parallel (for instance 12 channels), can perform in real time, for the same channels, also the voice activity detection by exploiting both the spectral and the energy information, and the subsequent VQE. The resulting detection is more accurate than when only the energy information is exploited and hence also the noise suppression operation is more accurate.
  • It is to be appreciated that, according to the existing GSM standards, the LPC information in the coded signal is updated every 5 ms. The energy information is computed on the same interval of 5 ms and then the two contributions are jointly processed to take a decision on the nature of audio segment. This is denoted by the presence of said Joint Processing Means 43 and 45. For voice quality enhancement the high rate of decisions available at the output of said Hard decision Unit 43 is not necessary and has often a negative impact on audition. Therefore these "hard" decisions are softened through a smoothing process which aims to redefine as "voice" eventual isolated segments of noise among a group of segments of voice and to redefine as "noise" eventual isolated segments of voice among a group of segments of noise. This is the aim of Soft Decision Unit 45 sited immediately after said Hard Decision Unit 43.
  • It is clear that the above description is given only by way of non-limiting example and that variations and modifications are possible without departing from the scope of the invention. In particular, even if reference has been made to a TRAU unit in a GSM system, what has been said can be applied also to mobile communication systems operating according to other standards. In such case, the energy analysis and spectral analysis should be adapted to the specific requirements of that system. The invention could be used also in other radio communication signals in which a digital coding of speech is adopted and the digitally coded speech signals include spectral information. Moreover, also the energy analysis could be performed on the coded signal, as disclosed in the above-mentioned EP-A 1017042.

Claims (10)

  1. A method of detecting voice activity in a received speech signal in a radio communication system in which speech signals are transmitted in digitally coded form, and a signal representative of the presence or absence of voice activity is generated by submitting the received speech signals to an energetic analysis and a spectral analysis, characterised in that said spectral analysis is performed directly on coded speech signals.
  2. A method as claimed in claim 1, wherein spectral information in said coded signals is periodically updated, characterised in that speech signal analysis comprises a joint processing of said spectral information with energy information, and generating said signal representative of the presence or absence of voice activity from said joint processing through a softening decision step adapted to smooth the decision's rate.
  3. A method as claimed in claim 1 or 2, characterised in that said radio communication system is a mobile communication system and said signal representative of the presence or absence of voice activity is used to drive a noise reduction feature in a voice quality enhancement operation performed on signals propagating in uplink direction.
  4. A voice activity detector for detecting voice activity in a received speech signal in a radio communication system in which digitally coded speech signals are transmitted, the detector (40) comprising means (41, 42, 43, 45) for performing an energetic analysis and a spectral analysis on the received speech signal and for generating a signal representative of the presence or absence of voice activity based upon the results of said analyses, characterised in that said spectral analysis means (41) are connected directly with a detector input (14) where said coded speech signals are present.
  5. A voice activity detector as claimed in claim 4, wherein spectral information in said coded signal are periodically updated, characterised in that for a joint processing means (43, 45) of both spectral and energetic information are connected among said spectral (42) and energetic (41) analysis means and the output of the voice activity detector.
  6. A voice activity detector as claimed in claim 5, characterised in that said Joint Processing Means are including:
    an Hard Decision Unit (43) connected to said means (42) for performing an energetic analysis and to said means (41) for performing a spectral analysis, adapted to joint processing the two input segments and to output an hard noise or voice decision;
    a Soft decision Unit (45) connected at the output of said Hard decision Unit (43) and adapted to perform a smoothing process in order to redefine as "voice" eventual isolated segments of noise among a group of segments of voice and to redefine as "noise" eventual isolated segments of voice among a group of segments of noise.
  7. A voice activity detector as claimed in claim 4 to 6 for use in a mobile communication system, characterised in that said detector (40) is located upstream of means (50) performing a voice quality enhancement in the upstream direction of communication, and said signal representative of the presence or absence of voice activity drives noise reduction means in said means (50) performing voice quality enhancement.
  8. A voice activity detector as claimed in claim 7, characterised in that said detector (40) is part, together with said voice quality enhancement means (50), of a unit (2) performing an adaptation of the uplink directed speech signals to the requirements of the control part of the mobile system and possibly of a fixed network and an adaptation of the downlink directed speech signals to the requirements of the radio part of the mobile system.
  9. A voice activity detector as claimed in claim 8, characterised in that it is implemented by the same processor that would be provided for performing speech signal adaptation and voice quality enhancement in parallel for a plurality of speech channels.
  10. A method of voice quality enhancement in a mobile communication system, in which a voice quality enhancement including a noise reduction operation is performed at least for speech signals propagating in uplink direction, characterised in that said noise reduction operation is driven by a signal representative of the presence or absence of voice activity generated by a method of voice activity detection as claimed in any of claims 1 to 3 and/or by a voice activity detector as claimed in any of claims 4 to 8.
EP04425031A 2004-01-22 2004-01-22 Voice activity detection operating with compressed speech signal parameters Expired - Lifetime EP1557820B1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AT04425031T ATE343196T1 (en) 2004-01-22 2004-01-22 VOICE ACTIVITY DETECTION USING COMPRESSED VOICE SIGNAL PARAMETERS
DE602004002845T DE602004002845T2 (en) 2004-01-22 2004-01-22 Voice activity detection using compressed speech signal parameters
EP04425031A EP1557820B1 (en) 2004-01-22 2004-01-22 Voice activity detection operating with compressed speech signal parameters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP04425031A EP1557820B1 (en) 2004-01-22 2004-01-22 Voice activity detection operating with compressed speech signal parameters

Publications (2)

Publication Number Publication Date
EP1557820A1 true EP1557820A1 (en) 2005-07-27
EP1557820B1 EP1557820B1 (en) 2006-10-18

Family

ID=34626566

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04425031A Expired - Lifetime EP1557820B1 (en) 2004-01-22 2004-01-22 Voice activity detection operating with compressed speech signal parameters

Country Status (3)

Country Link
EP (1) EP1557820B1 (en)
AT (1) ATE343196T1 (en)
DE (1) DE602004002845T2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017044245A1 (en) * 2015-09-10 2017-03-16 Qualcomm Incorporated Audio signal classification and post-processing following a decoder

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107331393B (en) * 2017-08-15 2020-05-12 成都启英泰伦科技有限公司 Self-adaptive voice activity detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5579435A (en) * 1993-11-02 1996-11-26 Telefonaktiebolaget Lm Ericsson Discriminating between stationary and non-stationary signals
US5732390A (en) * 1993-06-29 1998-03-24 Sony Corp Speech signal transmitting and receiving apparatus with noise sensitive volume control
EP1017042A1 (en) * 1994-01-28 2000-07-05 AT&T Corp. Voice activity detection driven noise remediator

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5732390A (en) * 1993-06-29 1998-03-24 Sony Corp Speech signal transmitting and receiving apparatus with noise sensitive volume control
US5579435A (en) * 1993-11-02 1996-11-26 Telefonaktiebolaget Lm Ericsson Discriminating between stationary and non-stationary signals
EP1017042A1 (en) * 1994-01-28 2000-07-05 AT&T Corp. Voice activity detection driven noise remediator

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017044245A1 (en) * 2015-09-10 2017-03-16 Qualcomm Incorporated Audio signal classification and post-processing following a decoder
CN107949881A (en) * 2015-09-10 2018-04-20 高通股份有限公司 Audio signal classification and post processing after decoder
US9972334B2 (en) 2015-09-10 2018-05-15 Qualcomm Incorporated Decoder audio classification
CN107949881B (en) * 2015-09-10 2019-05-31 高通股份有限公司 Audio signal classification and post-processing after decoder

Also Published As

Publication number Publication date
ATE343196T1 (en) 2006-11-15
EP1557820B1 (en) 2006-10-18
DE602004002845D1 (en) 2006-11-30
DE602004002845T2 (en) 2007-06-06

Similar Documents

Publication Publication Date Title
CA1231473A (en) Voice activity detection process and means for implementing said process
US6424938B1 (en) Complex signal activity detection for improved speech/noise classification of an audio signal
JP4870313B2 (en) Frame Erasure Compensation Method for Variable Rate Speech Encoder
EP1279167B1 (en) Method and apparatus for predictively quantizing voiced speech
EP1982324B1 (en) A voice detector and a method for suppressing sub-bands in a voice detector
EP0786760B1 (en) Speech coding
US9373342B2 (en) System and method for speech enhancement on compressed speech
KR20010024869A (en) A decoding method and system comprising an adaptive postfilter
JPH09152894A (en) Sound and silence discriminator
EP1212749B1 (en) Method and apparatus for interleaving line spectral information quantization methods in a speech coder
CN1046366C (en) Discriminating between stationary and non-stationary signals
US8144862B2 (en) Method and apparatus for the detection and suppression of echo in packet based communication networks using frame energy estimation
JP2000172283A (en) System and method for detecting sound
SE470577B (en) Method and apparatus for encoding and / or decoding background noise
EP1557820B1 (en) Voice activity detection operating with compressed speech signal parameters
KR100641673B1 (en) Pitch quantization for distributed speech recognition
JP3055608B2 (en) Voice coding method and apparatus
JPH07135490A (en) Voice detector and vocoder having voice detector
JPH07281689A (en) Audio signal transmission device
KR20100022894A (en) A voiced/unvoiced decision method for the smv of 3gpp2 using gaussian mixture model
JPH08293817A (en) Sound signal detection circuit and traveling communication terminal equipment
KR20100116102A (en) Method and apparatus for transmitting signal in a communication system

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK

17P Request for examination filed

Effective date: 20060123

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SIEMENS S.P.A.

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED.

Effective date: 20061018

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: LI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: CH

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Ref country code: CH

Ref legal event code: EP

REF Corresponds to:

Ref document number: 602004002845

Country of ref document: DE

Date of ref document: 20061130

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070118

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070118

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070118

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070122

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070129

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070319

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

EN Fr: translation not filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed

Effective date: 20070719

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070601

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070119

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20080122

Year of fee payment: 5

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20070122

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20070419

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20061018

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20090801

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20120719 AND 20120725

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20170119

Year of fee payment: 14

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20180122

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180122