AU2002233983A1 - Method and apparatus for robust speech classification - Google Patents

Method and apparatus for robust speech classification

Info

Publication number
AU2002233983A1
AU2002233983A1 AU2002233983A AU3398302A AU2002233983A1 AU 2002233983 A1 AU2002233983 A1 AU 2002233983A1 AU 2002233983 A AU2002233983 A AU 2002233983A AU 3398302 A AU3398302 A AU 3398302A AU 2002233983 A1 AU2002233983 A1 AU 2002233983A1
Authority
AU
Australia
Prior art keywords
speech
classification
parameters
classifier
bit rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2002233983A
Inventor
Pengjun Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of AU2002233983A1 publication Critical patent/AU2002233983A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • G10L19/025Detection of transients or attacks for time/frequency resolution switching
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Exchange Systems With Centralized Control (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Machine Translation (AREA)

Abstract

A speech classification technique for robust classification of varying modes of speech to enable maximum performance of multi-mode variable bit rate encoding techniques. A speech classifier accurately classifies a high percentage of speech segments for encoding at minimal bit rates, meeting lower bit rate requirements. Highly accurate speech classification produces a lower average encoded bit rate, and higher quality decoded speech. The speech classifier considers a maximum number of parameters for each frame of speech, producing numerous and accurate speech mode classifications for each frame. The speech classifier correctly classifies numerous modes of speech under varying environmental conditions. The speech classifier inputs classification parameters from external components, generates internal classification parameters from the input parameters, sets a Normalized Auto-correlation Coefficient Function threshold and selects a parameter analyzer according to the signal environment, and then analyzes the parameters to produce a speech mode classification.
AU2002233983A 2000-12-08 2001-12-04 Method and apparatus for robust speech classification Abandoned AU2002233983A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US09/733,740 2000-12-08
US09/733,740 US7472059B2 (en) 2000-12-08 2000-12-08 Method and apparatus for robust speech classification
PCT/US2001/046971 WO2002047068A2 (en) 2000-12-08 2001-12-04 Method and apparatus for robust speech classification

Publications (1)

Publication Number Publication Date
AU2002233983A1 true AU2002233983A1 (en) 2002-06-18

Family

ID=24948935

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2002233983A Abandoned AU2002233983A1 (en) 2000-12-08 2001-12-04 Method and apparatus for robust speech classification

Country Status (13)

Country Link
US (1) US7472059B2 (en)
EP (1) EP1340223B1 (en)
JP (2) JP4550360B2 (en)
KR (2) KR100908219B1 (en)
CN (2) CN100350453C (en)
AT (1) ATE341808T1 (en)
AU (1) AU2002233983A1 (en)
BR (2) BRPI0116002B1 (en)
DE (1) DE60123651T2 (en)
ES (1) ES2276845T3 (en)
HK (1) HK1067444A1 (en)
TW (1) TW535141B (en)
WO (1) WO2002047068A2 (en)

Families Citing this family (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6691084B2 (en) * 1998-12-21 2004-02-10 Qualcomm Incorporated Multiple mode variable rate speech coding
GB0003903D0 (en) * 2000-02-18 2000-04-05 Canon Kk Improved speech recognition accuracy in a multimodal input system
US8090577B2 (en) 2002-08-08 2012-01-03 Qualcomm Incorported Bandwidth-adaptive quantization
US7657427B2 (en) * 2002-10-11 2010-02-02 Nokia Corporation Methods and devices for source controlled variable bit-rate wideband speech coding
US7023880B2 (en) * 2002-10-28 2006-04-04 Qualcomm Incorporated Re-formatting variable-rate vocoder frames for inter-system transmissions
US7698132B2 (en) * 2002-12-17 2010-04-13 Qualcomm Incorporated Sub-sampled excitation waveform codebooks
US7613606B2 (en) * 2003-10-02 2009-11-03 Nokia Corporation Speech codecs
US7472057B2 (en) * 2003-10-17 2008-12-30 Broadcom Corporation Detector for use in voice communications systems
KR20050045764A (en) * 2003-11-12 2005-05-17 삼성전자주식회사 Apparatus and method for recording and playing voice in the wireless telephone
US7630902B2 (en) * 2004-09-17 2009-12-08 Digital Rise Technology Co., Ltd. Apparatus and methods for digital audio coding using codebook application ranges
EP1861846B1 (en) * 2005-03-24 2011-09-07 Mindspeed Technologies, Inc. Adaptive voice mode extension for a voice activity detector
US20060262851A1 (en) 2005-05-19 2006-11-23 Celtro Ltd. Method and system for efficient transmission of communication traffic
KR100744352B1 (en) * 2005-08-01 2007-07-30 삼성전자주식회사 Method of voiced/unvoiced classification based on harmonic to residual ratio analysis and the apparatus thereof
US20070033042A1 (en) * 2005-08-03 2007-02-08 International Business Machines Corporation Speech detection fusing multi-class acoustic-phonetic, and energy features
US7962340B2 (en) * 2005-08-22 2011-06-14 Nuance Communications, Inc. Methods and apparatus for buffering data for use in accordance with a speech recognition system
KR100735343B1 (en) * 2006-04-11 2007-07-04 삼성전자주식회사 Apparatus and method for extracting pitch information of a speech signal
EP2033489B1 (en) 2006-06-14 2015-10-28 Personics Holdings, LLC. Earguard monitoring system
EP2044804A4 (en) 2006-07-08 2013-12-18 Personics Holdings Inc Personal audio assistant device and method
US8239190B2 (en) * 2006-08-22 2012-08-07 Qualcomm Incorporated Time-warping frames of wideband vocoder
CA2663904C (en) * 2006-10-10 2014-05-27 Qualcomm Incorporated Method and apparatus for encoding and decoding audio signals
CA2672165C (en) * 2006-12-12 2014-07-29 Ralf Geiger Encoder, decoder and methods for encoding and decoding data segments representing a time-domain data stream
US11750965B2 (en) 2007-03-07 2023-09-05 Staton Techiya, Llc Acoustic dampening compensation system
WO2008126347A1 (en) * 2007-03-16 2008-10-23 Panasonic Corporation Voice analysis device, voice analysis method, voice analysis program, and system integration circuit
US8111839B2 (en) 2007-04-09 2012-02-07 Personics Holdings Inc. Always on headwear recording system
US11856375B2 (en) 2007-05-04 2023-12-26 Staton Techiya Llc Method and device for in-ear echo suppression
US11683643B2 (en) 2007-05-04 2023-06-20 Staton Techiya Llc Method and device for in ear canal echo suppression
US8502648B2 (en) 2007-08-16 2013-08-06 Broadcom Corporation Remote-control device with directional audio system
PT2186090T (en) 2007-08-27 2017-03-07 ERICSSON TELEFON AB L M (publ) Transient detector and method for supporting encoding of an audio signal
US8768690B2 (en) 2008-06-20 2014-07-01 Qualcomm Incorporated Coding scheme selection for low-bit-rate applications
US20090319261A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
US20090319263A1 (en) * 2008-06-20 2009-12-24 Qualcomm Incorporated Coding of transitional speech frames for low-bit-rate applications
KR20100006492A (en) 2008-07-09 2010-01-19 삼성전자주식회사 Method and apparatus for deciding encoding mode
US8380498B2 (en) * 2008-09-06 2013-02-19 GH Innovation, Inc. Temporal envelope coding of energy attack signal by using attack point location
US8600067B2 (en) 2008-09-19 2013-12-03 Personics Holdings Inc. Acoustic sealing analysis system
US9129291B2 (en) 2008-09-22 2015-09-08 Personics Holdings, Llc Personalized sound management and method
FR2944640A1 (en) * 2009-04-17 2010-10-22 France Telecom METHOD AND DEVICE FOR OBJECTIVE EVALUATION OF THE VOICE QUALITY OF A SPEECH SIGNAL TAKING INTO ACCOUNT THE CLASSIFICATION OF THE BACKGROUND NOISE CONTAINED IN THE SIGNAL.
US9838784B2 (en) 2009-12-02 2017-12-05 Knowles Electronics, Llc Directional audio capture
US8473287B2 (en) 2010-04-19 2013-06-25 Audience, Inc. Method for jointly optimizing noise reduction and voice quality in a mono or multi-microphone system
US8538035B2 (en) 2010-04-29 2013-09-17 Audience, Inc. Multi-microphone robust noise suppression
US8781137B1 (en) 2010-04-27 2014-07-15 Audience, Inc. Wind noise detection and suppression
WO2011145249A1 (en) 2010-05-17 2011-11-24 パナソニック株式会社 Audio classification device, method, program and integrated circuit
US8447596B2 (en) 2010-07-12 2013-05-21 Audience, Inc. Monaural noise suppression based on computational auditory scene analysis
US8311817B2 (en) * 2010-11-04 2012-11-13 Audience, Inc. Systems and methods for enhancing voice quality in mobile device
JP2012203351A (en) * 2011-03-28 2012-10-22 Yamaha Corp Consonant identification apparatus and program
US8990074B2 (en) * 2011-05-24 2015-03-24 Qualcomm Incorporated Noise-robust speech coding mode classification
WO2013075753A1 (en) * 2011-11-25 2013-05-30 Huawei Technologies Co., Ltd. An apparatus and a method for encoding an input signal
US8731911B2 (en) * 2011-12-09 2014-05-20 Microsoft Corporation Harmonicity-based single-channel speech quality estimation
US20150039300A1 (en) * 2012-03-14 2015-02-05 Panasonic Corporation Vehicle-mounted communication device
CN103903633B (en) * 2012-12-27 2017-04-12 华为技术有限公司 Method and apparatus for detecting voice signal
US9536540B2 (en) 2013-07-19 2017-01-03 Knowles Electronics, Llc Speech signal separation and synthesis based on auditory scene analysis and speech modeling
US9167082B2 (en) 2013-09-22 2015-10-20 Steven Wayne Goldstein Methods and systems for voice augmented caller ID / ring tone alias
US10043534B2 (en) 2013-12-23 2018-08-07 Staton Techiya, Llc Method and device for spectral expansion for an audio signal
EP2922056A1 (en) * 2014-03-19 2015-09-23 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and corresponding computer program for generating an error concealment signal using power compensation
CN105374367B (en) 2014-07-29 2019-04-05 华为技术有限公司 Abnormal frame detection method and device
DE112015004185T5 (en) 2014-09-12 2017-06-01 Knowles Electronics, Llc Systems and methods for recovering speech components
US9886963B2 (en) * 2015-04-05 2018-02-06 Qualcomm Incorporated Encoder selection
KR102446392B1 (en) * 2015-09-23 2022-09-23 삼성전자주식회사 Electronic device and method for recognizing voice of speech
US10616693B2 (en) 2016-01-22 2020-04-07 Staton Techiya Llc System and method for efficiency among devices
US9820042B1 (en) 2016-05-02 2017-11-14 Knowles Electronics, Llc Stereo separation and directional suppression with omni-directional microphones
EP3324407A1 (en) * 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for decomposing an audio signal using a ratio as a separation characteristic
EP3324406A1 (en) 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Apparatus and method for decomposing an audio signal using a variable threshold
WO2018118744A1 (en) * 2016-12-19 2018-06-28 Knowles Electronics, Llc Methods and systems for reducing false alarms in keyword detection
KR20180111271A (en) * 2017-03-31 2018-10-11 삼성전자주식회사 Method and device for removing noise using neural network model
US10951994B2 (en) 2018-04-04 2021-03-16 Staton Techiya, Llc Method to acquire preferred dynamic range function for speech enhancement
CN109545192B (en) * 2018-12-18 2022-03-08 百度在线网络技术(北京)有限公司 Method and apparatus for generating a model
WO2020223797A1 (en) * 2019-05-07 2020-11-12 Voiceage Corporation Methods and devices for detecting an attack in a sound signal to be coded and for coding the detected attack
CN110310668A (en) * 2019-05-21 2019-10-08 深圳壹账通智能科技有限公司 Mute detection method, system, equipment and computer readable storage medium

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US574906A (en) * 1897-01-12 Chain
US4281218A (en) * 1979-10-26 1981-07-28 Bell Telephone Laboratories, Incorporated Speech-nonspeech detector-classifier
JPS58143394A (en) * 1982-02-19 1983-08-25 株式会社日立製作所 Detection/classification system for voice section
CA2040025A1 (en) 1990-04-09 1991-10-10 Hideki Satoh Speech detection apparatus with influence of input level and noise reduced
US5680508A (en) * 1991-05-03 1997-10-21 Itt Corporation Enhancement of speech coding in background noise for low-rate speech coder
BR9206143A (en) * 1991-06-11 1995-01-03 Qualcomm Inc Vocal end compression processes and for variable rate encoding of input frames, apparatus to compress an acoustic signal into variable rate data, prognostic encoder triggered by variable rate code (CELP) and decoder to decode encoded frames
FR2684226B1 (en) * 1991-11-22 1993-12-24 Thomson Csf ROUTE DECISION METHOD AND DEVICE FOR VERY LOW FLOW VOCODER.
JP3277398B2 (en) 1992-04-15 2002-04-22 ソニー株式会社 Voiced sound discrimination method
US5734789A (en) * 1992-06-01 1998-03-31 Hughes Electronics Voiced, unvoiced or noise modes in a CELP vocoder
IN184794B (en) * 1993-09-14 2000-09-30 British Telecomm
US5784532A (en) 1994-02-16 1998-07-21 Qualcomm Incorporated Application specific integrated circuit (ASIC) for performing rapid speech compression in a mobile telephone system
TW271524B (en) * 1994-08-05 1996-03-01 Qualcomm Inc
GB2317084B (en) 1995-04-28 2000-01-19 Northern Telecom Ltd Methods and apparatus for distinguishing speech intervals from noise intervals in audio signals
JPH09152894A (en) 1995-11-30 1997-06-10 Denso Corp Sound and silence discriminator
DE69831991T2 (en) * 1997-03-25 2006-07-27 Koninklijke Philips Electronics N.V. Method and device for speech detection
JP3273599B2 (en) * 1998-06-19 2002-04-08 沖電気工業株式会社 Speech coding rate selector and speech coding device
JP2000010577A (en) 1998-06-19 2000-01-14 Sony Corp Voiced sound/voiceless sound judging device
US6640208B1 (en) * 2000-09-12 2003-10-28 Motorola, Inc. Voiced/unvoiced speech classifier

Also Published As

Publication number Publication date
KR20090026805A (en) 2009-03-13
CN101131817B (en) 2013-11-06
CN101131817A (en) 2008-02-27
DE60123651T2 (en) 2007-10-04
WO2002047068A3 (en) 2002-08-22
CN100350453C (en) 2007-11-21
HK1067444A1 (en) 2005-04-08
JP2004515809A (en) 2004-05-27
JP4550360B2 (en) 2010-09-22
BR0116002A (en) 2006-05-09
EP1340223A2 (en) 2003-09-03
ES2276845T3 (en) 2007-07-01
WO2002047068A2 (en) 2002-06-13
KR20030061839A (en) 2003-07-22
ATE341808T1 (en) 2006-10-15
EP1340223B1 (en) 2006-10-04
DE60123651D1 (en) 2006-11-16
CN1543639A (en) 2004-11-03
US20020111798A1 (en) 2002-08-15
US7472059B2 (en) 2008-12-30
JP5425682B2 (en) 2014-02-26
TW535141B (en) 2003-06-01
KR100895589B1 (en) 2009-05-06
KR100908219B1 (en) 2009-07-20
BRPI0116002B1 (en) 2018-04-03
JP2010176145A (en) 2010-08-12

Similar Documents

Publication Publication Date Title
AU2002233983A1 (en) Method and apparatus for robust speech classification
CN1185626C (en) System and method for modifying speech signals
CN1241169C (en) Low bit-rate coding of unvoiced segments of speech
JP5543405B2 (en) Predictive speech coder using coding scheme patterns to reduce sensitivity to frame errors
CN1302459C (en) A low-bit-rate coding method and apparatus for unvoiced speed
CN101763856B (en) Signal classifying method, classifying device and coding system
US20060171419A1 (en) Method for discontinuous transmission and accurate reproduction of background noise information
JP2002534720A (en) Adaptive Window for Analytical CELP Speech Coding by Synthesis
CN101615910B (en) Method, device and equipment of compression coding and compression coding method
WO2004034379A3 (en) Methods and devices for source controlled variable bit-rate wideband speech coding
AU1593800A (en) Complex signal activity detection for improved speech/noise classification of an audio signal
WO2005020210A3 (en) Method and apparatus for adaptive variable bit rate audio encoding
CA2494956A1 (en) Bandwidth-adaptive quantization
WO2005022343A3 (en) System and methods for incrementally augmenting a classifier
US10504540B2 (en) Signal classifying method and device, and audio encoding method and device using same
CN103915097B (en) Voice signal processing method, device and system
Shahbazi et al. Data transmission over GSM adaptive multi rate voice channel using speech-like symbols
WO2004090864A3 (en) Method and apparatus for the encoding and decoding of speech
CN101847410A (en) Fast quantization method for compressing digital audio signals
CN109089253A (en) A kind of audio compression Transmission system based on low-power consumption bluetooth
Boloursaz et al. Secure data communication through GSM adaptive multi rate voice channel
KR100651731B1 (en) Apparatus and method for variable frame speech encoding/decoding
Kazemi et al. A lower capacity bound of secure end to end data transmission via GSM network
US20050102136A1 (en) Speech codecs
CN1262991C (en) Method and apparatus for tracking the phase of a quasi-periodic signal