AU2001294974A1 - Perceptual harmonic cepstral coefficients as the front-end for speech recognition - Google Patents

Perceptual harmonic cepstral coefficients as the front-end for speech recognition

Info

Publication number
AU2001294974A1
AU2001294974A1 AU2001294974A AU9497401A AU2001294974A1 AU 2001294974 A1 AU2001294974 A1 AU 2001294974A1 AU 2001294974 A AU2001294974 A AU 2001294974A AU 9497401 A AU9497401 A AU 9497401A AU 2001294974 A1 AU2001294974 A1 AU 2001294974A1
Authority
AU
Australia
Prior art keywords
speech recognition
cepstral coefficients
perceptual
perceptual harmonic
harmonic cepstral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2001294974A
Inventor
Liang Gu
Kenneth Rose
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of California
Original Assignee
University of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of California filed Critical University of California
Publication of AU2001294974A1 publication Critical patent/AU2001294974A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • G10L2025/935Mixed voiced class; Transitions
AU2001294974A 2000-10-02 2001-10-02 Perceptual harmonic cepstral coefficients as the front-end for speech recognition Abandoned AU2001294974A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US23728500P 2000-10-02 2000-10-02
US60237285 2000-10-02
PCT/US2001/030909 WO2002029782A1 (en) 2000-10-02 2001-10-02 Perceptual harmonic cepstral coefficients as the front-end for speech recognition

Publications (1)

Publication Number Publication Date
AU2001294974A1 true AU2001294974A1 (en) 2002-04-15

Family

ID=22893097

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2001294974A Abandoned AU2001294974A1 (en) 2000-10-02 2001-10-02 Perceptual harmonic cepstral coefficients as the front-end for speech recognition

Country Status (3)

Country Link
US (2) US7337107B2 (en)
AU (1) AU2001294974A1 (en)
WO (1) WO2002029782A1 (en)

Families Citing this family (58)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7337107B2 (en) * 2000-10-02 2008-02-26 The Regents Of The University Of California Perceptual harmonic cepstral coefficients as the front-end for speech recognition
US20030004720A1 (en) * 2001-01-30 2003-01-02 Harinath Garudadri System and method for computing and transmitting parameters in a distributed voice recognition system
US7177304B1 (en) * 2002-01-03 2007-02-13 Cisco Technology, Inc. Devices, softwares and methods for prioritizing between voice data packets for discard decision purposes
SG120121A1 (en) * 2003-09-26 2006-03-28 St Microelectronics Asia Pitch detection of speech signals
JP4318119B2 (en) * 2004-06-18 2009-08-19 国立大学法人京都大学 Acoustic signal processing method, acoustic signal processing apparatus, acoustic signal processing system, and computer program
US20070299658A1 (en) * 2004-07-13 2007-12-27 Matsushita Electric Industrial Co., Ltd. Pitch Frequency Estimation Device, and Pich Frequency Estimation Method
KR100619893B1 (en) * 2004-07-23 2006-09-19 엘지전자 주식회사 A method and a apparatus of advanced low bit rate linear prediction coding with plp coefficient for mobile phone
US20060025991A1 (en) * 2004-07-23 2006-02-02 Lg Electronics Inc. Voice coding apparatus and method using PLP in mobile communications terminal
CN1776807A (en) * 2004-11-15 2006-05-24 松下电器产业株式会社 Sound identifying system and safety device having same
KR20060066483A (en) * 2004-12-13 2006-06-16 엘지전자 주식회사 Method for extracting feature vectors for voice recognition
JP4407538B2 (en) 2005-03-03 2010-02-03 ヤマハ株式会社 Microphone array signal processing apparatus and microphone array system
US7603275B2 (en) * 2005-10-31 2009-10-13 Hitachi, Ltd. System, method and computer program product for verifying an identity using voiced to unvoiced classifiers
US7788101B2 (en) * 2005-10-31 2010-08-31 Hitachi, Ltd. Adaptation method for inter-person biometrics variability
KR100653643B1 (en) * 2006-01-26 2006-12-05 삼성전자주식회사 Method and apparatus for detecting pitch by subharmonic-to-harmonic ratio
KR100790110B1 (en) * 2006-03-18 2008-01-02 삼성전자주식회사 Apparatus and method of voice signal codec based on morphological approach
KR100762596B1 (en) * 2006-04-05 2007-10-01 삼성전자주식회사 Speech signal pre-processing system and speech signal feature information extracting method
CN101622663B (en) * 2007-03-02 2012-06-20 松下电器产业株式会社 Encoding device and encoding method
US7877252B2 (en) * 2007-05-18 2011-01-25 Stmicroelectronics S.R.L. Automatic speech recognition method and apparatus, using non-linear envelope detection of signal power spectra
JP5089295B2 (en) * 2007-08-31 2012-12-05 インターナショナル・ビジネス・マシーンズ・コーポレーション Speech processing system, method and program
US8688441B2 (en) * 2007-11-29 2014-04-01 Motorola Mobility Llc Method and apparatus to facilitate provision and use of an energy value to determine a spectral envelope shape for out-of-signal bandwidth content
US8433582B2 (en) * 2008-02-01 2013-04-30 Motorola Mobility Llc Method and apparatus for estimating high-band energy in a bandwidth extension system
US20090201983A1 (en) * 2008-02-07 2009-08-13 Motorola, Inc. Method and apparatus for estimating high-band energy in a bandwidth extension system
US8463412B2 (en) * 2008-08-21 2013-06-11 Motorola Mobility Llc Method and apparatus to facilitate determining signal bounding frequencies
US8155967B2 (en) * 2008-12-08 2012-04-10 Begel Daniel M Method and system to identify, quantify, and display acoustic transformational structures in speech
US8463599B2 (en) * 2009-02-04 2013-06-11 Motorola Mobility Llc Bandwidth extension method and apparatus for a modified discrete cosine transform audio coder
US9055374B2 (en) * 2009-06-24 2015-06-09 Arizona Board Of Regents For And On Behalf Of Arizona State University Method and system for determining an auditory pattern of an audio segment
JP5316896B2 (en) * 2010-03-17 2013-10-16 ソニー株式会社 Encoding device, encoding method, decoding device, decoding method, and program
JP5183828B2 (en) * 2010-09-21 2013-04-17 三菱電機株式会社 Noise suppressor
US9208799B2 (en) * 2010-11-10 2015-12-08 Koninklijke Philips N.V. Method and device for estimating a pattern in a signal
US8849663B2 (en) 2011-03-21 2014-09-30 The Intellisis Corporation Systems and methods for segmenting and/or classifying an audio signal from transformed audio information
US9142220B2 (en) 2011-03-25 2015-09-22 The Intellisis Corporation Systems and methods for reconstructing an audio signal from transformed audio information
US9183850B2 (en) 2011-08-08 2015-11-10 The Intellisis Corporation System and method for tracking sound pitch across an audio signal
US8620646B2 (en) * 2011-08-08 2013-12-31 The Intellisis Corporation System and method for tracking sound pitch across an audio signal using harmonic envelope
US8548803B2 (en) 2011-08-08 2013-10-01 The Intellisis Corporation System and method of processing a sound signal including transforming the sound signal into a frequency-chirp domain
US20140329511A1 (en) * 2011-12-20 2014-11-06 Nokia Corporation Audio conferencing
US9173025B2 (en) 2012-02-08 2015-10-27 Dolby Laboratories Licensing Corporation Combined suppression of noise, echo, and out-of-location signals
US8712076B2 (en) 2012-02-08 2014-04-29 Dolby Laboratories Licensing Corporation Post-processing including median filtering of noise suppression gains
US9384759B2 (en) * 2012-03-05 2016-07-05 Malaspina Labs (Barbados) Inc. Voice activity detection and pitch estimation
US9076446B2 (en) * 2012-03-22 2015-07-07 Qiguang Lin Method and apparatus for robust speaker and speech recognition
EP2828855B1 (en) 2012-03-23 2016-04-27 Dolby Laboratories Licensing Corporation Determining a harmonicity measure for voice processing
CN103325384A (en) 2012-03-23 2013-09-25 杜比实验室特许公司 Harmonicity estimation, audio classification, pitch definition and noise estimation
CN103426441B (en) * 2012-05-18 2016-03-02 华为技术有限公司 Detect the method and apparatus of the correctness of pitch period
CN103928029B (en) 2013-01-11 2017-02-08 华为技术有限公司 Audio signal coding method, audio signal decoding method, audio signal coding apparatus, and audio signal decoding apparatus
US9953635B2 (en) 2013-04-11 2018-04-24 Cetin CETINTURK Relative excitation features for speech recognition
US9058820B1 (en) 2013-05-21 2015-06-16 The Intellisis Corporation Identifying speech portions of a sound model using various statistics thereof
US9484044B1 (en) 2013-07-17 2016-11-01 Knuedge Incorporated Voice enhancement and/or speech features extraction on noisy audio signals using successively refined transforms
US9530434B1 (en) 2013-07-18 2016-12-27 Knuedge Incorporated Reducing octave errors during pitch determination for noisy audio signals
US9208794B1 (en) 2013-08-07 2015-12-08 The Intellisis Corporation Providing sound models of an input signal using continuous and/or linear fitting
EP3696816B1 (en) 2014-05-01 2021-05-12 Nippon Telegraph and Telephone Corporation Periodic-combined-envelope-sequence generation device, periodic-combined-envelope-sequence generation method, periodic-combined-envelope-sequence generation program and recording medium
US9842611B2 (en) 2015-02-06 2017-12-12 Knuedge Incorporated Estimating pitch using peak-to-peak distances
US9870785B2 (en) 2015-02-06 2018-01-16 Knuedge Incorporated Determining features of harmonic signals
US9922668B2 (en) 2015-02-06 2018-03-20 Knuedge Incorporated Estimating fractional chirp rate with multiple frequency representations
US9997161B2 (en) 2015-09-11 2018-06-12 Microsoft Technology Licensing, Llc Automatic speech recognition confidence classifier
US10706852B2 (en) 2015-11-13 2020-07-07 Microsoft Technology Licensing, Llc Confidence features for automated speech recognition arbitration
US10468050B2 (en) * 2017-03-29 2019-11-05 Microsoft Technology Licensing, Llc Voice synthesized participatory rhyming chat bot
CN108022588B (en) * 2017-11-13 2022-03-29 河海大学 Robust speech recognition method based on dual-feature model
US11138334B1 (en) * 2018-10-17 2021-10-05 Medallia, Inc. Use of ASR confidence to improve reliability of automatic audio redaction
CN112863517B (en) * 2021-01-19 2023-01-06 苏州大学 Speech recognition method based on perceptual spectrum convergence rate

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3649765A (en) * 1969-10-29 1972-03-14 Bell Telephone Labor Inc Speech analyzer-synthesizer system employing improved formant extractor
US4885790A (en) * 1985-03-18 1989-12-05 Massachusetts Institute Of Technology Processing of acoustic waveforms
US5596680A (en) * 1992-12-31 1997-01-21 Apple Computer, Inc. Method and apparatus for detecting speech activity using cepstrum vectors
JP2812184B2 (en) * 1994-02-23 1998-10-22 日本電気株式会社 Complex Cepstrum Analyzer for Speech
AU696092B2 (en) * 1995-01-12 1998-09-03 Digital Voice Systems, Inc. Estimation of excitation parameters
US5774837A (en) * 1995-09-13 1998-06-30 Voxware, Inc. Speech coding system and method using voicing probability determination
US6233550B1 (en) * 1997-08-29 2001-05-15 The Regents Of The University Of California Method and apparatus for hybrid coding of speech at 4kbps
US6963833B1 (en) * 1999-10-26 2005-11-08 Sasken Communication Technologies Limited Modifications in the multi-band excitation (MBE) model for generating high quality speech at low bit rates
US6377916B1 (en) * 1999-11-29 2002-04-23 Digital Voice Systems, Inc. Multiband harmonic transform coder
US7337107B2 (en) * 2000-10-02 2008-02-26 The Regents Of The University Of California Perceptual harmonic cepstral coefficients as the front-end for speech recognition

Also Published As

Publication number Publication date
WO2002029782A1 (en) 2002-04-11
US7337107B2 (en) 2008-02-26
US20040128130A1 (en) 2004-07-01
US7756700B2 (en) 2010-07-13
US20080162122A1 (en) 2008-07-03

Similar Documents

Publication Publication Date Title
AU2001294974A1 (en) Perceptual harmonic cepstral coefficients as the front-end for speech recognition
AU7348000A (en) Voice recognition for internet navigation
AU2003295682A1 (en) Multilingual speech recognition
AU1149501A (en) Speech recognition
AU2002218916A1 (en) Hierarchical language models for speech recognition
AU5451800A (en) Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces
AU2001275319A1 (en) Load-adjusted speech recognition
AU2002236034A1 (en) Spoken language interface
AU2001279172A1 (en) Computer-implemented speech recognition system training
AU2003235868A1 (en) Speech recognition device
AU3165000A (en) Client-server speech recognition
AU2002363991A1 (en) Distributed speech recognition with configurable front-end
AU5647899A (en) Speech recognizer
EP1251489A3 (en) Training the parameters of a speech recognition system for the recognition of pronunciation variations
AU2002325930A1 (en) Method for automatic speech recognition
AU1157401A (en) Speech recognition
AU2000276400A1 (en) Search method based on single triphone tree for large vocabulary continuous speech recognizer
AU2001262407A1 (en) Dynamic language models for speech recognition
AU2001238103A1 (en) Electrolaryngeal speech enhancement for telephony
AU2002231046A1 (en) Context-responsive spoken language instruction
AU2003273357A1 (en) Speech recognition system
AU2002302651A1 (en) Voice recognition method
AU2002354792A1 (en) Grammars for speech recognition
AU3381299A (en) Multiple stage speech recognizer
AU2001228894A1 (en) The speech recognition(setting) of on-line-network-game