DE60000074T2 - Linear prädiktive Cepstral-Merkmale in hierarchische Subbänder organisiert für die HMM-basierte Spracherkennung - Google Patents

Linear prädiktive Cepstral-Merkmale in hierarchische Subbänder organisiert für die HMM-basierte Spracherkennung

Info

Publication number
DE60000074T2
DE60000074T2 DE60000074T DE60000074T DE60000074T2 DE 60000074 T2 DE60000074 T2 DE 60000074T2 DE 60000074 T DE60000074 T DE 60000074T DE 60000074 T DE60000074 T DE 60000074T DE 60000074 T2 DE60000074 T2 DE 60000074T2
Authority
DE
Germany
Prior art keywords
hmm
speech recognition
linear predictive
based speech
cepstral features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
DE60000074T
Other languages
English (en)
Other versions
DE60000074D1 (de
Inventor
Rathinevelu Chengalvarayan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia of America Corp
Original Assignee
Lucent Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lucent Technologies Inc filed Critical Lucent Technologies Inc
Application granted granted Critical
Publication of DE60000074D1 publication Critical patent/DE60000074D1/de
Publication of DE60000074T2 publication Critical patent/DE60000074T2/de
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
DE60000074T 1999-03-12 2000-03-07 Linear prädiktive Cepstral-Merkmale in hierarchische Subbänder organisiert für die HMM-basierte Spracherkennung Expired - Fee Related DE60000074T2 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/266,958 US6292776B1 (en) 1999-03-12 1999-03-12 Hierarchial subband linear predictive cepstral features for HMM-based speech recognition

Publications (2)

Publication Number Publication Date
DE60000074D1 DE60000074D1 (de) 2002-03-28
DE60000074T2 true DE60000074T2 (de) 2002-08-29

Family

ID=23016697

Family Applications (1)

Application Number Title Priority Date Filing Date
DE60000074T Expired - Fee Related DE60000074T2 (de) 1999-03-12 2000-03-07 Linear prädiktive Cepstral-Merkmale in hierarchische Subbänder organisiert für die HMM-basierte Spracherkennung

Country Status (5)

Country Link
US (1) US6292776B1 (de)
EP (1) EP1041540B1 (de)
JP (1) JP3810608B2 (de)
CA (1) CA2299051C (de)
DE (1) DE60000074T2 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102006014507A1 (de) * 2006-03-19 2007-09-20 Technische Universität Dresden Verfahren und Vorrichtung zur Klassifikation und Beurteilung von Musikinstrumenten

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI19992350A (fi) * 1999-10-29 2001-04-30 Nokia Mobile Phones Ltd Parannettu puheentunnistus
US20020065649A1 (en) * 2000-08-25 2002-05-30 Yoon Kim Mel-frequency linear prediction speech recognition apparatus and method
US6754626B2 (en) * 2001-03-01 2004-06-22 International Business Machines Corporation Creating a hierarchical tree of language models for a dialog system based on prompt and dialog context
JP3564501B2 (ja) 2001-03-22 2004-09-15 学校法人明治大学 乳幼児の音声解析システム
US7623114B2 (en) 2001-10-09 2009-11-24 Immersion Corporation Haptic feedback sensations based on audio output from computer devices
US6703550B2 (en) * 2001-10-10 2004-03-09 Immersion Corporation Sound data output and manipulation using haptic feedback
WO2004004320A1 (en) * 2002-07-01 2004-01-08 The Regents Of The University Of California Digital processing of video images
JP4517163B2 (ja) * 2004-03-12 2010-08-04 株式会社国際電気通信基礎技術研究所 周波数特性等化装置
US7765333B2 (en) 2004-07-15 2010-07-27 Immersion Corporation System and method for ordering haptic effects
US20060017691A1 (en) * 2004-07-23 2006-01-26 Juan Manuel Cruz-Hernandez System and method for controlling audio output associated with haptic effects
CN1296887C (zh) * 2004-09-29 2007-01-24 上海交通大学 用于嵌入式自动语音识别系统的训练方法
US7676362B2 (en) * 2004-12-31 2010-03-09 Motorola, Inc. Method and apparatus for enhancing loudness of a speech signal
US8280730B2 (en) 2005-05-25 2012-10-02 Motorola Mobility Llc Method and apparatus of increasing speech intelligibility in noisy environments
US20070055519A1 (en) * 2005-09-02 2007-03-08 Microsoft Corporation Robust bandwith extension of narrowband signals
US8700791B2 (en) 2005-10-19 2014-04-15 Immersion Corporation Synchronization of haptic effect data in a media transport stream
US7970613B2 (en) 2005-11-12 2011-06-28 Sony Computer Entertainment Inc. Method and system for Gaussian probability data bit reduction and computation
US8010358B2 (en) * 2006-02-21 2011-08-30 Sony Computer Entertainment Inc. Voice recognition with parallel gender and age normalization
US7778831B2 (en) 2006-02-21 2010-08-17 Sony Computer Entertainment Inc. Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch
US7979146B2 (en) 2006-04-13 2011-07-12 Immersion Corporation System and method for automatically producing haptic events from a digital audio signal
US8000825B2 (en) * 2006-04-13 2011-08-16 Immersion Corporation System and method for automatically producing haptic events from a digital audio file
US8378964B2 (en) 2006-04-13 2013-02-19 Immersion Corporation System and method for automatically producing haptic events from a digital audio signal
US20070250311A1 (en) * 2006-04-25 2007-10-25 Glen Shires Method and apparatus for automatic adjustment of play speed of audio data
US20080003550A1 (en) * 2006-06-30 2008-01-03 George Betsis Systems and method for recognizing meanings in sounds made by infants
US7873209B2 (en) 2007-01-31 2011-01-18 Microsoft Corporation Segment-discriminating minimum classification error pattern recognition
JP4762176B2 (ja) * 2007-03-05 2011-08-31 日本放送協会 音声認識装置および音声認識プログラム
US7764802B2 (en) * 2007-03-09 2010-07-27 Srs Labs, Inc. Frequency-warped audio equalizer
US9019087B2 (en) 2007-10-16 2015-04-28 Immersion Corporation Synchronization of haptic effect data in a media stream
DE102007056221B4 (de) 2007-11-27 2009-07-09 Siemens Ag Österreich Verfahren zur Spracherkennung
CN101546556B (zh) * 2008-03-28 2011-03-23 展讯通信(上海)有限公司 用于音频内容识别的分类系统
KR101287892B1 (ko) * 2008-08-11 2013-07-22 임머숀 코퍼레이션 촉각작동 가능한 음악 게임용 주변장치
US8200489B1 (en) * 2009-01-29 2012-06-12 The United States Of America As Represented By The Secretary Of The Navy Multi-resolution hidden markov model using class specific features
US8442833B2 (en) * 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Speech processing with source location estimation using signals from two or more microphones
US8788256B2 (en) 2009-02-17 2014-07-22 Sony Computer Entertainment Inc. Multiple language voice recognition
US8442829B2 (en) * 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Automatic computation streaming partition for voice recognition on multiple processors with limited memory
KR101008264B1 (ko) 2009-02-27 2011-01-13 전자부품연구원 선형예측계수 차수 선택방법 및 이를 이용한 신호처리장치
US9798653B1 (en) * 2010-05-05 2017-10-24 Nuance Communications, Inc. Methods, apparatus and data structure for cross-language speech adaptation
CN101944359B (zh) * 2010-07-23 2012-04-25 杭州网豆数字技术有限公司 一种面向特定人群的语音识别方法
US8639508B2 (en) * 2011-02-14 2014-01-28 General Motors Llc User-specific confidence thresholds for speech recognition
US8719019B2 (en) * 2011-04-25 2014-05-06 Microsoft Corporation Speaker identification
CN102254554B (zh) * 2011-07-18 2012-08-08 中国科学院自动化研究所 一种对普通话重音进行层次化建模和预测的方法
WO2013124862A1 (en) * 2012-02-21 2013-08-29 Tata Consultancy Services Limited Modified mel filter bank structure using spectral characteristics for sound analysis
US9153235B2 (en) 2012-04-09 2015-10-06 Sony Computer Entertainment Inc. Text dependent speaker recognition with long-term feature based on functional data analysis
PL403724A1 (pl) 2013-05-01 2014-11-10 Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie System rozpoznawania mowy i sposób wykorzystania dynamicznych modeli i sieci Bayesa
US9653094B2 (en) 2015-04-24 2017-05-16 Cyber Resonance Corporation Methods and systems for performing signal analysis to identify content types

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5271088A (en) * 1991-05-13 1993-12-14 Itt Corporation Automated sorting of voice messages through speaker spotting
US5590242A (en) * 1994-03-24 1996-12-31 Lucent Technologies Inc. Signal bias removal for robust telephone speech recognition
US5749066A (en) * 1995-04-24 1998-05-05 Ericsson Messaging Systems Inc. Method and apparatus for developing a neural network for phoneme recognition
US5806022A (en) * 1995-12-20 1998-09-08 At&T Corp. Method and system for performing speech recognition
US5765124A (en) * 1995-12-29 1998-06-09 Lucent Technologies Inc. Time-varying feature space preprocessing procedure for telephone based speech recognition
FR2748342B1 (fr) * 1996-05-06 1998-07-17 France Telecom Procede et dispositif de filtrage par egalisation d'un signal de parole, mettant en oeuvre un modele statistique de ce signal
US6064958A (en) * 1996-09-20 2000-05-16 Nippon Telegraph And Telephone Corporation Pattern recognition scheme using probabilistic models based on mixtures distribution of discrete distribution
US5930753A (en) * 1997-03-20 1999-07-27 At&T Corp Combining frequency warping and spectral shaping in HMM based speech recognition
FR2766604B1 (fr) * 1997-07-22 1999-10-01 France Telecom Procede et dispositif d'egalisation aveugle des effets d'un canal de transmission sur un signal de parole numerique
US6112175A (en) * 1998-03-02 2000-08-29 Lucent Technologies Inc. Speaker adaptation using discriminative linear regression on time-varying mean parameters in trended HMM

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102006014507A1 (de) * 2006-03-19 2007-09-20 Technische Universität Dresden Verfahren und Vorrichtung zur Klassifikation und Beurteilung von Musikinstrumenten
DE102006014507B4 (de) * 2006-03-19 2009-05-07 Technische Universität Dresden Verfahren und Vorrichtung zur Klassifikation und Beurteilung von Musikinstrumenten gleicher Instrumentengruppen

Also Published As

Publication number Publication date
CA2299051A1 (en) 2000-09-12
US6292776B1 (en) 2001-09-18
JP3810608B2 (ja) 2006-08-16
JP2000267692A (ja) 2000-09-29
EP1041540A1 (de) 2000-10-04
CA2299051C (en) 2004-04-13
EP1041540B1 (de) 2002-02-20
DE60000074D1 (de) 2002-03-28

Similar Documents

Publication Publication Date Title
DE60000074D1 (de) Linear prädiktive Cepstral-Merkmale in hierarchische Subbänder organisiert für die HMM-basierte Spracherkennung
DE69827988D1 (de) Sprachmodelle für die Spracherkennung
DE69838305D1 (de) Orthogonalisierungssuche für die CELP basierte Sprachkodierung
DE60115738D1 (de) Sprachmodelle für die Spracherkennung
DE69831114D1 (de) Integration mehrfacher Modelle für die Spracherkennung in verschiedenen Umgebungen
DE60109105D1 (de) Hierarchisierte Wörterbücher für die Spracherkennung
DE60126882D1 (de) Hierarchisierte Wörterbücher für die Spracherkennung
DE69421354T2 (de) Datenkompression für die Spracherkennung
DE69829235D1 (de) Registrierung für die Spracherkennung
DE60229095D1 (de) Ausprachen in mehreren Sprachen zur Spracherkennung
DE69618503T2 (de) Spracherkennung für Tonsprachen
HK1048187A1 (en) Variable bit-rate celp coding of speech with phonetic classification.
DE69614789T2 (de) Vom Anwender auswählbare mehrfache Schwellenwertkriterien für Spracherkennung
DE60000138D1 (de) Erzeugung von mehreren Aussprachen eines Eigennames für die Spracherkennung
IL146985A0 (en) Automatic dynamic speech recognition vocabulary based on external sources of information
DE69922104D1 (de) Spracherkenner mit durch buchstabierte Worteingabe adaptierbarem Wortschatz
DE60016722D1 (de) Spracherkennung in zwei Durchgängen mit Restriktion des aktiven Vokabulars
NO974097D0 (no) Talegjenkjenning
GB2333877B (en) Method of evaluating an utterance in a speech recognition system
NO972026L (no) Talegjenkjenning
DE60018886D1 (de) Adaptive Wavelet-Extraktion für die Spracherkennung
IL132449A0 (en) A vocoder-based voice recognizer
DE10191732T1 (de) Selektive Sprecheradaption für ein fahrzeuggebundenes Spracherkennungssystem
DE60115317D1 (de) Kühlungsstruktur für die Steuereinheit eines Fahrzeuges
DE60336102D1 (de) Automatische Segmentierung in Sprachsynthese

Legal Events

Date Code Title Description
8364 No opposition during term of opposition
8339 Ceased/non-payment of the annual fee