ATE394773T1 - Verfahren zur spracherkennung mit zeitabhängiger interpolation und verborgenen dynamischen wertklassen - Google Patents

Verfahren zur spracherkennung mit zeitabhängiger interpolation und verborgenen dynamischen wertklassen

Info

Publication number
ATE394773T1
ATE394773T1 AT03014848T AT03014848T ATE394773T1 AT E394773 T1 ATE394773 T1 AT E394773T1 AT 03014848 T AT03014848 T AT 03014848T AT 03014848 T AT03014848 T AT 03014848T AT E394773 T1 ATE394773 T1 AT E394773T1
Authority
AT
Austria
Prior art keywords
production
hidden
time
value
acoustics
Prior art date
Application number
AT03014848T
Other languages
German (de)
English (en)
Inventor
Li Deng
Jian-Lai Zhou
Frank Torsten Seide
Asela Gunawardana
Hagai Attias
Alejandro Acero
Xuedong D Huang
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Application granted granted Critical
Publication of ATE394773T1 publication Critical patent/ATE394773T1/de

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/12Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Image Analysis (AREA)
  • Machine Translation (AREA)
  • Noise Elimination (AREA)
  • Complex Calculations (AREA)
AT03014848T 2002-07-23 2003-06-30 Verfahren zur spracherkennung mit zeitabhängiger interpolation und verborgenen dynamischen wertklassen ATE394773T1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US39816602P 2002-07-23 2002-07-23
US40597102P 2002-08-26 2002-08-26
US10/267,522 US7050975B2 (en) 2002-07-23 2002-10-09 Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Publications (1)

Publication Number Publication Date
ATE394773T1 true ATE394773T1 (de) 2008-05-15

Family

ID=30003734

Family Applications (1)

Application Number Title Priority Date Filing Date
AT03014848T ATE394773T1 (de) 2002-07-23 2003-06-30 Verfahren zur spracherkennung mit zeitabhängiger interpolation und verborgenen dynamischen wertklassen

Country Status (5)

Country Link
US (2) US7050975B2 (enExample)
EP (1) EP1385147B1 (enExample)
JP (1) JP4515054B2 (enExample)
AT (1) ATE394773T1 (enExample)
DE (1) DE60320719D1 (enExample)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7209881B2 (en) * 2001-12-20 2007-04-24 Matsushita Electric Industrial Co., Ltd. Preparing acoustic models by sufficient statistics and noise-superimposed speech data
US7174292B2 (en) * 2002-05-20 2007-02-06 Microsoft Corporation Method of determining uncertainty associated with acoustic distortion-based noise reduction
US7103540B2 (en) * 2002-05-20 2006-09-05 Microsoft Corporation Method of pattern recognition using noise reduction uncertainty
US7050975B2 (en) * 2002-07-23 2006-05-23 Microsoft Corporation Method of speech recognition using time-dependent interpolation and hidden dynamic value classes
FR2846458B1 (fr) * 2002-10-25 2005-02-25 France Telecom Procede de traitement automatique d'un signal de parole.
US9117460B2 (en) * 2004-05-12 2015-08-25 Core Wireless Licensing S.A.R.L. Detection of end of utterance in speech recognition system
US7409346B2 (en) * 2004-11-05 2008-08-05 Microsoft Corporation Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction
US7565284B2 (en) * 2004-11-05 2009-07-21 Microsoft Corporation Acoustic models with structured hidden dynamics with integration over many possible hidden trajectories
US7519531B2 (en) * 2005-03-30 2009-04-14 Microsoft Corporation Speaker adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation
US7805301B2 (en) * 2005-07-01 2010-09-28 Microsoft Corporation Covariance estimation for pattern recognition
US7653535B2 (en) 2005-12-15 2010-01-26 Microsoft Corporation Learning statistically characterized resonance targets in a hidden trajectory model
US8010356B2 (en) * 2006-02-17 2011-08-30 Microsoft Corporation Parameter learning in a hidden trajectory model
US7877256B2 (en) * 2006-02-17 2011-01-25 Microsoft Corporation Time synchronous decoding for long-span hidden trajectory model
US7805308B2 (en) * 2007-01-19 2010-09-28 Microsoft Corporation Hidden trajectory modeling with differential cepstra for speech recognition
US9020816B2 (en) * 2008-08-14 2015-04-28 21Ct, Inc. Hidden markov model for speech processing with training method
US9009039B2 (en) * 2009-06-12 2015-04-14 Microsoft Technology Licensing, Llc Noise adaptive training for speech recognition
EP2539888B1 (en) 2010-02-22 2015-05-20 Nuance Communications, Inc. Online maximum-likelihood mean and variance normalization for speech recognition
TWI442384B (zh) * 2011-07-26 2014-06-21 Ind Tech Res Inst 以麥克風陣列為基礎之語音辨識系統與方法
JP6301664B2 (ja) 2014-01-31 2018-03-28 株式会社東芝 変換装置、パターン認識システム、変換方法およびプログラム
US9953646B2 (en) 2014-09-02 2018-04-24 Belleau Technologies Method and system for dynamic speech recognition and tracking of prewritten script
US10354642B2 (en) * 2017-03-03 2019-07-16 Microsoft Technology Licensing, Llc Hyperarticulation detection in repetitive voice queries using pairwise comparison for improved speech recognition
JP6599914B2 (ja) * 2017-03-09 2019-10-30 株式会社東芝 音声認識装置、音声認識方法およびプログラム

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4980917A (en) * 1987-11-18 1990-12-25 Emerson & Stern Associates, Inc. Method and apparatus for determining articulatory parameters from speech data
JP2986345B2 (ja) * 1993-10-18 1999-12-06 インターナショナル・ビジネス・マシーンズ・コーポレイション 音声記録指標化装置及び方法
GB2290684A (en) * 1994-06-22 1996-01-03 Ibm Speech synthesis using hidden Markov model to determine speech unit durations
JPH0895592A (ja) * 1994-09-21 1996-04-12 Nippon Telegr & Teleph Corp <Ntt> パターン認識方法
JPH0822296A (ja) * 1994-07-07 1996-01-23 Nippon Telegr & Teleph Corp <Ntt> パターン認識方法
US5937384A (en) * 1996-05-01 1999-08-10 Microsoft Corporation Method and system for speech recognition using continuous density hidden Markov models
US7050975B2 (en) * 2002-07-23 2006-05-23 Microsoft Corporation Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Also Published As

Publication number Publication date
US20060085191A1 (en) 2006-04-20
JP4515054B2 (ja) 2010-07-28
JP2004054298A (ja) 2004-02-19
US7206741B2 (en) 2007-04-17
EP1385147B1 (en) 2008-05-07
US7050975B2 (en) 2006-05-23
EP1385147A2 (en) 2004-01-28
EP1385147A3 (en) 2005-04-20
DE60320719D1 (de) 2008-06-19
US20040019483A1 (en) 2004-01-29

Similar Documents

Publication Publication Date Title
ATE394773T1 (de) Verfahren zur spracherkennung mit zeitabhängiger interpolation und verborgenen dynamischen wertklassen
WO2004100638A3 (en) Source-dependent text-to-speech system
US10176797B2 (en) Voice synthesis method, voice synthesis device, medium for storing voice synthesis program
ATE297588T1 (de) Anpassung des phonetischen kontextes zur verbesserung der spracherkennung
WO2002073595A1 (en) Prosody generating device, prosody generarging method, and program
US20010021906A1 (en) Intonation control method for text-to-speech conversion
DE602004015973D1 (de) Spracherkennungssystem und verfahren auf phonetischer basis
SE9502202L (sv) Metod vid tal-till-textomvandling
KR20150024180A (ko) 발음 교정 장치 및 방법
KR20030046444A (ko) 감정검출방법, 감성발생방법 및 그 장치 및 소프트웨어
EP1465154A3 (en) Method of speech recognition using variational inference with switching state space models
EP1675101A3 (en) Singing voice-synthesizing method and apparatus and storage medium
ATE456417T1 (de) Verfahren zur stabilisierung des schweisslichtbogens
DE602005024497D1 (de) Verstekte bedingte Zufallfeldermodelle für phonetische Klassifizierung und Spracherkennung
EP1378885A3 (en) Word-spotting apparatus, word-spotting method, and word-spotting program
ATE366912T1 (de) Verfahren und vorrichtung zur sprachausgabe, datenträger mit sprachdaten
JP2008241890A (ja) 音声対話装置および方法
EP0982684A4 (en) MOTION IMAGE GENERATION DEVICE AND LEARNING DEVICE VIA IMAGE CONTROL NETWORK
Venn Proliferations and Limitations: Berio’s Reworking of the Sequenzas
JP4232254B2 (ja) 音声合成装置、規則音声合成方法及び記憶媒体
ATE453101T1 (de) Verfahren zur darstellung von ortsnamen
ATE382179T1 (de) Verfahren zur dialogsteuerung und danach arbeitendes dialogsystem
JP3709436B2 (ja) 音声認識用精細セグメント音響モデルの作成装置
Ceyssens et al. A strategy for pitch conversion and its evaluation
ATE331275T1 (de) Verfahren und vorrichtung zur erzeugung von sprachansagen

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties