DE60320719D1 - Verfahren zur Spracherkennung mit zeitabhängiger Interpolation und verborgenen dynamischen Wertklassen - Google Patents

Verfahren zur Spracherkennung mit zeitabhängiger Interpolation und verborgenen dynamischen Wertklassen

Info

Publication number
DE60320719D1
DE60320719D1 DE60320719T DE60320719T DE60320719D1 DE 60320719 D1 DE60320719 D1 DE 60320719D1 DE 60320719 T DE60320719 T DE 60320719T DE 60320719 T DE60320719 T DE 60320719T DE 60320719 D1 DE60320719 D1 DE 60320719D1
Authority
DE
Germany
Prior art keywords
production
hidden
time
speech recognition
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE60320719T
Other languages
German (de)
English (en)
Inventor
Li Deng
Jian-Lai Zhou
Frank Torsten Seide
Asela J R Gunawardana
Hagai Attias
Alejandro Acero
Xuedong D Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Application granted granted Critical
Publication of DE60320719D1 publication Critical patent/DE60320719D1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/12Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Image Analysis (AREA)
  • Machine Translation (AREA)
  • Noise Elimination (AREA)
  • Complex Calculations (AREA)
DE60320719T 2002-07-23 2003-06-30 Verfahren zur Spracherkennung mit zeitabhängiger Interpolation und verborgenen dynamischen Wertklassen Expired - Lifetime DE60320719D1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US39816602P 2002-07-23 2002-07-23
US40597102P 2002-08-26 2002-08-26
US10/267,522 US7050975B2 (en) 2002-07-23 2002-10-09 Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Publications (1)

Publication Number Publication Date
DE60320719D1 true DE60320719D1 (de) 2008-06-19

Family

ID=30003734

Family Applications (1)

Application Number Title Priority Date Filing Date
DE60320719T Expired - Lifetime DE60320719D1 (de) 2002-07-23 2003-06-30 Verfahren zur Spracherkennung mit zeitabhängiger Interpolation und verborgenen dynamischen Wertklassen

Country Status (5)

Country Link
US (2) US7050975B2 (enExample)
EP (1) EP1385147B1 (enExample)
JP (1) JP4515054B2 (enExample)
AT (1) ATE394773T1 (enExample)
DE (1) DE60320719D1 (enExample)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7209881B2 (en) * 2001-12-20 2007-04-24 Matsushita Electric Industrial Co., Ltd. Preparing acoustic models by sufficient statistics and noise-superimposed speech data
US7174292B2 (en) * 2002-05-20 2007-02-06 Microsoft Corporation Method of determining uncertainty associated with acoustic distortion-based noise reduction
US7103540B2 (en) * 2002-05-20 2006-09-05 Microsoft Corporation Method of pattern recognition using noise reduction uncertainty
US7050975B2 (en) * 2002-07-23 2006-05-23 Microsoft Corporation Method of speech recognition using time-dependent interpolation and hidden dynamic value classes
FR2846458B1 (fr) * 2002-10-25 2005-02-25 France Telecom Procede de traitement automatique d'un signal de parole.
US9117460B2 (en) * 2004-05-12 2015-08-25 Core Wireless Licensing S.A.R.L. Detection of end of utterance in speech recognition system
US7409346B2 (en) * 2004-11-05 2008-08-05 Microsoft Corporation Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction
US7565284B2 (en) * 2004-11-05 2009-07-21 Microsoft Corporation Acoustic models with structured hidden dynamics with integration over many possible hidden trajectories
US7519531B2 (en) * 2005-03-30 2009-04-14 Microsoft Corporation Speaker adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation
US7805301B2 (en) * 2005-07-01 2010-09-28 Microsoft Corporation Covariance estimation for pattern recognition
US7653535B2 (en) 2005-12-15 2010-01-26 Microsoft Corporation Learning statistically characterized resonance targets in a hidden trajectory model
US8010356B2 (en) * 2006-02-17 2011-08-30 Microsoft Corporation Parameter learning in a hidden trajectory model
US7877256B2 (en) * 2006-02-17 2011-01-25 Microsoft Corporation Time synchronous decoding for long-span hidden trajectory model
US7805308B2 (en) * 2007-01-19 2010-09-28 Microsoft Corporation Hidden trajectory modeling with differential cepstra for speech recognition
US9020816B2 (en) * 2008-08-14 2015-04-28 21Ct, Inc. Hidden markov model for speech processing with training method
US9009039B2 (en) * 2009-06-12 2015-04-14 Microsoft Technology Licensing, Llc Noise adaptive training for speech recognition
EP2539888B1 (en) 2010-02-22 2015-05-20 Nuance Communications, Inc. Online maximum-likelihood mean and variance normalization for speech recognition
TWI442384B (zh) * 2011-07-26 2014-06-21 Ind Tech Res Inst 以麥克風陣列為基礎之語音辨識系統與方法
JP6301664B2 (ja) 2014-01-31 2018-03-28 株式会社東芝 変換装置、パターン認識システム、変換方法およびプログラム
US9953646B2 (en) 2014-09-02 2018-04-24 Belleau Technologies Method and system for dynamic speech recognition and tracking of prewritten script
US10354642B2 (en) * 2017-03-03 2019-07-16 Microsoft Technology Licensing, Llc Hyperarticulation detection in repetitive voice queries using pairwise comparison for improved speech recognition
JP6599914B2 (ja) * 2017-03-09 2019-10-30 株式会社東芝 音声認識装置、音声認識方法およびプログラム

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4980917A (en) * 1987-11-18 1990-12-25 Emerson & Stern Associates, Inc. Method and apparatus for determining articulatory parameters from speech data
JP2986345B2 (ja) * 1993-10-18 1999-12-06 インターナショナル・ビジネス・マシーンズ・コーポレイション 音声記録指標化装置及び方法
GB2290684A (en) * 1994-06-22 1996-01-03 Ibm Speech synthesis using hidden Markov model to determine speech unit durations
JPH0895592A (ja) * 1994-09-21 1996-04-12 Nippon Telegr & Teleph Corp <Ntt> パターン認識方法
JPH0822296A (ja) * 1994-07-07 1996-01-23 Nippon Telegr & Teleph Corp <Ntt> パターン認識方法
US5937384A (en) * 1996-05-01 1999-08-10 Microsoft Corporation Method and system for speech recognition using continuous density hidden Markov models
US7050975B2 (en) * 2002-07-23 2006-05-23 Microsoft Corporation Method of speech recognition using time-dependent interpolation and hidden dynamic value classes

Also Published As

Publication number Publication date
US20060085191A1 (en) 2006-04-20
JP4515054B2 (ja) 2010-07-28
JP2004054298A (ja) 2004-02-19
US7206741B2 (en) 2007-04-17
EP1385147B1 (en) 2008-05-07
ATE394773T1 (de) 2008-05-15
US7050975B2 (en) 2006-05-23
EP1385147A2 (en) 2004-01-28
EP1385147A3 (en) 2005-04-20
US20040019483A1 (en) 2004-01-29

Similar Documents

Publication Publication Date Title
DE60320719D1 (de) Verfahren zur Spracherkennung mit zeitabhängiger Interpolation und verborgenen dynamischen Wertklassen
WO2004100638A3 (en) Source-dependent text-to-speech system
US10176797B2 (en) Voice synthesis method, voice synthesis device, medium for storing voice synthesis program
ATE297588T1 (de) Anpassung des phonetischen kontextes zur verbesserung der spracherkennung
ATE445896T1 (de) Spracherkennungsverfahren das variationsinferenz mit veränderlichen zustandsraummodellen benuzt
EP1629464A4 (en) LANGUAGE RECOGNITION SYSTEM AND PHONETIC BASIC PROCEDURE
EP1811497A3 (en) Apparatus and method for voice conversion
EP1675101A3 (en) Singing voice-synthesizing method and apparatus and storage medium
WO2008142836A1 (ja) 声質変換装置および声質変換方法
WO2004063902A3 (en) Speech training method with color instruction
WO2004090834A3 (en) Adaptive engine logic used in training academic proficiency
DE60213195D1 (de) Verfahren, System und Computer Programm zur Sprach-/Sprechererkennung unter Verwendung einer Emotionszustandsänderung für die unüberwachte Anpassung des Erkennungsverfahren
SE9502202L (sv) Metod vid tal-till-textomvandling
ATE387703T1 (de) Auswahl eines musikstücks anhand von metadaten und einer externen tempo-eingabe
ATE456417T1 (de) Verfahren zur stabilisierung des schweisslichtbogens
DE60300027D1 (de) Verfahren zur Herstellung eines Gegenstandes durch Diffusionsschweissung und superplastische Formung
ATE487212T1 (de) Verstekte bedingte zufallfeldermodelle für phonetische klassifizierung und spracherkennung
EP1378885A3 (en) Word-spotting apparatus, word-spotting method, and word-spotting program
ATE366912T1 (de) Verfahren und vorrichtung zur sprachausgabe, datenträger mit sprachdaten
Venn Proliferations and Limitations: Berio’s Reworking of the Sequenzas
ATE453101T1 (de) Verfahren zur darstellung von ortsnamen
ATE382179T1 (de) Verfahren zur dialogsteuerung und danach arbeitendes dialogsystem
DE50206293D1 (de) Transportband sowie Verfahren zur Herstellung eines derartigen Bandes
ATE331275T1 (de) Verfahren und vorrichtung zur erzeugung von sprachansagen
ATE427624T1 (de) Erzeugung von prufsequenzen zur sprachgutebeurteilung

Legal Events

Date Code Title Description
8364 No opposition during term of opposition