DE69707876D1 - Verfahren und vorrichtung fuer dynamisch eingestelltes training zur spracherkennung - Google Patents

Verfahren und vorrichtung fuer dynamisch eingestelltes training zur spracherkennung

Info

Publication number
DE69707876D1
DE69707876D1 DE69707876T DE69707876T DE69707876D1 DE 69707876 D1 DE69707876 D1 DE 69707876D1 DE 69707876 T DE69707876 T DE 69707876T DE 69707876 T DE69707876 T DE 69707876T DE 69707876 D1 DE69707876 D1 DE 69707876D1
Authority
DE
Germany
Prior art keywords
voice recognition
dynamically set
set training
training
dynamically
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE69707876T
Other languages
English (en)
Other versions
DE69707876T2 (de
Inventor
Hsiao-Wuen Hon
D Huang
Mei-Yuh Hwang
Li Jiang
Yun-Cheng Ju
V Mahajan
J Rozak
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of DE69707876D1 publication Critical patent/DE69707876D1/de
Application granted granted Critical
Publication of DE69707876T2 publication Critical patent/DE69707876T2/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting

Landscapes

  • Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)
  • Electrically Operated Instructional Devices (AREA)
DE69707876T 1996-06-28 1997-06-27 Verfahren und vorrichtung fuer dynamisch eingestelltes training zur spracherkennung Expired - Lifetime DE69707876T2 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/673,435 US5963903A (en) 1996-06-28 1996-06-28 Method and system for dynamically adjusted training for speech recognition
PCT/US1997/011683 WO1998000834A1 (en) 1996-06-28 1997-06-27 Method and system for dynamically adjusted training for speech recognition

Publications (2)

Publication Number Publication Date
DE69707876D1 true DE69707876D1 (de) 2001-12-06
DE69707876T2 DE69707876T2 (de) 2002-04-11

Family

ID=24702640

Family Applications (1)

Application Number Title Priority Date Filing Date
DE69707876T Expired - Lifetime DE69707876T2 (de) 1996-06-28 1997-06-27 Verfahren und vorrichtung fuer dynamisch eingestelltes training zur spracherkennung

Country Status (6)

Country Link
US (1) US5963903A (de)
EP (1) EP0907949B1 (de)
JP (1) JP3672573B2 (de)
CN (1) CN1165887C (de)
DE (1) DE69707876T2 (de)
WO (1) WO1998000834A1 (de)

Families Citing this family (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6807537B1 (en) * 1997-12-04 2004-10-19 Microsoft Corporation Mixtures of Bayesian networks
US6418431B1 (en) * 1998-03-30 2002-07-09 Microsoft Corporation Information retrieval and speech recognition based on language models
US6085160A (en) * 1998-07-10 2000-07-04 Lernout & Hauspie Speech Products N.V. Language independent speech recognition
US6260014B1 (en) * 1998-09-14 2001-07-10 International Business Machines Corporation Specific task composite acoustic models
AU777693B2 (en) * 1999-03-05 2004-10-28 Canon Kabushiki Kaisha Database annotation and retrieval
ATE306116T1 (de) * 1999-07-08 2005-10-15 Koninkl Philips Electronics Nv Spracherkennungseinrichtung mit transfermitteln
US6836761B1 (en) * 1999-10-21 2004-12-28 Yamaha Corporation Voice converter for assimilation by frame synthesis with temporal alignment
US6789060B1 (en) * 1999-11-01 2004-09-07 Gene J. Wolfe Network based speech transcription that maintains dynamic templates
US6546387B1 (en) * 1999-11-15 2003-04-08 Transcom Software Inc. Computer network information management system and method using intelligent software agents
WO2001065541A1 (fr) * 2000-02-28 2001-09-07 Sony Corporation Dispositif de reconnaissance de la parole, procede de reconnaissance de la parole et support d'enregistrement
WO2001084535A2 (en) * 2000-05-02 2001-11-08 Dragon Systems, Inc. Error correction in speech recognition
US7031908B1 (en) * 2000-06-01 2006-04-18 Microsoft Corporation Creating a language model for a language processing system
US6865528B1 (en) * 2000-06-01 2005-03-08 Microsoft Corporation Use of a unified language model
US20050135180A1 (en) * 2000-06-30 2005-06-23 Micron Technology, Inc. Interface command architecture for synchronous flash memory
US7216077B1 (en) * 2000-09-26 2007-05-08 International Business Machines Corporation Lattice-based unsupervised maximum likelihood linear regression for speaker adaptation
US6832189B1 (en) 2000-11-15 2004-12-14 International Business Machines Corporation Integration of speech recognition and stenographic services for improved ASR training
US6928409B2 (en) * 2001-05-31 2005-08-09 Freescale Semiconductor, Inc. Speech recognition using polynomial expansion and hidden markov models
EP1288911B1 (de) * 2001-08-08 2005-06-29 Nippon Telegraph and Telephone Corporation Anhebungsdetektion zur automatischen Sprachzusammenfassung
JP4947861B2 (ja) * 2001-09-25 2012-06-06 キヤノン株式会社 自然言語処理装置およびその制御方法ならびにプログラム
US20030088416A1 (en) * 2001-11-06 2003-05-08 D.S.P.C. Technologies Ltd. HMM-based text-to-phoneme parser and method for training same
US20030120493A1 (en) * 2001-12-21 2003-06-26 Gupta Sunil K. Method and system for updating and customizing recognition vocabulary
GB2385698B (en) * 2002-02-26 2005-06-15 Canon Kk Speech processing apparatus and method
JP2004053742A (ja) * 2002-07-17 2004-02-19 Matsushita Electric Ind Co Ltd 音声認識装置
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
JP3667332B2 (ja) * 2002-11-21 2005-07-06 松下電器産業株式会社 標準モデル作成装置及び標準モデル作成方法
JP2004191705A (ja) * 2002-12-12 2004-07-08 Renesas Technology Corp 音声認識装置
US7146319B2 (en) * 2003-03-31 2006-12-05 Novauris Technologies Ltd. Phonetically based speech recognition system and method
US7302389B2 (en) * 2003-05-14 2007-11-27 Lucent Technologies Inc. Automatic assessment of phonological processes
US20040230431A1 (en) * 2003-05-14 2004-11-18 Gupta Sunil K. Automatic assessment of phonological processes for speech therapy and language instruction
US7373294B2 (en) * 2003-05-15 2008-05-13 Lucent Technologies Inc. Intonation transformation for speech therapy and the like
US8301436B2 (en) * 2003-05-29 2012-10-30 Microsoft Corporation Semantic object synchronous understanding for highly interactive interface
US7200559B2 (en) 2003-05-29 2007-04-03 Microsoft Corporation Semantic object synchronous understanding implemented with speech application language tags
US20040243412A1 (en) * 2003-05-29 2004-12-02 Gupta Sunil K. Adaptation of speech models in speech recognition
US8019602B2 (en) * 2004-01-20 2011-09-13 Microsoft Corporation Automatic speech recognition learning using user corrections
US7809567B2 (en) * 2004-07-23 2010-10-05 Microsoft Corporation Speech recognition application or server using iterative recognition constraints
CN1296887C (zh) * 2004-09-29 2007-01-24 上海交通大学 用于嵌入式自动语音识别系统的训练方法
GB0426347D0 (en) * 2004-12-01 2005-01-05 Ibm Methods, apparatus and computer programs for automatic speech recognition
US7475016B2 (en) * 2004-12-15 2009-01-06 International Business Machines Corporation Speech segment clustering and ranking
GB2428853A (en) 2005-07-22 2007-02-07 Novauris Technologies Ltd Speech recognition application specific dictionary
US7970613B2 (en) 2005-11-12 2011-06-28 Sony Computer Entertainment Inc. Method and system for Gaussian probability data bit reduction and computation
CN101385073A (zh) * 2006-02-14 2009-03-11 知识风险基金21有限责任公司 具有不依赖于说话者的语音识别的通信设备
US8010358B2 (en) * 2006-02-21 2011-08-30 Sony Computer Entertainment Inc. Voice recognition with parallel gender and age normalization
US7778831B2 (en) * 2006-02-21 2010-08-17 Sony Computer Entertainment Inc. Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch
US8135590B2 (en) * 2007-01-11 2012-03-13 Microsoft Corporation Position-dependent phonetic models for reliable pronunciation identification
CN101465123B (zh) * 2007-12-20 2011-07-06 株式会社东芝 说话人认证的验证方法和装置以及说话人认证系统
CN101286317B (zh) * 2008-05-30 2011-07-27 同济大学 语音识别装置、模型训练方法、及交通信息服务平台
US8155961B2 (en) * 2008-12-09 2012-04-10 Nokia Corporation Adaptation of automatic speech recognition acoustic models
US8442829B2 (en) 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Automatic computation streaming partition for voice recognition on multiple processors with limited memory
US8788256B2 (en) 2009-02-17 2014-07-22 Sony Computer Entertainment Inc. Multiple language voice recognition
US8442833B2 (en) 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Speech processing with source location estimation using signals from two or more microphones
CN101577118B (zh) * 2009-06-12 2011-05-04 北京大学 面向智能服务机器人的语音交互系统的实现方法
JP5471858B2 (ja) * 2009-07-02 2014-04-16 ヤマハ株式会社 歌唱合成用データベース生成装置、およびピッチカーブ生成装置
GB2489489B (en) * 2011-03-30 2013-08-21 Toshiba Res Europ Ltd A speech processing system and method
US9153235B2 (en) 2012-04-09 2015-10-06 Sony Computer Entertainment Inc. Text dependent speaker recognition with long-term feature based on functional data analysis
US9972306B2 (en) * 2012-08-07 2018-05-15 Interactive Intelligence Group, Inc. Method and system for acoustic data selection for training the parameters of an acoustic model
WO2014052326A2 (en) * 2012-09-25 2014-04-03 Nvoq Incorporated Apparatus and methods for managing resources for a system using voice recognition
TWI508033B (zh) * 2013-04-26 2015-11-11 Wistron Corp 語言學習方法與裝置以及電腦可讀記錄媒體
US20160063990A1 (en) * 2014-08-26 2016-03-03 Honeywell International Inc. Methods and apparatus for interpreting clipped speech using speech recognition
US9953646B2 (en) 2014-09-02 2018-04-24 Belleau Technologies Method and system for dynamic speech recognition and tracking of prewritten script
WO2016114428A1 (ko) 2015-01-16 2016-07-21 삼성전자 주식회사 문법 모델을 이용하여 음성인식을 수행하는 방법 및 디바이스
US10229672B1 (en) 2015-12-31 2019-03-12 Google Llc Training acoustic models using connectionist temporal classification
GB2552723A (en) * 2016-08-03 2018-02-07 Cirrus Logic Int Semiconductor Ltd Speaker recognition
CN110428819B (zh) * 2019-05-21 2020-11-24 腾讯科技(深圳)有限公司 解码网络生成方法、语音识别方法、装置、设备及介质
CN111105799B (zh) * 2019-12-09 2023-07-07 国网浙江省电力有限公司杭州供电公司 基于发音量化和电力专用词库的离线语音识别装置及方法

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4783803A (en) * 1985-11-12 1988-11-08 Dragon Systems, Inc. Speech recognition apparatus and method
US4866778A (en) * 1986-08-11 1989-09-12 Dragon Systems, Inc. Interactive speech recognition apparatus
JPH01102599A (ja) * 1987-10-12 1989-04-20 Internatl Business Mach Corp <Ibm> 音声認識方法
US5027406A (en) * 1988-12-06 1991-06-25 Dragon Systems, Inc. Method for interactive speech recognition and training
US5182773A (en) * 1991-03-22 1993-01-26 International Business Machines Corporation Speaker-independent label coding apparatus
US5465318A (en) * 1991-03-28 1995-11-07 Kurzweil Applied Intelligence, Inc. Method for generating a speech recognition model for a non-vocabulary utterance
US5390278A (en) * 1991-10-08 1995-02-14 Bell Canada Phoneme based speech recognition
US5278942A (en) * 1991-12-05 1994-01-11 International Business Machines Corporation Speech coding apparatus having speaker dependent prototypes generated from nonuser reference data
US5293584A (en) * 1992-05-21 1994-03-08 International Business Machines Corporation Speech recognition system for natural language translation
US5682464A (en) * 1992-06-29 1997-10-28 Kurzweil Applied Intelligence, Inc. Word model candidate preselection for speech recognition using precomputed matrix of thresholded distance values
US5455889A (en) * 1993-02-08 1995-10-03 International Business Machines Corporation Labelling speech using context-dependent acoustic prototypes
IT1270919B (it) * 1993-05-05 1997-05-16 Cselt Centro Studi Lab Telecom Sistema per il riconoscimento di parole isolate indipendente dal parlatore mediante reti neurali
GB2290684A (en) * 1994-06-22 1996-01-03 Ibm Speech synthesis using hidden Markov model to determine speech unit durations
JP3581401B2 (ja) * 1994-10-07 2004-10-27 キヤノン株式会社 音声認識方法
US5715367A (en) * 1995-01-23 1998-02-03 Dragon Systems, Inc. Apparatuses and methods for developing and using models for speech recognition

Also Published As

Publication number Publication date
CN1223739A (zh) 1999-07-21
JP2000514206A (ja) 2000-10-24
JP3672573B2 (ja) 2005-07-20
DE69707876T2 (de) 2002-04-11
WO1998000834A1 (en) 1998-01-08
EP0907949A1 (de) 1999-04-14
US5963903A (en) 1999-10-05
EP0907949B1 (de) 2001-10-31
CN1165887C (zh) 2004-09-08

Similar Documents

Publication Publication Date Title
DE69707876D1 (de) Verfahren und vorrichtung fuer dynamisch eingestelltes training zur spracherkennung
DE69717899D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69828141D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69806557D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69519840D1 (de) Einrichtung und Verfahren zur Spracherkennung
DE69518705D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69524829D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69726235D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE59707384D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69730930D1 (de) Verfahren und Gerät zur Zeichenerkennung
DE69923253D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69420400D1 (de) Verfahren und gerät zur sprechererkennung
DE69732156D1 (de) Verfahren und Gerät zur Zeichenerkennung
DE69830017D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69817844D1 (de) Verfahren und vorrichtung zur spracherkennungscomputereingabe
DE69831991D1 (de) Verfahren und Vorrichtung zur Sprachdetektion
DE69524677D1 (de) Gerät und Verfahren zur Bilderkennung
DE69725106D1 (de) Verfahren und Vorrichtung zur Spracherkennung mit Rauschadaptierung
DE69625950D1 (de) Verfahren und Vorrichtung zur Spracherkennung und Übersetzungssystem
DE69324629D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69727895D1 (de) Verfahren und Vorrichtung zur Sprachkodierung
DE69421324D1 (de) Verfahren und Vorrichtung zur Sprachkommunikation
DE69631728D1 (de) Verfahren und Vorrichtung zur Sprachkodierung
DE69428475D1 (de) Verfahren und Gerät zur automatischen Spracherkennung
DE69710525D1 (de) Verfahren und Vorrichtung zur Sprachsynthese

Legal Events

Date Code Title Description
8364 No opposition during term of opposition