WO2009025356A1 - 音声認識装置および音声認識方法 - Google Patents
音声認識装置および音声認識方法 Download PDFInfo
- Publication number
- WO2009025356A1 WO2009025356A1 PCT/JP2008/065008 JP2008065008W WO2009025356A1 WO 2009025356 A1 WO2009025356 A1 WO 2009025356A1 JP 2008065008 W JP2008065008 W JP 2008065008W WO 2009025356 A1 WO2009025356 A1 WO 2009025356A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- unit
- score
- tone
- search unit
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1807—Speech classification or search using natural language modelling using prosody or stress
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G10L2015/025—Phonemes, fenemes or fenones being the recognition units
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/15—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being formant information
Abstract
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009529074A JP5282737B2 (ja) | 2007-08-22 | 2008-08-22 | 音声認識装置および音声認識方法 |
US12/672,015 US8315870B2 (en) | 2007-08-22 | 2008-08-22 | Rescoring speech recognition hypothesis using prosodic likelihood |
CN2008801035918A CN101785051B (zh) | 2007-08-22 | 2008-08-22 | 语音识别装置和语音识别方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2007-215958 | 2007-08-22 | ||
JP2007215958 | 2007-08-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009025356A1 true WO2009025356A1 (ja) | 2009-02-26 |
Family
ID=40378256
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2008/065008 WO2009025356A1 (ja) | 2007-08-22 | 2008-08-22 | 音声認識装置および音声認識方法 |
Country Status (4)
Country | Link |
---|---|
US (1) | US8315870B2 (ja) |
JP (1) | JP5282737B2 (ja) |
CN (1) | CN101785051B (ja) |
WO (1) | WO2009025356A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2233110A1 (en) | 2009-03-24 | 2010-09-29 | orangedental GmbH & Co. KG | Methods and apparatus to determine distances for use in dentistry |
CN102254556A (zh) * | 2010-05-17 | 2011-11-23 | 阿瓦雅公司 | 基于听者和说者的讲话风格比较估计听者理解说者的能力 |
CN102938252A (zh) * | 2012-11-23 | 2013-02-20 | 中国科学院自动化研究所 | 结合韵律和发音学特征的汉语声调识别系统及方法 |
Families Citing this family (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102237081B (zh) * | 2010-04-30 | 2013-04-24 | 国际商业机器公司 | 语音韵律评估方法与系统 |
US10002608B2 (en) * | 2010-09-17 | 2018-06-19 | Nuance Communications, Inc. | System and method for using prosody for voice-enabled search |
US8401853B2 (en) | 2010-09-22 | 2013-03-19 | At&T Intellectual Property I, L.P. | System and method for enhancing voice-enabled search based on automated demographic identification |
JP5179559B2 (ja) * | 2010-11-12 | 2013-04-10 | シャープ株式会社 | 画像処理システムを制御する制御装置、画像形成装置、画像読取装置、制御方法、画像処理プログラム及びコンピュータ読み取り可能な記録媒体 |
JP5716595B2 (ja) * | 2011-01-28 | 2015-05-13 | 富士通株式会社 | 音声補正装置、音声補正方法及び音声補正プログラム |
US9317605B1 (en) | 2012-03-21 | 2016-04-19 | Google Inc. | Presenting forked auto-completions |
TWI557722B (zh) * | 2012-11-15 | 2016-11-11 | 緯創資通股份有限公司 | 語音干擾的濾除方法、系統,與電腦可讀記錄媒體 |
WO2014167570A1 (en) * | 2013-04-10 | 2014-10-16 | Technologies For Voice Interface | System and method for extracting and using prosody features |
US9251202B1 (en) * | 2013-06-25 | 2016-02-02 | Google Inc. | Corpus specific queries for corpora from search query |
US9646606B2 (en) | 2013-07-03 | 2017-05-09 | Google Inc. | Speech recognition using domain knowledge |
CN103474061A (zh) * | 2013-09-12 | 2013-12-25 | 河海大学 | 基于分类器融合的汉语方言自动辨识方法 |
CN105632499B (zh) * | 2014-10-31 | 2019-12-10 | 株式会社东芝 | 用于优化语音识别结果的方法和装置 |
US9824684B2 (en) * | 2014-11-13 | 2017-11-21 | Microsoft Technology Licensing, Llc | Prediction-based sequence recognition |
CN104464751B (zh) * | 2014-11-21 | 2018-01-16 | 科大讯飞股份有限公司 | 发音韵律问题的检测方法及装置 |
US9953644B2 (en) | 2014-12-01 | 2018-04-24 | At&T Intellectual Property I, L.P. | Targeted clarification questions in speech recognition with concept presence score and concept correctness score |
WO2016103358A1 (ja) * | 2014-12-24 | 2016-06-30 | 三菱電機株式会社 | 音声認識装置及び音声認識方法 |
US9754580B2 (en) | 2015-10-12 | 2017-09-05 | Technologies For Voice Interface | System and method for extracting and using prosody features |
CN105869624B (zh) | 2016-03-29 | 2019-05-10 | 腾讯科技(深圳)有限公司 | 数字语音识别中语音解码网络的构建方法及装置 |
US10607601B2 (en) * | 2017-05-11 | 2020-03-31 | International Business Machines Corporation | Speech recognition by selecting and refining hot words |
TW201921336A (zh) * | 2017-06-15 | 2019-06-01 | 大陸商北京嘀嘀無限科技發展有限公司 | 用於語音辨識的系統和方法 |
CN109145281B (zh) * | 2017-06-15 | 2020-12-25 | 北京嘀嘀无限科技发展有限公司 | 语音识别方法、装置及存储介质 |
EP3823306B1 (en) * | 2019-11-15 | 2022-08-24 | Sivantos Pte. Ltd. | A hearing system comprising a hearing instrument and a method for operating the hearing instrument |
CN111862954B (zh) * | 2020-05-29 | 2024-03-01 | 北京捷通华声科技股份有限公司 | 一种语音识别模型的获取方法及装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63165900A (ja) * | 1986-12-27 | 1988-07-09 | 沖電気工業株式会社 | 会話音声認識方式 |
JPH04128899A (ja) * | 1990-09-20 | 1992-04-30 | Fujitsu Ltd | 音声認識装置 |
JPH07261778A (ja) * | 1994-03-22 | 1995-10-13 | Canon Inc | 音声情報処理方法及び装置 |
JP2001282282A (ja) * | 2000-03-31 | 2001-10-12 | Canon Inc | 音声情報処理方法および装置および記憶媒体 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0758839B2 (ja) | 1987-09-05 | 1995-06-21 | ティーディーケイ株式会社 | 電子部品挿入ヘッド |
JP2946219B2 (ja) | 1989-11-22 | 1999-09-06 | 九州日立マクセル株式会社 | スクリーン印刷用印刷版 |
SE514684C2 (sv) * | 1995-06-16 | 2001-04-02 | Telia Ab | Metod vid tal-till-textomvandling |
US5806031A (en) * | 1996-04-25 | 1998-09-08 | Motorola | Method and recognizer for recognizing tonal acoustic sound signals |
JP3006677B2 (ja) * | 1996-10-28 | 2000-02-07 | 日本電気株式会社 | 音声認識装置 |
US6253178B1 (en) * | 1997-09-22 | 2001-06-26 | Nortel Networks Limited | Search and rescoring method for a speech recognition system |
JP2003514260A (ja) * | 1999-11-11 | 2003-04-15 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | スピーチ認識のための音調特徴 |
US7043430B1 (en) * | 1999-11-23 | 2006-05-09 | Infotalk Corporation Limitied | System and method for speech recognition using tonal modeling |
CN1180398C (zh) * | 2000-05-26 | 2004-12-15 | 封家麒 | 一种语音辨识方法及系统 |
US6510410B1 (en) * | 2000-07-28 | 2003-01-21 | International Business Machines Corporation | Method and apparatus for recognizing tone languages using pitch information |
AU2000276402A1 (en) * | 2000-09-30 | 2002-04-15 | Intel Corporation | Method, apparatus, and system for bottom-up tone integration to chinese continuous speech recognition system |
JP4353202B2 (ja) * | 2006-05-25 | 2009-10-28 | ソニー株式会社 | 韻律識別装置及び方法、並びに音声認識装置及び方法 |
-
2008
- 2008-08-22 CN CN2008801035918A patent/CN101785051B/zh not_active Expired - Fee Related
- 2008-08-22 US US12/672,015 patent/US8315870B2/en active Active
- 2008-08-22 JP JP2009529074A patent/JP5282737B2/ja active Active
- 2008-08-22 WO PCT/JP2008/065008 patent/WO2009025356A1/ja active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63165900A (ja) * | 1986-12-27 | 1988-07-09 | 沖電気工業株式会社 | 会話音声認識方式 |
JPH04128899A (ja) * | 1990-09-20 | 1992-04-30 | Fujitsu Ltd | 音声認識装置 |
JPH07261778A (ja) * | 1994-03-22 | 1995-10-13 | Canon Inc | 音声情報処理方法及び装置 |
JP2001282282A (ja) * | 2000-03-31 | 2001-10-12 | Canon Inc | 音声情報処理方法および装置および記憶媒体 |
Non-Patent Citations (2)
Title |
---|
ONODERA S. ET AL: "Multipath hoshiki o mochiita zatsuon kankyoka deno tango onsei ninshiki -accent joho no riyo-", THE ACOUSTICAL SOCIETY OF JAPAN 2004 SHUNKI KENKYU HAPPYOKAI KOEN RONBUNSHU, 17 March 2004 (2004-03-17), pages 161 - 162 * |
ZHAO LI ET AL: "3 Jigen viterbi-ho o mochiita onso joho to oncho joho no togo ni yoru chugokugo renzoku onsei ninshiki", THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN, vol. 54, no. 7, 1 July 1998 (1998-07-01), pages 497 - 505 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2233110A1 (en) | 2009-03-24 | 2010-09-29 | orangedental GmbH & Co. KG | Methods and apparatus to determine distances for use in dentistry |
CN102254556A (zh) * | 2010-05-17 | 2011-11-23 | 阿瓦雅公司 | 基于听者和说者的讲话风格比较估计听者理解说者的能力 |
CN102938252A (zh) * | 2012-11-23 | 2013-02-20 | 中国科学院自动化研究所 | 结合韵律和发音学特征的汉语声调识别系统及方法 |
Also Published As
Publication number | Publication date |
---|---|
CN101785051A (zh) | 2010-07-21 |
JP5282737B2 (ja) | 2013-09-04 |
US20110196678A1 (en) | 2011-08-11 |
US8315870B2 (en) | 2012-11-20 |
JPWO2009025356A1 (ja) | 2010-11-25 |
CN101785051B (zh) | 2012-09-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009025356A1 (ja) | 音声認識装置および音声認識方法 | |
TW200638337A (en) | Using a spoken utterance for disambiguation of spelling inputs into a speech recognition system | |
ATE524777T1 (de) | Automatische aktualisierung eines sprachmodells | |
ATE395685T1 (de) | Spracherkennung durch wort-in-phrase-befehl | |
WO2006023631A3 (en) | Document transcription system training | |
WO2008073850A3 (en) | Method and apparatus for reading education | |
WO2008087934A1 (ja) | 拡張認識辞書学習装置と音声認識システム | |
ATE362633T1 (de) | Erlernen der aussprache neuer worte unter verwendung eines aussprachegraphen | |
WO2007118020A3 (en) | Method and system for managing pronunciation dictionaries in a speech application | |
ATE404967T1 (de) | Text-zu-sprache-system und verfahren, computerprogramm dafür | |
TW200601263A (en) | Apparatus and method for synthesized audible response to an utterance in speaker-independent voice recognition | |
EP4318463A3 (en) | Multi-modal input on an electronic device | |
WO2007034478A3 (en) | System and method for correcting speech | |
WO2009006081A3 (en) | Pronunciation correction of text-to-speech systems between different spoken languages | |
WO2008142836A1 (ja) | 声質変換装置および声質変換方法 | |
WO2007117814A3 (en) | Voice signal perturbation for speech recognition | |
AU2001250579A1 (en) | Discriminatively trained mixture models in continuous speech recognition | |
ATE405920T1 (de) | Erzeugen einer spracherkennungsgrammatik für alphanumerische ausdrücke | |
WO2009035825A3 (en) | Automatic reading tutoring | |
ATE514162T1 (de) | Dynamische erzeugung von kontexten zur spracherkennung | |
DE602004024172D1 (de) | Automatische Erzeugung einer Wortaussprache für die Spracherkennung | |
Jang | Speech rhythm metrics for automatic scoring of English speech by Korean EFL learners | |
JP2004271895A (ja) | 複数言語音声認識システムおよび発音学習システム | |
JP2012255867A (ja) | 音声認識装置 | |
Elmahdy et al. | A baseline speech recognition system for levantine colloquial arabic |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200880103591.8 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08827744 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009529074 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12672015 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08827744 Country of ref document: EP Kind code of ref document: A1 |