JP4657736B2 - ユーザ訂正を用いた自動音声認識学習のためのシステムおよび方法 - Google Patents
ユーザ訂正を用いた自動音声認識学習のためのシステムおよび方法 Download PDFInfo
- Publication number
- JP4657736B2 JP4657736B2 JP2005010922A JP2005010922A JP4657736B2 JP 4657736 B2 JP4657736 B2 JP 4657736B2 JP 2005010922 A JP2005010922 A JP 2005010922A JP 2005010922 A JP2005010922 A JP 2005010922A JP 4657736 B2 JP4657736 B2 JP 4657736B2
- Authority
- JP
- Japan
- Prior art keywords
- pronunciation
- word
- user
- lexicon
- corrected word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012937 correction Methods 0.000 title claims abstract description 22
- 230000003993 interaction Effects 0.000 claims abstract description 6
- 230000008859 change Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 2
- 239000013598 vector Substances 0.000 description 26
- 238000012549 training Methods 0.000 description 13
- 238000013139 quantization Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 11
- 230000007704 transition Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 10
- 230000003287 optical effect Effects 0.000 description 5
- 230000002093 peripheral effect Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000005055 memory storage Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- CDFKCKUONRRKJD-UHFFFAOYSA-N 1-(3-chlorophenoxy)-3-[2-[[3-(3-chlorophenoxy)-2-hydroxypropyl]amino]ethylamino]propan-2-ol;methanesulfonic acid Chemical compound CS(O)(=O)=O.CS(O)(=O)=O.C=1C=CC(Cl)=CC=1OCC(O)CNCCNCC(O)COC1=CC=CC(Cl)=C1 CDFKCKUONRRKJD-UHFFFAOYSA-N 0.000 description 1
- AFCARXCZXQIEQB-UHFFFAOYSA-N N-[3-oxo-3-(2,4,6,7-tetrahydrotriazolo[4,5-c]pyridin-5-yl)propyl]-2-[[3-(trifluoromethoxy)phenyl]methylamino]pyrimidine-5-carboxamide Chemical compound O=C(CCNC(=O)C=1C=NC(=NC=1)NCC1=CC(=CC=C1)OC(F)(F)F)N1CC2=C(CC1)NN=N2 AFCARXCZXQIEQB-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01D—SEPARATION
- B01D35/00—Filtering devices having features not specifically covered by groups B01D24/00 - B01D33/00, or for applications not specifically covered by groups B01D24/00 - B01D33/00; Auxiliary devices for filtration; Filter housing constructions
- B01D35/30—Filter housing constructions
- B01D35/306—Filter mounting adapter
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01D—SEPARATION
- B01D35/00—Filtering devices having features not specifically covered by groups B01D24/00 - B01D33/00, or for applications not specifically covered by groups B01D24/00 - B01D33/00; Auxiliary devices for filtration; Filter housing constructions
- B01D35/14—Safety devices specially adapted for filtration; Devices for indicating clogging
- B01D35/153—Anti-leakage or anti-return valves
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F16—ENGINEERING ELEMENTS AND UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL
- F16K—VALVES; TAPS; COCKS; ACTUATING-FLOATS; DEVICES FOR VENTING OR AERATING
- F16K15/00—Check valves
- F16K15/02—Check valves with guided rigid valve members
- F16K15/06—Check valves with guided rigid valve members with guided stems
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F16—ENGINEERING ELEMENTS AND UNITS; GENERAL MEASURES FOR PRODUCING AND MAINTAINING EFFECTIVE FUNCTIONING OF MACHINES OR INSTALLATIONS; THERMAL INSULATION IN GENERAL
- F16K—VALVES; TAPS; COCKS; ACTUATING-FLOATS; DEVICES FOR VENTING OR AERATING
- F16K27/00—Construction of housing; Use of materials therefor
- F16K27/02—Construction of housing; Use of materials therefor of lift valves
- F16K27/0209—Check valves or pivoted valves
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01D—SEPARATION
- B01D2201/00—Details relating to filtering apparatus
- B01D2201/16—Valves
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B01—PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
- B01D—SEPARATION
- B01D2201/00—Details relating to filtering apparatus
- B01D2201/29—Filter cartridge constructions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0631—Creating reference templates; Clustering
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Mechanical Engineering (AREA)
- Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Electrically Operated Instructional Devices (AREA)
Description
C(pron)=1/[d/f/log(len1+len2)]
上式において、dは、レキシコン(語彙目録,辞書)における認識された発音と最良一致の間の相違(距離)であり、fは、認識された同じ発音が発音される周波数であり、p(d,AM)は、このような相違(距離)dおよびAMスコアを有する発音が正しい発音である確率である。len1およびlen2は、それぞれ新たな発音および最も近い発音における音素の長さである。P(d,AM)は、トレーニングによって学習される。
130 システムメモリ
134 オペレーティングシステム
135 アプリケーションプログラム
136 他のプログラムモジュール
137 プログラムデータ
140 取り外し不可の不揮発性メモリインターフェース
144 オペレーティングシステム
145 アプリケーションプログラム
146 他のプログラムモジュール
147 プログラムデータ
150 取り外し可能な不揮発性メモリインターフェース
160 ユーザ入力インターフェース
161 ポインティングデバイス
162 キーボード
163 マイクロホン
170 ネットワークインターフェース
171 ローカルエリアネットワーク
172 モデム
173 ワイドエリアネットワーク
180 リモートコンピュータ
185 リモートアプリケーションプログラム
190 ビデオインターフェース
191 モニタ
195 出力周辺インターフェース
196 プリンタ
197 スピーカ
202 プロセッサ
204 メモリ
208 通信インターフェース
214 アプリケーション(群)
216 オブジェクトストア
Claims (15)
- 自動音声認識システムを用いて学習を行う方法であって、
前記方法は、
口述テキストに対する変更を検出するステップと、
前記変更が訂正であるか、又は編集であるかを推論するステップと、
前記変更が訂正であると推論される場合、追加的なユーザ対話なしで前記訂正の性質から選択的に学習するステップと、
を含み、
前記訂正の性質から選択的に学習するステップは、
訂正されたワードがユーザのレキシコンに存在するか否かを判定するステップと、
前記訂正されたワードがユーザのレキシコンに存在する場合に、ユーザの発音が前記システムに既知の発音であるかどうかを判定するステップと、
前記ユーザの発音が前記システムに既知の発音である場合に、既存の発音に関連付けられる見込みを増加させるステップと
をさらに含み、
前記ユーザの発音が前記システムに既知の発音であるかどうかを判定するステップは、
前記訂正されたワードに対応付けられる文脈語に基づいて波形の強制整列を行うステップと、
前記強制整列を分析して前記訂正されたワードの発音である前記波形の一部を識別するステップと、
前記訂正されたワードの発音と前記訂正されたワードに対応するレキシコン内の既存のワードの発音との距離を計算するステップと
をさらに含むことを特徴とする方法。 - 前記距離を計算するステップは、前記既存のワードの発音と、前記訂正されたワードの発音との距離に基づいて、信頼性スコアを生成するステップをさらに含む、ことを特徴とする請求項1に記載の方法。
- 前記信頼性スコアは、関数、1/[d/f/log(len1+len2)]を用いて計算され、ここで、dは前記レキシコン内の既存のワードの発音の1つと前記訂正されたワードの発音との間の距離であり、fは前記訂正されたワードの発音が発音される周波数であり、len1およびlen2は音素の長さを表す値である、ことを特徴とする請求項2に記載の方法。
- 前記距離を計算するステップは、前記既存のワードの発音と、前記訂正されたワードの発音との音響モデルスコアに基づいて、信頼性スコアを生成するステップをさらに含む、ことを特徴とする請求項1に記載の方法。
- 前記信頼性スコアを閾値と比較するステップをさらに含むことを特徴とする請求項2または4に記載の方法。
- 新たな発音が、予め選択された回数だけ生じたか否かを判定するステップをさらに含む、ことを特徴とする請求項5に記載の方法。
- 前記訂正されたワードの可能性のある発音および認識結果に基づいてラティスを構築するステップをさらに含むことを特徴とする請求項1に記載の方法。
- 前記推論するステップは、前記ユーザが前記変更を行うために代替リストから選択を行ったか否か検出するステップを含む、ことを特徴とする請求項1に記載の方法。
- 前記推論するステップは、ディクテーションと前記変更との間の時間を測定するステップを含む、ことを特徴とする請求項1に記載の方法。
- 前記推論するステップは、前記口述テキストの音声認識エンジンスコアおよび前記変更されたテキストの音声認識エンジンスコアを比較するステップを含む、ことを特徴とする請求項1に記載の方法。
- 前記推論するステップは、変更されたワードの数を検出するステップを含む、ことを特徴とする請求項1に記載の方法。
- 前記訂正の性質から選択的に学習するステップは、前記訂正されたワードが前記ユーザのレキシコンに存在しない場合は、前記レキシコンに前記訂正されたワードを追加するステップを含む、ことを特徴とする請求項1に記載の方法。
- 前記訂正の性質から選択的に学習するステップは、前記ユーザのレキシコンに少なくとも1つのワード対を追加するステップを含む、ことを特徴とする請求項1に記載の方法。
- 前記ユーザのレキシコンに少なくとも1つのワード対が一時的に追加されることを特徴とする請求項13に記載の方法。
- 自動音声認識システムを用いて学習を行うシステムであって、
前記システムは、
口述テキストに対する変更を検出する手段と、
前記変更が訂正であるか、又は編集であるかを推論する手段と、
前記変更が訂正であると推論される場合、追加的なユーザ対話なしで前記訂正の性質から選択的に学習する手段と、
を備え、
前記訂正の性質から選択的に学習する手段は、
訂正されたワードがユーザのレキシコンに存在するか否かを判定する手段と、
前記訂正されたワードがユーザのレキシコンに存在する場合に、ユーザの発音が前記システムに既知の発音であるかどうかを判定する手段と、
前記ユーザの発音が前記システムに既知の発音である場合に、既存の発音に関連付けられる見込みを増加させる手段と
をさらに備え、
前記ユーザの発音が前記システムに既知の発音であるかどうかを判定する手段は、
前記訂正されたワードに対応付けられる文脈語に基づいて波形の強制整列を行う手段と、
前記強制整列を分析して前記訂正されたワードの発音である前記波形の一部を識別する手段と、
前記訂正されたワードの発音と前記訂正されたワードに対応するレキシコン内の既存のワードの発音との距離を計算する手段と
をさらに備えたことを特徴とするシステム。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/761,451 US8019602B2 (en) | 2004-01-20 | 2004-01-20 | Automatic speech recognition learning using user corrections |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2005208643A JP2005208643A (ja) | 2005-08-04 |
JP4657736B2 true JP4657736B2 (ja) | 2011-03-23 |
Family
ID=34634575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2005010922A Expired - Fee Related JP4657736B2 (ja) | 2004-01-20 | 2005-01-18 | ユーザ訂正を用いた自動音声認識学習のためのシステムおよび方法 |
Country Status (6)
Country | Link |
---|---|
US (2) | US8019602B2 (ja) |
EP (1) | EP1557822B1 (ja) |
JP (1) | JP4657736B2 (ja) |
KR (1) | KR101183344B1 (ja) |
CN (1) | CN1645477B (ja) |
AT (1) | ATE511177T1 (ja) |
Families Citing this family (89)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050198182A1 (en) * | 2004-03-02 | 2005-09-08 | Prakash Vipul V. | Method and apparatus to use a genetic algorithm to generate an improved statistical model |
US20050198181A1 (en) * | 2004-03-02 | 2005-09-08 | Jordan Ritter | Method and apparatus to use a statistical model to classify electronic communications |
KR100717385B1 (ko) | 2006-02-09 | 2007-05-11 | 삼성전자주식회사 | 인식 후보의 사전적 거리를 이용한 인식 신뢰도 측정 방법및 인식 신뢰도 측정 시스템 |
US8762148B2 (en) * | 2006-02-27 | 2014-06-24 | Nec Corporation | Reference pattern adaptation apparatus, reference pattern adaptation method and reference pattern adaptation program |
US7756708B2 (en) * | 2006-04-03 | 2010-07-13 | Google Inc. | Automatic language model update |
US8407052B2 (en) * | 2006-04-17 | 2013-03-26 | Vovision, Llc | Methods and systems for correcting transcribed audio files |
US7774202B2 (en) | 2006-06-12 | 2010-08-10 | Lockheed Martin Corporation | Speech activated control system and related methods |
US8719027B2 (en) * | 2007-02-28 | 2014-05-06 | Microsoft Corporation | Name synthesis |
US8838457B2 (en) | 2007-03-07 | 2014-09-16 | Vlingo Corporation | Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility |
US20110054895A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Utilizing user transmitted text to improve language model in mobile dictation application |
US20080312934A1 (en) * | 2007-03-07 | 2008-12-18 | Cerra Joseph P | Using results of unstructured language model based speech recognition to perform an action on a mobile communications facility |
US20090030688A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Tagging speech recognition results based on an unstructured language model for use in a mobile communication facility application |
US20090030697A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Using contextual information for delivering results generated from a speech recognition facility using an unstructured language model |
US8886540B2 (en) | 2007-03-07 | 2014-11-11 | Vlingo Corporation | Using speech recognition results based on an unstructured language model in a mobile communication facility application |
US20110054899A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Command and control utilizing content information in a mobile voice-to-speech application |
US8949130B2 (en) | 2007-03-07 | 2015-02-03 | Vlingo Corporation | Internal and external speech recognition use with a mobile communication facility |
US20110054900A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Hybrid command and control between resident and remote speech recognition facilities in a mobile voice-to-speech application |
US8996379B2 (en) | 2007-03-07 | 2015-03-31 | Vlingo Corporation | Speech recognition text entry for software applications |
US20110054894A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Speech recognition through the collection of contact information in mobile dictation application |
US8635243B2 (en) | 2007-03-07 | 2014-01-21 | Research In Motion Limited | Sending a communications header with voice recording to send metadata for use in speech recognition, formatting, and search mobile search application |
US20110054897A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Transmitting signal quality information in mobile dictation application |
US20110054898A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Multiple web-based content search user interface in mobile search application |
US20090030691A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Using an unstructured language model associated with an application of a mobile communication facility |
US20110054896A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Sending a communications header with voice recording to send metadata for use in speech recognition and formatting in mobile dictation application |
US20080221899A1 (en) * | 2007-03-07 | 2008-09-11 | Cerra Joseph P | Mobile messaging environment speech processing facility |
US20090030687A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Adapting an unstructured language model speech recognition system based on usage |
US20080288252A1 (en) * | 2007-03-07 | 2008-11-20 | Cerra Joseph P | Speech recognition of speech recorded by a mobile communication facility |
US8886545B2 (en) | 2007-03-07 | 2014-11-11 | Vlingo Corporation | Dealing with switch latency in speech recognition |
US10056077B2 (en) * | 2007-03-07 | 2018-08-21 | Nuance Communications, Inc. | Using speech recognition results based on an unstructured language model with a music system |
US20110060587A1 (en) * | 2007-03-07 | 2011-03-10 | Phillips Michael S | Command and control utilizing ancillary information in a mobile voice-to-speech application |
US8949266B2 (en) | 2007-03-07 | 2015-02-03 | Vlingo Corporation | Multiple web-based content category searching in mobile search application |
US20090030685A1 (en) * | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Using speech recognition results based on an unstructured language model with a navigation system |
US8457946B2 (en) * | 2007-04-26 | 2013-06-04 | Microsoft Corporation | Recognition architecture for generating Asian characters |
US20090037171A1 (en) * | 2007-08-03 | 2009-02-05 | Mcfarland Tim J | Real-time voice transcription system |
WO2009136440A1 (ja) * | 2008-05-09 | 2009-11-12 | 富士通株式会社 | 音声認識辞書作成支援装置,処理プログラム,および処理方法 |
JP5054711B2 (ja) * | 2009-01-29 | 2012-10-24 | 日本放送協会 | 音声認識装置および音声認識プログラム |
US8798983B2 (en) * | 2009-03-30 | 2014-08-05 | Microsoft Corporation | Adaptation for statistical language model |
US9659559B2 (en) * | 2009-06-25 | 2017-05-23 | Adacel Systems, Inc. | Phonetic distance measurement system and related methods |
US8725510B2 (en) * | 2009-07-09 | 2014-05-13 | Sony Corporation | HMM learning device and method, program, and recording medium |
US9218807B2 (en) * | 2010-01-08 | 2015-12-22 | Nuance Communications, Inc. | Calibration of a speech recognition engine using validated text |
US20110184736A1 (en) * | 2010-01-26 | 2011-07-28 | Benjamin Slotznick | Automated method of recognizing inputted information items and selecting information items |
US9263034B1 (en) * | 2010-07-13 | 2016-02-16 | Google Inc. | Adapting enhanced acoustic models |
JP5158174B2 (ja) * | 2010-10-25 | 2013-03-06 | 株式会社デンソー | 音声認識装置 |
US9396725B2 (en) | 2011-05-09 | 2016-07-19 | At&T Intellectual Property I, L.P. | System and method for optimizing speech recognition and natural language parameters with user feedback |
US8738375B2 (en) | 2011-05-09 | 2014-05-27 | At&T Intellectual Property I, L.P. | System and method for optimizing speech recognition and natural language parameters with user feedback |
US10522133B2 (en) * | 2011-05-23 | 2019-12-31 | Nuance Communications, Inc. | Methods and apparatus for correcting recognition errors |
US8676580B2 (en) * | 2011-08-16 | 2014-03-18 | International Business Machines Corporation | Automatic speech and concept recognition |
CN103000052A (zh) * | 2011-09-16 | 2013-03-27 | 上海先先信息科技有限公司 | 人机互动的口语对话系统及其实现方法 |
US8515751B2 (en) * | 2011-09-28 | 2013-08-20 | Google Inc. | Selective feedback for text recognition systems |
US9640175B2 (en) | 2011-10-07 | 2017-05-02 | Microsoft Technology Licensing, Llc | Pronunciation learning from user correction |
CN103165129B (zh) * | 2011-12-13 | 2015-07-01 | 北京百度网讯科技有限公司 | 一种优化语音识别声学模型的方法及系统 |
US9082403B2 (en) * | 2011-12-15 | 2015-07-14 | Microsoft Technology Licensing, Llc | Spoken utterance classification training for a speech recognition system |
CN103366741B (zh) * | 2012-03-31 | 2019-05-17 | 上海果壳电子有限公司 | 语音输入纠错方法及系统 |
KR101971513B1 (ko) * | 2012-07-05 | 2019-04-23 | 삼성전자주식회사 | 전자 장치 및 이의 음성 인식 오류 수정 방법 |
US9093072B2 (en) | 2012-07-20 | 2015-07-28 | Microsoft Technology Licensing, Llc | Speech and gesture recognition enhancement |
US20140067394A1 (en) * | 2012-08-28 | 2014-03-06 | King Abdulaziz City For Science And Technology | System and method for decoding speech |
CN103903618B (zh) * | 2012-12-28 | 2017-08-29 | 联想(北京)有限公司 | 一种语音输入方法及电子设备 |
CN110889265B (zh) * | 2012-12-28 | 2024-01-30 | 索尼公司 | 信息处理设备和信息处理方法 |
KR101892734B1 (ko) * | 2013-01-04 | 2018-08-28 | 한국전자통신연구원 | 음성 인식 시스템에서의 오류 수정 방법 및 그 장치 |
US20140317467A1 (en) * | 2013-04-22 | 2014-10-23 | Storart Technology Co., Ltd. | Method of detecting and correcting errors with bch engines for flash storage system |
US10394442B2 (en) * | 2013-11-13 | 2019-08-27 | International Business Machines Corporation | Adjustment of user interface elements based on user accuracy and content consumption |
WO2015102127A1 (ko) * | 2013-12-31 | 2015-07-09 | 엘지전자 주식회사 | 음성 인식 시스템 및 방법 |
CN103941868B (zh) * | 2014-04-14 | 2017-08-18 | 美的集团股份有限公司 | 语音控制准确率调整方法和系统 |
US20160063990A1 (en) * | 2014-08-26 | 2016-03-03 | Honeywell International Inc. | Methods and apparatus for interpreting clipped speech using speech recognition |
US9953646B2 (en) | 2014-09-02 | 2018-04-24 | Belleau Technologies | Method and system for dynamic speech recognition and tracking of prewritten script |
JP6671379B2 (ja) * | 2014-10-01 | 2020-03-25 | エクスブレイン・インコーポレーテッド | 音声および接続プラットフォーム |
US10048934B2 (en) | 2015-02-16 | 2018-08-14 | International Business Machines Corporation | Learning intended user actions |
US10410629B2 (en) * | 2015-08-19 | 2019-09-10 | Hand Held Products, Inc. | Auto-complete methods for spoken complete value entries |
CN106683677B (zh) | 2015-11-06 | 2021-11-12 | 阿里巴巴集团控股有限公司 | 语音识别方法及装置 |
US11429883B2 (en) | 2015-11-13 | 2022-08-30 | Microsoft Technology Licensing, Llc | Enhanced computer experience from activity prediction |
US10769189B2 (en) | 2015-11-13 | 2020-09-08 | Microsoft Technology Licensing, Llc | Computer speech recognition and semantic understanding from activity patterns |
CN106935239A (zh) * | 2015-12-29 | 2017-07-07 | 阿里巴巴集团控股有限公司 | 一种发音词典的构建方法及装置 |
EP3469519A4 (en) | 2016-06-14 | 2020-07-01 | Omry Netzer | AUTOMATIC VOICE RECOGNITION |
US10468015B2 (en) * | 2017-01-12 | 2019-11-05 | Vocollect, Inc. | Automated TTS self correction system |
US9741337B1 (en) * | 2017-04-03 | 2017-08-22 | Green Key Technologies Llc | Adaptive self-trained computer engines with associated databases and methods of use thereof |
CN107291867B (zh) * | 2017-06-13 | 2021-07-20 | 北京百度网讯科技有限公司 | 基于人工智能的对话处理方法、装置、设备及计算机可读存储介质 |
CN107463601B (zh) * | 2017-06-13 | 2021-02-12 | 北京百度网讯科技有限公司 | 基于人工智能的对话理解系统构建方法、装置、设备及计算机可读存储介质 |
CN107909995B (zh) * | 2017-11-16 | 2021-08-17 | 北京小米移动软件有限公司 | 语音交互方法和装置 |
CN107993653A (zh) * | 2017-11-30 | 2018-05-04 | 南京云游智能科技有限公司 | 语音识别设备的错误发音自动纠正更新方法和更新系统 |
CN108417205B (zh) * | 2018-01-19 | 2020-12-18 | 苏州思必驰信息科技有限公司 | 语义理解训练方法和系统 |
CN108733649B (zh) * | 2018-04-25 | 2022-05-06 | 北京华夏电通科技股份有限公司 | 一种语音识别文本插入笔录文档的方法、装置及系统 |
KR102114064B1 (ko) * | 2018-06-11 | 2020-05-22 | 엘지전자 주식회사 | 이동 단말기 |
CN108984529B (zh) * | 2018-07-16 | 2022-06-03 | 北京华宇信息技术有限公司 | 实时庭审语音识别自动纠错方法、存储介质及计算装置 |
KR20210064928A (ko) | 2019-11-26 | 2021-06-03 | 삼성전자주식회사 | 전자장치와 그의 제어방법, 및 기록매체 |
CN113744718A (zh) * | 2020-05-27 | 2021-12-03 | 海尔优家智能科技(北京)有限公司 | 语音文本的输出方法及装置、存储介质、电子装置 |
KR20220013732A (ko) * | 2020-07-27 | 2022-02-04 | 삼성전자주식회사 | 전자 장치 및 이의 제어 방법 |
WO2022085296A1 (ja) * | 2020-10-19 | 2022-04-28 | ソニーグループ株式会社 | 情報処理装置及び情報処理方法、コンピュータプログラム、フォーマット変換装置、オーディオコンテンツ自動転記システム、学習済みモデル、並びに表示装置 |
US20230267918A1 (en) * | 2022-02-24 | 2023-08-24 | Cisco Technology, Inc. | Automatic out of vocabulary word detection in speech recognition |
CN115083437B (zh) * | 2022-05-17 | 2023-04-07 | 北京语言大学 | 一种确定学习者发音的不确定性的方法及装置 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000122687A (ja) * | 1998-10-19 | 2000-04-28 | Internatl Business Mach Corp <Ibm> | 言語モデルを更新する方法 |
JP2001092493A (ja) * | 1999-09-24 | 2001-04-06 | Alpine Electronics Inc | 音声認識修正方式 |
Family Cites Families (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5029406A (en) * | 1990-05-15 | 1991-07-09 | Eldon Industries | Sign structures |
US5748840A (en) * | 1990-12-03 | 1998-05-05 | Audio Navigation Systems, Inc. | Methods and apparatus for improving the reliability of recognizing words in a large database when the words are spelled or spoken |
US5488652A (en) * | 1994-04-14 | 1996-01-30 | Northern Telecom Limited | Method and apparatus for training speech recognition algorithms for directory assistance applications |
US5855000A (en) * | 1995-09-08 | 1998-12-29 | Carnegie Mellon University | Method and apparatus for correcting and repairing machine-transcribed input using independent or cross-modal secondary input |
US5852801A (en) | 1995-10-04 | 1998-12-22 | Apple Computer, Inc. | Method and apparatus for automatically invoking a new word module for unrecognized user input |
US5794189A (en) | 1995-11-13 | 1998-08-11 | Dragon Systems, Inc. | Continuous speech recognition |
US6064959A (en) * | 1997-03-28 | 2000-05-16 | Dragon Systems, Inc. | Error correction in speech recognition |
US5963903A (en) * | 1996-06-28 | 1999-10-05 | Microsoft Corporation | Method and system for dynamically adjusted training for speech recognition |
US5857099A (en) | 1996-09-27 | 1999-01-05 | Allvoice Computing Plc | Speech-to-text dictation system with audio message capability |
US5950160A (en) * | 1996-10-31 | 1999-09-07 | Microsoft Corporation | Method and system for displaying a variable number of alternative words during speech recognition |
US5884258A (en) * | 1996-10-31 | 1999-03-16 | Microsoft Corporation | Method and system for editing phrases during continuous speech recognition |
US5864805A (en) * | 1996-12-20 | 1999-01-26 | International Business Machines Corporation | Method and apparatus for error correction in a continuous dictation system |
US6490555B1 (en) * | 1997-03-14 | 2002-12-03 | Scansoft, Inc. | Discriminatively trained mixture models in continuous speech recognition |
US6092044A (en) * | 1997-03-28 | 2000-07-18 | Dragon Systems, Inc. | Pronunciation generation in speech recognition |
JP4267101B2 (ja) * | 1997-11-17 | 2009-05-27 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 音声識別装置、発音矯正装置およびこれらの方法 |
US6377921B1 (en) * | 1998-06-26 | 2002-04-23 | International Business Machines Corporation | Identifying mismatches between assumed and actual pronunciations of words |
US6195635B1 (en) * | 1998-08-13 | 2001-02-27 | Dragon Systems, Inc. | User-cued speech recognition |
US6253177B1 (en) * | 1999-03-08 | 2001-06-26 | International Business Machines Corp. | Method and system for automatically determining whether to update a language model based upon user amendments to dictated text |
US6577999B1 (en) * | 1999-03-08 | 2003-06-10 | International Business Machines Corporation | Method and apparatus for intelligently managing multiple pronunciations for a speech recognition vocabulary |
US6507816B2 (en) * | 1999-05-04 | 2003-01-14 | International Business Machines Corporation | Method and apparatus for evaluating the accuracy of a speech recognition system |
US6434521B1 (en) * | 1999-06-24 | 2002-08-13 | Speechworks International, Inc. | Automatically determining words for updating in a pronunciation dictionary in a speech recognition system |
ATE320650T1 (de) * | 1999-06-30 | 2006-04-15 | Ibm | Verfahren zur erweiterung des wortschatzes eines spracherkennungssystems |
US6370503B1 (en) * | 1999-06-30 | 2002-04-09 | International Business Machines Corp. | Method and apparatus for improving speech recognition accuracy |
WO2001004874A1 (en) | 1999-07-08 | 2001-01-18 | Koninklijke Philips Electronics N.V. | Adaptation of a speech recognizer from corrected text |
CN1207664C (zh) * | 1999-07-27 | 2005-06-22 | 国际商业机器公司 | 对语音识别结果中的错误进行校正的方法和语音识别系统 |
US6418410B1 (en) | 1999-09-27 | 2002-07-09 | International Business Machines Corporation | Smart correction of dictated speech |
JP2001100781A (ja) * | 1999-09-30 | 2001-04-13 | Sony Corp | 音声処理装置および音声処理方法、並びに記録媒体 |
US6263308B1 (en) * | 2000-03-20 | 2001-07-17 | Microsoft Corporation | Methods and apparatus for performing speech recognition using acoustic models which are improved through an interactive process |
AU2001259446A1 (en) * | 2000-05-02 | 2001-11-12 | Dragon Systems, Inc. | Error correction in speech recognition |
US6859774B2 (en) * | 2001-05-02 | 2005-02-22 | International Business Machines Corporation | Error corrective mechanisms for consensus decoding of speech |
US6941264B2 (en) * | 2001-08-16 | 2005-09-06 | Sony Electronics Inc. | Retraining and updating speech models for speech recognition |
ES2228739T3 (es) * | 2001-12-12 | 2005-04-16 | Siemens Aktiengesellschaft | Procedimiento para sistema de reconocimiento de lenguaje y procedimiento para el funcionamiento de un sistema asi. |
US7181398B2 (en) * | 2002-03-27 | 2007-02-20 | Hewlett-Packard Development Company, L.P. | Vocabulary independent speech recognition system and method using subword units |
US7219059B2 (en) * | 2002-07-03 | 2007-05-15 | Lucent Technologies Inc. | Automatic pronunciation scoring for language learning |
US7389228B2 (en) * | 2002-12-16 | 2008-06-17 | International Business Machines Corporation | Speaker adaptation of vocabulary for speech recognition |
US7409345B2 (en) * | 2003-04-04 | 2008-08-05 | International Business Machines Corporation | Methods for reducing spurious insertions in speech recognition |
JP4390248B2 (ja) | 2003-06-17 | 2009-12-24 | キヤノン株式会社 | データ管理装置及びその制御方法並びにプログラム |
US7266795B2 (en) * | 2005-03-17 | 2007-09-04 | International Business Machines Corporation | System and method for engine-controlled case splitting within multiple-engine based verification framework |
-
2004
- 2004-01-20 US US10/761,451 patent/US8019602B2/en not_active Expired - Fee Related
-
2005
- 2005-01-12 EP EP05100140A patent/EP1557822B1/en not_active Not-in-force
- 2005-01-12 AT AT05100140T patent/ATE511177T1/de not_active IP Right Cessation
- 2005-01-18 JP JP2005010922A patent/JP4657736B2/ja not_active Expired - Fee Related
- 2005-01-20 CN CN2005100059379A patent/CN1645477B/zh not_active Expired - Fee Related
- 2005-01-20 KR KR1020050005345A patent/KR101183344B1/ko active IP Right Grant
-
2010
- 2010-09-17 US US12/884,434 patent/US8280733B2/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000122687A (ja) * | 1998-10-19 | 2000-04-28 | Internatl Business Mach Corp <Ibm> | 言語モデルを更新する方法 |
JP2001092493A (ja) * | 1999-09-24 | 2001-04-06 | Alpine Electronics Inc | 音声認識修正方式 |
Also Published As
Publication number | Publication date |
---|---|
CN1645477A (zh) | 2005-07-27 |
US8280733B2 (en) | 2012-10-02 |
US8019602B2 (en) | 2011-09-13 |
EP1557822A1 (en) | 2005-07-27 |
US20110015927A1 (en) | 2011-01-20 |
KR101183344B1 (ko) | 2012-09-14 |
CN1645477B (zh) | 2012-01-11 |
ATE511177T1 (de) | 2011-06-15 |
US20050159949A1 (en) | 2005-07-21 |
JP2005208643A (ja) | 2005-08-04 |
KR20050076697A (ko) | 2005-07-26 |
EP1557822B1 (en) | 2011-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4657736B2 (ja) | ユーザ訂正を用いた自動音声認識学習のためのシステムおよび方法 | |
US8886534B2 (en) | Speech recognition apparatus, speech recognition method, and speech recognition robot | |
US8346553B2 (en) | Speech recognition system and method for speech recognition | |
US6910012B2 (en) | Method and system for speech recognition using phonetically similar word alternatives | |
US7013276B2 (en) | Method of assessing degree of acoustic confusability, and system therefor | |
US7676365B2 (en) | Method and apparatus for constructing and using syllable-like unit language models | |
US7590533B2 (en) | New-word pronunciation learning using a pronunciation graph | |
KR101153078B1 (ko) | 음성 분류 및 음성 인식을 위한 은닉 조건부 랜덤 필드모델 | |
JP2003316386A (ja) | 音声認識方法および音声認識装置および音声認識プログラム | |
US20060206326A1 (en) | Speech recognition method | |
US7617104B2 (en) | Method of speech recognition using hidden trajectory Hidden Markov Models | |
KR101014086B1 (ko) | 음성 처리 장치 및 방법, 및 기록 매체 | |
US6963834B2 (en) | Method of speech recognition using empirically determined word candidates | |
JP2014074732A (ja) | 音声認識装置、誤り修正モデル学習方法、及びプログラム | |
KR101283271B1 (ko) | 어학 학습 장치 및 어학 학습 방법 | |
JP6027754B2 (ja) | 適応化装置、音声認識装置、およびそのプログラム | |
US20230252971A1 (en) | System and method for speech processing | |
JP5184467B2 (ja) | 適応化音響モデル生成装置及びプログラム | |
JP2008026721A (ja) | 音声認識装置、音声認識方法、および音声認識用プログラム | |
JP2886118B2 (ja) | 隠れマルコフモデルの学習装置及び音声認識装置 | |
JP2003345388A (ja) | 音声認識装置、音声認識方法、および、音声認識プログラム | |
JP2011180308A (ja) | 音声認識装置及び記録媒体 | |
JPH08211891A (ja) | ヒドン・マルコフ・モデルの学習方法 | |
JPH11161292A (ja) | 音声認識装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20080111 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20100827 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20101126 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20101217 |
|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20101222 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20140107 Year of fee payment: 3 |
|
R150 | Certificate of patent or registration of utility model |
Ref document number: 4657736 Country of ref document: JP Free format text: JAPANESE INTERMEDIATE CODE: R150 Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
S111 | Request for change of ownership or part of ownership |
Free format text: JAPANESE INTERMEDIATE CODE: R313113 |
|
R350 | Written notification of registration of transfer |
Free format text: JAPANESE INTERMEDIATE CODE: R350 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
LAPS | Cancellation because of no payment of annual fees |