JPH0432900A - Sound recognizing device - Google Patents
Sound recognizing deviceInfo
- Publication number
- JPH0432900A JPH0432900A JP2138879A JP13887990A JPH0432900A JP H0432900 A JPH0432900 A JP H0432900A JP 2138879 A JP2138879 A JP 2138879A JP 13887990 A JP13887990 A JP 13887990A JP H0432900 A JPH0432900 A JP H0432900A
- Authority
- JP
- Japan
- Prior art keywords
- candidate
- recognition
- vocalization
- recognition result
- threshold
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012790 confirmation Methods 0.000 claims description 3
- 238000000034 method Methods 0.000 description 7
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 206010011224 Cough Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
Abstract
Description
【発明の詳細な説明】
夜虚芳M−
本発明は、音声認識装置、より詳細には、音声認識装置
における認識結果の判定方式に関する。DETAILED DESCRIPTION OF THE INVENTION Yokoyoshi M - The present invention relates to a speech recognition device, and more particularly, to a method for determining recognition results in a speech recognition device.
史米投佐
音声認識は1機械、装置の音声による制御や音声による
データ入力等に広く利用されている。この場合、音声の
誤認識は制御入力の誤りやデータ入力の誤りとなって現
れるので、その影響は極めて重大である。特に、音声入
力によって機械の動作を指示する場合には、誤認識によ
る異常動作を防止しなければならない。このような場合
、従来、第1位の候補の認識結果の動作指示内容が危険
である場合には、第1位以外の上位候補の指示内容によ
って、無害な指示内容を持つ候補をL&識結果としたり
、音声入力を無効にして1機械の異常動作を防いだり、
或いは、第1位の候補をそのまま認識結果として出力し
、機械の動作を指示すると、異常動作を行なう可能性が
ある場合には、第1位の候補の認識信頼度に応じて、使
用者に確認を求めたり、第1位の候補を無効にしたりし
ていた。Voice recognition is widely used for voice control of machines and devices, voice data input, etc. In this case, erroneous voice recognition manifests itself as an error in control input or data input, and the effects thereof are extremely serious. In particular, when instructing the operation of a machine by voice input, it is necessary to prevent abnormal operation due to erroneous recognition. In such cases, conventionally, if the action instruction content of the recognition result of the first candidate is dangerous, the candidate with harmless instruction content is L&referenced based on the instruction content of the higher ranking candidate other than the first candidate. or disable voice input to prevent abnormal operation of one machine,
Alternatively, if the first candidate is output as a recognition result and the machine is instructed to operate, if there is a possibility that the machine will perform an abnormal operation, the user may be given instructions based on the recognition reliability of the first candidate. They asked for confirmation and invalidated the number one candidate.
而して、上述のごとき従来技術では、入力音声を照合し
て、第1位の候補が正しい結果であっても認識信頼度が
第2の閾値より小さい場合には、認識結果は無効となり
、使用者は、同じ発声を繰り返すことになる。Therefore, in the conventional technology as described above, even if the input speech is compared and the first candidate is the correct result, if the recognition reliability is smaller than the second threshold, the recognition result is invalid. The user will repeat the same utterance.
このとき、認識信頼度の低い原因が、突発的な雑音や一
時的な声の変調(せき込み、声の裏返えりなど)の場合
には、言い直しによって、2回目の発声では有効な認識
結果が得られる可能性が高い。しかし、定常的な周囲の
騒音や、音声の経時変化などが原因の場合には、何回同
じ発声を繰り返しても有効な認識結果は得られない。At this time, if the cause of the low recognition reliability is sudden noise or temporary voice modulation (coughing, voice overturning, etc.), the second utterance may produce an effective recognition result. is likely to be obtained. However, if the cause is constant ambient noise or changes in voice over time, effective recognition results will not be obtained no matter how many times the same utterance is repeated.
豆−一敗
本発明は、上述のごとき実情に鑑みてなされたもので、
特に、機械の異常動作を防止しつつ、言い直しによる入
力に対しては、有効な認識結果が得られるような音声認
識装置を提供することを目的としてなされたものである
。The present invention was made in view of the above-mentioned circumstances.
In particular, the purpose of this invention is to provide a speech recognition device that can obtain effective recognition results for inputs that are reworded while preventing abnormal machine operations.
豊−一戒
本発明は、上記の目的を達成するために、入力された音
声パターンを登録された音声パターンと照合して第1位
の候補と該候補の認識信頼度を計算する照合部と、該認
識信頼度が第1の閾値より大きい場合には、第1位の候
補を認識結果とし、該認識信頼度が第2の閾値より大き
く、第1の閾値より小さい場合には、第1位の候補の確
認を促す表示を行ない、該認識信頼度が第2の閾値より
小さい場合には、第1位の候補を認識結果としない認識
結果判定部と、前発声の第1位の候補を記憶する候補記
憶部とを具備する音声認識装置において、現発声の第1
位の候補が前発声の第1位の候補と等しい場合には、前
記第1、第2の閾値の片方もしくは双方を下げて認識結
果を判定することを特徴としたものである。以下、本発
明の実施例に基づいて説明する。In order to achieve the above object, the present invention includes a collation unit that collates an input voice pattern with registered voice patterns and calculates the first candidate and the recognition reliability of the candidate. , when the recognition reliability is larger than the first threshold, the first candidate is set as the recognition result, and when the recognition reliability is larger than the second threshold and smaller than the first threshold, the first candidate is set as the recognition result. a recognition result determination unit that displays a display prompting confirmation of the first-ranked candidate and does not select the first-ranked candidate as a recognition result if the recognition reliability is smaller than a second threshold; a candidate storage section for storing the first candidate memory of the current utterance;
If the candidate for the first position is equal to the candidate for the first position of the previous utterance, one or both of the first and second thresholds are lowered to determine the recognition result. Hereinafter, the present invention will be explained based on examples.
第1図は、本発明の一実施例を説明するための構成図で
、図中、1は音声入力部、2は音声パターン変換部、3
は標準パターン格納部、4は照合部、5は候補記憶部、
6は認識結果判定部で、マイクなどの音声入力部1から
入力された音声信号は、音声パターン変換部2で音声パ
ターンに変換される。音声認識に有効な変換方法として
は様々なものが知られているが、例えば、フレーム周期
10m5で15チヤンネルのバンドパスフィルタ群の出
力を取り出したものを用いれば良い。標準パターン格納
部3には、予め、認識対象語いを発声したものを音声パ
ターンに変換し、標準パターンとして格納しておく。照
合部4では、入力音声パターンXと標準パターンY1、
Y3、・・・YM(Mは。FIG. 1 is a block diagram for explaining one embodiment of the present invention, in which 1 is a voice input section, 2 is a voice pattern conversion section, and 3 is a block diagram for explaining an embodiment of the present invention.
is a standard pattern storage section, 4 is a matching section, 5 is a candidate storage section,
Reference numeral 6 denotes a recognition result determination section, in which a voice signal inputted from a voice input section 1 such as a microphone is converted into a voice pattern by a voice pattern conversion section 2 . Various conversion methods are known that are effective for speech recognition, and for example, one may be used that extracts the outputs of a bandpass filter group of 15 channels with a frame period of 10 m5. In the standard pattern storage unit 3, the speech of the recognition target word is converted into a voice pattern and stored as a standard pattern in advance. In the matching unit 4, the input voice pattern X and the standard pattern Y1,
Y3,...YM (M is.
登録しである標準パターン数)を照合する。照合の方法
は、例えば、DPマツチングを用いれば良い。例えば、
全ての標準パターンの中でYjが最もXと距離が小さか
ったとすれば、Yjに対応する単語名やコマンド名など
が第1位の候補となる。(Number of registered standard patterns). For example, DP matching may be used as the matching method. for example,
If Yj has the smallest distance from X among all standard patterns, then the word name, command name, etc. corresponding to Yj will be the first candidate.
この距離をDlとすると、認識信頼度りを1/D1とす
れば良い。もしくは、単語ごとに設定した正規化係数Z
jを用いてL=Zj/D□とすれば良い。If this distance is Dl, the recognition reliability may be set to 1/D1. Or normalization coefficient Z set for each word
Using j, it is sufficient to set L=Zj/D□.
第2図は、認識結果判定部6の動作説明をするためのフ
ローチャートで、該認識結果判定部6では、各標準パタ
ーンごと、もしくは単語、コマンドごとに設定された2
種類の閾値T iJ t T z jを用いて、L>T
工jの場合には、第1位の候補を認識結果として出力す
る(ただし、第2図のフローチャートにおいて、L:認
識信頼度、j:第1位の候補、j−old :前回の第
1位の候補、Txj+ T2:J :単語ごとに設定さ
れた閾値、α□β:正の定数である) 6T2j<L<
TユJの場合には、使用者に、第1位の候補を認識結果
として出力して良いかどうかの確認を促す表示を行ない
、出力と−して良いという指示があった場合のみ、第1
位の候補を認識結果として出力する。指示の方法として
は、はい/いいえを音声で入力する方法や3秒間のうち
に中止ボタンが押されなかった場合はOKの指示が出さ
れたと見なすなどの方法を用いれば良い。FIG. 2 is a flowchart for explaining the operation of the recognition result determination section 6. In the recognition result determination section 6, two
Using the threshold value T iJ t T z j of the type, L>T
In the case of process j, the first candidate is output as the recognition result (however, in the flowchart in Fig. 2, L: recognition reliability, j: first candidate, j-old: previous first candidate). candidate, Txj+ T2: J: Threshold value set for each word, α□β: Positive constant) 6T2j<L<
In the case of TyuJ, a display prompting the user to confirm whether or not to output the first candidate as a recognition result is displayed, and the first candidate is displayed only if the user is instructed to output the first candidate as a recognition result. 1
The position candidates are output as recognition results. As a method of giving the instruction, a method of inputting yes/no by voice or a method of assuming that an OK instruction has been given if the cancel button is not pressed within 3 seconds may be used.
L < T 2 Jの場合には、第1位の候補を拒否し
、認識結果を無効にして、次の音声の入方待ちの状態へ
入る。候補記憶部5では、第1位の候補を記憶し、次の
音声を照合した際、第1位の候補がYjに対応する単語
名やコマンド名が候補記憶部5で記憶されたものと等し
い場合にはTlj、T2jを下げて、認識結果判定部6
での処理を行なう。In the case of L < T 2 J, the first candidate is rejected, the recognition result is invalidated, and a state of waiting for the next voice is entered. The candidate storage unit 5 stores the first candidate, and when the next voice is compared, the word name or command name corresponding to Yj in the first candidate is the same as the one stored in the candidate storage unit 5. In this case, Tlj and T2j are lowered and the recognition result determination unit 6
Processing is performed.
T工J+Tzjの変更は、
T工j=T1j−αL’ (
1)T 2 :J = T 2 j−βL’
(2)とすれば良い、ここでα、βは
正の定数、L′は前回の認識信頼度である。The change of T-work J + Tzz is as follows: T-work j = T1j - αL' (
1) T2:J = T2j-βL'
(2), where α and β are positive constants, and L' is the previous recognition reliability.
羞−一来
以上の説明から明らかなように、本発明によると、前回
の第1位の候補と、今回の第1位の候補が等しい場合に
は、2つの閾値T 1jt T 2 jの値を小さくし
ている。このため、前回と今回の第1位の候補が等しく
、この候補が正しい認識結果である可能性が高い場合は
、今回の認識信頼度が小さい場合でもそのまま、この候
補が出力される確率が大きくなり、何回も同じ発声を繰
り返しても入力できないという欠点が解消できる。As is clear from the above explanation, according to the present invention, when the previous first-place candidate and the current first-place candidate are equal, the values of the two thresholds T 1jt T 2 j is made smaller. Therefore, if the first-ranked candidates from last time and this time are the same and there is a high possibility that this candidate is the correct recognition result, there is a high probability that this candidate will be output as is even if the current recognition reliability is low. This solves the problem of not being able to input even if you repeat the same utterance many times.
第1図は、本発明による音声認識装置の一実施例を説明
するためのブロック図、第2図は、第1図に示した認識
結果判定部の動作説明をするためのフローチャートであ
る。
1・・・音声入力部、2・・・音声パターン変換部、3
標準パタ一ン格納部、4・・・照合部、訃・・候補記憶
部、6・・・認識結果判定部。
第2図
第1図
T2J=T1j−βT2コ
認識結果
認識結果無効FIG. 1 is a block diagram for explaining an embodiment of the speech recognition apparatus according to the present invention, and FIG. 2 is a flowchart for explaining the operation of the recognition result determination section shown in FIG. 1. 1... Audio input section, 2... Audio pattern conversion section, 3
Standard pattern storage section, 4... Collation section, Death... Candidate storage section, 6. Recognition result determination section. Figure 2 Figure 1 T2J=T1j-βT2 recognition result recognition result invalid
Claims (1)
と照合して第1位の候補と該候補の認識信頼度を計算す
る照合部と、該認識信頼度が第1の閾値より大きい場合
には、第1位の候補を認識結果とし、該認識信頼度が第
2の閾値より大きく、第1の閾値より小さい場合には、
第1位の候補の確認を促す表示を行ない、該認識信頼度
が第2の閾値より小さい場合には、第1位の候補を認識
結果としない認識結果判定部と、前発声の第1位の候補
を記憶する候補記憶部とを具備する音声認識装置におい
て、現発声の第1位の候補が前発声の第1位の候補と等
しい場合には、前記第1、第2の閾値の片方もしくは双
方を下げて認識結果を判定することを特徴とする音声認
識装置。1. A collation unit that collates the input voice pattern with the registered voice pattern and calculates the first candidate and the recognition reliability of the candidate, and if the recognition reliability is greater than the first threshold, , the first candidate is the recognition result, and if the recognition reliability is larger than the second threshold and smaller than the first threshold,
A recognition result determination unit that displays a display prompting confirmation of the first-ranked candidate and does not select the first-ranked candidate as a recognition result if the recognition reliability is smaller than a second threshold; In the speech recognition device, when the first candidate of the current utterance is equal to the first candidate of the previous utterance, one of the first and second thresholds is Alternatively, a speech recognition device is characterized in that the recognition result is determined by lowering both.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2138879A JPH0432900A (en) | 1990-05-29 | 1990-05-29 | Sound recognizing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2138879A JPH0432900A (en) | 1990-05-29 | 1990-05-29 | Sound recognizing device |
Publications (1)
Publication Number | Publication Date |
---|---|
JPH0432900A true JPH0432900A (en) | 1992-02-04 |
Family
ID=15232250
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2138879A Pending JPH0432900A (en) | 1990-05-29 | 1990-05-29 | Sound recognizing device |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPH0432900A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002311991A (en) * | 2001-04-19 | 2002-10-25 | Alpine Electronics Inc | Speech recognition method |
JP2006251800A (en) * | 2005-03-07 | 2006-09-21 | Samsung Electronics Co Ltd | User adaptive speech recognition method and apparatus |
JP2007322647A (en) * | 2006-05-31 | 2007-12-13 | Funai Electric Co Ltd | Electronic equipment |
JP2008046299A (en) * | 2006-08-14 | 2008-02-28 | Nissan Motor Co Ltd | Speech recognition apparatus |
JP2019015950A (en) * | 2017-07-05 | 2019-01-31 | パナソニックIpマネジメント株式会社 | Voice recognition method, program, voice recognition device, and robot |
-
1990
- 1990-05-29 JP JP2138879A patent/JPH0432900A/en active Pending
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002311991A (en) * | 2001-04-19 | 2002-10-25 | Alpine Electronics Inc | Speech recognition method |
JP4565768B2 (en) * | 2001-04-19 | 2010-10-20 | アルパイン株式会社 | Voice recognition device |
JP2006251800A (en) * | 2005-03-07 | 2006-09-21 | Samsung Electronics Co Ltd | User adaptive speech recognition method and apparatus |
JP4709663B2 (en) * | 2005-03-07 | 2011-06-22 | 三星電子株式会社 | User adaptive speech recognition method and speech recognition apparatus |
JP2007322647A (en) * | 2006-05-31 | 2007-12-13 | Funai Electric Co Ltd | Electronic equipment |
US7908146B2 (en) | 2006-05-31 | 2011-03-15 | Funai Electric Co., Ltd. | Digital television receiver controlled by speech recognition |
JP2008046299A (en) * | 2006-08-14 | 2008-02-28 | Nissan Motor Co Ltd | Speech recognition apparatus |
JP2019015950A (en) * | 2017-07-05 | 2019-01-31 | パナソニックIpマネジメント株式会社 | Voice recognition method, program, voice recognition device, and robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5860059A (en) | Transaction system based on a bidirectional speech channel by status graph building and problem detection for a human user | |
JP3284832B2 (en) | Speech recognition dialogue processing method and speech recognition dialogue device | |
JPS603699A (en) | Adaptive automatically dispersing voice recognition | |
JPH02163819A (en) | Text processor | |
CN110060674B (en) | Table management method, device, terminal and storage medium | |
Noyes et al. | Errors and error correction in automatic speech recognition systems | |
JPH0432900A (en) | Sound recognizing device | |
JPH0830290A (en) | Voice input possible information processing device and its malprocessing detection method | |
JPH02265000A (en) | Voice interactive device | |
JPH064264A (en) | Voice input/output system | |
JPH07325597A (en) | Information input method and device for executing its method | |
JPH04254896A (en) | Speech recognition correction device | |
JPH09230889A (en) | Speech recognition and response device | |
JPH08335094A (en) | Voice input method and device for executing this method | |
JP2005316247A (en) | Voice dialog system | |
JPH0654503B2 (en) | Pattern recognition device | |
JPH0634234B2 (en) | Pattern recognizer | |
JPS61165797A (en) | Voice recognition equipment | |
JPS60260094A (en) | Voice recognition equipment | |
JPH06289899A (en) | Speech recognition device | |
JPH02149900A (en) | Voice recognizing and answering device | |
JPH05216493A (en) | Operator assistance type speech recognition device | |
JPS61231629A (en) | Voice input device | |
JPH0415960B2 (en) | ||
JPS60159899A (en) | Voice recognition equipment with learning function |