JPH09212189A - Method and device for speech recognition - Google Patents

Method and device for speech recognition

Info

Publication number
JPH09212189A
JPH09212189A JP8040385A JP4038596A JPH09212189A JP H09212189 A JPH09212189 A JP H09212189A JP 8040385 A JP8040385 A JP 8040385A JP 4038596 A JP4038596 A JP 4038596A JP H09212189 A JPH09212189 A JP H09212189A
Authority
JP
Japan
Prior art keywords
candidate
similarity
function
threshold value
difference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP8040385A
Other languages
Japanese (ja)
Inventor
Hiroyuki Kuno
裕之 久野
Megumi Takano
恵 高野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Elemex Corp
Original Assignee
Ricoh Elemex Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Elemex Corp filed Critical Ricoh Elemex Corp
Priority to JP8040385A priority Critical patent/JPH09212189A/en
Publication of JPH09212189A publication Critical patent/JPH09212189A/en
Pending legal-status Critical Current

Links

Abstract

PROBLEM TO BE SOLVED: To improve the precision of speech recognition by giving a threshold (what is called a moving threshold value) which can reflect a difference in the similarity of a 1st candidate in detail according to the difference in the similarity of the 1st candidate. SOLUTION: For example, the distance of a 1st candidate is defined as a variable and the distance difference between 1st and 2nd candidates is defined as a function of the distance of the 1st candidate. This function can be given a pattern of an increase trend which increases the distance difference as, for example, the distance of the 1st candidate increases. Then the concrete distance value (a) of the 1st candidate is substituted in the function (threshold value function) and then a function threshold value f(a) characteristic of the distance value 8a) is obtained. When the actual distance difference between the 1st and 2nd candidates is in a lower area (B) on the basis of the function threshold value f(a), the 1st candidate is judged to be misrecognized (rejected) and when the difference in an upper area (C), the 1st candidate is judged to be recognized correctly.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【発明の属する技術分野】この発明は、入力される音声
と予め定めた複数の音声候補との類似度の比較に基づ
き、その音声を上記複数の候補の何れかに一致するもの
として認識する音声認識方法及びその装置に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is based on a comparison of the similarity between an input voice and a plurality of predetermined voice candidates, and recognizes the voice as matching with any of the plurality of candidates. The present invention relates to a recognition method and its device.

【0002】[0002]

【従来の技術】従来、この種の音声認識装置としては、
類似度の基準として、例えば実際の入力音声と複数の候
補とのパターンマッチング上の距離計算を行い、その類
似度が最大である最小距離の候補を第1候補とし、この
第1候補を認識結果とし得るか否かを判定するものがあ
る。そして、第1候補を認識結果とし得ないとするリジ
ェクトの判定方法について、例えば次のような方法のい
ずれか又はこれらを適宜組み合わせたものが知られてい
る。
2. Description of the Related Art Conventionally, as this type of voice recognition device,
As a similarity criterion, for example, distance calculation is performed on the pattern matching between the actual input voice and a plurality of candidates, the candidate with the smallest distance having the highest similarity is set as the first candidate, and the first candidate is recognized. There is one that determines whether or not. Then, as a reject determination method in which the first candidate cannot be used as a recognition result, for example, one of the following methods or a combination thereof is known.

【0003】第1候補の距離aが、しきい値より大き
い場合にはリジェクトする。 第1候補の距離aと第2候補の距離bとの差(a−
b)が、しきい値より小さい場合はリジェクトする。 第1候補の距離aと第2候補の距離bの比(a/b)
が、しきい値より小さい場合はリジェクトする。 ここで、は第1候補の類似度が絶対的に低いため認識
結果にし得ない場合であり、やは、第1候補の類似
度と第2候補の類似度が接近しているため、第1候補を
認識結果に断定し得ない場合である。及びの組み合
わせとしては、特開昭61−114299等が、また
に関係するものとして、特公平3−44316等があ
る。
If the distance a of the first candidate is larger than the threshold value, it is rejected. The difference between the distance a of the first candidate and the distance b of the second candidate (a-
If b) is smaller than the threshold value, it is rejected. Ratio of distance a of the first candidate and distance b of the second candidate (a / b)
However, if is smaller than the threshold value, it is rejected. Here, is the case where the recognition result cannot be obtained because the similarity of the first candidate is absolutely low, and or is the first similarity because the similarity of the first candidate and the similarity of the second candidate are close to each other. This is the case where the candidate cannot be determined as the recognition result. As a combination of and, there is JP-A-61-114299 and the like, and a related one is JP-B-3-44316 and the like.

【0004】[0004]

【発明が解決しようとする課題】しかしながら、これら
の方法によっても誤認識等を確実に排除するという要請
からすると、さらに改善が望まれている。特に、誤認識
等の判定の基準となるしきい値をどのように決めるかで
認識精度は大きな影響を受ける。しきい値を厳しい基準
に設定すれば、認識精度は高まるが、第1候補がリジェ
クトされる割合が多くなるため、ユーザーにとっては使
い勝手の悪いものとなる。一方、しきい値を緩やかにす
れば、第1候補がリジェクトされる割合は少なくなる
が、ユーザーの意図しない音声候補が提示され、信頼性
を低下させる原因になる。
However, further improvement is desired in view of the demand for surely eliminating erroneous recognition and the like even by these methods. In particular, the recognition accuracy is greatly affected by how to determine the threshold value that serves as a criterion for determination such as erroneous recognition. If the threshold is set to a strict criterion, the recognition accuracy is improved, but the first candidate is rejected at a high rate, which is inconvenient for the user. On the other hand, if the threshold value is made gentle, the rate at which the first candidate is rejected decreases, but a voice candidate not intended by the user is presented, which causes a decrease in reliability.

【0005】本発明の課題は、第1候補の類似度の相違
(例えば距離の大小)に応じて、一律ではなく、その類
似度の相違をきめ細かく反映させることのできるしきい
値(いわば動くしきい値)を付与することにより、誤っ
た認識をできるだけなくすとともに、正しい認識と判定
されるべきものが不用意にリジェクトされてしまうこと
を防ぐ(救済する)ことにある。
An object of the present invention is not to be uniform according to the difference in the similarity of the first candidate (for example, the magnitude of the distance), but a threshold value (so to speak, move so that the difference in the similarity is finely reflected). By assigning a threshold value), erroneous recognition is eliminated as much as possible, and what is to be judged as correct recognition is prevented (rescue) from being carelessly rejected.

【0006】[0006]

【課題を解決するための手段及び作用・効果】この発明
に係る音声認識方法は、入力される音声を予め定めた複
数の候補と比較し、それらの類似度計算により入力音声
が複数の候補のいずれに該当するかを認識することを前
提として、類似度の高い方からの順位である第1候補と
第2候補との類似度差又は類似度比を、第1候補の類似
度の関数として予め定義・関係付け、第1候補の類似度
の具体値を前記関数に当てはめることにより、対応する
関数値を関数しきい値とし、この関数しきい値を基準に
第1候補を認識結果として採用するか否かを判定するこ
とを特徴とする。
A speech recognition method according to the present invention compares an input voice with a plurality of predetermined candidates and calculates the similarity between them to determine that the input voice has a plurality of candidates. Assuming that one of the first candidate and the second candidate, which is ranked from the one with the highest similarity, is used as a function of the similarity of the first candidate, on the assumption that which one corresponds to the first candidate and the second candidate. By defining / correlating in advance and applying a concrete value of the similarity of the first candidate to the function, the corresponding function value is used as a function threshold, and the first candidate is adopted as a recognition result based on this function threshold. It is characterized by determining whether or not to do.

【0007】また、このような方法の実施に好適な本発
明に係る音声認識装置は、次のような要件、すなわち、 複数の候補を記憶する候補記憶手段、 入力される音声の特徴を分析する音声分析手段、 この分析結果と前記複数の候補との類似度を算出する
類似度算出手段と、 類似度の高い方からの基準である第1候補と第2候補
との類似度差又は類似度比を算出する相対値算出手段、 予め第1候補と第2候補の類似度差又は類似度比を、
第1候補の類似度の関数として定義・関係付け、その関
係を認識適否判定のための基準関数として記憶する基準
関数記憶手段、 第1候補の類似度を変数値として前記基準関数に当て
はめ、対応する関数値をしきい値とする関数しきい値決
定手段と、 第1・第2候補の類似度差又は類似度比と前記関数し
きい値との大小関係に基づき、前記第1候補が認識結果
として適当であるか否かを判定する適否判定手段を含
む。
Further, the speech recognition apparatus according to the present invention suitable for carrying out such a method analyzes the following requirements, that is, candidate storage means for storing a plurality of candidates, and characteristics of input speech. A voice analysis means; a similarity calculation means for calculating the similarity between the analysis result and the plurality of candidates; and a similarity difference between the first candidate and the second candidate, which is a reference from the higher similarity, or the similarity. Relative value calculating means for calculating the ratio, the similarity difference or similarity ratio between the first candidate and the second candidate in advance,
Reference function storing means for defining / associating as a function of similarity of the first candidate and storing the relationship as a reference function for recognition suitability determination, applying the similarity of the first candidate as a variable value to the reference function, and corresponding The first candidate is recognized on the basis of the function threshold value determining means having the function value as a threshold value, and the magnitude difference between the similarity difference or the similarity ratio between the first and second candidates and the function threshold value. As a result, it includes a suitability judging means for judging whether it is suitable or not.

【0008】図2(b)は本発明に対する比較例であ
る。第1候補の類似度として距離(入力音声に対する)
を用い、その第1候補の距離がしきい値Pより大きけれ
ば、第1候補を誤認識と判定し(A’の領域)、しきい
値Pより小さいものについては、さらに第1候補の距離
と第2候補の距離の差が、一律固定のしきい値Qより大
きいかどうかをみる。そのしきい値Qより下側(B’)
の領域では、第1候補の第2候補に対する優先性が明確
でないとして誤認識と判断し、しきい値Qより上側
(C’)の領域では、当該第1候補を正しい認識と判定
する。
FIG. 2B shows a comparative example with respect to the present invention. Distance as the similarity of the first candidate (for the input voice)
If the distance of the first candidate is larger than the threshold value P, the first candidate is determined to be erroneous recognition (area A ′), and if it is smaller than the threshold value P, the distance of the first candidate is further decreased. And whether the difference between the distances of the second candidate and the second candidate is larger than the uniformly fixed threshold value Q. Below the threshold Q (B ')
In the area of (1), it is determined that the first candidate is erroneous recognition because the priority of the first candidate with respect to the second candidate is not clear, and in the area above the threshold value Q (C ′), the first candidate is determined to be correct recognition.

【0009】ここで、第1候補の距離値が小さい(入力
音声との類似度が高い)場合、しきい値Qの下側でも、
実際には第1候補が正しい認識であること(○印)があ
り、逆に第1候補の距離値が大きい(入力音声との類似
度が低い)場合、しきい値Qの上側でも、実際には第1
候補が誤認識(×印)であることがある。つまり、誤認
識と判定(リジェクト)されてはいけないもの(○印)
がリジェクトされ、逆にリジェクトされるべきもの(×
印)が正しいと認識される結果を生じ、これが認識精度
の低下を招く。
Here, when the distance value of the first candidate is small (the similarity with the input voice is high), even below the threshold value Q,
Actually, the first candidate may be a correct recognition (marked with O), and conversely, when the distance value of the first candidate is large (the similarity with the input voice is low), even if the upper side of the threshold value Q is exceeded, Is the first
The candidate may be misrecognized (x mark). In other words, those that should not be judged (rejected) as misrecognition (circle)
Should be rejected, and conversely should be rejected (×
Mark) results in being recognized as correct, which leads to a decrease in recognition accuracy.

【0010】これに対し本発明では、その一例を図2
(a)に示すように、一律固定のしきい値ではなく、関
数としてのしきい値を定義する(基準関数)。つまり、
第1候補の例えば距離を変数とし、第1・第2候補の距
離差を第1候補の距離の関数として捉える。この関数は
例えば第1候補の距離が大きくなるほど、上記距離差を
大きくするような増加傾向のパターンとすることができ
る。そして、第1候補の具体的な距離値aを、その関数
(しきい値関数といえる)に代入すると、その距離値a
に固有のしきい値f(a)が得られる。これは関数によっ
て与えられるという意味で関数しきい値といえる。さら
に、その関数しきい値f(a)を基準にして、第1・第2
候補の実際の距離差が下側(B)の領域にあれば、第1
候補を誤認識と判断(リジェクト)し、上側(C)の領
域にあれば第1候補を正しい認識と判断する。
On the other hand, in the present invention, an example thereof is shown in FIG.
As shown in (a), not a fixed threshold value but a threshold value as a function is defined (reference function). That is,
For example, the distance of the first candidate is used as a variable, and the distance difference between the first and second candidates is taken as a function of the distance of the first candidate. This function can be, for example, a pattern of increasing tendency in which the distance difference increases as the distance of the first candidate increases. Then, when the specific distance value a of the first candidate is substituted into the function (which can be called a threshold function), the distance value a
To obtain a threshold value f (a) peculiar to This is a function threshold in the sense that it is given by a function. Further, with the function threshold value f (a) as a reference, the first and second
If the actual distance difference of the candidate is in the lower (B) area, the first
The candidate is judged (rejected) as false recognition, and if it is in the upper area (C), the first candidate is judged as correct recognition.

【0011】これにより、第1候補の距離が小さい場合
は、第1・第2候補の距離差のしきい値を緩やかにする
ことができ、本来正しい認識と判断されるべきもの(○
印)がリジェクトされることを防ぐ。いわばリジェクト
からの救済である。一方、第1候補の距離が大きい場合
は、第1・第2候補の距離差のしきい値をより厳格にす
ることができ、これにより、本来リジェクトされるべき
もの(×印)が正しい認識と判断されることを防ぐ。こ
れらにより、いたずらにリジェクトの回数が増えること
を抑制しつつ、認定精度を高めることができる。
As a result, when the distance of the first candidate is small, the threshold value of the distance difference between the first and second candidates can be made gentle, and it should be judged that the recognition is proper (O).
Mark) is prevented from being rejected. So to speak, it is a relief from rejection. On the other hand, when the distance of the first candidate is large, it is possible to make the threshold value of the distance difference between the first and second candidates more strict, so that what is originally rejected (x mark) is correctly recognized. Prevent being judged. As a result, it is possible to improve the recognition accuracy while suppressing the number of rejects from being unnecessarily increased.

【0012】なお、図2(a)において、同図(b)と
同様に、第1候補の距離について足切りしきい値Pを設
定し、足切りしきい値Pから外れる(A)の領域につい
ては、上記関数しきい値を考慮するまでもなく誤認識と
判断し、足切りしきい値P内の領域について、次のステ
ップとして上記関数しきい値を用いた認識判断をするこ
とができる。だだし、認識ステップの順序を逆にして、
関数しきい値による認識判断の後に、足切りしきい値P
による判断を加えてもよいし、そもそも足切りしきい値
Pを用いず、上記関数しきい値による単独の認識判断に
してもよい。
Note that, in FIG. 2A, similarly to FIG. 2B, a foot cut threshold P is set for the distance of the first candidate, and an area (A) deviating from the foot cut threshold P is set. With regard to the above, it is possible to judge misrecognition without considering the function threshold value, and for the region within the cutoff threshold value P, the recognition judgment using the function threshold value can be performed as the next step. . However, reverse the order of the recognition steps,
After the recognition judgment by the function threshold, the cutoff threshold P
May be added, or the recognition threshold may be independently determined based on the function threshold value without using the cutoff threshold value P in the first place.

【0013】また、上記しきい値を規定する関数(基準
関数)は、一次関数(単純な傾き直線)的なものに限ら
ず、例えば図4(a)に示すような階段状のもの、同
(b)、(c)のような曲線的なもの、同(d)や
(e)のような折れ線状のもの等、目的とする認識基準
等に応じて、適宜定めることができる。
The function (standard function) that defines the threshold value is not limited to a linear function (simple straight line), but may be a stepwise function as shown in FIG. It can be appropriately determined according to the target recognition standard, such as a curved shape like (b) and (c), a polygonal shape like (d) and (e).

【0014】さらに、しきい値を規定する上記関数は、
第1・第2候補の距離差等の類似度差に限らず、第1候
補の距離を変数として、第1候補の距離aと第2候補の
距離bとの距離比(a/b)等の類似度比で定義するこ
ともできる。この場合でも、第1候補の類似度が第2候
補のそれに比べてどの程度顕著であるかが反映され、特
に距離比(a/b)を採用する場合は、その値が小さい
ほど第1候補の類似度が第2候補のそれに比べて高いこ
とを意味するから、その意味では前記距離差(b−a)
とは逆の関係になる。従って、図2(a)において前述
と同様な効果をもたせるには、距離比(a/b)を前提
とする関数は、距離差(b−a)とは傾きが逆(図で右
下がり)のものとなろう。
Further, the above function defining the threshold is
The distance ratio (a / b) between the distance a of the first candidate and the distance b of the second candidate is not limited to the similarity difference such as the distance difference between the first and second candidates, and the distance of the first candidate is used as a variable. It can also be defined by the similarity ratio of. Even in this case, the degree to which the similarity of the first candidate is more significant than that of the second candidate is reflected, and particularly when the distance ratio (a / b) is adopted, the smaller the value, the first candidate. Of the second candidate is higher than that of the second candidate, and in that sense, the distance difference (ba)
It is the opposite relationship. Therefore, in order to obtain the same effect as described above in FIG. 2A, the function that is based on the distance ratio (a / b) has an inclination opposite to the distance difference (ba) (downward in the figure). Will be the one.

【0015】[0015]

【発明の実施の形態】以下、図に示す実施例に基づき、
本発明の実施の形態を説明する。図1は装置構成の一例
を示すブロック図であり、マイク1等の音声入力手段か
ら入力される音声は、音声分析部2でその特徴パターン
が抽出され、これが入力音声バッファ3に一時記憶され
る。一方、標準音声パターン辞書部4は、候補記憶手段
として、例えば各単語ごとに標準音声パターンを記憶し
ている。CPU5は、上記入力音声の特徴パターンを、
複数の候補である各標準音声パターンと照合し、類似度
計算の一例である距離計算を行うことにより、各候補に
ついて入力音声の特徴パターンとの距離をそれぞれ計算
し、その結果を候補順位・距離バッファ6に一時記憶す
る。ここで、その距離の小さい(類似度の高い)ものか
ら、第1候補、第2候補・・・となる。
BEST MODE FOR CARRYING OUT THE INVENTION Based on the embodiment shown in the drawings,
An embodiment of the present invention will be described. FIG. 1 is a block diagram showing an example of a device configuration. A voice analysis unit 2 extracts a characteristic pattern of a voice input from a voice input unit such as a microphone 1, and the feature pattern is temporarily stored in an input voice buffer 3. . On the other hand, the standard voice pattern dictionary unit 4 stores a standard voice pattern for each word, for example, as a candidate storage unit. The CPU 5 determines the characteristic pattern of the input voice as
By comparing with each standard voice pattern that is multiple candidates and performing distance calculation, which is an example of similarity calculation, the distance between each candidate and the characteristic pattern of the input voice is calculated, and the result is the candidate rank and distance. It is temporarily stored in the buffer 6. Here, the first candidate, the second candidate, and so on are in order of decreasing distance (higher similarity).

【0016】基準関数記憶手段としての基準関数メモリ
7は、図2(a)に例示したように、第1候補の距離
(変数)の関数として第1・第2候補の距離差を定義・
関係付けた基準関数(しきい値関数ともいえる)を記憶
する。これは予め関数方程式として定義するか、あるい
は第1候補の距離と第1・第2候補の距離差とを、換算
表、対応マップ等の記憶テーブルとして関係付けて記憶
するものである。また、図2(a)の第1候補の距離の
足切りしきい値Pを記憶する場合は、そのための足切り
しきい値メモリ8が加えられる。
The reference function memory 7 as the reference function storage means defines the distance difference between the first and second candidates as a function of the distance (variable) of the first candidate, as shown in FIG.
A related reference function (also called a threshold function) is stored. This is defined in advance as a function equation, or the distance between the first candidate and the distance difference between the first and second candidates are associated and stored as a storage table such as a conversion table or a correspondence map. Further, when the cutoff threshold P for the distance of the first candidate in FIG. 2A is stored, a cutoff threshold memory 8 for that is added.

【0017】CPU5は、候補順位・距離バッファ6か
ら第1候補の距離値aを読み出し、この距離値aを図2
(a)の基準関数に当てはめることにより、関数しきい
値f(a)を求め、これを関数しきい値バッファ9に一時
記憶する。CPU5はまた、候補順位・距離バッファ6
のデータに基づいて、第1候補の距離aと第2候補の距
離bとの距離差(b−a)を計算し、その計算値(b−
a)が上記関数しきい値f(a)より大きいかどうかを比
較することにより、第1候補を認識結果としてよいかど
うを判定する。つまり、第1候補が正しい認識であるか
誤った認識であるかを判断し、その認識結果をI/F部
10から出力する。この実施例で、CPU5は、類似度
算出手段、相対値算出手段、関数しきい値決定手段およ
び適否判定手段を兼ねることになる。
The CPU 5 reads the distance value a of the first candidate from the candidate rank / distance buffer 6 and uses this distance value a as shown in FIG.
The function threshold value f (a) is obtained by applying it to the reference function of (a), and this is temporarily stored in the function threshold value buffer 9. The CPU 5 also uses the candidate rank / distance buffer 6
The distance difference (b−a) between the first candidate distance “a” and the second candidate distance “b” is calculated based on the above data, and the calculated value (b−a) is calculated.
By comparing whether or not a) is larger than the function threshold f (a), it is determined whether or not the first candidate may be a recognition result. That is, it is determined whether the first candidate is correct recognition or incorrect recognition, and the recognition result is output from the I / F unit 10. In this embodiment, the CPU 5 also serves as the similarity calculation means, the relative value calculation means, the function threshold value determination means, and the suitability determination means.

【0018】次に、以上のような音声認識の流れを、図
3に従い説明する。ステップS1で音声入力を取り込
み、S2でその特徴パターンを抽出する。S3では、そ
の特徴パターンと各標準パターン(候補)との間で距離
計算を行い、それぞれの距離を求める。
Next, the flow of voice recognition as described above will be described with reference to FIG. The voice input is captured in step S1, and the characteristic pattern is extracted in step S2. In S3, distance calculation is performed between the characteristic pattern and each standard pattern (candidate) to obtain each distance.

【0019】S4において、第1候補の距離は図2
(a)の足切りしきい値Pより大きいかどうかを比較
し、大きい場合は第1候補の類似度が不足するとして、
S8でリジェクトし、S9でそれに対応する処理(例え
ば、ユーザに対し音声が認識困難な旨あるいは再入力の
要請のメッセージ等)を行う。
In S4, the distance of the first candidate is shown in FIG.
It is compared whether it is larger than the cutoff threshold P of (a), and if it is larger, the similarity of the first candidate is insufficient,
Reject in S8, and the corresponding process (for example, a message that the user has difficulty recognizing voice or a request for re-input) is performed in S9.

【0020】第1候補の距離値がS4の足切りしきい値
をクリアした場合は、S5において、その第1候補の実
際の距離値aを図2(a)の基準関数に代入して(ある
いは対応関係を読み出して)、第1・第2候補の距離差
についての関数しきい値f(a)を求める。さらにS6
で、第1・第2候補の実際の距離差(b−a)が関数し
きい値f(a)以上であるかを判定し、そうであればS7
で当該第1候補を認識結果に採用する(第1候補は正し
い認識である)と結論付け、そうでなければ上記S8、
S9の流れとなる。なお、以上の説明はあくまでも例示
であり、本発明がこれに限定されないことは言うまでも
ない。
When the distance value of the first candidate clears the cutoff threshold value of S4, the actual distance value a of the first candidate is substituted into the reference function of FIG. 2 (a) in S5 ( Alternatively, the correspondence relationship is read out) to obtain the function threshold f (a) for the distance difference between the first and second candidates. Further S6
Then, it is determined whether or not the actual distance difference (ba) between the first and second candidates is greater than or equal to the function threshold f (a), and if so, S7
And conclude that the first candidate is adopted as the recognition result (the first candidate is correct recognition), otherwise, the above S8,
The flow is S9. Needless to say, the above description is merely an example, and the present invention is not limited to this.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の装置構成の一例を示すブロック図。FIG. 1 is a block diagram showing an example of a device configuration of the present invention.

【図2】本発明例と比較例とを比較・対照して示す説明
図。
FIG. 2 is an explanatory view showing an example of the present invention and a comparative example in comparison and contrast.

【図3】本発明の音声認識の流れの一例を示すフローチ
ャート。
FIG. 3 is a flowchart showing an example of the flow of voice recognition according to the present invention.

【図4】本発明のしきい値の関数(基準関数)の変形例
を示す概念図。
FIG. 4 is a conceptual diagram showing a modified example of a threshold function (reference function) of the present invention.

【符号の説明】[Explanation of symbols]

1 マイク 2 音声分析部 3 入力音声バッファ 4 標準音声パターン辞書部 5 CPU 7 基準関数メモリ 9 関数しきい値バッファ 1 Microphone 2 Voice analysis unit 3 Input voice buffer 4 Standard voice pattern dictionary unit 5 CPU 7 Reference function memory 9 Function threshold buffer

Claims (5)

【特許請求の範囲】[Claims] 【請求項1】 入力される音声を予め定めた複数の候補
と比較し、それらの類似度計算により前記音声が前記複
数の候補のいずれに該当するかを認識する方法におい
て、 前記類似度の高い方からの順位である第1候補と第2候
補との類似度差又は類似度比を、前記第1候補の類似度
の関数として予め定義・関係付け、前記第1候補の類似
度の具体値を前記関数に当てはめることにより、対応す
る関数値を関数しきい値とし、この関数しきい値を基準
に前記第1候補を認識結果として採用するか否かを判定
することを特徴とする音声認識方法。
1. A method of comparing an input voice with a plurality of predetermined candidates and recognizing which of the plurality of candidates the voice corresponds to by calculating their similarity, wherein the similarity is high. The similarity difference or similarity ratio between the first candidate and the second candidate, which is the rank from the other side, is defined and related in advance as a function of the similarity of the first candidate, and a specific value of the similarity of the first candidate. Is applied to the function to determine a corresponding function value as a function threshold value, and whether or not to adopt the first candidate as a recognition result is determined based on the function threshold value. Method.
【請求項2】 入力される音声を予め定めた複数の候補
と比較し、それらの類似度計算により前記音声が前記複
数の候補のいずれに該当するかを認識する装置におい
て、 前記複数の候補を記憶する候補記憶手段と、 前記音声の特徴を分析する音声分析手段と、 この分析結果と前記複数の候補との類似度を算出する類
似度算出手段と、 前記類似度の高い方からの基準である第1候補と第2候
補との類似度差又は類似度比を算出する相対値算出手段
と、 予め前記第1候補と第2候補の類似度差又は類似度比
を、前記第1候補の類似度の関数として定義・関係付
け、その関係を認識適否判定のための基準関数として記
憶する基準関数記憶手段と、 前記第1候補の類似度を変数値として前記基準関数に当
てはめ、対応する関数値をしきい値とする関数しきい値
決定手段と、 前記第1・第2候補間の類似度差又は類似度比と前記関
数しきい値との大小関係に基づき、前記第1候補が認識
結果として適当であるか否かを判定する適否判定手段
と、 を含むことを特徴とする音声認識装置。
2. An apparatus which compares an input voice with a plurality of predetermined candidates and recognizes which of the plurality of candidates the voice corresponds to by calculating the similarity between them. A candidate storage unit for storing, a voice analysis unit for analyzing the feature of the voice, a similarity calculation unit for calculating the similarity between the analysis result and the plurality of candidates, and a reference from the one with the higher similarity. Relative value calculating means for calculating a similarity difference or a similarity ratio between a certain first candidate and a second candidate, and a similarity difference or a similarity ratio between the first candidate and the second candidate in advance for the first candidate. Reference function storing means for defining / associating as a function of similarity and storing the relationship as a reference function for recognition suitability determination; and applying the similarity of the first candidate to the reference function as a variable value, and corresponding function The value is the threshold Whether or not the first candidate is appropriate as a recognition result based on a number threshold value determining means and a magnitude relationship between the similarity difference or similarity ratio between the first and second candidates and the function threshold value. A voice recognition device comprising: an adequacy determining unit that determines whether or not.
【請求項3】 前記第1・第2候補間の類似度差を、前
記第1候補の類似度の関数として定義・関係付け、これ
を前記基準関数記憶手段が記憶し、 前記関数しきい値決定手段が、具体的な前記第1候補の
類似度値を前記基準関数に当てはめ、その関数値を前記
関数しきい値としたとき、 前記適否判定手段は、前記第1・第2候補間の類似度差
が前記関数しきい値より大きければ前記第1候補を認識
結果に採用し得ると判定し、前記類似度差が前記関数し
きい値より小さければ前記第一候補を認識結果に採用で
きないと判定するものである請求項2記載の音声認識装
置。
3. The similarity difference between the first and second candidates is defined and related as a function of the similarity of the first candidate, the reference function storage means stores the relation, and the function threshold value is stored. When the determining unit applies the specific similarity value of the first candidate to the reference function and sets the function value as the function threshold value, the suitability determining unit determines that the suitability between the first and second candidates is large. If the difference in similarity is larger than the function threshold, it is determined that the first candidate can be adopted as a recognition result, and if the difference in similarity is smaller than the function threshold, the first candidate cannot be adopted as a recognition result. The voice recognition device according to claim 2, wherein
【請求項4】 前記第1候補の類似度を変数とし、前記
第1・第2候補間の類似度差で定義される前記基準関数
は、 前記第1候補の類似度が低くなる側に移行するほど、前
記関数しきい値が全体として上昇していくパターンで設
定されている請求項3記載の音声認識装置。
4. The reference function defined by the similarity difference between the first and second candidates using the similarity of the first candidate as a variable shifts to a side where the similarity of the first candidate decreases. 4. The voice recognition device according to claim 3, wherein the function threshold value is set in such a manner that it increases as a whole.
【請求項5】 前記請求項3又は4において、 前記第1候補の類似度を、予め定めた一定の足切りしき
い値と比較し、その第1候補の類似度が前記足切りしき
い値に及ばない場合は、その第1候補を認識結果として
採用し得ない旨のリジェクトの判定をする足切り判定手
段を設け、 その足切り判定手段により前記リジェクトの判定を受け
なかった第1候補について、前記関数しきい値に基づき
当該第1候補を認識結果とすることの適否を判定するこ
とを特徴とする音声認識装置。
5. The similarity according to claim 3 or 4, wherein the similarity of the first candidate is compared with a predetermined fixed cutoff threshold value, and the similarity of the first candidate is the cutoff threshold value. If the first candidate cannot be adopted as a recognition result, a cutoff determination means for determining a rejection indicating that the first candidate cannot be adopted is provided. A voice recognition device, characterized in that the suitability of using the first candidate as a recognition result is determined based on the function threshold value.
JP8040385A 1996-02-02 1996-02-02 Method and device for speech recognition Pending JPH09212189A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP8040385A JPH09212189A (en) 1996-02-02 1996-02-02 Method and device for speech recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP8040385A JPH09212189A (en) 1996-02-02 1996-02-02 Method and device for speech recognition

Publications (1)

Publication Number Publication Date
JPH09212189A true JPH09212189A (en) 1997-08-15

Family

ID=12579194

Family Applications (1)

Application Number Title Priority Date Filing Date
JP8040385A Pending JPH09212189A (en) 1996-02-02 1996-02-02 Method and device for speech recognition

Country Status (1)

Country Link
JP (1) JPH09212189A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007083637A1 (en) * 2006-01-17 2007-07-26 Pioneer Corporation Voice recognizer, voice recognition method, voice recognition program, and recording medium
WO2009147927A1 (en) * 2008-06-06 2009-12-10 株式会社レイトロン Audio recognition device, audio recognition method, and electronic device
JP4734771B2 (en) * 2001-06-12 2011-07-27 ソニー株式会社 Information extraction apparatus and method
JP2016062059A (en) * 2014-09-22 2016-04-25 富士通株式会社 Voice recognition unit, voice recognition method and program

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4734771B2 (en) * 2001-06-12 2011-07-27 ソニー株式会社 Information extraction apparatus and method
WO2007083637A1 (en) * 2006-01-17 2007-07-26 Pioneer Corporation Voice recognizer, voice recognition method, voice recognition program, and recording medium
WO2009147927A1 (en) * 2008-06-06 2009-12-10 株式会社レイトロン Audio recognition device, audio recognition method, and electronic device
JPWO2009147927A1 (en) * 2008-06-06 2011-10-27 株式会社レイトロン Voice recognition apparatus, voice recognition method, and electronic device
JP5467043B2 (en) * 2008-06-06 2014-04-09 株式会社レイトロン Voice recognition apparatus, voice recognition method, and electronic apparatus
JP2016062059A (en) * 2014-09-22 2016-04-25 富士通株式会社 Voice recognition unit, voice recognition method and program

Similar Documents

Publication Publication Date Title
US6751595B2 (en) Multi-stage large vocabulary speech recognition system and method
US7013276B2 (en) Method of assessing degree of acoustic confusability, and system therefor
US8374869B2 (en) Utterance verification method and apparatus for isolated word N-best recognition result
JP3337233B2 (en) Audio encoding method and apparatus
JPH09212189A (en) Method and device for speech recognition
JP2000250593A (en) Device and method for speaker recognition
CN113851150A (en) Method for selecting among multiple sets of voice recognition results by using confidence score
JPWO2018134916A1 (en) Voice recognition device
KR20040010860A (en) Surrounding-condition-adaptive voice recognition device including multiple recognition module and the method thereof
JP2975772B2 (en) Voice recognition device
CN112017641A (en) Voice processing method, device and storage medium
JPS6325366B2 (en)
JPH05108091A (en) Speech recognition device
JP2834880B2 (en) Voice recognition device
JPS6332596A (en) Voice recognition equipment
JPH11249688A (en) Device and method for recognizing voice
JP3100208B2 (en) Voice recognition device
JPS6147999A (en) Voice recognition system
JPS632100A (en) Voice recognition equipment
JP6451171B2 (en) Speech recognition apparatus, speech recognition method, and program
JPS58159598A (en) Monosyllabic voice recognition system
JPH01154098A (en) Voice recognition apparatus
JP2005265874A (en) Elementary speech unit connection type speech synthesizer
JPH06318097A (en) Speech recognizing device
JPS61208095A (en) Voice recognition system