JPH0331274B2

JPH0331274B2 -

Info

Publication number: JPH0331274B2
Application number: JP58168795A
Authority: JP
Inventors: Yasuo Sato; Takayuki Fujimoto
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1983-09-13
Filing date: 1983-09-13
Publication date: 1991-05-02
Also published as: JPS6060696A

Description

【発明の詳細な説明】 (A) 発明の技術分野本発明は音声認識装置、特に、標準特徴パター
ンの修正または追加に際し、他の辞書項目の標準
特徴パターンとの類似性にもとづいて、既登録標
準特徴パターンまたは新たに登録する標準特徴パ
ターンの妥当性を判断し、辞書の品質を向上させ
ることができるようにした音声認識装置に関する
ものである。[Detailed Description of the Invention] (A) Technical Field of the Invention The present invention relates to a speech recognition device, in particular, when modifying or adding a standard feature pattern, based on the similarity to the standard feature pattern of other dictionary items, The present invention relates to a speech recognition device that can improve the quality of a dictionary by determining the validity of a standard feature pattern or a newly registered standard feature pattern.

(B) 従来技術と問題点一般に音声認識において、認識率を向上させる
ためには、音声情報からどのような特徴パラメー
タを抽出し照合に用いるかが重要であるが、その
システムで定められた特徴抽出により、各項目を
代表する標準特徴パラメータとして、いかに最適
なものを辞書に用意するかについても重要であ
る。特徴の抽出のし方や照合のし方が、いかに優
れていても、辞書中に登録される標準特徴パター
ンに、雑音付加パターン、不明瞭発声パターン等
の不良標準特徴パターンた、例えば「ａ」を登録
すべきときに「ｉ」と発声してしまう等の発声誤
りによる誤り標準特徴パターンが多ければ、認識
率は向上しない。(B) Prior art and problems In general, in speech recognition, in order to improve the recognition rate, it is important to determine what kind of feature parameters are extracted from speech information and used for matching. It is also important to determine how best to prepare the standard feature parameters representative of each item in the dictionary through extraction. No matter how good the method of feature extraction or matching is, the standard feature patterns registered in the dictionary may contain defective standard feature patterns such as noise-added patterns, unclear speech patterns, etc., such as "a". If there are many erroneous standard feature patterns due to utterance errors such as uttering ``i'' when ``i'' should be registered, the recognition rate will not improve.

標準特徴パターンは、辞書中にデイジタル情報
で記憶され、その数が多く、機械部品のように目
にみえるわけではなく、またすべての標準特徴パ
ターンが一律に使用されるわけではないので、一
旦登録されてしまうと、上記不良標準特徴パター
ン、誤り標準特徴パターン等の検出は容易ではな
い。 Standard feature patterns are stored as digital information in the dictionary, and there are many of them, and they are not visible like mechanical parts, and not all standard feature patterns are uniformly used, so once registered. If this occurs, it is not easy to detect the above-mentioned defective standard feature patterns, erroneous standard feature patterns, etc.

従来、一旦登録した標準特徴パターンはすべて
正しいものとして扱い、認識誤りが生じた場合、
認識させようとする入力音声が悪いか、または認
識の限界であつて、止む得ないものとされるのが
一般的であつた。また、誤認識を生じさせた入力
音声から抽出された入力特徴パターンと、既に登
録されている標準特徴パターンとのいわゆる平均
化により、辞書の品質を改良していく学習方式等
も提案されているが、このとき学習のための入力
音声が悪いと、かえつて辞書の品質を劣化させる
ことになるという問題があつた。 Conventionally, all standard feature patterns once registered are treated as correct, and if a recognition error occurs,
Generally, it was assumed that the input voice to be recognized was bad or that recognition was at its limit, and that it was unavoidable. In addition, a learning method has been proposed that improves the quality of the dictionary by averaging the input feature pattern extracted from the input voice that caused the misrecognition with the already registered standard feature pattern. However, there was a problem in that if the input speech for learning was poor, the quality of the dictionary would deteriorate.

(C) 発明の目的と構成本発明は上記問題点の解決を図り、登録モード
ないし練習モード等において、発声誤りや雑音等
による不良特徴パターンの登録を防止し、辞書の
品質を向上させて、音声認識率を向上させること
ができるようにすることを目的としている。その
ため、本発明の音声認識装置は、未知入力音声を
音響分析して得られる入力特徴パターンと、予め
辞書中の各項目に対応して格納された標準特徴パ
ターンとの照合によつて音声認識を行う音声認識
装置において、標準特徴パターンの追加／修正に
際して入力した音声から抽出した入力特徴パター
ンと、該入力特徴パターンと異なる辞書項目中の
最も類似する標準特徴パターンとの類似度が、当
該入力特徴パターンと同じ辞書項目に属する標準
特徴パターンと、上記最も類似する標準特徴パタ
ーンとの類似度よりも、所定の値以上大きいか否
かを判定するパターン追加判定部と、上記類似度
の差が所定の値よりも小さい場合に当該入力特徴
パターンを標準特徴パターンの追加／修正に用い
るパターン修正追加部と、上記類似度の差が所定
の値よりも大きい場合に上記２種の辞書項目の標
準特徴パターンを再登録する再登録部とをそなえ
たことを特徴としている。(C) Object and Structure of the Invention The present invention aims to solve the above-mentioned problems, prevents the registration of defective feature patterns due to pronunciation errors, noise, etc. in the registration mode or practice mode, improves the quality of the dictionary, and improves the quality of the dictionary. The purpose is to be able to improve the speech recognition rate. Therefore, the speech recognition device of the present invention performs speech recognition by comparing an input feature pattern obtained by acoustically analyzing unknown input speech with a standard feature pattern stored in advance corresponding to each item in a dictionary. In the speech recognition device that performs standard feature pattern addition/modification, the degree of similarity between the input feature pattern extracted from the input voice and the most similar standard feature pattern among dictionary items different from the input feature pattern is determined as the input feature. a pattern addition determination unit that determines whether the similarity between a standard feature pattern belonging to the same dictionary item as the pattern and the most similar standard feature pattern is greater than or equal to a predetermined value; a pattern correction addition unit that uses the input feature pattern to add/modify a standard feature pattern when the difference in similarity is greater than a predetermined value; It is characterized by having a re-registration section for re-registering the pattern.

もう１つの本発明である音声認識装置は、未知
入力音声を音響分析して得られる入力特徴パター
ンと、予め辞書中の各項目に対応して格納された
標準特徴パターンとの照合によつて音声認識を行
う音声認識装置において、標準特徴パターンの追
加／修正に際して入力した音声について仮の認識
を行い認識誤りを検出する誤り検出部と、認識誤
りを生じさせた入力特徴パターンと誤認識結果と
なつた辞書項目との類似度が所定の値より大きい
か否かを判定するパターン追加判定部と、該パタ
ーン追加判定部により上記類似度が所定の値より
も小さいと判定された場合に当該入力特徴パター
ンに関連する標準特徴パターンの修正または追加
を行うパターン修正追加部とをそなえたことを特
徴としている。以下図面を参照しつつ説明する。 The speech recognition device, which is another aspect of the present invention, recognizes speech by comparing input feature patterns obtained by acoustic analysis of unknown input speech with standard feature patterns stored in advance corresponding to each item in a dictionary. In a speech recognition device that performs recognition, there is an error detection unit that temporarily recognizes the input speech when adding/modifying standard feature patterns and detects recognition errors, and an error detection unit that detects recognition errors and input feature patterns that cause recognition errors. a pattern addition determination unit that determines whether the degree of similarity with the dictionary item is greater than a predetermined value; and a pattern addition determination unit that determines whether the degree of similarity with the dictionary item is greater than a predetermined value; The present invention is characterized by comprising a pattern modification and addition section that modifies or adds standard feature patterns related to the pattern. This will be explained below with reference to the drawings.

(D) 発明の実施例第１図は音声パターンの分布と標準特徴パター
ンとの関係を説明するための図を示す。(D) Embodiments of the Invention FIG. 1 shows a diagram for explaining the relationship between the distribution of voice patterns and standard feature patterns.

第１図において、Ａ，Ｂ，Ｃの実線で囲まれた
部分は、パターン空間における実際の音声パター
ンの分布を示し、A₁およびA₂は単語Ａ（単音節を
含む。以下同様。）に対する登録された標準特徴
パターン、B₁ないしB₂は単語Ｂに対する標準特
徴パターン、C₁は単語Ｃに対する標準特徴パタ
ーンを表わしている。図示Ｃのように、１つの単
語項目について、１つの標準特徴パターンでカバ
ーする場合もあるが、通常、図示Ａ，Ｂのよう
に、１つの項目について複数の標準特徴パターン
を用意し、認識すべき音声パターンの分布範囲を
カバーするのが普通である。例えば、未知入力音
声の入力特徴パターンＸが抽出されると、その入
力特徴パターンＸと各標準特徴パターンA₁，A₂，
B₁……とのマツチング距離の演算を行い、距離
の小さい標準特徴パターンの属する項目を認識結
果とする。 In Figure 1, the parts surrounded by solid lines A, B, and C indicate the distribution of actual speech patterns in the pattern space, and A ₁ and A ₂ are for word A (including monosyllables; the same applies hereinafter). The registered standard feature patterns B 1 and B ₂ represent standard feature patterns for word B _, and C ₁ represents a standard feature pattern for word C. In some cases, one word item is covered by one standard feature pattern, as shown in illustration C, but usually, multiple standard feature patterns are prepared and recognized for one item, as shown in illustrations A and B. It usually covers the distribution range of power speech patterns. For example, when an input feature pattern X of unknown input speech is extracted, the input feature pattern X and each standard feature pattern A ₁ , A ₂ ,
The matching distance with B ₁ is calculated, and the item to which the standard feature pattern with the short distance belongs is taken as the recognition result.

第２図および第３図は本発明による処理概要を
説明するための図、第４図は本発明の一実施例構
成を示す。 FIGS. 2 and 3 are diagrams for explaining the outline of processing according to the present invention, and FIG. 4 shows the configuration of an embodiment of the present invention.

第１図の説明からわかるように、もし、辞書に
登録された標準特徴パターンの中に、音声パター
ンの分布から外れた不良標準特徴パターンや誤り
標準特徴パターンがあれば、認識率は劣化するこ
ととなる。本発明は、このような妥当でない標準
特徴パターンの登録を次のように防止する。 As can be seen from the explanation in Figure 1, if there is a defective standard feature pattern or an erroneous standard feature pattern that deviates from the distribution of speech patterns among the standard feature patterns registered in the dictionary, the recognition rate will deteriorate. becomes. The present invention prevents registration of such invalid standard feature patterns as follows.

例えば、第２図図示の如く、辞書項目各「渋
谷」について、標準特徴パターンA₁が登録され、
辞書項目名「日比谷」について、標準特徴パター
ンB₁が既に登録されていたとする。この状態で、
さらに「渋谷」の標準特徴パターンの修正または
追加のため、項目名「渋谷」についての発声が入
力され、その入力特徴パターンがA₂であつたと
する。まず、項目名「渋谷」以外の辞書項目の中
で、入力特徴パターンA₂に最も類似する標準特
徴パターンが捜し出される。これが例えば項目名
「日比谷」の標準特徴パターンB₁であつたとする
と、次に各特徴パターンA₁，A₂，B₁の妥当性を
チエツクするために、標準特徴パターンA₁およ
びB₁の類似度と、標準特徴パターンB₁および入
力特徴パターンA₂の類似度との差が演算される。
もし、パターンB₁およびA₂の類似度が、パター
ンA₁およびB₁の類似度よりも、所定の閾値以上
大きい場合には、パターンA₁，B₁およびA₂のう
ち、少なくともそれかが、正常な音声パターン分
布から外れている可能性が大きい。従つて、この
ような場合には、項目名「渋谷」および「日比比
谷」について、再発声を依頼し、標準特徴パター
ンA₁およびB₁の登録をやり直す。この再登録に
よつて、発声誤り等の最初の登録時における登録
ミスがあれば、訂正されることとなる。なお、上
記類似度の比較にあたつて用いられる閾値は、シ
ステムで予め一律に定めてもよいが、対比される
各２種の辞書項目に対して、予め音韻の共通性等
を考慮し、適当に定めておくことが望ましい。 For example, as shown in Figure 2, standard feature pattern _A1 is registered for each dictionary entry "Shibuya",
Assume that standard feature pattern B ₁ has already been registered for the dictionary item name "Hibiya." In this state,
Furthermore, suppose that in order to modify or add the standard feature pattern of "Shibuya", a utterance about the item name "Shibuya" is input, and the input feature pattern is _A2 . First, a standard feature pattern that is most similar to the input feature pattern _A2 is searched out among dictionary items other than the item name "Shibuya." For example, if this is the standard feature pattern B ₁ of the item name "Hibiya", then in order to check the validity of each feature pattern A ₁ , A ₂ , B ₁ , we will check the similarity of the standard feature patterns A ₁ and B ₁ . The difference between the degree of similarity and the degree of similarity between the standard feature pattern B ₁ and the input feature pattern A ₂ is calculated.
If the similarity between patterns B ₁ and A ₂ is greater than the similarity between patterns A ₁ and B ₁ by more than a predetermined threshold, then at least one of patterns A ₁ , B ₁ and A ₂ is , there is a high possibility that it deviates from the normal speech pattern distribution. Therefore, in such a case, a request is made to re-speak the item names "Shibuya" and "Hibiya", and the registration of the standard feature patterns _A1 and _B1 is re-registered. Through this re-registration, any registration mistakes made at the time of initial registration, such as utterance errors, will be corrected. Note that the threshold value used in the above similarity comparison may be uniformly determined in advance by the system, but it may be determined in advance by considering the commonality of phonemes, etc. for each two types of dictionary items to be compared. It is desirable to set it appropriately.

特に再登録の場合、また不良標準特徴パターン
等が登録されないようにするために、例えば次の
ように再登録する標準特徴パターンを決定すれば
よい。１つの辞書項目の１標準特徴パターンの登
録に対して、複数回の発声を入力する。そして、
例えば第３図イないしハで説明するように、平均
的パターンの選別などを行う。 In particular, in the case of re-registration, and in order to prevent defective standard feature patterns from being registered, standard feature patterns to be re-registered may be determined, for example, as follows. Multiple utterances are input for registration of one standard feature pattern of one dictionary item. and,
For example, as explained in FIG. 3 A to C, average patterns are selected.

例えば、４回の発声からそれぞれ抽出した入力
特徴パターンが、P₁，P₂，P₃，P₄であつたとす
る。第３図イ図示の場合、パターン空間におい
て、パターンP₁，P₂，P₃，P₄の重心を概略演算
し、その重心に最も近いパターンP₃を登録すべ
き標準特徴パターンとして選出している。第３図
ロ図示の場合、４つの特徴パターンP₁，P₂，P₃，
P₄の平均値を求め、その平均的パターンP_nを標
準特徴パターンとして登録する。また、第３図ハ
図示の場合、他のパターンから大きく離れたパタ
ーンP₂を除去し、残りのパターンP₁，P₃，P₄の
平均的パターンP′_nを求めて登録している。この
ように複数回の発声から１つの標準特徴パターン
を選出または作成することによつて、再登録され
る標準特徴パターンは、良好なものとなる。 For example, assume that the input feature patterns extracted from four utterances are P ₁ , P ₂ , P ₃ , and P ₄ . In the case shown in Figure 3A, the centroids of patterns P ₁ , P ₂ , P ₃ , and P ₄ are roughly calculated in the pattern space, and pattern P ₃ closest to the centroids is selected as the standard feature pattern to be registered. There is. In the case shown in FIG. 3B, there are four characteristic patterns P ₁ , P ₂ , P ₃ ,
The average value of P ₄ is determined, and the average pattern P _n is registered as a standard feature pattern. Further, in the case shown in FIG. 3C, pattern P ₂ which is far away from other patterns is removed, and an average pattern P' _n of the remaining patterns P ₁ , P ₃ and P ₄ is determined and registered. By selecting or creating one standard feature pattern from a plurality of utterances in this way, the re-registered standard feature pattern becomes good.

第４図は本発明の一実施例構成を示すブロツク
図であつて、図中、符号１はマイクロホン、２は
音響分析部、３はパターン抽出部、４は切替部、
５はパターン追加判定部、６はパターン修正追加
部、７は辞書部、８は再登録部、９は表示部、１
０は照合判定部を表わす。 FIG. 4 is a block diagram showing the configuration of an embodiment of the present invention, in which reference numeral 1 is a microphone, 2 is an acoustic analysis section, 3 is a pattern extraction section, 4 is a switching section,
5 is a pattern addition determination section, 6 is a pattern correction addition section, 7 is a dictionary section, 8 is a re-registration section, 9 is a display section, 1
0 represents a verification determination section.

マイクロホン１から入力された音声信号は、音
響分析部２において周波数分析される。音響分析
部２は、例えば帯域フイルタ群、パラメータ抽出
回路等を有しており、入力音声の特徴量（パラメ
ータ）、例えば第１ホルマント周波数に相当する
モーメントM₁や、第２ホルマント周波数に相当
するモーメントM₂や、さらには、低域電力や高
域電力などを抽出し、これらの特徴量に関するサ
ンプル点を決定して、特徴量の時系列情報を得
る。 The audio signal input from the microphone 1 is subjected to frequency analysis in the acoustic analysis section 2. The acoustic analysis unit 2 includes, for example, a group of band filters, a parameter extraction circuit, etc., and extracts features (parameters) of the input speech, such as a moment _M1 corresponding to the first formant frequency and a moment M1 corresponding to the second formant frequency. The moment M ₂ and further low-frequency power and high-frequency power are extracted, sample points related to these feature quantities are determined, and time-series information on the feature quantities is obtained.

音響分析部２において得られたパラメータ時系
列情報は、パターン抽出部３に入力される。パタ
ーン抽出部３は、このパラメータ時系列情報か
ら、入力音声の特徴を表わす入力特徴パターンを
抽出する。切替部４は、パターン情報の登録また
は照合を、例えばキーボードからの切替指示によ
り、切替えるものである。 The parameter time series information obtained by the acoustic analysis section 2 is input to the pattern extraction section 3. The pattern extraction unit 3 extracts an input feature pattern representing the characteristics of the input voice from this parameter time series information. The switching unit 4 switches registration or verification of pattern information in accordance with a switching instruction from a keyboard, for example.

パターン追加判定部５は、例えば各辞書項目に
対して、少なくとも１パターン宛の登録が終了し
た後、さらに標準特徴パターンの修正または追加
をする際に起動されるものである。パターン追加
判定部５は、パターン抽出部３が抽出した追加な
いし修正用の入力特徴パターンに関して、第２図
を参照して説明したような、妥当性のチエツクを
行う。当該入力特徴パターンとの類似度との関連
において、既登録標準特徴パターンが妥当なもの
であると判断されると、当該入力特徴パターン
が、パターン修正追加部６へ引き渡される。パタ
ーン修正追加部６は、引き渡された入力特徴パタ
ーンと既登録の同種項目の標準特徴パターンとの
いわゆる平均化操作により、標準特徴パターンの
修正を行つたり、新規標準特徴パターンとして追
加登録する処理を実行する。 The pattern addition determination unit 5 is activated, for example, when a standard feature pattern is further modified or added after registration of at least one pattern has been completed for each dictionary item. The pattern addition determining section 5 checks the validity of the input feature pattern for addition or modification extracted by the pattern extracting section 3, as described with reference to FIG. When the registered standard feature pattern is determined to be appropriate in relation to the degree of similarity with the input feature pattern, the input feature pattern is passed to the pattern modification/addition section 6 . The pattern modification/addition unit 6 performs a process of modifying the standard feature pattern or additionally registering it as a new standard feature pattern by performing a so-called averaging operation between the delivered input feature pattern and the already registered standard feature pattern of the same type of item. Execute.

パターン追加判定部５における判定で、既登録
標準特徴パターンが正しくない可能性があると判
断された場合、再登録部８が呼び出される。再登
録部８は、例えばCRTデイスプレイ等の表示部
９へ、疑いのある辞書項目名を表示し、登録音声
の再入力を指示する。例えば複数回の発声から、
それぞれ再登録のための入力特徴パターンが、パ
ターン抽出部３によつて抽出されると、再登録部
８は、例えば第３図で説明したような処理を実行
して、再登録すべき標準特徴パターンを決定し、
パターン修正追加部６を経由して、辞書部７へ再
登録する。辞書部７は、例えば磁気デイスク装置
等の外部記憶装置であつて、認識対象の項目名と
標準特徴パターンとを対応させて記憶し、保持す
る。 If the pattern addition determining unit 5 determines that there is a possibility that the registered standard feature pattern is incorrect, the re-registration unit 8 is called. The re-registration unit 8 displays the suspected dictionary item name on a display unit 9 such as a CRT display, and instructs the user to re-input the registered voice. For example, from multiple utterances,
When each input feature pattern for re-registration is extracted by the pattern extraction unit 3, the re-registration unit 8 executes the process as explained in FIG. Decide on the pattern,
It is re-registered in the dictionary section 7 via the pattern correction/addition section 6. The dictionary section 7 is an external storage device such as a magnetic disk device, and stores and holds the item name to be recognized and the standard feature pattern in correspondence with each other.

未知入力音声について認識を行う場合、パター
ン抽出部３の出力は、照合判定部１０に供給され
る。照合判定部１０は、辞書部７の内容を順次読
み出し、入力特徴パターンと標準特徴パターンと
を、例えば周知のダイナミツク・プログラミング
（DP）、マツチング等により照合し、認識結果を
出力する。 When performing recognition on unknown input speech, the output of the pattern extraction section 3 is supplied to the matching determination section 10. The matching/judgment section 10 sequentially reads out the contents of the dictionary section 7, matches the input feature pattern and the standard feature pattern by, for example, well-known dynamic programming (DP), matching, etc., and outputs a recognition result.

第５図は第２の本発明の処理概要を説明するた
めの図、第６図は第２の本発明の一実施例構成を
示す。 FIG. 5 is a diagram for explaining the processing outline of the second invention, and FIG. 6 shows the configuration of an embodiment of the second invention.

例えば、第５図図示の如く、辞書項目「渋谷」
について標準特徴パターンA₁が登録されており、
辞書項目「日比谷」について標準特徴パターン
B₁，B₂が登録されていたとする。登録モードま
たは練習モード等において、例えばパターンA₂
に対応する音声「シブヤ」が入力されると、項目
「日比谷」と誤認識されることとなる。このとき
従来、パターンA₂を項目「渋谷」に追加登録し
たり、既登録の標準特徴パターンA₁とのいわゆ
る平均化操作をしたりすることが、一般に行われ
ている。しかし、必ずしもパターンA₂の基礎と
なつた音声「シブヤ」が正しく発声されたもので
あるとは、断言できない。もし、発声誤り等があ
つた場合、パターンA₂を標準特徴パターンに反
映させると、かえつて辞書の品質は劣化する。本
発明の場合、パターンA₂と既登録の標準特徴パ
ターンとの類似性を考慮することにより、次のよ
うにパターンA₂の妥当性をチエツクする。 For example, as shown in Figure 5, the dictionary entry "Shibuya"
Standard feature pattern A ₁ has been registered for
Standard feature pattern for dictionary entry “Hibiya”
Assume that B ₁ and B ₂ are registered. In registration mode or practice mode, for example, pattern A ₂
If the voice corresponding to "Shibuya" is input, it will be mistakenly recognized as the item "Hibiya". At this time, conventionally, it has been common practice to additionally register pattern _A2 in the item "Shibuya" or to perform a so-called averaging operation with the already registered standard feature pattern _A1 . However, it cannot be said with certainty that the voice "Shibuya", which is the basis of pattern _A2 , was uttered correctly. If there is a pronunciation error or the like, if pattern _A2 is reflected in the standard feature pattern, the quality of the dictionary will deteriorate. In the case of the present invention, the validity of pattern A _{2 is checked as follows by considering the similarity between pattern A 2} _and a registered standard feature pattern.

まず、入力特徴パターンA₂にいつて、仮の認
識を行う。認識誤りが生じた場合、入力特徴パタ
ーンA₂と、誤認識の結果となつた辞書項目、例
えば「日比谷」との類似度を計算する。類似度と
しては、例えば辞書項目「日比谷」に含まれる標
準特徴パターンB₁，B₂との平均距離または最小
距離などを用いる。この類似度が、各辞書項目に
対して予め定められた閾値よりも大きい場合、す
なわち上記平均距離または最小距離が、所定の値
よりも小さい場合には、パターンA₂が誤りであ
る可能性が大きいと判断できるので、パターン
A₂を標準特徴パターンの追加または修正に利用
することなく、棄却する。このようにして、発声
誤り等による音声の特徴パターンが辞書に反映さ
れるのを防止する。 First, tentative recognition is performed using the input feature pattern _A2 . If a recognition error occurs, the degree of similarity between the input feature pattern _A2 and the dictionary item that resulted in the misrecognition, such as "Hibiya", is calculated. As the degree of similarity, for example, the average distance or minimum distance with standard feature patterns B ₁ and B ₂ included in the dictionary item "Hibiya" is used. If this degree of similarity is larger than a predetermined threshold for each dictionary item, that is, if the average distance or minimum distance is smaller than a predetermined value, there is a possibility that pattern _A2 is incorrect. Since it can be determined that it is large, the pattern
A ₂ is rejected without being used to add or modify the standard feature pattern. In this way, characteristic patterns of speech due to pronunciation errors and the like are prevented from being reflected in the dictionary.

第６図は第２の本発明の一実施構成を示すブロ
ツク図であつて、図中、符号１ないし７，１０は
第４図に対応し、１１は誤り検出部、１２はキー
ボードを表わす。 FIG. 6 is a block diagram showing an embodiment of the second invention, in which reference numerals 1 to 7 and 10 correspond to those in FIG. 4, 11 represents an error detection section, and 12 represents a keyboard.

登録モードまたは練習モード等において、マイ
クロホン１から音声が入力されると、音響分析さ
れ、パターン抽出部３によつて特徴パターンが抽
出される。この入力特徴パターンは、切替部４を
経由して、誤り検出部１１に通知されるととも
に、照合判定部１０にも供給される。照合判定部
１０は、通常の認識と同様に辞書部７から標準特
徴パターンを順次読み出し、仮の認識を行う。こ
の認識結果は、図示省略したデイスプレイ等に表
示される。誤り検出部１１は、この表示に対し、
例えばキーボード１２から入力される誤り指示を
検出する。誤り指示がなく、認識結果が正しい場
合には、従来と同様な処理が続行される。 When audio is input from the microphone 1 in the registration mode, practice mode, etc., it is acoustically analyzed and a characteristic pattern is extracted by the pattern extraction unit 3. This input feature pattern is notified to the error detection section 11 via the switching section 4, and is also supplied to the matching determination section 10. The comparison determination unit 10 sequentially reads standard feature patterns from the dictionary unit 7 and performs provisional recognition in the same way as in normal recognition. This recognition result is displayed on a display (not shown) or the like. The error detection unit 11 detects this display by
For example, an incorrect instruction input from the keyboard 12 is detected. If there is no error instruction and the recognition result is correct, the same processing as before is continued.

誤り指示が検出された場合、パターン追加判定
部５へ、誤つて認識された項目名と、入力特徴パ
ターンが通知される。パターン追加判定部５は、
第５図を参照して説明したように、入力特徴パタ
ーンと、誤つて認識結果とされた辞書項目との類
似度を演算し、所定の閾値と比較して、当該入力
特徴パターンについての妥当性をチエツクする。
上記類似度が所定の閾値よりも大きいと判定され
た場合には、当該入力特徴パターンは、不良であ
る可能性が大きいので、辞書部７に追加したり、
既存の標準特徴パターンと平均化操作して修正し
たりすることを中止する。上記類似度が所定の閾
値よりも小さい場合にのみ、パターン修正追加部
６は、辞書部７への当該入力特徴パターンの追加
登録みたは既存の標準特徴パターンとの平均化操
作等による修正を行う。 If an erroneous instruction is detected, the pattern addition determination unit 5 is notified of the erroneously recognized item name and input feature pattern. The pattern addition determination unit 5
As explained with reference to FIG. 5, the degree of similarity between the input feature pattern and the dictionary item that was incorrectly recognized as a recognition result is calculated, and compared with a predetermined threshold, the validity of the input feature pattern is determined. Check.
If the similarity is determined to be greater than a predetermined threshold, there is a high possibility that the input feature pattern is defective, so the input feature pattern may be added to the dictionary section 7 or
Stop modifying the existing standard feature pattern by averaging it. Only when the above-mentioned similarity is smaller than a predetermined threshold value, the pattern modification/addition section 6 performs modification by additionally registering the input feature pattern in the dictionary section 7 or by averaging operation with an existing standard feature pattern. .

(E) 発明の効果以上説明した如く、本発明によれば、例えば発
声誤り等による妥当でない音声のパターンが、辞
書中に入り込むことが防止され、辞書の品質が向
上するので、認識率が向上する。(E) Effects of the Invention As explained above, according to the present invention, invalid speech patterns, such as those caused by pronunciation errors, are prevented from entering the dictionary, and the quality of the dictionary is improved, so the recognition rate is improved. do.

[Brief explanation of the drawing]

第１図は音声パターンの分布と標準特徴パター
ンとの関係を説明するための図、第２図および第
３図は本発明による処理概要を説明するための
図、第４図は本発明の一実施例構成、第５図は第
２の本発明の処理概要を説明するための図、第６
図は第２の本発明の一実施例構成を示す。図中、１はマイクロンホン、２は音響分析部、
３はパターン抽出部、４は切替部、５はパターン
追加判定部、６はパターン修正追加部、７は辞書
部、８は再登録部、１０は照合判定部、１１は誤
り検出部を表わす。 FIG. 1 is a diagram for explaining the relationship between the distribution of voice patterns and standard feature patterns, FIGS. 2 and 3 are diagrams for explaining the outline of the process according to the present invention, and FIG. Embodiment configuration, FIG. 5 is a diagram for explaining the processing outline of the second invention, and FIG.
The figure shows the configuration of an embodiment of the second invention. In the figure, 1 is a microphone, 2 is an acoustic analysis section,
Reference numeral 3 represents a pattern extraction section, 4 a switching section, 5 a pattern addition/determination section, 6 a pattern correction/addition section, 7 a dictionary section, 8 a re-registration section, 10 a collation determination section, and 11 an error detection section.

Claims

[Scope of Claims] 1. Speech recognition that performs speech recognition by comparing input feature patterns obtained by acoustic analysis of unknown input speech with standard feature patterns stored in advance corresponding to each item in a dictionary. In the device, when adding/modifying a standard feature pattern, the degree of similarity between the input feature pattern extracted from the input voice and the most similar standard feature pattern among dictionary items different from the input feature pattern is the same as the input feature pattern. a pattern addition determination unit that determines whether the similarity between a standard feature pattern belonging to a dictionary item and the most similar standard feature pattern is greater than a predetermined value by a predetermined value; a pattern modification/addition unit that uses the input feature pattern to add/modify a standard feature pattern when the difference in similarity is greater than a predetermined value; A voice recognition device comprising a re-registration unit for registering. 2. The re-registration unit inputs a plurality of utterances for each dictionary item, and registers the pattern closest to the center of gravity of the input feature patterns as the standard feature pattern. Speech recognition device described in section. 3. The re-registration unit inputs a plurality of utterances for each dictionary item, and registers an average pattern of the input feature patterns as a standard feature pattern. voice recognition device. 4 The above-mentioned re-registration unit inputs multiple utterances for each dictionary item, and registers the average pattern of those input feature patterns, excluding patterns that are significantly different from other patterns, as a standard feature pattern. Claim 1 characterized in that
Speech recognition device described in section. 5. In a speech recognition device that performs speech recognition by comparing an input feature pattern obtained by acoustic analysis of unknown input speech with a standard feature pattern stored in advance corresponding to each item in a dictionary, the standard feature pattern an error detection unit that temporarily recognizes input speech to detect recognition errors when adding/correcting the input voice; and an error detection unit that detects recognition errors by performing temporary recognition on the input speech; a pattern addition determination unit that determines whether the degree of similarity is greater than a predetermined value; and a pattern addition determination unit that determines whether or not the similarity is greater than a predetermined value; A speech recognition device comprising: a pattern correction/addition section for performing pattern correction/addition. 6. The speech recognition device according to claim 5, wherein an average distance or a minimum distance to a standard feature pattern in a dictionary item is used as the degree of similarity.