JPS6259999A

JPS6259999A - Unspecified speaker voice recognition equipment

Info

Publication number: JPS6259999A
Application number: JP60199564A
Authority: JP
Inventors: 陽一山田
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1985-09-11
Filing date: 1985-09-11
Publication date: 1987-03-16
Also published as: JPH0446438B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は不特定話者音声認識装置に関し、特に認識率向
上を目的としたパターンマツチング用認識辞書への標準
パターンの追加機能を有する不特定話者音声認識装置に
関する。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a speaker-independent speech recognition device, and particularly to a device having a function of adding standard patterns to a recognition dictionary for pattern matching with the aim of improving the recognition rate. The present invention relates to a specific speaker speech recognition device.

（従来の技術）不特定話者を対象とする従来の音声認識方式としては、
アナログ音声信号を入力し、それをＡ／Ｄ変換し、さら
に例えばバンドパスフィルタ、Ｌｐｃ分析等により何ら
かの特徴量に変換し、該特徴量と標準パターンとの類似
度を計算し、類似度が最大となる！準パターンの有する
カテゴリ属性を認識カテゴリとするのが一般的である。(Prior art) Conventional speech recognition methods targeting unspecified speakers include:
Input an analog audio signal, convert it A/D, and then convert it into some feature using a bandpass filter, LPC analysis, etc., calculate the similarity between the feature and the standard pattern, and find the maximum similarity. It becomes! Generally, a category attribute of a quasi-pattern is used as a recognition category.

前記標準パターンは、認識対象の標準的と考えられる前
記特徴量を各カテゴリ毎にあらかじめ用意したものであ
る。この標準パターンの作成方法としては。The standard pattern is a pattern in which the feature amounts considered to be standard for the recognition target are prepared in advance for each category. How to create this standard pattern.

認識対象カテゴリ毎に、あらかじめ用意された複数話者
（一般的に１０００人程度）の前記特徴量である音声パ
ターンについてクラスタリング、平均化等の処理を行な
い、１カテゴリ当りの標準パターンを複数個（実際には
２０個程度）作成する方法が一般的である。標準パター
ン作成に際し使用された音声パターンを以後学習パター
ンと称す。For each recognition target category, processing such as clustering and averaging is performed on the voice patterns, which are the feature quantities, of multiple speakers prepared in advance (generally about 1000 people), and multiple standard patterns per category ( In reality, it is common to create about 20 pieces. The voice pattern used to create the standard pattern will be referred to as a learning pattern hereinafter.

（発明が解決しようとする問題点）前記標準パターン作成に使用される学習パターンはあく
までも限定された範囲の話者から構成されるものであり
、前記限定された話者の特徴分布を効率的に網羅するも
のである。従って前記学習パターンを前記標準パターン
を用いて認識した場合には誤認識となる可能性は極めて
小であり、かつ各カテゴリ内で効率的にクラスタリング
が行なわれていることにより複数標準パターンが特徴空
間内において均等に配置されているので、学習パターン
正認識時に全標準パターンの中で類似度が最大となる頻
度は各標準パターン間で均等となる。(Problems to be Solved by the Invention) The learning pattern used to create the standard pattern is made up of only a limited range of speakers, and the characteristic distribution of the limited speakers can be efficiently It is comprehensive. Therefore, when the learning pattern is recognized using the standard pattern, the possibility of misrecognition is extremely small, and because clustering is performed efficiently within each category, multiple standard patterns can be recognized in the feature space. Since the standard patterns are arranged evenly within the standard pattern, the frequency at which the degree of similarity is maximum among all the standard patterns when the learned pattern is correctly recognized is the same among the standard patterns.

しかしながら不特定話者を対象とした認識を行なう場合
、認識対象となる話者数は前記学習パターンを構成する
話者数をはるかに上回り、それに伴い特徴分布のばらつ
きの大きさも前記学習パターン内におけるばらつきの大
きさに比較して大となる。即ち前記学習パターンが持つ
特徴分布と不特定話者を対象として認識を行なう時の学
習されていないパターンが持つ特徴分布は異なったもの
となる。従って前記学習されていないパターンに対し認
識処理を行なった場合には、学習パターンが持つ特徴分
布範囲外の特徴空間に存在する入カバターンが出現し、
そのようなパターンに対しては誤Ｐ（ｍを引き起こすと
いう問題点があった。However, when performing recognition targeting unspecified speakers, the number of speakers to be recognized far exceeds the number of speakers composing the learning pattern, and accordingly, the variation in feature distribution also increases within the learning pattern. This is large compared to the size of the dispersion. That is, the feature distribution of the learned pattern is different from the feature distribution of the unlearned pattern when recognition is performed for an unspecified speaker. Therefore, when recognition processing is performed on the unlearned pattern, an input pattern existing in the feature space outside the feature distribution range of the learned pattern will appear,
Such a pattern has the problem of causing an error P(m).

本発明は以上述べた従来技術の欠点を除去し、学習され
ていない入カバターンの中で誤認識となるパターンを正
認識とすべく標準パターンの追加を効率的に行ない、認
識性能を向上させることができる不特定話者音声認識装
置を提供するものである。The present invention eliminates the drawbacks of the prior art described above, efficiently adds standard patterns to correctly recognize patterns that are incorrectly recognized among input patterns that have not been learned, and improves recognition performance. The present invention provides a speaker-independent speech recognition device that can perform the following tasks.

（問題点を解決するための手段）本発明は、入力音声パターンとあらかじめ作成された複
数の標準パターンとの類似度計算を行ない、入力音声パ
ターンとの類似度が最大となる標準パターンが有するカ
テゴリ属性を認識カテゴリと決定することにより音声認
識を行なう不特定話者音声認識装置に係るもので、前記
従来技術の問題点を解決するために、第１の記憶手段、
カテゴリ選択手段、第２の記憶手段及び追加標準パター
ン作成手段を具備して構成される。(Means for Solving the Problem) The present invention calculates the degree of similarity between an input voice pattern and a plurality of standard patterns created in advance, and calculates the category that the standard pattern having the maximum degree of similarity to the input voice pattern has. This relates to a speaker-independent speech recognition device that performs speech recognition by determining an attribute as a recognition category, and in order to solve the problems of the prior art, a first storage means,
The apparatus includes a category selection means, a second storage means, and an additional standard pattern creation means.

第１の記憶手段は、後述の実施例中の誤認識パターン記
憶部に対応し、標準パターン作成時に使用した音声パタ
ーン（学習パターン）以外の音声パターン（非学習パタ
ーン）に対して前記音声認識の処理を行なったときに誤
認識となる音声パターンを記憶する。カテゴリ選択手段
は、該音声認識の処理、すなわち非学習パターンについ
ての音声認識の処理で誤認識となる回数が他のＬＬＩ！
対象カテゴリと比較して大であるカテゴリを選択する。The first storage means corresponds to an erroneous recognition pattern storage unit in an embodiment described later, and the first storage means performs the speech recognition on a speech pattern (non-learning pattern) other than the speech pattern (learning pattern) used when creating the standard pattern. Record voice patterns that are misrecognized when processing is performed. The category selection means determines whether the number of misrecognitions in the speech recognition process, that is, the speech recognition process for the non-learning pattern is different from other LLI!
Select a category that is large compared to the target category.

第２の記憶手段は、実施例中の追加標準パターン作成用
学習パターン記憶部に対応し、カテゴリ選択手段により
選択されたカテゴリ、すなわち認識性能が低いカテゴリ
に属する音声パターンを第１の記憶手段より入力し記憶
する。追加標準パターン作成手段は、先ず、カテゴリ毎
に、前記第２の記憶手段に記憶されている１のパターン
と他の全てのパターンとの類似度計算を同一カテゴリ中
の全てのパターンについて行なう。次に、計算された類
似度が所定の閾値より大となる回数をカテゴリ毎、パタ
ーン毎に計数する。そしてカテゴリ中にて計数された回
数が最大となるパターンの特徴ベクトルと該パターンに
対する類似度が前記所定の閾値より大となる同カテゴリ
中の全てのパターンの特徴ベクトルとの平均ベクトルに
相当するパターンを、そのカテゴリにおける追加すべき
標準パターンとする。The second storage means corresponds to the additional standard pattern creation learning pattern storage section in the embodiment, and stores the audio pattern belonging to the category selected by the category selection means, that is, the category with low recognition performance, from the first storage means. Enter and remember. The additional standard pattern creation means first calculates the degree of similarity between one pattern stored in the second storage means and all other patterns for each category, for all patterns in the same category. Next, the number of times the calculated similarity is greater than a predetermined threshold is counted for each category and each pattern. and a pattern corresponding to the average vector of the feature vector of the pattern counted the maximum number of times in the category and the feature vectors of all patterns in the same category whose similarity to the pattern is greater than the predetermined threshold value. Let be the standard pattern to be added in that category.

（作用）本発明の各技術手段は次のように作用する。先ず、第１
の記憶手段は非学習パターンに対して行なわれた認識処
理にて誤認識となったパターンを収集し記憶する。カテ
ゴリ選択手段は非学習パターンの認識処理におけるカテ
ゴリ毎の誤認識率に基づき、認識性能が低いカテゴリす
なわち追加のｍ準パターンを作成すべきカテゴリを選択
する。(Operation) Each technical means of the present invention operates as follows. First, the first
The storage means collects and stores patterns that are erroneously recognized in recognition processing performed on non-learning patterns. The category selection means selects a category for which recognition performance is low, that is, a category for which additional m quasi-patterns should be created, based on the misrecognition rate for each category in the recognition process of unlearned patterns.

第２の記憶手段はカテゴリ選択手段により選択されたカ
テゴリと同じカテゴリ属性を持ったパターンを第１の記
憶手段から受取り、記憶する。追加標準パターン作成手
段はカテゴリ選択手段により選択されたカテゴリ毎に以
下に述べる操作を行ない追加の標準パターンを作成する
。すなわち、あるカテゴリについて、対応するパターン
を第２の記憶手段より受取り、そのカテゴリに属する１
のパターンに着目し、該パターンと他の全てのパターン
との類似度を求める。そして求めた類似度が所定の閾値
より大となる回数を求める。この操作をカテゴリ内の全
てのパターンについて行なう。The second storage means receives and stores a pattern having the same category attribute as the category selected by the category selection means from the first storage means. The additional standard pattern creation means performs the following operations for each category selected by the category selection means to create additional standard patterns. That is, a pattern corresponding to a certain category is received from the second storage means, and a pattern belonging to the category is received.
, and find the similarity between this pattern and all other patterns. Then, the number of times the obtained similarity is greater than a predetermined threshold is determined. This operation is performed for all patterns within the category.

そして回数が最大となるパターンを検出し、該パターン
に対する類似度が前記閾値より大となる全てのパターン
の特徴ベクトルと回数が最大のパターンの特徴ベクトル
との平均ベクトルに相当するパターンを求め、このパタ
ーンを当該カテゴリにおける追加標準パターンとする。Then, the pattern with the maximum number of times is detected, and a pattern corresponding to the average vector of the feature vectors of all patterns whose similarity to the pattern is greater than the threshold value and the feature vector of the pattern with the maximum number of times is found. Make the pattern an additional standard pattern in the category.

従って、標準パターン数の増加を最小限に止めて認識性
能の向上が実現されるようになる。Therefore, recognition performance can be improved by minimizing the increase in the number of standard patterns.

（実施例）第１図は本発明の一実施例の不特定話者音声認識装置の
構成を示すブロック図、第２図は同装置の動作を示すフ
ローチャートである。第１図に示すように本実施例の音
声認識装置は、類似度計算部１、制御部２、標準パター
ン記憶部３、誤認識パターン記憶部４、カテゴリ選択部
５、追加標準パターン作成用学習パターン記憶部６及び
追加標準パターン作成部７から構成される。(Embodiment) FIG. 1 is a block diagram showing the configuration of a speaker-independent speech recognition device according to an embodiment of the present invention, and FIG. 2 is a flowchart showing the operation of the device. As shown in FIG. 1, the speech recognition device of this embodiment includes a similarity calculation section 1, a control section 2, a standard pattern storage section 3, an erroneous recognition pattern storage section 4, a category selection section 5, and a learning section for creating additional standard patterns. It is composed of a pattern storage section 6 and an additional standard pattern creation section 7.

動作について説明すると、制御部２は類似度計算部１に
対し認識開始指令信号ａを出力スル。類似度計算部１は
このＬ＆識開始指令信号、を受ケチ、その入力時刻より
入カバターンｂを逐次入力する（第２図のステップ■）
。なおこの時５正解カテゴリ名も同時に入力する。また
ここで入カバターンｂとは、入力音声信号をＡ／Ｄ変換
した後、あらかじめ定められた特徴量に変換したもので
ある。To explain the operation, the control section 2 outputs a recognition start command signal a to the similarity calculation section 1. The similarity calculation unit 1 receives this L& recognition start command signal, and sequentially inputs the input pattern b from the input time (step ■ in Fig. 2).
. At this time, the name of the 5 correct answer categories is also input at the same time. In addition, the input cover turn b here is a signal obtained by A/D converting the input audio signal and then converting it into a predetermined feature amount.

そして類似度計算部１は、入カバターンｂと、標準パタ
ーン記憶部３に格納されている全ての標準パターンとの
類似度計算を行ない（ステップ■）、認識結果Ｃを制御
部２へ出力する動作及び誤認識となった場合だけ誤認識
パターン記憶部４へ誤認識パターンｄを出力する動作（
ステップ■、■）を、認識対象となる入カバターンｂの
入力が終了するまで（ステップ■）繰り返し行なう。本
発明の効果を確認するためには入カバターンの入力回数
は１万回以上であることが望ましい。入カバターンｂの
入力が終了した後、類似度計算部１は認識終了信号ｅを
制御部２へ出力する。制御部２は類似度計算部１より入
力する音声パターン入力毎の認識結果Ｃから認識対象カ
テゴリ別に誤認識率を算出し、類似度計算部１より認識
終了信号ｅを入力した時刻における認識対象カテゴリ毎
の誤認識率ｆをカテゴリ選択部５へ出力する。Then, the similarity calculation unit 1 calculates the similarity between the input cover pattern b and all the standard patterns stored in the standard pattern storage unit 3 (step ■), and outputs the recognition result C to the control unit 2. and an operation of outputting the misrecognition pattern d to the misrecognition pattern storage unit 4 only in the case of misrecognition (
Steps (2) and (2) are repeated until the input of input pattern b to be recognized is completed (step (2)). In order to confirm the effects of the present invention, it is desirable that the number of input patterns is 10,000 or more. After the input of the input cover turn b is completed, the similarity calculation unit 1 outputs a recognition completion signal e to the control unit 2. The control unit 2 calculates the false recognition rate for each recognition target category from the recognition result C for each speech pattern input from the similarity calculation unit 1, and calculates the recognition target category at the time when the recognition end signal e is input from the similarity calculation unit 1. The misrecognition rate f for each is output to the category selection section 5.

カテゴリ選択部５は制御部２より誤認識率ｆを入力した
後、他の認識対象カテゴリの誤認識率に比較して誤認識
率が大であるカテゴリ名及びカテゴリ数を算出する（ス
テップ■）。本実施例では全認識対象カテゴリの誤認識
率の平均値と比較して誤認識率が約３％以上大となるカ
テゴリを選択し、追加用標準パターンを作成すべきカテ
ゴリとしている。カテゴリ選択部５は前記選択されたカ
テゴリ数に及ぶカテゴリ全ての中のある１つのカテゴリ
を追加カテゴリ名として（ステップ■）以下の動作を行
なう。After inputting the false recognition rate f from the control unit 2, the category selection unit 5 calculates the name of the category and the number of categories whose false recognition rate is large compared to the false recognition rate of other recognition target categories (step ■). . In this embodiment, a category whose misrecognition rate is approximately 3% or more higher than the average value of the misrecognition rates of all recognition target categories is selected as a category for which an additional standard pattern should be created. The category selection unit 5 performs the following operations by setting one category among all the selected categories as an additional category name (step ①).

誤認識パターン記憶部４より誤認識となったパターンを
逐次入力しくステップ■）、前記追加カテゴリ名と前記
入力したパターンが本来有する力テゴリ名が一致したパ
ターンだけを（ステップ■）追加標準パターン作成用学
習パターン記憶部６へ出力しくステップ［相］）、パタ
ーン入力終了の判定を行なう（ステップ■）。パターン
入力終了後、カテゴリ選択部５は追加標準パターン作成
開始指令信号ｇを追加標準パターン作成部７へ入力する
。Input the misrecognized patterns one by one from the misrecognition pattern storage unit 4 (Step ■), and create additional standard patterns only for the patterns whose additional category names match the power category names of the input patterns (Step ■). The pattern input is output to the learning pattern storage section 6 (step [phase]), and it is determined whether the pattern input is completed (step 2). After the pattern input is completed, the category selection section 5 inputs an additional standard pattern creation start command signal g to the additional standard pattern creation section 7.

追加＃Ａ準パターン作成部７は以下詳細に述べる方法に
より追加標準パターンを作成する。The additional #A quasi-pattern creating section 7 creates an additional standard pattern by a method described in detail below.

追加標準パターン作成用学習パターン記憶部６に記憶さ
れている追加標準パターン作成用学習パターン数をＧＰ
ＮＯ，追加標準パターン作成用学習パターンの特徴次元
数をＤＮＯ，各追加標準パターン作成用学習パターンの
特徴ベクトルをで表わす時、各パターン間の類似度（類
似度を表わす概念としてはベクトル間角度、市街地距離
、ユークリッド距離等が考えられるが本説明においては
例としてユークリッド距離を用いて説明を行なう。）を
ユークリッド距離で定義する。ここでユークリッド距離が大である程類似
度は小、ユークリッド距離が小である程類似度は大であ
るものとする。そしである１つの追加標準パターン作成
用学習パターンと、他の全ての追加標準パターン作成用
学習パターンとのユークリッド距離計算をそれぞれ行な
い、その中でユークリッド距離があらかじめ定められた
閾値ＲＴＨより小となる回数を計数する（ステップ＠）
。上記処理を全ての追加標準パターン作成用学習パター
ンについて行ない、前記ユークリッド距離が前記閾値Ｒ
ＴＨより小となる回数をＲＰＨ，（ｉ　＝　１〜ＧＰＮ
Ｏ）とする。上記操作を全ての追加標準パターン作成用
学習パターンについて行なうと（ステップ■）、全ての
追加標準パターン作成用学習パターンの中で前記ＲＰ　
＋が最大となるパターンについて、該パターンと、他の
追加標準パターン作成用学習パターンの中で該パターン
との前記ユークリッド距離が前記閾値ＲＴＨより小とな
る全てのパターンとの平均ベクトルを算出しくステップ
Ｏ）、その平均ベクトルを追加標準パターンｈとして標
準パターン記憶部３へ出力する（ステップ■）とともに
追加標準パターン作成終了信号〕−をカテゴリ選択部５
へ出力する。The number of learning patterns for creating additional standard patterns stored in the learning pattern storage unit 6 for creating additional standard patterns is GP.
NO, the number of feature dimensions of the learning pattern for creating an additional standard pattern is expressed by DNO, and the feature vector of the learning pattern for creating each additional standard pattern is expressed by the degree of similarity between each pattern (the concept of expressing the degree of similarity is the angle between vectors, Although urban area distance, Euclidean distance, etc. can be considered, in this explanation, Euclidean distance will be used as an example.) is defined by Euclidean distance. Here, it is assumed that the larger the Euclidean distance, the smaller the similarity, and the smaller the Euclidean distance, the larger the similarity. Then, the Euclidean distance is calculated between the learning pattern for creating an additional standard pattern and all other learning patterns for creating an additional standard pattern, and the Euclidean distance is smaller than a predetermined threshold RTH. Count the number of times (step @)
. The above process is performed for all additional standard pattern creation learning patterns, and the Euclidean distance is set to the threshold R.
The number of times smaller than TH is RPH, (i = 1~GPN
O). When the above operation is performed for all additional standard pattern creation learning patterns (step ■), the above RP among all additional standard pattern creation learning patterns
For the pattern with the maximum +, calculate an average vector between this pattern and all patterns for which the Euclidean distance to this pattern is smaller than the threshold RTH among other learning patterns for creating additional standard patterns. O), outputs the average vector as an additional standard pattern h to the standard pattern storage section 3 (step ■), and outputs an additional standard pattern creation end signal]- to the category selection section 5.
Output to.

カテゴリ選択部５は追加標準パターンを作成すべきカテ
ゴリ全てについて上記動作を行なうべく、追加標準パタ
ーン作成部７へ追加標準パターン作成開始指令信号ｇを
出力し、この指令に従って追加標準パターン作成部７は
上記追加標準パターン作成動作を行なう（ステップＯ）
。The category selection section 5 outputs an additional standard pattern creation start command signal g to the additional standard pattern creation section 7 in order to carry out the above operations for all the categories for which additional standard patterns are to be created.According to this command, the additional standard pattern creation section 7 Perform the above additional standard pattern creation operation (Step O)
.

第３図はある１つのカテゴリの追加標準パターン作成用
学習パターンの特徴空間上の分布を概念的に表わしたも
のである。以下第３図を用いて上記追加標準パターン作
成部７の動作説明を定性的に行なう。FIG. 3 conceptually represents the distribution on the feature space of learning patterns for creating additional standard patterns for one category. The operation of the additional standard pattern creating section 7 will be qualitatively explained below using FIG.

第３図において追加標準パターン作成用学習パターンを
Ａ−Ｌとすると前記ユークリッド距離Ｄ＋、が前記閾値
ＲＴＨより小となる回数が最大となるパターンはＡであ
り、Ａとの前記ユークリッド距離が前記閾値ＲＴＨより
小となるパターンはＢ、Ｃ，Ｄ、Ｅ、及びＦとなり、Ａ
、Ｂ、Ｃ，Ｄ、Ｅ、Ｆ計６パターンの平均ベクトルを算
出し追加標準パターンＳとする。In FIG. 3, if the learning pattern for creating an additional standard pattern is A-L, the pattern in which the number of times the Euclidean distance D+ is smaller than the threshold RTH is maximum is A, and the Euclidean distance with A is the threshold The patterns that are smaller than RTH are B, C, D, E, and F, and A
, B, C, D, E, F, the average vector of six patterns in total is calculated and set as an additional standard pattern S.

第３図においてＧ　、　Ｊ　、　Ｌ等のパターンは誤認
識パターンの中でも他の誤認識パターンと比較してかけ
離れた特徴空間上に位置するパターンであり、これらの
パターンを上記平均化の対象からはずすことにより、誤
認識パターンの中で出現頻度が高い特徴空間上に位置す
るパターンのみにより追加標準パターンを作成できるの
で、標準パターン数の増加を最小限に止め、かつ認識性
能の効率的向上が期待できる。In Figure 3, patterns such as G, J, and L are patterns that are located in a feature space that is far away from other erroneously recognized patterns among the erroneously recognized patterns, and these patterns are excluded from the above averaging target. As a result, additional standard patterns can be created using only patterns located in the feature space that frequently appear among misrecognized patterns, which is expected to minimize the increase in the number of standard patterns and efficiently improve recognition performance. can.

（発明の効果）以上詳細に説明したように、本発明では、誤認識が発生
し易いカテゴリについてのみ、誤認識パターン全体の中
で特徴空間上出現頻度が大であるパターンだけで平均を
とり追加標準パターンとしたので、標準パターン増加を
最小限にしてかつ認識性能の効率的向上を実現できる。(Effects of the Invention) As explained in detail above, in the present invention, for categories where misrecognition is likely to occur, only patterns that appear frequently in the feature space among all misrecognition patterns are averaged and added. Since the standard pattern is used, the increase in the standard pattern can be minimized and recognition performance can be efficiently improved.

【図面の簡単な説明】第１図は本発明に係る不特定話者音声認識装置の実施例
のブロック図、第２図は前記実施例の動作を示すフロー
チャート、第３図は本発明に係る追加標準パターン作成
方法の説明図である。１・・・類似度計算部、　　　２・・制御部、３・・・
標準パターン記憶部、４・・・誤認識パターン記憶部、５・・・カテゴリ選択部、６・・追加標準パターン作成用学習パターン記憶部、７
・・・追加標準パターン作成部。[BRIEF DESCRIPTION OF THE DRAWINGS] FIG. 1 is a block diagram of an embodiment of a speaker-independent speech recognition device according to the present invention, FIG. 2 is a flowchart showing the operation of the embodiment, and FIG. 3 is a block diagram of an embodiment of the speaker-independent speech recognition device according to the present invention. FIG. 3 is an explanatory diagram of an additional standard pattern creation method. 1... Similarity calculation unit, 2... Control unit, 3...
Standard pattern storage unit, 4... Erroneous recognition pattern storage unit, 5... Category selection unit, 6... Learning pattern storage unit for creating additional standard patterns, 7
...Additional standard pattern creation department.

Claims

[Claims] Calculating the degree of similarity between the input voice pattern and a plurality of standard patterns created in advance, and determining the category attribute of the standard pattern that has the maximum degree of similarity to the input voice pattern as the recognition category. In a speaker-independent speech recognition device that performs speech recognition, a speech pattern that is erroneously recognized when the speech recognition process is performed on a speech pattern other than the speech pattern used when creating the standard pattern is stored. 1 storage means; a category selection means for selecting a category for which the number of incorrect recognitions in the speech recognition process is large compared to other recognition target categories; a second storage means for inputting and storing voice patterns from the first storage means; and one for each category stored in the second storage means.
The similarity between this pattern and all other patterns is calculated for all patterns in the same category, and the number of times the calculated similarity is greater than a predetermined threshold is calculated for each category.
The average of the feature vector of the pattern counted for each pattern with the maximum number of counts in the category and the feature vectors of all patterns in the same category whose similarity to the pattern is greater than the predetermined threshold value. A speaker-independent speech recognition device comprising additional standard pattern creation means for creating a pattern corresponding to the vector as a standard pattern to be added in the category.