JP2007065491A

JP2007065491A - Pattern model generating device, pattern model evaluating device, and pattern recognizing device

Info

Publication number: JP2007065491A
Application number: JP2005253819A
Authority: JP
Inventors: Makoto Shosakai; 誠庄境; Toshihide Nakino; 豪秀奈木野
Original assignee: Asahi Kasei Corp
Current assignee: Asahi Kasei Corp
Priority date: 2005-09-01
Filing date: 2005-09-01
Publication date: 2007-03-15
Anticipated expiration: 2025-09-01
Also published as: JP4763387B2

Abstract

PROBLEM TO BE SOLVED: To provide a pattern model generating device capable of easily generating a new pattern model improving recognition probability of a signal causing recognition probability of a pattern model to decrease, a pattern model evaluating device suitably used to evaluate the recognition probability of the signal causing the recognition probability of the pattern model to decrease, and a pattern recognizing device which performs pattern recognition by using the pattern model improving the recognition probability. SOLUTION: A data space consisting of a high-order signal pattern model including high-dimensional elements and a high-order noise pattern model is mapped to a data space consisting of low-order signal vectors and low-order noise vectors while the position relation among the high-order pattern models is approximated, and the low-order vector space is sectioned and visualized; and the pattern model for pattern recognition is generated by using signal data and noise data corresponding to low-order vectors at positions at a distance longer than a specified distance from the center. COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、多様な信号と多様なノイズとが混合したノイズ混合信号のパターン認識に用いられるパターンモデルに係り、特に、当該パターンモデルの認識確率を低下させる要因となるノイズ混合信号に対する認識確率を向上する新規のパターンモデルを容易に生成可能なパターンモデル生成装置、パターンモデル生成プログラムおよびパターンモデル生成方法、パターンモデルによる認識確率を低下させる要因となるノイズ混合信号に対する認識確率を評価するのに好適なパターンモデル評価装置、パターンモデル評価プログラムおよびパターンモデル評価方法、並びに前記新規なパターンモデルを用いてパターン認識を行うパターン認識装置、パターン認識プログラムおよびパターン認識方法に関する。 The present invention relates to a pattern model used for pattern recognition of a noise mixed signal in which various signals and various noises are mixed, and in particular, the recognition probability for a noise mixed signal that causes a reduction in the recognition probability of the pattern model. Suitable for evaluating the recognition probability for noise mixed signals that cause a reduction in recognition probability due to a pattern model, pattern model generation program and pattern model generation method capable of easily generating a new pattern model to be improved The present invention relates to a simple pattern model evaluation apparatus, a pattern model evaluation program and a pattern model evaluation method, and a pattern recognition apparatus, a pattern recognition program and a pattern recognition method for performing pattern recognition using the novel pattern model.

一般に、パターン認識は、認識対象の信号をある特徴パラメータの系列に変換する特徴分析部と、特徴分析部で得られた特徴パラメータの系列を予め用意された特徴パラメータに関する情報と照合して、最も類似度の高い照合結果を認識結果とする特徴照合部との２つの部分から構成される。以下、音声認識を例にとって説明する。音声認識は、話者が発声した音声サンプルをある特徴パラメータの系列に変換する音響分析部と、音響分析部で得られた特徴パラメータの系列を予めメモリやハードディスクなどの記憶装置に蓄積した語彙単語の特徴パラメータに関する情報と照合して、最も類似度の高い音声を認識結果とする音声照合部と、の２つの部分から構成される。 In general, pattern recognition is performed by comparing a feature analysis unit that converts a signal to be recognized into a sequence of feature parameters, and comparing the feature parameter sequence obtained by the feature analysis unit with information about feature parameters prepared in advance. It consists of two parts: a feature matching unit that uses a matching result with a high degree of similarity as a recognition result. Hereinafter, explanation will be given by taking voice recognition as an example. Speech recognition consists of an acoustic analysis unit that converts a speech sample uttered by a speaker into a sequence of feature parameters, and a vocabulary word in which the feature parameter sequence obtained by the acoustic analysis unit is stored in a storage device such as a memory or a hard disk in advance. It is composed of two parts: a voice collating unit that collates with information on the feature parameter of the voice and uses the voice with the highest similarity as a recognition result.

音声サンプルをある特徴パラメータの系列に変換する音響分析方法としては、ケプストラム分析や線形予測分析などが知られており、非特許文献１の「Chapter3:Signal Processing and Analysis Methods for Speech Recognition」にも詳述されている。
音声認識の中で、不特定話者の音声を認識する技術を一般に不特定話者音声認識と呼ぶ。不特定話者音声認識においては、語彙単語の特徴パラメータに関する情報が予め記憶装置に蓄積されているため、特定話者音声認識のようにユーザーが音声認識させたい単語を登録するという作業は発生しない。 Cepstrum analysis and linear prediction analysis are known as acoustic analysis methods for converting a speech sample into a series of feature parameters. It is stated.
In speech recognition, a technique for recognizing unspecified speaker speech is generally referred to as unspecified speaker speech recognition. In unspecified speaker voice recognition, information related to the characteristic parameters of vocabulary words is stored in the storage device in advance, so that the user does not need to register a word that the user wants to recognize as in the case of specific speaker voice recognition. .

また、語彙単語の特徴パラメータに関する情報の作成およびその情報と入力された音声から変換された特徴パラメータの系列との音声照合方法としては、隠れマルコフモデル(Hidden Markov Model, HMM)による方法が一般に用いられている。ＨＭＭによる方法においては、音節、半音節、音韻、ｂｉｐｈｏｎｅ、ｔｒｉｐｈｏｎｅなどの音声単位がＨＭＭによりモデル化される。これらのモデルを一般に、音響モデルと呼ぶ。 In addition, the Hidden Markov Model (HMM) method is generally used as a method for creating information on feature parameters of vocabulary words and for collating the information with a sequence of feature parameters converted from the input speech. It has been. In the method based on the HMM, speech units such as syllables, semi-syllables, phonemes, biphones, and triphones are modeled by the HMM. These models are generally called acoustic models.

音響モデルの作成方法については、上記非特許文献１の「Chapter6:Theory and Implementation of Hidden Markov Models」に詳しく述べられている。
また、上記非特許文献１の「Chapter6:Theory and Implementation of Hidden Markov Models」に記載されているＶｉｔｅｒｂｉアルゴリズムにより、当業者は不特定話者音声認識装置を容易に構成することができる。 The method for creating an acoustic model is described in detail in “Chapter 6: Theory and Implementation of Hidden Markov Models” of Non-Patent Document 1 above.
Further, the Viterbi algorithm described in “Chapter 6: Theory and Implementation of Hidden Markov Models” of Non-Patent Document 1 enables a person skilled in the art to easily configure an unspecified speaker speech recognition apparatus.

また、従来、音声認識の性能評価は、試行錯誤的な手法が多かった。例えば、収集した音声データ群を学習用データ群と評価用データ群とに適当に分割し、この分割された学習用データ群からＨＭＭによる音響モデルを生成する。そして、前記分割された評価用データ群に対する音響モデルの認識性能を評価する。この認識性能の評価においては、各認識結果の平均値や最高値、最低値などを算出する。また、学習用データ群と評価用データ群との組み合わせがそれぞれ変化するように前記分割処理を複数回行い、これら複数の分割結果に対する性能評価結果を平均化する操作により、より客観性の高い評価を行う試みもあるが、音声認識の性能評価は経験的であり、体系的とは言えなかった。 Conventionally, there have been many trial and error methods for evaluating the performance of speech recognition. For example, the collected speech data group is appropriately divided into a learning data group and an evaluation data group, and an HMM acoustic model is generated from the divided learning data group. Then, the recognition performance of the acoustic model for the divided evaluation data group is evaluated. In this recognition performance evaluation, the average value, maximum value, minimum value, etc. of each recognition result are calculated. Further, by performing the division process a plurality of times so that the combination of the learning data group and the evaluation data group changes, and by averaging the performance evaluation results for the plurality of division results, a more objective evaluation However, the performance evaluation of speech recognition is empirical and not systematic.

また、音声認識は、音声による入力インターフェースを提供する技術として、ハンズフリー操作が要求されるカーナビゲーションシステムなどの車載機器や、音声による対話機能を持つロボットなどにおいて実用化の期待が高まっているが、現状では、１００％の認識確率を実現しているとは言えない。利用者の声質や喋り方によっては、認識確率が５０％を下回る場合も多いのが現状である。これまでは、認識確率の平均値を尺度として認識性能を評価することが多かったが、入力インターフェースとしての性能を評価するのであれば、認識確率の最低値を評価し、その値が例えば９０％以上であるから、全ての利用者に対して９０％以上の認識性能を実現できるという形の性能保証をしていく必要がある。 In addition, voice recognition is expected to be put to practical use in in-vehicle devices such as car navigation systems that require hands-free operation and robots that have voice interactive functions as a technology that provides voice input interfaces. At present, it cannot be said that 100% recognition probability is realized. The current situation is that the recognition probability often falls below 50% depending on the voice quality of the user and how to speak. In the past, the recognition performance was often evaluated using the average value of the recognition probability as a scale. However, when evaluating the performance as an input interface, the minimum value of the recognition probability is evaluated, and the value is, for example, 90%. As described above, it is necessary to guarantee performance in a form that can realize recognition performance of 90% or more for all users.

また、最低性能を保証することに加え、認識確率を低下させる要因となるデータに対しても十分な認識性能を発揮できるパターンモデルの生成が望まれている。
例えば、ノイズを混合した音声データからパターンモデルを生成する技術として、特許文献１および特許文献２に記載の音声モデルのノイズ適応化システム、並びに特許文献３に記載の音響モデル作成方法がある。 In addition to guaranteeing the minimum performance, it is desired to generate a pattern model that can exhibit sufficient recognition performance for data that causes a reduction in recognition probability.
For example, as a technique for generating a pattern model from speech data mixed with noise, there are a speech model noise adaptation system described in Patent Literature 1 and Patent Literature 2, and an acoustic model creation method described in Patent Literature 3.

特許文献１に記載の従来技術は、まずノイズをＳＮＲ（signal-to-noise ratio）でクラスタリングし、次にＳＮＲ毎に様々な種類のノイズを木構造にクラスタリングし、クラスタ毎にノイズを所定のＳＮＲで混合したノイズ混合音声データから音声ＨＭＭ（パターンモデル）を作成し、それらの集合の中から、ＳＮＲとノイズの種類が適合した音声ＨＭＭを選択して音声認識に用いることにより、ノイズに頑健な音声認識システムを構築するというものである。 In the prior art described in Patent Document 1, noise is first clustered by SNR (signal-to-noise ratio), then various types of noise are clustered in a tree structure for each SNR, and noise is predetermined for each cluster. Creates a speech HMM (pattern model) from noise-mixed speech data mixed with SNR, selects a speech HMM that matches the SNR and noise type from the set, and uses it for speech recognition, making it robust against noise A simple voice recognition system.

また、特許文献１の改良技術である特許文献２に記載の従来技術は、特許文献１と同一の出願人が特許文献１に記載の従来技術に加えて、選択された音声ＨＭＭが与える尤度が更に大きくなるように選択された音声ＨＭＭを線形変換することにより、ノイズに頑健な音声認識システムを構築するというものである。
また、特許文献３に記載の従来技術は、まずノイズを統計的手法でクラスタリングし、ノイズのクラスタ毎に所定のＳＮＲで音声とノイズとを混合し、ノイズ除去処理を行った後、ノイズのクラスタ毎にノイズ混合音声の音声ＨＭＭ（パターンモデル）を作成することにより、ノイズに頑健な音声認識システムを構築するというものである。 In addition, the prior art described in Patent Document 2, which is an improved technique of Patent Document 1, is the likelihood that the same applicant as Patent Document 1 gives the selected speech HMM in addition to the prior art described in Patent Document 1. A speech recognition system that is robust against noise is constructed by linearly transforming a speech HMM that is selected so that becomes larger.
In the prior art described in Patent Document 3, first, noise is clustered by a statistical method, voice and noise are mixed with a predetermined SNR for each noise cluster, and noise removal processing is performed. By creating a speech HMM (pattern model) of noise-mixed speech every time, a speech recognition system that is robust against noise is constructed.

特開２００４−１１７６２４号公報JP 2004-117624 A 特開２００４−２７９４６６号公報JP 2004-279466 A 特開２００４−２０６０６３号公報JP 2004-206063 A Lawrence Rabiner and Biing-Hwang Juang, "Fundamentals of Speech Recognition," Prentice Hall Signal Processing Series, 1993Lawrence Rabiner and Biing-Hwang Juang, "Fundamentals of Speech Recognition," Prentice Hall Signal Processing Series, 1993

しかしながら、上記特許文献１乃至上記特許文献３の従来技術においては、ノイズのクラスタ毎にノイズ混合音声の音声ＨＭＭを作成するため、ノイズ混合音声の音声ＨＭＭの集合の中から、最適な音声ＨＭＭを選択するという処理が必要になり、ノイズのクラスタの規模が大きくなると、複数のノイズ混合音声の音声ＨＭＭの集合の作成処理と最適なノイズ混合音声の音声ＨＭＭの選択処理に多大な時間と労力がかかるという不具合があった。 However, in the prior arts of Patent Document 1 to Patent Document 3 described above, since an audio HMM of noise mixed speech is created for each noise cluster, an optimal speech HMM is selected from the set of speech HMMs of noise mixed speech. When the process of selecting is required and the size of the noise cluster becomes large, a great deal of time and effort is required for the process of creating a set of voice HMMs of a plurality of noise-mixed voices and the process of selecting an optimal voice HMM of noise-mixed voices. There was a problem that it took.

加えて、これまでは、認識確率を低下させる要因となる学習用データ群および評価用データ群の組み合わせを導出する体系的な手法が提案されてこなかったため、パターンモデルの認識性能の最低性能を保証することが困難であった。上記特許文献１乃至上記特許文献３のいずれの従来技術も、認識確率を低下させる要因となる学習用データ群および評価用データ群の組み合わせを導出する体系的な手法の開示には至っていない。 In addition, until now, no systematic method has been proposed for deriving combinations of learning data groups and evaluation data groups that cause the recognition probability to decrease, so the minimum performance of pattern model recognition performance is guaranteed. It was difficult to do. None of the prior arts in Patent Document 1 to Patent Document 3 has disclosed a systematic method for deriving a combination of a learning data group and an evaluation data group that causes a reduction in recognition probability.

そこで、本発明は、このような従来の技術の有する未解決の課題に着目してなされたものであって、パターンモデルの認識確率を低下させる要因となるノイズ混合信号に対する認識確率を向上する新規のパターンモデルを容易に生成可能なパターンモデル生成装置、パターンモデル生成プログラムおよびパターンモデル生成方法、パターンモデルによる認識確率を低下させる要因となるノイズ混合信号に対する認識確率を評価するのに好適なパターンモデル評価装置、パターンモデル評価プログラムおよびパターンモデル評価方法、並びに前記新規なパターンモデルを用いてパターン認識を行うパターン認識装置、パターン認識プログラムおよびパターン認識方法を提供することを目的としている。 Therefore, the present invention has been made paying attention to such an unsolved problem of the conventional technology, and is a novel technique for improving the recognition probability for a noise mixed signal that causes a reduction in the recognition probability of the pattern model. Pattern model generation device, pattern model generation program and pattern model generation method, and pattern model suitable for evaluating recognition probability for noise mixed signals that cause reduction of recognition probability by pattern model An object is to provide an evaluation apparatus, a pattern model evaluation program, a pattern model evaluation method, a pattern recognition apparatus, a pattern recognition program, and a pattern recognition method for performing pattern recognition using the novel pattern model.

上記目的を達成するために、本発明に係る請求項１記載のパターンモデル生成装置は、
認識対象となる認識対象データのパターン認識に用いられるパターンモデルを生成するパターンモデル生成装置であって、
複数対象に係る複数のノイズ未混合の予め取得された信号データを記憶する信号データ記憶手段と、
前記信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を記憶する低次信号ベクトル空間記憶手段と、
複数種類の信号未混合のノイズデータを記憶するノイズデータ記憶手段と、
前記ノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を記憶する低次ノイズベクトル空間記憶手段と、
前記低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択する低次信号ベクトル選択手段と、
前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択する低次ノイズベクトル選択手段と、
前記低次信号ベクトル選択手段によって選択した低次信号ベクトルに対応する信号データと前記低次ノイズベクトル選択手段によって選択した低次ノイズベクトルに対応するノイズデータとに基づき、前記各信号データに前記ノイズデータを混合したノイズ混合信号データを生成するノイズ混合信号データ生成手段と、
前記ノイズ混合信号データを学習データとして確率モデルを学習させてなるパターンモデルを生成するパターンモデル生成手段と、を備えることを特徴としている。 In order to achieve the above object, a pattern model generation apparatus according to claim 1 according to the present invention comprises:
A pattern model generation device that generates a pattern model used for pattern recognition of recognition target data to be recognized,
Signal data storage means for storing a plurality of noise-unmixed pre-acquired signal data related to a plurality of objects;
As an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on the signal data, the positional relationship between the higher-order signal pattern models in the data space Low-order signal vector space storage means for storing a low-order signal vector space that is mapped to a data space composed of low-order signal vectors of three dimensions or less in an approximate state,
Noise data storage means for storing a plurality of types of signal-unmixed noise data;
As an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four-dimensional or higher-dimensional elements generated based on the noise data, the positional relationship between the higher-order noise pattern models in the data space Low-order noise vector space storage means for storing a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state where
Among the plurality of low-order signal vectors constituting the low-order signal vector space, a low-order signal vector for generating a pattern model from among low-order signal vectors located at a predetermined distance or more from the center of gravity of all the low-order signal vectors. Low-order signal vector selection means for selecting,
Of the plurality of low-order noise vectors constituting the low-order noise vector space, a low-order noise vector for generating a pattern model from among the low-order noise vectors located at a predetermined distance or more from the center of gravity of all the low-order noise vectors. Low-order noise vector selection means for selecting
Based on the signal data corresponding to the low-order signal vector selected by the low-order signal vector selection means and the noise data corresponding to the low-order noise vector selected by the low-order noise vector selection means, the noise is added to each signal data. Noise mixed signal data generating means for generating noise mixed signal data obtained by mixing data;
Pattern model generation means for generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data.

このような構成であれば、信号データ記憶手段によって、予め取得された、複数対象に係る複数のノイズ未混合の信号データを記憶することが可能であり、低次信号ベクトル空間記憶手段によって、前記信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を記憶することが可能であり、ノイズデータ記憶手段によって、複数種類の信号未混合のノイズデータを記憶することが可能であり、低次ノイズベクトル空間記憶手段によって、前記ノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を記憶することが可能である。 With such a configuration, it is possible to store a plurality of noise-unmixed signal data related to a plurality of objects acquired in advance by the signal data storage means, and the low-order signal vector space storage means As an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on signal data, the positional relationship between each higher-order signal pattern model in the data space is It is possible to store a low-order signal vector space that is mapped to a data space composed of low-order signal vectors of three dimensions or less in an approximate state, and a plurality of types of unmixed signals can be stored by the noise data storage means. It is possible to store noise data and generated based on the noise data by the low-order noise vector space storage means As an alternative to a data space composed of a plurality of higher-order noise pattern models composed of higher-dimensional elements than the three-dimensional elements, the positional relationship between the higher-order noise pattern models in the data space is approximated with three or less dimensions. It is possible to store a low-order noise vector space mapped to a data space composed of low-order noise vectors.

また、低次信号ベクトル選択手段によって、前記低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択することが可能であり、低次ノイズベクトル選択手段によって、前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択することが可能であり、ノイズ混合信号データ生成手段によって、前記低次信号ベクトル選択手段によって選択した低次信号ベクトルに対応する信号データと前記低次ノイズベクトル選択手段によって選択した低次ノイズベクトルに対応するノイズデータとに基づき、前記各信号データに前記ノイズデータを混合したノイズ混合信号データを生成することが可能であり、パターンモデル生成手段によって、前記ノイズ混合信号データを学習データとして確率モデルを学習させてなるパターンモデルを生成することが可能である。 Further, the low-order signal vector selecting means selects a low-order signal vector located at a position more than a predetermined distance from the center of gravity of all the low-order signal vectors among a plurality of low-order signal vectors constituting the low-order signal vector space. It is possible to select a low-order signal vector for generating a pattern model from among the plurality of low-order noise vectors constituting the low-order noise vector space by the low-order noise vector selection means. It is possible to select a low-order noise vector for generating a pattern model from among low-order noise vectors located at a predetermined distance or more from the center of gravity of the low-order signal vector. Signal data corresponding to the low-order signal vector selected by the means and the low-order noise vector selection means. It is possible to generate noise mixed signal data in which the noise data is mixed with each signal data based on the noise data corresponding to the low-order noise vector, and the noise mixed signal data is generated by pattern model generation means. It is possible to generate a pattern model obtained by learning a probability model as learning data.

低次信号ベクトル空間および低次ノイズベクトル空間においては、各ベクトルが重心から離れれば離れるほど各ベクトルの存在する密度が低くなる傾向にあり、且つ密度の低いところ（分布の外側周辺およびその近傍）にあるベクトルに対応した信号データおよびノイズデータから生成されるノイズ混合信号データは、例えば、各空間における全データを用いて生成した不特定対象に対するパターンモデルに対して低い認識確率を生じさせる傾向にある。従って、重心から所定距離以上離れた位置（存在密度の低くなる位置範囲）にあるデータを用いてノイズ混合信号データを生成し、当該生成したノイズ混合信号データを学習データとして用いてパターンモデルを生成することが可能であるので、パターンモデルの認識確率を低下させる信号データに対する認識確率を向上できるパターンモデルを生成することが可能である。 In the low-order signal vector space and the low-order noise vector space, the density of each vector tends to decrease as the distance from the center of gravity increases, and the density is low (outside and near the distribution). The noise mixed signal data generated from the signal data and noise data corresponding to the vector in FIG. 5 tends to generate a low recognition probability for a pattern model for an unspecified object generated using all data in each space, for example. is there. Therefore, noise mixed signal data is generated using data at a position away from the center of gravity by a predetermined distance or more (position range where the existence density is low), and a pattern model is generated using the generated noise mixed signal data as learning data. Therefore, it is possible to generate a pattern model that can improve the recognition probability for signal data that reduces the recognition probability of the pattern model.

また、重心から所定距離以上離れた位置にあるデータを用いて生成されたパターンモデルは、重心から所定距離未満の範囲にあるデータによって生成されるノイズ混合信号データに対しても比較的良好な認識確率が得られる傾向にあるため、重心から所定距離以上離れた位置にあるデータを用いてパターンモデルを生成することで、不特定対象の信号データ全般に対して満遍なく良好な認識確率が得られるパターンモデルを生成することが可能である。
ここで、パターンモデルとは、例えば、所定データが音声データである場合に、この音声データに対してマッチングを行う信号パターンをモデル化したものであり、ＨＭＭやニューラルネットワーク等の統計モデルや特徴パラメータの時系列を用いて表現されるものである。 In addition, the pattern model generated using data at a position more than a predetermined distance from the center of gravity is relatively good recognition for noise mixed signal data generated by data within a range less than the predetermined distance from the center of gravity. Probability tends to be obtained, so by generating a pattern model using data that is at a predetermined distance or more from the center of gravity, a pattern that can obtain a good recognition probability uniformly for all unspecified target signal data A model can be generated.
Here, the pattern model is, for example, a model of a signal pattern that performs matching on audio data when the predetermined data is audio data, and is a statistical model such as an HMM or a neural network, or a feature parameter. It is expressed using the time series.

また、信号データとしては、例えば、人間の音声などの音響データや野鳥、昆虫、蛙、蝙蝠、動物などの野生生物の鳴声データ、画像データ、赤外線センサデータ、加速度センサデータ、方位角センサデータ、圧力センサデータ、圧電素子や振動計などの振動センサデータおよびその他の全てのセンサデータ、リチウムイオン２次電池や燃料電池などの電池の充電状況に関する物理的データ、心電図、筋電図、血圧、体重などの生体信号データ、遺伝子解析用のマイクロアレイデータ、気温、湿度、気圧などの気象データ、酸素濃度、窒素酸化物濃度などの環境データ、株価、物価などの経済動向データなどの時系列データ等がある。 The signal data includes, for example, acoustic data such as human voice, cry data of wildlife such as wild birds, insects, moths, moths, animals, image data, infrared sensor data, acceleration sensor data, azimuth sensor data , Pressure sensor data, vibration sensor data such as piezoelectric elements and vibrometers and all other sensor data, physical data on the charging status of batteries such as lithium ion secondary batteries and fuel cells, electrocardiogram, electromyogram, blood pressure, Biological signal data such as weight, microarray data for genetic analysis, weather data such as temperature, humidity, and pressure, environmental data such as oxygen concentration and nitrogen oxide concentration, time series data such as economic trend data such as stock prices and prices, etc. There is.

また、ノイズデータとしては、住宅内の生活雑音、工場騒音、交通騒音等の音響に対するノイズデータ、電気・電子回路ノイズのノイズデータ、振動ノイズのノイズデータ等がある。
また、パターンモデルを４次元以上の高次元の要素を含むモデルとしたが、これは、例えば、音声認識等のパターン認識において、少なくとも、４次元以上の特徴パラメータを利用しないと、高い認識性能が得られないためであり、また、音声認識においては、実用上有効な認識性能を実現可能な３次元以下の特徴パラメータが現在のところ発見されていないためである。 The noise data includes noise data for sounds such as house life noise, factory noise, traffic noise, noise data for electrical / electronic circuit noise, noise data for vibration noise, and the like.
In addition, the pattern model is a model including a four-dimensional or higher-dimensional element. For example, in pattern recognition such as speech recognition, a high recognition performance is obtained unless at least a four-dimensional or higher feature parameter is used. This is because it has not been obtained, and in speech recognition, a feature parameter of three dimensions or less capable of realizing practically effective recognition performance has not been found at present.

また、複数対象に係る信号データとは、例えば、複数対象から測定できるデータそのもの、当該データから抽出した特徴量、当該特徴量に基づき生成したパターンモデルなどと、それらの内容を記述したテキストファイルとの組などを指す。例えば、複数の話者の発声した音声のデータ、当該音声データから抽出した特徴量、当該特徴量に基づき生成したパターンモデルなどと、それらの発声内容を記述したテキストファイルとの組などである。 In addition, the signal data related to a plurality of objects includes, for example, data itself that can be measured from a plurality of objects, feature amounts extracted from the data, pattern models generated based on the feature amounts, and a text file that describes the contents thereof. This refers to the set. For example, it is a set of speech data uttered by a plurality of speakers, feature amounts extracted from the speech data, a pattern model generated based on the feature amounts, and a text file describing the utterance contents.

また、各高次信号パターンモデル間の距離又は各高次ノイズパターンモデル間の距離とは、特定対象の信号データ又はノイズデータから生成されるパターンモデル間の類似度を示すもので、例えば、ユークリッド距離や、類似度を測る距離を二つのベクトルの内積とし、二つのベクトルの成す角を類似度として評価するマハラノビスの汎距離などがある。なお、本発明においては、距離として、他に、バタチャリヤ（Bhattacharyya）距離、平方ユークリッド距離、コサイン距離、ピアソンの相関、チェビシェフ、都市ブロック距離（あるいはマンハッタン距離）、ミンコウスキー和、カルバック情報量、ベイズ距離、チェルノフ距離、ＨＭＭによって構成されたパターンモデルの正規分布の平均ベクトルに基づくユークリッド距離、ＨＭＭによって構成されたパターンモデルの正規分布の標準偏差により正規化された当該パターンモデルの正規分布の平均ベクトルに基づくユークリッド距離、ＨＭＭによって構成されたパターンモデルの正規分布に基づくバタチャリア距離などがある。つまり、距離と称してはいるが、類似度を示すものであれば何でも良い。 The distance between each higher-order signal pattern model or the distance between each higher-order noise pattern model indicates the similarity between pattern models generated from signal data or noise data of a specific target. For example, Euclid There is a Mahalanobis generalized distance in which the distance or the distance for measuring the similarity is the inner product of the two vectors, and the angle formed by the two vectors is evaluated as the similarity. In the present invention, the distance includes, in addition, a Bhattacharyya distance, a square Euclidean distance, a cosine distance, a Pearson correlation, a Chebyshev, a city block distance (or Manhattan distance), a Minkowski sum, a Calbach information amount, Bayesian distance, Chernov distance, Euclidean distance based on the normal vector of the normal distribution of the pattern model configured by the HMM, and the average of the normal distribution of the pattern model normalized by the standard deviation of the normal distribution of the pattern model configured by the HMM There are a Euclidean distance based on a vector, a butteraria distance based on a normal distribution of a pattern model constituted by an HMM, and the like. That is, although it is referred to as a distance, anything that shows a degree of similarity may be used.

また、４次元以上の高次元のデータ空間を、各パターンモデル間の距離関係を近似した状態で３次元以下の低次元のデータ空間に写像する処理は、例えば、パターンモデル相互間の距離が小さい２つのパターンモデルは互いに近くに、パターンモデル相互間の距離が大きい２つのパターンモデルは互いに遠くに位置するように全てのパターンモデルを低次元空間（例えば、２次元、３次元空間）に写像する処理となる。
例えば、パターンモデル間の距離としてユークリッド距離を用いた場合、写像された低次元空間において、ユークリッド距離が近いパターンモデルは遠いものよりもパターンモデル相互が類似していることを意味していると考えられる。 In addition, the process of mapping a four-dimensional or higher-dimensional data space to a three-dimensional or lower-dimensional data space in a state where the distance relationship between the pattern models is approximated is, for example, a small distance between pattern models. All pattern models are mapped to a low-dimensional space (for example, two-dimensional or three-dimensional space) so that the two pattern models are located close to each other and the two pattern models having a large distance between the pattern models are located far from each other. It becomes processing.
For example, when the Euclidean distance is used as the distance between pattern models, the pattern models with a close Euclidean distance in the mapped low-dimensional space mean that the pattern models are more similar than those with a long distance. It is done.

また、高次のパターンモデルを、これより低次のベクトルに変換する公知の手法としては、Ｓａｍｍｏｎ法（J. W. Sammon,"A nonlinear mapping for data structure ana1ysis,"IEEE Trans.Computers,vol.C-18,no.5,pp.401-409,May 1969.参照）、判別分析法（R. A. Fisher, "The use of multiple measurements in taxonomic Problems,"Ann.Eugenics,vol.7,no.PartII,pp.179-188,1936.参照）、Ａｌａｄｊａｍ法（M.A1adjem,"Multiclass discriminant mappings,"Signa1 Process.,vol.35,pp.1-18,1994.参照）、ニューラルネットワークによる手法（J.Mao et a1.,"Artificial neural networks for feature extraction and mu1tivariate data projection,"IEEE Trans.Neura1 Networks,vol.6,no.2,pp.296-317,1995.参照）、グラフを利用した手法（Y.Mori et al.,"Comparison of 1ow-dimensional mapping techniques based on discriminatory information,"Proc.2nd International ICSC Symposium on Advances in Intelligent Data Analysis(AIDA'2001）,CD-ROM Paper-no.1724-166,Bangor,United Kingdom,2001.参照）、写像追跡法（J.H.Freidman et al.,"A projection pursuit algorithm for exp1oratory data ana1ysis,"IEEE Trans.Comput.,vol.C-23,no.9,pp.881-889,1974.参照）、ＳＯＭ法（T.Kohonen,"Self-Organizing Maps,"Springer Series in Information Sciences,vol.30,Berlin,1995.参照）等がある。 As a known method for converting a higher-order pattern model into a lower-order vector, the Sammon method (JW Sammon, “A nonlinear mapping for data structure analysis,” IEEE Trans. Computers, vol. C-18). , no.5, pp. 401-409, May 1969.), discriminant analysis (RA Fisher, "The use of multiple measurements in taxonomic Problems," Ann. Eugenics, vol. 7, no. Part II, pp. 179 -188,1936.), Aladjam method (M.A1adjem, "Multiclass discriminant mappings," Signa1 Process., Vol.35, pp.1-18,1994.), Neural network method (J.Mao et a1 ., "Artificial neural networks for feature extraction and mu1tivariate data projection," IEEE Trans.Neura1 Networks, vol.6, no.2, pp.296-317,1995.), Graph-based method (Y.Mori et al., "Comparison of 1ow-dimensional mapping techniques based on discriminatory information," Proc. 2nd International ICSC Symposium on Advances in Intelligent Data Analysis (AIDA'2001), CD-ROM Paper-no. 1724-166, Bangor, United Kingdom, 2001.), projection tracking method (JHFreidman et al., "A projection pursuit algorithm for exp1oratory data ana1ysis," IEEE Trans.Comput., Vol.C-23, no.9 , pp. 881-889, 1974), SOM method (see T. Kohonen, “Self-Organizing Maps,” Springer Series in Information Sciences, vol. 30, Berlin, 1995.) and the like.

更に、請求項２に係る発明は、請求項１記載のパターンモデル生成装置において、
前記高次信号パターンモデル、前記高次ノイズパターンモデルおよび前記パターンモデルは、ＨＭＭ（Hidden Markov Model）によって構成されることを特徴としている。
このような構成であれば、前記各パターンモデルは、ＨＭＭ（Hidden Markov Model）によって構成することが可能であるので、時間的概念を伴う信号データに対して適切なパターンモデルを生成することが可能である。 Furthermore, the invention according to claim 2 is the pattern model generation device according to claim 1,
The high-order signal pattern model, the high-order noise pattern model, and the pattern model are configured by an HMM (Hidden Markov Model).
With such a configuration, each pattern model can be configured by an HMM (Hidden Markov Model), so that an appropriate pattern model can be generated for signal data with a temporal concept. It is.

ここで、ＨＭＭは時系列信号のパターンモデルであり、複数の定常信号源の間を遷移することで、非定常な時系列信号をモデル化することが可能である。例えば、音声は話すスピードによりその時間的長さが変わり、発話内容により、周波数上で特徴的な形状（スペクトル包絡という）を示すが、その形状は発声する人、環境、内容等に依存し、揺らぎが生じる。ＨＭＭはそのような揺らぎを吸収することができる統計的モデルである。ＨＭＭは、どのような単位で定義されても良く（例えば単語や音素）、各ＨＭＭ（ここで「各」というのは例えば単語であれば複数の単語が存在し、音素においても複数の音素が存在するため。）は、複数の状態からなり、各状態は統計的に学習された、状態遷移確率と出力確率（正規分布、混合正規分布等の確率分布）とで構成されている。遷移確率は音声の時間伸縮の揺らぎを、出力確率はスペクトルの揺らぎを吸収する。 Here, the HMM is a pattern model of a time series signal, and an unsteady time series signal can be modeled by transitioning between a plurality of stationary signal sources. For example, the length of time of speech changes depending on the speaking speed, and the utterance content shows a characteristic shape on the frequency (referred to as spectrum envelope), but the shape depends on the person who speaks, the environment, the content, etc. Fluctuation occurs. HMM is a statistical model that can absorb such fluctuations. The HMM may be defined in any unit (for example, a word or a phoneme), and each HMM (here, “each” is, for example, a word includes a plurality of words, and a phoneme includes a plurality of phonemes. Is composed of a plurality of states, and each state is made up of statistically learned state transition probabilities and output probabilities (probability distributions such as normal distribution and mixed normal distribution). The transition probability absorbs the fluctuation of time expansion and contraction of the voice, and the output probability absorbs the fluctuation of the spectrum.

更に、請求項３に係る発明は、請求項１又は請求項２記載のパターンモデル生成装置において、
前記低次信号ベクトル空間および前記低次ノイズベクトル空間は２次元または３次元のデータ空間であって、
前記低次信号ベクトル空間を構成する複数の低次信号ベクトルを、全低次信号ベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次信号ベクトルとの距離を半径とした１つの外円または外球と、前記重心を中心とし且つ前記外円または外球よりも小さな半径のｎ個の内円または内球（ｎは１以上の整数）とにより区分すると共に、前記外円および内円または外球および内球からなる複数の同心円同士または同心球同士の各曲線間または各曲面間に形成される環状または球面状の領域を、半径方向に伸びる線または面によって複数に区分する低次信号ベクトル空間区分手段と、
前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルを、前記低次ノイズベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次ノイズベクトルとの距離を半径とした１つの外円または外球と、前記重心を中心とし且つ前記外円または外球よりも小さな半径のｎ個の内円または内球（ｎは１以上の整数）とにより区分すると共に、前記外円および内円または外球および内球からなる複数の同心円同士または同心球同士の各曲線間または各曲面間に形成される環状または球面状の領域を、半径方向に伸びる線または面によって複数に区分する低次ノイズベクトル空間区分手段と、
前記低次信号ベクトル空間を構成する各低次信号ベクトルを、前記区分結果と共に表示する低次信号ベクトル空間表示手段と、
前記低次ノイズベクトル空間を構成する各低次ノイズベクトルを、前記区分結果と共に表示する低次ノイズベクトル空間表示手段と、を備え、
前記低次信号ベクトル選択手段は、前記表示された低次信号ベクトル空間における、前記外円または外球と半径が前記所定距離以上の内円または内球とからなる複数の同心円同士または同心球同士の各曲線間または各曲面間に形成される環状または球面状の領域に含まれる低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択するようになっており、
前記低次ノイズベクトル選択手段は、前記表示された低次ノイズベクトル空間における、前記外円または外球と半径が前記所定距離以上の内円または内球とからなる複数の同心円同士または同心球同士の各曲線間または各曲面間に形成される環状または球面状の領域に含まれる低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択するようになっていることを特徴としている。 Furthermore, the invention according to claim 3 is the pattern model generation device according to claim 1 or 2,
The low-order signal vector space and the low-order noise vector space are two-dimensional or three-dimensional data spaces,
A plurality of low-order signal vectors constituting the low-order signal vector space are centered on the centroid of all low-order signal vectors, and the distance between the centroid and the low-order signal vector at the position farthest from the centroid is the radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order signal vector space partitioning means for partitioning;
A plurality of low-order noise vectors constituting the low-order noise vector space are centered on the centroid of the low-order noise vector, and the distance between the centroid and the low-order noise vector at a position farthest from the centroid is a radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order noise vector space classification means for classifying;
Low-order signal vector space display means for displaying each low-order signal vector constituting the low-order signal vector space together with the classification result;
Low-order noise vector space display means for displaying each low-order noise vector constituting the low-order noise vector space together with the classification result, and
The low-order signal vector selection means includes a plurality of concentric circles or concentric spheres composed of the outer circle or outer sphere and an inner circle or inner sphere having a radius equal to or greater than the predetermined distance in the displayed lower-order signal vector space. A low-order signal vector for generating a pattern model is selected from low-order signal vectors included in an annular or spherical region formed between each curve or between curved surfaces.
The low-order noise vector selection means includes a plurality of concentric circles or concentric spheres composed of the outer circle or outer sphere and an inner circle or inner sphere having a radius equal to or greater than the predetermined distance in the displayed lower-order noise vector space. A low-order noise vector for generating a pattern model is selected from low-order noise vectors included in an annular or spherical area formed between each curve or between curved surfaces. .

このような構成であれば、前記低次信号ベクトル空間および前記低次ノイズベクトル空間が２次元のデータ空間のときは、低次信号ベクトル空間区分手段によって、前記低次信号ベクトル空間を構成する複数の低次信号ベクトルを、全低次信号ベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次信号ベクトルとの距離を半径とした１つの外円と、前記重心を中心とし且つ前記外円よりも小さな半径のｎ個の内円（ｎは１以上の整数）とにより区分すると共に、前記外円および内円からなる複数の同心円同士のに形成される環状の領域を、半径方向に伸びる線によって複数に区分することが可能である。 With such a configuration, when the low-order signal vector space and the low-order noise vector space are two-dimensional data spaces, a plurality of low-order signal vector spaces are configured by the low-order signal vector space dividing means. A low-order signal vector with one outer circle centered on the center of gravity of all the low-order signal vectors and the radius between the center of gravity and the low-order signal vector located farthest from the center of gravity, and the center of gravity And an n-shaped inner circle (n is an integer of 1 or more) having a smaller radius than the outer circle, and an annular region formed between a plurality of concentric circles composed of the outer circle and the inner circle. It is possible to divide into a plurality by lines extending in the radial direction.

また、低次ノイズベクトル空間区分手段によって、前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルを、前記低次ノイズベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次ノイズベクトルとの距離を半径とした１つの外円と、前記重心を中心とし且つ前記外円よりも小さな半径のｎ個の内円（ｎは１以上の整数）とにより区分すると共に、前記外円および内円からなる複数の同心円同士の各曲線間に形成される環状の領域を、半径方向に伸びる線によって複数に区分することが可能である。 Further, by the low-order noise vector space segmentation means, a plurality of low-order noise vectors constituting the low-order noise vector space are centered on the centroid of the low-order noise vector and located at a position farthest from the centroid. Partitioning into one outer circle whose radius is the distance from the low-order noise vector and n inner circles (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle; An annular region formed between the curved lines of a plurality of concentric circles composed of the outer circle and the inner circle can be divided into a plurality by a line extending in the radial direction.

また、低次信号ベクトル空間表示手段によって、前記低次信号ベクトル空間を構成する各低次信号ベクトルを、前記区分結果と共に表示することが可能であり、低次ノイズベクトル空間表示手段によって、前記低次ノイズベクトル空間を構成する各低次ノイズベクトルを、前記区分結果と共に表示することが可能である。
また、低次信号ベクトル選択手段は、前記表示された低次信号ベクトル空間における、前記外円と半径が前記所定距離以上の内円とからなる複数の同心円同士の各曲線間に形成される環状の領域に含まれる低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択することが可能であり、低次ノイズベクトル選択手段は、前記表示された低次ノイズベクトル空間における、前記外円と半径が前記所定距離以上の内円とからなる複数の同心円同士の各曲線間に形成される環状の領域に含まれる低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択することが可能である。 Each low-order signal vector constituting the low-order signal vector space can be displayed together with the classification result by the low-order signal vector space display means, and the low-order noise vector space display means can display the low-order signal vector space. Each low-order noise vector constituting the next-order noise vector space can be displayed together with the classification result.
Further, the low-order signal vector selection means is a ring formed between the respective curves of a plurality of concentric circles each including the outer circle and an inner circle having a radius equal to or greater than the predetermined distance in the displayed low-order signal vector space. A low-order signal vector for pattern model generation can be selected from the low-order signal vectors included in the region of, and the low-order noise vector selection means is configured to select the low-order noise vector space in the displayed low-order noise vector space. A low-order noise vector for generating a pattern model is selected from low-order noise vectors included in an annular region formed between curves of a plurality of concentric circles each having an outer circle and an inner circle having a radius equal to or greater than the predetermined distance. It is possible to select.

一方、前記低次信号ベクトル空間および前記低次ノイズベクトル空間が３次元のデータ空間のときは、低次信号ベクトル空間区分手段によって、前記低次信号ベクトル空間を構成する複数の低次信号ベクトルを、全低次信号ベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次信号ベクトルとの距離を半径とした１つの外球と、前記重心を中心とし且つ前記外球よりも小さな半径のｎ個の内球（ｎは１以上の整数）とにより区分すると共に、前記外球および内球からなる複数の同心球同士の各曲面間に形成される球面状の領域を、半径方向に伸びる面によって複数に区分することが可能である。 On the other hand, when the low-order signal vector space and the low-order noise vector space are three-dimensional data spaces, a plurality of low-order signal vectors constituting the low-order signal vector space are obtained by low-order signal vector space partitioning means. One outer sphere centered at the center of gravity of all the low-order signal vectors and having a radius as a distance between the center of gravity and the low-order signal vector at a position farthest from the center of gravity, and from the outer sphere centered on the center of gravity and Is divided into n inner spheres (n is an integer of 1 or more) having a small radius, and a spherical region formed between the curved surfaces of a plurality of concentric spheres composed of the outer sphere and the inner sphere, It is possible to divide into a plurality by the surface extending in the radial direction.

また、低次ノイズベクトル空間区分手段によって、前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルを、前記低次ノイズベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次ノイズベクトルとの距離を半径とした１つの外球と、前記重心を中心とし且つ前記外球よりも小さな半径のｎ個の内球（ｎは１以上の整数）とにより区分すると共に、前記外球および内球からなる複数の同心球同士の各曲面間に形成される球面状の領域を、半径方向に伸びる面によって複数に区分することが可能である。 Further, by the low-order noise vector space segmentation means, a plurality of low-order noise vectors constituting the low-order noise vector space are centered on the centroid of the low-order noise vector and located at a position farthest from the centroid. Partitioning into one outer sphere whose radius is a distance from a low-order noise vector and n inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer sphere; A spherical region formed between the curved surfaces of a plurality of concentric spheres composed of the outer sphere and the inner sphere can be divided into a plurality by a surface extending in the radial direction.

また、低次信号ベクトル空間表示手段によって、前記低次信号ベクトル空間を構成する各低次信号ベクトルを、前記区分結果と共に表示することが可能であり、低次ノイズベクトル空間表示手段によって、前記低次ノイズベクトル空間を構成する各低次ノイズベクトルを、前記区分結果と共に表示することが可能である。
また、低次信号ベクトル選択手段は、前記表示された低次信号ベクトル空間における、前記外球と半径が前記所定距離以上の内球とからなる複数の同心球同士の各曲面間に形成される球面状の領域に含まれる低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択することが可能であり、低次ノイズベクトル選択手段は、前記表示された低次ノイズベクトル空間における、前記外球と半径が前記所定距離以上の内球とからなる複数の同心球同士の各曲面間に形成される球面状の領域に含まれる低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択することが可能である。 Each low-order signal vector constituting the low-order signal vector space can be displayed together with the classification result by the low-order signal vector space display means, and the low-order noise vector space display means can display the low-order signal vector space. Each low-order noise vector constituting the next-order noise vector space can be displayed together with the classification result.
The low-order signal vector selection means is formed between the curved surfaces of a plurality of concentric spheres composed of the outer sphere and an inner sphere having a radius equal to or greater than the predetermined distance in the displayed low-order signal vector space. It is possible to select a low-order signal vector for generating a pattern model from low-order signal vectors included in a spherical region, and the low-order noise vector selection means is arranged in the displayed low-order noise vector space. A low-order noise vector for generating a pattern model from among low-order noise vectors included in a spherical region formed between curved surfaces of a plurality of concentric spheres each having an outer sphere and an inner sphere having a radius equal to or greater than the predetermined distance It is possible to select the next noise vector.

従って、外円（または外球）とｎ個の内円（または内球）とからなる複数の同心円同士（または同心球同士）の各曲線間（または各曲面間）に形成される環状（または球面状）の領域を、半径方向に伸びる線（または面）によって複数に区分し、このように区分された低次ベクトル空間を表示するようにしたので、各パターンモデルの分布を視覚的に簡易に捉えることが可能となると共に、当該表示された低次ベクトル空間における所定距離以上の半径を有する同心円同士（または同心球同士）で形成される環状（または球面状）の領域から、任意の低次ベクトル（例えば、半径方向に伸びる線（または面）によって複数に区分された各領域毎の低次ベクトルなど）を簡易に選択することが可能となる。更に、この領域の選択の仕方を工夫して、例えば、パターン認識の内容や認識対象に応じて低次ベクトルを取捨選択することで、特定の認識対象に特化したパターンモデルや認識性能がより高くなるパターンモデルなどを簡易に生成することが可能となる。 Accordingly, an annular shape (or between curved surfaces) formed between a plurality of concentric circles (or concentric spheres) composed of an outer circle (or outer sphere) and n inner circles (or inner spheres). Since the (spherical) area is divided into a plurality of lines (or faces) extending in the radial direction and the low-order vector space is displayed in this way, the distribution of each pattern model is visually simplified. From the annular (or spherical) region formed by concentric circles (or concentric spheres) having a radius equal to or greater than a predetermined distance in the displayed low-order vector space. It is possible to easily select the next vector (for example, a low-order vector for each region divided into a plurality of lines (or surfaces) extending in the radial direction). Furthermore, by devising how to select this region, for example, by selecting low-order vectors according to the content of pattern recognition and the recognition target, the pattern model and recognition performance specialized for a specific recognition target can be improved. It becomes possible to easily generate an increasing pattern model.

更に、請求項４に係る発明は、請求項１乃至請求項３のいずれか１項に記載のパターンモデル生成装置において、
前記信号データから特徴量を抽出し、当該抽出した特徴量に基づき前記高次信号パターンモデルを生成する高次信号パターンモデル生成手段と、
前記ノイズデータから特徴量を抽出し、当該抽出した特徴量に基づき前記高次ノイズパターンモデルを生成する高次ノイズパターンモデル生成手段と、
前記複数の高次信号パターンモデルから構成されるデータ空間の代替として、当該データ空間における高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる前記低次信号ベクトル空間を生成する低次信号ベクトル空間生成手段と、
前記複数の高次ノイズパターンモデルから構成されるデータ空間の代替として、当該データ空間における高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる前記低次ノイズベクトル空間を生成する低次ノイズベクトル空間生成手段と、を備えることを特徴としている。 Furthermore, the invention according to claim 4 is the pattern model generation device according to any one of claims 1 to 3,
A high-order signal pattern model generating means for extracting a feature quantity from the signal data and generating the high-order signal pattern model based on the extracted feature quantity;
A high-order noise pattern model generating means for extracting a feature value from the noise data and generating the high-order noise pattern model based on the extracted feature value;
As an alternative to a data space composed of a plurality of higher-order signal pattern models, data composed of lower-order signal vectors of three dimensions or less in a state in which the positional relationship between the higher-order signal pattern models in the data space is approximated Low-order signal vector space generating means for generating the low-order signal vector space formed by mapping to space;
As an alternative to the data space composed of the plurality of high-order noise pattern models, data composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the high-order noise pattern models in the data space is approximated Low-order noise vector space generation means for generating the low-order noise vector space mapped to the space.

このような構成であれば、高次信号パターンモデル生成手段によって、前記信号データから特徴量を抽出し、当該抽出した特徴量に基づき前記高次信号パターンモデルを生成することが可能であり、高次ノイズパターンモデル生成手段によって、前記ノイズデータから特徴量を抽出し、当該抽出した特徴量に基づき前記高次ノイズパターンモデルを生成することが可能である。 With such a configuration, it is possible to extract a feature quantity from the signal data by a high-order signal pattern model generation unit, and to generate the high-order signal pattern model based on the extracted feature quantity. It is possible to extract a feature amount from the noise data by the next noise pattern model generation means and generate the higher order noise pattern model based on the extracted feature amount.

また、低次信号ベクトル空間生成手段によって、前記複数の高次信号パターンモデルから構成されるデータ空間の代替として、当該データ空間における高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる前記低次信号ベクトル空間を生成することが可能であり、低次ノイズベクトル空間生成手段によって、前記複数の高次ノイズパターンモデルから構成されるデータ空間の代替として、当該データ空間における高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる前記低次ノイズベクトル空間を生成することが可能である。
従って、各高次パターンモデルおよび各低次ベクトル空間を生成することができるので、例えば、追加されたデータに対する各高次パターンモデルの生成処理や、各データ空間の変更処理や更新処理等を簡易に行うことが可能である。 Further, as a substitute for the data space composed of the plurality of higher-order signal pattern models by the low-order signal vector space generating means, the three-dimensional state is obtained by approximating the positional relationship between the higher-order signal pattern models in the data space. It is possible to generate the low-order signal vector space mapped to the data space composed of the following low-order signal vectors, and from the plurality of high-order noise pattern models by the low-order noise vector space generation means As an alternative to the configured data space, the lower order obtained by mapping to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the higher-order noise pattern models in the data space is approximated It is possible to generate a noise vector space.
Accordingly, each high-order pattern model and each low-order vector space can be generated. For example, it is easy to generate each high-order pattern model for added data, and to change or update each data space. Can be done.

一方、上記目的を達成するために、請求項５記載のパターンモデル評価装置は、
認識対象となる認識対象データのパターン認識に用いられるパターンモデルの認識性能を評価するパターンモデル評価装置であって、
前記複数対象に係る複数のノイズ未混合の予め取得された信号データを記憶する信号データ記憶手段と、
前記信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を記憶する低次信号ベクトル空間記憶手段と、
複数種類の信号未混合のノイズデータを記憶するノイズデータ記憶手段と、
前記ノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を記憶する低次ノイズベクトル空間記憶手段と、
前記低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次信号ベクトルの中から評価用の低次信号ベクトルを選択する評価信号ベクトル選択手段と、
前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次ノイズベクトルの中から評価用の低次ノイズベクトルを選択する評価ノイズベクトル選択手段と、
前記評価信号ベクトル選択手段によって選択された低次信号ベクトルに対応する信号データと、前記評価ノイズベクトル選択手段によって選択された低次ノイズベクトルに対応するノイズデータとに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成する評価用混合信号データ生成手段と、
前記評価用混合信号データ生成手段において生成された評価用混合信号データと、前記パターンモデルとに基づき、当該パターンモデルの性能を評価する性能評価手段と、
前記性能評価手段の評価結果を出力する評価結果出力手段と、を備えることを特徴としている。 On the other hand, in order to achieve the above object, the pattern model evaluation apparatus according to claim 5 is:
A pattern model evaluation device for evaluating the recognition performance of a pattern model used for pattern recognition of recognition target data to be recognized,
Signal data storage means for storing a plurality of noise-unmixed pre-acquired signal data related to the plurality of objects;
As an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on the signal data, the positional relationship between the higher-order signal pattern models in the data space Low-order signal vector space storage means for storing a low-order signal vector space that is mapped to a data space composed of low-order signal vectors of three dimensions or less in an approximate state,
Noise data storage means for storing a plurality of types of signal-unmixed noise data;
As an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four-dimensional or higher-dimensional elements generated based on the noise data, the positional relationship between the higher-order noise pattern models in the data space Low-order noise vector space storage means for storing a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state where
Among the plurality of low-order signal vectors constituting the low-order signal vector space, the low-order signal vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors or that are uniformly centered on the center of gravity. Evaluation signal vector selection means for selecting a low-order signal vector for evaluation from the inside;
Among the plurality of low-order noise vectors constituting the low-order noise vector space, the low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors or that are uniformly centered on the center of gravity. Evaluation noise vector selection means for selecting a low-order noise vector for evaluation from the inside;
Based on the signal data corresponding to the low-order signal vector selected by the evaluation signal vector selection means and the noise data corresponding to the low-order noise vector selected by the evaluation noise vector selection means, each signal data Mixed signal data generation means for evaluation for generating mixed signal data for evaluation mixed with data; and
Performance evaluation means for evaluating the performance of the pattern model based on the mixed signal data for evaluation generated by the mixed signal data generation means for evaluation and the pattern model;
And an evaluation result output means for outputting an evaluation result of the performance evaluation means.

また、評価信号ベクトル選択手段によって、前記低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次信号ベクトルの中から評価用の低次信号ベクトルを選択することが可能であり、評価ノイズベクトル選択手段によって、前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次ノイズベクトルの中から評価用の低次ノイズベクトルを選択することが可能である。 Further, the evaluation signal vector selection means is located at a position more than a predetermined distance from the center of gravity of all the low-order signal vectors among the plurality of low-order signal vectors constituting the low-order signal vector space, or centered on the center of gravity. It is possible to select a low-order signal vector for evaluation from uniform low-order signal vectors, and a plurality of low-order noise vectors constituting the low-order noise vector space are selected by the evaluation noise vector selection means. Among them, it is possible to select a low-order noise vector for evaluation from low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors or that are uniformly centered on the center of gravity. is there.

また、評価用混合信号データ生成手段によって、前記評価信号ベクトル選択手段によって選択された低次信号ベクトルに対応する信号データと、前記評価ノイズベクトル選択手段によって選択された低次ノイズベクトルに対応するノイズデータとに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成することが可能であり、性能評価手段によって、前記評価用混合信号データ生成手段において生成された評価用混合信号データと、前記パターンモデルとに基づき、当該パターンモデルの性能を評価することが可能であり、評価結果出力手段によって、前記性能評価手段の評価結果を出力することが可能である。 Further, the signal data corresponding to the low-order signal vector selected by the evaluation signal vector selection means by the evaluation mixed signal data generation means and the noise corresponding to the low-order noise vector selected by the evaluation noise vector selection means Based on the data, it is possible to generate mixed signal data for evaluation in which each noise data is mixed with each signal data, and the mixed signal for evaluation generated in the mixed signal data generation unit for evaluation generated by the performance evaluation unit The performance of the pattern model can be evaluated based on the data and the pattern model, and the evaluation result of the performance evaluation unit can be output by the evaluation result output unit.

低次信号ベクトル空間および低次ノイズベクトル空間においては、各ベクトルが重心から離れれば離れるほど各ベクトルの存在する密度が低くなる傾向にあり、且つ密度の低いところ（分布の外側周辺およびその近傍）にあるベクトルに対応した信号データおよびノイズデータから生成されるノイズ混合信号データは、例えば、各空間における全データを用いて生成した不特定対象に対するパターンモデルに対して低い認識確率を生じさせる傾向にある。従って、重心から所定距離以上離れた位置にあるデータを評価用データとして選択し、当該選択したデータを用いてノイズ混合信号データを生成し、このノイズ混合信号データによって、パターンモデルの評価を行うことで、低い認識確率を生じさせるデータに対する認識確率を正確に且つ体系的に評価することが可能となる。 In the low-order signal vector space and the low-order noise vector space, the density of each vector tends to decrease as the distance from the center of gravity increases, and the density is low (outside and near the distribution). The noise mixed signal data generated from the signal data and noise data corresponding to the vector in FIG. 5 tends to generate a low recognition probability for a pattern model for an unspecified object generated using all data in each space, for example. is there. Accordingly, data at a position that is a predetermined distance or more away from the center of gravity is selected as evaluation data, noise mixed signal data is generated using the selected data, and the pattern model is evaluated using the noise mixed signal data. Thus, it becomes possible to accurately and systematically evaluate the recognition probability for data that causes a low recognition probability.

更に、請求項６に係る発明は、請求項５記載のパターンモデル評価装置において、
前記低次信号ベクトル空間および前記低次ノイズベクトル空間は２次元または３次元のデータ空間であって、
前記低次信号ベクトル空間を構成する複数の低次信号ベクトルを、全低次信号ベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次信号ベクトルとの距離を半径とした１つの外円または外球と、前記重心を中心とし且つ前記外円または外球よりも小さな半径のｎ個の内円または内球（ｎは１以上の整数）とにより区分すると共に、前記外円および内円または外球および内球からなる複数の同心円同士または同心球同士の各曲線間または各曲面間に形成される環状または球面状の領域を、半径方向に伸びる線または面によって複数に区分する低次信号ベクトル空間区分手段と、
前記低次ノイズベクトル空間を構成する複数の低次ノイズベクトルを、前記低次ノイズベクトルの重心を中心とし且つ前記重心と当該重心から最も離れた位置の低次ノイズベクトルとの距離を半径とした１つの外円または外球と、前記重心を中心とし且つ前記外円または外球よりも小さな半径のｎ個の内円または内球（ｎは１以上の整数）とにより区分すると共に、前記外円および内円または外球および内球からなる複数の同心円同士または同心球同士の各曲線間または各曲面間に形成される環状または球面状の領域を、半径方向に伸びる線または面によって複数に区分する低次ノイズベクトル空間区分手段と、
前記低次信号ベクトル空間を構成する各低次信号ベクトルを、前記区分結果と共に表示する低次信号ベクトル空間表示手段と、
前記低次ノイズベクトル空間を構成する各低次ノイズベクトルを、前記区分結果と共に表示する低次ノイズベクトル空間表示手段と、を備え、
前記評価信号ベクトル選択手段および前記評価ノイズベクトル選択手段は、前記表示された前記低次信号ベクトル空間および前記低次ノイズベクトル空間の前記環状の領域または前記球面状の領域における前記半径方向に伸びる線または面によって複数に区分された各領域から評価用の低次信号ベクトルおよび低次ノイズベクトルを選択するようになっており、
前記評価用混合信号データ生成手段は、前記評価信号ベクトル選択手段および前記評価ノイズベクトル選択手段によって選択された低次信号ベクトルおよび低次ノイズベクトルにそれぞれ対応する信号データおよびノイズデータに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成するようになっていることを特徴としている。 Furthermore, the invention according to claim 6 is the pattern model evaluation apparatus according to claim 5,
The low-order signal vector space and the low-order noise vector space are two-dimensional or three-dimensional data spaces,
A plurality of low-order signal vectors constituting the low-order signal vector space are centered on the centroid of all low-order signal vectors, and the distance between the centroid and the low-order signal vector at the position farthest from the centroid is the radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order signal vector space partitioning means for partitioning;
A plurality of low-order noise vectors constituting the low-order noise vector space are centered on the centroid of the low-order noise vector, and the distance between the centroid and the low-order noise vector at a position farthest from the centroid is a radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order noise vector space classification means for classifying;
Low-order signal vector space display means for displaying each low-order signal vector constituting the low-order signal vector space together with the classification result;
Low-order noise vector space display means for displaying each low-order noise vector constituting the low-order noise vector space together with the classification result, and
The evaluation signal vector selection means and the evaluation noise vector selection means are lines extending in the radial direction in the annular region or the spherical region of the displayed low-order signal vector space and low-order noise vector space. Alternatively, a low-order signal vector and a low-order noise vector for evaluation are selected from each region divided into a plurality of areas according to a plane,
The evaluation mixed signal data generation means includes each signal based on signal data and noise data respectively corresponding to a low-order signal vector and a low-order noise vector selected by the evaluation signal vector selection means and the evaluation noise vector selection means. It is characterized in that mixed signal data for evaluation in which each noise data is mixed with data is generated.

また、低次信号ベクトル空間表示手段によって、前記低次信号ベクトル空間を構成する各低次信号ベクトルを、前記区分結果と共に表示することが可能であり、低次ノイズベクトル空間表示手段によって、前記低次ノイズベクトル空間を構成する各低次ノイズベクトルを、前記区分結果と共に表示することが可能である。
また、前記評価信号ベクトル選択手段および前記評価ノイズベクトル選択手段は、前記表示された前記低次信号ベクトル空間および前記低次ノイズベクトル空間の前記環状の領域における前記半径方向に伸びる線によって複数に区分された各領域から評価用の低次信号ベクトルおよび低次ノイズベクトルを選択することが可能であり、前記評価用混合信号データ生成手段は、前記選択された低次信号ベクトルおよび低次ノイズベクトルにそれぞれ対応する信号データおよびノイズデータに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成することが可能である。 Each low-order signal vector constituting the low-order signal vector space can be displayed together with the classification result by the low-order signal vector space display means, and the low-order noise vector space display means can display the low-order signal vector space. Each low-order noise vector constituting the next-order noise vector space can be displayed together with the classification result.
The evaluation signal vector selection means and the evaluation noise vector selection means are divided into a plurality of lines by the lines extending in the radial direction in the annular region of the displayed low-order signal vector space and the low-order noise vector space. It is possible to select an evaluation low-order signal vector and a low-order noise vector from each of the selected regions, and the evaluation mixed signal data generation means applies the selected low-order signal vector and low-order noise vector to the selected low-order signal vector and low-order noise vector. Based on the corresponding signal data and noise data, it is possible to generate mixed signal data for evaluation in which each signal data is mixed with each noise data.

また、低次信号ベクトル空間表示手段によって、前記低次信号ベクトル空間を構成する各低次信号ベクトルを、前記区分結果と共に表示することが可能であり、低次ノイズベクトル空間表示手段によって、前記低次ノイズベクトル空間を構成する各低次ノイズベクトルを、前記区分結果と共に表示することが可能である。
また、前記評価信号ベクトル選択手段および前記評価ノイズベクトル選択手段は、前記表示された前記低次信号ベクトル空間および前記低次ノイズベクトル空間の前記球面状の領域における前記半径方向に伸びる面によって複数に区分された各領域から評価用の低次信号ベクトルおよび低次ノイズベクトルを選択することが可能であり、前記評価用混合信号データ生成手段は、前記選択された低次信号ベクトルおよび低次ノイズベクトルにそれぞれ対応する信号データおよびノイズデータに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成することが可能である。 Each low-order signal vector constituting the low-order signal vector space can be displayed together with the classification result by the low-order signal vector space display means, and the low-order noise vector space display means can display the low-order signal vector space. Each low-order noise vector constituting the next-order noise vector space can be displayed together with the classification result.
Further, the evaluation signal vector selection means and the evaluation noise vector selection means are divided into a plurality by the surface extending in the radial direction in the spherical region of the displayed low-order signal vector space and the low-order noise vector space. It is possible to select a low-order signal vector and a low-order noise vector for evaluation from each of the divided regions, and the mixed signal data generation means for evaluation includes the selected low-order signal vector and low-order noise vector. It is possible to generate mixed signal data for evaluation in which each signal data is mixed with each noise data based on the signal data and the noise data respectively corresponding to.

従って、例えば、内円（または内球）の半径を重心から所定距離以上となる大きさとすることで、外円（または外球）とその１つ内側の内円（または内球）との曲線間（または曲面間）に形成される環状（または球面状）の領域に含まれる各低次ベクトルは、全てが重心から所定距離以上離れた位置にある低次ベクトルとなる。そして、この環状（または球面状）の領域を半径方向に伸びる線（または面）によって区分するようにしたので、環状（または球面状）の領域にある低次ベクトルは、更に細かく区分される。更に、このようにして区分された低次ベクトル空間から、評価用の低次ベクトルを選択するようにしたので、不特定対象に対するパターンモデルに対して低い認識確率を生じさせるデータのうち、例えば、パターンモデルの生成に用いられていないデータを評価用のデータとして選択することなどが可能であり、このようにして選択した評価用のデータからノイズ混合信号データを生成し、このノイズ混合信号データによって、パターンモデルの評価を行うことで、低い認識確率を生じさせるノイズ混合信号データに対する認識確率をより正確に且つ体系的に評価することが可能となる。 Therefore, for example, by setting the radius of the inner circle (or inner sphere) to be greater than or equal to a predetermined distance from the center of gravity, the curve between the outer circle (or outer sphere) and the inner circle (or inner sphere) inside it is one. Each of the low-order vectors included in the annular (or spherical) region formed between the curved surfaces (or between the curved surfaces) is a low-order vector that is located at a predetermined distance or more from the center of gravity. Since this annular (or spherical) region is divided by a line (or surface) extending in the radial direction, the low-order vectors in the annular (or spherical) region are further finely divided. Furthermore, since the low-order vector for evaluation is selected from the low-order vector space thus divided, among the data that causes a low recognition probability for the pattern model for the unspecified object, for example, It is possible to select data that is not used to generate the pattern model as data for evaluation, etc., and generate noise mixed signal data from the evaluation data selected in this way. By evaluating the pattern model, it becomes possible to more accurately and systematically evaluate the recognition probability for noise mixed signal data that causes a low recognition probability.

一方、上記目的を達成するために、請求項７記載のパターン認識装置は、
請求項１乃至請求項４のいずれか１項に記載のパターンモデル生成装置と、
前記認識対象の認識対象データを取得するデータ取得手段と、
前記データ取得手段において取得した認識対象データと、前記パターンモデル生成装置において生成されたパターンモデルとに基づき、前記取得した認識対象データのパターン認識を行うパターン認識手段と、
前記パターン認識手段の認識結果を出力する認識結果出力手段と、を備えることを特徴としている。 On the other hand, in order to achieve the above object, a pattern recognition apparatus according to claim 7 is:
The pattern model generation device according to any one of claims 1 to 4,
Data acquisition means for acquiring recognition target data of the recognition target;
Pattern recognition means for performing pattern recognition of the acquired recognition target data based on the recognition target data acquired in the data acquisition means and the pattern model generated in the pattern model generation device;
Recognition result output means for outputting the recognition result of the pattern recognition means.

このような構成であれば、データ取得手段によって、認識対象の認識対象データを取得することが可能であり、パターン認識手段によって、前記データ取得手段において取得した認識対象データと、前記パターンモデル生成装置において生成されたパターンモデルとに基づき、前記取得した認識対象データのパターン認識を行うことが可能であり、認識結果出力手段によって、前記パターン認識手段の認識結果を出力することが可能である。
従って、パターンモデルの認識確率を低下させる信号データおよびノイズデータに対する認識確率を向上できるパターンモデルによって認識対象データのパターン認識を行うようにしたので、各認識対象データに対して満遍なく認識確率が良好となるパターン認識を行うことが可能である。 With such a configuration, it is possible to acquire recognition target data to be recognized by the data acquisition unit, and the recognition target data acquired by the data acquisition unit by the pattern recognition unit and the pattern model generation device It is possible to perform pattern recognition of the acquired recognition target data based on the pattern model generated in step 1, and the recognition result output means can output the recognition result of the pattern recognition means.
Therefore, since the pattern recognition of the recognition target data is performed by the pattern model that can improve the recognition probability for the signal data and noise data that reduce the recognition probability of the pattern model, the recognition probability is uniformly good for each recognition target data. It is possible to perform pattern recognition.

更に、請求項８に係る発明は、請求項８記載のパターン認識装置において、
請求項５又は請求項６記載のパターンモデル評価装置を備えることを特徴としている。
このような構成であれば、新規の信号データやノイズデータが追加されたときに、パターンモデルの性能を評価することができるので、この評価結果に基づいて、認識性能が低下するようであればパターンモデルを作成し直したりすることが可能となり、パターン認識の性能の維持や向上を比較的簡易に行うことが可能となる。 The invention according to claim 8 is the pattern recognition apparatus according to claim 8,
A pattern model evaluation apparatus according to claim 5 or 6 is provided.
With such a configuration, when new signal data or noise data is added, the performance of the pattern model can be evaluated. Based on this evaluation result, if the recognition performance decreases. The pattern model can be recreated, and the pattern recognition performance can be maintained and improved relatively easily.

一方、上記目的を達成するために、請求項９記載のパターンモデル生成プログラムは、
認識対象となる認識対象データのパターン認識に用いられるパターンモデルを生成するためのパターンモデル生成プログラムであって、
複数対象に係る複数のノイズ未混合の信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択する低次信号ベクトル選択手段と、
複数種類の信号未混合のノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択する低次ノイズベクトル選択手段と、
前記低次信号ベクトル選択手段によって選択した低次信号ベクトルに対応する信号データと前記低次ノイズベクトル選択手段によって選択した低次ノイズベクトルに対応するノイズデータとに基づき、前記各信号データに前記ノイズデータを混合したノイズ混合信号データを生成するノイズ混合信号データ生成手段と、
前記ノイズ混合信号データを学習データとして確率モデルを学習させてなるパターンモデルを生成するパターンモデル生成手段としての機能をコンピュータに実現させるためのプログラムを含むことを特徴としている。
このような構成であれば、コンピュータによってプログラムが読み取られ、読み取られたプログラムに従ってコンピュータが処理を実行すると、請求項１記載のパターンモデル生成装置と同等の作用および効果が得られる。 On the other hand, in order to achieve the above object, a pattern model generation program according to claim 9 is provided:
A pattern model generation program for generating a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, low-order signal vector selection means for selecting a low-order signal vector for pattern model generation from low-order signal vectors located at a predetermined distance or more away from the center of gravity of all low-order signal vectors,
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, Low-order noise vector selection means for selecting a low-order noise vector for pattern model generation from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected by the low-order signal vector selection means and the noise data corresponding to the low-order noise vector selected by the low-order noise vector selection means, the noise is added to each signal data. Noise mixed signal data generating means for generating noise mixed signal data obtained by mixing data;
It includes a program for causing a computer to realize a function as a pattern model generation means for generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data.
With such a configuration, when the program is read by the computer and the computer executes processing according to the read program, the same operations and effects as those of the pattern model generation apparatus according to claim 1 can be obtained.

また、上記目的を達成するために、請求項１０記載のパターンモデル評価プログラムは、
認識対象となる認識対象データのパターン認識に用いられるパターンモデルの認識性能を評価するためのパターンモデル評価プログラムであって、
複数対象に係る複数のノイズ未混合の信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次信号ベクトルの中から評価用の低次信号ベクトルを選択する評価信号ベクトル選択手段と、
複数種類の信号未混合のノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次ノイズベクトルの中から評価用の低次ノイズベクトルを選択する評価ノイズベクトル選択手段と、
前記評価信号ベクトル選択手段によって選択された低次信号ベクトルに対応する信号データと、前記評価ノイズベクトル選択手段によって選択された低次ノイズベクトルに対応するノイズデータとに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成する評価用混合信号データ生成手段と、
前記評価用混合信号データ生成手段において生成された評価用混合信号データと、前記パターンモデルとに基づき、当該パターンモデルの性能を評価する性能評価手段と、
前記性能評価手段の評価結果を出力する評価結果出力手段としての機能をコンピュータに実現させるためのプログラムを含むことを特徴としている。
このような構成であれば、コンピュータによってプログラムが読み取られ、読み取られたプログラムに従ってコンピュータが処理を実行すると、請求項５記載のパターンモデル評価装置と同等の作用および効果が得られる。 In order to achieve the above object, the pattern model evaluation program according to claim 10 comprises:
A pattern model evaluation program for evaluating the recognition performance of a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Evaluation signal vector for selecting a low-order signal vector for evaluation from among low-order signal vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors, or that are uniformly centered on the center of gravity. A selection means;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, Evaluation noise vector selection means for selecting a low-order noise vector for evaluation from low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors, or are uniformly centered on the center of gravity; ,
Based on the signal data corresponding to the low-order signal vector selected by the evaluation signal vector selection means and the noise data corresponding to the low-order noise vector selected by the evaluation noise vector selection means, each signal data Mixed signal data generation means for evaluation for generating mixed signal data for evaluation mixed with data; and
Performance evaluation means for evaluating the performance of the pattern model based on the mixed signal data for evaluation generated by the mixed signal data generation means for evaluation and the pattern model;
It includes a program for causing a computer to realize a function as an evaluation result output means for outputting an evaluation result of the performance evaluation means.
With such a configuration, when the program is read by the computer and the computer executes processing according to the read program, the same operation and effect as those of the pattern model evaluation apparatus according to claim 5 can be obtained.

また、上記目的を達成するために、請求項１１記載のパターン認識プログラムは、
複数対象に係る複数のノイズ未混合の信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択する低次信号ベクトル選択手段と、
複数種類の信号未混合のノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択する低次ノイズベクトル選択手段と、
前記低次信号ベクトル選択手段によって選択した低次信号ベクトルに対応する信号データと前記低次ノイズベクトル選択手段によって選択した低次ノイズベクトルに対応するノイズデータとに基づき、前記各信号データに前記ノイズデータを混合したノイズ混合信号データを生成するノイズ混合信号データ生成手段と、
前記ノイズ混合信号データを学習データとして確率モデルを学習させてなるパターンモデルを生成するパターンモデル生成手段と、
前記認識対象の認識対象データを取得するデータ取得手段と、
前記データ取得手段において取得した認識対象データと、前記パターンモデル生成装置において生成されたパターンモデルとに基づき、前記取得した認識対象データのパターン認識を行うパターン認識手段と、
前記パターン認識手段の認識結果を出力する認識結果出力手段としての機能をコンピュータに実現させるためのプログラムを含むことを特徴としている。
このような構成であれば、コンピュータによってプログラムが読み取られ、読み取られたプログラムに従ってコンピュータが処理を実行すると、請求項７記載のパターン認識装置と同等の作用および効果が得られる。 In order to achieve the above object, a pattern recognition program according to claim 11 comprises:
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, low-order signal vector selection means for selecting a low-order signal vector for pattern model generation from low-order signal vectors located at a predetermined distance or more away from the center of gravity of all low-order signal vectors,
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, Low-order noise vector selection means for selecting a low-order noise vector for pattern model generation from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected by the low-order signal vector selection means and the noise data corresponding to the low-order noise vector selected by the low-order noise vector selection means, the noise is added to each signal data. Noise mixed signal data generating means for generating noise mixed signal data obtained by mixing data;
Pattern model generation means for generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data;
Data acquisition means for acquiring recognition target data of the recognition target;
Pattern recognition means for performing pattern recognition of the acquired recognition target data based on the recognition target data acquired in the data acquisition means and the pattern model generated in the pattern model generation device;
It includes a program for causing a computer to realize a function as a recognition result output means for outputting a recognition result of the pattern recognition means.
With such a configuration, when the program is read by the computer and the computer executes processing according to the read program, the same operation and effect as those of the pattern recognition apparatus according to claim 7 can be obtained.

また、上記目的を達成するために、請求項１２記載のパターンモデル生成方法は、
認識対象となる認識対象データのパターン認識に用いられるパターンモデルを生成するためのパターンモデル生成方法であって、
複数対象に係る複数のノイズ未混合の信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択する低次信号ベクトル選択ステップと、
複数種類の信号未混合のノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択する低次ノイズベクトル選択ステップと、
前記低次信号ベクトル選択ステップにおいて選択した低次信号ベクトルに対応する信号データと前記低次ノイズベクトル選択ステップにおいて選択した低次ノイズベクトルに対応するノイズデータとに基づき、前記各信号データに前記ノイズデータを混合したノイズ混合信号データを生成するノイズ混合信号データ生成ステップと、
前記ノイズ混合信号データを学習データとして確率モデルを学習させてなるパターンモデルを生成するパターンモデル生成ステップと、を含むことを特徴としている。
これにより、請求項１記載のパターンモデル生成装置と同等の作用および効果が得られる。 In order to achieve the above object, the pattern model generation method according to claim 12 comprises:
A pattern model generation method for generating a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, a low-order signal vector selection step for selecting a low-order signal vector for pattern model generation from among low-order signal vectors located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, A low-order noise vector selection step for selecting a low-order noise vector for generating a pattern model from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected in the low-order signal vector selection step and the noise data corresponding to the low-order noise vector selected in the low-order noise vector selection step, the noise is added to each signal data. Noise mixed signal data generation step for generating noise mixed signal data obtained by mixing data;
And a pattern model generation step of generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data.
Thus, the same operation and effect as those of the pattern model generation device according to claim 1 can be obtained.

また、上記目的を達成するために、請求項１３記載のパターンモデル評価方法は、
認識対象となる認識対象データのパターン認識に用いられるパターンモデルの認識性能を評価するためのパターンモデル評価方法であって、
複数対象に係る複数のノイズ未混合の信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次信号ベクトルの中から評価用の低次信号ベクトルを選択する評価信号ベクトル選択ステップと、
複数種類の信号未混合のノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある、または前記重心を中心として一様にある低次ノイズベクトルの中から評価用の低次ノイズベクトルを選択する評価ノイズベクトル選択ステップと、
前記評価信号ベクトル選択ステップおいて選択された低次信号ベクトルに対応する信号データと、前記評価ノイズベクトル選択ステップにおいて選択された低次ノイズベクトルに対応するノイズデータとに基づき、各信号データに各ノイズデータを混合した評価用混合信号データを生成する評価用混合信号データ生成ステップと、
前記評価用混合信号データ生成ステップにおいて生成された評価用混合信号データと、前記パターンモデルとに基づき、当該パターンモデルの性能を評価する性能評価ステップと、
前記性能評価ステップの評価結果を出力する評価結果出力ステップと、を含むことを特徴としている。
これにより、請求項５記載のパターンモデル評価装置と同等の作用および効果が得られる。 In order to achieve the above object, the pattern model evaluation method according to claim 13 comprises:
A pattern model evaluation method for evaluating the recognition performance of a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Evaluation signal vector for selecting a low-order signal vector for evaluation from among low-order signal vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors, or that are uniformly centered on the center of gravity. A selection step;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, An evaluation noise vector selection step for selecting a low-order noise vector for evaluation from low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors or that are uniformly centered on the center of gravity; ,
Based on the signal data corresponding to the low-order signal vector selected in the evaluation signal vector selection step and the noise data corresponding to the low-order noise vector selected in the evaluation noise vector selection step, each signal data An evaluation mixed signal data generation step for generating mixed signal data for evaluation mixed with noise data;
A performance evaluation step for evaluating the performance of the pattern model based on the mixed signal data for evaluation generated in the mixed signal data generation step for evaluation and the pattern model;
An evaluation result output step of outputting an evaluation result of the performance evaluation step.
Thus, the same operation and effect as those of the pattern model evaluation apparatus according to claim 5 can be obtained.

また、上記目的を達成するために、請求項１４記載のパターン認識方法は、
複数対象に係る複数のノイズ未混合の信号データに基づき生成された４次元以上の高次元の要素からなる複数の高次信号パターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次信号パターンモデル相互間の位置関係を近似した状態で３次元以下の低次信号ベクトルから構成されるデータ空間に写像してなる低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルの中からパターンモデル生成用の低次信号ベクトルを選択する低次信号ベクトル選択ステップと、
複数種類の信号未混合のノイズデータに基づき生成された４次元以上の高次元の要素からなる複数の高次ノイズパターンモデルから構成されるデータ空間の代替として生成された、当該データ空間における各高次ノイズパターンモデル相互間の位置関係を近似した状態で３次元以下の低次ノイズベクトルから構成されるデータ空間に写像してなる低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルの中からパターンモデル生成用の低次ノイズベクトルを選択する低次ノイズベクトル選択ステップと、
前記低次信号ベクトル選択ステップにおいて選択した低次信号ベクトルに対応する信号データと前記低次ノイズベクトル選択ステップにおいて選択した低次ノイズベクトルに対応するノイズデータとに基づき、前記各信号データに前記ノイズデータを混合したノイズ混合信号データを生成するノイズ混合信号データ生成ステップと、
前記ノイズ混合信号データを学習データとして確率モデルを学習させてなるパターンモデルを生成するパターンモデル生成ステップと、
前記認識対象の認識対象データを取得するデータ取得ステップと、
前記データ取得ステップにおいて取得した認識対象データと、前記パターンモデル生成ステップにおいて生成されたパターンモデルとに基づき、前記取得した認識対象データのパターン認識を行うパターン認識ステップと、
前記パターン認識ステップの認識結果を出力する認識結果出力ステップと、を含むことを特徴としている。
これにより、請求項７記載のパターン認識装置と同等の作用および効果が得られる。 In order to achieve the above object, the pattern recognition method according to claim 14 comprises:
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, a low-order signal vector selection step for selecting a low-order signal vector for pattern model generation from among low-order signal vectors located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, A low-order noise vector selection step for selecting a low-order noise vector for generating a pattern model from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected in the low-order signal vector selection step and the noise data corresponding to the low-order noise vector selected in the low-order noise vector selection step, the noise is added to each signal data. Noise mixed signal data generation step for generating noise mixed signal data obtained by mixing data;
A pattern model generation step of generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data;
A data acquisition step of acquiring recognition target data of the recognition target;
A pattern recognition step for performing pattern recognition of the acquired recognition target data based on the recognition target data acquired in the data acquisition step and the pattern model generated in the pattern model generation step;
A recognition result output step of outputting a recognition result of the pattern recognition step.
Thus, the same operation and effect as the pattern recognition device according to claim 7 can be obtained.

以上説明したように、本発明に係る請求項１又は請求項２に記載のパターンモデル生成装置によれば、パターンモデルの認識確率を低下させる信号データに対する認識確率を向上できるパターンモデルを生成することができるという効果が得られる。
また、請求項３記載のパターンモデル生成装置によれば、請求項１又は請求項２の前記効果に加え、各パターンモデルの分布を視覚的に簡易に捉えることができると共に、領域の選択の仕方を工夫して、例えば、パターン認識の内容や認識対象に応じてデータを取捨選択することで、特定の認識対象に特化したパターンモデルや認識性能がより高くなるパターンモデルなどを生成することができるという効果が得られる。 As described above, according to the pattern model generation device according to claim 1 or 2 of the present invention, it is possible to generate a pattern model that can improve the recognition probability for signal data that reduces the recognition probability of the pattern model. The effect of being able to be obtained.
According to the pattern model generation device of claim 3, in addition to the effect of claim 1 or claim 2, the distribution of each pattern model can be easily grasped visually, and the method of selecting a region For example, by selecting data according to the content of pattern recognition and the recognition target, a pattern model specialized for a specific recognition target or a pattern model with higher recognition performance may be generated. The effect that it can be obtained.

また、請求項４記載のパターンモデル生成装置によれば、請求項１乃至請求項３のいずれか１の前記効果に加え、追加されたデータに対する各高次パターンモデルの生成処理や、各データ空間の変更処理や更新処理等を簡易に行うことができるという効果が得られる。
また、請求項５又は請求項６記載のパターンモデル評価装置によれば、低い認識確率を生じさせるデータに対する認識確率を正確に且つ体系的に評価することができるという効果が得られる。 According to the pattern model generation device of claim 4, in addition to the effect of any one of claims 1 to 3, each higher-order pattern model generation process for added data, and each data space The effect that the change process, the update process, and the like can be easily performed is obtained.
Moreover, according to the pattern model evaluation apparatus of Claim 5 or Claim 6, the effect that the recognition probability with respect to the data which produces a low recognition probability can be evaluated correctly and systematically is acquired.

また、請求項７又は請求項８記載のパターン認識装置によれば、各信号データに対して満遍なく認識確率が良好となるパターン認識を行うことができるという効果が得られる。
また、請求項９記載のパターンモデル生成プログラムによれば、パターンモデルの認識確率を低下させる信号データに対する認識確率を向上できるパターンモデルを生成することができるという効果が得られる。 Moreover, according to the pattern recognition apparatus of Claim 7 or Claim 8, the effect that the pattern recognition with which the recognition probability becomes uniform uniformly with respect to each signal data can be performed is acquired.
According to the pattern model generation program of the ninth aspect, it is possible to generate a pattern model that can improve the recognition probability with respect to signal data that reduces the recognition probability of the pattern model.

また、請求項１０記載のパターンモデル評価プログラムによれば、低い認識確率を生じさせるデータに対する認識確率を正確に且つ体系的に評価することができるという効果が得られる。
また、請求項１１記載のパターン認識プログラムによれば、各信号データに対して満遍なく認識確率が良好となるパターン認識を行うことができるという効果が得られる。 In addition, according to the pattern model evaluation program of the tenth aspect, it is possible to accurately and systematically evaluate the recognition probability for data that causes a low recognition probability.
Further, according to the pattern recognition program of the eleventh aspect, it is possible to obtain an effect that it is possible to perform pattern recognition with a uniform recognition probability for each signal data.

また、請求項１２記載のパターンモデル生成方法によれば、パターンモデルの認識確率を低下させる信号データに対する認識確率を向上できるパターンモデルを生成することができるという効果が得られる。
また、請求項１３記載のパターンモデル評価方法によれば、低い認識確率を生じさせるデータに対する認識確率を正確に且つ体系的に評価することができるという効果が得られる。
また、請求項１４記載のパターン認識方法によれば、各信号データに対して満遍なく認識確率が良好となるパターン認識を行うことができるという効果が得られる。 According to the pattern model generation method of the twelfth aspect of the present invention, it is possible to generate a pattern model that can improve the recognition probability for signal data that reduces the recognition probability of the pattern model.
According to the pattern model evaluation method of the thirteenth aspect, it is possible to accurately and systematically evaluate the recognition probability for data that causes a low recognition probability.
In addition, according to the pattern recognition method of the fourteenth aspect, an effect is obtained that it is possible to perform pattern recognition with a uniform recognition probability for each signal data.

以下、本発明の実施の形態を図面に基づき説明する。図１〜図１１は、本発明に係るパターンモデル生成装置、パターンモデル生成プログラムおよびパターンモデル生成方法、パターンモデル評価装置、パターンモデル評価プログラムおよびパターンモデル評価方法、並びにパターン認識装置、パターン認識プログラムおよびパターン認識方法の実施の形態を示す図である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. 1 to 11 show a pattern model generation apparatus, a pattern model generation program and a pattern model generation method, a pattern model evaluation apparatus, a pattern model evaluation program and a pattern model evaluation method, a pattern recognition apparatus, a pattern recognition program, and It is a figure which shows embodiment of the pattern recognition method.

まず、本発明に係るパターン認識装置の機能構成を図１に基づき説明する。図１は、本発明に係るパターン認識装置の機能構成を示すブロック図である。
パターン認識装置１００は、複数対象に係る信号データおよび複数種類のノイズデータを記憶するデータ記憶部１０と、低次信号ベクトル空間および低次ノイズベクトル空間を記憶する低次ベクトル空間記憶部１２と、信号データに基づき高次信号パターンモデルを生成し、ノイズデータに基づき高次ノイズパターンモデルを生成する高次パターンモデル生成部１４と、高次信号パターンモデルおよび高次ノイズパターンモデルを記憶する高次パターンモデル記憶部１６と、高次のパターンモデルによって構成されるデータ空間を低次のベクトルによって構成されるデータ空間に写像して低次ベクトル空間を生成する低次ベクトル空間生成部１８とを含んだ構成となっている。 First, the functional configuration of the pattern recognition apparatus according to the present invention will be described with reference to FIG. FIG. 1 is a block diagram showing a functional configuration of a pattern recognition apparatus according to the present invention.
The pattern recognition apparatus 100 includes a data storage unit 10 that stores signal data and multiple types of noise data related to a plurality of objects, a low-order vector space storage unit 12 that stores a low-order signal vector space and a low-order noise vector space, A high-order pattern model generation unit 14 that generates a high-order signal pattern model based on the signal data and generates a high-order noise pattern model based on the noise data, and a high-order signal pattern model and a high-order noise pattern model that are stored. A pattern model storage unit 16; and a low-order vector space generation unit 18 that generates a low-order vector space by mapping a data space constituted by a high-order pattern model to a data space constituted by low-order vectors. It has a configuration.

データ記憶部１０は、複数の話者が発声した音声信号データ、複数の赤外線センサなどのセンサから出力されたセンサ出力信号データ、複数の野生生物の鳴声信号データ等の複数対象に係る信号データと、住宅内の生活雑音、工場騒音、交通騒音等のノイズデータ、電気・電子回路ノイズ、振動ノイズ等のノイズデータなどを後述する記憶装置７０の所定領域に記憶する機能を有している。更に、本実施の形態においては、複数対象に係る信号データを、複数の特定条件に基づきグループ分けして記憶する機能を有している。例えば、複数話者から取得した不特定多数の音声データを、話者の名前、男性／女性の性別、子供／大人／高齢者の年齢別等の「話者の種類」、発話する、数字、文章、単語等の「発話語彙」、発話速度、発話音量、方言に由来する特徴等の「発話様式」などの特定条件に基づいてグループ分けして記憶する。また、各信号データは、ノイズ未混合のクリーンなデータとなっており、各ノイズデータは、ノイズ成分以外の信号が未混合のクリーンなノイズデータとなっている。 The data storage unit 10 includes signal data related to a plurality of objects such as voice signal data uttered by a plurality of speakers, sensor output signal data output from sensors such as a plurality of infrared sensors, and a plurality of wildlife cry signal data. And noise data such as household noise, factory noise, and traffic noise, noise data such as electric / electronic circuit noise, vibration noise, and the like are stored in a predetermined area of the storage device 70 to be described later. Furthermore, in this embodiment, there is a function of storing signal data related to a plurality of objects in groups based on a plurality of specific conditions. For example, a large number of unspecified audio data obtained from multiple speakers can be converted into “speaker type” such as speaker name, male / female gender, child / adult / elder age, etc. Stored in groups based on specific conditions such as “utterance vocabulary” of sentences, words, etc., “speech style” such as utterance speed, speech volume, and dialect features. Further, each signal data is clean data with no noise mixing, and each noise data is clean noise data with no signal other than noise components.

低次ベクトル空間記憶部１２は、外部装置からネットワーク等を介して送られてきたり、記憶媒体を介して送られてきたりした低次信号ベクトル空間データおよび低次ノイズベクトル空間データを後述する記憶装置７０の所定領域に記憶したり、低次ベクトル空間生成部１８において生成された低次信号ベクトル空間データおよび低次ノイズベクトル空間データを後述する記憶装置７０の所定領域に記憶したりする機能を有している。ここで、外部装置等から送られてきた低次ベクトル空間データについては、これらの生成時に用いた複数対象に係る信号データおよびノイズデータも取得するようになっている。 The low-order vector space storage unit 12 stores a low-order signal vector space data and low-order noise vector space data, which will be described later, sent from an external device via a network or the like, or sent via a storage medium. 70, and the low-order signal vector space data and the low-order noise vector space data generated by the low-order vector space generation unit 18 are stored in a predetermined region of the storage device 70 described later. is doing. Here, for low-order vector space data sent from an external device or the like, signal data and noise data related to a plurality of objects used at the time of generation are also acquired.

高次パターンモデル生成部１４は、まず、データ記憶部１０によって後述する記憶装置７０の所定領域に記憶された各グループ毎の信号データ又はノイズデータのうち、ユーザからの選択指示等に応じて選択された信号データ又はノイズデータから、ケプストラム分析や線形予測分析などの分析処理によって４次元以上の高次元の特徴量（特徴パラメータともいう）を抽出する機能を有している。更に、当該抽出した高次元の特徴量を学習データとして、公知のＥＭアルゴリズム等を用いてＨＭＭの学習を行い、当該学習後のＨＭＭから構成される高次元の要素を含んでなるパターンモデルを生成する機能を有している。ここで、パターンモデルは、例えば、複数対象に係る信号データであれば、各グループを構成する個人や個体毎などに生成され、ノイズデータであれば、各グループを構成するノイズの種類毎に生成される。 The high-order pattern model generation unit 14 first selects signal data or noise data for each group stored in a predetermined area of the storage device 70 described later by the data storage unit 10 according to a selection instruction from the user. The extracted signal data or noise data has a function of extracting a four-dimensional or higher-dimensional feature quantity (also referred to as a feature parameter) by analysis processing such as cepstrum analysis or linear prediction analysis. Further, HMM learning is performed using the extracted high-dimensional feature value as learning data using a known EM algorithm or the like, and a pattern model including high-dimensional elements composed of the HMM after the learning is generated. It has a function to do. Here, for example, if the pattern model is signal data related to multiple objects, it is generated for each individual or individual constituting each group, and if it is noise data, it is generated for each type of noise constituting each group. Is done.

高次パターンモデル記憶部１６は、外部装置からネットワーク等を介して送られてきたり、記憶媒体を介して送られてきたりした高次信号パターンモデルおよび高次ノイズパターンモデルを後述する記憶装置７０の所定領域に記憶したり、高次パターンモデル生成部１４において生成された高次信号パターンモデルおよび高次ノイズパターンモデルを後述する記憶装置７０の所定領域に記憶したりする機能を有している。ここで、外部装置等から送られてきた高次パターンモデルについては、これらの生成時に用いた複数対象に係る信号データおよびノイズデータも取得するようになっている。 The high-order pattern model storage unit 16 includes a high-order signal pattern model and a high-order noise pattern model that are sent from an external device via a network or the like or a storage medium. The high-order signal pattern model and high-order noise pattern model generated by the high-order pattern model generation unit 14 are stored in a predetermined area of the storage device 70 to be described later. Here, for high-order pattern models sent from an external device or the like, signal data and noise data related to a plurality of objects used at the time of generation are also acquired.

低次ベクトル空間生成部１８は、公知のＳａｍｍｏｎ法を用いて、複数の高次信号パターンモデルによって構成されるデータ空間の代替として、当該データ空間における各パターンモデル間の距離関係を近似した状態で、３次元以下の低次元の要素からなる信号ベクトルから構成されるデータ空間へと写像して低次信号ベクトル空間を生成する機能を有している。更に、同様に公知のＳａｍｍｏｎ法を用いて、複数の高次ノイズパターンモデルによって構成されるデータ空間の代替として、当該データ空間における各パターンモデル間の距離関係を近似した状態で、３次元以下の低次元の要素からなる信号ベクトルから構成されるデータ空間へと写像して低次ノイズベクトル空間を生成する機能を有している。 The low-order vector space generation unit 18 uses a well-known Sammon method as an alternative to a data space composed of a plurality of high-order signal pattern models, in an approximate state of the distance relationship between the pattern models in the data space. It has a function of generating a low-order signal vector space by mapping to a data space composed of signal vectors composed of low-dimensional elements of three dimensions or less. Further, similarly, using the well-known Sammon method, as an alternative to the data space constituted by a plurality of higher-order noise pattern models, in a state where the distance relation between each pattern model in the data space is approximated, the three-dimensional or less It has a function to generate a low-order noise vector space by mapping to a data space composed of signal vectors composed of low-dimensional elements.

ここで、Ｓａｍｍｏｎ法とは、高次元空間上のベクトル情報（高次パターンモデル）の相互距離の総和と低次元空間上の写像位置座標（低次ベクトル）の相互ユークリッド距離の総和との差が最小となるように、最急降下法により低次元空間上の写像位置座標を最適化する手法である。つまり、高次信号パターンモデルおよび高次ノイズパターンモデルは、各パターンモデル間の距離関係を近似した状態で、例えば、３次元又は２次元の低次ベクトルへと変換され、低次元空間上における座標点へと写像されることになる。本実施の形態においては、生成した低次信号ベクトル空間および低次ノイズベクトル空間を後述する情報表示部３６によって表示できるようになっている。 Here, the Sammon method is the difference between the sum of mutual distances of vector information (higher order pattern model) in a high-dimensional space and the sum of mutual Euclidean distances of mapping position coordinates (lower-order vectors) in a low-dimensional space. This is a technique for optimizing the mapping position coordinates in the low-dimensional space by the steepest descent method so as to be minimized. That is, the high-order signal pattern model and the high-order noise pattern model are converted into, for example, a three-dimensional or two-dimensional low-order vector in a state in which the distance relation between the pattern models is approximated, and coordinates in a low-dimensional space. It will be mapped to a point. In the present embodiment, the generated low-order signal vector space and low-order noise vector space can be displayed by the information display unit 36 described later.

パターン認識装置１００は、更に、複数の低次ベクトルによって構成される低次ベクトル空間を複数の領域に区分する低次ベクトル空間区分部２０と、当該区分された低次ベクトル空間からパターンモデル生成用の低次ベクトルを選択し、当該選択した低次ベクトルに対応するデータを取得する学習用データ選択部２２と、当該学習用データ選択部２２で取得した信号データおよびノイズデータに基づきノイズ混合信号データを生成するノイズ混合信号データ生成部２６と、当該生成されたノイズ混合信号データに基づきパターン認識用のパターンモデルを生成するパターンモデル生成部２８と、当該生成したパターンモデルを記憶するパターンモデル記憶部２８とを含んだ構成となっている。 The pattern recognition apparatus 100 further includes a low-order vector space partitioning unit 20 that partitions a low-order vector space composed of a plurality of low-order vectors into a plurality of regions, and a pattern model generation from the divided low-order vector spaces. A learning data selection unit 22 that acquires data corresponding to the selected low-order vector, and noise mixed signal data based on the signal data and noise data acquired by the learning data selection unit 22 A noise mixture signal data generation unit 26 that generates a pattern model generation unit 28 that generates a pattern model for pattern recognition based on the generated noise mixture signal data, and a pattern model storage unit that stores the generated pattern model 28.

低次ベクトル空間区分部２０は、複数の低次信号ベクトルから構成される２次元（または３次元）の低次信号ベクトル空間を、全低次信号ベクトルの重心を中心とし、当該中心とそこから最も離れた位置にある低次ベクトルとの間の距離を半径とした外円（または外球）と、前記重心を中心とし、前記外円（または外球）よりも短い半径のｎ個の内円（または内球）とによって区分すると共に、前記外円（または外球）とｎ個の内円（または内球）とからなる同心円（または同心球）の各曲線間（または曲面間）に形成される環状（または球面状）の領域を半径方向に伸びる線（または面）によって複数の領域に区分する機能を有している。更に、上記低次信号ベクトル空間と同様に、複数の低次ノイズベクトルから構成される低次ノイズベクトル空間を、複数の領域に区分する機能を有している。 The low-order vector space partitioning unit 20 has a two-dimensional (or three-dimensional) low-order signal vector space composed of a plurality of low-order signal vectors, centered on the center of gravity of all the low-order signal vectors, and the center and the center thereof. An outer circle (or outer sphere) whose radius is the distance between the lowest-order vectors at the farthest positions, and n inner circles having a radius shorter than that of the outer circle (or outer sphere) centered on the center of gravity. It is divided by a circle (or inner sphere), and between each curve (or between curved surfaces) of a concentric circle (or concentric sphere) composed of the outer circle (or outer sphere) and n inner circles (or inner spheres). It has a function of dividing the formed annular (or spherical) region into a plurality of regions by lines (or surfaces) extending in the radial direction. Further, similarly to the low-order signal vector space, it has a function of dividing a low-order noise vector space composed of a plurality of low-order noise vectors into a plurality of regions.

学習用データ選択部２２は、後述する入力装置７４を介したユーザからの選択指示があった場合は、低次信号ベクトル空間および低次ノイズベクトル空間が２次元（または３次元）であるとすると、外円（または外球）とその１つ内側の内円（または内球）との曲線間（または曲面間）に形成される環状（または球面状）の領域から、選択指示に対応した領域を選択し、当該選択した領域に属する低次信号ベクトル又は低次ノイズベクトルに対応するデータをデータ記憶部１０の記憶内容から取得する機能を有している。更に、選択指示がない場合は、予め設定された選択方法（例えば、全領域を選択など）に応じて、外円（または外球）とその１つ内側の内円（または内球）との曲線間（または曲面間）に形成される環状（または球面状）の領域における所定の領域を選択し、当該選択した領域に属する低次信号ベクトル又は低次ノイズベクトルに対応するデータをデータ記憶部１０から取得する機能を有している。なお、領域ではなく、低次ベクトルそのものを直接選択しても良い。 The learning data selection unit 22 assumes that the low-order signal vector space and the low-order noise vector space are two-dimensional (or three-dimensional) when there is a selection instruction from the user via the input device 74 described later. An area corresponding to a selection instruction from an annular (or spherical) area formed between curves (or between curved surfaces) of an outer circle (or outer sphere) and an inner circle (or inner sphere) inside the outer circle (or outer sphere). And the data corresponding to the low-order signal vector or low-order noise vector belonging to the selected region is acquired from the stored contents of the data storage unit 10. In addition, when there is no selection instruction, an outer circle (or outer sphere) and an inner circle (or inner sphere) on the inner side of the outer circle (or outer sphere) according to a preset selection method (for example, selection of all areas) A data storage unit selects a predetermined region in an annular (or spherical) region formed between curves (or between curved surfaces), and stores data corresponding to a low-order signal vector or a low-order noise vector belonging to the selected region 10 to obtain the function. Note that the low-order vector itself may be directly selected instead of the region.

ここで、上記括弧内の記載は、低次ベクトル空間が３次元空間である場合の形状を示す記載である。このことは、以下の低次ベクトル空間に対する記載においても同様である。
ノイズ混合信号データ生成部２４は、学習用データ選択部２２において取得された信号データおよびノイズデータに基づき、各信号データに対して、ノイズデータを所定のＳＮＲで混合してノイズ混合信号データを生成する機能を有している。 Here, the description in the parentheses is a description showing a shape when the low-order vector space is a three-dimensional space. The same applies to the following description of the low-order vector space.
The noise mixed signal data generation unit 24 generates noise mixed signal data by mixing noise data with a predetermined SNR with respect to each signal data based on the signal data and noise data acquired by the learning data selection unit 22. It has a function to do.

パターンモデル生成部２６は、ノイズ混合信号データ生成部２４において生成されたノイズ混合信号データを学習データとして、初期状態のＨＭＭ又は学習済みのＨＭＭを学習してパターン認識用のパターンモデルを生成する機能を有している。なお、本実施の形態において、学習方法については、上記高次信号パターンモデルおよび上記高次ノイズパターンモデルと同様の方法を用いることとする。
パターンモデル記憶部２８は、パターンモデル生成部２６において生成されたパターンモデルを後述する記憶装置７０の所定領域に記憶したり、外部装置からネットワーク等を介して送られてきたり、記憶媒体を介して送られてきたりしたパターンモデルを後述する記憶装置７０の所定領域に記憶したりする機能を有している。 The pattern model generation unit 26 uses the noise mixed signal data generated by the noise mixed signal data generation unit 24 as learning data, and learns an HMM in an initial state or a learned HMM to generate a pattern model for pattern recognition. have. In the present embodiment, as a learning method, the same method as that for the high-order signal pattern model and the high-order noise pattern model is used.
The pattern model storage unit 28 stores the pattern model generated by the pattern model generation unit 26 in a predetermined area of a storage device 70 to be described later, or is sent from an external device via a network or the like, or via a storage medium It has a function of storing the transmitted pattern model in a predetermined area of the storage device 70 to be described later.

パターン認識装置１００は、更に、後述する入力装置７４を介してユーザから指定されたパターンモデルに対応する、低次ベクトル空間区分部２０において複数の領域に区分された低次ベクトル空間から評価用の領域を選択し、当該選択した領域に属する低次ベクトルに対応するデータをデータ記憶部１０から取得する評価用データ選択部３０と、当該取得した評価用データから評価用混合信号データを生成する評価用混合信号データ生成部３２と、当該生成した評価用混合信号データと、前記パターンモデルとに基づき、当該パターンモデルの認識性能を評価するパターンモデル評価部３４と、当該評価結果、上記区分後の低次ベクトル空間および後述するパターン認識処理部４０の認識結果等を表示する情報表示部３６とを含んだ構成となっている。 The pattern recognition apparatus 100 further performs evaluation from a low-order vector space divided into a plurality of regions in the low-order vector space partitioning unit 20 corresponding to a pattern model specified by the user via the input device 74 described later. An evaluation data selection unit 30 that selects a region and acquires data corresponding to a low-order vector belonging to the selected region from the data storage unit 10, and an evaluation that generates mixed signal data for evaluation from the acquired evaluation data Pattern signal evaluation unit 34 for evaluating the recognition performance of the pattern model based on the generated mixed signal data for evaluation 32, the generated mixed signal data for evaluation, and the pattern model; A configuration including a low-order vector space and an information display unit 36 for displaying a recognition result of a pattern recognition processing unit 40 to be described later You have me.

評価用データ選択部３０は、ユーザによって指定されたパターンモデルに対応する区分後の低次信号ベクトル空間および区分後の低次ノイズベクトル空間から評価用の低次信号ベクトルおよび低次ノイズベクトルを選択する機能を有している。この選択処理は、ユーザからの選択指示があった場合は、選択指示に応じた領域に属する低次ベクトルを選択し、選択指示が無い場合は、予め設定された選択方法に従って評価用の低次信号ベクトルおよび低次ノイズベクトルを選択するようになっている。更に、評価用の低次信号ベクトルおよび低次ノイズベクトルが選択されると、当該選択された低次信号ベクトルおよび低次ノイズベクトルに対応する信号データおよびノイズデータをデータ記憶部１０の記憶内容から取得する機能を有している。 The evaluation data selection unit 30 selects a low-order signal vector and a low-order noise vector for evaluation from the low-order signal vector space after segmentation and the low-order noise vector space after segmentation corresponding to the pattern model specified by the user It has a function to do. In this selection process, when there is a selection instruction from the user, a low-order vector belonging to the region according to the selection instruction is selected, and when there is no selection instruction, a low-order vector for evaluation according to a preset selection method is selected. A signal vector and a low-order noise vector are selected. Further, when a low-order signal vector and a low-order noise vector for evaluation are selected, signal data and noise data corresponding to the selected low-order signal vector and low-order noise vector are stored from the stored contents of the data storage unit 10. It has a function to acquire.

評価用混合信号データ生成部３２は、評価用データ選択部３０において取得された評価用の信号データおよびノイズデータに基づき、各信号データに所定のＳＮＲでノイズデータを混合して評価用混合信号データを生成する機能を有している。
パターンモデル評価部３４は、評価用混合信号データ生成部３２において生成された各評価用混合信号データから特徴量を抽出し、当該抽出した特徴量とユーザに指定されたパターンモデルとに基づき、各評価用混合信号データのパターン認識処理を行う。例えば、評価用混合信号データに対してフレーム毎（例えば、フレーム長２０ｍｓ、フレームシフト１０ｍｓ）にケプストラム分析や、ＭＦＣＣ分析等を行って特徴量を抽出し、当該抽出した特徴量と、指定されたパターンモデルとに基づき、各パターンモデルに対する尤度を算出し、当該算出した尤度に基づきパターン認識を行う。パターン認識は、例えば、最大尤度のパターンモデルに対応したラベルを認識結果として選択する。つまり、信号データが音声データであれば、音声認識用の各パターンモデルに対応した語彙のうち、前記算出した尤度の最も大きいパターンモデルに対応した語彙を認識結果として選択する。更に、パターンモデル評価部３４は、このパターン認識処理の認識結果に基づきユーザによって指定されたパターンモデルの認識性能を評価する機能を有している。 The evaluation mixed signal data generation unit 32 mixes noise data with a predetermined SNR to each signal data based on the evaluation signal data and noise data acquired by the evaluation data selection unit 30, and the evaluation mixed signal data. It has the function to generate.
The pattern model evaluation unit 34 extracts a feature amount from each evaluation mixed signal data generated in the evaluation mixed signal data generation unit 32, and based on the extracted feature amount and the pattern model specified by the user, Performs pattern recognition processing of the mixed signal data for evaluation. For example, a feature amount is extracted by performing cepstrum analysis, MFCC analysis, etc. for each frame (for example, frame length 20 ms, frame shift 10 ms) for the mixed signal data for evaluation, and the extracted feature amount is designated. The likelihood for each pattern model is calculated based on the pattern model, and pattern recognition is performed based on the calculated likelihood. For pattern recognition, for example, a label corresponding to the pattern model of maximum likelihood is selected as a recognition result. That is, if the signal data is speech data, the vocabulary corresponding to the pattern model with the highest likelihood is selected as the recognition result from the vocabulary corresponding to each pattern model for speech recognition. Further, the pattern model evaluation unit 34 has a function of evaluating the recognition performance of the pattern model designated by the user based on the recognition result of the pattern recognition processing.

情報表示部３６は、上記低次ベクトル空間区分部２０において区分後の低次ベクトル空間を、２次元であれば２次元空間上の座標点、３次元であれば３次元空間上の座標点として領域を区分する区分線と共に表示する機能を有している。更に、パターンモデル評価部３４の評価結果を表示したり、パターン認識処理部４０の認識結果を表示したりする機能も有している。ここで、評価結果としては、各評価用混合信号データに対する認識確率を表示したり、認識確率に対する優劣のコメント等を表示したりする。また、認識結果としては、例えば、音声パターンモデルを用いた音声認識であれば、認識結果の語彙を表示したりする。 The information display unit 36 displays the low-order vector space after the division in the low-order vector space division unit 20 as a coordinate point on the two-dimensional space if it is two-dimensional, and as a coordinate point on the three-dimensional space if it is three-dimensional. It has a function of displaying together with a dividing line for dividing the area. Furthermore, it also has a function of displaying the evaluation result of the pattern model evaluation unit 34 and displaying the recognition result of the pattern recognition processing unit 40. Here, as the evaluation result, the recognition probability for each mixed signal data for evaluation is displayed, or a superior or inferior comment for the recognition probability is displayed. As the recognition result, for example, in the case of speech recognition using a speech pattern model, the vocabulary of the recognition result is displayed.

パターン認識装置１００は、更に、認識対象の信号データを取得するデータ取得部３８と、当該取得した信号データをパターン認識するパターン認識処理部４０とを含んだ構成となっている。
データ取得部３８は、認識対象の信号データを取得する機能を有しており、例えば、音声データを取得する場合は、マイクおよびＡ／Ｄ変換器等を介して、マイクから入力されたアナログの音声データをＡ／Ｄ変換器によってデジタルデータへと変換したものを取得する。また、例えば、インターネット等を介して認識対象のデータを取得する場合は、ネットワークカード等を介してデータを取得する。また、例えば、認識対象の赤外線センサ等のセンサ検知結果の信号データを取得する場合は、センサ出力信号をＡ／Ｄ変換器によってデジタルデータへと変換したものを取得する。 The pattern recognition apparatus 100 further includes a data acquisition unit 38 that acquires signal data to be recognized, and a pattern recognition processing unit 40 that recognizes the acquired signal data.
The data acquisition unit 38 has a function of acquiring signal data to be recognized. For example, when acquiring audio data, the analog data input from the microphone is input via a microphone and an A / D converter. Audio data converted into digital data by an A / D converter is acquired. Further, for example, when data to be recognized is acquired via the Internet or the like, the data is acquired via a network card or the like. For example, when acquiring signal data of a sensor detection result of an infrared sensor or the like to be recognized, a sensor output signal converted into digital data by an A / D converter is acquired.

パターン認識処理部４０は、データ取得部３８において取得された信号データに対して、上記パターンモデル評価部３４と同様に、フレーム毎（例えば、フレーム長２０ｍｓ、フレームシフト１０ｍｓ）にケプストラム分析や、ＭＦＣＣ分析等を行って特徴量を抽出し、当該抽出した特徴量と、信号データに対応するパターンモデルとに基づき、各パターンモデルに対する尤度を算出し、当該算出した尤度に基づきパターン認識を行う機能を有している。この認識結果は、情報表示部３６によって表示される。 The pattern recognition processing unit 40 performs cepstrum analysis or MFCC for each frame (for example, a frame length of 20 ms and a frame shift of 10 ms) on the signal data acquired by the data acquisition unit 38 in the same manner as the pattern model evaluation unit 34. Analyzes and the like extract feature quantities, calculate likelihoods for each pattern model based on the extracted feature quantities and the pattern model corresponding to the signal data, and perform pattern recognition based on the calculated likelihoods It has a function. This recognition result is displayed by the information display unit 36.

更に、パターン認識装置１００は、上記各部の制御をソフトウェア上で実現するためのコンピュータシステムを備えており、そのハードウェア構成は、図２に示すように、各種制御や演算処理を担う中央演算処理装置であるＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）６０と、主記憶装置（ＭａｉｎＳｔｏｒａｇｅ）を構成するＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）６２と、読み出し専用の記憶装置であるＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）６４との間をＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ）バスやＩＳＡ（ＩｎｄｕｓｔｒｉａｌＳｔａｎｄａｒｄＡｒｃｈｉｔｅｃｔｕｒｅ）バス等からなる各種内外バス６８で接続すると共に、このバス６８に入出力インターフェース（Ｉ／Ｆ）６６を介して、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）などの外部記憶装置（ＳｅｃｏｎｄａｒｙＳｔｏｒａｇｅ）７０や、印刷手段やＣＲＴ、ＬＣＤモニター等の出力装置７２、操作パネルやマウス、キーボード、スキャナなどの入力装置７４、および図示しない外部装置などと通信するためのネットワークＬなどを接続したものである。 Furthermore, the pattern recognition apparatus 100 is provided with a computer system for realizing the control of the above-described units on software, and the hardware configuration thereof is a central processing unit responsible for various controls and arithmetic processing as shown in FIG. A PCI (Central Processing Unit) 60 that is a device, a RAM (Random Access Memory) 62 that constitutes a main storage device (Main Storage), and a ROM (Read Only Memory) 64 that is a read-only storage device. (Peripheral Component Interconnect) bus and ISA (Industrial Standard Architecture) bus etc. are connected by various internal and external buses 68, and this bus 68 is entered and exited. Via a force interface (I / F) 66, an external storage device (Secondary Storage) 70 such as an HDD (Hard Disk Drive), an output device 72 such as a printing means, a CRT, an LCD monitor, an operation panel, a mouse, a keyboard, An input device 74 such as a scanner and a network L for communicating with an external device (not shown) are connected.

そして、電源を投入すると、ＲＯＭ６４等に記憶されたＢＩＯＳ等のシステムプログラムが、ＲＯＭ６４に予め記憶された各種専用のコンピュータプログラム、あるいは、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ、フレキシブルディスク（ＦＤ）などの記憶媒体を介して、またはインターネットなどの通信ネットワークＬを介して記憶装置７０にインストールされた各種専用のコンピュータプログラムを同じくＲＡＭ６２にロードし、そのＲＡＭ６２にロードされたプログラムに記述された命令に従ってＣＰＵ６０が各種リソースを駆使して所定の制御および演算処理を行うことで前述したような各手段の各機能をソフトウェア上で実現できるようになっている。 When the power is turned on, a system program such as BIOS stored in the ROM 64 or the like is stored in various dedicated computer programs stored in the ROM 64 in advance, or in a CD-ROM, DVD-ROM, flexible disk (FD), or the like. Various dedicated computer programs installed in the storage device 70 via the medium or the communication network L such as the Internet are similarly loaded into the RAM 62, and the CPU 60 performs various operations according to instructions described in the program loaded in the RAM 62. Each function of each means as described above can be realized on software by performing predetermined control and arithmetic processing by making full use of resources.

次に、このような構成をしたパターン認識装置１００を用いたパターン認識用のパターンモデルの生成処理の流れの一例を図３のフローチャートに基づき説明する。ここで、図３は、パターン認識用のパターンモデルの生成処理の一例を示すフローチャートである。図３のフローチャートに示すように、パターン認識装置１００は、まずステップＳ１００に移行し、低次ベクトル空間区分部２０において、入力装置７４を介したユーザからのパターンモデルの生成指示があったか否かを判定し、生成指示があったと判定した場合（Ｙｅｓ）は、ステップＳ１０２に移行し、そうでない場合（Ｎｏ）は、生成指示があるまで判定処理を繰り返す。 Next, an example of a flow of pattern pattern generation processing for pattern recognition using the pattern recognition apparatus 100 configured as described above will be described with reference to the flowchart of FIG. Here, FIG. 3 is a flowchart showing an example of a pattern model generation process for pattern recognition. As shown in the flowchart of FIG. 3, the pattern recognition apparatus 100 first proceeds to step S 100, and determines whether or not the low-order vector space partitioning unit 20 has received a pattern model generation instruction from the user via the input device 74. If it is determined and it is determined that there is a generation instruction (Yes), the process proceeds to step S102. If not (No), the determination process is repeated until there is a generation instruction.

ステップＳ１０２に移行した場合は、低次ベクトル空間区分部２０において、記憶装置７０の所定領域から、ユーザからの生成指示に応じた低次信号ベクトル空間データを読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、低次信号ベクトル空間データを取得してステップＳ１０４に移行する。
ステップＳ１０４では、低次ベクトル空間区分部２０において、記憶装置７０の所定領域から、ユーザからの生成指示に応じた低次ノイズベクトル空間データを読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、低次ノイズベクトル空間データを取得してステップＳ１０６に移行する。 When the process proceeds to step S102, the low-order vector space division unit 20 reads low-order signal vector space data corresponding to the generation instruction from the user from a predetermined area of the storage device 70, and the read data is stored in the RAM 62 as a predetermined data. By storing in the area, low-order signal vector space data is acquired, and the process proceeds to step S104.
In step S 104, the low-order vector space division unit 20 reads low-order noise vector space data corresponding to the generation instruction from the user from a predetermined area of the storage device 70, and stores the read data in the predetermined area of the RAM 62. Thus, the low-order noise vector space data is acquired, and the process proceeds to step S106.

ステップＳ１０６では、低次ベクトル空間区分部２０において、ステップＳ１０２で取得した低次信号ベクトル空間データに基づき、当該空間を構成する低次信号ベクトルを複数の領域に区分すると共に、ステップＳ１０４で取得した低次ノイズベクトル空間データに基づき、当該空間を構成する低次ノイズベクトルを複数の領域に区分してステップＳ１０８に移行する。 In step S106, the low-order vector space partitioning unit 20 partitions the low-order signal vectors constituting the space into a plurality of regions based on the low-order signal vector space data acquired in step S102 and acquired in step S104. Based on the low-order noise vector space data, the low-order noise vector constituting the space is divided into a plurality of regions, and the process proceeds to step S108.

ステップＳ１０８では、低次ベクトル空間区分部２０において、手動選択モードが選択されているか否かを判定し、手動選択モードが選択されていると判定した場合（Ｙｅｓ）は、情報表示部３６に区分後の低次ベクトル空間データを与えてステップＳ１１０に移行し、そうでない場合（Ｎｏ）は、ステップＳ１２２に移行する。ここで、手動選択モードとは、区分後の低次信号ベクトル空間および低次ノイズベクトル空間における特定の領域から、ユーザが入力装置７４を介して任意の領域を選択することができるモードである。また、特定の領域とは、低次信号ベクトル空間および低次ノイズベクトル空間が２次元（または３次元）であるとすると、前述したように外円（または外球）とその１つ内側の内円（または内球）との曲線間（または曲面間）に形成される環状（または球面状）の領域のことであり、この領域は、半径方向に伸びる線（または面）によって複数の領域に区分されている。なお、この手動選択モードに対応して自動選択モードがあり、このモードに設定されている場合は、予め設定された選択方法に従って、前記特定の領域から所定の領域が自動的に選択される。 In step S108, the low-order vector space dividing unit 20 determines whether or not the manual selection mode is selected. If it is determined that the manual selection mode is selected (Yes), the information display unit 36 is divided. The subsequent low-order vector space data is provided and the process proceeds to step S110. If not (No), the process proceeds to step S122. Here, the manual selection mode is a mode in which the user can select an arbitrary region via the input device 74 from specific regions in the low-order signal vector space and the low-order noise vector space after the division. Further, if the low-order signal vector space and the low-order noise vector space are two-dimensional (or three-dimensional), the specific region is the outer circle (or outer sphere) and the inner one of the inner circle as described above. An annular (or spherical) area formed between curves (or curved surfaces) with a circle (or inner sphere). This area is divided into multiple areas by radially extending lines (or faces). It is divided. There is an automatic selection mode corresponding to this manual selection mode, and when this mode is set, a predetermined area is automatically selected from the specific area according to a preset selection method.

ステップＳ１１０に移行した場合は、情報表示部３６において、低次ベクトル空間区分部２０からの低次信号ベクトル空間および低次ノイズベクトル空間の各データに基づき、各低次ベクトルおよび区分結果をＣＲＴ又はＬＣＤモニター等から構成される出力装置７２に表示出力してステップＳ１１２に移行する。
ステップＳ１１２では、学習用データ選択部２２において、ユーザによって、区分後の低次信号ベクトル空間および低次ノイズベクトル空間の前述した特定領域から所定の領域が選択されたか否かを判定し、選択されたと判定した場合（Ｙｅｓ）は、ステップＳ１１４に移行し、そうでない場合（Ｎｏ）は、選択されるまで判定処理を繰り返す。 When the process proceeds to step S110, in the information display unit 36, the low-order vector space and the low-order noise vector space based on the data of the low-order signal vector space and the low-order noise vector space are used to display each low-order vector and the division result as CRT or The display is output to the output device 72 constituted by an LCD monitor or the like, and the process proceeds to step S112.
In step S112, the learning data selection unit 22 determines whether or not a predetermined region has been selected by the user from the above-described specific regions of the divided low-order signal vector space and low-order noise vector space. If it is determined (Yes), the process proceeds to step S114. If not (No), the determination process is repeated until it is selected.

ステップＳ１１４に移行した場合は、学習用データ選択部２２において、ステップＳ１１２で選択された領域に属する低次信号ベクトルおよび低次ノイズベクトルに対応する信号データおよびノイズデータを、記憶装置７０の所定領域から読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、これら信号データおよびノイズデータを取得してステップＳ１１６に移行する。 When the process proceeds to step S114, the learning data selection unit 22 causes the signal data and noise data corresponding to the low-order signal vector and the low-order noise vector belonging to the area selected in step S112 to be stored in a predetermined area of the storage device 70. , And the read data is stored in a predetermined area of the RAM 62 to acquire the signal data and noise data, and the process proceeds to step S116.

ステップＳ１１６では、ノイズ混合信号データ生成部２４において、ステップＳ１１４で取得した信号データおよびノイズデータに基づき、各信号データにノイズを予め設定されたＳＮＲで混合してノイズ混合信号データを生成してステップＳ１１８に移行する。ここで、本実施の形態において、ＳＮＲは、ユーザが任意の値を設定したり、いくつかの候補の中から選択したりできる構成となっている。 In step S116, the noise mixed signal data generation unit 24 generates noise mixed signal data by mixing noise with each signal data at a preset SNR based on the signal data and noise data acquired in step S114. The process proceeds to S118. Here, in the present embodiment, the SNR is configured such that the user can set an arbitrary value or select from several candidates.

ステップＳ１１８では、パターンモデル生成部２６において、ステップＳ１１６で生成されたノイズ混合信号データを解析して４次元以上の高次元（例えば、１０〜４０次元）の特徴量を抽出し、当該抽出した高次元の特徴量を用いてＨＭＭを学習することでパターン認識用のパターンモデルを生成してステップＳ１２０に移行する。
ステップＳ１２０では、パターンモデル記憶部２８において、ステップＳ１１８で生成したパターンモデルを認識対象毎に対応付けて記憶装置７０の所定領域に記憶してステップＳ１００に移行する。 In step S118, the pattern model generation unit 26 analyzes the noise mixed signal data generated in step S116, extracts high-dimensional (eg, 10 to 40-dimensional) feature quantities of four or more dimensions, and extracts the extracted high A pattern model for pattern recognition is generated by learning the HMM using the dimension feature quantity, and the process proceeds to step S120.
In step S120, the pattern model storage unit 28 stores the pattern model generated in step S118 in association with each recognition target in a predetermined area of the storage device 70, and the process proceeds to step S100.

一方、ステップＳ１０８において、手動選択モードではなく自動選択モードが設定されておりステップＳ１２２に移行した場合は、学習用データ選択部２２において、予め設定された選択方法に従って、前述した特定領域から所定の領域を選択すると共に、当該選択した領域に属する低次信号ベクトルおよび低次ノイズベクトルに対応する信号データおよびノイズデータを、記憶装置７０の所定領域から読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、これら信号データおよびノイズデータを取得してステップＳ１１６に移行する。 On the other hand, if the automatic selection mode is set instead of the manual selection mode in step S108, and the process proceeds to step S122, the learning data selection unit 22 determines whether a predetermined area is selected from the specific area according to a preset selection method. In addition to selecting an area, signal data and noise data corresponding to the low-order signal vector and the low-order noise vector belonging to the selected area are read from the predetermined area of the storage device 70, and the read data is stored in the predetermined area of the RAM 62. By storing the signal data and the noise data, the process proceeds to step S116.

更に、パターン認識装置１００における低次ベクトル空間データの生成処理の流れの一例を、図４のフローチャートに基づき説明する。ここで、図４は、低次ベクトル空間データの生成処理の一例を示すフローチャートである。
図４のフローチャートに示すように、パターン認識装置１００は、まずステップＳ２００に移行して、高次パターンモデル生成部１４において、低次ベクトル空間データの生成指示があったか否かを判定し、あったと判定した場合（Ｙｅｓ）は、ステップＳ２０２に移行し、そうでない場合（Ｎｏ）は、指示があるまで判定処理を繰り返す。 Furthermore, an example of the flow of low-order vector space data generation processing in the pattern recognition apparatus 100 will be described based on the flowchart of FIG. Here, FIG. 4 is a flowchart showing an example of a low-order vector space data generation process.
As shown in the flowchart of FIG. 4, the pattern recognition apparatus 100 first proceeds to step S 200, and determines whether there is an instruction to generate low-order vector space data in the high-order pattern model generation unit 14. When it determines (Yes), it transfers to step S202, and when that is not right (No), a determination process is repeated until there exists an instruction | indication.

ステップＳ２０２に移行した場合は、高次パターンモデル生成部１４において、低次ベクトル空間データの生成指示は、信号データに対するものであるか否かを判定し、信号データに対するものであると判定した場合（Ｙｅｓ）は、ステップＳ２０４に移行し、そうでない場合（Ｎｏ）は、ステップＳ２１６に移行する。
ステップＳ２０４に移行した場合は、高次パターンモデル生成部１４において、生成指示に対応した信号データを、記憶装置７０の所定領域から読み出し、当該読み出した信号データを、ＲＡＭ６２の所定領域に格納することで、当該信号データを取得してステップＳ２０６に移行する。 When the process proceeds to step S202, the high-order pattern model generation unit 14 determines whether the low-order vector space data generation instruction is for the signal data, and if it is determined that the instruction is for the signal data (Yes) moves to step S204, otherwise (No) moves to step S216.
When the process proceeds to step S204, the high-order pattern model generation unit 14 reads the signal data corresponding to the generation instruction from the predetermined area of the storage device 70, and stores the read signal data in the predetermined area of the RAM 62. Thus, the signal data is acquired and the process proceeds to step S206.

ステップＳ２０６では、高次パターンモデル生成部１４において、ステップＳ２０４で取得した信号データを解析して４次元以上の高次元の特徴量を抽出してステップＳ２０８に移行する。
ステップＳ２０８では、高次パターンモデル生成部１４において、ステップＳ２０６で抽出した高次元の特徴量を用いてＨＭＭを学習し高次信号パターンモデルを生成してステップＳ２１０に移行する。 In step S206, the high-order pattern model generation unit 14 analyzes the signal data acquired in step S204, extracts a high-dimensional feature quantity of four or more dimensions, and proceeds to step S208.
In step S208, the high-order pattern model generation unit 14 learns the HMM using the high-dimensional feature value extracted in step S206 to generate a high-order signal pattern model, and the process proceeds to step S210.

ステップＳ２１０では、高次パターンモデル記憶部１６において、ステップＳ２０８で生成した高次信号パターンモデルを、記憶装置７０の所定領域に記憶してステップＳ２１２に移行する。
ステップＳ２１２では、低次ベクトル空間生成部１８において、ステップＳ２１０において記憶された高次信号パターンモデルから構成されるデータ空間を、Ｓａｍｍｏｎ法を用いて３次元以下の信号ベクトルから構成されるデータ空間へと写像することで低次信号ベクトル空間データを生成してステップＳ２１４に移行する。 In step S210, the high-order pattern model storage unit 16 stores the high-order signal pattern model generated in step S208 in a predetermined area of the storage device 70, and the process proceeds to step S212.
In step S212, the low-order vector space generation unit 18 changes the data space composed of the high-order signal pattern model stored in step S210 to a data space composed of signal vectors of three dimensions or less using the Sammon method. To generate low-order signal vector space data, and the process proceeds to step S214.

ステップＳ２１４では、低次ベクトル空間記憶部１２において、ステップＳ２１２で生成された低次信号ベクトル空間データを、記憶装置７０の所定領域に記憶してステップＳ２００に移行する。
一方、ステップＳ２０２において、信号データではなくステップＳ２１６に移行した場合は、高次パターンモデル生成部１４において、生成指示に対応したノイズデータを、記憶装置７０の所定領域から読み出し、当該読み出したノイズデータを、ＲＡＭ６２の所定領域に格納することで、当該信号データを取得してステップＳ２１８に移行する。 In step S214, the low-order vector space storage unit 12 stores the low-order signal vector space data generated in step S212 in a predetermined area of the storage device 70, and the process proceeds to step S200.
On the other hand, when the process proceeds to step S216 instead of signal data in step S202, the high-order pattern model generation unit 14 reads out noise data corresponding to the generation instruction from a predetermined area of the storage device 70, and the read noise data. Is stored in a predetermined area of the RAM 62 to acquire the signal data, and the process proceeds to step S218.

ステップＳ２１８では、高次パターンモデル生成部１４において、ステップＳ２１６で取得したノイズデータを解析して４次元以上の高次元の特徴量を抽出してステップＳ２２０に移行する。
ステップＳ２２０では、高次パターンモデル生成部１４において、ステップＳ２１８で抽出した高次元の特徴量を用いてＨＭＭを学習し高次ノイズパターンモデルを生成してステップＳ２２２に移行する。
ステップＳ２２２では、高次パターンモデル記憶部１６において、ステップＳ２２０で生成した高次ノイズパターンモデルを、記憶装置７０の所定領域に記憶してステップＳ２２４に移行する。 In step S218, the high-order pattern model generation unit 14 analyzes the noise data acquired in step S216, extracts high-dimensional feature quantities of four or more dimensions, and proceeds to step S220.
In step S220, the high-order pattern model generation unit 14 learns the HMM using the high-dimensional feature value extracted in step S218, generates a high-order noise pattern model, and proceeds to step S222.
In step S222, the high-order pattern model storage unit 16 stores the high-order noise pattern model generated in step S220 in a predetermined area of the storage device 70, and the process proceeds to step S224.

ステップＳ２２４では、低次ベクトル空間生成部１８において、ステップＳ２２２において記憶された高次ノイズパターンモデルから構成されるデータ空間を、Ｓａｍｍｏｎ法を用いて３次元以下の低次元のノイズベクトルから構成されるデータ空間へと写像することで低次ノイズベクトル空間データを生成してステップＳ２２６に移行する。
ステップＳ２２６では、低次ベクトル空間記憶部１２において、ステップＳ２２４で生成された低次ノイズベクトル空間データを、記憶装置７０の所定領域に記憶してステップＳ２００に移行する。 In step S224, in the low-order vector space generation unit 18, the data space composed of the high-order noise pattern model stored in step S222 is composed of low-dimensional noise vectors of three dimensions or less using the Sammon method. By mapping to the data space, low-order noise vector space data is generated and the process proceeds to step S226.
In step S226, the low-order vector space storage unit 12 stores the low-order noise vector space data generated in step S224 in a predetermined area of the storage device 70, and the process proceeds to step S200.

更に、パターン認識装置１００におけるパターン認識用のパターンモデルの評価処理の流れの一例を、図５のフローチャートに基づき説明する。ここで、図５は、パターン認識用のパターンモデルの評価処理の一例を示すフローチャートである。
図５のフローチャートに示すように、パターン認識装置１００は、まずステップＳ３００に移行し、低次ベクトル空間区分部２０において、入力装置７４を介したユーザからのパターンモデルの評価指示があったか否かを判定し、評価指示があったと判定した場合（Ｙｅｓ）は、ステップＳ３０２に移行し、そうでない場合（Ｎｏ）は、評価指示があるまで判定処理を繰り返す。 Further, an example of the flow of the pattern recognition process for pattern recognition in the pattern recognition apparatus 100 will be described based on the flowchart of FIG. Here, FIG. 5 is a flowchart showing an example of pattern pattern evaluation processing for pattern recognition.
As shown in the flowchart of FIG. 5, the pattern recognition apparatus 100 first proceeds to step S 300, and in the low-order vector space partition unit 20, determines whether or not there has been a pattern model evaluation instruction from the user via the input device 74. If it is determined and it is determined that there is an evaluation instruction (Yes), the process proceeds to step S302. If not (No), the determination process is repeated until there is an evaluation instruction.

ステップＳ３０２に移行した場合は、低次ベクトル空間区分部２０において、記憶装置７０の所定領域から、上記評価指示において指定されるパターンモデルに対応した低次信号ベクトル空間データを読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、低次信号ベクトル空間データを取得してステップＳ３０４に移行する。
ステップＳ３０４では、低次ベクトル空間区分部２０において、記憶装置７０の所定領域から、評価指示において指定されるパターンモデルに対応した低次ノイズベクトル空間データを読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、低次ノイズベクトル空間データを取得してステップＳ３０６に移行する。 When the process proceeds to step S302, the low-order vector space partition unit 20 reads low-order signal vector space data corresponding to the pattern model specified in the evaluation instruction from a predetermined area of the storage device 70, and the read data Is stored in a predetermined area of the RAM 62, low-order signal vector space data is acquired, and the process proceeds to step S304.
In step S304, the low-order vector space partitioning unit 20 reads low-order noise vector space data corresponding to the pattern model specified in the evaluation instruction from the predetermined area of the storage device 70, and the read data is stored in the predetermined area of the RAM 62. , The low-order noise vector space data is acquired, and the process proceeds to step S306.

ステップＳ３０６では、低次ベクトル空間区分部２０において、ステップＳ３０２で取得した低次信号ベクトル空間データに基づき、当該空間を構成する低次信号ベクトルを複数の領域に区分すると共に、ステップＳ３０４で取得した低次ノイズベクトル空間データに基づき、当該空間を構成する低次ノイズベクトルを複数の領域に区分してステップＳ３０８に移行する。 In step S306, the low-order vector space partitioning unit 20 partitions the low-order signal vectors constituting the space into a plurality of regions based on the low-order signal vector space data acquired in step S302 and acquired in step S304. Based on the low-order noise vector space data, the low-order noise vector constituting the space is divided into a plurality of regions, and the process proceeds to step S308.

ステップＳ３０８では、低次ベクトル空間区分部２０において、手動選択モードが選択されているか否かを判定し、手動選択モードが選択されていると判定した場合（Ｙｅｓ）は、情報表示部３６に区分後の低次ベクトル空間データを与えてステップＳ３１０に移行し、そうでない場合（Ｎｏ）は、ステップＳ３２２に移行する。
ステップＳ３２２に移行した場合は、情報表示部３６において、低次ベクトル空間区分部２０からの低次信号ベクトル空間および低次ノイズベクトル空間の各データに基づき、各低次ベクトルおよび区分結果をＣＲＴ又はＬＣＤモニター等から構成される出力装置７２に表示出力してステップＳ３１２に移行する。 In step S308, the low-order vector space segmentation unit 20 determines whether or not the manual selection mode is selected. If it is determined that the manual selection mode is selected (Yes), the information display unit 36 is segmented. The subsequent low-order vector space data is given, and the process proceeds to step S310. Otherwise (No), the process proceeds to step S322.
When the process proceeds to step S322, in the information display unit 36, the low-order vector space and the low-order noise vector space based on each data of the low-order signal vector space and the low-order noise vector space are displayed on the CRT or The display is output to the output device 72 constituted by an LCD monitor or the like, and the process proceeds to step S312.

ステップＳ３１２では、評価用データ選択部３０において、ユーザによって、区分後の低次信号ベクトル空間および低次ノイズベクトル空間の前述した特定領域から所定の領域が選択されたか否かを判定し、選択されたと判定した場合（Ｙｅｓ）は、ステップＳ３１４に移行し、そうでない場合（Ｎｏ）は、選択されるまで判定処理を繰り返す。
ステップＳ３１４に移行した場合は、評価用データ選択部３０において、ステップＳ３１２で選択された領域に属する低次信号ベクトルおよび低次ノイズベクトルに対応する信号データおよびノイズデータを、記憶装置７０の所定領域から読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、これら信号データおよびノイズデータを取得してステップＳ１１６に移行する。 In step S312, the evaluation data selection unit 30 determines whether or not a predetermined region has been selected by the user from the above-described specific regions of the low-order signal vector space and the low-order noise vector space after classification. If it is determined (Yes), the process proceeds to step S314. If not (No), the determination process is repeated until it is selected.
When the process proceeds to step S314, the evaluation data selection unit 30 stores the signal data and noise data corresponding to the low-order signal vector and the low-order noise vector belonging to the area selected in step S312 in the predetermined area of the storage device 70. , And the read data is stored in a predetermined area of the RAM 62 to acquire the signal data and noise data, and the process proceeds to step S116.

ステップＳ１１６では、評価用混合信号データ生成部３２において、ステップＳ１１４で取得した信号データおよびノイズデータに基づき、各信号データにノイズを予め設定されたＳＮＲで混合して評価用混合信号データを生成してステップＳ３１８に移行する。ここで、本実施の形態において、ＳＮＲは、ユーザが任意の値を設定したり、いくつかの候補の中から選択したりできる構成となっている。 In step S116, in the evaluation mixed signal data generation unit 32, based on the signal data and noise data acquired in step S114, noise is mixed in each signal data with a preset SNR to generate evaluation mixed signal data. Then, the process proceeds to step S318. Here, in the present embodiment, the SNR is configured such that the user can set an arbitrary value or select from several candidates.

ステップＳ３１８では、パターンモデル評価部３４において、ステップＳ１１６で生成された評価用混合信号データを解析して高次元の特徴量を抽出し、当該抽出した特徴量と、評価指示において指定されたパターンモデルとに基づきパターン認識を行い、当該認識結果に基づきパターンモデルを評価してステップＳ３２０に移行する。
ステップＳ３２０では、情報表示部３６において、ステップＳ３１８の評価結果を出力装置７２のモニターに表示してステップＳ３００に移行する。 In step S318, the pattern model evaluation unit 34 analyzes the evaluation mixed signal data generated in step S116 to extract high-dimensional feature values, and extracts the extracted feature values and the pattern model specified in the evaluation instruction. And pattern recognition is performed based on the recognition result, and the process proceeds to step S320.
In step S320, the information display unit 36 displays the evaluation result of step S318 on the monitor of the output device 72, and the process proceeds to step S300.

一方、ステップＳ３０８において、手動選択モードではなく自動選択モードが設定されておりステップＳ３２２に移行した場合は、評価用データ選択部３０において、予め設定された選択方法に従って、前述した特定領域から所定の低次ベクトルを選択すると共に、当該選択した低次ベクトルに対応する信号データおよびノイズデータを記憶装置７０から読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納することで、これら信号データおよびノイズデータを取得してステップＳ３１６に移行する。 On the other hand, if the automatic selection mode is set instead of the manual selection mode in step S308, and the process proceeds to step S322, the evaluation data selection unit 30 performs predetermined processing from the specific area described above according to a preset selection method. While selecting a low-order vector, reading out the signal data and noise data corresponding to the selected low-order vector from the storage device 70, and storing the read data in a predetermined area of the RAM 62, the signal data and the noise data And the process proceeds to step S316.

更に、パターン認識装置１００におけるパターン認識処理の流れの一例を、図６のフローチャートに基づき説明する。ここで、図６は、パターン認識処理の一例を示すフローチャートである。
図６のフローチャートに示すように、パターン認識装置１００は、まずステップＳ４００に移行し、データ取得部３８において、入力装置７４を介したユーザからのパターン認識指示があったか否かを判定し、認識指示があったと判定した場合（Ｙｅｓ）は、ステップＳ４０２に移行し、そうでない場合（Ｎｏ）は、認識指示があるまで判定処理を繰り返す。 Furthermore, an example of the flow of pattern recognition processing in the pattern recognition apparatus 100 will be described based on the flowchart of FIG. Here, FIG. 6 is a flowchart showing an example of pattern recognition processing.
As shown in the flowchart of FIG. 6, the pattern recognition apparatus 100 first proceeds to step S 400, and the data acquisition unit 38 determines whether or not there is a pattern recognition instruction from the user via the input device 74. If it is determined that there is (Yes), the process proceeds to step S402. If not (No), the determination process is repeated until there is a recognition instruction.

ステップＳ４０２に移行した場合は、信号データ取得部３８において、認識対象の信号データを取得したか否かを判定し、取得したと判定した場合（Ｙｅｓ）は、Ａ／Ｄ変換器によりデジタルデータに変換してステップＳ４０４に移行し、そうでない場合（Ｎｏ）は、取得するか、認識処理の終了指示があるまで判定処理を繰り返す。
ステップＳ４０４に移行した場合は、パターン認識処理部４０において、ステップＳ４０２で取得した信号データを解析して高次元の特徴量を抽出してステップＳ４０６に移行する。 When the process proceeds to step S402, the signal data acquisition unit 38 determines whether or not the signal data to be recognized has been acquired. If it is determined that the signal data has been acquired (Yes), the A / D converter converts the signal data into digital data. After conversion, the process proceeds to step S404. If not (No), the determination process is repeated until acquisition or an instruction to end the recognition process is given.
When the process proceeds to step S404, the pattern recognition processing unit 40 analyzes the signal data acquired in step S402 to extract high-dimensional feature values, and the process proceeds to step S406.

ステップＳ４０６では、パターン認識処理部４０において、ステップＳ４０４で抽出した高次元の特徴量と、認識指示において指定されたパターンモデルとに基づきパターン認識処理を行い、情報表示部に認識結果の情報および認識結果の表示指示を与えてステップＳ４０８に移行する
ステップＳ４０８では、情報表示部３６において、パターン認識処理部４０からの認識結果の情報および表示指示に応じて、出力装置７２を構成するＣＲＴ又はＬＣＤモニターに認識結果を表示してステップＳ４１０に移行する。
ステップＳ４１０では、データ取得部３８において、入力装置７４を介したユーザからの認識処理の終了指示があったか否かを判定し、あったと判定した場合（Ｙｅｓ）は、ステップＳ４００に移行し、そうでない場合（Ｎｏ）は、ステップＳ４０２に移行する。 In step S406, the pattern recognition processing unit 40 performs pattern recognition processing based on the high-dimensional feature amount extracted in step S404 and the pattern model specified in the recognition instruction, and the information display unit displays the recognition result information and the recognition. The result display instruction is given and the process proceeds to step S408. In step S408, the information display unit 36 includes a CRT or LCD monitor constituting the output device 72 in accordance with the information on the recognition result from the pattern recognition processing unit 40 and the display instruction. The recognition result is displayed on the screen, and the process proceeds to step S410.
In step S410, the data acquisition unit 38 determines whether or not there has been an instruction to end the recognition process from the user via the input device 74. If it is determined that there is (Yes), the process proceeds to step S400. In the case (No), the process proceeds to step S402.

次に、図７〜図１１に基づき、本実施の形態の動作を説明する。
ここで、図７は、音声信号データに対する２次元の低次信号ベクトル空間の一例を示す図である。また、図８は、領域区分後の図７の低次信号ベクトル空間の一例を示す図である。また、図９は、ノイズデータに対する２次元の低次ノイズベクトル空間の一例を示す図である。また、図１０は、領域区分後の図９の低次ノイズベクトル空間の一例を示す図である。また、図１１は、本発明の手法を用いて実データに基づき生成したパターンモデルの評価結果を示す図である。 Next, the operation of the present embodiment will be described with reference to FIGS.
Here, FIG. 7 is a diagram illustrating an example of a two-dimensional low-order signal vector space for audio signal data. FIG. 8 is a diagram illustrating an example of the low-order signal vector space of FIG. 7 after region segmentation. FIG. 9 is a diagram illustrating an example of a two-dimensional low-order noise vector space for noise data. FIG. 10 is a diagram illustrating an example of the low-order noise vector space of FIG. 9 after region segmentation. Moreover, FIG. 11 is a figure which shows the evaluation result of the pattern model produced | generated based on real data using the method of this invention.

以下、音声信号データに対する低次信号ベクトル空間および雑音データに対する低次ノイズベクトル空間が既に生成されているものとして、音声認識用のパターンモデルの生成処理および当該生成したパターンモデルの評価処理について具体的な動作を説明する。
パターン認識装置１００は、入力装置７４を介して、ユーザからの音声信号データに対するパターンモデルの生成指示が入力されると（ステップＳ１００の「Ｙｅｓ」の分岐）、当該生成指示に対応する音声データの低次信号ベクトル空間データを記憶装置７０から読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納すると共に（ステップＳ１０２）、同じく記憶装置７０から音声データに対する雑音データの低次ノイズベクトル空間データを読み出し、当該読み出したデータをＲＡＭ６２の所定領域に格納する（ステップＳ１０４）。本実施の形態において、音声データの低次信号ベクトル空間は、２次元の低次信号ベクトルから構成された空間であり、各低次信号ベクトルを座標点としてこれを可視化して表示すると、例えば、図７に示すように、各低次信号ベクトルが２次元空間上の座標点として平面表示される。ここで、図７に示す、低次信号ベクトル空間は、音声データとして、実際に１４５人の男性が日本語１７５単語を複数の発話様式で発声した音声データを用いて作成されており、この音声データから５６１個の特定話者に対する高次信号パターンモデルを生成し、この５６１個の高次信号パターンモデルから構成されるデータ空間を２次元の低次信号ベクトル空間に写像して得られたものである。 Hereinafter, it is assumed that the low-order signal vector space for the speech signal data and the low-order noise vector space for the noise data have already been generated. The operation will be described.
When the pattern recognition device 100 receives a pattern model generation instruction for the audio signal data from the user via the input device 74 (“Yes” in step S100), the pattern recognition device 100 outputs the audio data corresponding to the generation instruction. The low-order signal vector space data is read from the storage device 70, the read data is stored in a predetermined area of the RAM 62 (step S102), and the low-order noise vector space data of the noise data for the audio data is also read from the storage device 70. The read data is stored in a predetermined area of the RAM 62 (step S104). In the present embodiment, the low-order signal vector space of the audio data is a space composed of two-dimensional low-order signal vectors. When each low-order signal vector is visualized and displayed as coordinate points, for example, As shown in FIG. 7, each low-order signal vector is planarly displayed as a coordinate point in a two-dimensional space. Here, the low-order signal vector space shown in FIG. 7 is created as voice data using voice data in which 145 men actually uttered 175 Japanese words in a plurality of utterance styles. A high-order signal pattern model for 561 specific speakers is generated from the data, and a data space composed of the 561 high-order signal pattern models is mapped to a two-dimensional low-order signal vector space. It is.

低次ベクトル空間区分部２０においては、２次元の低次信号ベクトル空間データを、例えば、図８に示すように、当該低次信号ベクトル空間を構成する全低次信号ベクトルの重心を中心とし、この中心とそこから最も離れた位置にある低次信号ベクトルまでの距離を半径とする外円と、当該外円の半径よりも短い半径の１つの内円とにより区分し、更に、外円と内円との曲線間に形成される環状の領域を半径方向に伸びる線によって８つの領域に均等に区分する（ステップＳ１０６）。なお、図８は、１００人の男性が日本の１０００都市名を発声した音声データから作成された１００個の特定話者に対する高次信号パターンモデルから作成された低次信号ベクトル空間を区分した例である。 In the low-order vector space segmentation unit 20, the two-dimensional low-order signal vector space data is centered on the center of gravity of all the low-order signal vectors constituting the low-order signal vector space, for example, as shown in FIG. The outer circle whose radius is the distance from the center to the lowest order signal vector located farthest from the center is divided into one inner circle having a radius shorter than the radius of the outer circle, and An annular region formed between the curve with the inner circle is equally divided into eight regions by a line extending in the radial direction (step S106). FIG. 8 shows an example in which a low-order signal vector space created from a high-order signal pattern model for 100 specific speakers created from speech data in which 100 men uttered 1000 city names in Japan is segmented. It is.

一方、本実施の形態において、雑音データの低次ノイズベクトル空間は、音声データの低次信号ベクトル空間と同様に、２次元の低次ノイズベクトルから構成された空間であり、各低次ノイズベクトルを座標点としてこれを可視化して表示すると、図９に示すように、各低次ノイズベクトルが２次元空間上の座標点として平面表示される。ここで、図９に示す低次ノイズベクトル空間は、風呂の浴槽のお湯が泡立つ音、ペンで机を叩く音などを含む２０００種類の雑音データから実際に高次ノイズパターンモデルを生成し、当該生成した高次ノイズパターンモデルによって構成されるデータ空間を２次元の低次ノイズベクトル空間に写像して得られたものである。 On the other hand, in the present embodiment, the low-order noise vector space of the noise data is a space composed of a two-dimensional low-order noise vector, like the low-order signal vector space of the voice data, and each low-order noise vector When this is visualized as a coordinate point and displayed, each low-order noise vector is planarly displayed as a coordinate point in a two-dimensional space, as shown in FIG. Here, the low-order noise vector space shown in FIG. 9 actually generates a high-order noise pattern model from 2000 kinds of noise data including the sound of hot water in a bath tub and the sound of hitting a desk with a pen. This is obtained by mapping a data space constituted by the generated higher-order noise pattern model to a two-dimensional lower-order noise vector space.

低次ベクトル空間区分部２０においては、、図１０に示すように、当該２次元の低次ノイズベクトル空間を構成する全低次ノイズベクトルの重心を中心とし、この中心とそこから最も離れた位置にある低次ノイズベクトルまでの距離を半径とする外円と、当該外円の半径よりも短い半径の１つの内円とにより区分し、更に、外円と内円との曲線間に形成される環状の領域を半径方向に伸びる線によって８つの領域に均等に区分する（ステップＳ１０６）。 In the low-order vector space segmentation unit 20, as shown in FIG. 10, the center of gravity of all the low-order noise vectors constituting the two-dimensional low-order noise vector space is the center, and the position farthest from this center. Is divided into an outer circle whose radius is the distance to the low-order noise vector in FIG. 1 and one inner circle having a radius shorter than the radius of the outer circle, and is further formed between the curves of the outer circle and the inner circle. The annular region is equally divided into eight regions by a line extending in the radial direction (step S106).

なお、低次信号ベクトル空間および低次ノイズベクトル空間は共に、重心から最も離れた位置にある低次ベクトルの重心からの距離を１とし、これに対して内円の半径を重心からの距離の比が０．６となる位置とした。
ここでは、自動選択モードが設定されているとして、パターンモデル生成用の音声信号データおよびノイズデータが予め設定された選択方法に従って取得される場合を説明する。 In both the low-order signal vector space and the low-order noise vector space, the distance from the center of gravity of the low-order vector that is farthest from the center of gravity is 1, and the radius of the inner circle is the distance from the center of gravity. The position was such that the ratio was 0.6.
Here, a case will be described in which the audio signal data and noise data for pattern model generation are acquired according to a preset selection method, assuming that the automatic selection mode is set.

本実施の形態においては、環状の領域を区分してなる８つの領域における各領域の中央付近に位置する男女それぞれ１つずつの低次信号ベクトルに対応する話者を評価用の話者として選択する設定となっている。また、生成用の音声信号データは、この男女１６人の評価用話者を除いた８つの領域に属する全ての低次信号ベクトルに対応した話者に対する音声信号データを選択する設定となっている。つまり、重心からの距離の比が０．６〜１．０の範囲にある環状の領域から低次信号ベクトルが選択されることになる。このように環状の領域から低次信号ベクトルを選択する理由は、不特定話者の音声認識において、一般に重心から遠い位置にある低次ベクトルに対応するパターンモデルは認識性能が低く、逆に、重心に近い位置にある低次ベクトルに対応するパターンモデルほど認識性能が高くなる傾向にあることから、重心から遠い位置にある比較的認識性能が低くなるパターンモデルを選択するためである。 In this embodiment, a speaker corresponding to one lower-order signal vector for each man and woman located near the center of each region in the eight regions obtained by dividing the annular region is selected as a speaker for evaluation. It is set to be. The generation voice signal data is set to select voice signal data for speakers corresponding to all the low-order signal vectors belonging to the eight regions except for the 16 male and female evaluation speakers. . That is, a low-order signal vector is selected from an annular region in which the ratio of the distance from the center of gravity is in the range of 0.6 to 1.0. The reason for selecting the low-order signal vector from the annular region in this way is that, in speech recognition of an unspecified speaker, a pattern model generally corresponding to a low-order vector located far from the center of gravity has low recognition performance. This is because a pattern model corresponding to a low-order vector located near the center of gravity tends to have a higher recognition performance, so that a pattern model that is relatively far from the center of gravity and has a relatively low recognition performance is selected.

一方、本実施の形態においては、図１０に示すように、低次ノイズベクトル空間の環状の領域を区分してなる８つの領域を、それぞれ、パターンモデル生成用の領域である「ＴＲ＿１、ＴＲ＿２、ＴＲ＿３、ＴＲ＿４」の４つの領域と、評価用の領域である「ＥＶ＿１＿２、ＥＶ＿２＿３、ＥＶ＿３＿４、ＥＶ＿４＿１」の４つの領域とに分けている。そして、パターンモデル生成用のノイズデータとしては、「ＴＲ＿１、ＴＲ＿２、ＴＲ＿３、ＴＲ＿４」の各ゾーンに属する全低次ノイズベクトルに対応するノイズデータを選択する設定となっている。つまり、低次信号ベクトルと同様に、重心からの距離の比が０．６〜１．０の範囲にある環状の領域から低次ノイズベクトルが選択されることになる。なお、環状の領域から選択する理由は、上記音声信号データの選択理由と同じである。 On the other hand, in the present embodiment, as shown in FIG. 10, eight regions obtained by partitioning the annular region of the low-order noise vector space are divided into “TR_1, TR_2, It is divided into four regions of “TR_3, TR_4” and four regions of “EV_1_2, EV_2_3, EV_3_4, EV_4_1” which are evaluation regions. The noise data for pattern model generation is set to select noise data corresponding to all low-order noise vectors belonging to each zone of “TR_1, TR_2, TR_3, TR_4”. That is, similarly to the low-order signal vector, the low-order noise vector is selected from an annular region in which the ratio of the distance from the center of gravity is in the range of 0.6 to 1.0. The reason for selecting from the annular area is the same as the reason for selecting the audio signal data.

従って、図８に例示する低次信号ベクトル空間からは、上記環状の領域における８つの区分領域を選択し、当該選択した８つの区分領域に属する上記男性８人、女性８人の合計１６人の評価用話者を除いた残りの全ての話者に対する音声信号データを記憶装置７０から読み出してＲＡＭ６２の所定領域に格納する。ここで、音声データは、前述したようにノイズの混合されていないクリーンな音声データとなっている。更に、図１０に示す低次ノイズベクトル空間からは、上記環状の領域の８つの区分領域におけるパターンモデル生成用の領域である「ＴＲ＿１、ＴＲ＿２、ＴＲ＿３、ＴＲ＿４」の４つの領域を選択し、これら４つの領域に属する全ての低次ノイズベクトルに対応するノイズデータを記憶装置７０から読み出してＲＡＭ６２の所定領域に格納する（ステップＳ１２２）。 Therefore, from the low-order signal vector space illustrated in FIG. 8, eight segmented regions in the annular region are selected, and a total of 16 men, eight men and eight women belonging to the selected eight segmented regions. The audio signal data for all the remaining speakers excluding the evaluation speaker is read from the storage device 70 and stored in a predetermined area of the RAM 62. Here, the sound data is clean sound data in which noise is not mixed as described above. Further, from the low-order noise vector space shown in FIG. 10, four regions “TR_1, TR_2, TR_3, TR_4” which are regions for pattern model generation in the eight divided regions of the annular region are selected. Noise data corresponding to all low-order noise vectors belonging to the four areas is read from the storage device 70 and stored in a predetermined area of the RAM 62 (step S122).

このようにしてパターンモデル生成用の音声信号データおよびノイズデータが選択且つ取得されると、ノイズ混合信号データ生成部２４は、この取得したノイズデータを連結して、当該連結したノイズを、取得したクリーンな音声データに所定のＳＮＲ（例えば、２０ｄＢ）で混合してノイズ混合信号データを生成する（ステップＳ１１６）。
ノイズ混合信号データが生成されると、パターンモデル生成部２６は、各ノイズ混合信号データを学習用データとして、音声認識用に用意された初期状態のＨＭＭを、公知のＥＭアルゴリズムによって学習して不特定話者に対する音声認識用のパターンモデルを生成する（ステップＳ１１８）。つまり、認識性能が低くなるパターンモデルに対応した音声信号データおよびノイズデータを用いて音声認識用のパターンモデルを生成するので、認識性能を低くするノイズ混合信号データに特化したパターンモデルが生成されることになる。 When the sound signal data and noise data for pattern model generation are selected and acquired in this way, the noise mixed signal data generation unit 24 connects the acquired noise data and acquires the connected noise. Noise mixed signal data is generated by mixing clean audio data with a predetermined SNR (for example, 20 dB) (step S116).
When the noise mixed signal data is generated, the pattern model generation unit 26 learns the HMM in the initial state prepared for speech recognition by using each noise mixed signal data as learning data by a known EM algorithm. A pattern model for speech recognition for a specific speaker is generated (step S118). In other words, since the pattern model for speech recognition is generated using the speech signal data and noise data corresponding to the pattern model with low recognition performance, a pattern model specialized for noise mixed signal data with low recognition performance is generated. Will be.

更に、本実施の形態においては、後述する評価処理のために、以下の７種類のノイズ混合信号データを生成し、各ノイズ混合信号データを学習用データとして、上記生成した音声認識用のパターンモデルをＥＭアルゴリズムによって再学習し、評価用の７種類のパターンモデルを生成する。
（１）ＴＲ＿１およびＴＲ＿２の領域に属する全低次ノイズベクトルに対応するノイズデータを連結して、これをＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。 Furthermore, in the present embodiment, for the evaluation process described later, the following seven types of noise mixed signal data are generated, and each of the noise mixed signal data is used as learning data, and the generated pattern model for speech recognition is generated. Are re-learned by the EM algorithm to generate seven types of pattern models for evaluation.
(1) Noise mixed signal data obtained by concatenating noise data corresponding to all the low-order noise vectors belonging to the regions of TR_1 and TR_2, and mixing this with the selected speech signal data for pattern model generation at SNR 20 dB.

（２）ＴＲ＿２およびＴＲ＿３の領域に属する全低次ノイズベクトルに対応するノイズデータを連結して、これをＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。
（３）ＴＲ＿３およびＴＲ＿４の領域に属する全低次ノイズベクトルに対応するノイズデータを連結して、これをＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。
（４）ＴＲ＿４およびＴＲ＿１の領域に属する全低次ノイズベクトルに対応するノイズデータを連結して、これをＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。 (2) Noise mixed signal data obtained by concatenating noise data corresponding to all the low-order noise vectors belonging to the regions of TR_2 and TR_3, and mixing this with the selected speech signal data for pattern model generation at SNR 20 dB.
(3) Noise mixed signal data obtained by concatenating noise data corresponding to all low-order noise vectors belonging to the regions of TR_3 and TR_4 and mixing this with the selected speech signal data for pattern model generation at SNR 20 dB.
(4) Noise mixed signal data obtained by concatenating noise data corresponding to all low-order noise vectors belonging to the regions of TR_4 and TR_1, and mixing this with the selected speech signal data for pattern model generation at SNR 20 dB.

（５）評価用の４つの領域を除く、「ＴＲ＿１、ＴＲ＿２、ＴＲ＿３、ＴＲ＿４」の４つの領域に属する全低次ノイズベクトルに対応するノイズデータを連結して、これをＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。
（６）低次ノイズベクトル空間の全低次ノイズベクトルに対応するノイズデータを連結して、これをＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。
（７）従来から用いられている展示会雑音を、ＳＮＲ２０ｄＢで上記選択したパターンモデル生成用の音声信号データに混合したノイズ混合信号データ。 (5) The noise data corresponding to all the low-order noise vectors belonging to the four regions “TR_1, TR_2, TR_3, TR_4”, excluding the four regions for evaluation, are concatenated, and the selected pattern is SNR 20 dB. Noise mixed signal data mixed with audio signal data for model generation.
(6) Noise mixed signal data obtained by concatenating noise data corresponding to all the low-order noise vectors in the low-order noise vector space and mixing the noise data with the selected sound signal data for pattern model generation at SNR 20 dB.
(7) Noise mixed signal data obtained by mixing exhibition noise, which has been conventionally used, with audio signal data for pattern model generation selected at SNR 20 dB.

上記（１）〜（７）のノイズ混合信号データを学習データとして再学習されて生成された７種類の音声認識用のパターンモデルは、パターンモデル記憶部２８によって、記憶装置７０の所定領域に格納される（ステップＳ１２０）。
更に、上記生成された７種類の音声認識用のパターンモデルに対する評価処理の実際の動作を具体的に説明する。 Seven types of pattern models for speech recognition generated by re-learning the noise mixed signal data (1) to (7) as learning data are stored in a predetermined area of the storage device 70 by the pattern model storage unit 28. (Step S120).
Further, the actual operation of the evaluation process for the generated seven types of pattern models for speech recognition will be specifically described.

パターン認識装置１００は、入力装置７４を介して、ユーザからの上記生成された音声認識用のパターンモデルに対する評価指示が入力されると（ステップＳ３００の「Ｙｅｓ」の分岐）、この音声認識用のパターンモデルに対応する低次信号ベクトル空間データおよび低次ノイズベクトル空間データを記憶装置７０から読み出して、ＲＡＭ６２の所定領域に格納する（ステップＳ３０２，Ｓ３０４）。更に、これら低次信号ベクトル空間データおよび低次ノイズベクトル空間データを、上記パターンモデルの生成処理と同様に、図８および図１０に示すように区分する。
ここでも、自動選択モードが設定されおり、予め設定された選択方法に従って、パターンモデル評価用の音声データおよびノイズデータが選択される。 When receiving an evaluation instruction for the generated speech recognition pattern model from the user via the input device 74 (the branch of “Yes” in step S300), the pattern recognition device 100 uses this speech recognition device. Low-order signal vector space data and low-order noise vector space data corresponding to the pattern model are read from the storage device 70 and stored in a predetermined area of the RAM 62 (steps S302 and S304). Further, the low-order signal vector space data and the low-order noise vector space data are classified as shown in FIGS. 8 and 10 in the same manner as the pattern model generation process.
Again, the automatic selection mode is set, and the sound data and noise data for pattern model evaluation are selected according to a preset selection method.

前述したように、図８に例示する環状の領域からは男性８人、女性８人、合計１６人の話者が評価用話者として選択されるように設定されているので、評価用データ選択部３０は、上記パターンモデル生成時に除外した１６人の男女に対応する低次信号ベクトルを選択し、当該選択した低次信号ベクトルに対応する音声信号データを記憶装置７０から読み出し、ＲＡＭ６２の所定領域に格納する。 As described above, since it is set so that a total of 16 speakers, 8 males and 8 females, are selected as the evaluation speakers from the annular area illustrated in FIG. The unit 30 selects a low-order signal vector corresponding to the 16 men and women excluded at the time of generating the pattern model, reads out audio signal data corresponding to the selected low-order signal vector from the storage device 70, and stores a predetermined area of the RAM 62. To store.

更に、上記評価用の領域である「ＥＶ＿１＿２、ＥＶ＿２＿３、ＥＶ＿３＿４、ＥＶ＿４＿１」の４つの領域の各領域毎に最も中央にある低次ノイズベクトルを選択し、当該選択した４つの低次ノイズベクトルに対応するノイズデータを、上記７種類の音声認識用のパターンモデルに対する評価用の音声信号データに混合する評価用のノイズデータとして選択する。図１０の例では、自転車のベル音、携帯電話の着信音、接着テープを切る音およびカメラのシャッター音の４種類の雑音に対する低次ノイズベクトルが選択され、当該選択された低次ノイズベクトルに対応するノイズデータが記憶装置７０から読み出され、ＲＡＭ６２の所定領域に格納される（ステップＳ３２２）。 Furthermore, the lowest central noise vector is selected for each of the four regions “EV_1_2, EV_2_3, EV_3_4, EV_4_1”, which are the evaluation regions, and corresponds to the selected four low-order noise vectors. Noise data to be evaluated is selected as noise data for evaluation to be mixed with audio signal data for evaluation for the above seven types of pattern models for voice recognition. In the example of FIG. 10, low-order noise vectors are selected for four types of noises, that is, a bell sound of a bicycle, a ring tone of a mobile phone, a sound of cutting an adhesive tape, and a shutter sound of a camera, and the selected low-order noise vector is selected. Corresponding noise data is read from the storage device 70 and stored in a predetermined area of the RAM 62 (step S322).

そして、評価用混合信号データ生成部３２において、上記選択且つ取得した４種類のノイズデータを、上記選択且つ取得した音声信号データ（男女１６名がそれぞれ発声した日本の１０００都市の名称の音声信号データ）にそれぞれ別々にＳＮＲ２０ｄＢで混合した、４種類の評価用混合信号データ群を生成する（ステップＳ３１６）。本実施の形態においては、雑音が日本の１０００都市の名称の音声信号データの音声区間の中に位置するように混合した。 Then, in the mixed signal data generation unit 32 for evaluation, the four types of selected and acquired noise data are converted into the selected and acquired audio signal data (audio signal data with names of 1000 cities in Japan uttered by 16 men and women, respectively). ) Are separately mixed at SNR 20 dB, and four types of evaluation mixed signal data groups are generated (step S316). In this embodiment, the noise is mixed so as to be located in the voice section of the voice signal data with names of 1000 cities in Japan.

このようにして、４種類の評価用混合信号データ群が生成されると、パターンモデル評価部３４は、上記７種類の音声認識用のパターンモデルに対して、各評価用混合信号データ群毎に、各評価用混合信号データを認識対象とした音声認識処理を実行する。そして、各パターンモデル毎の音声認識結果に基づき、各評価用混合信号データ群毎の認識確率を算出すると共に、各パターンモデル毎に４種類の評価用ノイズデータに対する認識確率の平均値を算出する（ステップＳ３１８）。このようにして算出された、７種類のパターンモデルの各モデル毎に対する各評価用ノイズ毎の認識確率および平均認識確率は、情報表示部３６によって表示され、図１１に示すような評価結果となる。 When the four types of mixed signal data groups for evaluation are generated in this way, the pattern model evaluation unit 34 for each of the mixed signal data groups for evaluation with respect to the seven types of pattern models for speech recognition. Then, a speech recognition process is performed on each evaluation mixed signal data as a recognition target. Based on the speech recognition result for each pattern model, the recognition probability for each evaluation mixed signal data group is calculated, and the average value of the recognition probabilities for the four types of evaluation noise data is calculated for each pattern model. (Step S318). The recognition probability and the average recognition probability for each evaluation noise for each of the seven types of pattern models calculated in this way are displayed by the information display unit 36, resulting in the evaluation results shown in FIG. .

図１１において、「ＥＶ＿」は、ノイズの種類を示し、ＥＶ＿１＿２であれば自転車のベル音、ＥＶ＿２＿３であれば携帯電話の着信音、ＥＶ＿３＿４であれば接着テープを切る音、ＥＶ＿４＿１であればカメラのシャッター音となる。また、図１１中の、「Ｂａｓｅ」は、上記（７）に対応したパターンモデルを示し、「ＴＲ＿１＿２、ＴＲ＿２＿３、ＴＲ＿３＿４、ＴＲ＿４＿１」は、上記（１）〜（４）にそれぞれ対応したパターンモデルを示し、「Ｐｅｒｉ」は、上記（５）に対応したパターンモデルを示し、「Ａｌｌ」は、上記（６）に対応したパターンモデルを示す。また、図１１中の「Ａｖｅ．」は各パターンモデル毎の４種類の評価用混合信号データに対する認識確率の平均値を示す。 In FIG. 11, "EV_" indicates the type of noise. If EV_1_2, the bell sound of the bicycle, EV_2_3 is the ringtone of the mobile phone, EV_3_4 is the sound of cutting the adhesive tape, and EV_4_1 is the camera's ring tone. The shutter sound is heard. In FIG. 11, “Base” indicates a pattern model corresponding to the above (7), and “TR_1_2, TR_2_3, TR_3_4, TR_4_1” indicate pattern models corresponding to the above (1) to (4), respectively. “Peri” indicates a pattern model corresponding to the above (5), and “All” indicates a pattern model corresponding to the above (6). In addition, “Ave.” in FIG. 11 indicates the average value of the recognition probabilities for the four types of evaluation mixed signal data for each pattern model.

図１１に示すように、「ＴＲ＿１＿２、ＴＲ＿２＿３、ＴＲ＿３＿４、ＴＲ＿４＿１」の４種類のパターンモデル生成用の領域に対応するパターンモデルの音声認識確率は、「ＥＶ＿１＿２、ＥＶ＿２＿３、ＥＶ＿３＿４、ＥＶ＿４＿１」の４種類の評価用領域から選択されたノイズに対する評価用混合信号データに対して、いずれも「Ｂａｓｅ」に対応するパターンモデルよりも認識確率が高くなっているのが解る。このことから、重心からの距離の比が０．６〜１．０の範囲（つまり、重心から比較的遠く離れた位置）にある環状の領域から選択された信号データおよびノイズデータから生成されたパターンモデルは、従来の展示会雑音を混合したノイズ混合信号データから生成されたパターンモデルよりも評価用混合信号データに対する音声認識に対して有効であると言える。 As shown in FIG. 11, the speech recognition probabilities of the pattern models corresponding to the four types of pattern model generation regions “TR_1_2, TR_2_3, TR_3_4, TR_4_1” are four types of “EV_1_2, EV_2_3, EV_3_4, EV_4_1”. It can be seen that the recognition probability is higher for the mixed signal data for evaluation with respect to the noise selected from the evaluation region than for the pattern model corresponding to “Base”. From this, it was generated from signal data and noise data selected from an annular region in which the ratio of the distance from the center of gravity is in the range of 0.6 to 1.0 (that is, a position relatively far from the center of gravity). The pattern model can be said to be more effective for speech recognition with respect to the mixed signal data for evaluation than the pattern model generated from the noise mixed signal data mixed with the conventional exhibition noise.

更に、図１１に示すように、ＴＲ＿１＿２に対応するパターンモデルは、ＥＶ＿１＿２に対応する評価用混合信号データ対して他のパターンモデルよりも高い認識確率となっており、同様に、ＴＲ＿３＿４、ＴＲ＿４＿１にそれぞれ対応するパターンモデルの認識確率は、ＥＶ＿３＿４、ＥＶ＿４＿１にそれぞれ対応する評価用混合信号データの音声認識に対して他のパターンモデルよりも高い認識確率を示していることが解る。更に、ＴＲ＿２＿３に対応するパターンモデルの認識確率も、ＥＶ＿２＿３に対応する評価用混合信号データの音声認識に対して「８７．９％」と比較的高い認識確率を示している。このことから、２つのパターンモデル生成用の領域に挟まれた評価用領域に対応する評価用混合信号データに対して音声認識を行う際に、この２つの領域のノイズデータを用いて生成されたパターンモデルを用いることが有効であると言える。 Furthermore, as shown in FIG. 11, the pattern model corresponding to TR_1_2 has a higher recognition probability than the other pattern models for the mixed signal data for evaluation corresponding to EV_1_2, and similarly to TR_3_4 and TR_4_1, respectively. It can be seen that the recognition probability of the corresponding pattern model shows a higher recognition probability than the other pattern models for the speech recognition of the mixed signal data for evaluation corresponding to EV_3_4 and EV_4_1. Further, the recognition probability of the pattern model corresponding to TR_2_3 also shows a relatively high recognition probability of “87.9%” with respect to the speech recognition of the evaluation mixed signal data corresponding to EV_2_3. From this, when performing speech recognition on the evaluation mixed signal data corresponding to the evaluation area sandwiched between the two pattern model generation areas, it was generated using the noise data of these two areas. It can be said that it is effective to use a pattern model.

更に、図１１に示すように、「Ｐｅｒｉ」に対応するパターンモデルは、「ＥＶ＿１＿２、ＥＶ＿２＿３、ＥＶ＿３＿４、ＥＶ＿４＿１」にそれぞれ対応する評価用混合信号データに対する音声認識に対して、「ＴＲ＿１＿２、ＴＲ＿２＿３、ＴＲ＿３＿４、ＴＲ＿４＿１」にそれぞれ対応するパターンモデルのそれぞれの認識確率を平均したような認識確率となっているのが解る。また、図１１に示すように、「Ｐｅｒｉ」に対応するパターンモデルの平均認識確率（図１１中のＡｖｅ．の欄）は、「Ａｌｌ」に対応するパターンモデルの平均認識確率よりも高くなっていることが解る。このことから、重心からの距離の比が０．６〜１．０の範囲にある環状の領域から選択された低次ベクトルに対応する音声信号データおよびノイズデータを用いて生成されたパターンモデルは、全ノイズデータを用いたパターンモデルよりも、評価用混合信号データに対する音声認識に対して有効であると言える。また、全種類のノイズデータを１つのパターンモデルに混合するのは、必ずしも良い結果が得られるとは言えないことが解る。 Furthermore, as shown in FIG. 11, the pattern model corresponding to “Peri” is “TR_1_2, TR_2_3, TR_3_4” for speech recognition for the evaluation mixed signal data corresponding to “EV_1_2, EV_2_3, EV_3_4, EV_4_1”, respectively. , TR_4_1 ", it can be seen that the recognition probabilities are obtained by averaging the respective recognition probabilities of the pattern models respectively corresponding to TR_4_1". Also, as shown in FIG. 11, the average recognition probability of the pattern model corresponding to “Peri” (Ave. column in FIG. 11) is higher than the average recognition probability of the pattern model corresponding to “All”. I understand that. From this, the pattern model generated using the audio signal data and the noise data corresponding to the low-order vector selected from the annular region where the ratio of the distance from the center of gravity is in the range of 0.6 to 1.0 is It can be said that it is more effective for speech recognition with respect to the mixed signal data for evaluation than the pattern model using all noise data. It can also be seen that mixing all types of noise data into one pattern model does not necessarily give good results.

なお、重心からの距離の比が０．６〜１．０の範囲にある環状の領域から選択された低次ベクトルに対応する信号データおよびノイズデータを用いて生成されたパターンモデルは、重心からの距離の比が０．６未満の範囲となる内円の領域にある低次ベクトルに対応する話者の音声信号データおよびノイズデータから生成されるノイズ混合信号データに対しても比較的良好な認識確率を有するものとなる。従って、重心からの所定距離以上離れた範囲にある環状の領域から選択された低次ベクトルに対応する音声信号データおよびノイズデータを用いてパターンモデルを生成することで、全般的に良好な認識性能が得られるパターンモデルを生成することができる。以下、この理由を図１２〜図１５に基づき説明する。 Note that a pattern model generated using signal data and noise data corresponding to a low-order vector selected from an annular region in which the ratio of the distance from the center of gravity is in the range of 0.6 to 1.0 is Also relatively good for noise mixed signal data generated from speaker speech signal data and noise data corresponding to low-order vectors in the inner circle region where the distance ratio is less than 0.6 It has a recognition probability. Therefore, generally good recognition performance can be achieved by generating a pattern model using speech signal data and noise data corresponding to low-order vectors selected from an annular region that is more than a predetermined distance from the center of gravity. Can be generated. Hereinafter, the reason for this will be described with reference to FIGS.

ここで、図１２は、発話様式の一例を示す図である。また、図１３は、音声認識用のパターンモデルの可視化された低次ベクトル空間の一例を示す図である。また、図１４は、図１３の低次ベクトル空間を区分した一例を示す図である。また、図１５は、図１４の区分内容に基づき生成されたパターンモデルに対する認識確率を示す図である。なお、図１２〜図１５は、実際の実験に基づき生成された図である。 Here, FIG. 12 is a diagram illustrating an example of an utterance style. FIG. 13 is a diagram showing an example of a visualized low-order vector space of a pattern model for speech recognition. FIG. 14 is a diagram illustrating an example in which the low-order vector space of FIG. 13 is partitioned. FIG. 15 is a diagram showing recognition probabilities for the pattern model generated based on the category contents of FIG. 12 to 15 are diagrams generated based on actual experiments.

まず、男性１４５名に、１７５単語からなる複数の単語リストを、図１２に示すような複数の発話様式で発声してもらった音声データを収録し、この収録した音声データを全て用いてＨＭＭを学習させ、高次元の要素を有する男性用の不特定話者音声認識用のパターンモデルを生成した。次に、この生成したパターンモデルを初期モデルとし、連結学習により、各話者と収録時に指示された発話様式との組合せ毎に、各話者の音声データによって初期モデルを学習させてパターンモデルを生成した。以下、この生成されたパターンモデルを、話者・発話様式音響モデルと称す。 First, voice data obtained by uttering a plurality of word lists consisting of 175 words in a plurality of utterance styles as shown in FIG. 12 are recorded in 145 men, and an HMM is used using all of the recorded voice data. A pattern model for unspecified speaker speech recognition for men having high-dimensional elements was generated by learning. Next, the generated pattern model is used as an initial model, and for each combination of each speaker and the utterance style instructed at the time of recording, the initial model is learned from each speaker's voice data by connection learning. Generated. Hereinafter, the generated pattern model is referred to as a speaker / utterance style acoustic model.

そして、この高次元の要素からなる話者・発話様式音響モデルからなるデータ空間を、上記同様にＳａｍｍｏｎ法を用いて、２次元のベクトルからなるデータ空間へと写像して、各低次ベクトルを２次元空間上の座標点として表示する。この表示結果は、図１３に示すようになる。図１３には、目視によって得られる、発声速度と発声音量の軸も示されている。このように可視化された低次ベクトル空間を、上記同様に、重心から最も離れた位置にある低次ベクトルまでの距離を半径とした外円と、この外円よりも小さな半径の１つの内円とによって区分すると共に、外円と内円とからなる同心円同士の各曲線間に形成される環状の領域を半径方向に伸びる線によって４つのゾーンに区切った。この区分結果は、図１４に示すようになる。図１４に示すように、図１３の低次ベクトルは、ゾーン１〜ゾーン５の５つの領域に区分される。 Then, the data space consisting of this speaker / utterance style acoustic model consisting of high-dimensional elements is mapped to the data space consisting of two-dimensional vectors using the Sammon method as described above, and each low-order vector is mapped. Displayed as coordinate points in a two-dimensional space. The display result is as shown in FIG. FIG. 13 also shows the voicing speed and voicing volume axes obtained visually. The low-order vector space visualized in this way is, similarly to the above, an outer circle whose radius is the distance to the low-order vector farthest from the center of gravity, and one inner circle having a smaller radius than this outer circle. And an annular region formed between the curved lines of concentric circles composed of an outer circle and an inner circle was divided into four zones by lines extending in the radial direction. The classification result is as shown in FIG. As shown in FIG. 14, the low-order vector in FIG. 13 is divided into five regions, zone 1 to zone 5.

このようにして、話者・発話様式音響モデルの低次ベクトル空間が、５つのゾーンに区分されると、ゾーン１〜５の各ゾーン、ゾーン２〜５および全ゾーンのそれぞれに対して、各ゾーンに属する低次ベクトルに対応した話者・発話様式音響モデルの作成に用いられた音声データ（評価データは除く）を用いて、ゾーン１〜５の各ゾーンに対応する５つのパターンモデル、ゾーン２〜５に対応するパターンモデルおよび全ゾーンに対応するパターンモデルの計７つのパターンモデルを生成した。以下、この生成したパターンモデルをゾーン音響モデルと称す。なお、ゾーン音響モデルの作成方法は、話者・発話様式音響モデルと同様である。 In this way, when the low-order vector space of the speaker / utterance style acoustic model is divided into five zones, each zone of zones 1 to 5, each of zones 2 to 5 and all of the zones is Five pattern models and zones corresponding to each of the zones 1 to 5 using the voice data (excluding evaluation data) used to create the speaker / utterance style acoustic model corresponding to the low-order vector belonging to the zone A total of seven pattern models were generated: pattern models corresponding to 2 to 5 and pattern models corresponding to all zones. Hereinafter, the generated pattern model is referred to as a zone acoustic model. The creation method of the zone acoustic model is the same as that of the speaker / utterance style acoustic model.

ゾーン音響モデルが生成されると、図１４に示す低次ベクトル空間において、評価用データとしてラベル付けされた低次ベクトル（図中の１−Ａ〜５−Ｄ）に対応する音声データを用いて、上記７つのゾーン音響モデルの単語音声認識を行い、その認識確率を評価した。ここで、評価用データの選択に際しては、低次ベクトル空間の重心から距離がそれぞれ異なるものを適宜選択した。そして、評価用データは、展示会雑音をＳＮＲ２０ｄＢで混合したものを用いた。この単語音声認識結果は、図１５に示すようになる。 When the zone acoustic model is generated, the audio data corresponding to the low-order vectors (1-A to 5-D in the figure) labeled as the evaluation data are used in the low-order vector space shown in FIG. The word speech recognition of the above seven zone acoustic models was performed, and the recognition probability was evaluated. Here, when selecting the evaluation data, those having different distances from the center of gravity of the low-order vector space were appropriately selected. The evaluation data used was a mixture of exhibition noise and SNR of 20 dB. The word speech recognition result is as shown in FIG.

図１３〜図１５を詳細に観察すると、以下のＡ〜Ｇの７つのことが解る。
Ａ．ゾーン１は、通常の発話様式（◇）のプロットでほとんどが占められている。
Ｂ．ゾーン２は、小声の発話様式（△）のプロットが支配的である。
Ｃ．モーラ強調（×）のプロットが支配的であるゾーン３は、他の発話様式との間に、顕著な溝が観察される。
Ｄ．ゾーン４は、大声（□）およびロンバード（■）のプロットが支配的であり、ゾーン１を挟んで、小声（△）のプロットが支配的なゾーン２の反対側に位置する。
Ｅ．早口（○）および高い声（●）のプロットが支配的なゾーン５は、ゾーン１を挟んで、モーラ強調のプロットが支配的なゾーン３の反対側に位置する。
Ｆ．低次ベクトル空間の周辺部に位置するほど、認識性能が低くなる。
Ｇ．ゾーン２〜５の集合ゾーンに対応するゾーン音響モデルは、全ゾーンの集合ゾーンに対応するゾーン音響モデルとほぼ同じ認識確率となっている。 When observing FIGS. 13 to 15 in detail, the following seven of A to G are understood.
A. Zone 1 is mostly occupied by plots of normal speech styles (（).
B. Zone 2 is dominated by a plot of the utterance style (Δ) of the loud voice.
C. In Zone 3, where the mora enhancement (x) plot is dominant, a significant groove is observed between other utterance modes.
D. Zone 4 is dominated by loud (□) and Lombard (■) plots, and is located on the opposite side of zone 2 across which zone 1 is dominant.
E. Zone 5 in which the fast (O) and high voice (●) plots are dominant is located on the opposite side of Zone 3 from which Zone 1 is dominant.
F. The recognition performance becomes lower as it is located in the periphery of the low-order vector space.
G. The zone acoustic models corresponding to the collective zones of zones 2 to 5 have almost the same recognition probability as the zone acoustic models corresponding to the collective zones of all zones.

つまり、上記区分された低次ベクトル空間およびゾーン２〜５の集合ゾーンに対応するゾーン音響モデルは、上記Ａ．〜Ｆ．および上記Ｇ．で述べた特性をそれぞれ有するので、重心から所定距離以上離れた範囲にある環状の領域から選択された低次ベクトルに対応する音声信号データおよびノイズデータを用いてパターンモデルを生成することで、全般的に良好な認識性能が得られるパターンモデルを生成することができる。 That is, the zone acoustic model corresponding to the divided low-order vector space and the collective zones of zones 2 to 5 is the A. ~ F. And G. above. Since each of the above-mentioned characteristics is provided, a pattern model is generated by using voice signal data and noise data corresponding to a low-order vector selected from an annular region within a predetermined distance from the center of gravity. It is possible to generate a pattern model that can obtain excellent recognition performance.

なお、低次ベクトル空間の生成方法、当該低次ベクトル空間の可視化方法等については、本発明者らが発表した論文（「M.Shozakai and G.Nagino,"Analysis of Speaking Styles by Two-Dimensional Visualization of Aggregate of Acoustic Models,"Proc.ICSLP,vol.1,pp.717-720,Jeju,Korea,Oct.2004.」）に、より詳しく記載されている。 For the generation method of the low-order vector space, the visualization method of the low-order vector space, etc., a paper published by the present inventors ("M. Shozakai and G. Nagano," Analysis of Speaking Styles by Two-Dimensional Visualization of Aggregate of Acoustic Models, “Proc. ICSLP, vol. 1, pp. 717-720, Jeju, Korea, Oct. 2004.”).

以上、本発明のパターン認識装置１００は、高次元の要素を含んでなる高次信号パターンモデルから構成されるデータ空間を、各高次信号パターンモデル間の距離関係を近似した状態で低次元の要素を含んでなる低次信号ベクトルからなるデータ空間に写像してなる２次元（または３次元）の低次信号ベクトル空間と、高次元の要素を含んでなる高次ノイズパターンモデルから構成されるデータ空間を、各高次ノイズパターンモデル間の距離関係を近似した状態で低次元の要素を含んでなる低次ノイズベクトルからなるデータ空間に写像してなる２次元（または３次元）の低次ノイズベクトル空間とを、それぞれ、重心から最も離れた位置にある低次ベクトルまでの距離を半径とした外円（または外球）と、この外円（または外球）よりも小さな半径の内円（または内球）とによって区分すると共に、外円と内円とからなる同心円同士（または外球と内球とからなる同心球同士）の各曲線間（または曲面間）に形成される環状（球面状）の領域を半径方向に伸びる線（または面）によって複数に区分することが可能である。更に、中心から最も離れた位置にある低次ベクトルの中心からの距離を１とし、この距離に対する比が０．６〜１．０の範囲に含まれる環状の領域からパターンモデルを生成するための信号データおよびノイズデータを選択し、当該選択した信号データおよびノイズデータを用いて認識用のパターンモデルを生成することが可能である。従って、中心よりも所定距離以上離れたところにある信号データおよびノイズデータは、ユニークな特徴を有するので、通常のデータおよびこのようなユニークなデータに対しても満遍なく良好な認識確率が得られるパターンモデルを生成することが可能である。 As described above, the pattern recognition apparatus 100 according to the present invention has a low-dimensional data space composed of high-order signal pattern models including high-dimensional elements in a state where the distance relation between the high-order signal pattern models is approximated. Consists of a two-dimensional (or three-dimensional) low-order signal vector space mapped to a data space consisting of low-order signal vectors containing elements, and a high-order noise pattern model containing high-dimensional elements. A two-dimensional (or three-dimensional) low-order image that maps the data space to a data space consisting of low-order noise vectors containing low-dimensional elements in a state in which the distance relationship between each high-order noise pattern model is approximated. The noise vector space is an outer circle (or outer sphere) whose radius is the distance to the low-order vector farthest from the center of gravity, and smaller than this outer circle (or outer sphere). The inner circles (or inner spheres) of different radii are used to distinguish between the concentric circles composed of the outer and inner spheres (or between the concentric spheres composed of the outer and inner spheres). The formed annular (spherical) region can be divided into a plurality of lines (or surfaces) extending in the radial direction. Furthermore, the distance from the center of the low-order vector located farthest from the center is set to 1, and a pattern model is generated from an annular region whose ratio to this distance is in the range of 0.6 to 1.0. It is possible to select signal data and noise data and generate a pattern model for recognition using the selected signal data and noise data. Therefore, since the signal data and noise data located at a predetermined distance or more from the center have unique characteristics, a pattern in which a good recognition probability can be obtained evenly for normal data and such unique data. A model can be generated.

また、本発明のパターン認識装置１００は、上記２次元（または３次元）の低次信号ベクトル空間および低次ノイズベクトル空間を、それぞれ、重心から最も離れた位置にある低次ベクトルまでの距離を半径とした外円（または外球）と、この外円（または外球）よりも小さな半径の内円（または内球）とによって区分すると共に、外円と内円とからなる同心円同士（または外球と内球とからなる同心球同士）の各曲線間（または曲面間）に形成される環状（球面状）の領域を半径方向に伸びる線（または面）によって複数に区分することが可能である。更に、中心から最も離れた位置にある低次ベクトルの中心からの距離を１とし、この距離に対する比が０．６〜１．０の範囲に含まれる環状の領域からパターンモデルを評価するための信号データおよびノイズデータを選択し、当該選択した信号データおよびノイズデータを用いて生成した評価用混合信号データを用いて認識用のパターンモデルを評価することが可能である。従って、中心よりも所定距離以上離れたところにある信号データおよびノイズデータは、ユニークな特徴を有し、また、これらのユニークデータは、パターンモデルの認識性能を低下させる大きな要因となるので、これらのデータを用いてパターンモデルを評価することで簡易に且つ体系的に、パターンモデルの最低性能を評価することが可能である。 Further, the pattern recognition apparatus 100 of the present invention uses the above-described two-dimensional (or three-dimensional) low-order signal vector space and low-order noise vector space to determine the distance to the low-order vector that is farthest from the center of gravity. The outer circle (or outer sphere) with a radius and the inner circle (or inner sphere) with a smaller radius than this outer circle (or outer sphere) are separated, and concentric circles composed of the outer circle and inner circle (or An annular (spherical) region formed between each curve (or between curved surfaces) of concentric spheres consisting of an outer sphere and an inner sphere) can be divided into a plurality of lines (or surfaces) extending radially. It is. Further, the distance from the center of the low-order vector located farthest from the center is set to 1, and the pattern model is evaluated from an annular region whose ratio to the distance is in the range of 0.6 to 1.0. It is possible to select signal data and noise data, and to evaluate a pattern model for recognition using mixed signal data for evaluation generated using the selected signal data and noise data. Therefore, signal data and noise data that are more than a predetermined distance away from the center have unique characteristics, and these unique data are a major factor that degrades pattern model recognition performance. It is possible to easily and systematically evaluate the minimum performance of the pattern model by evaluating the pattern model using the above data.

上記実施の形態において、データ記憶部１０は、請求項１又は５記載の、信号データ記憶手段およびノイズデータ記憶手段に対応し、低次ベクトル空間記憶部１２は、請求項１又は５記載の、低次信号ベクトル空間記憶手段および低次ノイズベクトル空間記憶手段に対応する。
また、上記実施の形態において、高次パターンモデル生成部１４は、請求項４記載の、高次信号パターンモデル生成手段および高次ノイズパターンモデル生成手段に対応し、低次ベクトル空間生成部１８は、請求項４記載の、低次信号ベクトル空間生成手段および低次ノイズベクトル空間生成手段に対応する。 In the above embodiment, the data storage unit 10 corresponds to the signal data storage unit and the noise data storage unit according to claim 1 or 5, and the low-order vector space storage unit 12 has the configuration according to claim 1 or 5. This corresponds to low-order signal vector space storage means and low-order noise vector space storage means.
Moreover, in the said embodiment, the high-order pattern model production | generation part 14 respond | corresponds to the high-order signal pattern model production | generation means and high-order noise pattern model production | generation means of Claim 4, The low-order vector space production | generation part 18 is This corresponds to the low-order signal vector space generating means and the low-order noise vector space generating means.

また、上記実施の形態において、低次ベクトル空間区分部２０は、請求項３又は６記載の、低次信号ベクトル空間区分手段および低次ノイズベクトル空間区分手段に対応し、学習用データ選択部２２は、請求項１、３、９および１１のいずれか１項に記載の、低次信号ベクトル選択手段および低次ノイズベクトル選択手段に対応し、ノイズ混合信号データ生成部２４は、請求項１、９および１１のいずれか１項に記載のノイズ混合信号データ生成手段に対応し、パターンモデル生成部２６は、請求項１、９および１１のいずれか１項に記載のパターンモデル生成手段に対応する。 In the above-described embodiment, the low-order vector space partitioning unit 20 corresponds to the low-order signal vector space partitioning unit and the low-order noise vector space partitioning unit according to claim 3 or 6, and the learning data selection unit 22 Corresponds to the low-order signal vector selection means and the low-order noise vector selection means according to any one of claims 1, 3, 9, and 11, and the noise mixed signal data generation unit 24 includes: The pattern model generation unit 26 corresponds to the pattern model generation unit according to any one of claims 1, 9, and 11, and corresponds to the noise mixed signal data generation unit according to any one of 9 and 11. .

また、上記実施の形態において、評価用データ選択部３０は、請求項５、６および１０のいずれか１項に記載の、評価信号ベクトル選択手段および評価ノイズベクトル選択手段に対応し、評価用混合信号データ生成部３２は、請求項５、６および１０のいずれか１項に記載の評価用混合信号データ生成手段に対応し、パターンモデル評価部３４は、請求項５又は１０記載の性能評価手段に対応する。
また、上記実施の形態において、データ取得部３８は、請求項７又は１１記載のデータ取得手段に対応し、パターン認識処理部４０は、請求項７又は１１記載のパターン認識手段に対応する。 Moreover, in the said embodiment, the evaluation data selection part 30 respond | corresponds to the evaluation signal vector selection means and the evaluation noise vector selection means of any one of Claim 5, 6, and 10, and is mixed for evaluation. The signal data generation unit 32 corresponds to the evaluation mixed signal data generation unit according to any one of claims 5, 6, and 10, and the pattern model evaluation unit 34 includes the performance evaluation unit according to claim 5 or 10. Corresponding to
Moreover, in the said embodiment, the data acquisition part 38 respond | corresponds to the data acquisition means of Claim 7 or 11, and the pattern recognition process part 40 respond | corresponds to the pattern recognition means of Claim 7 or 11.

また、上記実施の形態において、情報表示部３６は、請求項５又は１０記載の評価結果出力手段、および請求項７又は１１記載の認識結果出力手段に対応する。
なお、上記実施の形態においては、評価処理を考慮して環状部分にある信号データの一部および雑音データの一部を用いて音声認識用のパターンモデルを生成する例を説明したが、これに限らず、環状の領域に属する低次ベクトルに対応する全音声信号データおよび全ノイズデータを用いてパターンモデルを生成しても良い。 Moreover, in the said embodiment, the information display part 36 respond | corresponds to the evaluation result output means of Claim 5 or 10, and the recognition result output means of Claim 7 or 11.
In the above embodiment, an example in which a pattern model for speech recognition is generated using a part of signal data and a part of noise data in the annular portion in consideration of the evaluation process has been described. However, the pattern model may be generated using all audio signal data and all noise data corresponding to low-order vectors belonging to the annular region.

また、上記実施の形態においては、パターンモデルを生成する機能、パターンモデルを評価する機能およびパターン認識を行う機能を１つの装置が備える構成としたが、これに限らず、これらの機能を分けて、パターンモデルを生成するのに必要な機能だけでパターンモデル生成装置を構成しても良いし、パターンモデルを評価するのに必要な機能だけでパターンモデル評価装置を構成しても良いし、前記パターンモデル生成装置で生成されたパターンモデルとパターン認識に必要な機能とでパターン認識装置を構成しても良い。また、各機能を自由に組み合わせて、パターンモデルの生成機能および評価機能を有する装置、パターンモデルの評価機能およびパターン認識機能を有する装置など様々な組合せの装置を構成しても良い。 In the above-described embodiment, a single device has a function of generating a pattern model, a function of evaluating a pattern model, and a function of performing pattern recognition. However, the present invention is not limited to this. The pattern model generation device may be configured with only the functions necessary for generating the pattern model, or the pattern model evaluation device may be configured with only the functions necessary for evaluating the pattern model. The pattern recognition device may be composed of the pattern model generated by the pattern model generation device and the functions necessary for pattern recognition. Various combinations of devices such as a device having a pattern model generation function and an evaluation function and a device having a pattern model evaluation function and a pattern recognition function may be configured by freely combining the functions.

また、上記実施の形態においては、低次信号ベクトル空間を構成する複数の低次信号ベクトルのうち、全低次信号ベクトルの重心から所定距離以上離れた位置にある低次信号ベクトルから評価用の低次信号ベクトルを選択する場合および低次ノイズベクトル空間を構成する複数の低次ノイズベクトルのうち、全低次ノイズベクトルの重心から所定距離以上離れた位置にある低次ノイズベクトルから評価用の低次ノイズベクトルを選択する場合について説明したが、これに限らず、図１６に示すように、２次元（または３次元）に可視化した場合には、前記外円（または外球）および複数の内円（または内球）の間に形成される複数の環状（または球面状）の領域が、前記半径方向に伸びる線（または面）によって複数に区分された各領域の頂点の近傍領域から評価用の低次信号ベクトルまたは評価用の低次ノイズベクトルを選択するようにしても良い。この場合には、実環境における認識性能の最低値の評価に留まらず、実環境における認識性能の最高値も含めて、認識性能を網羅的に評価することが可能になる。すなわち、全低次信号ベクトルの重心近くに位置する評価用の低次信号ベクトルに対応する信号データほど、高い認識確率が得られ、外円（または外球）の曲線（または曲面）近くに位置する評価用の低次信号ベクトルに対応する信号データほど、低い認識性能が得られる。更に、全低次ノイズベクトルの重心近くに位置する評価用の低次ノイズベクトルに対応するノイズデータほど、高い認識確率が得られ、外円（または外球）の曲線（または曲面）近くに位置する評価用の低次ノイズベクトルに対応するノイズデータほど、低い認識確率が得られる。 Further, in the above embodiment, for evaluation from a plurality of low-order signal vectors constituting the low-order signal vector space, a low-order signal vector located at a position more than a predetermined distance from the center of gravity of all the low-order signal vectors. When selecting a low-order signal vector and for evaluation from a plurality of low-order noise vectors constituting a low-order noise vector space, from a low-order noise vector located at a position more than a predetermined distance from the center of gravity of all the low-order noise vectors The case of selecting a low-order noise vector has been described. However, the present invention is not limited to this, and as shown in FIG. 16, when visualized in two dimensions (or three dimensions), the outer circle (or outer sphere) and a plurality of A plurality of annular (or spherical) regions formed between inner circles (or inner spheres) are divided into a plurality of regions by the radially extending lines (or surfaces). Low-order noise vector of the low-order signal vector or evaluation for evaluation from the vicinity area of the may be selected. In this case, it is possible not only to evaluate the minimum value of the recognition performance in the real environment but also to comprehensively evaluate the recognition performance including the maximum value of the recognition performance in the real environment. That is, the signal data corresponding to the low-order signal vector for evaluation located near the center of gravity of all the low-order signal vectors has a higher recognition probability and is located near the curve (or curved surface) of the outer circle (or outer sphere). The signal data corresponding to the low-order signal vector for evaluation to be obtained has a lower recognition performance. Furthermore, the noise data corresponding to the low-order noise vector for evaluation located near the centroid of all the low-order noise vectors has a higher recognition probability and is located near the curve (or curved surface) of the outer circle (or outer sphere). The lower the probability of recognition, the more noise data corresponding to the evaluation low-order noise vector.

また、上記実施の形態においては、低次信号ベクトル空間生成手段と低次ノイズベクトル空間生成手段は、多くの処理量や記憶容量を必要とするが、低次信号ベクトル空間と低次ノイズベクトル空間とをそれぞれ独立に生成し、低次信号ベクトル空間を構成する複数の低次信号ベクトルから選択された低次信号ベクトルと低次ノイズベクトル空間を構成する複数の低次ノイズベクトルから選択された低次ノイズベクトルとを所定のＳＮＲで混合して、ノイズ混合信号データを生成し、ノイズ混合信号データを学習データとして確率モデルを学習することができる。従って、信号データの種別が増えて、低次信号ベクトル空間を再作成する必要がある場合でも、低次ノイズベクトル空間は再作成しなくて良い。一方、ノイズデータの種別が増えて、低次ノイズベクトル空間を再作成する必要がある場合でも、低次信号ベクトル空間は再作成しなくて良いのは言うまでもない。 In the above embodiment, the low-order signal vector space generation unit and the low-order noise vector space generation unit require a large amount of processing and storage capacity. Are independently selected, and a low-order signal vector selected from a plurality of low-order signal vectors constituting a low-order signal vector space and a low-order signal selected from a plurality of low-order noise vectors constituting a low-order noise vector space The next noise vector is mixed at a predetermined SNR to generate noise mixed signal data, and the probability model can be learned using the noise mixed signal data as learning data. Therefore, even when the types of signal data increase and it is necessary to recreate a low-order signal vector space, the low-order noise vector space does not have to be recreated. On the other hand, it is needless to say that the low-order signal vector space need not be re-created even when the number of types of noise data increases and the low-order noise vector space needs to be re-created.

上記実施の形態により、本発明は、ノイズのクラスタの規模が大きくなると、ノイズ混合音声ＨＭＭの集合の作成処理とノイズ混合音声ＨＭＭの集合の中から最適なノイズ混合音声ＨＭＭの選択処理に多大な時間と労力がかかるという、上記特許文献１乃至上記特許文献３の従来技術が抱えていた課題の解決を可能とした。
更に、上記実施の形態により、本発明は、従来技術が成し得なかった、認識確率を低下させる要因となる学習用データ群および評価用データ群の組み合わせを導出する体系的な手法の開示に至ったものである。 According to the above-described embodiment, when the size of the noise cluster increases, the present invention makes a great deal of processing for creating a set of noise mixed speech HMMs and selecting an optimal noise mixed speech HMM from the set of noise mixed speech HMMs. It was possible to solve the problem of the prior art of Patent Document 1 to Patent Document 3 that takes time and labor.
Furthermore, according to the above embodiment, the present invention discloses a systematic method for deriving a combination of a learning data group and an evaluation data group that cause a reduction in recognition probability, which could not be achieved by the conventional technique. It has come.

本発明に係るパターン認識装置の機能構成を示すブロック図である。It is a block diagram which shows the function structure of the pattern recognition apparatus which concerns on this invention. パターン認識装置１００のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the pattern recognition apparatus. パターン認識用のパターンモデルの生成処理の一例を示すフローチャートである。It is a flowchart which shows an example of the production | generation process of the pattern model for pattern recognition. 低次ベクトル空間データの生成処理の一例を示すフローチャートである。It is a flowchart which shows an example of the production | generation process of low-order vector space data. パターン認識用のパターンモデルの評価処理の一例を示すフローチャートである。It is a flowchart which shows an example of the evaluation process of the pattern model for pattern recognition. パターン認識処理の一例を示すフローチャートである。It is a flowchart which shows an example of a pattern recognition process. 音声信号データに対する２次元の低次信号ベクトル空間の一例を示す図である。It is a figure which shows an example of the two-dimensional low-order signal vector space with respect to audio | voice signal data. 領域区分後の図７の低次信号ベクトル空間の一例を示す図である。It is a figure which shows an example of the low-order signal vector space of FIG. 7 after area | region division. 雑音データに対する２次元の低次ノイズベクトル空間の一例を示す図である。It is a figure which shows an example of the two-dimensional low-order noise vector space with respect to noise data. 領域区分後の図９の低次ノイズベクトル空間の一例を示す図である。It is a figure which shows an example of the low-order noise vector space of FIG. 9 after area | region division. 本発明の手法を用いて実データに基づき生成したパターンモデルの評価結果を示す図である。It is a figure which shows the evaluation result of the pattern model produced | generated based on actual data using the method of this invention. 発話様式の一例を示す図である。It is a figure which shows an example of an utterance style. 音声認識用のパターンモデルの可視化された低次ベクトル空間の一例を示す図である。It is a figure which shows an example of the low-order vector space visualized of the pattern model for speech recognition. 図１３の低次ベクトル空間を区分した一例を示す図である。It is a figure which shows an example which divided the low-order vector space of FIG. 図１４の区分内容に基づき生成されたパターンモデルに対する認識確率を示す図である。It is a figure which shows the recognition probability with respect to the pattern model produced | generated based on the division content of FIG. ２次元可視化において、外円および複数の内円の間に形成される複数の環状の領域が、半径方向に伸びる線によって複数に区分された各領域の頂点を示す図である。In two-dimensional visualization, it is a figure which shows the vertex of each area | region where the some cyclic | annular area | region formed between an outer circle and a some inner circle was divided into plurality by the line extended in a radial direction.

Explanation of symbols

１００パターン認識装置
１０データ記憶部
１２低次ベクトル空間記憶部
１４高次パターンモデル生成部
１６高次パターンモデル記憶部
１８低次ベクトル空間生成部
２０低次ベクトル空間区分部
２２学習用データ選択部
２４ノイズ混合信号データ生成部
２６パターンモデル生成部
２８パターンモデル記憶部
３０評価用データ選択部
３２評価用混合信号データ生成部
３４パターンモデル評価部
３６情報表示部
３８データ取得部
４０パターン認識処理部 DESCRIPTION OF SYMBOLS 100 Pattern recognition apparatus 10 Data storage part 12 Low-order vector space storage part 14 High-order pattern model generation part 16 High-order pattern model storage part 18 Low-order vector space generation part 20 Low-order vector space division part 22 Data selection part 24 for learning Noise mixed signal data generation unit 26 Pattern model generation unit 28 Pattern model storage unit 30 Evaluation data selection unit 32 Evaluation mixed signal data generation unit 34 Pattern model evaluation unit 36 Information display unit 38 Data acquisition unit 40 Pattern recognition processing unit

Claims

A pattern model generation device that generates a pattern model used for pattern recognition of recognition target data to be recognized,
Signal data storage means for storing a plurality of noise-unmixed pre-acquired signal data related to a plurality of objects;
As an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on the signal data, the positional relationship between the higher-order signal pattern models in the data space Low-order signal vector space storage means for storing a low-order signal vector space that is mapped to a data space composed of low-order signal vectors of three dimensions or less in an approximate state,
Noise data storage means for storing a plurality of types of signal-unmixed noise data;
As an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four-dimensional or higher-dimensional elements generated based on the noise data, the positional relationship between the higher-order noise pattern models in the data space Low-order noise vector space storage means for storing a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state where
Among the plurality of low-order signal vectors constituting the low-order signal vector space, a low-order signal vector for generating a pattern model from among low-order signal vectors located at a predetermined distance or more from the center of gravity of all the low-order signal vectors. Low-order signal vector selection means for selecting,
Of the plurality of low-order noise vectors constituting the low-order noise vector space, a low-order noise vector for generating a pattern model from among the low-order noise vectors located at a predetermined distance or more from the center of gravity of all the low-order noise vectors. Low-order noise vector selection means for selecting
Based on the signal data corresponding to the low-order signal vector selected by the low-order signal vector selection means and the noise data corresponding to the low-order noise vector selected by the low-order noise vector selection means, the noise is added to each signal data. Noise mixed signal data generating means for generating noise mixed signal data obtained by mixing data;
A pattern model generation device, comprising: a pattern model generation unit configured to generate a pattern model obtained by learning a probability model using the noise mixed signal data as learning data.

2. The pattern model generation apparatus according to claim 1, wherein the high-order signal pattern model, the high-order noise pattern model, and the pattern model are configured by an HMM (Hidden Markov Model).

The low-order signal vector space and the low-order noise vector space are two-dimensional or three-dimensional data spaces,
A plurality of low-order signal vectors constituting the low-order signal vector space are centered on the centroid of all low-order signal vectors, and the distance between the centroid and the low-order signal vector at the position farthest from the centroid is the radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order signal vector space partitioning means for partitioning;
A plurality of low-order noise vectors constituting the low-order noise vector space are centered on the centroid of the low-order noise vector, and the distance between the centroid and the low-order noise vector at a position farthest from the centroid is a radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order noise vector space classification means for classifying;
Low-order signal vector space display means for displaying each low-order signal vector constituting the low-order signal vector space together with the classification result;
Low-order noise vector space display means for displaying each low-order noise vector constituting the low-order noise vector space together with the classification result, and
The low-order signal vector selection means includes a plurality of concentric circles or concentric spheres composed of the outer circle or outer sphere and an inner circle or inner sphere having a radius equal to or greater than the predetermined distance in the displayed lower-order signal vector space. A low-order signal vector for generating a pattern model is selected from low-order signal vectors included in an annular or spherical region formed between each curve or between curved surfaces.
The low-order noise vector selection means includes a plurality of concentric circles or concentric spheres composed of the outer circle or outer sphere and an inner circle or inner sphere having a radius equal to or greater than the predetermined distance in the displayed lower-order noise vector space. A low-order noise vector for generating a pattern model is selected from low-order noise vectors included in an annular or spherical area formed between each curve or between curved surfaces. The pattern model generation apparatus according to claim 1 or 2.

A high-order signal pattern model generating means for extracting a feature quantity from the signal data and generating the high-order signal pattern model based on the extracted feature quantity;
A high-order noise pattern model generating means for extracting a feature value from the noise data and generating the high-order noise pattern model based on the extracted feature value;
As an alternative to a data space composed of a plurality of higher-order signal pattern models, data composed of lower-order signal vectors of three dimensions or less in a state in which the positional relationship between the higher-order signal pattern models in the data space is approximated Low-order signal vector space generating means for generating the low-order signal vector space formed by mapping to space;
As an alternative to the data space composed of the plurality of high-order noise pattern models, data composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the high-order noise pattern models in the data space is approximated 4. The pattern model generation device according to claim 1, further comprising: a low-order noise vector space generation unit configured to generate the low-order noise vector space mapped to a space. 5. .

A pattern model evaluation device for evaluating the recognition performance of a pattern model used for pattern recognition of recognition target data to be recognized,
Signal data storage means for storing a plurality of noise-unmixed pre-acquired signal data related to the plurality of objects;
As an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on the signal data, the positional relationship between the higher-order signal pattern models in the data space Low-order signal vector space storage means for storing a low-order signal vector space that is mapped to a data space composed of low-order signal vectors of three dimensions or less in an approximate state,
Noise data storage means for storing a plurality of types of signal-unmixed noise data;
As an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four-dimensional or higher-dimensional elements generated based on the noise data, the positional relationship between the higher-order noise pattern models in the data space Low-order noise vector space storage means for storing a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state where
Among the plurality of low-order signal vectors constituting the low-order signal vector space, the low-order signal vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors or that are uniformly centered on the center of gravity. Evaluation signal vector selection means for selecting a low-order signal vector for evaluation from the inside;
Among the plurality of low-order noise vectors constituting the low-order noise vector space, the low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors or that are uniformly centered on the center of gravity. Evaluation noise vector selection means for selecting a low-order noise vector for evaluation from the inside;
Based on the signal data corresponding to the low-order signal vector selected by the evaluation signal vector selection means and the noise data corresponding to the low-order noise vector selected by the evaluation noise vector selection means, each signal data Mixed signal data generation means for evaluation for generating mixed signal data for evaluation mixed with data; and
Performance evaluation means for evaluating the performance of the pattern model based on the mixed signal data for evaluation generated by the mixed signal data generation means for evaluation and the pattern model;
An evaluation result output means for outputting an evaluation result of the performance evaluation means.

The low-order signal vector space and the low-order noise vector space are two-dimensional or three-dimensional data spaces,
A plurality of low-order signal vectors constituting the low-order signal vector space are centered on the centroid of all low-order signal vectors, and the distance between the centroid and the low-order signal vector at the position farthest from the centroid is the radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order signal vector space partitioning means for partitioning;
A plurality of low-order noise vectors constituting the low-order noise vector space are centered on the centroid of the low-order noise vector, and the distance between the centroid and the low-order noise vector at a position farthest from the centroid is a radius. The outer circle or outer sphere is divided into n outer circles or inner spheres (n is an integer of 1 or more) centered on the center of gravity and having a smaller radius than the outer circle or outer sphere. A plurality of concentric circles consisting of circles and inner circles or outer spheres and inner spheres, or annular or spherical regions formed between curved lines or curved surfaces of concentric spheres are divided into a plurality by radially extending lines or surfaces. Low-order noise vector space classification means for classifying;
Low-order signal vector space display means for displaying each low-order signal vector constituting the low-order signal vector space together with the classification result;
Low-order noise vector space display means for displaying each low-order noise vector constituting the low-order noise vector space together with the classification result, and
The evaluation signal vector selection means and the evaluation noise vector selection means are lines extending in the radial direction in the annular region or the spherical region of the displayed low-order signal vector space and low-order noise vector space. Alternatively, a low-order signal vector and a low-order noise vector for evaluation are selected from each region divided into a plurality of areas according to a plane,
The evaluation mixed signal data generation means includes each signal based on signal data and noise data respectively corresponding to a low-order signal vector and a low-order noise vector selected by the evaluation signal vector selection means and the evaluation noise vector selection means. 6. The pattern model evaluation apparatus according to claim 5, wherein mixed signal data for evaluation obtained by mixing each noise data with the data is generated.

The pattern model generation device according to any one of claims 1 to 4,
Data acquisition means for acquiring recognition target data of the recognition target;
Pattern recognition means for performing pattern recognition of the acquired recognition target data based on the recognition target data acquired in the data acquisition means and the pattern model generated in the pattern model generation device;
And a recognition result output means for outputting a recognition result of the pattern recognition means.

The pattern recognition apparatus according to claim 7, comprising the pattern model evaluation apparatus according to claim 5.

A pattern model generation program for generating a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, low-order signal vector selection means for selecting a low-order signal vector for pattern model generation from low-order signal vectors located at a predetermined distance or more away from the center of gravity of all low-order signal vectors,
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, Low-order noise vector selection means for selecting a low-order noise vector for pattern model generation from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected by the low-order signal vector selection means and the noise data corresponding to the low-order noise vector selected by the low-order noise vector selection means, the noise is added to each signal data. Noise mixed signal data generating means for generating noise mixed signal data obtained by mixing data;
A pattern model generation program comprising a program for causing a computer to realize a function as a pattern model generation unit that generates a pattern model obtained by learning a probability model using the noise mixed signal data as learning data.

A pattern model evaluation program for evaluating the recognition performance of a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Evaluation signal vector for selecting a low-order signal vector for evaluation from among low-order signal vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors, or that are uniformly centered on the center of gravity. A selection means;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, Evaluation noise vector selection means for selecting a low-order noise vector for evaluation from low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors, or are uniformly centered on the center of gravity; ,
Based on the signal data corresponding to the low-order signal vector selected by the evaluation signal vector selection means and the noise data corresponding to the low-order noise vector selected by the evaluation noise vector selection means, each signal data Mixed signal data generation means for evaluation for generating mixed signal data for evaluation mixed with data; and
Performance evaluation means for evaluating the performance of the pattern model based on the mixed signal data for evaluation generated by the mixed signal data generation means for evaluation and the pattern model;
A pattern model evaluation program comprising a program for causing a computer to realize a function as an evaluation result output means for outputting an evaluation result of the performance evaluation means.

The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, low-order signal vector selection means for selecting a low-order signal vector for pattern model generation from low-order signal vectors located at a predetermined distance or more away from the center of gravity of all low-order signal vectors,
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, Low-order noise vector selection means for selecting a low-order noise vector for pattern model generation from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected by the low-order signal vector selection means and the noise data corresponding to the low-order noise vector selected by the low-order noise vector selection means, the noise is added to each signal data. Noise mixed signal data generating means for generating noise mixed signal data obtained by mixing data;
Pattern model generation means for generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data;
Data acquisition means for acquiring recognition target data of the recognition target;
Pattern recognition means for performing pattern recognition of the acquired recognition target data based on the recognition target data acquired in the data acquisition means and the pattern model generated in the pattern model generation device;
A pattern recognition program comprising a program for causing a computer to realize a function as a recognition result output means for outputting a recognition result of the pattern recognition means.

A pattern model generation method for generating a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, a low-order signal vector selection step for selecting a low-order signal vector for pattern model generation from among low-order signal vectors located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, A low-order noise vector selection step for selecting a low-order noise vector for generating a pattern model from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected in the low-order signal vector selection step and the noise data corresponding to the low-order noise vector selected in the low-order noise vector selection step, the noise is added to each signal data. Noise mixed signal data generation step for generating noise mixed signal data obtained by mixing data;
And a pattern model generation step of generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data.

A pattern model evaluation method for evaluating the recognition performance of a pattern model used for pattern recognition of recognition target data to be recognized,
The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Evaluation signal vector for selecting a low-order signal vector for evaluation from among low-order signal vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors, or that are uniformly centered on the center of gravity. A selection step;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, An evaluation noise vector selection step for selecting a low-order noise vector for evaluation from low-order noise vectors that are located at a predetermined distance or more away from the center of gravity of all the low-order noise vectors or that are uniformly centered on the center of gravity; ,
Based on the signal data corresponding to the low-order signal vector selected in the evaluation signal vector selection step and the noise data corresponding to the low-order noise vector selected in the evaluation noise vector selection step, each signal data An evaluation mixed signal data generation step for generating mixed signal data for evaluation mixed with noise data;
A performance evaluation step for evaluating the performance of the pattern model based on the mixed signal data for evaluation generated in the mixed signal data generation step for evaluation and the pattern model;
An evaluation result output step of outputting an evaluation result of the performance evaluation step.

The data space generated as an alternative to a data space composed of a plurality of higher-order signal pattern models composed of four-dimensional or higher-dimensional elements generated based on a plurality of noise-unmixed signal data related to a plurality of objects A plurality of low-order signal vectors constituting a low-order signal vector space mapped to a data space composed of low-order signal vectors of three dimensions or less in a state in which the positional relationship between the high-order signal pattern models in FIG. Among them, a low-order signal vector selection step for selecting a low-order signal vector for pattern model generation from among low-order signal vectors located at a predetermined distance or more away from the center of gravity of all the low-order signal vectors;
Each height in the data space generated as an alternative to a data space composed of a plurality of higher-order noise pattern models composed of four or more high-dimensional elements generated based on a plurality of types of signal-unmixed noise data Among a plurality of low-order noise vectors constituting a low-order noise vector space mapped to a data space composed of low-order noise vectors of three dimensions or less in a state in which the positional relationship between the second-order noise pattern models is approximated, A low-order noise vector selection step for selecting a low-order noise vector for generating a pattern model from low-order noise vectors located at a predetermined distance or more away from the center of gravity of all low-order noise vectors;
Based on the signal data corresponding to the low-order signal vector selected in the low-order signal vector selection step and the noise data corresponding to the low-order noise vector selected in the low-order noise vector selection step, the noise is added to each signal data. Noise mixed signal data generation step for generating noise mixed signal data obtained by mixing data;
A pattern model generation step of generating a pattern model obtained by learning a probability model using the noise mixed signal data as learning data;
A data acquisition step of acquiring recognition target data of the recognition target;
A pattern recognition step for performing pattern recognition of the acquired recognition target data based on the recognition target data acquired in the data acquisition step and the pattern model generated in the pattern model generation step;
And a recognition result output step of outputting a recognition result of the pattern recognition step.