JP3464148B2

JP3464148B2 - Expression method of identification space in recognition device, template evaluation method and learning device, recording medium

Info

Publication number: JP3464148B2
Application number: JP17630698A
Authority: JP
Inventors: 貴彦新村
Original assignee: 株式会社エヌ・ティ・ティ・データ
Priority date: 1998-06-23
Filing date: 1998-06-23
Publication date: 2003-11-05
Anticipated expiration: 2018-06-23
Also published as: JP2000011093A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、テンプレートを用
いた認識技術に係り、特にテンプレートを適正に評価し
て学習を進めるための改良された手法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a recognition technique using a template, and more particularly, to an improved method for properly evaluating a template to advance learning.

【０００２】[0002]

【従来の技術】例えば文字認識装置に用いられるテンプ
レートは、文字のカテゴリ間の特徴を判別し易く且つ個
々の文字カテゴリの特徴をより良く代表するように配置
することが認識率の向上につながる。そこで、文字認識
学習の度にテンプレートを評価し、評価結果が十分でな
い場合は、所定の学習パラメータを変えながらテンプレ
ートを更新することが行われている。テンプレートの更
新は、具体的には、多次元学習データにより形成される
識別空間内で認識率が高まる方向にテンプレートを移動
することにより行われる。2. Description of the Related Art For example, in a template used in a character recognition device, it is easy to discriminate the characteristics between character categories and to arrange them so as to better represent the characteristics of each character category, which leads to an improvement in the recognition rate. Therefore, the template is evaluated every time the character recognition learning is performed, and when the evaluation result is not sufficient, the template is updated while changing the predetermined learning parameter. Specifically, the template is updated by moving the template in a direction in which the recognition rate increases in the identification space formed by the multidimensional learning data.

【０００３】従来、テンプレートの移動方向の決定は、
学習パラメータを試行錯誤的に変えながら認識率がどれ
だけ高まったかを調べるという、間接的な手法に頼って
いる。テンプレートと多次元学習データの分布（カテゴ
リ特徴）との関係を可視化することができれば、テンプ
レートをどの方向に動かしたら良いかが直接的にわかる
ので有効な手段となり得るが、そのためには、識別空間
の特徴を表現するための二次元分布図が作成できなけれ
ばならない。しかし、文字認識の分野では、識別空間が
数千次元にも及ぶため、そのような多次元空間をそのま
ま二次元に圧縮すると分散が大きく、分布の重なりが大
きい非実用的な分布図になってしまう。Conventionally, the determination of the moving direction of the template is performed by
It relies on an indirect method of investigating how much the recognition rate has increased while changing learning parameters by trial and error. If the relationship between the template and the distribution (category features) of the multidimensional learning data can be visualized, it can be an effective means because it is possible to directly know in which direction the template should be moved. It is necessary to be able to create a two-dimensional distribution map for expressing the characteristics of. However, in the field of character recognition, the identification space extends to several thousand dimensions, so compressing such a multidimensional space as it is into two dimensions results in a large dispersion and an unpractical distribution map with a large overlap of distributions. I will end up.

【０００４】そこで、所定次元数に圧縮変換した後に分
布図を作成することが試みられている。このような従来
技術として、主成分分析（「多変量解析ハンドブッ
ク」、柳井、現代数学社、1986．参照）、サモンのマッ
プ（「Nonlinear Mapping for Data Structure Analysi
s」、JOHN W.SAMMON,IEEE TRANS VOL.C-18 NO5 May 196
9．参照）、Ｆｉｓｈｅｒ比（「判別分析」、ラッヘン
ブルック、現代数学社．参照）による方法が知られてい
る。主成分分析は、多次元空間の特徴を少数の固有ベク
トルからなる識別空間へ変換する方法であり、サモンの
マップは、非線形的に別空間へ写像する方法である。し
かし、主成分分析では、個々の固有ベクトルの寄与率が
低くなることがあり、二次元分布図上での重なりが大き
くなってしまう。また、サモンのマップでは、低次元へ
の変換が可能であるが、パラメータ調整が試行錯誤的で
あるため、扱いづらい。Therefore, it has been attempted to create a distribution map after compression conversion into a predetermined number of dimensions. Examples of such conventional techniques include principal component analysis (see "Multivariate Analysis Handbook", Yanai, Hyundai Mathematics Co., 1986.), and Summon's map ("Nonlinear Mapping for Data Structure Analysi").
s '', JOHN W. SAMMON, IEEE TRANS VOL. C-18 NO5 May 196
9. Method), and Fisher ratio (see "Discriminant analysis", Rachenbrook, Hyundai Mathematics Co., Ltd.). Principal component analysis is a method of converting the features of a multidimensional space into an identification space composed of a small number of eigenvectors, and Summon's map is a method of non-linearly mapping to another space. However, in the principal component analysis, the contribution rate of each eigenvector may be low, and the overlap on the two-dimensional distribution map becomes large. In addition, the map of Summon can be converted into a low dimension, but it is difficult to handle because the parameter adjustment is trial and error.

【０００５】これに対し、Ｆｉｓｈｅｒ比（以下、Ｆ比
と略称する）は、変量変換を行わないため他の手法より
も扱い易く、この種の分野では、比較的良く利用され
る。Ｆ比では、多次元学習データの分布状態を下記
（１）式の比率値で表す。Ｆ比＝Ｖｂ／Ｖｗ・・・(1) 分子Ｖｂは級間分散（カテゴリ間分散）、分母Ｖｗは級
内分散（カテゴリ内分散）であり、それぞれ下記（２）
式、（３）式で表される。On the other hand, the Fisher ratio (hereinafter abbreviated as F ratio) is easier to handle than other methods because it does not perform variable transformation, and is relatively well used in this kind of field. In the F ratio, the distribution state of the multidimensional learning data is represented by the ratio value of the following formula (1). F ratio = Vb / Vw (1) The numerator Vb is interclass variance (intercategory variance), and the denominator Vw is intraclass variance (intracategory variance).
Equation (3) is represented.

【０００６】[0006]

【数１】 [Equation 1]

【０００７】但し、ｇは文字カテゴリ数、Ｅｉは文字カ
テゴリｉの平均値、ＥｕはＥ１〜Ｅｇの平均値、ｎｉは
文字カテゴリｉのデータ数、ｘijは文字カテゴリｉのｊ
番目のデータである。[0007] However, g is the number of characters category, Ei is the average value of the character category i, Eu is E1 ~E g of average value, ni is the number of data of the character category i, xij is the character category i j
This is the second data.

【０００８】これらの式から明らかなように、級内分散
Ｖｗが小さくなるほど（カテゴリ内の学習データの広が
りが小さいほど）、また、級間分散Ｖｂが大きくなるほ
ど（２つのカテゴリが離れているほど）Ｆ比が大きい値
となり、分布判別がより容易になる。つまり、図９
（ａ）のように、文字カテゴリＣ１〜Ｃ４分布の重なり
がある場合はＦ比は小さくなり、分布判別がしにくくな
る。これに対し、図９（ｂ）のように文字カテゴリＣ１
〜Ｃ４の重なりが小さいほどＦ比は大きくなり、分布判
別が容易になる。As is clear from these equations, the smaller the intra-class variance Vw (the smaller the spread of the learning data in the category) and the larger the inter-class variance Vb (the more distant the two categories are, the more the two categories are separated). ) The F ratio has a large value, and the distribution discrimination becomes easier. That is, FIG.
When the character categories C1 to C4 distributions overlap each other as shown in (a), the F ratio becomes small and it becomes difficult to discriminate the distributions. In contrast, as shown in FIG. 9B, the character category C1
The smaller the overlap of C4 to C4, the larger the F ratio, and the easier the distribution discrimination becomes.

【０００９】[0009]

【発明が解決しようとする課題】従来のＦ比は、識別空
間内の全体的な分布状態がどのようになっているかを問
題にする手法であるため、個々のカテゴリ内の分散や任
意の２カテゴリ間の分散がどうなっているかは考慮され
ていない。しかし、実際には、同じカテゴリ内であって
も、図１０（ａ）に示すように、全体としては分布の重
なりは少ないが個々の文字カテゴリＣ１〜Ｃ３には重な
りのあるものや、図１０（ｂ）のように、個々の文字カ
テゴリＣ１〜Ｃ３で分布の重なりが少ないものもある。
また、同じＦ比であっても、図１０（ｂ）と図１０
（ｃ）のように、文字カテゴリＣ１〜Ｃ３の重なり状態
が異なる場合があるが、これらの要素がテンプレートと
学習データの分布図を見にくくし、テンプレートの評
価、更新を困難にするという問題があった。Since the conventional F-ratio is a method that makes the overall distribution state in the identification space a problem, the variance within each category and an arbitrary 2 It does not take into account the distribution between categories. However, in reality, even within the same category, as shown in FIG. 10A, there is little overlap in distribution as a whole, but there is overlap in individual character categories C1 to C3. As shown in (b), some character categories C1 to C3 have a small overlap of distributions.
In addition, even if the F ratio is the same, as shown in FIG.
As shown in (c), the overlapping states of the character categories C1 to C3 may be different, but there is a problem that these elements make it difficult to see the distribution map of the template and the learning data and make it difficult to evaluate and update the template. It was

【００１０】そこで、本発明の課題は、多次元識別空間
の特徴を良く見えるようにするための識別空間の表現方
法を提供することにある。本発明の他の課題は、認識率
を高めるための性能の良いテンプレートを迅速に得るた
めのテンプレート評価方法を提供することにある。本発
明の他の課題は、上記テンプレート評価方法の実施に適
したテンプレート学習装置を提供することにある。本発
明の他の課題は、上記次元圧縮方法及びテンプレート評
価方法をコンピュータ装置を用いて実施するうえで好適
となる記録媒体を提供することにある。Therefore, an object of the present invention is to provide a method of expressing an identification space so that the features of the multidimensional identification space can be seen well. Another object of the present invention is to provide a template evaluation method for promptly obtaining a template with good performance for increasing the recognition rate. Another object of the present invention is to provide a template learning device suitable for implementing the template evaluation method. Another object of the present invention is to provide a recording medium suitable for implementing the above-described dimension compression method and template evaluation method using a computer device.

【００１１】[0011]

【課題を解決するための手段】上記課題を解決するた
め、本発明は、複数カテゴリの特徴を表す多次元の識別
空間から複数の２軸の組み合わせを特定する過程と、各
２軸の組み合わせ毎に、全カテゴリにおけるカテゴリ間
分散とカテゴリ内分散との比率を表す第１の値を導出す
るとともに２つのカテゴリの組み合わせに係るカテゴリ
間分散とカテゴリ内分散との比率の最小値を表す第２の
値を全組み合わせについて導出する過程と、前記第１の
値及び第２の値が共に最大となる目標点からの距離が最
小の２軸の組み合わせを選択して平面を形成する過程
と、前記平面上に前記選択した２軸上のデータを反映さ
せる過程とを含むようにすることで、認識装置における
多次元の識別空間を二次元的に適正に表現できるように
した。In order to solve the above-mentioned problems, the present invention provides a process of identifying a combination of a plurality of two axes from a multidimensional identification space representing the characteristics of a plurality of categories, and each combination of the two axes. And derive a first value that represents the ratio of the inter-category variance and the intra-category variance in all categories and a second value that represents the minimum value of the ratio between the inter-category variance and the intra-category variance related to the combination of the two categories. A step of deriving values for all combinations; a step of forming a plane by selecting a biaxial combination having a minimum distance from a target point where both the first value and the second value are maximum; By including the above process of reflecting the selected data on the two axes, the multidimensional identification space in the recognition device can be appropriately expressed two-dimensionally.

【００１２】上記他の課題を解決するテンプレート評価
方法では、複数のカテゴリの特徴を表す多次元学習デー
タと当該多次元学習データにより形成される識別空間の
特徴を代表するテンプレートを生成するためのテンプレ
ートデータとを保持しておく。そして、前記識別空間か
らカテゴリ特徴の分布の重なりが最も小さくなる２軸の
組み合わせを選択するとともに、選択した２軸で形成さ
れる平面上に当該２軸上の学習データ及びテンプレート
データを分布表示し、この分布表示をもとに前記識別空
間におけるテンプレートの配置を評価することを特徴と
する。In the template evaluation method for solving the above-mentioned other problems, a template for generating multi-dimensional learning data representing the characteristics of a plurality of categories and a template representing the characteristics of the identification space formed by the multi-dimensional learning data. Hold data and. Then, a combination of two axes that minimizes the overlap of the distribution of category features is selected from the identification space, and learning data and template data on the two axes are distributed and displayed on the plane formed by the selected two axes. The arrangement of the templates in the identification space is evaluated based on this distribution display.

【００１３】前記２軸を選択する過程は、具体的には、
前記２軸の組み合わせ毎に、全カテゴリのカテゴリ間分
散とカテゴリ内分散との比率を表す第１の値を導出する
とともに２つのカテゴリの組み合わせに係るカテゴリ間
分散とカテゴリ内分散との比率の最小値を表す第２の値
を全組み合わせについて導出し、前記第１の値と第２の
値が共に最大となる目標点からの距離が最小となる２軸
の組み合わせを選択する処理である。２軸を選択するた
めの前記識別空間は、好ましくは、初期多次元空間から
余分な特徴成分を除去して次元圧縮した空間とする。The process of selecting the two axes is as follows.
For each combination of the two axes, a first value representing the ratio of the inter-category variance and the intra-category variance of all categories is derived, and the minimum ratio of the inter-category variance and the intra-category variance related to the combination of the two categories This is a process of deriving a second value that represents a value for all combinations and selecting a biaxial combination that minimizes the distance from the target point where both the first value and the second value are maximum. The discriminant space for selecting the two axes is preferably a space dimensionally compressed by removing extra feature components from the initial multidimensional space.

【００１４】上記他の課題を解決する本発明のテンプレ
ート学習装置は、複数のカテゴリの特徴を表す多次元学
習データと当該多次元学習データにより形成される識別
空間の特徴を代表するテンプレートを生成するためのテ
ンプレートデータとを保持するデータ保持手段、前記識
別空間からカテゴリ分布の重なりが最も小さくなる２軸
の組み合わせを選択する軸選択手段、この軸選択手段で
選択された２軸で平面を形成するとともにこの平面上に
当該２軸上の学習データ及びテンプレートデータを分布
表示する手段を有し、この分布表示をもとに前記識別空
間におけるテンプレートを学習することを特徴とする。
テンプレート学習装置をこのように構成することによ
り、識別空間内におけるテンプレートの移動方向を適正
に評価しながら学習（更新）を進めていくことができる
ので、性能の良いテンプレートの作成が迅速且つ容易に
なる。A template learning device of the present invention which solves the above-mentioned other problems generates multi-dimensional learning data representing the features of a plurality of categories and a template representing the features of an identification space formed by the multi-dimensional learning data. Data holding means for holding the template data for the purpose, axis selecting means for selecting a combination of the two axes from which the category distribution has the smallest overlap, and a plane is formed by the two axes selected by the axis selecting means. Along with this, there is provided means for displaying the learning data and template data on the two axes in a distributed manner on this plane, and learning the template in the identification space based on this distributed display.
By configuring the template learning device in this way, it is possible to proceed with learning (updating) while appropriately evaluating the moving direction of the template in the identification space, and thus it is possible to quickly and easily create a template with good performance. Become.

【００１５】上記他の課題を解決する本発明の記録媒体
は、上記識別空間の表現方法の実施に適した第１の記録
媒体、上記テンプレート評価方法の実施に適した第２の
記録媒体である。第１の記録媒体は、少なくとも下記の
処理をコンピュータ装置に実行させるためのプログラム
が記録されたコンピュータ読取可能な記録媒体である。（１−１）複数カテゴリの特徴を表す多次元の識別空間
から複数の２軸の組み合わせを特定する処理、（１−
２）各２軸の組み合わせ毎に、全カテゴリにおけるカテ
ゴリ間分散とカテゴリ内分散との比率を表す第１の値を
導出するとともに２つのカテゴリの組み合わせに係るカ
テゴリ間分散とカテゴリ内分散との比率の最小値を表す
第２の値を全組み合わせについて導出する処理、（１−
３）前記第１の値及び第２の値が共に最大となる目標点
からの距離が最小の２軸の組み合わせを選択して平面を
形成する処理。A recording medium of the present invention for solving the above-mentioned other problems is a first recording medium suitable for carrying out the identification space expression method and a second recording medium suitable for carrying out the template evaluation method. . The first recording medium is a computer-readable recording medium in which a program for causing a computer device to execute at least the following processing is recorded. (1-1) A process of identifying a combination of a plurality of two axes from a multidimensional identification space representing the characteristics of a plurality of categories, (1-
2) For each combination of the two axes, derive a first value representing the ratio of the inter-category variance and the intra-category variance in all categories, and at the same time, the ratio between the inter-category variance and the intra-category variance related to the combination of the two categories A process of deriving a second value representing the minimum value of all combinations, (1-
3) A process of forming a plane by selecting a combination of two axes having the smallest distance from the target point where both the first value and the second value are maximum.

【００１６】また、第２の記録媒体は、複数のカテゴリ
の特徴を表す多次元学習データと当該多次元学習データ
により形成される識別空間の特徴を代表するテンプレー
トを生成するためのテンプレートデータとを保持するコ
ンピュータ装置に、少なくとも下記の処理を実行させる
ためのプログラムが記録されたコンピュータ読取可能な
記録媒体である。（２−１）前記識別空間からカテゴリ特徴の分布の重な
りが最も小さくなる２軸の組み合わせを選択する処理、
（２−２）選択した２軸で形成される平面上に当該２軸
上の学習データ及びテンプレートデータを分布表示する
処理、（２−３）この分布表示に対応して入力された所
定の学習パラメータに基づいて前記識別空間におけるテ
ンプレートを移動させる処理。Further, the second recording medium stores multidimensional learning data representing the characteristics of a plurality of categories and template data for generating a template representing the characteristics of the identification space formed by the multidimensional learning data. It is a computer-readable recording medium in which a program for executing at least the following processing is recorded in a computer device held therein. (2-1) A process of selecting a combination of two axes from which the overlapping of distributions of category features is the smallest, from the identification space,
(2-2) A process of distributing and displaying the learning data and template data on the selected two axes on a plane formed by the selected two axes, and (2-3) a predetermined learning input corresponding to this distribution display. A process of moving a template in the identification space based on a parameter.

【００１７】[0017]

【発明の実施の形態】以下、本発明を文字認識に用いら
れるテンプレートの学習（評価、更新）を行うテンプレ
ート学習装置に適用した場合の実施の形態を説明する。
図１は、本実施形態によるテンプレート学習装置の構成
図である。このテンプレート学習装置１は、データ記憶
装置２０、データ入力装置３０、表示装置４０を備えた
コンピュータ装置により実現されるもので、そのコンピ
ュータ装置が所定のプログラムを読み込んで自己のＯＳ
（オペレーティングシステム）と協働実行することによ
り形成される、入出力インタフェース１１、正準分析部
１２、テンプレート作成部１３、テンプレート評価部１
４、識別空間解析部１５、分散演算部１６、重み演算部
１７の機能ブロックを有している。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, an embodiment in which the present invention is applied to a template learning device for learning (evaluating and updating) a template used for character recognition will be described.
FIG. 1 is a block diagram of the template learning device according to the present embodiment. The template learning device 1 is realized by a computer device including a data storage device 20, a data input device 30, and a display device 40, and the computer device reads a predetermined program to execute its own OS.
An input / output interface 11, a canonical analysis unit 12, a template creation unit 13, and a template evaluation unit 1, which are formed by performing cooperative execution with an (operating system).
4, it has functional blocks of an identification space analysis unit 15, a variance calculation unit 16, and a weight calculation unit 17.

【００１８】上記プログラムは、通常、コンピュータ装
置の内部記憶装置あるいは外部記憶装置（図示省略）に
格納され、随時読み取られて実行されるようになってい
るが、コンピュータ装置とは分離可能な記録媒体、例え
ばＣＤ−ＲＯＭやＦＤ等に格納された可搬性記録媒体、
あるいは通信回線を通じて接続されたプログラムサーバ
に記録され、使用時に読み込まれて上記内部記憶装置ま
たは外部記憶装置にインストールされて随時実行に供さ
れるものであっても良い。The above program is usually stored in an internal storage device or an external storage device (not shown) of the computer device, and is read and executed at any time, but a recording medium separable from the computer device. , A portable recording medium stored in, for example, a CD-ROM or FD,
Alternatively, it may be recorded in a program server connected through a communication line, read at the time of use, installed in the internal storage device or an external storage device, and provided for execution at any time.

【００１９】データ記憶装置２０には、複数の文字カテ
ゴリの特徴を表す多次元学習データと、その多次元学習
データにより形成される識別空間の特徴を代表するテン
プレートを生成するためのテンプレートデータが格納さ
れている。データ入力装置３０はテンプレートのパター
ンやその移動方向を変えるための学習パラメータを入力
するものであり、表示装置４０は、後述する分布図を表
示するものである。The data storage device 20 stores multidimensional learning data representing the characteristics of a plurality of character categories and template data for generating a template representing the characteristics of the identification space formed by the multidimensional learning data. Has been done. The data input device 30 is for inputting learning parameters for changing the pattern of the template and its moving direction, and the display device 40 is for displaying a distribution map described later.

【００２０】入出力インタフェース１１は、データ記憶
装置２０，データ入力装置３０，表示装置４０と内部機
能ブロック１２、１４との間のデータ等の送り渡しを制
御する。正準分析部１２は多次元学習データによって形
成される識別空間から余分な特徴を削除して次元圧縮す
るものであり、テンプレート作成部１３は学習パラメー
タをもとにパターン認識学習を通して文字認識用のテン
プレートの作成や更新を行うものである。テンプレート
評価部１４は、識別空間解析部１５、分散演算部１６、
重み演算部１７の機能を用いて、作成ないし更新された
テンプレートの評価を行うものである。識別空間解析部
１５は識別空間における軸線及び２軸の組み合わせを特
定するものであり、分散演算部１６は上述のＦ比の演算
と演算結果の比較を行うものであり、重み演算部１７は
例えばユークリッド距離等による重み演算を行うもので
ある。The input / output interface 11 controls the transfer of data and the like between the data storage device 20, the data input device 30, the display device 40 and the internal functional blocks 12 and 14. The canonical analysis unit 12 removes extra features from the identification space formed by the multidimensional learning data and performs dimension compression, and the template creation unit 13 performs pattern recognition learning based on learning parameters for character recognition. It is for creating and updating templates. The template evaluation unit 14 includes an identification space analysis unit 15, a variance calculation unit 16,
The function of the weight calculator 17 is used to evaluate the created or updated template. The identification space analysis unit 15 identifies the combination of the axis and the two axes in the identification space, the variance calculation unit 16 performs the above-described F ratio calculation and comparison of the calculation results, and the weight calculation unit 17 includes, for example, The weight calculation is performed based on the Euclidean distance and the like.

【００２１】上記テンプレート学習装置１の動作は、以
下のとおりである。最初に、図２を参照してテンプレー
ト学習に応用できる識別空間の表現方法の概略手順を説
明する。The operation of the template learning device 1 is as follows. First, a schematic procedure of an identification space expression method applicable to template learning will be described with reference to FIG.

【００２２】まず、データ記憶装置２０から多次元学習
データを取得し（ステップＳ１０１）、この多次元学習
データにより形成される初期識別空間に対して正準判別
分析を行い、余分な特徴成分を除去して次元圧縮する
（ステップＳ１０２）。これは正準分析部１２により行
われる。例えばイ〜ン、濁音、半濁音、濁点、半濁点等
を合わせて７４個の片仮名カテゴリの場合、初期識別空
間は１５３６次元で表現されるが、これを５４〜４４次
元に圧縮してもその空間の特徴は殆ど維持されているこ
とが本発明者によって確認されている。本発明では、こ
の点を利用して後続処理のために、識別空間を例えば４
４次元に圧縮しておく。First, multidimensional learning data is acquired from the data storage device 20 (step S101), and the canonical discriminant analysis is performed on the initial discriminant space formed by this multidimensional learning data to remove extra feature components. Then, the dimension is compressed (step S102). This is performed by the canonical analysis unit 12. For example, in the case of 74 katakana categories including inclusive voices, voiced sounds, semi-voiced sounds, voiced voices, and voiced voices, the initial identification space is expressed in 1536 dimensions, but even if it is compressed to 54 to 44 dimensions, It has been confirmed by the present inventor that most of the spatial characteristics are maintained. In the present invention, this point is used to set the identification space to, for example, 4 for subsequent processing.
Compress in 4 dimensions.

【００２３】次いで、圧縮した次元の各軸から、カテゴ
リ特徴の分布の重なりが最も小さくなる２軸の組み合わ
せを選択する（ステップＳ１０３）。選択された２軸上
の学習データとテンプレートデータをデータ記憶装置２
０から抽出し（ステップＳ１０４）、これをその２軸で
形成される平面上に反映させ、表示装置４０へ二次元分
布表示させる（ステップＳ１０５）。これにより、分布
の重なりが少ない分布図が得られる。Next, a combination of the two axes that minimizes the overlap of the distributions of the category features is selected from the respective axes of the compressed dimension (step S103). The data storage device 2 stores the learning data and the template data on the selected two axes.
It is extracted from 0 (step S104), reflected on the plane formed by the two axes, and displayed in a two-dimensional distribution on the display device 40 (step S105). As a result, it is possible to obtain a distribution map in which the distributions do not overlap.

【００２４】上述のステップＳ１０３における２軸選択
は、具体的には図３の手順で行われる。すなわち、識別
空間解析部１５で、正準判別分析された識別空間を特定
した後、次元軸の中から２本を選択し（ステップＳ２０
１，Ｓ２０２）、選択した２軸上で、全カテゴリにおけ
るカテゴリ間分散とカテゴリ内分散との比率（Ｆ比）を
上記（１）〜（３）式を用いて導出するとともに（ステ
ップＳ２０７）、２つの文字カテゴリの組み合わせに係
るカテゴリ間分散とカテゴリ内分散との比率（組み合わ
せＦ比とする）の最小値（Ｆｍ）を全組み合わせについ
て導出する（ステップＳ２０３〜Ｓ２０６）。Ｆ比、Ｆ
ｍの導出は分散演算部１６で行い、その結果を、図示し
ない作業メモリ領域へその２軸の組み合わせの「配列」
として保存しておく（ステップＳ２０８）。ステップＳ
２０２以降の処理をすべての２軸について繰り返し（ス
テップＳ２０９：Yes）、Ｆ比とＦｍとが共に高い２軸
の組み合わせを見つける。ここでは、最大値となるＦ比
をＦＭ、最大値となるＦｍをＦｍＭとして目標点（Ｆ
Ｍ、ＦｍＭ）を特定する（ステップＳ２１０）。The selection of the two axes in the above step S103 is specifically performed by the procedure of FIG. That is, the identification space analysis unit 15 specifies the identification space subjected to the canonical discriminant analysis, and then selects two of the dimensional axes (step S20).
1, S202), the ratio (F ratio) of the inter-category variance and the intra-category variance in all categories on the selected two axes is derived using the above equations (1) to (3) (step S207), The minimum value (Fm) of the ratio of the inter-category variance and the intra-category variance related to the combination of two character categories (combined F ratio) is derived for all the combinations (steps S203 to S206). F ratio, F
The derivation of m is performed by the distributed arithmetic unit 16, and the result is stored in a work memory area (not shown) as an "array" of the combination of the two axes.
It is saved as (step S208). Step S
The processing after 202 is repeated for all the two axes (step S209: Yes), and a combination of the two axes having a high F ratio and a high Fm is found. Here, the F ratio having the maximum value is FM, and the Fm having the maximum value is FmM, and the target point (F
M, FmM) are specified (step S210).

【００２５】そして、目標点（ＦＭ、ＦｍＭ）と個々の
２軸のＦ比、Ｆｍの交点とのユークリッド距離を重み演
算部１７で演算し（ステップＳ２１１）、この距離をソ
ートして（ステップＳ２１２）、最小距離、つまり目標
点（ＦＭ、ＦｍＭ）に最も近い交点（Ｆ比とＦｍとの交
点）をもつ２軸の組み合わせを、カテゴリ特徴の分布の
重なりが最も小さくなる２軸として選択する（ステップ
Ｓ２１３）。Then, the Euclidean distance between the target point (FM, FmM) and the intersection of the F ratios and Fm of each of the two axes is calculated by the weight calculator 17 (step S211), and this distance is sorted (step S212). ), The minimum distance, that is, the combination of the two axes having the intersection (the intersection of the F ratio and Fm) closest to the target point (FM, FmM) is selected as the two axes having the smallest overlap of the distribution of the category features ( Step S213).

【００２６】次に、上記の表現方法を応用して二次元分
布を表現するとともに、この二次元分布をもとにテンプ
レートを学習させていく手順を、具体例を挙げて説明す
る。ここでは、片仮名のカテゴリセットを用いてテンプ
レート評価を行う場合の例を挙げる。片仮名の文字カテ
ゴリは、実用上、４４次元程度まで次元圧縮できること
は前述のとおりである。また、片仮名のカテゴリセット
には種々の組み合わせがあるが、認識率が相対的に低く
なる組み合わせのカテゴリセットを用いて評価していけ
ば、他の組み合わせの文字カテゴリないしカテゴリセッ
トについても適合率が高くなるテンプレートを作成でき
る。本発明者による実験によれば、５つの片仮名カテゴ
リ「シンラジソ」のセットが、上記認識率が最も低いカ
テゴリセットであることが確認されているので、本例で
は、片仮名の文字カテゴリを４４次元に圧縮し、「シン
ラジソ」のカテゴリセットのペアカテゴリ（「シン」、
「シラ」、「シジ」・・・）について、それぞれＦ比、
組み合わせＦ比、Ｆｍ等を求める場合について説明す
る。Next, a procedure for expressing the two-dimensional distribution by applying the above-described expression method and learning the template based on the two-dimensional distribution will be described by giving a concrete example. Here, an example will be given in which template evaluation is performed using a category set of katakana. As described above, the katakana character category can be dimensionally compressed to about 44 dimensions in practice. Also, although there are various combinations of Katakana category sets, if the evaluation is performed using a combination category set that has a relatively low recognition rate, the matching rate will also be improved for other combinations of character categories or category sets. You can create a template that goes up. According to an experiment conducted by the present inventor, it has been confirmed that a set of five katakana categories “Shin Radiso” is a category set having the lowest recognition rate. Therefore, in this example, the character category of katakana is set to 44 dimensions. Compressed and paired categories of "Shin Radiso" category set ("Shin",
"Shira", "Shiji" ...), F ratio,
The case of obtaining the combination F ratio, Fm, etc. will be described.

【００２７】この場合の次元（軸）数は“４４”、軸の
組み合わせのバリエーション数は“ ₄₄ Ｃ ₂＝９４６”、
ペアカテゴリのパターンバリエーション数は“ ₅ Ｃ ₂＝１
０”、カテゴリ当たりのデータ個数は“５０”であり、
配列は図４に示すようになる。なお、配列は、実際に
は、４４軸のうち２軸の組み合わせの数だけ存在する
が、図４の例では、本発明の特徴を説明するために、
「１−２軸」，「１−３軸」のみに限定してある。In this case, the number of dimensions (axis) is "44", the number of variations of the combination of axes is " ₄₄ C ₂ = 946",
The number of pattern variations in the pair category is “ ₅ C ₂ = 1”
0 ", the number of data per category" and 50 ",
The arrangement is as shown in FIG. Although there are actually as many arrays as there are combinations of two axes out of 44 axes, in the example of FIG. 4, in order to explain the features of the present invention,
It is limited to “1-2 axis” and “1-3 axis” only.

【００２８】図４を参照すると、各軸の組み合わせ共、
ペアカテゴリによる組み合わせＦ比にはばらつきが見ら
れる。例えば「１−２軸」の全カテゴリの場合のＦ比は
“３．６７７”であるのに対し、組み合わせＦ比は、
「シン」の“０．３０４”から「ンラ」の“１０．８８
３”まで様々である。「１−３軸」の場合も同様であ
る。前述のように、Ｆ比は、大きい値の方が判別し易く
なるので、Ｆ比の小さい軸の組み合わせは分布図の作成
に適さない。従来手法では、全カテゴリのＦ比のみに基
づいて分布状態を判定していたため、「１−２軸」、
「１−３軸」のどちらを選んでもさほど変わらないとい
う見方ができた。これに対し、本発明では、その軸の組
み合わせＦ比の中から最低値のものを選び、これをその
軸を代表するＦｍとする。これによれば、「１−２軸」
のＦｍは“０．３０４”となり、「１−３軸」のＦｍは
“０．０１８”となるため、両者には大きな差が現れ
る。Referring to FIG. 4, the combination of axes is
There are variations in the combination F ratio depending on the pair category. For example, the F ratio in the case of all categories of "1-2 axis" is "3.677", while the combined F ratio is
"0.304" of "Shin" to "10.88" of "Nra"
It is various up to 3 ". The same applies to the case of" 1-3 axes ". As described above, the larger the F ratio is, the easier it is to discriminate. Therefore, a combination of axes having a small F ratio is not suitable for creating a distribution chart. In the conventional method, the distribution state is determined only based on the F ratios of all the categories, so "1-2 axis",
It was possible to see that no matter which one of the "1-3 axes" was selected, it would not change much. On the other hand, in the present invention, the lowest value is selected from the combined F ratios of the axis, and this is set as Fm representing the axis. According to this, "1-2 axis"
Since the Fm of "1" is "0.304" and the Fm of "1-3 axis" is "0.018", a large difference appears between the two.

【００２９】このようにして求めた全カテゴリについて
のＦ比とＦｍとの関係をすべての２軸の組み合わせにつ
いて実測した結果をグラフ化したのが図５である。図５
において、横軸はＦｍ、縦軸は全カテゴリについてのＦ
比であり、Ｘで示した点がステップＳ２１０で特定した
目標点（ＦＭ、ＦｍＭ）となる。図５によれば、この目
標点Ｘに近い交点をもつ２軸の組み合わせは「２−７
軸」となる。なお、従来手法では単にＦ比が最大になる
ものを選定するため、図７の例では「２−２４軸」とな
る。FIG. 5 is a graph showing the results of actual measurement of the relationship between the F ratio and Fm for all the categories thus obtained for all combinations of two axes. Figure 5
, The horizontal axis is Fm and the vertical axis is F for all categories.
It is a ratio, and the point indicated by X is the target point (FM, FmM) specified in step S210. According to FIG. 5, the combination of two axes having an intersection close to the target point X is “2-7
Axis. Note that, in the conventional method, since the one having the maximum F ratio is simply selected, the example of FIG. 7 has “2-24 axes”.

【００３０】図６はこの「２−７軸」からみたテンプレ
ートデータの二次元分布図、図７は「２−７軸」からみ
た学習データの二次元分布図、図８は従来手法により選
択された「２−２４軸」からみた学習データの二次元分
布図である。これらの二次元分布図は、表示装置４０に
適宜表示されるため、操作者は、それを視認しながらテ
ンプレートの評価を行うことができる。FIG. 6 is a two-dimensional distribution map of the template data viewed from the "2-7 axis", FIG. 7 is a two-dimensional distribution map of the learning data viewed from the "2-7 axis", and FIG. 8 is selected by the conventional method. It is a two-dimensional distribution diagram of the learning data seen from the "2-24 axis". Since these two-dimensional distribution charts are displayed on the display device 40 as appropriate, the operator can evaluate the template while visually recognizing it.

【００３１】図６〜図８から明らかなように、本発明の
手法によれば、従来手法の場合よりも分布の重なりが少
なくなっており、テンプレートデータの広がりとの比
較、つまりテンプレートの評価に、より適していること
がわかる。従って、テンプレートの移動方向を図６及び
図７のような二次元分布図で確認しながら微調整するこ
とが可能になり、性能の良いテンプレートを迅速に作成
できるようになる。As is apparent from FIGS. 6 to 8, the method of the present invention has less overlap of distributions than the conventional method, and is useful for comparison with the spread of template data, that is, template evaluation. , It turns out to be more suitable. Therefore, it becomes possible to make fine adjustments while confirming the moving direction of the template with the two-dimensional distribution diagrams as shown in FIGS. 6 and 7, and it becomes possible to quickly create a template with good performance.

【００３２】なお、図６にデータ分布状態を示したテン
プレートは例示であり、任意のものを選ぶことができ
る。例えば学習パラメータに応じて予め複数のテンプレ
ートを用意しておき、各テンプレートと学習データの二
次元分布図とを比較することで、どのテンプレートが適
切であるか、学習パラメータの値はどれが良いかを検討
することもできる。The template showing the data distribution state in FIG. 6 is an example, and any template can be selected. For example, by preparing a plurality of templates in advance according to learning parameters and comparing each template with the two-dimensional distribution map of learning data, which template is appropriate and which value of the learning parameter is good Can also be considered.

【００３３】また、本発明の手法の応用例として、目標
点Ｘ（ＦＭ、ＦｍＭ）に近い軸の組み合わせを、候補と
していくつか用意しておき、二次元分布図を複数作成し
て操作者に任意に選択させるようにしても良い。さら
に、本実施形態では、重み演算部１７においてユークリ
ッド距離を演算するようにしたが、分散の逆数などの重
みをかけるようにしても良い。Further, as an application example of the method of the present invention, some combinations of axes close to the target point X (FM, FmM) are prepared as candidates, a plurality of two-dimensional distribution charts are created, and the operator is prepared. You may make it select arbitrarily. Further, in the present embodiment, the Euclidean distance is calculated in the weight calculation unit 17, but a weight such as the reciprocal of the variance may be applied.

【００３４】本発明の他の応用例として、複数のカテゴ
リの一部を類似する他のカテゴリに置換し、同一軸を含
む複数の２軸の組み合わせＦ比同士を比較することによ
りテンプレートの学習方向（移動方向）を決定していく
ことも可能である。例えば、カテゴリ「シ」に対してカ
テゴリ「ミ」、「ン」、「ソ」、「ツ」はその特徴が良
く似ている（これらの関係を対抗カテゴリと呼ぶ）た
め、文字認識の際に誤認を生じ易い。そこで、上述のカ
テゴリセット「シンラジソ」の一つの文字カテゴリを対
抗カテゴリに置換した他の複数のカテゴリセットを用
い、例えば「２−７軸」と「２−２０軸」、「２−１０
軸」・・・のように、２軸に対する他の軸の組み合わせ
を変えながら組み合わせＦ比とＦｍとを比較していくこ
とで、テンプレートの動きを調べることも可能である。As another application example of the present invention, a part of a plurality of categories is replaced with another similar category, and the combination F ratios of a plurality of two axes including the same axis are compared with each other to learn the template learning direction. It is also possible to decide (movement direction). For example, the characteristics of the categories “mi”, “n”, “so”, and “tsu” are very similar to those of the category “shi” (these relationships are called opposing categories). It is easy to make a mistake. Therefore, a plurality of other category sets in which one character category of the above-mentioned category set “Shin Radiso” is replaced with a counter category is used, for example, “2-7 axis”, “2-20 axis”, “2-10”.
It is also possible to examine the movement of the template by comparing the combination F ratio and Fm while changing the combination of other axes with respect to the two axes, such as "axis" ....

【００３５】以上、本発明を文字認識用のテンプレート
の学習に適用した場合の実施の形態を説明してきたが、
本発明は、文字認識以外の他の認識系の処理にも同様に
適用が可能なものである。The embodiment in which the present invention is applied to the learning of the template for character recognition has been described above.
The present invention can be similarly applied to processing of other recognition systems other than character recognition.

【００３６】[0036]

【発明の効果】以上の説明から明らかなように、本発明
によれば、認識装置に使用されるテンプレートを移動方
向を二次元分布によって評価しながら学習することがで
き、性能の良いテンプレートを迅速に得ることができる
ようになるという、特有の効果がある。As is apparent from the above description, according to the present invention, the template used in the recognition device can be learned while evaluating the moving direction by the two-dimensional distribution, and the template with good performance can be quickly obtained. There is a unique effect that you can get it.

[Brief description of drawings]

【図１】本発明を適用したテンプレート学習装置の実施
の形態を示した構成図。FIG. 1 is a configuration diagram showing an embodiment of a template learning device to which the present invention is applied.

【図２】本実施形態のテンプレート学習装置による識別
空間の表現方法の概略手順を示した説明図。FIG. 2 is an explanatory diagram showing a schematic procedure of a method of expressing an identification space by the template learning device of the present embodiment.

【図３】本実施形態において、圧縮した次元の各軸から
カテゴリ特徴の分布の重なりが最も小さくなる２軸の組
み合わせを選択する場合の手順を示した図。FIG. 3 is a diagram showing a procedure for selecting a combination of two axes that minimizes the overlap of category feature distributions from each axis of the compressed dimension in the present embodiment.

【図４】１−２軸、１−３軸におけるＦ比、組み合わせ
Ｆ比、Ｆｍの一例を示した図表。FIG. 4 is a chart showing an example of an F ratio, a combined F ratio, and Fm on the 1-2 axis and the 1-3 axis.

【図５】全カテゴリについてのＦ比とＦｍとの関係をす
べての２軸の組み合わせについて実測した結果を示した
グラフ。FIG. 5 is a graph showing the results of actual measurement of the relationship between F ratio and Fm for all categories for all combinations of two axes.

【図６】２−７軸からみたテンプレートデータの分布状
態を示した二次元分布図。FIG. 6 is a two-dimensional distribution diagram showing a distribution state of template data viewed from the 2-7 axis.

【図７】２−７軸からみた学習データの分布状態を示し
た二次元分布図。FIG. 7 is a two-dimensional distribution chart showing a distribution state of learning data viewed from the 2-7 axis.

【図８】２−２４軸からみた学習データの分布状態を示
した二次元分布図。FIG. 8 is a two-dimensional distribution diagram showing a distribution state of learning data viewed from the 2-24 axis.

【図９】（ａ）は多次元学習データにおいて分布の重な
りがある状態、（ｂ）は分布の重なりの小さい状態を示
した説明図。FIG. 9A is an explanatory diagram showing a state where distributions overlap in multidimensional learning data, and FIG. 9B shows a state where distributions overlap little.

【図１０】（ａ）はＦ比が小さくなる分布状態、（ｂ）
はＦ比が大きくなる分布状態、（ｃ）は同一Ｆ比であり
ながら（ｂ）の場合と異なる分布状態を示した説明図。10A is a distribution state in which the F ratio is small, FIG.
Is an explanatory view showing a distribution state in which the F ratio is large, and FIG. 6C is a distribution state different from that in the case of FIG.

[Explanation of symbols]

１テンプレート学習装置１１入出力インタフェース１２正準分析部１３テンプレート作成部１４テンプレート評価部１５識別空間解析部１６分散演算部１７重み演算部２０データ記憶装置３０データ入力装置４０表示装置 1 Template learning device 11 I / O interface 12 Canonical Analysis Department 13 Template creation department 14 Template evaluation department 15 Identification space analysis unit 16 distributed computing unit 17 Weight calculator 20 data storage 30 data input device 40 display device

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06K 9/00 - 9/82 G06N 3/00 G06T 7/00 - 7/60 ─────────────────────────────────────────────────── ─── Continuation of front page (58) Fields surveyed (Int.Cl. ⁷ , DB name) G06K 9/00-9/82 G06N 3/00 G06T 7/00-7/60

Claims

(57) [Claims]

1. Holding multidimensional learning data representing features of a plurality of categories and template data for generating a template representative of features of an identification space formed by the multidimensional learning data, A combination of two axes that minimizes the overlap of distributions of category features is selected, and learning data and template data on the two axes are distributed and displayed on the plane formed by the selected two axes. A template evaluation method in a recognition device, characterized in that the arrangement of templates in the identification space is evaluated.

2. The step of selecting the two axes includes deriving a first value representing a ratio of inter-category variance and intra-category variance of all categories for each combination of the two axes.
The second value representing the minimum value of the ratio of the inter-category variance and the intra-category variance related to the combination of one category is derived for all the combinations, and from the target point where the first value and the second value are both maximum. wherein the distance of selecting a combination of two axes that minimizes template evaluation method according to claim 1, wherein.

Wherein said identification space, initial multidimensional remove excess feature components from the space characterized in that it is a space that dimension compression, claim 1 or 2 template evaluation method according.

4. The learning direction of the template is determined by replacing a part of the plurality of categories with another similar category and comparing the second values related to a combination of a plurality of two axes including the same axis. The template evaluation method according to claim 1 , wherein the template evaluation method is performed.

5. The plurality of categories is a set of categories having the lowest recognition rate among the types of the categories using the template .
The template evaluation method according to any one of 4 above.

6. Data holding means for holding multidimensional learning data representing features of a plurality of categories and template data for generating a template representing a feature of an identification space formed by the multidimensional learning data, Axis selection means for selecting a combination of two axes that minimizes the overlap of category distributions from the identification space, a plane is formed by the two axes selected by the axis selection means, and learning data on the two axes is formed on the plane. A template learning device having means for displaying and displaying template data in a distributed manner, and learning a template in the identification space based on the distribution display.

7. The axis selecting means derives, for each combination of the two axes, a first value indicating a ratio of interclass dispersion and intraclass dispersion of the total number of categories and relates to a combination of two categories. A variance calculation unit that derives a second value representing the minimum value of the ratio of interclass variance and intraclass variance for all combinations, and a distance from a target point at which both the first value and the second value are maximum. 7. The template learning device according to claim 6, further comprising a distance calculation unit that selects a combination of two axes that minimizes.

8. A canonical discriminator that removes extra feature components from the initial multidimensional space formed by the multidimensional learning data and compresses the dimension further, and this dimensionally compressed space is used as the identification space. 7. The method according to claim 6, wherein
Or template learning device described in 7 .

9. A computer device holding multi-dimensional learning data representing features of a plurality of categories and template data for generating a template representative of features of an identification space formed by the multi-dimensional learning data, A process of selecting a combination of two axes that minimizes the overlap of category feature distributions from the identification space
A process of distributing and displaying the learning data and template data on the two axes on a plane formed by the axes, and a process of moving the template in the identification space based on a predetermined learning parameter input corresponding to the distribution display. A computer-readable recording medium in which a program for executing is recorded.