JPS60142788A

JPS60142788A - Feature amount evaluating method in graphic recognition

Info

Publication number: JPS60142788A
Application number: JP58248152A
Authority: JP
Inventors: Miyahiko Orita; 折田　三弥彦; Yoshiki Kobayashi; 芳樹小林; Yutaka Kubo; 裕久保
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1983-12-29
Filing date: 1983-12-29
Publication date: 1985-07-27
Also published as: JPH027115B2

Abstract

PURPOSE:To attain identification by applying the best evaluating method (discrete distribution) in the split performance to a category even if an object category is in any distribution state so as to evaluate and select the proper feature amount of plural categories. CONSTITUTION:The evaluation is conducted along the internal operation of a discrete distribution calculating section 9 of a feature amount F2. A category managing section 15 generates two category combinations from Ca-Cf so as not to be duplicated to a feature amount distribution data 14 of each category stored in a form compressed into a mean value and a standard deviation. The distribution data (mean value and standard deviation) of the two categories are transferred to a inter-distribution distance calculating section 16. A category combining and management section 15 repeats the operation until all the combinations of categories are outputted at each start command from a discrete propriety discriminating section. Thus, a proper feature amount is evaluated and selected and the identification is attained to plural given categories.

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は図形認識に係り、特に認識対象図形の特徴量の
評価方法に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to figure recognition, and particularly to a method for evaluating feature amounts of a figure to be recognized.

[Background of the invention]

対象物の識別認識には認識辞書を作成し、判定木構造°
により認識をおこなう０そして認識対象の識別に当って
は、予め作成した認識辞書に従う。To identify and recognize objects, a recognition dictionary is created and a decision tree structure is used.
0 and the recognition target is identified according to a recognition dictionary created in advance.

該認識辞書は認識対象各々について種々時微量を抽出し
た後、各カテゴリの分布データ管解析することにより作
成される。その際、解析の手数を省き、解析者の主観を
混じえないために該認識辞書はある標準化した方法で自
動作成するのが望ましい。認識辞書の構造としては、処
理速度を重視した判定水沫による識別を実現する構造（
判定木構造）が最も有力である。しかし判定木構造辞書
を作成する際に常に問題になるのは、木の各節でいかな
る特徴ｔを使用するかということである。すなわち木の
各節のカテゴリ候補をいくつかの小グループのカテゴリ
候補群に分割するために１種々の特徴量を評価し、順位
付けを行う必要がある（他の構造辞書においても同様に
特徴量の順位付けは問題になるが・ここでは判定木構造
辞書を念頭に置匹ている）。The recognition dictionary is created by extracting trace amounts at various times for each recognition target and then analyzing the distribution data of each category. In this case, it is desirable to automatically create the recognition dictionary using a standardized method in order to save the trouble of analysis and prevent the analyst's subjectivity from being mixed in. The structure of the recognition dictionary is a structure (
decision tree structure) is the most promising. However, when creating a dictionary with a decision tree structure, a problem that always arises is which feature t should be used at each node of the tree. In other words, in order to divide the category candidates of each node of the tree into several small groups of category candidates, it is necessary to evaluate and rank various features (features are similarly used in other structural dictionaries). However, here we have a decision tree structure dictionary in mind).

従来の特徴量順位付は方法としては分離度法（例えば辻
内順平著「応用画像解析」共立出版）°、安定係数法（
例えば特開昭５７−２５０７８号）および分散比法（例
えば「パターン認識における特徴抽出に関する数理的研
究」大津氏、電総研研究報告第８１８号）がある。それ
ぞれについて、以下に簡単に説明する。Conventional feature ranking methods include the separability method (for example, "Applied Image Analysis" by Junpei Tsujiuchi, Kyoritsu Shuppan) °, the stability coefficient method (
For example, Japanese Patent Application Laid-Open No. 57-25078) and the variance ratio method (for example, "Mathematical Study on Feature Extraction in Pattern Recognition" by Mr. Otsu, Electric Research Institute Research Report No. 818). Each will be briefly explained below.

いま分割（識別）を試みてｈるカテゴリ群を第１図に示
すようにＣａ−Ｃｄ、これに対して準備されている特徴
量はＦｌ−Ｆｓであったとする。そしてＦｌ、Ｆ２　、
Ｆｓの各々の特徴量について何度かデータをとったとこ
ろ第１図に示すようか分布が得られたと仮定する。こ＼
で特徴量とは、例えば円形の部品のような場合その外周
長などがとられる。円形物品にあてられている光の量、
すなわち明るさなどでばらつきにより、第１図（ａ）〜
（Ｃ）に示すような分布曲線が得られる。Assume that the category group to be attempted to be divided (identified) is Ca-Cd, as shown in FIG. 1, and the feature amounts prepared for this are Fl-Fs. And Fl, F2,
Assume that data is collected several times for each feature of Fs, and a distribution as shown in FIG. 1 is obtained. child\
For example, in the case of a circular part, the feature value is the outer circumference of the part. the amount of light falling on a circular object;
In other words, due to variations in brightness, etc., the
A distribution curve as shown in (C) is obtained.

（１）分離度法：　Ｆｌ−Ｆｓについて分離度とい　。(1) Separation method: Separation is called for Fl-Fs.

う評価値を計算し・その分離度の犬き艷特微量を第１図
のカテゴリ群に対して用いる。こ＼で分離度は（１）式
で表わされる。The evaluation value is calculated, and the characteristic value of the degree of separation is used for the category group in Figure 1. Here, the degree of separation is expressed by equation (1).

σに＋１　−　σにここで、μｋ　；特徴量ＦＩにおけるカテゴリＣｋの平
均値 μ１ｃ４１：　、Ｃｋに隣接するカテゴリＣｋや１（平
均値が次に大きいカテゴリ）の平均値 σｋ　：特徴量Ｆ１におけるカテゴリＣｈの標準偏差 σに＋＋：Ｃｋに隣接するカテゴリＣｋやＩの標準偏差（２）安定係数法：安定係数ｅＦｔ−Ｆｓまで計算し・
その値が大きい特徴量が優先される。尚、安定係数は（
２）式で表わされる。+1 to σ - σ where μk : Average value of category Ck in feature FI μ1c41: Average value of categories Ck and 1 (category with the next largest average value) adjacent to Ck σk : Category in feature F1 ++ to standard deviation σ of Ch: standard deviation of categories Ck and I adjacent to Ck (2) Stability coefficient method: Calculate the stability coefficient eFt-Fs.
Priority is given to feature quantities with larger values. Furthermore, the stability coefficient is (
2) It is expressed by the formula.

（“ｋ・μに＋ｌ　ｌσに、σに＋１ともに分離度に同
じ）。(“+1 for k・μ, +1 for lσ, and +1 for σ are the same as the degree of separation)”.

（３）分散比法：分散比ｆ：Ｆ＋　−Ｆｓ　’！で計算
し、その値が大きい特徴量を優先させる。尚、分散比は
・ここで１μに；カテゴリにの平均値 σ２　；　〃　の標準偏差の自乗Ｋ　：分割を試みているカテゴリ数 μ　：全カテゴリの平均値以上の方法で特徴量Ｆ１〜Ｆ３について評価すると、図
中に示すように彦る。これによると、Ｆ３で４つのカテ
ゴリに分割できる（Ｆｓが最も有効である）のに対し・
（１）、　（２）、　（３）の方法はいずれも・Ｆｌ及
びＦ２を優先（選択）することになる。（１）。(3) Variance ratio method: Variance ratio f:F+ -Fs'! , and give priority to features with larger values. In addition, the variance ratio is 1 μ here; the average value for the category σ2 ; the square of the standard deviation of When evaluated, it turns out as shown in the figure. According to this, F3 can be divided into four categories (Fs is the most effective), whereas
All of the methods (1), (2), and (3) prioritize (select) Fl and F2. (1).

（２）はＦＩＫおけるｃｄのように極端に１つのカテゴ
リが他のカテゴリの分布群から離れている場合に大きく
なり、（３）はＦ２のように全カテゴリが２つに大きく
わかれている場合に大きくなるからである。(2) becomes large when one category is extremely far away from the distribution group of other categories, such as CD in FIK, and (3) becomes large when all categories are largely divided into two, such as F2. This is because it becomes larger.

ここでそれぞれの方法で、判定木辞書を作成すると第２
図（−ｉ）、Φ）、　（Ｃ）に示すようになる。分離度
及び安定係数によりまず・Ｃａ、Ｃｂ、Ｃ６，Ｃｄを対
象にして特徴量の屓位付けをすると、Ｆｌが有力となる
。そこでＦｌの軸上で、安全に分割できるのは、Ｃｃと
ｃｄの間であるからＣａ、Ｃｂ。Here, if you create a decision tree dictionary using each method, the second
The results are as shown in Figures (-i), Φ), and (C). First, when the feature values are ranked for Ca, Cb, C6, and Cd based on the degree of separation and stability coefficient, Fl becomes the most dominant. Therefore, on the axis of Fl, the parts that can be safely divided are between Cc and cd, so Ca and Cb.

ＣＣとｃｄのクループに分割できる。次にＣａ、Ｃｂ。It can be divided into CC and CD groups. Next, Ca and Cb.

Ｃｃを対象として特徴量の順位付けをすると・Ｆ２が有
力となる。以下同様に行うと図のようになるわけである
（分散比についても同様）。認識の処理時間は一つの認
識結果を出力する際に使用する特徴量の個数に比例する
ため・できる限り該特徴量個数は少くする必要がある。When ranking the feature values for Cc, F2 becomes the most likely. If the process is repeated in the same way, the result will be as shown in the figure (the same applies to the dispersion ratio). Since the recognition processing time is proportional to the number of features used when outputting one recognition result, the number of features needs to be as small as possible.

したがって、第２図（Ｃ）のようにＦ３のみで認識する
方がよい。すなわち、従来方法では適切に特徴量の分割
性能を表していないため、特徴量を安定に順位付けるこ
とができないことから・効率の悪い判定木辞書を作成し
、結果的に認識処理時間を大きくしている。Therefore, it is better to recognize only F3 as shown in FIG. 2(C). In other words, the conventional method does not properly represent the segmentation performance of the feature values, making it impossible to rank the feature values stably.・Inefficient decision tree dictionaries are created, resulting in increased recognition processing time. ing.

[Purpose of the invention]

本発明の目的は、判定木構造の認識辞書を作成する際に
問題となる特徴量順位付は方法として。The purpose of the present invention is to provide a method for ranking features, which is a problem when creating a recognition dictionary with a decision tree structure.

対象とするカテゴリがいかなる分布状態にあっても、該
対象とするカテゴリに対する分割性能が最もよい特徴量
全評価決定する方法全提供することにある。It is an object of the present invention to provide a method for determining all evaluations of feature quantities with the best division performance for a target category, regardless of the distribution state of the target category.

[Summary of the invention]

本発明は識別対象図形に対応する複数のカテゴリに対し
て複数の特徴量の各々について頻度分布をめ・該特徴量
の各々についてめられたカテゴリの分布の中から相互干
渉のない２つのカテゴリの組合せ数である離散分布数を
算出し、該算出された離散分布数の中でできるだけ大き
な分布数の値？もつ特徴量を選択し、図形識別における
特徴量として採用して図形識別をおこなうことに特徴が
ある。The present invention calculates a frequency distribution for each of a plurality of feature quantities for a plurality of categories corresponding to a figure to be identified, and selects two categories that do not interfere with each other from the distribution of categories determined for each of the feature quantities. Calculate the number of discrete distributions, which is the number of combinations, and find the value of the number of distributions that is as large as possible among the calculated numbers of discrete distributions? The feature of this method is that it selects feature quantities that have the same characteristics and employs them as feature quantities for figure identification to perform figure identification.

[Embodiments of the invention]

はじめに基礎となる事柄について説明する。特徴量の評
価は・ある特徴量を用いたとき、認識の対象となってい
るカテゴリをいかに多く分割できるかにある。第３図を
用いて説明する。First, I will explain the basics. The evaluation of feature amounts depends on how many categories that are the object of recognition can be divided into when using a certain feature amount. This will be explained using FIG.

いま判定木の任意のノード１にお、いて識別の対象とな
っているカテゴリ２がＣ−＝Ｃｔであったと（７）する。このときある特徴量（３）ヲ用いることによりＣ
ａ、Ｃｂ、　Ｃｃの１つのグループ（４）、Ｃａ、ｃｄ
。Suppose (7) that the category 2 to be identified at an arbitrary node 1 of the decision tree is C-=Ct. At this time, by using a certain feature (3), C
One group of a, Cb, Cc (4), Ca, cd
.

Ｃｅの１つのグループ（５）、ＣＩ、Ｃｆ、Ｃｇの１つ
のグループ（６）の３つのグループに分割されたとする
。Assume that it is divided into three groups: one group of Ce (5) and one group of CI, Cf, and Cg (6).

こ＼で分割内容を以下に示す。The details of the division are shown below.

（１）Ｃ，はＣｄ、Ｃ−、Ｃｆ、Ｃｔと区別（分割）さ
れた。(1) C was distinguished (divided) from Cd, C-, Cf, and Ct.

（２）ＣｈはＣｄ、Ｃ−、Ｃｆ、Ｃｇと区別（分割）さ
れた。(2) Ch was distinguished (divided) from Cd, C-, Cf, and Cg.

（３）Ｃ，はＣｏ、Ｃｆ、Ｃｇと区別（分割）された。(3) C was distinguished (divided) from Co, Cf, and Cg.

（４）　Ｃｄ１ｉ、Ｃｆ、Ｃｇと区別（分割）された（
Ｃ，。(4) Differentiated (divided) into Cd1i, Cf, and Cg (
C.

Ｃｂと区別（分割）されたが重複がある）（５）Ｃ，は
区別（分割）されず（ＣＩ、　Ｃｂ、　Ｃｅと分割され
たが重複がある）０（６）Ｃｆ、Ｃｇは区別（分割）されず（ｃａ、　Ｃｂ
。(5) C, is not distinguished (divided) from CI, Cb, and Ce, but there is overlap) 0 (6) Cf, Cg are distinct ( (split) not (ca, Cb
.

Ｃｃ、ｃｄと分割されたが重複がある）すなわち、特徴
量によるカテゴリの分割とは。Cc and cd, but there is some overlap) In other words, what is the division of categories based on feature amounts?

２つのカテゴリの分割を多数組合わせたものから構成さ
れている。そこで、分割される２つのカテ（８）ゴリの総組合わせ数Ｄ−の大きさで特徴量の順位付けを
行うことにした。該総組合わせ数を例えば第１図の例に
ついて計算すると・ＦｌはＤＣ−３゜Ｆ２はＤｅ＝４−
ＦｓはＤ−＝６　（ｅ大阪）　となる。It is composed of many combinations of divisions of two categories. Therefore, we decided to rank the feature amounts based on the total number of combinations D- of the two divided categories (8). For example, when calculating the total number of combinations for the example shown in Figure 1, Fl is DC-3°F2 is De=4-
Fs becomes D-=6 (e Osaka).

すなわちＦ３が最も有効な特徴量であるという結果にな
る。これは第１図の分布例に特有な事象ではなく、対象
となるカテゴリがいかに複雑に分布しても５分割性能と
該総組合わせ数（以後離散分布数と呼ぶ）は適切に対応
している。本発明はこの事象の新たな発見より生まれた
ものである。In other words, the result is that F3 is the most effective feature quantity. This is not a phenomenon specific to the distribution example in Figure 1; no matter how complex the target category is distributed, the 5-division performance and the total number of combinations (hereinafter referred to as the number of discrete distributions) correspond appropriately. There is. The present invention was born from a new discovery of this phenomenon.

本発明の実施例を第４〜６図を用いて説明する。Embodiments of the present invention will be described using FIGS. 4 to 6.

今、第４図（ａ）、　（ｂ）、　（Ｃ）に示すようにカ
テゴリＣａ〜Ｃｆの２値パターンを認識する例を考える
。このときＣ，−Ｃｆについて幾度か特徴量Ｆ１〜Ｆｈ
を計算して第４図（ｂ）、（Ｃ）の頻度分布が得られた
、とする。ここでは便宜上・Ｆｌ　を該２値パターンの
外形の周囲長・Ｆ２を穴の周囲長とし、以下Ｆ３〜Ｆｋ
については説明を省略した。この例を用いて特徴量の順
位付は方法を説明する。すなわちこの例では第４図（ａ
）に示す識別図形（穴が空いてい（９）る円形部品のようなもの）についての例を示している。Now, consider an example in which binary patterns of categories Ca to Cf are recognized as shown in FIGS. 4(a), 4(b), and 4(c). At this time, the feature values F1 to Fh are calculated several times for C and −Cf.
It is assumed that the frequency distributions shown in FIGS. 4(b) and (C) are obtained by calculating . Here, for convenience, Fl is the perimeter of the outer shape of the binary pattern, F2 is the perimeter of the hole, and below, F3 to Fk
The explanation has been omitted. Using this example, the method for ranking feature values will be explained. In other words, in this example, Fig. 4 (a
) shows an example of the identification figure (like a circular part with a hole (9)).

そして特徴量としているいろ考えられるがこ＼では外周
長、穴周囲長をそれぞれＦｌ、Ｆｚとした場合を示して
いる。Although there are various possible feature quantities, this example shows the case where the outer circumference length and the hole circumference length are respectively Fl and Fz.

第５図に特徴量順位付は方法の概略を示す。本方法は次
の各部で構成されている。FIG. 5 shows an outline of the feature ranking method. This method consists of the following parts.

分割を試みる対象となるカテゴリのコードセットを記憶
している分割カテゴリコードセット記憶部７．特徴量Ｆ
ｌ軸上での該カテゴリの離散分布数を計算する特徴量Ｆ
１の離散分布数計算部８（以下同様に・９〜１１は各特
徴量軸上での離散分布数計算部）。各特徴量軸上での離
散分布数を大の順に並べ、その同一順序で特微量名（コ
ード）列を出力する離散分布数比較部１２．特徴量の順
位付は結果を記憶する特徴量名列記憶部１３である０次に各部の動作を説明する。A split category code set storage unit 7 that stores the code set of the category that is the target of splitting. Feature amount F
Feature value F for calculating the number of discrete distributions of the category on the l axis
1, a discrete distribution number calculation unit 8 (the same applies hereafter; 9 to 11 are discrete distribution number calculation units on each feature axis). A discrete distribution number comparison unit 12 that arranges the number of discrete distributions on each feature axis in descending order and outputs a sequence of feature names (codes) in the same order. The ranking of the feature quantities is performed by the feature quantity name string storage section 13 that stores the results.Next, the operation of each section will be explained.

分割を試みる対象となるカテゴリのコードセット記憶部
７に記載されているカテゴリに関して（初期状態として
はＣ，−Ｃｆ）、特徴量Ｆ１〜（１ＧＦｋの各離散分布数計算部８〜１１では、各々離散分布
数を計算し、その値を離散分布数比較部１２に出力する
。次に離散分布数比較部１２では各離散分布数を大きい
の順に舵心、それと同一順序で特徴量各列を作成し、特
徴量名列記憶部１３に記憶させる。ここで離散分布数と
は・今路新しく考案した概念で、頻度分布（確率密度分
布）が互いに干渉し合わない２つのカテゴリの総組合わ
せ数のことである。該離散分布数をめる方法を第６図で
説明する。各離散分布数計算部８，９゜１０．１１・・
・は次の各部で構成されて込る。Regarding the category described in the code set storage unit 7 of the category to be attempted to be divided (initial state is C, -Cf), each of the discrete distribution number calculation units 8 to 11 of the feature quantities F1 to (1G Fk) Each discrete distribution number is calculated and the value is output to the discrete distribution number comparison unit 12.Next, the discrete distribution number comparison unit 12 calculates each discrete distribution number in descending order of the steering center, and in the same order, calculates each column of feature quantities. The number of discrete distributions is a concept newly devised by Imaji, and is a total combination of two categories whose frequency distributions (probability density distributions) do not interfere with each other. The method of calculating the discrete distribution number is explained in Fig. 6.Each discrete distribution number calculation unit 8, 9゜10.11...
・It consists of the following parts.

各カテゴリの特徴量分布データ（各カテゴリの特徴量の
平均値と標準偏差）を記憶している特徴量分布データ記
憶部１４、コードセット記憶部７に記憶されているカテ
ゴリ（Ｃ，−Ｃｆ）の中から重複しないように２つのカ
テゴリの組合わせを発生し２該２つのカテゴリの分布デ
ータを出力するカテゴリ組合わせ管理部１５、該カテゴ
リ組合わせ管理部１５の出力した２つのカテゴリの分布
データから、該２つのカテゴリの分布間距離をめる分布
間距離計算部１６、該分布間距離をめる際のパラメータ
を記憶する出現範囲パラメータ記憶部１７１分布間距離
計算部１６の結果を判定して２つのカテゴリが離散か否
かを判定する離散是非判定部１８・離散是非判定部１８
が離散判定に用いるしきい値ＤＷを記憶している離散判
定パラメータ記憶部１９・該離散是非判定部で２つのカ
テゴリが離散であると判定された回数を遂次カウントす
る離散分布数カウンタ２０がある。Categories (C, -Cf) stored in the feature distribution data storage unit 14 and code set storage unit 7 that store feature distribution data for each category (average value and standard deviation of the feature values for each category) A category combination management unit 15 that generates a combination of two categories from among them without overlapping and outputs distribution data of the two categories, and distribution data of the two categories output by the category combination management unit 15. From this, the results of the inter-distribution distance calculation unit 16, which calculates the distance between the distributions of the two categories, and the appearance range parameter storage unit 171, which stores the parameters for calculating the distance between the distributions, are determined. Discrete pass/fail judgment unit 18/Discrete pass/fail judgment unit 18 for determining whether two categories are discrete.
A discrete determination parameter storage unit 19 stores a threshold value DW used for discrete determination, and a discrete distribution number counter 20 sequentially counts the number of times the discrete pass/fail determination unit determines that two categories are discrete. be.

次に各部の動作を説明する（説明は、特徴ｔ　Ｆ　２の
離散分布数計算部９の内部動作に沿って行う）。Next, the operation of each part will be explained (the explanation will be given along with the internal operation of the discrete distribution number calculation unit 9 of the feature t F 2).

平均値と標準偏差に圧縮した型式で記憶されている各カ
テゴリの特徴量分布データ１４に対してカテゴリ組合わ
せ管理部１５では、０１〜Ｃｆ　から重複しないように
２つのカテゴリの組合わせ（例えば第４図ＣｆとＣｃ）
を発生し、該２つのカテゴリの分布データ（平均値及び
標準偏差）１に分布間距離計算部１６に転送する。以後
離散是非判定部からの起動指令毎に、該組合わせ管理部
１５はすべてのカテゴリの組合わせを出力するまで上記
の動作を繰返す。For the feature distribution data 14 of each category that is stored in a format compressed into mean values and standard deviations, the category combination management unit 15 creates combinations of two categories (for example, the Figure 4 Cf and Cc)
is generated and transferred to the distribution distance calculation unit 16 as distribution data (average value and standard deviation) 1 of the two categories. Thereafter, the combination management section 15 repeats the above operation for each activation command from the discrete pass/fail determining section until all category combinations are output.

分布間距離計算部１６では、該組合わせ管理部より転送
された両カテゴリの平均値及び標準偏差により次式で表
わす分布間距離を計算する。The inter-distribution distance calculation section 16 calculates the inter-distribution distance expressed by the following formula using the average value and standard deviation of both categories transferred from the combination management section.

θに＋ｌ’ｒ２２つのカテゴリのうち平均値の大きい側
のカテゴリの出現範囲の下、上限値（第４図（Ｃ）のθ
ｃ１．θｅｘ） θ１．１ｏｒ２〜２つのカテゴリのうち平均値の小さｈ
側のカテゴリの出現範囲の下、上限値（第４図（Ｃ）の
θｆ１．θｆｚ　）尚、各カテゴリの出現範囲の下、上限値θＩｆ　＋２は
・θＩＩ＋２＝平均値±α×標準偏差　・・・・・・・
・・（５）で表わす０ここでαは出現範囲パラメータ記
憶部１７から供給される。通常、特徴量が正規分布する
場合、α−３とすれば、あるカテゴリの９９．７俤の出
現範囲を表現できる。θ + l'r2 The lower and upper limit of the appearance range of the category with the larger average value among the two categories (θ in Figure 4 (C)
c1. θex) θ1.1 or 2 ~ Smaller average value h of the two categories
The lower upper limit value of the appearance range of the category on the side (θf1.θfz in Figure 4 (C)) The lower upper limit value θIf +2 of the appearance range of each category is ・θII+2=mean value ±α×standard deviation ・・・・・・・・
. . 0 expressed as (5) where α is supplied from the appearance range parameter storage unit 17. Normally, when the feature amount is normally distributed, if α-3 is used, it is possible to express the appearance range of 99.7 yen of a certain category.

次に離散是非判定部１８において、該距離計算部１６で
出力された分布間距離が離散判定パラメータ記憶部１９
から供給されるしきい値Ｄｗよシ犬であれば２つのカテ
ゴリは離散であると見なし、離散分布数のカウンタ２０
に１を加え、カテゴリ組合わせ管理部１５を再び起動し
・次の組合せのカテゴリについて同様な処理が成される
。また、該しきい値ＤＷより該分布間距離が小であれば
２つのカテゴリは離散でなく・分割不可能と考え・他に
処理を行わず、カテゴリ組合わせ管理部１５ヲホ起動し
・次のカテゴリの組合わせに対して同様な処理を行う。Next, in the discrete pass/fail judgment section 18, the inter-distribution distance outputted from the distance calculation section 16 is determined by the discrete judgment parameter storage section 19.
If the threshold value Dw supplied from
1 is added to , and the category combination management section 15 is activated again. Similar processing is performed for the next combination of categories. In addition, if the distance between the distributions is smaller than the threshold DW, the two categories are considered not to be discrete and cannot be divided, and the category combination management unit 15 is activated without performing any other processing. Similar processing is performed for combinations of categories.

第４図の例で、α＝３．Ｄ−＝０として処理を行うと、
Ｆｌの場合、離散なカテゴリの組合わせは、（Ｃａ、Ｃ
ｄ）、（Ｃａ、　Ｃ・）。In the example of FIG. 4, α=3. When processing is performed with D-=0,
In the case of Fl, the combination of discrete categories is (Ca, C
d), (Ca, C.).

（Ｃａ、Ｃｆ）、（Ｃｂ、Ｃｄ）、（Ｃｂ、Ｃａ）。(Ca, Cf), (Cb, Cd), (Cb, Ca).

（Ｃｂ、Ｃｆ）ｌ　（Ｃａ、Ｃｄ）、（Ｃ６，Ｃ・）、
　（ＣＣ。(Cb, Cf)l (Ca, Cd), (C6, C・),
(C.C.

Ｃｆ）となり・離散分布数は９．Ｆ２の場合は１２とな
る。したがってＦｌ　よりＦ２の方が優先して順位付け
られる。すなわち特徴量Ｆ１を用いるよりはＦ２の方が
よい。Cf) and the number of discrete distributions is 9. In the case of F2, it is 12. Therefore, F2 is prioritized over Fl. In other words, it is better to use F2 than to use feature amount F1.

本実施例では、離散分布数をめる際に必要なカテゴリ間
の離散の是非判定を、前記した分布間距離により行った
。これにより次のような効果がある。In this example, the judgment of whether or not the categories are discrete, which is necessary when calculating the number of discrete distributions, is performed using the above-mentioned inter-distribution distance. This has the following effects.

■分布距離は両カテゴリの平均値と標準偏差、及び出現
範囲パラメータαより極めて簡単な計算でめられる。■The distribution distance can be determined by an extremely simple calculation from the mean value and standard deviation of both categories, and the appearance range parameter α.

■カテゴリの出現範囲の広さを決定する出現範囲パラメ
ータαと・離散の是非判定のしきい値として用いる離散
判定パラメータＤｗ’に適宜変化させることにより、目
的に応じて容易に分割の際の危険度を調節できるため、
柔軟なシステムを実現できる。例えば・工業部品のよう
に一つのカテゴリには全く同一の部品しかない場合は、
特徴量の計算値のばらつきは照明やレンズのむらによる
偶然誤差のみに関係するため、何度か採取したデータは
ほとんど正規分布になる。したがってこのような場合、
α−３，Ｄｗ＝Ｏとしても良い〔但し、０．３％の危険
率を許容する場合に限る〕。しかし果物や魚種の仕分は
等の変形の大きい例の場合は、採取したデータは、必ず
しも正規分布ではないし、また変形例を充分含んでいが
いこともあるので、α及びｌ）ｗを大きくするのがよい
。（例えばα＝４、ＤＷ＝０．３など）。■ By appropriately changing the appearance range parameter α, which determines the width of the appearance range of a category, and the discrete judgment parameter Dw', which is used as a threshold for determining whether or not to use discrete Because the degree can be adjusted,
A flexible system can be realized. For example, if there are only identical parts in one category, such as industrial parts,
Variations in the calculated values of feature quantities are related only to random errors due to unevenness in illumination and lenses, so data collected several times will almost always have a normal distribution. Therefore, in such a case,
It is also possible to set α-3, Dw=O [however, only if a risk rate of 0.3% is allowed]. However, in the case of cases with large deformations, such as the sorting of fruits and fish species, the collected data does not necessarily have a normal distribution, and it may be desirable to include enough deformed examples, so α and l) w should be increased. It is better to do so. (For example, α=4, DW=0.3, etc.).

前述の実施例ではクラス分は第４図のようにいくつか認
識対象があり、ある入カバターンが該認識対象のどれで
あるか全出力する認識の際に有効な特徴量を得るための
順位付は方法について説明したが、第６図の「カテゴリ
組合わせ管理部」１５の動作を若干変更するだけで・不
良品検査のように、１種類の良品カテゴリと数種類の不
良品カテゴリ全区別する際に有効な特徴量を得る特徴量
順位付は方法が実現できる。In the above-mentioned embodiment, there are several recognition targets for each class as shown in Figure 4, and ranking is performed to obtain effective feature quantities during recognition to output all of the recognition targets to which a certain input pattern belongs. explained the method, but by just slightly changing the operation of the "Category combination management section" 15 in Figure 6, when distinguishing between one type of non-defective product category and several types of defective product categories, as in defective product inspection. A method can be used to rank features to obtain effective features.

今第１図に示すように、カテゴリＣｂが良品で、他は不
良品であったとする。例えば錠剤の検査の例とし、Ｃａ
は［かけＪ、Ｃｃは「ぼっちコとは錠剤に小さなつぶが
付着しているもの）、ｃａは「くっつき」（「＜つつき
」とは錠剤が互いに２つ付着しているもの）で、Ｆ！は
面積、Ｆ２は周囲長、Ｆ３は形状係数（周囲長２／面積
）と考えることにする。この場合、認識出方とじては良
品か不良品の２つだから、特徴量としては良品であるＣ
ｂと他の不良品であるＣ−、Ｃｃ、　ｃａが区別できる
ものが良い。すなわち・良品（Ｃｂ）と離散であるカテ
ゴリ数が多い程良い。そこでカテゴリ組合わせ管理部の
動作として、良品（Ｃｂ）　とそれ以外のカテゴリ（Ｃ
ｍ、Ｃｃ、Ｃｄ）の組合わせを順次発生し、両者の分布
データを出力するように変更すれば容易に特徴量の順位
付は方法が実現できる。例えば第１図Ｆ夏の離散分布数
は１゜Ｆ２は２．Ｆ３は３となり−Ｆｓが最も有効であ
ると判断できる口この方法によると・良品カテゴリ一種
類に対して不良品カテゴリが多種存在する不良品検査に
おいて・認識辞書を作成するための特徴量順位付は方法
を容易に提供でき、しかも出力される徴取量順位は適切
に分割性能の良否をきめることができる。As shown in FIG. 1, it is now assumed that category Cb is a non-defective item and the other items are defective. For example, as an example of tablet inspection, Ca
is [kake J, Cc is ``bocchiko'' is a tablet with a small lump attached to it, ca is ``stick''(``<peck'' is a tablet with two sticks attached to each other), F! is the area, F2 is the perimeter, and F3 is the shape factor (perimeter 2/area). In this case, there are two types of recognition: a good product or a defective product, so the feature value is C, which is a good product.
It is good to be able to distinguish between B and other defective products C-, Cc, and ca. That is, the greater the number of categories that are discrete from non-defective products (Cb), the better. Therefore, the category combination management department operates by selecting good products (Cb) and other categories (Cb).
By sequentially generating combinations of (m, Cc, Cd) and outputting the distribution data of both, the method for ranking the feature values can be easily realized. For example, in Figure 1, the number of discrete distributions in summer F is 1°F2 is 2. F3 is 3 - Fs can be judged to be the most effective method According to this method - In defective product inspection where there are many defective product categories for one non-defective product category - Feature value ranking for creating a recognition dictionary can easily provide a method, and the output collection order can appropriately determine the quality of the division performance.

特徴量順位付けを行うための特微量評価釈度として・離
散分布数と論う本発明によると従来の分離度法等による
特徴量順位付は方法で判定木構造の認識辞書に用いる特
徴量を選択する場合に比較して・一つの認識結果を得る
までに必要な段数を減少させることが可能となり、結果
的に認識時間を大幅に短縮できる。例えば第１図の例で
判定本構造辞書を作成する場合を考えると、従来方法で
は分離度等が最大の特微量他各節に割り当てるので・第
２図のように一つの認識出力を出すのに２個以上の特徴
量を使用することになる。これに対して本発明の離散分
布数法によると、同図に示すように使用する徴取量は１
個となる。通常・特徴量の計算時間は２０ミリ秒／個程
度と非常に大きいため、認識処理時間はほとんど該特徴
量の個数で決定されると考えて良ｔｘ０したがって、本
方法により、従来方法の１／２以上の処理時間の認識出
力が可能になる。According to the present invention, feature values used in a recognition dictionary with a decision tree structure are used for ranking features using the conventional separability method. Compared to the case of selection, it is possible to reduce the number of steps required to obtain one recognition result, and as a result, the recognition time can be significantly shortened. For example, if we consider the case of creating a judgment book structure dictionary in the example shown in Figure 1, in the conventional method, the degree of separation etc. is assigned to the maximum feature amount and each node. Two or more features will be used for this purpose. On the other hand, according to the discrete distribution number method of the present invention, the collection amount used is 1, as shown in the figure.
Become an individual. Normally, the calculation time for feature quantities is very long, about 20 milliseconds/item, so it is reasonable to assume that the recognition processing time is almost determined by the number of features.Therefore, with this method, it is 1/1/2 of the conventional method. It becomes possible to recognize and output two or more processing times.

〔Effect of the invention〕

本発明によると与えられた複数のカテゴリに対し最適な
特徴量を評価選択し識別をおこなうことができる。According to the present invention, it is possible to evaluate and select the optimum feature amount for a plurality of given categories and perform identification.

[Brief explanation of the drawing]

第１図は特微量分布例と従来の特徴量順位付は結果の説
明図・第２図は従来方法及び本発明方法を用いて作成し
た判定木認識辞書例、第３図は本発明が生まれる原因と
なった事象の発見の説明図、第４図は本発明の詳細な説
明するための特徴量分布図・第５図は本発明の実施例概
念図、第６図は本発明の実施例中心部の概念図。７・・・分割カテゴリセット、８〜１１・・・特徴量の
離散分布数計算部・１２・・・離散分布数比較部・１３
・・・特徴量順位列、１４・・・各カテゴリ分布データ
、１５・・・カテゴリ組合わせ管理部、１６・・・分布
間距離計算部・１７・・・出現範囲パラメータ記憶部、
１８・・・離散是非判定部・１９・・・離散判定パラノ
ー１１図 μａｗ＋　μｇ−’ｆ　メ”　７％ワ澤Ｆｓ■ ■ ′Ｖ１３図チ　Ｓ　乙ミＹ４図￥５区Figure 1 is an explanatory diagram of the results of the feature quantity distribution example and conventional feature ranking. Figure 2 is an example of a decision tree recognition dictionary created using the conventional method and the method of the present invention. Figure 3 is the result of the present invention. An explanatory diagram of the discovery of the causative event, Figure 4 is a feature value distribution diagram for detailed explanation of the present invention, Figure 5 is a conceptual diagram of an embodiment of the present invention, and Figure 6 is an embodiment of the present invention. Conceptual diagram of the center. 7...Divided category set, 8-11...Discrete distribution number calculation unit for feature quantity・12...Discrete distribution number comparison unit・13
. . . Feature ranking sequence, 14. Each category distribution data, 15. Category combination management unit, 16. Inter-distribution distance calculation unit, 17. Appearance range parameter storage unit,
18...Discrete judgment part ・19...Discrete judgment Parano 11 figure μaw+ μg-'f Me' 7%Wazawa Fs ■ ■ 'V13 figure Chi S Otomi Y4 figure ¥5 ward

Claims

[Claims]

1. In a method for evaluating feature quantities as identification indicators in pattern recognition, the frequency distribution is determined for each of a plurality of feature quantities (n) for a plurality of categories (m) corresponding to a figure to be identified. The number of discrete distributions, which is the number of combinations of two categories without mutual interference, is calculated from the distribution of categories (m) considered for each of the features (n).
A method for evaluating feature quantities in figure recognition, characterized in that a feature quantity having the largest possible value of the calculated discrete distribution number is selected from among the feature quantities (n).