JPS60142788A - Feature amount evaluating method in graphic recognition - Google Patents

Feature amount evaluating method in graphic recognition

Info

Publication number
JPS60142788A
JPS60142788A JP58248152A JP24815283A JPS60142788A JP S60142788 A JPS60142788 A JP S60142788A JP 58248152 A JP58248152 A JP 58248152A JP 24815283 A JP24815283 A JP 24815283A JP S60142788 A JPS60142788 A JP S60142788A
Authority
JP
Japan
Prior art keywords
category
categories
distribution
discrete
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP58248152A
Other languages
Japanese (ja)
Other versions
JPH027115B2 (en
Inventor
Miyahiko Orita
折田 三弥彦
Yoshiki Kobayashi
芳樹 小林
Yutaka Kubo
裕 久保
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP58248152A priority Critical patent/JPS60142788A/en
Priority to US06/687,757 priority patent/US4658429A/en
Publication of JPS60142788A publication Critical patent/JPS60142788A/en
Publication of JPH027115B2 publication Critical patent/JPH027115B2/ja
Granted legal-status Critical Current

Links

Abstract

PURPOSE:To attain identification by applying the best evaluating method (discrete distribution) in the split performance to a category even if an object category is in any distribution state so as to evaluate and select the proper feature amount of plural categories. CONSTITUTION:The evaluation is conducted along the internal operation of a discrete distribution calculating section 9 of a feature amount F2. A category managing section 15 generates two category combinations from Ca-Cf so as not to be duplicated to a feature amount distribution data 14 of each category stored in a form compressed into a mean value and a standard deviation. The distribution data (mean value and standard deviation) of the two categories are transferred to a inter-distribution distance calculating section 16. A category combining and management section 15 repeats the operation until all the combinations of categories are outputted at each start command from a discrete propriety discriminating section. Thus, a proper feature amount is evaluated and selected and the identification is attained to plural given categories.

Description

【発明の詳細な説明】 〔発明の利用分野〕 本発明は図形認識に係り、特に認識対象図形の特徴量の
評価方法に関する。
DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to figure recognition, and particularly to a method for evaluating feature amounts of a figure to be recognized.

〔発明の背景〕[Background of the invention]

対象物の識別認識には認識辞書を作成し、判定木構造°
により認識をおこなう0そして認識対象の識別に当って
は、予め作成した認識辞書に従う。
To identify and recognize objects, a recognition dictionary is created and a decision tree structure is used.
0 and the recognition target is identified according to a recognition dictionary created in advance.

該認識辞書は認識対象各々について種々時微量を抽出し
た後、各カテゴリの分布データ管解析することにより作
成される。その際、解析の手数を省き、解析者の主観を
混じえないために該認識辞書はある標準化した方法で自
動作成するのが望ましい。認識辞書の構造としては、処
理速度を重視した判定水沫による識別を実現する構造(
判定木構造)が最も有力である。しかし判定木構造辞書
を作成する際に常に問題になるのは、木の各節でいかな
る特徴tを使用するかということである。すなわち木の
各節のカテゴリ候補をいくつかの小グループのカテゴリ
候補群に分割するために1種々の特徴量を評価し、順位
付けを行う必要がある(他の構造辞書においても同様に
特徴量の順位付けは問題になるが・ここでは判定木構造
辞書を念頭に置匹ている)。
The recognition dictionary is created by extracting trace amounts at various times for each recognition target and then analyzing the distribution data of each category. In this case, it is desirable to automatically create the recognition dictionary using a standardized method in order to save the trouble of analysis and prevent the analyst's subjectivity from being mixed in. The structure of the recognition dictionary is a structure (
decision tree structure) is the most promising. However, when creating a dictionary with a decision tree structure, a problem that always arises is which feature t should be used at each node of the tree. In other words, in order to divide the category candidates of each node of the tree into several small groups of category candidates, it is necessary to evaluate and rank various features (features are similarly used in other structural dictionaries). However, here we have a decision tree structure dictionary in mind).

従来の特徴量順位付は方法としては分離度法(例えば辻
内順平著「応用画像解析」共立出版)°、安定係数法(
例えば特開昭57−25078号)および分散比法(例
えば「パターン認識における特徴抽出に関する数理的研
究」大津氏、電総研研究報告第818号)がある。それ
ぞれについて、以下に簡単に説明する。
Conventional feature ranking methods include the separability method (for example, "Applied Image Analysis" by Junpei Tsujiuchi, Kyoritsu Shuppan) °, the stability coefficient method (
For example, Japanese Patent Application Laid-Open No. 57-25078) and the variance ratio method (for example, "Mathematical Study on Feature Extraction in Pattern Recognition" by Mr. Otsu, Electric Research Institute Research Report No. 818). Each will be briefly explained below.

いま分割(識別)を試みてhるカテゴリ群を第1図に示
すようにCa−Cd、これに対して準備されている特徴
量はFl−Fsであったとする。そしてFl、F2 、
Fsの各々の特徴量について何度かデータをとったとこ
ろ第1図に示すようか分布が得られたと仮定する。こ\
で特徴量とは、例えば円形の部品のような場合その外周
長などがとられる。円形物品にあてられている光の量、
すなわち明るさなどでばらつきにより、第1図(a)〜
(C)に示すような分布曲線が得られる。
Assume that the category group to be attempted to be divided (identified) is Ca-Cd, as shown in FIG. 1, and the feature amounts prepared for this are Fl-Fs. And Fl, F2,
Assume that data is collected several times for each feature of Fs, and a distribution as shown in FIG. 1 is obtained. child\
For example, in the case of a circular part, the feature value is the outer circumference of the part. the amount of light falling on a circular object;
In other words, due to variations in brightness, etc., the
A distribution curve as shown in (C) is obtained.

(1)分離度法: Fl−Fsについて分離度とい 。(1) Separation method: Separation is called for Fl-Fs.

う評価値を計算し・その分離度の犬き艷特微量を第1図
のカテゴリ群に対して用いる。こ\で分離度は(1)式
で表わされる。
The evaluation value is calculated, and the characteristic value of the degree of separation is used for the category group in Figure 1. Here, the degree of separation is expressed by equation (1).

σに+1 − σに ここで、μk ;特徴量FIにおけるカテゴリCkの平
均値 μ1c41: 、Ckに隣接するカテゴリCkや1(平
均値が次に大きいカテゴリ) の平均値 σk :特徴量F1におけるカテゴリChの標準偏差 σに++:Ckに隣接するカテゴリCkやIの標準偏差 (2)安定係数法:安定係数eFt−Fsまで計算し・
その値が大きい特徴量が優先される。尚、安定係数は(
2)式で表わされる。
+1 to σ - σ where μk : Average value of category Ck in feature FI μ1c41: Average value of categories Ck and 1 (category with the next largest average value) adjacent to Ck σk : Category in feature F1 ++ to standard deviation σ of Ch: standard deviation of categories Ck and I adjacent to Ck (2) Stability coefficient method: Calculate the stability coefficient eFt-Fs.
Priority is given to feature quantities with larger values. Furthermore, the stability coefficient is (
2) It is expressed by the formula.

(“k・μに+l lσに、σに+1ともに分離度に同
じ)。
(“+1 for k・μ, +1 for lσ, and +1 for σ are the same as the degree of separation)”.

(3)分散比法:分散比f:F+ −Fs ’!で計算
し、その値が大きい特徴量を優先させる。尚、分散比は
・ ここで1μに;カテゴリにの平均値 σ2 ; 〃 の標準偏差の自乗 K :分割を試みているカテゴリ数 μ :全カテゴリの平均値 以上の方法で特徴量F1〜F3について評価すると、図
中に示すように彦る。これによると、F3で4つのカテ
ゴリに分割できる(Fsが最も有効である)のに対し・
(1)、 (2)、 (3)の方法はいずれも・Fl及
びF2を優先(選択)することになる。(1)。
(3) Variance ratio method: Variance ratio f:F+ -Fs'! , and give priority to features with larger values. In addition, the variance ratio is 1 μ here; the average value for the category σ2 ; the square of the standard deviation of When evaluated, it turns out as shown in the figure. According to this, F3 can be divided into four categories (Fs is the most effective), whereas
All of the methods (1), (2), and (3) prioritize (select) Fl and F2. (1).

(2)はFIKおけるcdのように極端に1つのカテゴ
リが他のカテゴリの分布群から離れている場合に大きく
なり、(3)はF2のように全カテゴリが2つに大きく
わかれている場合に大きくなるからである。
(2) becomes large when one category is extremely far away from the distribution group of other categories, such as CD in FIK, and (3) becomes large when all categories are largely divided into two, such as F2. This is because it becomes larger.

ここでそれぞれの方法で、判定木辞書を作成すると第2
図(−i)、Φ)、 (C)に示すようになる。分離度
及び安定係数によりまず・Ca、Cb、C6,Cdを対
象にして特徴量の屓位付けをすると、Flが有力となる
。そこでFlの軸上で、安全に分割できるのは、Ccと
cdの間であるからCa、Cb。
Here, if you create a decision tree dictionary using each method, the second
The results are as shown in Figures (-i), Φ), and (C). First, when the feature values are ranked for Ca, Cb, C6, and Cd based on the degree of separation and stability coefficient, Fl becomes the most dominant. Therefore, on the axis of Fl, the parts that can be safely divided are between Cc and cd, so Ca and Cb.

CCとcdのクループに分割できる。次にCa、Cb。It can be divided into CC and CD groups. Next, Ca and Cb.

Ccを対象として特徴量の順位付けをすると・F2が有
力となる。以下同様に行うと図のようになるわけである
(分散比についても同様)。認識の処理時間は一つの認
識結果を出力する際に使用する特徴量の個数に比例する
ため・できる限り該特徴量個数は少くする必要がある。
When ranking the feature values for Cc, F2 becomes the most likely. If the process is repeated in the same way, the result will be as shown in the figure (the same applies to the dispersion ratio). Since the recognition processing time is proportional to the number of features used when outputting one recognition result, the number of features needs to be as small as possible.

したがって、第2図(C)のようにF3のみで認識する
方がよい。すなわち、従来方法では適切に特徴量の分割
性能を表していないため、特徴量を安定に順位付けるこ
とができないことから・効率の悪い判定木辞書を作成し
、結果的に認識処理時間を大きくしている。
Therefore, it is better to recognize only F3 as shown in FIG. 2(C). In other words, the conventional method does not properly represent the segmentation performance of the feature values, making it impossible to rank the feature values stably.・Inefficient decision tree dictionaries are created, resulting in increased recognition processing time. ing.

〔発明の目的〕[Purpose of the invention]

本発明の目的は、判定木構造の認識辞書を作成する際に
問題となる特徴量順位付は方法として。
The purpose of the present invention is to provide a method for ranking features, which is a problem when creating a recognition dictionary with a decision tree structure.

対象とするカテゴリがいかなる分布状態にあっても、該
対象とするカテゴリに対する分割性能が最もよい特徴量
全評価決定する方法全提供することにある。
It is an object of the present invention to provide a method for determining all evaluations of feature quantities with the best division performance for a target category, regardless of the distribution state of the target category.

〔発明の概要〕[Summary of the invention]

本発明は識別対象図形に対応する複数のカテゴリに対し
て複数の特徴量の各々について頻度分布をめ・該特徴量
の各々についてめられたカテゴリの分布の中から相互干
渉のない2つのカテゴリの組合せ数である離散分布数を
算出し、該算出された離散分布数の中でできるだけ大き
な分布数の値?もつ特徴量を選択し、図形識別における
特徴量として採用して図形識別をおこなうことに特徴が
ある。
The present invention calculates a frequency distribution for each of a plurality of feature quantities for a plurality of categories corresponding to a figure to be identified, and selects two categories that do not interfere with each other from the distribution of categories determined for each of the feature quantities. Calculate the number of discrete distributions, which is the number of combinations, and find the value of the number of distributions that is as large as possible among the calculated numbers of discrete distributions? The feature of this method is that it selects feature quantities that have the same characteristics and employs them as feature quantities for figure identification to perform figure identification.

〔発明の実施例〕[Embodiments of the invention]

はじめに基礎となる事柄について説明する。特徴量の評
価は・ある特徴量を用いたとき、認識の対象となってい
るカテゴリをいかに多く分割できるかにある。第3図を
用いて説明する。
First, I will explain the basics. The evaluation of feature amounts depends on how many categories that are the object of recognition can be divided into when using a certain feature amount. This will be explained using FIG.

いま判定木の任意のノード1にお、いて識別の対象とな
っているカテゴリ2がC−=Ctであったと(7) する。このときある特徴量(3)ヲ用いることによりC
a、Cb、 Ccの1つのグループ(4)、Ca、cd
Suppose (7) that the category 2 to be identified at an arbitrary node 1 of the decision tree is C-=Ct. At this time, by using a certain feature (3), C
One group of a, Cb, Cc (4), Ca, cd
.

Ceの1つのグループ(5)、CI、Cf、Cgの1つ
のグループ(6)の3つのグループに分割されたとする
Assume that it is divided into three groups: one group of Ce (5) and one group of CI, Cf, and Cg (6).

こ\で分割内容を以下に示す。The details of the division are shown below.

(1)C,はCd、C−、Cf、Ctと区別(分割)さ
れた。
(1) C was distinguished (divided) from Cd, C-, Cf, and Ct.

(2)ChはCd、C−、Cf、Cgと区別(分割)さ
れた。
(2) Ch was distinguished (divided) from Cd, C-, Cf, and Cg.

(3)C,はCo、Cf、Cgと区別(分割)された。(3) C was distinguished (divided) from Co, Cf, and Cg.

(4) Cd1i、Cf、Cgと区別(分割)された(
C,。
(4) Differentiated (divided) into Cd1i, Cf, and Cg (
C.

Cbと区別(分割)されたが重複がある)(5)C,は
区別(分割)されず(CI、 Cb、 Ceと分割され
たが重複がある)0 (6)Cf、Cgは区別(分割)されず(ca、 Cb
(5) C, is not distinguished (divided) from CI, Cb, and Ce, but there is overlap) 0 (6) Cf, Cg are distinct ( (split) not (ca, Cb
.

Cc、cdと分割されたが重複がある)すなわち、特徴
量によるカテゴリの分割とは。
Cc and cd, but there is some overlap) In other words, what is the division of categories based on feature amounts?

2つのカテゴリの分割を多数組合わせたものから構成さ
れている。そこで、分割される2つのカテ(8) ゴリの総組合わせ数D−の大きさで特徴量の順位付けを
行うことにした。該総組合わせ数を例えば第1図の例に
ついて計算すると・FlはDC−3゜F2はDe=4−
FsはD−=6 (e大阪) となる。
It is composed of many combinations of divisions of two categories. Therefore, we decided to rank the feature amounts based on the total number of combinations D- of the two divided categories (8). For example, when calculating the total number of combinations for the example shown in Figure 1, Fl is DC-3°F2 is De=4-
Fs becomes D-=6 (e Osaka).

すなわちF3が最も有効な特徴量であるという結果にな
る。これは第1図の分布例に特有な事象ではなく、対象
となるカテゴリがいかに複雑に分布しても5分割性能と
該総組合わせ数(以後離散分布数と呼ぶ)は適切に対応
している。本発明はこの事象の新たな発見より生まれた
ものである。
In other words, the result is that F3 is the most effective feature quantity. This is not a phenomenon specific to the distribution example in Figure 1; no matter how complex the target category is distributed, the 5-division performance and the total number of combinations (hereinafter referred to as the number of discrete distributions) correspond appropriately. There is. The present invention was born from a new discovery of this phenomenon.

本発明の実施例を第4〜6図を用いて説明する。Embodiments of the present invention will be described using FIGS. 4 to 6.

今、第4図(a)、 (b)、 (C)に示すようにカ
テゴリCa〜Cfの2値パターンを認識する例を考える
。このときC,−Cfについて幾度か特徴量F1〜Fh
を計算して第4図(b)、(C)の頻度分布が得られた
、とする。ここでは便宜上・Fl を該2値パターンの
外形の周囲長・F2を穴の周囲長とし、以下F3〜Fk
については説明を省略した。この例を用いて特徴量の順
位付は方法を説明する。すなわちこの例では第4図(a
)に示す識別図形(穴が空いてい(9) る円形部品のようなもの)についての例を示している。
Now, consider an example in which binary patterns of categories Ca to Cf are recognized as shown in FIGS. 4(a), 4(b), and 4(c). At this time, the feature values F1 to Fh are calculated several times for C and −Cf.
It is assumed that the frequency distributions shown in FIGS. 4(b) and (C) are obtained by calculating . Here, for convenience, Fl is the perimeter of the outer shape of the binary pattern, F2 is the perimeter of the hole, and below, F3 to Fk
The explanation has been omitted. Using this example, the method for ranking feature values will be explained. In other words, in this example, Fig. 4 (a
) shows an example of the identification figure (like a circular part with a hole (9)).

そして特徴量としているいろ考えられるがこ\では外周
長、穴周囲長をそれぞれFl、Fzとした場合を示して
いる。
Although there are various possible feature quantities, this example shows the case where the outer circumference length and the hole circumference length are respectively Fl and Fz.

第5図に特徴量順位付は方法の概略を示す。本方法は次
の各部で構成されている。
FIG. 5 shows an outline of the feature ranking method. This method consists of the following parts.

分割を試みる対象となるカテゴリのコードセットを記憶
している分割カテゴリコードセット記憶部7.特徴量F
l軸上での該カテゴリの離散分布数を計算する特徴量F
1の離散分布数計算部8(以下同様に・9〜11は各特
徴量軸上での離散分布数計算部)。各特徴量軸上での離
散分布数を大の順に並べ、その同一順序で特微量名(コ
ード)列を出力する離散分布数比較部12.特徴量の順
位付は結果を記憶する特徴量名列記憶部13である0 次に各部の動作を説明する。
A split category code set storage unit 7 that stores the code set of the category that is the target of splitting. Feature amount F
Feature value F for calculating the number of discrete distributions of the category on the l axis
1, a discrete distribution number calculation unit 8 (the same applies hereafter; 9 to 11 are discrete distribution number calculation units on each feature axis). A discrete distribution number comparison unit 12 that arranges the number of discrete distributions on each feature axis in descending order and outputs a sequence of feature names (codes) in the same order. The ranking of the feature quantities is performed by the feature quantity name string storage section 13 that stores the results.Next, the operation of each section will be explained.

分割を試みる対象となるカテゴリのコードセット記憶部
7に記載されているカテゴリに関して(初期状態として
はC,−Cf)、特徴量F1〜(1G Fkの各離散分布数計算部8〜11では、各々離散分布
数を計算し、その値を離散分布数比較部12に出力する
。次に離散分布数比較部12では各離散分布数を大きい
の順に舵心、それと同一順序で特徴量各列を作成し、特
徴量名列記憶部13に記憶させる。ここで離散分布数と
は・今路新しく考案した概念で、頻度分布(確率密度分
布)が互いに干渉し合わない2つのカテゴリの総組合わ
せ数のことである。該離散分布数をめる方法を第6図で
説明する。各離散分布数計算部8,9゜10.11・・
・は次の各部で構成されて込る。
Regarding the category described in the code set storage unit 7 of the category to be attempted to be divided (initial state is C, -Cf), each of the discrete distribution number calculation units 8 to 11 of the feature quantities F1 to (1G Fk) Each discrete distribution number is calculated and the value is output to the discrete distribution number comparison unit 12.Next, the discrete distribution number comparison unit 12 calculates each discrete distribution number in descending order of the steering center, and in the same order, calculates each column of feature quantities. The number of discrete distributions is a concept newly devised by Imaji, and is a total combination of two categories whose frequency distributions (probability density distributions) do not interfere with each other. The method of calculating the discrete distribution number is explained in Fig. 6.Each discrete distribution number calculation unit 8, 9゜10.11...
・It consists of the following parts.

各カテゴリの特徴量分布データ(各カテゴリの特徴量の
平均値と標準偏差)を記憶している特徴量分布データ記
憶部14、コードセット記憶部7に記憶されているカテ
ゴリ(C,−Cf)の中から重複しないように2つのカ
テゴリの組合わせを発生し2該2つのカテゴリの分布デ
ータを出力するカテゴリ組合わせ管理部15、該カテゴ
リ組合わせ管理部15の出力した2つのカテゴリの分布
データから、該2つのカテゴリの分布間距離をめる分布
間距離計算部16、該分布間距離をめる際のパラメータ
を記憶する出現範囲パラメータ記憶部171分布間距離
計算部16の結果を判定して2つのカテゴリが離散か否
かを判定する離散是非判定部18・離散是非判定部18
が離散判定に用いるしきい値DWを記憶している離散判
定パラメータ記憶部19・該離散是非判定部で2つのカ
テゴリが離散であると判定された回数を遂次カウントす
る離散分布数カウンタ20がある。
Categories (C, -Cf) stored in the feature distribution data storage unit 14 and code set storage unit 7 that store feature distribution data for each category (average value and standard deviation of the feature values for each category) A category combination management unit 15 that generates a combination of two categories from among them without overlapping and outputs distribution data of the two categories, and distribution data of the two categories output by the category combination management unit 15. From this, the results of the inter-distribution distance calculation unit 16, which calculates the distance between the distributions of the two categories, and the appearance range parameter storage unit 171, which stores the parameters for calculating the distance between the distributions, are determined. Discrete pass/fail judgment unit 18/Discrete pass/fail judgment unit 18 for determining whether two categories are discrete.
A discrete determination parameter storage unit 19 stores a threshold value DW used for discrete determination, and a discrete distribution number counter 20 sequentially counts the number of times the discrete pass/fail determination unit determines that two categories are discrete. be.

次に各部の動作を説明する(説明は、特徴t F 2の
離散分布数計算部9の内部動作に沿って行う)。
Next, the operation of each part will be explained (the explanation will be given along with the internal operation of the discrete distribution number calculation unit 9 of the feature t F 2).

平均値と標準偏差に圧縮した型式で記憶されている各カ
テゴリの特徴量分布データ14に対してカテゴリ組合わ
せ管理部15では、01〜Cf から重複しないように
2つのカテゴリの組合わせ(例えば第4図CfとCc)
を発生し、該2つのカテゴリの分布データ(平均値及び
標準偏差)1に分布間距離計算部16に転送する。以後
離散是非判定部からの起動指令毎に、該組合わせ管理部
15はすべてのカテゴリの組合わせを出力するまで上記
の動作を繰返す。
For the feature distribution data 14 of each category that is stored in a format compressed into mean values and standard deviations, the category combination management unit 15 creates combinations of two categories (for example, the Figure 4 Cf and Cc)
is generated and transferred to the distribution distance calculation unit 16 as distribution data (average value and standard deviation) 1 of the two categories. Thereafter, the combination management section 15 repeats the above operation for each activation command from the discrete pass/fail determining section until all category combinations are output.

分布間距離計算部16では、該組合わせ管理部より転送
された両カテゴリの平均値及び標準偏差により次式で表
わす分布間距離を計算する。
The inter-distribution distance calculation section 16 calculates the inter-distribution distance expressed by the following formula using the average value and standard deviation of both categories transferred from the combination management section.

θに+l’r22つのカテゴリのうち平均値の大きい側
のカテゴリの出現範囲の下、上限値(第4図(C)のθ
c1.θex) θ1.1or2〜2つのカテゴリのうち平均値の小さh
側のカテゴリの出現範囲の下、上限値(第4図(C)の
θf1.θfz ) 尚、各カテゴリの出現範囲の下、上限値θIf +2は
・θII+2=平均値±α×標準偏差 ・・・・・・・
・・(5)で表わす0ここでαは出現範囲パラメータ記
憶部17から供給される。通常、特徴量が正規分布する
場合、α−3とすれば、あるカテゴリの99.7俤の出
現範囲を表現できる。
θ + l'r2 The lower and upper limit of the appearance range of the category with the larger average value among the two categories (θ in Figure 4 (C)
c1. θex) θ1.1 or 2 ~ Smaller average value h of the two categories
The lower upper limit value of the appearance range of the category on the side (θf1.θfz in Figure 4 (C)) The lower upper limit value θIf +2 of the appearance range of each category is ・θII+2=mean value ±α×standard deviation ・・・・・・・・
. . 0 expressed as (5) where α is supplied from the appearance range parameter storage unit 17. Normally, when the feature amount is normally distributed, if α-3 is used, it is possible to express the appearance range of 99.7 yen of a certain category.

次に離散是非判定部18において、該距離計算部16で
出力された分布間距離が離散判定パラメータ記憶部19
から供給されるしきい値Dwよシ犬であれば2つのカテ
ゴリは離散であると見なし、離散分布数のカウンタ20
に1を加え、カテゴリ組合わせ管理部15を再び起動し
・次の組合せのカテゴリについて同様な処理が成される
。また、該しきい値DWより該分布間距離が小であれば
2つのカテゴリは離散でなく・分割不可能と考え・他に
処理を行わず、カテゴリ組合わせ管理部15ヲホ起動し
・次のカテゴリの組合わせに対して同様な処理を行う。
Next, in the discrete pass/fail judgment section 18, the inter-distribution distance outputted from the distance calculation section 16 is determined by the discrete judgment parameter storage section 19.
If the threshold value Dw supplied from
1 is added to , and the category combination management section 15 is activated again. Similar processing is performed for the next combination of categories. In addition, if the distance between the distributions is smaller than the threshold DW, the two categories are considered not to be discrete and cannot be divided, and the category combination management unit 15 is activated without performing any other processing. Similar processing is performed for combinations of categories.

第4図の例で、α=3.D−=0として処理を行うと、
Flの場合、離散なカテゴリの組合わせは、(Ca、C
d)、(Ca、 C・)。
In the example of FIG. 4, α=3. When processing is performed with D-=0,
In the case of Fl, the combination of discrete categories is (Ca, C
d), (Ca, C.).

(Ca、Cf)、(Cb、Cd)、(Cb、Ca)。(Ca, Cf), (Cb, Cd), (Cb, Ca).

(Cb、Cf)l (Ca、Cd)、(C6,C・)、
 (CC。
(Cb, Cf)l (Ca, Cd), (C6, C・),
(C.C.

Cf)となり・離散分布数は9.F2の場合は12とな
る。したがってFl よりF2の方が優先して順位付け
られる。すなわち特徴量F1を用いるよりはF2の方が
よい。
Cf) and the number of discrete distributions is 9. In the case of F2, it is 12. Therefore, F2 is prioritized over Fl. In other words, it is better to use F2 than to use feature amount F1.

本実施例では、離散分布数をめる際に必要なカテゴリ間
の離散の是非判定を、前記した分布間距離により行った
。これにより次のような効果がある。
In this example, the judgment of whether or not the categories are discrete, which is necessary when calculating the number of discrete distributions, is performed using the above-mentioned inter-distribution distance. This has the following effects.

■分布距離は両カテゴリの平均値と標準偏差、及び出現
範囲パラメータαより極めて簡単な計算でめられる。
■The distribution distance can be determined by an extremely simple calculation from the mean value and standard deviation of both categories, and the appearance range parameter α.

■カテゴリの出現範囲の広さを決定する出現範囲パラメ
ータαと・離散の是非判定のしきい値として用いる離散
判定パラメータDw’に適宜変化させることにより、目
的に応じて容易に分割の際の危険度を調節できるため、
柔軟なシステムを実現できる。例えば・工業部品のよう
に一つのカテゴリには全く同一の部品しかない場合は、
特徴量の計算値のばらつきは照明やレンズのむらによる
偶然誤差のみに関係するため、何度か採取したデータは
ほとんど正規分布になる。したがってこのような場合、
α−3,Dw=Oとしても良い〔但し、0.3%の危険
率を許容する場合に限る〕。しかし果物や魚種の仕分は
等の変形の大きい例の場合は、採取したデータは、必ず
しも正規分布ではないし、また変形例を充分含んでいが
いこともあるので、α及びl)wを大きくするのがよい
。(例えばα=4、DW=0.3など)。
■ By appropriately changing the appearance range parameter α, which determines the width of the appearance range of a category, and the discrete judgment parameter Dw', which is used as a threshold for determining whether or not to use discrete Because the degree can be adjusted,
A flexible system can be realized. For example, if there are only identical parts in one category, such as industrial parts,
Variations in the calculated values of feature quantities are related only to random errors due to unevenness in illumination and lenses, so data collected several times will almost always have a normal distribution. Therefore, in such a case,
It is also possible to set α-3, Dw=O [however, only if a risk rate of 0.3% is allowed]. However, in the case of cases with large deformations, such as the sorting of fruits and fish species, the collected data does not necessarily have a normal distribution, and it may be desirable to include enough deformed examples, so α and l) w should be increased. It is better to do so. (For example, α=4, DW=0.3, etc.).

前述の実施例ではクラス分は第4図のようにいくつか認
識対象があり、ある入カバターンが該認識対象のどれで
あるか全出力する認識の際に有効な特徴量を得るための
順位付は方法について説明したが、第6図の「カテゴリ
組合わせ管理部」15の動作を若干変更するだけで・不
良品検査のように、1種類の良品カテゴリと数種類の不
良品カテゴリ全区別する際に有効な特徴量を得る特徴量
順位付は方法が実現できる。
In the above-mentioned embodiment, there are several recognition targets for each class as shown in Figure 4, and ranking is performed to obtain effective feature quantities during recognition to output all of the recognition targets to which a certain input pattern belongs. explained the method, but by just slightly changing the operation of the "Category combination management section" 15 in Figure 6, when distinguishing between one type of non-defective product category and several types of defective product categories, as in defective product inspection. A method can be used to rank features to obtain effective features.

今第1図に示すように、カテゴリCbが良品で、他は不
良品であったとする。例えば錠剤の検査の例とし、Ca
は[かけJ、Ccは「ぼっちコとは錠剤に小さなつぶが
付着しているもの)、caは「くっつき」(「<つつき
」とは錠剤が互いに2つ付着しているもの)で、F!は
面積、F2は周囲長、F3は形状係数(周囲長2/面積
)と考えることにする。この場合、認識出方とじては良
品か不良品の2つだから、特徴量としては良品であるC
bと他の不良品であるC−、Cc、 caが区別できる
ものが良い。すなわち・良品(Cb)と離散であるカテ
ゴリ数が多い程良い。そこでカテゴリ組合わせ管理部の
動作として、良品(Cb) とそれ以外のカテゴリ(C
m、Cc、Cd)の組合わせを順次発生し、両者の分布
データを出力するように変更すれば容易に特徴量の順位
付は方法が実現できる。例えば第1図F夏の離散分布数
は1゜F2は2.F3は3となり−Fsが最も有効であ
ると判断できる口この方法によると・良品カテゴリ一種
類に対して不良品カテゴリが多種存在する不良品検査に
おいて・認識辞書を作成するための特徴量順位付は方法
を容易に提供でき、しかも出力される徴取量順位は適切
に分割性能の良否をきめることができる。
As shown in FIG. 1, it is now assumed that category Cb is a non-defective item and the other items are defective. For example, as an example of tablet inspection, Ca
is [kake J, Cc is ``bocchiko'' is a tablet with a small lump attached to it, ca is ``stick''(``<peck'' is a tablet with two sticks attached to each other), F! is the area, F2 is the perimeter, and F3 is the shape factor (perimeter 2/area). In this case, there are two types of recognition: a good product or a defective product, so the feature value is C, which is a good product.
It is good to be able to distinguish between B and other defective products C-, Cc, and ca. That is, the greater the number of categories that are discrete from non-defective products (Cb), the better. Therefore, the category combination management department operates by selecting good products (Cb) and other categories (Cb).
By sequentially generating combinations of (m, Cc, Cd) and outputting the distribution data of both, the method for ranking the feature values can be easily realized. For example, in Figure 1, the number of discrete distributions in summer F is 1°F2 is 2. F3 is 3 - Fs can be judged to be the most effective method According to this method - In defective product inspection where there are many defective product categories for one non-defective product category - Feature value ranking for creating a recognition dictionary can easily provide a method, and the output collection order can appropriately determine the quality of the division performance.

特徴量順位付けを行うための特微量評価釈度として・離
散分布数と論う本発明によると従来の分離度法等による
特徴量順位付は方法で判定木構造の認識辞書に用いる特
徴量を選択する場合に比較して・一つの認識結果を得る
までに必要な段数を減少させることが可能となり、結果
的に認識時間を大幅に短縮できる。例えば第1図の例で
判定本構造辞書を作成する場合を考えると、従来方法で
は分離度等が最大の特微量他各節に割り当てるので・第
2図のように一つの認識出力を出すのに2個以上の特徴
量を使用することになる。これに対して本発明の離散分
布数法によると、同図に示すように使用する徴取量は1
個となる。通常・特徴量の計算時間は20ミリ秒/個程
度と非常に大きいため、認識処理時間はほとんど該特徴
量の個数で決定されると考えて良tx0したがって、本
方法により、従来方法の1/2以上の処理時間の認識出
力が可能になる。
According to the present invention, feature values used in a recognition dictionary with a decision tree structure are used for ranking features using the conventional separability method. Compared to the case of selection, it is possible to reduce the number of steps required to obtain one recognition result, and as a result, the recognition time can be significantly shortened. For example, if we consider the case of creating a judgment book structure dictionary in the example shown in Figure 1, in the conventional method, the degree of separation etc. is assigned to the maximum feature amount and each node. Two or more features will be used for this purpose. On the other hand, according to the discrete distribution number method of the present invention, the collection amount used is 1, as shown in the figure.
Become an individual. Normally, the calculation time for feature quantities is very long, about 20 milliseconds/item, so it is reasonable to assume that the recognition processing time is almost determined by the number of features.Therefore, with this method, it is 1/1/2 of the conventional method. It becomes possible to recognize and output two or more processing times.

〔発明の効果〕〔Effect of the invention〕

本発明によると与えられた複数のカテゴリに対し最適な
特徴量を評価選択し識別をおこなうことができる。
According to the present invention, it is possible to evaluate and select the optimum feature amount for a plurality of given categories and perform identification.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は特微量分布例と従来の特徴量順位付は結果の説
明図・第2図は従来方法及び本発明方法を用いて作成し
た判定木認識辞書例、第3図は本発明が生まれる原因と
なった事象の発見の説明図、第4図は本発明の詳細な説
明するための特徴量分布図・第5図は本発明の実施例概
念図、第6図は本発明の実施例中心部の概念図。 7・・・分割カテゴリセット、8〜11・・・特徴量の
離散分布数計算部・12・・・離散分布数比較部・13
・・・特徴量順位列、14・・・各カテゴリ分布データ
、15・・・カテゴリ組合わせ管理部、16・・・分布
間距離計算部・17・・・出現範囲パラメータ記憶部、
18・・・離散是非判定部・19・・・離散判定パラノ
ー11図 μaw+ μg−’f メ” 7%ワ澤Fs■ ■ ′V13図 チ S 乙 ミ Y4図 ¥5区
Figure 1 is an explanatory diagram of the results of the feature quantity distribution example and conventional feature ranking. Figure 2 is an example of a decision tree recognition dictionary created using the conventional method and the method of the present invention. Figure 3 is the result of the present invention. An explanatory diagram of the discovery of the causative event, Figure 4 is a feature value distribution diagram for detailed explanation of the present invention, Figure 5 is a conceptual diagram of an embodiment of the present invention, and Figure 6 is an embodiment of the present invention. Conceptual diagram of the center. 7...Divided category set, 8-11...Discrete distribution number calculation unit for feature quantity・12...Discrete distribution number comparison unit・13
. . . Feature ranking sequence, 14. Each category distribution data, 15. Category combination management unit, 16. Inter-distribution distance calculation unit, 17. Appearance range parameter storage unit,
18...Discrete judgment part ・19...Discrete judgment Parano 11 figure μaw+ μg-'f Me' 7%Wazawa Fs ■ ■ 'V13 figure Chi S Otomi Y4 figure ¥5 ward

Claims (1)

【特許請求の範囲】[Claims] 1、ハターン認識における識別指標としての特徴量の評
価方法において、識別対象図形に対応する複数カテゴリ
(m)に対し複数の特徴量(n)の各々について頻度分
布をめ・該特徴量(n)の各々についてめられたカテゴ
リ(m)の分布の中から相互干渉のない2つのカテゴリ
の組合せ数である離散分布数を算出し、該特徴量(n)
の中で算出された離散分布数ができるだけ大きい値をも
つ特徴量を該特徴量(n)の中かな選択することを特徴
とする図形認識における特徴量評価方法。
1. In a method for evaluating feature quantities as identification indicators in pattern recognition, the frequency distribution is determined for each of a plurality of feature quantities (n) for a plurality of categories (m) corresponding to a figure to be identified. The number of discrete distributions, which is the number of combinations of two categories without mutual interference, is calculated from the distribution of categories (m) considered for each of the features (n).
A method for evaluating feature quantities in figure recognition, characterized in that a feature quantity having the largest possible value of the calculated discrete distribution number is selected from among the feature quantities (n).
JP58248152A 1983-12-29 1983-12-29 Feature amount evaluating method in graphic recognition Granted JPS60142788A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP58248152A JPS60142788A (en) 1983-12-29 1983-12-29 Feature amount evaluating method in graphic recognition
US06/687,757 US4658429A (en) 1983-12-29 1984-12-31 System and method for preparing a recognition dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58248152A JPS60142788A (en) 1983-12-29 1983-12-29 Feature amount evaluating method in graphic recognition

Publications (2)

Publication Number Publication Date
JPS60142788A true JPS60142788A (en) 1985-07-27
JPH027115B2 JPH027115B2 (en) 1990-02-15

Family

ID=17173989

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58248152A Granted JPS60142788A (en) 1983-12-29 1983-12-29 Feature amount evaluating method in graphic recognition

Country Status (1)

Country Link
JP (1) JPS60142788A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10160508A (en) * 1996-11-26 1998-06-19 Omron Corp Situation discriminating apparatus
WO2008026414A1 (en) * 2006-08-31 2008-03-06 Osaka Prefecture University Public Corporation Image recognition method, image recognition device, and image recognition program
JP2010134927A (en) * 2008-12-03 2010-06-17 Ind Technol Res Inst Monitoring method and monitoring device using hierarchical appearance model

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0593924U (en) * 1992-05-23 1993-12-21 株式会社東洋工機 Air outlet device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS592187A (en) * 1982-06-29 1984-01-07 Fujitsu Ltd Object recognizing device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS592187A (en) * 1982-06-29 1984-01-07 Fujitsu Ltd Object recognizing device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10160508A (en) * 1996-11-26 1998-06-19 Omron Corp Situation discriminating apparatus
WO2008026414A1 (en) * 2006-08-31 2008-03-06 Osaka Prefecture University Public Corporation Image recognition method, image recognition device, and image recognition program
JP4883649B2 (en) * 2006-08-31 2012-02-22 公立大学法人大阪府立大学 Image recognition method, image recognition apparatus, and image recognition program
US8199973B2 (en) 2006-08-31 2012-06-12 Osaka Prefecture University Public Corporation Image recognition method, image recognition device, and image recognition program
JP2010134927A (en) * 2008-12-03 2010-06-17 Ind Technol Res Inst Monitoring method and monitoring device using hierarchical appearance model

Also Published As

Publication number Publication date
JPH027115B2 (en) 1990-02-15

Similar Documents

Publication Publication Date Title
Roberts et al. Package ‘labdsv’
TWI723528B (en) Computer-executed event risk assessment method and device, computer-readable storage medium and computing equipment
US5799311A (en) Method and system for generating a decision-tree classifier independent of system memory size
Koru et al. Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products
CN113935434A (en) Data analysis processing system and automatic modeling method
US4658429A (en) System and method for preparing a recognition dictionary
US5956692A (en) Attribute inductive data analysis
US5247662A (en) Join processor for a relational database, using multiple auxiliary processors
CN110472742B (en) Model variable determination method, device and equipment
JPH10505930A (en) Method and apparatus for extracting information from a database
JPS60142788A (en) Feature amount evaluating method in graphic recognition
CN115186776B (en) Method, device and storage medium for classifying ruby producing areas
CN111177311B (en) Data analysis model and analysis method of event processing result
CN115098740B (en) Data quality detection method and device based on multi-source heterogeneous data source
Han et al. Interestingness classification of association rules for master data
Vidgof et al. Interactive log-delta analysis using multi-range filtering
CN114037137A (en) Object prediction method, system and medium
CN113920366A (en) Comprehensive weighted main data identification method based on machine learning
Behringer et al. Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features.
CN111524555A (en) Automatic typing method based on human intestinal flora
Wildi Management and multivariate analysis of large data sets in vegetation research
CN115620885B (en) Evaluation method and system for measuring tool
TWI521361B (en) Method for multi-layer classifier
Özen Subject based bibliometric analysis: A review of information science discipline
CN116228483B (en) Learning path recommendation method and device based on quantum drive