JPH08161172A

JPH08161172A - Knowledge correction type learning system

Info

Publication number: JPH08161172A
Application number: JP6304774A
Authority: JP
Inventors: Megumi Ishii; 恵石井; Yasuhiro Akiba; 泰弘秋葉; Arumoarimu Fusein; アルモアリムフセイン; Shigeo Kaneda; 重郎金田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1994-12-08
Filing date: 1994-12-08
Publication date: 1996-06-21

Abstract

PURPOSE: To provide a knowledge correction type learning system capable of acquiring excellent knowledge from a small number of cases by using existing knowledge (a rule, a determination tree, etc.) together. CONSTITUTION: An existing knowledge conversion means 11 generates a virtual case 23 by converting existing knowledge 21 into a unit rule expression constituted only of the product of conditions, the virtual case 23 and a real case 22 are inputted to a weight changing means 12 and a learning case 24 is generated by respectively applying weight to the virtual case 23 and the real case 22. The case 24 is inputted to an inductive learning means 13 to generate corrected knowledge 25 expressed by a rule or a determination tree, the weight of the means 12 is changed by a parameter determining means 14 and a parameter capable of applying the highest discrimination performance to an unknown case is selected by a cross validation method.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、既存知識を併用し、観
測された事例から知識を獲得する知識修正型学習システ
ムに関し、更に詳しくは、事例から概念学習によりルー
ル・決定木を作成する知識修正型学習システムに関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a knowledge correction type learning system for acquiring knowledge from observed cases by using existing knowledge together, and more specifically, knowledge for creating rules / decision trees by concept learning from cases. A modified learning system.

【０００２】[0002]

【従来の技術】過去の観測事例から、将来の未知事例の
クラスを判別するためのルール／決定木を学習する方法
は「概念学習」と呼ばれ、種々の手法が提案されてき
た。その中でも、Quinlan の「ＩＤ３」は、最も代表的
なものである。ＩＤ３は、その学習時間が学習に利用す
る事例（学習事例）の個数にほぼ比例している点に大き
な特徴があり、その高速性から実用性が高い手法として
知られる。以下、ＩＤ３に関する簡単な説明を行う。な
お、ＩＤ３の詳細は、例えば、電総研人工知能研究グル
ープ訳「知識獲得と学習シリーズ第１〜８巻」共立出版
を参照されたい。2. Description of the Related Art A method of learning a rule / decision tree for discriminating a future unknown case class from past observation cases is called "concept learning", and various methods have been proposed. Among them, Quinlan's "ID3" is the most representative one. ID3 has a great feature in that its learning time is almost proportional to the number of cases (learning cases) used for learning, and is known as a highly practical method because of its high speed. Hereinafter, a brief description of ID3 will be given. For details of ID3, refer to Kyoritsu Publishing, "Knowledge Acquisition and Learning Series Vol.

【０００３】ＩＤ３では、例えば、事例として以下の事
例集合を用いる。ここで、事例［低い、ブロンド、青；
＋］を構成する「低い」、「ブロンド」、および「青」
が属性値である。この例では、「背の高さ」「髪の色」
「目の色」の３属性がある。「−」「＋」は、これらの
属性で決まるクラスである。この８個の事例を、以下、
「事例集合Ｃ」と呼ぶ。In ID3, for example, the following case set is used as a case. Here, the case [low, blond, blue;
“Low”, “Blonde”, and “Blue” that compose +]
Is the attribute value. In this example, "height""haircolor"
There are three attributes, "eye color". "-" And "+" are classes determined by these attributes. These eight cases are
This is referred to as “case set C”.

【０００４】さて、上記の事例集合Ｃが与えられた後、これらの事例
には無い新たな事例、［低い、赤色、茶；？］が到着し
た場合に、このクラスを何と推定すべきだろうか？概念
学習の目的は、このような未知の事例のクラスを判別す
るためのルール（決定木）を既知の学習事例（上記では
事例集合Ｃ）から生成することである。[0004] Now, given the case set C above, there is a new case, [Low, Red, Brown ;? ], What should we presume in this class? The purpose of concept learning is to generate a rule (decision tree) for discriminating a class of such an unknown case from a known learning case (above case set C).

【０００５】ＩＤ３では、クラスを決定するルールは、
図２（ｂ）のような決定木として表現される。この決定
木の意味は、まず最初に「髪の色」で未知事例を検査す
る。この結果髪の色が「黒」であれば直ちにクラス
「−」と判別する。また、髪の色が「赤」であるなら
ば、クラス「＋」と判別する。もし、髪の色が「ブロン
ド」であるならば、更に、「目の色」について事例を調
べ、目の色が「青」であるならば、クラスを「＋」、茶
であるならば、クラスを「−」とする。このように、決
定木では、トップのノードから、順次、未知事例の属性
値を質問する形でクラスの判定を行う。In ID3, the rule for determining the class is
It is expressed as a decision tree as shown in FIG. The meaning of this decision tree is to first examine unknown cases by "hair color". As a result, if the hair color is "black", the class is immediately determined to be "-". If the hair color is “red”, the class is determined as “+”. If the hair color is "blond", further examine the case for "eye color", and if the eye color is "blue", class is "+", and if it is brown, Let the class be "-". In this way, in the decision tree, the class determination is performed by sequentially asking the attribute values of unknown cases from the top node.

【０００６】次に、この決定木を事例集合Ｃから作成す
る方法について説明する。ＩＤ３では、まず最初に事例
集合Ｃに関し、各属性で判別した時のツリーを各々作成
する。図２（ａ）は、属性「髪の色」で判別を行った例
である。８個の事例は、３個、１個、４個のグループに
分かれる。ここで、「黒」と「赤」については、事例の
持つクラスがユニークである。しかし、「ブロンド」の
枝に対しては、クラスはユニークではない。ＩＤ３は、
属性による判別の良し悪しをエントロピのゲイン（利
得）により判断する。Next, a method of creating this decision tree from the case set C will be described. With ID3, first of all, regarding the case set C, a tree when it is determined by each attribute is created. FIG. 2A is an example in which determination is made based on the attribute “hair color”. The eight cases are divided into three, one, and four groups. Here, with respect to “black” and “red”, the classes of the cases are unique. But for the "blond" branch, the class is not unique. ID3 is
The quality of discrimination based on attributes is judged by the gain of entropy.

【０００７】事例集合Ｃが持つエントロピは以下の式に
より求められる。クラス「＋」に属する事例の個数をＮ
ｐ、クラス「−」に属する事例の個数を「Ｎｎ」とする
時、The entropy of the case set C is calculated by the following equation. The number of cases belonging to the class “+” is N
p and the number of cases belonging to the class “−” is “Nn”,

【数１】で与えられる。但し、ｌｏｇは底が２とする。[Equation 1] Given in. However, the log has a bottom of 2.

【０００８】まず、分岐前に、事例集合Ｃが持っていた
エントロピは、８個の事例に２種類のクラスが３個、５
個の割合で存在するので、以下のようになる。First, before branching, the entropy of the case set C has 3 classes of 5 kinds and 5 classes of 8 cases.
Since they exist in proportion, the following is obtained.

【０００９】[0009]

【数２】一方、判別後のエントロピは、分岐先の事例集合のエン
トロピをそれぞれ計算し、そのエントロピに、事例の分
岐数を考慮した重み付けを行う必要がある。具体的に
は、この例では、髪の色が、「黒」（事例８個中３
個）、「赤」（事例８個中１個）、「ブロンド」（事例
８個中４個）として分岐した先のエントロピが、それぞ
れ、０ビット、０ビット、１ビットとなるので、判別後
の期待エントロピは、[Equation 2] On the other hand, for the entropy after the determination, it is necessary to calculate the entropy of each of the branch destination case sets and weight the entropy in consideration of the number of branch cases. Specifically, in this example, the hair color is “black” (3 out of 8 cases).
), “Red” (1 out of 8 cases) and “blond” (4 out of 8 cases) have entropy of 0 bit, 0 bit and 1 bit respectively. The expected entropy of

【数３】となる。従って、「髪の色」のエントロピゲインは、
０．９５４−０．５＝０．４５４ビットである。一方、
「背丈」で判別した場合のエントロピゲインは、詳細に
は説明しないが、同様にして０．００３ビットであり、
「目の色」は０．３４７ビットとなる。(Equation 3) Becomes Therefore, the entropy gain of "hair color" is
0.954-0.5 = 0.454 bits. on the other hand,
Although not described in detail, the entropy gain determined by "height" is 0.003 bits in the same manner,
The "eye color" is 0.347 bits.

【００１０】ＩＤ３では、木を作成する際に、エントロ
ピゲインが最大となる属性を優先する。即ち、この例で
は、「髪の色」を最初の判別属性とする。判別の結果、
クラスがユニークに決定された属性値の分岐は、処理を
停止する。一方、クラスがユニークでない分岐について
は、同様のエントロピゲイン計算により、次の判別に利
用する属性を決定する。このような、判別属性の決定
は、全ての分岐において、クラスがユニークになるまで
繰り返される。この事例Ｃから生成されるのは、図２
（ｂ）の決定木である。未知事例に対しては、この決定
木を用いて、クラスの判定を行う。With ID3, when creating a tree, the attribute with the maximum entropy gain is prioritized. That is, in this example, "hair color" is the first determination attribute. As a result of the determination,
When the attribute value whose class is uniquely determined is branched, the processing is stopped. On the other hand, for a branch whose class is not unique, the same entropy gain calculation is used to determine the attribute to be used for the next discrimination. Such determination of the discrimination attribute is repeated in all the branches until the class becomes unique. The case C is generated as shown in FIG.
It is the decision tree of (b). For unknown cases, this decision tree is used to determine the class.

【００１１】[0011]

【発明が解決しようとする課題】上記のＩＤ３は、十分
な事例数が確保されれば、エキスパートシステムの知識
獲得支援手法として利用できる。しかし、現実のフィー
ルドでは、収集できる専門家の処理事例数に限度があ
り、入手可能な事例のみから知識（決定木）を生成して
も、その決定木の性能が不十分であることが多い。この
問題を回避するため、従来は、概念学習手法と人間の支
援を組み合わせて利用するのが一般的であった。具体的
には、まず事例から知識を学習し、学習された知識を人
手で修正する。この従来の手法は、確かに、少ない事例
から知識を獲得可能とするが、最終的に作成された知識
が人手作成となる。従って、ルール相互の矛盾、条件抜
け等の問題が発生して、知識の品質保証に課題を残して
いたのである。If a sufficient number of cases is secured, the above ID3 can be used as a knowledge acquisition support method for an expert system. However, in the actual field, there is a limit to the number of processing cases of experts that can be collected, and even if knowledge (decision tree) is generated from only available cases, the performance of that decision tree is often insufficient. . In order to avoid this problem, conventionally, it has been general to use a concept learning method and human support in combination. Specifically, first, the knowledge is learned from the case, and the learned knowledge is manually corrected. This conventional method can certainly acquire knowledge from a small number of cases, but the finally created knowledge is manually created. Therefore, problems such as contradiction between rules and omission of conditions occur, leaving a problem in quality assurance of knowledge.

【００１２】上記の問題点を解決するためには、まず人
手でルールを作成し、このルールと、実際に収集される
事例を合わせて、最終的なルールを自動学習する手法を
用いるべきである。これにより、「人間の頭の中」にあ
る知識と、入手可能な事例から、より精度の高いルール
を獲得できる。しかも、この手法では、最終的なルール
が機械生成となるので、ルールの相互矛盾や条件抜けを
防止できる。In order to solve the above problems, first, a rule should be manually created, and a method for automatically learning the final rule should be used by combining this rule with cases actually collected. . As a result, more accurate rules can be acquired from the knowledge "in the human head" and the available cases. Moreover, in this method, since the final rule is machine-generated, mutual contradiction of rules and omission of conditions can be prevented.

【００１３】このような、修正型の学習が可能な学習手
法は、従来の機械学習研究の中で、幾つか提案されてい
る。しかし、ＡＱアルゴリズム等、何れも処理時間が長
い学習手法に関するものである。従来のエキスパートシ
ステム構築支援で、修正型学習が適用されなかった大き
な理由は、この処理の重さにあると思われる。ほぼ事例
数に比例した学習時間で学習可能な、高速学習手法であ
るＩＤ３で、この修正型の学習が可能であれば、実用的
な価値が高い。しかし、ＩＤ３に対しては、既存ルール
修正型のバージョンは知られていない。Several learning methods capable of such modified learning have been proposed in conventional machine learning research. However, the AQ algorithm and the like are all related to the learning method having a long processing time. The major reason why the modified learning is not applied in the conventional expert system construction support seems to be the weight of this processing. If this modified learning is possible with ID3, which is a high-speed learning method that can be learned in a learning time that is approximately proportional to the number of cases, it is of high practical value. However, no existing rule modification type version is known for ID3.

【００１４】本発明は、上記に鑑みてなされたもので、
その目的とするところは、既存の知識（ルール、決定
木）を併用して、少ない事例から優れた知識を獲得でき
る知識修正型学習システムを提供することにある。The present invention has been made in view of the above,
The purpose is to provide a knowledge correction type learning system that can acquire excellent knowledge from a small number of cases by using existing knowledge (rules, decision trees) together.

【００１５】[0015]

【課題を解決するための手段】上記目的を達成するた
め、本発明の知識修正型学習システムは、既存知識を併
用し、観測された事例から知識を獲得する知識修正型学
習システムにおいて、ルールまたは決定木により表現さ
れた既存知識を条件の積のみから構成される単位ルール
表現に変換することにより仮事例を生成する既存知識変
換手段と、前記仮事例と観測された実事例とを入力とし
て、仮事例と実事例の各々に重みを与えて学習事例を生
成する重み変更手段と、前記学習事例を入力として、ル
ールまたは決定木により表現された修正済知識を生成す
る帰納的学習手段と、前記重み変更手段におけるパラメ
ータとしての重みを変化させ、クロスバリデーション法
により未知事例に対する判別性能が最も良くなるパラメ
ータを選択するパラメータ決定手段とを有することを要
旨とする。In order to achieve the above object, the knowledge correction type learning system of the present invention uses a rule or a knowledge correction type learning system in which existing knowledge is used in combination and knowledge is acquired from observed cases. Existing knowledge conversion means for generating a tentative case by converting existing knowledge represented by a decision tree into a unit rule expression consisting only of products of conditions, and the tentative case and observed actual case as inputs, A weight changing means for giving a weight to each of the provisional case and the real case to generate a learning case; and an inductive learning means for receiving the learning case as an input and generating a corrected knowledge expressed by a rule or a decision tree, By changing the weight as a parameter in the weight changing means and selecting the parameter with the best discrimination performance for unknown cases by the cross validation method. And summarized in that and a chromatography data determining means.

【００１６】[0016]

【作用】本発明の知識修正型学習システムでは、ルール
または決定木により表現された既存知識を条件の積のみ
から構成される単位ルール表現に変換して仮事例を生成
し、この仮事例と実事例の各々に重みを与えて学習事例
を生成し、この学習事例を入力して、ルールまたは決定
木により表現された修正済知識を生成し、重み変更手段
における重みを変化させ、クロスバリデーション法によ
り未知事例に対する判別性能が最も良くなるパラメータ
を選択している。In the knowledge correction type learning system of the present invention, the existing knowledge expressed by the rule or the decision tree is converted into the unit rule expression composed only of the product of the conditions to generate the tentative case. A learning case is generated by giving a weight to each of the cases, and this learning case is input to generate modified knowledge represented by a rule or a decision tree, the weight in the weight changing means is changed, and the cross validation method is used. The parameter that gives the best discrimination performance for unknown cases is selected.

【００１７】[0017]

【実施例】以下、図面を用いて本発明の実施例を説明す
る。Embodiments of the present invention will be described below with reference to the drawings.

【００１８】図１は、本発明の一実施例に係る知識修正
型学習システムの構成を示すブロック図である。同図に
示す知識修正型学習システム１は、ルールまたは決定木
により表現された既存知識２１を条件の積のみから構成
される単位ルール表現に変換することにより仮事例２３
を生成する既存知識変換手段１１と、前記仮事例２３と
実事例２２とを入力として、仮事例と実事例の各々に重
みを与えて学習事例２４を生成する重み変更手段１２
と、前記学習事例２４を入力として、ルールまたは決定
木により表現された修正済知識２５を生成する帰納的学
習手段１３と、前記重み変更手段１２における重みを変
化させ、クロスバリデーション法により、未知事例に対
する判別性能が最も良くなるパラメータを選択するパラ
メータ決定手段１４とから構成される。以下、本発明の
実施例について、詳細に説明する。FIG. 1 is a block diagram showing the configuration of a knowledge correction type learning system according to an embodiment of the present invention. The knowledge correction type learning system 1 shown in the same figure converts the existing knowledge 21 represented by a rule or a decision tree into a unit rule expression composed only of products of conditions, and thereby the provisional case 23
The existing knowledge converting means 11 for generating the learning case 24 and the weight changing means 12 for generating the learning case 24 by inputting the temporary case 23 and the actual case 22 and weighting each of the temporary case and the actual case.
And the learning case 24 as an input, the inductive learning means 13 for generating the modified knowledge 25 expressed by the rule or the decision tree, and the weights in the weight changing means 12 are changed, and the unknown case is obtained by the cross validation method. And a parameter determining means 14 for selecting a parameter having the best discrimination performance with respect to. Hereinafter, examples of the present invention will be described in detail.

【００１９】既存知識変換手段１１は、ルールまたは決
定木で与えられる既存知識を事例（仮事例）に変換する
機能を有する。事例は属性値とクラス名称から構成され
るベクトルであり、事例番号をｉとして、以下の形式で
表現される。但し、Ｖ_i,jは、ｊ番目の属性の属性値で
あり、Ｃｌａｓｓ_iはクラスである。The existing knowledge conversion means 11 has a function of converting existing knowledge given by a rule or a decision tree into a case (provisional case). A case is a vector composed of an attribute value and a class name, and is expressed in the following format, where i is the case number. However, V _{i, j} is the attribute value of the j-th attribute, and Class _i is the class.

【００２０】［Ｖ_i,1，Ｖ_i,2，Ｖ_i,3，Ｖ_i,4・・・
・；Ｃｌａｓｓ_i］従来の技術で例示したベクトルでは、事例は、例えば、［高い、ブロンド、青；＋］と表現された。[V _{i, 1} , V _{i, 2} , V _{i, 3} , V _{i, 4} ...
•; Class _i ] In the vector illustrated in the prior art, the case is expressed as, for example, [high, blond, blue; +].

【００２１】ここで、次の事例を考えよう。Consider the following case.

【００２２】［？、ブロンド、青；＋］この例は、一番目の属性が如何なる値であっても、クラ
スが「＋」であることを示す。これは明らかに、ルールｉｆ（髪の色＝ブロンド）＆（目の色＝青）ｔｈｅｎ（ｃｌａｓｓ＝＋）と同一である。即ち、属性値に「？」（Ｄｏｎ’ｔＣ
ａｒｅ）を持つ事例は、実質的には、ルールと同一であ
る。更に、図２（ｂ）を見れば、この事例は、決定木の
ルートから、ブロンドと青を属性値として選んでリーフ
に至るパスに対応している。既存知識をこの「？」付き
事例（以下、これを仮事例と呼ぶ）に変換すれば、その
仮事例集合は、事実上、既存知識の持つ情報を反映した
ものとなる。[? , Blond, blue; +] This example shows that the class is "+", whatever the value of the first attribute. This is clearly the same as the rule if (hair color = blond) & (eye color = blue) then (class = +). That is, "?"(Don't C
Cases with are) are substantially the same as rules. Further, referring to FIG. 2B, this case corresponds to the path from the root of the decision tree to the leaf by selecting blonde and blue as attribute values. If the existing knowledge is converted into the case with “?” (Hereinafter, referred to as “provisional case”), the provisional case set effectively reflects the information of the existing knowledge.

【００２３】なお、ＩＤ３では、本明細書では詳細には
述べないが、上記のような「？」の付いた事例を入力の
中に含んでも、概念学習手段として何ら問題はなく動作
する機能がある。従って、上記の事例は、そのまま、Ｉ
Ｄ３に投入できる。この機能の詳細については、J.Ross
Quilan 著、「Progorams for Machine Learning」、Mo
rgan Kaufmanを参照されたい。In ID3, although not described in detail in this specification, even if the input includes a case with "?" As described above, there is no problem as a concept learning means. is there. Therefore, the above case is I
Can be thrown into D3. For more information on this feature, see J. Ross
Quilan, Progorams for Machine Learning, Mo
See rgan Kaufman.

【００２４】以上から、既存知識変換手段１１の処理手
順は、以下のようなものである。From the above, the processing procedure of the existing knowledge conversion means 11 is as follows.

【００２５】まず、既存知識２１が決定木の時について
説明する。この場合には、（１）決定木のルートノードからリーフノードまでの各
パスについて、以下の仮事例を生成する。その仮事例
は、決定木の当該パスに条件として現れる属性について
は、当該属性値を持ち、それ以外の属性については、
「？」を属性値として持つ事例ベクトルである。First, the case where the existing knowledge 21 is a decision tree will be described. In this case, (1) the following tentative case is generated for each path from the root node to the leaf node of the decision tree. The provisional case has the attribute value for the attribute that appears as a condition in the path of the decision tree, and for the other attributes,
It is a case vector having "?" As an attribute value.

【００２６】（２）但し、生成された仮事例が、より正
確に既存の決定木を反映することを望む場合には、決定
木中のどこかに出現する属性については、値を付与する
ことを考える。具体的には決定木の中のどこかに出現す
る属性であり、当該パスに当該属性値が出現しない場合
には、ランダムに属性値を付与する。(2) However, if it is desired that the generated provisional case more accurately reflect an existing decision tree, a value should be given to an attribute that appears anywhere in the decision tree. think of. Specifically, it is an attribute that appears anywhere in the decision tree, and if the attribute value does not appear in the path, the attribute value is randomly assigned.

【００２７】上記の（２）は、一見ムダのようである
が、生成された仮事例集合に対して、実際にＩＤ３を動
作させた場合、もとの既存知識と同一の決定木が生成さ
れるようにするためには、必要な処理である。The above (2) seems to be wasteful at first glance, but when ID3 is actually operated on the generated temporary case set, the same decision tree as the original existing knowledge is generated. In order to do so, it is a necessary process.

【００２８】次に、既存知識２１がルール形式の時につ
いて説明する。この場合には、（１）ルール中に、（髪の色＝ブロンド or 黒）といっ
たＯＲ条件が存在する場合には、まずＯＲの無いルール
に変換する。この変換は、当業者においては、従来の技
術で容易に実現できることは明らかである。この結果、
条件（（髪の色＝ブロンド）等）の積条件のみで構成さ
れた積表現のルールが得られる。Next, the case where the existing knowledge 21 is in the rule format will be described. In this case, (1) If an OR condition such as (hair color = blonde or black) exists in the rule, first convert to a rule without OR. It is obvious to those skilled in the art that this conversion can be easily realized by the conventional technique. As a result,
A product expression rule composed only of product conditions ((hair color = blond) etc.) is obtained.

【００２９】（２）次に、上記のルールを前述した属性
値表現の事例に変換する。但し、ルールの持つ情報を十
分に反映させるためには、全ルールのどこかに出現する
属性については、ランダムに値を付与する。但し、この
付与されたランダム値によって、他ルールにマッチし
て、異なるクラスとなる場合については、当該ランダム
値は付与を許さない。(2) Next, the above rule is converted into the above-mentioned case of the attribute value expression. However, in order to fully reflect the information that the rules have, values are randomly assigned to the attributes that appear anywhere in all the rules. However, in the case where this assigned random value matches another rule and results in a different class, the assigned random value is not allowed.

【００３０】以上述べた既存知識変換手段１１により、
既存知識は変換され、その出力は、仮事例２３として、
蓄えられることとなる。このとき既存知識変換手段１１
は各事例に対して事例が仮事例であるか実事例であるか
を表す属性を付与する。例えば第一の属性によって区別
する場合、第一属性が０の事例は仮事例であり、第一属
性が１の事例は実事例である。By the existing knowledge conversion means 11 described above,
Existing knowledge is converted, and the output is temporary case 23,
It will be stored. At this time, the existing knowledge conversion means 11
Gives each case an attribute indicating whether the case is a tentative case or a real case. For example, when distinguishing by the first attribute, a case where the first attribute is 0 is a tentative case, and a case where the first attribute is 1 is an actual case.

【００３１】次に、重み変更手段１２について説明す
る。重み変更手段１２は、前述の仮事例２３と、実事例
２２に対して、後述するパラメータ決定手段１４から指
示された重みを付与する。Next, the weight changing means 12 will be described. The weight changing unit 12 gives the above-mentioned temporary case 23 and the actual case 22 a weight instructed by the parameter determining unit 14 to be described later.

【００３２】事例の重みを２番目の属性で表示するもの
として、事例に属性を付加すると、事例は、以下の構成
となる。例えば、［１、２．５、高い、ブロンド、青；＋］は実事例であり、重みが２．５であることを示してい
る。また、［０、２．０、低い、ブロンド、？；−］は仮事例、即ち既存知識の一部であることを示してい
る。また、この重みについては、数値で与えるのではな
く、（整数の時に限られるが）実際に事例のコピーを取
って、２個の事例を生成することにより、重み２を表現
してもよい。この重み変更手段１２の具体的構成は、当
業者であれば上記の説明から明らかと思われるので、本
明細書では説明を略する。重み変更手段１２により、仮
事例と実事例は、学習事例２４となる。When the weight of the case is displayed with the second attribute, the case has the following structure when the case is added with the case. For example, [1, 2.5, high, blonde, blue; +] is a real case, indicating a weight of 2.5. Also, [0, 2.0, low, blonde ,? ;-] Indicates a provisional case, that is, a part of existing knowledge. Further, the weight 2 may be expressed by not actually giving a numerical value but by actually copying a case (limited to an integer) to generate two cases. The specific configuration of the weight changing unit 12 will be apparent to those skilled in the art from the above description, and thus the description thereof is omitted here. By the weight changing means 12, the provisional case and the actual case become a learning case 24.

【００３３】帰納的学習手段１３は、ＩＤ３である。従
来の技術で説明した通りのものであり、従来の技術から
比べて、特段の機能は持たない。但し、後述するよう
に、実事例であるか、仮事例であるかのフラグは、後述
するように、学習事例の重みを修正する際に参照すべき
ものであるので、帰納的学習手段１３には無関係であ
り、この最初の属性は、帰納的学習手段１３には入力し
ない。事例の重みは、もともと、前述のJ.Ross Quilan
著、「Progorams for Machine Learning」、MorganKauf
manにおいて、事例に重みを与える場合について、その
処理方法が示されており、事実、この著書に付属したプ
ログラムには、内部的に重みを与えた処理方法のプログ
ラムが開示されている。その詳細は、J.Ross Quilan の
著書を参照されたい。帰納的学習手段１３により生成さ
れた決定木が修正済知識２５となる。The inductive learning means 13 is ID3. This is as described in the related art and does not have a special function as compared with the related art. However, as will be described later, the flag indicating whether it is a real case or a tentative case should be referred to when correcting the weight of the learning case, as will be described later, so the inductive learning means 13 This first attribute is irrelevant and is not input to the inductive learning means 13. The weight of the case was originally from J. Ross Quilan mentioned above.
Author, "Progorams for Machine Learning", Morgan Kauf
In man, a processing method is shown for the case of giving a weight to a case, and in fact, the program attached to this book discloses a program of a processing method to which a weight is given internally. For more details, see J. Ross Quilan's book. The decision tree generated by the inductive learning means 13 becomes the corrected knowledge 25.

【００３４】パラメータ決定手段１４は、本発明の最も
特徴的な機能のひとつを実現する。前述したように、確
率統計理論に従えば、既存知識は先験的確率に相当し、
実事例とこの先験的確率により、事後確率が生成され
る。従って、ある条件のもとでのクラスに関する先験的
確率をＰ、事例の分布（確率）をＱとする時、統計解析
で言う所の事後確率ｆは、ｆ＝Ｐ×α＋（１−α）Ｑで与えられる。このα（０≦α≦１）は、先験的確率の
重要度を意味している。このαが大きい程、事前確率を
信用すべきこととなる。例えば、事例にノイズ（属性値
やクラスに誤りがある事例）が殆んど含まれておらず、
一方、事前の知識に自信が持てない時には、このαを小
さく抑えるべきである。逆に、事例がノイズを多く含
み、一方、事前知識はかなり正確と思われる場合は、事
例の確率Ｑではなく、先験的確率Ｐを重視すべきである
から、αを１に近づけるべきである。The parameter determining means 14 realizes one of the most characteristic functions of the present invention. As mentioned above, according to probability statistics theory, existing knowledge corresponds to a priori probability,
A posterior probability is generated from the actual case and this a priori probability. Therefore, when the a priori probability for a class under a certain condition is P and the distribution (probability) of cases is Q, the posterior probability f in statistical analysis is f = P × α + (1-α ) Given by Q. This α (0 ≦ α ≦ 1) means the importance of the a priori probability. The larger this α is, the more reliable the prior probability should be. For example, the case includes almost no noise (case in which the attribute value or class is incorrect),
On the other hand, when one is not confident in prior knowledge, this α should be kept small. Conversely, if the case is noisy and the a priori knowledge seems to be fairly accurate, then the a priori probability P should be valued rather than the case probability Q, so α should be close to 1. is there.

【００３５】このαは、事前に値を知ることができれ
ば、それに越したことはない。しかし、現実には、この
αの値は事例の性質や既存知識の正確さ等にも依存し、
事前に知ることは難しい。そこで、本発明では、このα
の値を実験的に定める機構を具備する。具体的に、以
下、説明する。If the value of α can be known in advance, it will never be exceeded. However, in reality, the value of α depends on the nature of the case and the accuracy of existing knowledge,
It is difficult to know in advance. Therefore, in the present invention, this α
A mechanism for experimentally determining the value of is provided. The details will be described below.

【００３６】このαに相当するものとして、本発明では
学習に利用する事例に重みを与える。即ち、１個の事例
であっても、そこに重みの情報を付加する。例えば、重
み２であれば、事実上同一の事例が２個存在するのと同
様の処理をＩＤ３が実行するわけである。事例の与え方
としては、以下の２通りがある。In the present invention, as examples corresponding to this α, the cases used for learning are weighted. That is, weight information is added to even one case. For example, if the weight is 2, the ID 3 executes the same processing as when there are actually two identical cases. There are two ways to give cases.

【００３７】（１）仮事例、または実事例の重みを固定
して、他の事例の重みを変化させる。(1) The weight of the temporary case or the actual case is fixed, and the weights of other cases are changed.

【００３８】（２）仮事例、実事例それぞれの重みを変
化させる。(2) The weights of the temporary case and the actual case are changed.

【００３９】実際的には、この際の、「仮事例の重み」／（「仮事例の重み」＋「実事例の重
み」）が、上記のαに相当すると考えてよい。なお、上記の値
が同一であっても、各重みの絶対値よって、現実の学習
手段の性能が変化する場合がある。従って、上記（１）
の方式のみでは十分に修正済知識の性能が上がらない場
合があるため、上記（２）の方式も併せて利用可能とす
べきである。しかし、以下の説明においては、説明の簡
単化のため、既存知識から生成した仮事例の重みを１と
して説明を行う。但し、仮事例の重みも変化させた場合
に関しては、当業者であれば、その実現方法を容易に類
推可能と思われる。Practically, it can be considered that the “provisional case weight” / (“provisional case weight” + “actual case weight”) at this time corresponds to the above α. Even if the above values are the same, the actual performance of the learning means may change depending on the absolute value of each weight. Therefore, the above (1)
Since the method of (2) alone may not sufficiently improve the performance of corrected knowledge, the method of (2) above should also be usable. However, in the following description, for simplification of description, the weight of the tentative case generated from the existing knowledge is set to 1. However, in the case where the weight of the tentative case is also changed, those skilled in the art can easily analogize the realization method.

【００４０】仮事例の重みを一定として、実事例の重み
を変化させるものとして、以下の説明を行う。この場
合、実事例のパラメータである重みを、本発明では、
「クロスバリデーション法」により、このパラメータを
定める。The following description will be made assuming that the weight of the temporary case is constant and the weight of the actual case is changed. In this case, in the present invention, the weight that is the parameter of the actual case is
This parameter is defined by the "cross validation method".

【００４１】クロスバリデーション法について説明す
る。このクロスバリデーション法では、ＳＴＥＰ１：学習事例２４から、予め定められた個数の
事例をランダムに抜き出して、これをテスト事例集合と
する。The cross validation method will be described. In this cross-validation method, STEP 1: a predetermined number of cases are randomly extracted from the learning case 24 and set as a test case set.

【００４２】ＳＴＥＰ２：テスト事例集合を除いた学習
事例２４から、帰納的学習手段１３により修正済知識を
作成する。そして、この修正済知識を用いて、テスト事
例集合の判別を行い、その正解率を求めて記録する。STEP 2: The modified knowledge is created by the inductive learning means 13 from the learning case 24 excluding the test case set. Then, using this corrected knowledge, the test case set is discriminated, and the correct answer rate is calculated and recorded.

【００４３】ＳＴＥＰ３：予め決められた回数（例えば
１０回）、上記のＳＴＥＰ１〜ＳＴＥＰ２を繰り返し、
正解率の平均を求める。STEP3: The above STEP1 to STEP2 are repeated a predetermined number of times (for example, 10 times),
Find the average of the correct answer rates.

【００４４】正解率を求める手法は、当業者であれば周
知のものであり、作成された決定木をルートから順に辿
ることによりクラスを決定できる。The method of obtaining the correct answer rate is well known to those skilled in the art, and the class can be determined by sequentially tracing the created decision tree from the root.

【００４５】クロスバリデーション法によるパラメータ
決定では、上記のＳＴＥＰ１〜ＳＴＥＰ３の処理を異な
る実事例の重みについて実行し、正解率が最も高くなっ
た重みを利用する。前述の学習事例で、事例に事例が実
事例か仮事例かを表す属性を付与したのは、学習事例の
重みを次々変化させる際に、仮事例と実事例の区別がつ
かなくなることのないようにするためである。In the parameter determination by the cross-validation method, the processes of STEP1 to STEP3 described above are executed for the weights of different actual cases, and the weight with the highest correct answer rate is used. In the learning cases described above, the attribute that indicates whether the case is a real case or a temporary case is added to the case so that the temporary case and the real case can be distinguished from each other when the weights of the learning cases are changed one after another. This is because

【００４６】以上、本発明の実施例を説明した。図３
は、米国カリフォルニア州立大学が提供している学習手
法のベンチマークデータであるアーバインデータベース
の一部（Soybean Large と呼ばれるデータ）を利用し
て、本発明を実施した結果である。但し、この例では、
６８３個の事例から、２００個の学習データを２組と
り、１組（２００個）を実事例、他の１組（２００個）
で決定木を生成して、これを既存知識としている。学習
は、この生成された既存知識から生成された仮事例と、
２００個の実事例により行われた。本学習システムによ
って生成された決定木の正答率を評価するための事例は
学習に用いなかった２８３個である。このような方法を
取ったのは、既存知識として、人手で恣意的な偏りを与
えることを防ぐためである。The embodiments of the present invention have been described above. FIG.
Is the result of carrying out the present invention using a part of the Irvine database (data called Soybean Large) which is benchmark data of learning methods provided by California State University in the United States. However, in this example,
From 683 cases, two sets of 200 learning data are taken, one set (200) is a real case, another set is (200)
Generates a decision tree in and uses this as existing knowledge. Learning is a temporary case generated from this generated existing knowledge,
It was conducted with 200 actual cases. The cases for evaluating the correct answer rate of the decision tree generated by the learning system are 283 cases not used for learning. The reason for taking such a method is to prevent existing knowledge from being manually biased.

【００４７】また、図３では、実事例の重みのみではな
く、仮事例の重みも変化させている。詳細は省略する
が、仮事例の２倍弱の重みを実事例に与えた所で、最も
正答率（８７．２４％（１００％−１２．７６％））が
向上していることに注意されたい。但し、２００個の学
習データ２組を、全てＩＤ３に投入した場合（８８．４
６％）に比べると、本発明の正答率（実際には、エラー
率で表示している。もちろん、１００％から正答率の％
を減じたものである）は、若干劣っている。しかし、２
００組の学習データを１組のみ利用した時、即ち、既存
知識なしに観測された実事例のみで学習を行った場合
（８１．８１％）に比べると、その知識の性能は、はる
かに優れたものであることが示されている。Further, in FIG. 3, not only the weight of the actual case but also the weight of the temporary case is changed. Although details are omitted, it is noted that the correct answer rate (87.24% (100% -12.76%)) is most improved when the actual case is given a weight slightly less than twice that of the tentative case. I want to. However, when two sets of 200 learning data are all input to ID3 (88.4
6%), the correct answer rate of the present invention (actually, the error rate is displayed. Of course, 100% to the correct answer rate
Is a little inferior). But 2
Compared to the case where only one set of 00 learning data is used, that is, the case where learning is performed only with actual cases observed without existing knowledge (81.81%), the performance of the knowledge is far superior. Have been shown to be

【００４８】上述したように、本発明の知識修正型学習
システムは、既存知識変換手段１１により既存知識を仮
事例として反映している点、およびクロスバリデーショ
ン法により未知事例に対する判別性能の向上を図ってい
る点に特徴がある。As described above, the knowledge correction type learning system of the present invention aims to improve the discrimination performance for unknown cases by the existing knowledge conversion means 11 by reflecting the existing knowledge as a tentative case and the cross validation method. There is a feature in that.

【００４９】また、仮事例は既存知識を事例として表現
したものであるため、既存知識を学習に併用できる。但
し、この際、仮事例と実事例のどちらをどの程度重く見
るかは問題である。即ち、既存知識は確率統計理論での
先験的確率に相当する訳であり、事後確率（先験的確率
と事例から生成される最終的な確率）生成のもととなる
実際の観測実事例を入手した場合に、どの程度、先験的
確率を信用してよいかは本発明の適用対象により変化す
る。そこで、クロスバリデーション法により最も未知事
例に対する判別性能を上げるための仮事例、実事例の重
みを定めている。Further, since the provisional case is a representation of existing knowledge as a case, the existing knowledge can be used for learning. However, in this case, it is a question of which of the temporary case and the actual case should be viewed more heavily. That is, the existing knowledge corresponds to the a priori probability in the probability statistics theory, and the actual observation actual case that is the basis of the posterior probability (the final probability generated from the a priori probability and the case) The degree to which the a priori probability can be relied upon when obtaining is changed depending on the application target of the present invention. Therefore, the cross-validation method is used to determine the weights of provisional cases and actual cases to improve the discrimination performance for the most unknown cases.

【００５０】[0050]

【発明の効果】以上述べたように、本発明によれば、従
来は既存知識の併用が困難であった高速な概念学習手法
においても、既存知識を利用した高精度の学習が可能で
ある。また、既存知識の存在により、観測事例のみから
作成されるよりも遥かに優れた知識を獲得できる。As described above, according to the present invention, it is possible to perform high-accuracy learning using existing knowledge even in a high-speed concept learning method which has been difficult to use the existing knowledge in the past. In addition, the existence of existing knowledge makes it possible to acquire knowledge that is far superior to that created from only observation cases.

【００５１】また、本発明によれば、最終的な知識が機
械生成されるので、事例からの学習結果を人手修正する
従来手法の場合に比べて、誤りや条件抜けが生じない。
また、ＩＤ３の持つ高速性をそのまま生かした、高速学
習が実現できる。Further, according to the present invention, since the final knowledge is machine-generated, errors and omission of conditions do not occur as compared with the conventional method of manually correcting the learning result from the case.
In addition, high-speed learning can be realized by making the best use of the high speed of ID3.

[Brief description of drawings]

【図１】本発明の一実施例に係る知識修正型学習システ
ムの構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a knowledge correction type learning system according to an embodiment of the present invention.

【図２】ＩＤ３を説明するための図である。FIG. 2 is a diagram for explaining ID3.

【図３】クロスバリデーション法の結果を示す図であ
る。FIG. 3 is a diagram showing a result of a cross validation method.

[Explanation of symbols]

１１既存知識変換手段１２重み変更手段１３帰納的学習手段１４パラメータ決定手段２１既存知識２２実事例２３仮事例２４学習事例２５修正済知識 11 Existing Knowledge Converting Means 12 Weight Changing Means 13 Inductive Learning Means 14 Parameter Determining Means 21 Existing Knowledge 22 Actual Cases 23 Temporary Cases 24 Learning Cases 25 Modified Knowledge

───────────────────────────────────────────────────── フロントページの続き (72)発明者金田重郎東京都千代田区内幸町１丁目１番６号日本電信電話株式会社内 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Shigeo Kaneda 1-1-6 Uchisaiwaicho, Chiyoda-ku, Tokyo Nihon Telegraph and Telephone Corporation

Claims

[Claims]

1. In a knowledge correction type learning system that acquires existing knowledge by using existing knowledge together, existing knowledge expressed by rules or decision trees is converted into a unit rule expression consisting of products of conditions only. Existing knowledge conversion means for generating a tentative case by converting, and a weight changing means for generating a learning case by giving a weight to each of the tentative case and the actual case with the tentative case and the observed real case as input , An inductive learning means for generating modified knowledge represented by a rule or a decision tree using the learning case as an input, and changing a weight as a parameter in the weight changing means, and discriminating performance for an unknown case by a cross validation method. A knowledge correction type learning system, comprising: a parameter determining unit that selects a parameter that maximizes.