JPH0895791A

JPH0895791A - Correction type knowledge learning device

Info

Publication number: JPH0895791A
Application number: JP6232837A
Authority: JP
Inventors: Shigeo Kaneda; 重郎金田; Yasuhiro Akiba; 泰弘秋葉; Megumi Ishii; 恵石井
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1994-09-28
Filing date: 1994-09-28
Publication date: 1996-04-12

Abstract

PURPOSE: To provide the correction type knowledge learning device which generates enough knowledge to explain an instance set partially including instances that do not match background knowledge on the assumption of the background knowledge. CONSTITUTION: A temporary instance and an actual instance where the background knowledge 3 given conditions for determining the classes of instances in an actual instance set 2 from attribute values in the form of a rule, decision tree, etc., and put together to form a training instance set by an instance converting means 4; and a training instance set is inputted and the same values of respective attributes are gathered to decide an attribute whose decrease in index showing the randomness of the instance sets is minimum as a divided attribute when the instance set is divided, the divided attribute is extracted recursively as a node, and the decision tree is generated by a decision tree generating means 5.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、人間の専門家の問題解
決事例から、その背後に存在する一般的な知識を学習す
る知識学習装置に関し、特に人間が作成した知識と事例
とを総合して、より質の高い知識を獲得するための修正
型知識学習装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a knowledge learning device that learns general knowledge existing behind a problem solving case of a human expert, and in particular, combines knowledge and case created by a human. And a modified knowledge learning device for acquiring higher quality knowledge.

【０００２】[0002]

【従来の技術】人間の専門家の問題解決事例を複数収集
し、この事例集合からその背後に存在する知識を抽出す
ることは、エキスパートシステムのための知識を作成す
る有力な手法として活発に研究が進められている。本明
細書では事例として以下のものを考える。即ち、Ａ₁，
Ａ₂，・・Ａ_j，・・，Ａ_nを属性値、Ｃをその属性値
によって決定されるクラスとして、事例は、（Ａ₁，Ａ₂，・・・・Ａ_n：Ｃ）として表現する。この場合、属性値からクラスを決定す
る未知関数ｆ（Ａ₁，Ａ₂，・・・・Ａ_n）＝Ｃの推定が知識学習装置の責務である。2. Description of the Related Art Collecting a plurality of problem solving cases of human experts and extracting the knowledge behind them from this set of cases is actively studied as a powerful method for creating knowledge for an expert system. Is being promoted. In this specification, the following is considered as an example. That is, A ₁ ,
_{_{A 2, ·· A j, ··}} , A n an attribute value, as the class determined by the attribute value C, _{_{case, (A 1, A 2,}} ···· A n: C) expressed as To do. In this case, it is the responsibility of the knowledge learning device to estimate an unknown function f (A ₁ , A ₂ , ... A _n ) = C that determines the class from the attribute value.

【０００３】知識学習装置の構築手法としては、種々の
ものが提案されているが、最も良く知られているもの
は、 J.Ross.Quinlan によるＩＤ３である。ＩＤ３は上
記の未知関数ｆの形式として、デシジョンツリーを生成
する機能を持つ。以下、ＩＤ３の簡単な説明を行う。詳
細は、共立出版株式会社、知識獲得と学習シリーズ１、
「知識獲得入門・帰納学習と応用」第５章、を参照され
たい。Various methods have been proposed for constructing a knowledge learning device, but the most well-known one is ID3 by J. Ross. Quinlan. ID3 has a function of generating a decision tree as a form of the unknown function f. Hereinafter, a brief description of ID3 will be given. For details, see Kyoritsu Publishing Co., Ltd., Knowledge Acquisition and Learning Series 1,
See Chapter 5, "Introduction to Knowledge Acquisition / Inductive Learning and Applications".

【０００４】まず、以下の事例集合Ｓを考える。ここで
は、Ｓ中の各事例は３個の属性Ａ₀，Ａ₁，Ａ₂（但
し、属性値は０または１の２値であるとする）、および
クラスＣ（０または１の２値）から構成されるものとす
る。即ち、事例の形式は、（Ａ₀，Ａ₁，Ａ₂：Ｃ）で
ある。First, consider the following case set S. Here, each case in S has three attributes A ₀ , A ₁ , A ₂ (provided that the attribute value is a binary value of 0 or 1), and a class C (binary value of 0 or 1). Shall consist of That is, the format of the case is (A ₀ , A ₁ , A ₂ : C).

【０００５】（０，０，１：１）（１，０，０：０）（１，１，１：１）（１，１，０：１）今、２番目の属性Ａ₁を選んで、この集合Ｓを分割して
みる。この場合には、事例は２個のグループに分割され
る。(0,0,1: 1) (1,0,0: 0) (1,1,1: 1) (1,1,0: 1) Now, select the second attribute A ₁ . , Split this set S. In this case, the case is divided into two groups.

【０００６】グループ０（Ａ₁＝０）グループ１（Ａ₁＝１）（０，０，１：１）（１，１，１：１）（１，０，０：０）（１，１，０：１）分割は、この場合、３種類の属性で可能であることに注
意されたい。ＩＤ３では、この分割の前後のエントロピ
の差（エントロピゲイン）を求める。ここで、エントロ
ピゲインとは「分割前の事例のエントロピ」−「分割後の事例の期待
エントロピ」である。「分割後の事例の期待エントロピ」とは、分割
後の各事例のグループのエントロピを、各グループ毎の
発生頻度（当該グループに到達した事例数を分割前の事
例の総数で除したもの）で重み付け合計したものであ
る。具体的には、以下のようにエントロピゲインを計算
する。この分割前の事例集合に対しては、クラス１が３
個、クラス０が１個であるので、エントロピはGroup 0 (A ₁ = 0) Group 1 (A ₁ = 1) (0,0,1: 1) (1,1,1: 1) (1,0,0: 0) (1,1 , 0: 1) Note that splitting is possible in this case with three types of attributes. In ID3, the difference in entropy before and after this division (entropy gain) is obtained. Here, the entropy gain is "entropy of case before division"-"expected entropy of case after division". "Expected entropy of cases after division" is the entropy of the group of each case after division, which is the frequency of occurrence for each group (the number of cases that reached the group divided by the total number of cases before division) It is a weighted sum. Specifically, the entropy gain is calculated as follows. For this case set before division, class 1 is 3
Since there is one and there is one class 0, the entropy is

【数１】−（１／４）log （１／４）−（３／４）log
（３／４）＝０．５００＋０．３１１＝０．８１１となる。但し、対数の底は２である。[Equation 1]-(1/4) log (1/4)-(3/4) log
(3/4) = 0.500 + 0.311 = 0.811. However, the base of the logarithm is 2.

【０００７】一方、分割後のエントロピは、グループ０
（Ａ₁＝０）がOn the other hand, the entropy after division is group 0.
(A ₁ = 0)

【数２】−（１／２）log （１／２）−（１／２）log
（１／２）＝１グループ１（Ａ₁＝１）が(2)-(1/2) log (1/2)-(1/2) log
(1/2) = 1 Group 1 (A ₁ = 1)

【数３】−（２／２）log （２／２）−（０／２）log
（０／２）＝０である。従って、分割後の期待エントロピは、（２／４）・１＋（２／４）・０＝０．５であり、２番目の属性（Ａ₁）を選んで、集合Ｓを分割
した時のエントロピゲインは、０．８１１−０．５＝０．３１１となる。[Equation 3]-(2/2) log (2/2)-(0/2) log
(0/2) = 0. Therefore, the expected entropy after division is (2/4) · 1 + (2/4) · 0 = 0.5, and the entropy when the set S is divided by selecting the second attribute (A ₁ ). The gain is 0.811−0.5 = 0.331.

【０００８】次に、１番目の属性（Ａ₀）を選んで、こ
の集合を分割してみる。この場合の分割後のエントロピは、グループ０（Ａ₀＝
０）のエントロピが０であるので、Next, the first attribute (A ₀ ) is selected and this set is divided. The entropy after division in this case is group 0 (A ₀ =
Since the entropy of 0) is 0,

【数４】（３／４）×（−（１／３）log （１／３）−
（２／３）log （２／３））＝０．６８９である。また、３番目の属性（Ａ₂）を選んでこの集合
を分割してみると以下のようになる。(4) (3/4) × (-(1/3) log (1/3)-
(2/3) log (2/3)) = 0.689. Also, when the third attribute (A ₂ ) is selected and this set is divided, the result is as follows.

【０００９】グループ０（Ａ₂＝０）グループ１（Ａ₂＝１）（１，０，０：０）（０，０，１：１）（１，１，０：１）（１，１，１：１）この場合の分割後のエントロピは、グループ１（Ａ₂＝
１）のエントロピが０であるので、Group 0 (A ₂ = 0) Group 1 (A ₂ = 1) (1,0,0: 0) (0,0,1: 1) (1,1,0: 1) (1,1 , 1: 1) The entropy after division in this case is group 1 (A ₂ =
Since the entropy of 1) is 0,

【数５】（２／４）×（−（１／２）log （１／２）−
（１／２）log （１／２））＝０．５である。従って、この場合には、２番目の属性Ａ₁と３
番目の属性Ａ₂の両方が最大のエントロピゲインを示
す。ここでは、２番目の属性Ａ₁で分割を実行するもの
とする。この属性Ａ₁を分割属性と呼ぶ。また、この分
割属性の値Ａ₁＝０，Ａ₁＝１をそれぞれ分岐の属性値
と呼ぶ。(5) (2/4) × (-(1/2) log (1/2)-
(1/2) log (1/2)) = 0.5. Therefore, in this case, the second attributes A ₁ and 3
Both of the second attributes A ₂ exhibit maximum entropy gain. Here, it is assumed that the division is executed with the second attribute A ₁ . This attribute A ₁ is called a division attribute. Further, the values A ₁ = 0 and A ₁ = 1 of the division attribute are called branch attribute values.

【００１０】２番目の属性Ａ₁により分割すると、グル
ープ１のクラスは「１」のみであるから、この分岐に流
れて来る事例のクラスは全て「１」であると見なす。更
に、グループ０のクラスに所属する２個の事例（０，０，１：１）（１，０，０：０）は、明らかに、３番目の属性Ａ₂で分割すると、分割後
の各グループのクラスはユニークになる。従って、３番
目の属性による分割が、最もエントロピゲインが大き
い。When divided by the second attribute A ₁ , the class of group 1 is only "1", so all the classes of cases flowing to this branch are regarded as "1". Furthermore, two cases (0,0,1: 1) (1,0,0: 0) belonging to the class of group 0 are obviously divided by the third attribute A ₂ , and The class of the group will be unique. Therefore, the division by the third attribute has the largest entropy gain.

【００１１】以上の手順はデシジョンツリー形式で表現
でき、図２のようになる。ＩＤ３では、この生成された
デシジョンツリーを、クラス未知の事例に対して判別に
利用する。この結果、クラスの値をｆとすれば、ｆ＝Ａ
₁＋Ａ₂の論理式と等価なデシジョンツリーが得られた
ことになる。The above procedure can be expressed in a decision tree format, as shown in FIG. In ID3, the generated decision tree is used for discrimination with respect to a case of unknown class. As a result, if the class value is f, then f = A
A decision tree equivalent to the logical expression of ₁ + A ₂ is obtained.

【００１２】以上が、帰納学習アルゴリズムＩＤ３の概
要であり、より正確な手順として記述すれば以下のよう
な再帰的な手順になる。The above is the outline of the induction learning algorithm ID3, and if described as a more accurate procedure, the following recursive procedure is performed.

【００１３】（ＳＴＥＰ１）事例の属性集合をＡ、与え
られた事例集合をＳとする。（ＳＴＥＰ２）事例集合が空であるか所属するクラスが
単一の場合は、処理を停止する。さもなければ、事例集
合Ｓに対して、属性集合Ａに含まれる各属性に対して、
当該属性で分割を行った場合のエントロピゲインを計算
し、最も大きなエントロピゲインを持つ属性を１個選び
ａとする。（ＳＴＥＰ３）属性ａを属性集合Ａから削除し、これを
新たなＡとする。そして、属性ａにより、事例集合Ｓを
グループに分割し、この各グループ毎の事例集合を
Ｓ₀，Ｓ₁・・・とする。このＳ₀，Ｓ₁・・・の各々
に対して、ＳＴＥＰ２〜ＳＴＥＰ３を再帰的に実行す
る。(STEP 1) Let A be the attribute set of a case and S be a given case set. (STEP 2) If the case set is empty or the class to which it belongs is single, the processing is stopped. Otherwise, for the case set S, for each attribute included in the attribute set A,
The entropy gain when the division is performed by the attribute is calculated, and one attribute having the largest entropy gain is selected and designated as a. (STEP 3) The attribute a is deleted from the attribute set A, and this is set as a new A. Then, the case set S is divided into groups according to the attribute a, and the case sets for each group are set as S ₀ , S ₁ ... STEP 2 to STEP 3 are recursively executed for each of S ₀ , S ₁ ...

【００１４】なお、上記では、分割前後の事例集合の乱
雑性の指標をエントロピにより表現したが、エントロピ
に限定するものではない。何らかの指標で乱雑性を表す
ものであり、その指標が順序（全順序）付け可能なもの
であればよい。例えば、エントロピの対数として１０を
底としてもよく、また、単にグループ中のクラス数から
１を減じたものでもよい。但し、この指標に依存して、
作成された知識の性能（未知事例に対する推定能力）が
影響を受ける。ＩＤ３のエントロピゲインによる方法
は、簡単で良好な性能を示す指標として認知されてい
る。In the above description, the index of the randomness of the case set before and after the division is expressed by entropy, but the index is not limited to entropy. Any index may be used to represent randomness, and the index may be ordered (total order). For example, the base of the logarithm of entropy may be 10, or it may be simply the number of classes in the group minus 1. However, depending on this indicator,
The performance of the created knowledge (the ability to estimate unknown cases) is affected. The method based on entropy gain of ID3 is recognized as an index showing simple and good performance.

【００１５】図３では、上記の手順を直観的に説明して
いる（この例は、図２とは全く異なる）。図３では、○
（クラス１）と×（クラス０）で表示された事例が、属
性値が張る多次元空間（この例では、２次元）に分布し
ている様子を示す。上記の属性値による分割は、この多
次元空間を一枚の超平面で区切ることである。図３の上
図は、まず、ある超平面で区切った結果を示す。この場
合、右側には、３個の×事例が存在するのみなので、ク
ラスは「０」で一意となった。ここで、エントロピゲイ
ンが最大とは、超平面で切る際に、できるだけ、クラス
の乱雑性が（例え、取り合えずその時点では綺麗にクラ
スが切れなくても）少なくなるように切ることを意味す
る。FIG. 3 illustrates the above procedure intuitively (this example is completely different from FIG. 2). In Figure 3, ○
A case where the cases displayed as (class 1) and x (class 0) are distributed in a multidimensional space (two dimensions in this example) having attribute values is shown. The division by the above attribute value is to divide this multidimensional space by one hyperplane. The upper diagram of FIG. 3 shows the result of division by a certain hyperplane. In this case, since there are only three × cases on the right side, the class is unique with “0”. Here, the maximum entropy gain means that when cutting with a hyperplane, the class is cut as little as possible (even if the class is not cut cleanly at that point). .

【００１６】図３の下の図では、綺麗に切れていなかっ
た左半分の領域について、分割を繰り返している。その
結果、多次元空間は、３個の領域に分割され、クラス
は、完全に各領域で一意になっている。In the lower diagram of FIG. 3, the division is repeated for the left half region which is not neatly cut. As a result, the multidimensional space is divided into three regions, and the classes are completely unique in each region.

【００１７】[0017]

【発明が解決しようとする課題】以上のＩＤ３は、優れ
た手法として知られ、エキスパートシステム構築支援ツ
ールへの実装例が報告されている。しかし、現実の応用
の中では、あまり利用されたとの報告が聞かれない。そ
の最大の理由は、現実には、このＩＤ３で知識（デシジ
ョンツリー）を生成できる程は、事例が集まらないこと
に起因する。即ち、十分に実用に耐える知識（デシジョ
ンツリー）を作成するには、事例が少なすぎるのであ
る。The above ID3 is known as an excellent method, and an implementation example in an expert system construction support tool has been reported. However, in the actual application, it is not heard that it was used much. The biggest reason is that in reality, cases are not collected enough to generate knowledge (decision tree) with this ID3. In other words, there are too few cases to create knowledge (decision tree) that is sufficiently practical.

【００１８】現実に可能、あるいは生じるのは、数少な
い事例から人間が頭の中で知識を生成することである。
これはナレッジエンジニア（ＫＥ）あるいはドメインの
専門家の仕事であった。しかし、人間もまた完全な知識
を生成することはできない。実際に人間が作成した知識
を運用すると、（少数ではあっても）人間の作成した知
識に合致しない事例が発生する。必要なことは、既に存
在する知識を新たな（例外的な）事例に対して、チュー
ンすることである。しかし、このような修正型の知識学
習アルゴリズムはほとんど知られていない。What is or is actually possible is that humans generate knowledge in their heads from a few cases.
This was the job of a Knowledge Engineer (KE) or domain expert. However, humans cannot generate perfect knowledge either. When human-created knowledge is actually used, there are cases (even a small number) that do not match the human-created knowledge. What is needed is to tune existing knowledge into new (exceptional) cases. However, such a modified knowledge learning algorithm is hardly known.

【００１９】ここで、注意しなければならないことは、
「人間が作成した知識＋新たに発生した例外的事例」の
みから、「修正された知識」を生成することは、どのよ
うな手法を用いたとしても、その精度に疑問が生じるこ
とである。なぜならば、人間が作成した知識（背景知
識）のどの部分が確実な知識で、どの部分が不確実な知
識であるかが本質的に分からないからである。即ち、統
計解析の用語をかりて説明すれば、事前確率が真の事前
確率ではないときに、その事前確率の「信頼度」のよう
なものがない限り、どこまで事前確率を修正してよいか
が分からないのである。本発明では、この信頼度とし
て、背景知識を生成するもととなった実際の事例（実事
例）を利用する。即ち、実際に観測された事例で分割が
実行できる範囲では実事例を利用し、実事例のみからで
は、それ以上分割できない時に、背景知識を利用する。Here, it should be noted that
Generating "corrected knowledge" only from "human-created knowledge + newly generated exceptional case" causes a question of accuracy regardless of which method is used. This is because it is essentially impossible to know which part of human-created knowledge (background knowledge) is reliable knowledge and which part is uncertain knowledge. In other words, in terms of statistical analysis, when the prior probability is not the true prior probability, how much can the prior probability be modified unless there is something like "reliability" of the prior probability? I don't know. In the present invention, an actual case (actual case) from which background knowledge is generated is used as this reliability. That is, the actual case is used in the range in which the division can be executed in the actually observed case, and the background knowledge is used when the actual case cannot be further divided.

【００２０】本発明は、上記に鑑みてなされたもので、
その目的とするところは、背景知識を前提とし、しかも
一部に背景知識には合致しない事例を含む事例集合を説
明可能な知識を生成する修正型知識学習装置を提供する
ことにある。The present invention has been made in view of the above,
It is an object of the present invention to provide a modified knowledge learning device that generates knowledge capable of explaining a case set including cases that do not match the background knowledge and that is based on the background knowledge.

【００２１】[0021]

【課題を解決するための手段】上記目的を達成するた
め、本発明の修正型知識学習装置は、属性値の組と当該
属性値の組に対応して決まる１つのクラスとから構成さ
れる実事例を複数含む実事例集合と、前記事例のクラス
を属性値から決定するための条件をルール、デシジョン
ツリー等の形式で与えられた背景知識と、前記背景知識
を反映した仮事例と前記実事例とをあわせて訓練事例集
合を形成する事例変換手段と、前記訓練事例集合を入力
として、各属性に対して当該属性の同値同士を集めるこ
とにより事例集合を分割した際に、当該事例集合の乱雑
性を表す指標の減少が最大となる属性を分割属性とし、
当該分割属性をノードとして抽出することを再帰的に行
い、決定木を生成する決定木生成手段とを有することを
要旨とする。In order to achieve the above object, the modified knowledge learning apparatus of the present invention comprises a set of attribute values and one class determined corresponding to the set of attribute values. A set of actual cases including a plurality of cases, a background knowledge given in the form of a rule, a decision tree, or the like for a condition for determining the class of the case from an attribute value, a tentative case reflecting the background knowledge, and the actual case And a case conversion unit that forms a training case set together with the training case set as an input, and when the case set is divided by collecting the same values of the attribute for each attribute, the case set is messed up. The attribute with the largest decrease in the index indicating sex is the split attribute,
The gist is to have a decision tree generating unit that recursively extracts the division attribute as a node and generates a decision tree.

【００２２】[0022]

【作用】本発明の修正型知識学習装置では、実事例のみ
ではなく、背景知識から生成された仮事例を含めて、分
割属性を決定しているため、実事例が多数存在する領域
ではそれに合致した知識が生成され、実事例が少なく、
仮事例のみ存在している領域では、背景知識に合致した
知識が生成される。In the modified knowledge learning apparatus of the present invention, not only the actual case but also the temporary case generated from the background knowledge is used to determine the division attribute. Generated knowledge, few actual cases,
In a region where only temporary cases exist, knowledge that matches background knowledge is generated.

【００２３】[0023]

【実施例】以下、図面を用いて本発明の実施例を説明す
る。Embodiments of the present invention will be described below with reference to the drawings.

【００２４】図１は、本発明の一実施例に係わる修正型
知識学習装置の構成を示すブロック図である。同図に示
す修正型知識学習装置１は、属性値の組と当該属性値の
組に対応して決まる１つのクラスとから構成される実事
例を複数個含む実事例集合２と、前記事例のクラスを属
性値から決定するための条件を、ルール、デシジョンツ
リー等の形式で与えられた背景知識３と、前記背景知識
を反映した仮事例と、前記実事例とをあわせて訓練事例
集合を形成する事例変換手段４と、前記訓練事例集合を
入力として、各属性に対して、当該属性の同値同士を集
めることにより事例集合を分割した際に、当該事例集合
の乱雑性を表す指標の減少が最大となる属性を分割属性
とし、当該分割属性をノードとして抽出することを再帰
的に行い、決定木を生成する決定木生成手段５とから構
成される。FIG. 1 is a block diagram showing the configuration of a modified knowledge learning device according to an embodiment of the present invention. The modified-type knowledge learning device 1 shown in FIG. 1 includes an actual case set 2 including a plurality of actual cases composed of a set of attribute values and one class determined corresponding to the set of attribute values, and A training case set is formed by combining background knowledge 3 given as conditions for determining a class from attribute values in the form of rules, decision trees, provisional cases reflecting the background knowledge, and the actual cases. When the case conversion unit 4 and the training case set are input, and when the case set is divided by collecting the same values of the attribute for each attribute, the index indicating the disorder of the case set decreases. The maximum attribute is defined as a split attribute, and the split attribute is extracted as a node recursively, and the decision tree generating means 5 is configured to generate a decision tree.

【００２５】実事例集合２は、現実世界において観測さ
れた事例（以下、これを実事例と呼ぶ）である。事例
は、（Ａ₁，Ａ₂，・・・・Ａ_n：Ｃ）の形式で表現されるが、具体的にはこれをビット列で表
現してもよく、リスト構造で表現してもよい。それは、
当業者の選択の範囲にあるが、少なくとも、属性値とク
ラスを情報として有している必要がある。例えば、実事
例集合は以下のようなものである。但し、事例は
（Ａ₁，Ａ₂，Ａ₃：Ｃ）の形式を持つ。The real case set 2 is a case observed in the real world (hereinafter, referred to as a real case). The case is expressed in the form of (A ₁ , A ₂ , ... A _n : C), but specifically, it may be expressed by a bit string or a list structure. that is,
Although it is within the range of selection by those skilled in the art, it is necessary to have at least the attribute value and class as information. For example, the actual case set is as follows. However, the case has a format of (A ₁ , A ₂ , A ₃ : C).

【００２６】（１，０，０：０）（１，０，１：０）（０，１，０：１）これは、以下の事例集合のサブセットであり、上記の３
個のみが観測されているものとする。(1,0,0: 0) (1,0,1: 0) (0,1,0: 1) This is a subset of the following case set and
It is assumed that only individuals are observed.

【００２７】（０，０，０：１）（０，０，１：１）（０，１，０：１）（０，１，１：１）（１，０，０：０）（１，０，１：０）（１，１，０：１）（１，１，１：１）この８個の事例は、ｆ＝（not Ａ₀)or Ａ₁ なるターゲット概念の事例としてある。但し、not は否
定、 or は論理和を表すとする。(0,0,0: 1) (0,0,1: 1) (0,1,0: 1) (0,1,1: 1) (1,0,0: 0) (1 , 0, 1: 0) (1,1,0: 1) (1,1,1: 1) These eight cases are examples of the target concept of f = (not A ₀ ) or A ₁ . However, not represents negation and or represents OR.

【００２８】背景知識３は、前記実事例集合２に所属す
る各事例のクラスを属性値から決定するためのルール、
デシジョンツリー等である。上記のｆ＝（not Ａ₀)or
Ａ₁は、論理式で表現された知識と考えることができ
る。背景知識はどんな形態でもよいが、重要なことは、
属性値の部分集合が与えられた時に、当該背景知識によ
り、そのクラスが計算可能であることである。ここで、
計算可能とは、クラスが定まらないことも許容する。即
ち、部分集合以外に、他の属性値が定められない場合に
クラスを特定できない場合には、クラス未定として陽に
表現されているものとする。但し、全ての属性値を定め
ても、属性値からでは複数のクラスが生成されるような
ことは許されない。知識の表現方法、並びに、そのクラ
スの計算方法については、既に知識処理の分野で種々の
提案があり、従来技術で表現できるため、本明細書では
これ以上の説明は行わない。The background knowledge 3 is a rule for determining the class of each case belonging to the actual case set 2 from the attribute value,
For example, a decision tree. F = (not A ₀ ) or
A ₁ can be considered as knowledge expressed by a logical expression. The background knowledge can be in any form, but the important thing is
It is that the class can be calculated by the background knowledge when a subset of attribute values is given. here,
Computable also allows the class to be undefined. That is, if the class cannot be specified when other attribute values other than the subset are not determined, it is assumed that the class is undetermined and is explicitly expressed. However, even if all the attribute values are defined, it is not allowed to generate a plurality of classes from the attribute values. There are already various proposals in the field of knowledge processing regarding the method of expressing knowledge and the method of calculating the class thereof, and since they can be expressed by conventional techniques, further description will not be given in this specification.

【００２９】事例変換手段４は、上記背景知識に相当す
る事例（以下、これを「仮事例」と呼ぶこととする）
と、上記実事例とをあわせて訓練事例集合を形成する機
能を持つ。本発明の特徴を構成する構成要素の１つであ
る。以下、詳細に説明する。そもそも、仮事例は、後述
の決定木生成手段５が事例に基づいてデシジョンツリー
を生成するため、背景知識を事例の形式で持ち込むため
に仮想的に生成された事例である。具体的には以下の形
式を持つ。The case conversion means 4 corresponds to the above background knowledge (hereinafter referred to as "provisional case").
And the above-mentioned actual case are combined to form a training case set. It is one of the constituent elements constituting the feature of the present invention. The details will be described below. In the first place, the tentative case is a case virtually generated in order to bring in background knowledge in the form of a case because the decision tree generating means 5 described later generates a decision tree based on the case. Specifically, it has the following format.

【００３０】（＊，＊，＊，・・・・・：Ｃ₀）（＊，＊，＊，・・・・・：Ｃ₁）・・・・（＊，＊，＊，・・・・・：Ｃ_n-1）ここで、「＊」は当該属性の値が決まっていないことを
示す。Ｃ₀，Ｃ₁，・・・，Ｃ_n-1は、クラスを列挙し
たものである。上記は、全ての属性値が決定されていな
い段階では、クラスが決定されていないことを意味す
る。事例変換手段４は、実事例からなる実事例集合２に
含まれる事例と、この仮事例を用いて、学習のための訓
練事例集合を生成する。上記の具体例に対しては、仮事
例は（＊，＊，＊：０）（＊，＊，＊：１）となる。従って、訓練事例集合は、実（１，０，０：０）実（１，０，１：０）実（０，１，０：１）仮（＊，＊，＊：０）仮（＊，＊，＊：１）である。事例の前の「実」「仮」は、それぞれ、実事例
と仮事例の区別を表す。(*, *, *, ...: C ₀ ) (*, *, *, ...: C ₁ ) ... (*, *, *, ... *: Cn _-1 ) Here, "*" shows that the value of the said attribute is not decided. C ₀ , C ₁ , ..., C _n-1 are enumerations of classes. The above means that the class is not determined at the stage when all the attribute values are not determined. The case conversion unit 4 generates a training case set for learning by using the cases included in the real case set 2 composed of the real cases and the temporary cases. For the above specific example, the provisional case is (*, *, *: 0) (*, *, *: 1). Therefore, the training set is real (1,0,0: 0) real (1,0,1: 0) real (0,1,0: 1) tentative (*, *, *: 0) tentative (* , *, *: 1). “Actual” and “Provisional” before the case represent the distinction between the actual case and the temporary case, respectively.

【００３１】決定木生成手段５は、上記訓練事例集合を
入力として、各属性に対して、当該属性により事例集合
を分割した際に事例集合の乱雑性を表す指標の減少が最
大となる属性をノードとして、決定木を再帰的に生成す
る機能を有する。但し、仮事例と実事例の扱いについて
は、若干の注意が必要である。即ち、仮事例は、本来、
複数の事例を代表するものである。従って、実事例と同
一の重みをもって扱うことには問題がある。The decision tree generating means 5 receives the above-mentioned training case set as an input, and for each attribute, when the case set is divided by the attribute, the attribute showing the largest decrease in the index indicating the disorder of the case set is set. As a node, it has a function of recursively generating a decision tree. However, some care must be taken regarding the handling of provisional cases and actual cases. That is, the provisional case is originally
It is representative of multiple cases. Therefore, there is a problem in handling with the same weight as the actual case.

【００３２】そこで、本発明では、仮事例と実事例にそ
れぞれ予め与えられた重みを考え、この重み分の個数の
事例が、あたかもそこに存在するが如く扱うことによ
り、デシジョンツリーの生成を行う。その最も簡単な例
は、実事例に極端に大きな重みを与えた場合である。こ
の場合には、その入力となる事例中に実事例が存在する
限りにおいて、実事例に従って分割属性を決定し、実事
例が無くなった場合には、仮事例に基づいて分割属性を
決定することとなる。以下、第１の実施例として、この
実事例優先の場合について、詳細に説明する。Therefore, in the present invention, the decision tree is generated by considering the weights given in advance to the temporary case and the actual case, and treating as many cases as the weights as if they exist there. . The simplest case is when the actual case is given an extremely large weight. In this case, as long as there is a real case in the input case, the split attribute is determined according to the real case, and when the real case disappears, the split attribute is determined based on the temporary case. Become. Hereinafter, as a first embodiment, the case of prioritizing the actual case will be described in detail.

【００３３】〔決定木生成手段５の実施例１〕実事例優
先戦略を用いた場合の決定木生成手段５は、以下のステ
ップで動作する。[First Embodiment of Decision Tree Generating Unit 5] The decision tree generating unit 5 using the real case priority strategy operates in the following steps.

【００３４】（ＳＴＥＰ１）デシジョンツリーのノード
を生成するために、当該ノードの分割の際に与えられて
いる事例の属性集合をＡ、事例集合をＳ_real，Ｓ_artと
する。但し、Ｓ_realは実事例の集合、Ｓ_artは仮事例の
集合とする。（ＳＴＥＰ２）事例集合Ｓ_real，Ｓ_artに対して、後述
の停止条件を満足していれば、処理を停止する。さもな
ければ、事例集合Ｓ_realに対して、属性集合Ａに含まれ
る各属性に対して、当該属性で分割を行った場合のエン
トロピゲインを計算し、最も大きなエントロピゲインを
持つ属性を１個選び、これをａとする。但し、事例集合
Ｓ_realが空の場合には、事例集合をＳ_artについて、属
性集合Ａに含まれる各属性に対して、当該属性で分割を
行った場合のエントロピゲインを計算し、最も大きなエ
ントロピゲインを持つ属性を１個選びａとする。ここ
で、エントロピゲインを計算するためには、仮事例の所
属クラスを分割後の各グループに対して実行する必要が
あるが、その詳細は後述する。（ＳＴＥＰ３）属性ａを属性集合Ａから削除し、これを
新たなＡとする。そして、属性ａにより、事例集合Ｓ
_real，Ｓ_artをグループに分割し、この各グループ毎の
事例集合をＳ₀，Ｓ₁・・・とする。この際、事例集合
Ｓ_artに対しては、当該属性ａの「＊」を、その分岐の
属性値に書き替える。（ＳＴＥＰ４）上記のＳ₀，Ｓ₁・・・の各々に対し
て、ＳＴＥＰ２〜ＳＴＥＰ３を再帰的に実行する。但
し、実事例のクラスが一意である場合には、それに合致
しない仮事例は削除して集合を生成する。(STEP 1) In order to generate a node of the decision tree, the attribute set of the case given when the node is divided is A, and the case set is S _real and S _art . However, S _real is a set of actual cases, and S _art is a set of temporary cases. (STEP 2) If the stop conditions described later are satisfied for the case sets S _real and S _art , the processing is stopped. Otherwise, for the case set S _real, for each attribute included in the attribute set A, calculate the entropy gain when the attribute is divided, and select the one with the largest entropy gain. , And let this be a. However, when the case set S _real is empty, entropy gain is calculated for each attribute included in the attribute set A for the case set S _art , and the entropy gain is calculated to be the largest entropy. One attribute having a gain is selected and designated as a. Here, in order to calculate the entropy gain, it is necessary to execute the affiliation class of the temporary case for each group after division, which will be described later in detail. (STEP 3) The attribute a is deleted from the attribute set A, and this is set as a new A. Then, with the attribute a, the case set S
_Real and S _art are divided into groups, and the set of cases for each group is S ₀ , S ₁ . At this time, for the case set S _art , "*" of the attribute a is rewritten to the attribute value of the branch. (STEP 4) STEP 2 to STEP 3 are recursively executed for each of the above S ₀ , S ₁ ... However, when the class of the actual case is unique, the temporary cases that do not match the class are deleted and a set is generated.

【００３５】上記の処理の停止は、以下の条件による。（１）事例集合Ｓ_real，Ｓ_artが何れも空の時。（２）事例集合Ｓ_realのクラスが単一で、Ｓ_artが空の
時。この場合、当該分岐のクラスは事例集合Ｓ_realのク
ラスとする。（３）事例集合をＳ_real，Ｓ_artのクラスが各々単一に
決定されて、クラスが同一の時。この場合、当該分岐の
クラスは事例集合Ｓ_realのクラスとする。（４）事例集合Ｓ_real，Ｓ_artのクラスが各々単一に決
定されて、クラスが矛盾する時。この場合、当該分岐の
クラスは事例集合Ｓ_realのクラスとする。The above processing is stopped under the following conditions. (1) When both the case sets S _real and S _art are empty. (2) When the case set S _real has a single class and S _art is empty. In this case, the class of the branch is the class of the case set S _real . (3) When the classes of S _real and S _art are determined to be single and the classes are the same. In this case, the class of the branch is the class of the case set S _real . (4) When the classes of the case sets S _real and S _art are individually determined and the classes are inconsistent. In this case, the class of the branch is the class of the case set S _real .

【００３６】以上の条件（１），（２）は、従来技術の
ＩＤ３と同一の条件である。条件（３）は、実事例も仮
事例も同一のクラスの場合には、停止することを意味し
ている。条件（４）は、実事例も仮事例も単一クラスで
あるが、これが矛盾する場合には、実事例のクラスを優
先することを意味する。この条件から、実事例が空（ま
たは、実事例の持つクラスが一意に収束した場合）とな
っても、仮事例のクラスが確定していない間は、分岐処
理が続行される。The above conditions (1) and (2) are the same as those of ID3 of the prior art. The condition (3) means to stop when the actual case and the temporary case are in the same class. The condition (4) means that both the actual case and the temporary case are of a single class, but if they are inconsistent, the class of the actual case is prioritized. From this condition, even if the actual case becomes empty (or the class of the actual case converges uniquely), the branch processing is continued while the class of the temporary case is not fixed.

【００３７】仮事例の所属クラスを分割後の各グループ
に対して実行するためには、基本的には、仮事例の
「＊」の属性値の中で、分岐に対応して、分岐の属性値
に当該属性の値を固定した事例を生成すれば、よい。そ
して、この生成された仮事例は、背景知識を照らしあわ
せ、そのクラスを求めることができるか否かを調べる必
要がある。もちろん、全ての属性値が分かっているわけ
ではないので、クラスが求まるとは限らない。しかし、
少なくとも、これ以上未知の属性値が判明しても、クラ
スがそのクラスとはならない仮事例は消去されるべきで
ある。それまでに「＊」の代わりに代入された属性値の
みからでは、クラスが唯一に決まらない場合、少なくと
も、そのクラスが生じる可能性のある間は、当該仮事例
を残す。In order to execute the class to which the provisional case belongs to each group after division, basically, in the attribute value of "*" of the provisional case, the attribute of the branch corresponding to the branch. It suffices to generate a case in which the value of the attribute is fixed to the value. Then, it is necessary to check whether or not the class can be obtained by checking the background knowledge in the generated provisional case. Of course, not all attribute values are known, so the class is not always available. But,
At a minimum, provisional cases in which a class does not fall into the class even if an unknown unknown attribute value is found should be deleted. If the class cannot be uniquely determined only by the attribute value that has been substituted for “*”, the provisional case is left at least as long as the class may occur.

【００３８】クラスの計算方法は、背景知識の表現方法
に応じて、種々考えられる。例えば、ルールインタプリ
タ、デシジョンツリーインタプリタ、論理関数値計算プ
ログラムを用いることができる。これらの知識表現手
法、インタプリタの実現方法等は、当業者であれば、容
易に推定できると思われる。Various class calculation methods can be considered according to the background knowledge expression method. For example, a rule interpreter, a decision tree interpreter, and a logic function value calculation program can be used. Those skilled in the art can easily estimate these knowledge representation methods and interpreter implementation methods.

【００３９】また、実事例によるエントロピゲイン計算
で、最大のエントロピゲインを示す属性が複数個存在す
る場合には、仮事例のエントロピゲインを調べ、仮事例
のエントロピゲインが最大となる属性を選択することも
１つの方法と思われる。In addition, in the entropy gain calculation based on the actual case, when there are a plurality of attributes showing the maximum entropy gain, the entropy gain of the tentative case is examined, and the attribute that maximizes the entropy gain of the tentative case is selected. That seems to be one method.

【００４０】具体的には、仮事例仮（＊，＊，＊：０）仮（＊，＊，＊：１）に対して、最初の属性で分岐したとすると、まず、以下
のようにコピーを取る。More specifically, if a temporary case tentative (*, *, *: 0) tentative (*, *, *: 1) is branched with the first attribute, first copy as follows. I take the.

【００４１】グループ０（Ａ₀＝０）仮（０，＊，＊：０）仮（０，＊，＊：１）グループ１（Ａ₀＝１）仮（１，＊，＊：０）仮（１，＊，＊：１）但し、ここで、背景知識が、例えばｆ＝not Ａ₀であっ
たとする。そうすると、グループ０，１ともに、各々１
個の事例は、背景知識から見て不合理である。従って、
分岐後のグループは、以下のようになる。Group 0 (A ₀ = 0) Temporary (0, *, *: 0) Temporary (0, *, *: 1) Group 1 (A ₀ = 1) Temporary (1, *, *: 0) Temporary (1, *, *: 1) However, here, it is assumed that the background knowledge is, for example, f = not A ₀ . Then, both groups 0 and 1 are 1
Each case is absurd from the background knowledge. Therefore,
The group after branching is as follows.

【００４２】グループ０（Ａ₀＝０）仮（０，＊，＊：１）グループ１（Ａ₀＝１）仮（１，＊，＊：０）何れも、事例のクラスは単一であるので、少なくとも仮
事例についての分岐処理はここで停止する。しかし、背
景知識が、ｆ＝Ａ₁であると、Ａ₁の値が未定なので、
クラスを決定できないため、「０」「１」の双方の可能
性が残るため、分岐後のグループ分割は以下のようにな
る。Group 0 (A ₀ = 0) Temporary (0, *, *: 1) Group 1 (A ₀ = 1) Temporary (1, *, *: 0) In both cases, there is only one case class. Therefore, the branching process for at least the temporary case is stopped here. However, if the background knowledge is f = A ₁ , the value of A ₁ is undecided.
Since the class cannot be determined, the possibilities of both “0” and “1” remain, so the group division after branching is as follows.

【００４３】グループ０（Ａ₀＝０）仮（０，＊，＊：０）仮（０，＊，＊：１）グループ１（Ａ₀＝１）仮（１，＊，＊：０）仮（１，＊，＊：１）以上で、本発明の実施例の説明を終了する。Group 0 (A ₀ = 0) Temporary (0, *, *: 0) Temporary (0, *, *: 1) Group 1 (A ₀ = 1) Temporary (1, *, *: 0) Temporary (1, *, *: 1) Above, the description of the embodiment of the present invention is completed.

【００４４】以下、本発明の効果を見るためには、若干
の処理例を示す。まず、最初に、実（１，０，０：０）実（１，０，１：０）実（０，１，０：１）なる事例が与えられたとする。属性はＡ₀，Ａ₁，Ａ₂
であり、背景知識としてｆ＝Ａ₁が人間により推定され
たとする。しかし、この人間の背景知識は完全なもので
はない。実際には、ｆ＝（not Ａ₀)or Ａ₁が正しい背
景知識であるとする。但し、上記の３実事例は、この正
しくない背景知識でも説明しうるので、とりあえず矛盾
はないことに注意されたい。In order to see the effect of the present invention, some processing examples will be shown below. First, it is assumed that a case of real (1,0,0: 0) real (1,0,1: 0) real (0,1,0: 1) is given. The attributes are A ₀ , A ₁ , A ₂
And it is assumed that f = A ₁ is estimated by a person as background knowledge. However, this human background is not perfect. In reality, f = (not A ₀ ) or A ₁ is correct background knowledge. However, it should be noted that the above three actual cases can be explained by this incorrect background knowledge, so that there is no contradiction for the time being.

【００４５】次に、新たな事例、実（０，０，１：１）が入力されたとする。これは、背景知識ｆ＝Ａ₁と矛盾
する。そこで、本発明の知識学習システムの訓練事例と
して、以下を作成する。Next, it is assumed that a new case, real (0, 0, 1: 1), is input. This contradicts the background knowledge f = A ₁ . Therefore, the following is created as a training example of the knowledge learning system of the present invention.

【００４６】実（１，０，０：０）実（１，０，１：０）実（０，１，０：１）実（０，０，１：１）仮（＊，＊，＊：０）仮（＊，＊，＊：１）これを本発明の知識学習システムで処理すると、エント
ロピゲイン計算により、（実事例のみがまず調べられ）
属性Ａ₀がエントロピゲインが最大であるので、以下の
分割を行う。Real (1,0,0: 0) Real (1,0,1: 0) Real (0,1,0: 1) Real (0,0,1: 1) Temporary (*, *, * : 0) Tentative (*, *, *: 1) When this is processed by the knowledge learning system of the present invention, by entropy gain calculation (only the actual case is examined first).
Since the attribute A ₀ has the maximum entropy gain, the following division is performed.

【００４７】グループ０（Ａ₀＝０）実（０，１，０：１）実（０，０，１：１）仮（０，＊，＊：０）仮（０，＊，＊：１）グループ１（Ａ₀＝１）実（１，０，０：０）実（１，０，１：０）仮（１，＊，＊：０）仮（１，＊，＊：１）これで、実事例のクラスはユニークとなった。仮事例に
ついては、背景知識ｆ＝Ａ₁からクラスを決めることは
できない。しかし、実事例のクラスが決定しているの
で、それと矛盾するクラスを持つ仮事例は刈り込む必要
がある（この場合は、クラスが２値なので、背景知識か
らは仮事例のクラスを決定できないにも係わらず、仮事
例のクラスがユニークに決定されるが、むしろ、この現
象は例外的現象である）。従って、各グループの事例
は、以下のようになる。Group 0 (A ₀ = 0) Real (0,1,0: 1) Real (0,0,1: 1) Temporary (0, *, *: 0) Temporary (0, *, *: 1 ) Group 1 (A ₀ = 1) Real (1,0,0: 0) Real (1,0,1: 0) Temporary (1, *, *: 0) Temporary (1, *, *: 1) So, the actual case class became unique. For the tentative case, the class cannot be determined from the background knowledge f = A ₁ . However, since the class of the actual case is determined, it is necessary to trim the temporary case having a class that conflicts with it (in this case, since the class is binary, the class of the temporary case cannot be determined from the background knowledge). Regardless, the class of provisional cases is uniquely determined, but rather this phenomenon is an exceptional phenomenon). Therefore, the case of each group is as follows.

【００４８】グループ０（Ａ₀＝０）実（０，１，０：１）実（０，０，１：１）仮（０，＊，＊：１）グループ１（Ａ₀＝１）実（１，０，０：０）実（１，０，１：０）仮（１，＊，＊：０）各グループの実事例も仮事例もクラスが単一なので、こ
の段階で、処理は停止する。実際に生成された概念は、
ｆ＝not Ａ₀である。Group 0 (A ₀ = 0) Real (0,1,0: 1) Real (0,0,1: 1) Temporary (0, *, *: 1) Group 1 (A ₀ = 1) Real (1,0,0: 0) Real (1,0,1: 0) Temporary (1, *, *: 0) Since each class has a single real case and a single temporary case, the process is performed at this stage. Stop. The concept actually generated is
f = not A ₀ .

【００４９】では、別の例として事例実（０，０，１：１）実（０，１，１：１）実（１，０，０：０）に対して、背景知識として、ｆ＝Ａ₂が宣言されている
とする（この背景知識は、全くの間違いであるが、事例
から学習すると、例えＩＤ３を用いても、こうなってし
まう）。Then, as another example, for the case actual (0,0,1: 1) actual (0,1,1: 1) actual (1,0,0: 0), as background knowledge, f = It is assumed that A ₂ is declared (this background knowledge is completely wrong, but if we learn from the case, even if ID3 is used, it will be like this).

【００５０】ここに、事例実（１，１，０：１）が入力されたとする。これは、背景知識とは合致しな
い。It is assumed that the actual case (1,1,0: 1) is input here. This is inconsistent with background knowledge.

【００５１】訓練事例集合実（０，０，１：１）実（０，１，１：１）実（１，０，０：０）実（１，１，０：１）仮（＊，＊，＊：０）仮（＊，＊，＊：１）に対してエントロピゲインを計算する。この場合には、
エントロピゲインは、実事例では、どれでやっても同一
である。しかし、Ａ₂で分割した場合には、仮事例の方
が、綺麗にクラス分割されるので、分割属性Ａ₂とな
る。Ａ₂＝１のグループについては実事例のクラスと仮
事例のクラスが同一となるので、処理は停止する。しか
し、Ａ₂＝０のグループ実（１，０，０：０）実（１，１，０：１）仮（＊，＊，０：０）については、実事例のクラスが一定ではないので、更に
分割を続行することとなり、エントロピゲインの観点か
ら、Ａ₁を選択して更に分割を続行する。これにより、
最終的に以下の論理式と等価なツリーを得る。Training Case Set Real (0,0,1: 1) Real (0,1,1: 1) Real (1,0,0: 0) Real (1,1,0: 1) Tentative (*, *, *: 0) Calculate the entropy gain for the temporary (*, *, *: 1). In this case,
The entropy gain is the same in all cases in practice. However, in the case of division by A ₂ , the provisional case is divided into classes neatly, and thus has the division attribute A ₂ . For the group of A ₂ = 1, the class of the actual case and the class of the temporary case are the same, so the processing is stopped. However, for A ₂ = 0 group real (1,0,0: 0) real (1,1,0: 1) tentative (*, *, 0: 0), the actual case class is not constant. , Further division is continued, and from the viewpoint of entropy gain, A ₁ is selected and further division is continued. This allows
Finally, a tree equivalent to the following logical expression is obtained.

【００５２】ｆ＝Ａ₁＋Ａ₂ これはターゲット概念とは一致しない。しかし、人間の
与えた、ｆ＝Ａ₂の知識が保存されていることに注意さ
れたい。このように、本発明では、背景知識をできるだ
け残した形で学習を行うことができる。F = A ₁ + A ₂ This is not consistent with the target concept. Note, however, that the human-provided knowledge of f = A ₂ is preserved. As described above, according to the present invention, learning can be performed while leaving background knowledge as much as possible.

【００５３】また、ＩＤ３の計算量は、事例の個数の１
次オーダであり、高速なことで知られるが、本発明の手
法も、このＩＤ３の特徴を何ら阻害しないため、少ない
処理量で学習を実行できる。The calculation amount of ID3 is 1 of the number of cases.
This is the next order and is known to be high-speed, but the method of the present invention does not hinder the characteristics of this ID3 at all, so learning can be executed with a small processing amount.

【００５４】〔決定木生成手段５の実施例２〕本発明に
おいて、デシジョンツリー生成の最初に投入した仮事例
は、属性値として「＊」を利用しているので、ある意味
では、背景知識に入力しうる全事例（即ち、多次元空間
上に分布しうる全ての事例）に相当するものである。従
って、分割判断の各事例集合に含まれる仮事例は、その
段階までに区切られていた多次元空間上に分布しうる点
に相当する。このため、以下の疑問が生じる。[Embodiment 2 of Decision Tree Generating Unit 5] In the present invention, the provisional case introduced at the beginning of the decision tree generation uses "*" as the attribute value. This corresponds to all cases that can be input (that is, all cases that can be distributed in a multidimensional space). Therefore, the tentative cases included in each case set of the division determination correspond to points that can be distributed in the multidimensional space divided up to that stage. Therefore, the following questions arise.

【００５５】（１）実事例が１個に対して、仮事例は１
個でも、仮事例は実際には多数の事例を表現するもので
ある。従って、背景知識から、仮事例のクラスが１個に
定まり、かつ、そのクラスが実事例と反するからと言っ
て、単純に仮事例を消去して良いのか？（２）実事例にもノイズがあり、常に真とは限らない。
従って、実事例中の１個のみが他クラスを指定し、残り
の実事例が仮事例と同一のクラスを主張した場合、この
段階で、仮事例のクラスを当該実事例集合のクラスとし
ても良いのではないか？これらの場合に対処するためには、実事例のみでは決着
が付かない場合のみ仮事例のエントロピゲインを計算す
るのではなく、実事例、仮事例双方を含めたエントロピ
ゲインを計算する方が有利な場合があると考えられる。(1) For one actual case, one tentative case
Even with an individual, a provisional case actually represents a large number of cases. Therefore, based on background knowledge, is it possible to simply delete a provisional case just because the class of the provisional case is set to one and the class conflicts with the actual case? (2) Actual cases also have noise and are not always true.
Therefore, if only one of the actual cases specifies another class and the remaining actual cases claim the same class as the temporary case, the class of the temporary case may be the class of the actual case set at this stage. Isn't it? In order to deal with these cases, it is advantageous to calculate the entropy gain including both the real case and the provisional case, rather than calculating the entropy gain of the provisional case only when the actual case alone cannot make a decision. It is thought that there are cases.

【００５６】従って、実事例と仮事例の一方、または双
方に重みを付け、あたかもその重みの個数分の事例が存
在するかの如くエントロピゲインを計算することが考え
られる。この場合、種々の重みの付け方が考えられる
が、代表的な考え方として、以下の２種が考えられる。Therefore, it is conceivable to weight one or both of the real case and the tentative case, and calculate the entropy gain as if there were as many cases as the number of weights. In this case, various weighting methods can be considered, but the following two types are considered as typical ideas.

【００５７】（ａ）仮事例には、予め固定した数値を与
えておく。これは、全仮事例に対して一定でもよく、ま
た予めユーザに、仮事例毎の確信度として、この重みを
陽に指定させてもよい。（ｂ）属性選択が行われ、事例集合の分割が行われる毎
に、仮事例の重みを分割してゆく方法が考えられる。例
えば、「＊」で属性値が与えられている属性が分割属性
となったとする。この場合には、仮事例の重みを分岐数
で除する方法がある。例えば、分岐が３本であるなら、
各枝の重みは３分の１とする。(A) A fixed numerical value is given in advance to the tentative case. This may be constant for all temporary cases, or the user may be allowed to explicitly specify this weight in advance as the certainty factor for each temporary case. (B) A method is conceivable in which the weight of the tentative case is divided every time the attribute is selected and the case set is divided. For example, it is assumed that an attribute whose attribute value is given by "*" is a split attribute. In this case, there is a method of dividing the weight of the temporary case by the number of branches. For example, if there are 3 branches,
The weight of each branch is one-third.

【００５８】以下、上記（ａ）の仮事例と実事例それぞ
れに重みを与える場合の決定木生成手段５は、以下のス
テップで動作する。決定木生成手段５以外の構成は、前
述の実施例と差異はない。Hereinafter, the decision tree generating means 5 in the case of weighting each of the provisional case and the actual case of the above (a) operates in the following steps. The configuration other than the decision tree generating means 5 is not different from that of the above-described embodiment.

【００５９】（ＳＴＥＰ１）デシジョンツリーのノード
を生成するために、当該ノードの分割の際に与えられて
いる事例の属性集合をＡ、事例集合をＳ_real，Ｓ_artと
する。但し、Ｓ_realは実事例の集合、Ｓ_artは仮事例の
集合であり、これら２種類の事例集合の和集合を事例集
合Ｓとする。この際仮事例と実事例には、それぞれ予め
与えられた重みを付与する。（ＳＴＥＰ２）事例集合Ｓに対して、後述の停止条件を
満足していれば、処理を停止。この場合、事例集合Ｓに
よって定められるクラスは、事例集合Ｓの持つクラスの
多数決とする。処理が停止しない場合には、事例集合Ｓ
に対して、属性集合Ａに含まれる各属性に対して、当該
属性で分割を行った場合のエントロピゲインを計算し、
最も大きなエントロピゲインを持つ属性を１個選び、こ
れをａとする。（ＳＴＥＰ３）属性ａを属性集合Ａから削除し、これを
新たなＡとする。そして、属性ａにより、事例集合Ｓを
グループに分割し、この各グループ毎の事例集合を
Ｓ₀，Ｓ₁，・・・とする。この際、事例集合Ｓ_artに
対しては、当該属性ａの「＊」を、その分岐の属性値に
書き替える。（ＳＴＥＰ４）上記のＳ₀，Ｓ₁，・・・の各々に対し
て、ＳＴＥＰ２〜ＳＴＥＰ３を再帰的に実行する。(STEP 1) In order to generate a node of the decision tree, the attribute set of the case given when the node is divided is A, and the case set is S _real and S _art . However, S _real is a set of real cases, S _art is a set of provisional cases, and the union of these two types of case sets is a case set S. At this time, a weight given in advance is given to each of the provisional case and the actual case. (STEP 2) If the stop condition described later is satisfied for the case set S, the processing is stopped. In this case, the class defined by the case set S is a majority vote of the classes of the case set S. If processing does not stop, case set S
On the other hand, for each attribute included in the attribute set A, the entropy gain in the case where the attribute is divided is calculated,
One attribute having the largest entropy gain is selected and designated as a. (STEP 3) The attribute a is deleted from the attribute set A, and this is set as a new A. Then, the case set S is divided into groups according to the attribute a, and the case sets for each group are set as S ₀ , S ₁ ,. At this time, for the case set S _art , "*" of the attribute a is rewritten to the attribute value of the branch. (STEP 4) STEP 2 to STEP 3 are recursively executed for each of the above S ₀ , S ₁ , ...

【００６０】上記ＳＴＥＰ２の停止条件は、種々開発さ
れており、当業者において周知のものが適用できる。例
えば、以下のものがある。（１）事例集合Ｓが何れも空の時。（２）事例集合Ｓのクラスが単一の時。この場合、当該分岐のクラスは事例集合Ｓ_realのクラス
とする。（３）事例集合Ｓに含まれる事例数（但し、重みが付与
されている場合には、当該重み分の事例が存在するもの
とする。）が予め定めた限度を下回った場合。（４）事例集合Ｓをどの属性で分割しても、予め与えた
エントロピゲインの下限を下回るゲインしか得られず、
分岐の意義に疑問がある場合。Various stopping conditions for STEP 2 have been developed, and those known to those skilled in the art can be applied. For example: (1) When the case set S is empty. (2) When the class of the case set S is single. In this case, the class of the branch is the class of the case set S _real . (3) When the number of cases included in the case set S (however, if a weight is given, it is assumed that there are cases for the weight) below a predetermined limit. (4) No matter which attribute the case set S is divided into, only a gain lower than the lower limit of the entropy gain given in advance can be obtained,
If you are in doubt about the significance of branching.

【００６１】〔決定木生成手段５の実施例３〕以上、仮
事例に一定の重みを与えた場合の決定木生成手段５の構
成を示した。次に、分岐毎に仮事例の重みを減ずる前述
の（ｂ）の決定木生成手段５の構成を述べる。この場合
は、前述の（ａ）の場合とほとんど同一であり、唯一異
なるのは、分岐を行った場合の重み修正のみである。具
体的には、以下のステップで動作する。決定木生成手段
５以外の構成は、前述の実施例と差異はない。[Third Embodiment of Decision Tree Generating Unit 5] The configuration of the decision tree generating unit 5 in the case where a given weight is given to a temporary case has been described above. Next, the configuration of the decision tree generating means 5 of (b) described above for reducing the weight of the temporary case for each branch will be described. In this case, it is almost the same as the case of (a) described above, and the only difference is the weight correction in the case of branching. Specifically, it operates in the following steps. The configuration other than the decision tree generating means 5 is not different from that of the above-described embodiment.

【００６２】（ＳＴＥＰ１）デシジョンツリーのノード
を生成するために、当該ノードの分割の際に与えられて
いる事例の属性集合をＡ、事例集合をＳ_real，Ｓ_artと
する。但し、Ｓ_realは実事例の集合、Ｓ_artは仮事例の
集合であり、これら２種類の事例集合の和集合を事例集
合Ｓとする。なお、この際仮事例と実事例には、それぞ
れ予め与えられた重みを付与する。（ＳＴＥＰ２）事例集合Ｓに対して、前述の（ａ）の場
合の停止条件を満足していれば、処理を停止する。この
場合、事例集合Ｓによって定められるクラスは、事例集
合Ｓの持つクラスの多数決とする。処理が停止しない場
合には、事例集合Ｓに対して、属性集合Ａに含まれる各
属性に対して、当該属性で分割を行った場合のエントロ
ピゲインを計算し、最も大きなエントロピゲインを持つ
属性を１個選び、これをａとする。但し、事例集合を属
性により分割する場合、当該属性の属性値が「＊」であ
る仮事例については、分岐数で除した値をそれぞれの仮
事例に与えて分岐させることとする。（ＳＴＥＰ３）属性ａを属性集合Ａから削除し、これを
新たなＡとする。そして、属性ａにより、事例集合Ｓを
グループに分割し、この各グループ毎の事例集合を
Ｓ₀，Ｓ₁・・・とする。但し、事例集合を属性により
分割する場合、当該属性の属性値が「＊」である仮事例
については、分岐数で除した値をそれぞれの仮事例に与
えて分岐させる。事例集合Ｓ_artに対しては、当該属性
ａの「＊」をその分岐の属性値に書き替える。（ＳＴＥＰ４）上記のＳ₀，Ｓ₁・・・の各々に対し
て、ＳＴＥＰ２〜ＳＴＥＰ３を再帰的に実行する。(STEP 1) In order to generate a node of the decision tree, the attribute set of the case given when the node is divided is A, and the case set is S _real and S _art . However, S _real is a set of real cases, S _art is a set of provisional cases, and the union of these two types of case sets is a case set S. At this time, a weight given in advance is given to each of the provisional case and the actual case. (STEP 2) If the stop condition for the case (a) is satisfied for the case set S, the process is stopped. In this case, the class defined by the case set S is a majority vote of the classes of the case set S. If the processing does not stop, for the case set S, for each attribute included in the attribute set A, the entropy gain in the case where the attribute is divided is calculated, and the attribute having the largest entropy gain is calculated. Select one and call it a. However, when the case set is divided by the attribute, for the temporary cases in which the attribute value of the attribute is “*”, a value divided by the number of branches is given to each temporary case to branch. (STEP 3) The attribute a is deleted from the attribute set A, and this is set as a new A. Then, the case set S is divided into groups according to the attribute a, and the case sets for each group are set as S ₀ , S ₁ ... However, when the case set is divided according to the attributes, for the provisional cases in which the attribute value of the attribute is “*”, a value divided by the number of branches is given to each provisional case and branched. For the case set S _art , "*" of the attribute a is rewritten to the attribute value of the branch. (STEP 4) STEP 2 to STEP 3 are recursively executed for each of the above S ₀ , S ₁ ...

【００６３】なお、決定木生成手段５の実施例２および
実施例３においては、事例に付与した「重み」が整数で
ないことが生じうる。この場合においても、エントロピ
ゲインの計算には何ら支障がないことを注意されたい。
エントロピの計算においては、事例集合に対して、事例
重合全体の事例数（重みの合計）をｎとして、あるクラ
スに所属する事例数（重みの合計）をｋとして、 −（ｋ／ｎ）log （ｋ／ｎ）を計算する必要がある。この計算は、ｋやｎが整数であ
る必要はない。従って、実数の重みを持つ事例集合に対
しても、エントロピゲインの計算に何ら支障はない。In the second and third embodiments of the decision tree generating means 5, the "weight" given to the case may not be an integer. It should be noted that even in this case, there is no hindrance to the calculation of entropy gain.
In the entropy calculation, for a set of cases, the number of cases (total weight) in the entire case stack is n, the number of cases belonging to a certain class (total weight) is k, and − (k / n) log It is necessary to calculate (k / n). This calculation need not be an integer for k or n. Therefore, there is no problem in calculating the entropy gain even for the case set having a real number weight.

【００６４】[0064]

【発明の効果】以上説明したように、本発明によれば、
事例のみからでは学習できない知識を背景知識として与
え、事例と背景知識の双方に合致した知識を新たに学習
することができる。また、事例のみからでは学習できな
い場合でも、少ない事例から知識学習が可能である。As described above, according to the present invention,
Knowledge that cannot be learned only from cases can be given as background knowledge, and knowledge that matches both cases and background knowledge can be newly learned. Further, even if the learning cannot be done only from the cases, the knowledge learning can be done from the few cases.

【００６５】また、本発明によれば、事例数に比例した
高速な学習が可能であり、更に実事例と背景知識が矛盾
する場合でも、予め定めた重みに従って、双方の情報を
総合した知識を学習することができる。Further, according to the present invention, high-speed learning proportional to the number of cases is possible, and even when the actual cases and the background knowledge are inconsistent, the knowledge obtained by integrating both pieces of information according to a predetermined weight is obtained. You can learn.

[Brief description of drawings]

【図１】本発明の一実施例に係わる修正型知識学習装置
の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a modified knowledge learning device according to an embodiment of the present invention.

【図２】デシジョンツリー形式で表現される従来技術の
例を示す図である。FIG. 2 is a diagram showing an example of a conventional technique expressed in a decision tree format.

【図３】領域分割のイメージを示す図である。FIG. 3 is a diagram showing an image of area division.

[Explanation of symbols]

１修正型知識学習装置２実事例集合３背景知識４事例変換手段５決定木生成手段 1 Modified Knowledge Learning Device 2 Actual Case Set 3 Background Knowledge 4 Case Conversion Means 5 Decision Tree Generation Means

Claims

[Claims]

1. A real case set including a plurality of real cases each including a set of attribute values and one class determined in correspondence with the set of attribute values, and a class for determining the class of the case from the attribute values. A background knowledge given in the form of conditions, rules, a decision tree, etc., case conversion means for forming a training case set by combining the temporary case and the actual case reflecting the background knowledge, and inputting the training case set As an example, when the case set is divided by collecting the same values of the attribute for each attribute, the attribute with the largest decrease in the index indicating the randomness of the case set is the division attribute, and the division attribute is the node. And a decision tree generating means for generating a decision tree recursively.