JPH0298775A

JPH0298775A - Method and device for generating decision list

Info

Publication number: JPH0298775A
Application number: JP63251334A
Authority: JP
Inventors: Kenji Yamanishi; 健司山西
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-10-04
Filing date: 1988-10-04
Publication date: 1990-04-11
Anticipated expiration: 2012-02-12
Also published as: JP2581196B2

Abstract

PURPOSE:To perform the classification and prediction of data outputted frequently from an information source with high accuracy by picking up the data from decision which maximizes an information gain, and classifying the data preferentially from the decision with high identification capacity. CONSTITUTION:The decision with high information gain for measuring data is added sequentially as the decision of a decision list as a logical product which regulates the decision of the decision list under a state where the number of characters comprising one logical product is fixed. And a stopping rule based on an error classification rate is provided, and the addition of the decision is stopped at a point where it is satisfied, then, the decision list obtained at that time is outputted. In such a way, it is possible to classify and predict unknown data generated from the same information source as that for the measuring data with high accuracy.

Description

【発明の詳細な説明】（産業上の利用分野）この発明は｛０、１）の値をとる複数の属性とクラスの
組として与えられる、雑音を伴う複数の観測データから
決定リストを発生させる方法に関する。[Detailed Description of the Invention] (Industrial Application Field) This invention generates a decision list from a plurality of noisy observation data given as a set of a plurality of attributes and classes that take values of {0, 1). Regarding the method.

（従来の技術）｛０、１）の値をとる複数の属性とクラスの組として与
えられる観測データから属性とクラスの構造的な関係を
生成し、それを表現するための方法として決定リストに
よる分類規則生成の方法がある。(Prior art) A decision list is used as a method for generating and expressing a structural relationship between attributes and classes from observed data given as a set of multiple attributes and classes that take values of {0, 1). There is a method for generating classification rules.

決定リストの概念については、１９８７年発行の米国の
雑誌［マシンラーニング（Ｍａｃｈｉｎｅ　Ｌｅａｒｎ
ｉｎｇ月の２巻の中の２２９−２４６頁掲載のＲ，Ｌリ
ベスト（Ｒ，Ｌ、Ｒｉｖｅｓｔ）による論文「ラーニン
グデンジョンリスト（Ｌｅａｒｎｉｎｇ　ｄｅｃｉｓｉ
ｏｎ　１ｉｓｔ）　Ｊに記載されており、２値データを
扱う限りにおいては現在知られている分類規則の表現の
中では最も表現能力の高いものであることがわかってい
る。この論文の中では、雑音の伴わない観測データから
、全ての観測データにつじつまを合わせるような決定リ
ストを構成する方法が記載されている。The concept of decision lists is introduced in the American magazine Machine Learning, published in 1987.
The article "Learning decisi
on 1st) J, and is known to have the highest expressiveness among the currently known expressions of classification rules as far as binary data is handled. This paper describes a method for constructing a decision list that is consistent with all observational data from observational data without noise.

（発明が解決しようとする課題）前記論文で示されていた決定リストの生成方法は、雑音
の伴わない観測データから全ての観測データにつじつま
を合わせるような決定リストを構成するための方法であ
った。しかし、実際に扱う観測データは一般に雑音を伴
うので、全ての観測データを説明出来なくても、同じ情
報源から発生するデータを出来るだけ正しく分類するよ
うな決定リストの方法が必要であるのだが、このような
決定リストを発生させる・手段は存在していなかった。(Problem to be Solved by the Invention) The method for generating a decision list described in the above paper is a method for constructing a decision list that is consistent with all observation data from observation data without noise. Ta. However, since the observational data that we actually handle generally involves noise, we need a decision list method that can classify data generated from the same information source as accurately as possible, even if it cannot explain all the observational data. , there was no means to generate such a decision list.

本発明の目的は雑音を伴う観測データから、未知データ
に対する予測誤差が最小になるような決定リストを自動
的に発生させる方法を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a method for automatically generating a decision list that minimizes prediction errors for unknown data from noisy observed data.

（課題を解決するための手段）本発明による決定リストの発生方法の１つは、１つの論
理積を構成する文字の数を固定したもとで、決定リスト
の決定を規定する論理積として、観測データに対する情
報利得を最大にするようなものを順次選択しながら付加
するステップと、予め定めたストツピングルールによっ
て決定の付加を終了したところで得られる決定リストを
最終的に求める決定リストとするステップとを含むこと
を特徴とする。(Means for Solving the Problems) One of the methods for generating a decision list according to the present invention is to fix the number of characters constituting one logical product, and to generate a logical product that defines the decision of the decision list. A step of sequentially selecting and adding those that maximize the information gain to observation data, and a step of using a decision list obtained when adding decisions according to a predetermined stopping rule as the final decision list. It is characterized by including.

あるいは、上記発明に対して次のような変形を施すこと
も出来る。すなわち、上の方法によって一度決定リスト
を生成するステップと、次に該決定リストの末端から決
定項を順次取り除くことによって得られる決定リス）・
の系列の要素の１つ１つに対して、その決定リストを用
いて記述される観測データの記述量を計算し、該決定リ
スト系列中で記述量を最小にする決定リストを最終的に
求める決定リストとするステップと、を含むことを特徴
とする決定リストの発生方法である。（上の２つのの方
法を［発明１］とする。）また、本発明によるもう１つの決定リストの発生方法は
、１つの論理積を構成する文字の最大数（ｋとする）を
固定したもとで、決定リストの決定を規定する論理積と
して、観測データに対する情報利得を最大にするような
ものを順次選択しながら付加するステップと、予め定め
たストツピングルールによって決定の付加を終了したと
ころで得られる決定リストを複数のｋの値に対して求め
るステップと、各ｋに対して以上の生成過程を繰り返し
て生成される決定リストを用いて記述される観測データ
の記述量を計算して比較し、その中で記述量を最小にす
る決定リストを最終的に求める決定リストとするステッ
プと、を含むことを特徴とする。Alternatively, the following modification can be made to the above invention. That is, the step of once generating a decision list using the above method, and then the decision list obtained by sequentially removing decision items from the end of the decision list).
For each element of the series, calculate the amount of description of observed data described using the decision list, and finally find the decision list that minimizes the amount of description in the decision list series. A method for generating a decision list is characterized in that the method includes the step of generating a decision list. (The above two methods are referred to as [Invention 1].) Another method of generating a decision list according to the present invention is to fix the maximum number of characters (referred to as k) that constitute one logical product. Under the above method, the addition of decisions was completed using a step of sequentially selecting and adding the logical product that maximizes the information gain for the observation data as a logical product that defines the decision of the decision list, and a predetermined stopping rule. By the way, the amount of description of observed data to be described is calculated using the step of calculating the obtained decision list for multiple values of k and the decision list generated by repeating the above generation process for each k. The method is characterized in that it includes a step of comparing and determining a decision list that minimizes the amount of description among them as a final decision list.

あるいは、上記発明に対して次のような変形を施すこと
も出来る。すなわち、上の方法によって、各固定された
ｋに対して決定リストを求めるステップと、該決定リス
トの末端から決定項を順次取り除くことによって得られ
る決定リストの系列の要素の１つ１つに対して、その決
定リストを用いて記述される観測データの記述量を計算
し、該決定リス）・系列の中で記述量を最小にする決定
リストを各ｋに対して最良の決定リストとして求めるス
テップと、複数のｋの値に対する最良の決定リストを求
め、その中で記述量を最小とする決定リストを最終的に
求める決定リストとするステップと、を含むことを特徴
とする決定リストの生成方法である。（以下、上の２つ
の方法を「発明２Ｊとする）。Alternatively, the following modification can be made to the above invention. That is, by the above method, for each element of the series of decision lists obtained by obtaining a decision list for each fixed k, and sequentially removing decision terms from the end of the decision list, and calculating the amount of description of the observed data to be described using the decision list, and obtaining the best decision list for each k, which minimizes the amount of description in the decision list). and determining the best decision list for a plurality of values of k, and determining the decision list that minimizes the amount of description among them as the final decision list. It is. (Hereinafter, the above two methods will be referred to as "Invention 2J").

本発明による決定リストの発生装置の１つは、観測デー
タを記憶する手段と、１つの論理積を構成・する文字の
数を固定したもとで、決定リストの決定を規定する論理
積として観測データに対する情報利得を最大にするよう
なものを順次選択しながら付加し、予め定めたストツピ
ングルールによって決定の付加を終了したところで得ら
れる決定リストを最終的に求める決定リストとして出力
する手段とを含むことを特徴とする特あるいは、上記発明に対して次のような変形を施すこと
も出来る。すなわち、上の装置によって一度決定リスト
を生成する手段と、次に該決定リストの末端から決定項
を順次取り除くことによって得られる決定リストの系列
を発生させる手段と、該系列要素の１つ１つに対して、
その決定リストを用いて記述される観測データの記述量
を計算する手段と、該記述量を決定リストの系列を記憶
する手段と、該決定リスト系列の中で記述量を最小にす
る決定リストを最終的に求める決定リストとして出力す
る手段と、を含むことを特徴とする決定リストの生成装
置である。（上の２つの装置を「発明３」とする。）また、本発明による決定リストの発生装置は、観測デー
タを記憶する手段と、１つの論理積を構成する文字の最
大数（ｋとする）を固定したもとで、決定リストの決定
を規定する論理積として観測データに対する情報利得を
最大にするようなものを順次選択しながら付加し、予め
定めたストツピングルールによって決定の付加を終了し
たところで得られる決定リストを複数のｋの値に対して
求める手段と、各ｋに対して求められた決定リストを記
憶する手段と、各ｋに対して以上の生成過程を繰り返し
て生成される決定リストを用いて記述される観測データ
の記述量を計算する手段と、該記述量を記憶する手段と
、その中で記述量を最小にする決定リストを最終的に求
める決定リストとして出力する手段と、を含むことを特
徴とする。One of the decision list generation devices according to the present invention includes means for storing observation data, and observation data as a logical product defining a decision in the decision list, with the number of characters constituting one logical product being fixed. Means for sequentially selecting and adding decisions that maximize the information gain to the data, and outputting the decision list obtained when the addition of decisions is finished according to a predetermined stopping rule as the final decision list. It is also possible to make the following modifications to the above invention. That is, a means for once generating a decision list using the above apparatus, a means for generating a series of decision lists obtained by sequentially removing decision items from the end of the decision list, and a means for generating a series of decision lists obtained by sequentially removing decision items from the end of the decision list, and a means for generating a series of decision lists one by one by using the above apparatus. For,
means for calculating the amount of description of observed data described using the decision list; means for storing the amount of description in a series of decision lists; and a decision list that minimizes the amount of description in the series of decision lists. A device for generating a decision list, comprising means for outputting the decision list as a final decision list. (The above two devices are referred to as "Invention 3.") Furthermore, the decision list generating device according to the present invention includes a means for storing observation data, and a maximum number of characters constituting one logical product (denoted as k). ) is fixed, the logical products that define the decisions of the decision list are sequentially selected and added such that the information gain for the observed data is maximized, and the addition of decisions is finished using a predetermined stopping rule. means for obtaining the decision list obtained at this point for multiple values of k, means for storing the decision list obtained for each k, and the above generation process is repeated for each k. Means for calculating the amount of description of observed data described using a decision list, means for storing the amount of description, and means for outputting a decision list that minimizes the amount of description among them as a final decision list. It is characterized by including.

あるいは、上記発明に対して次のような変形を施すこと
も出来る。すなわち、上の装置により固定されたｋに対
して決定リストを求める手段と、該決定リストの末端か
ら決定項を順次取り除くこ６とによって得られる決定リ
ストの系列を発生させる手段と、該系列要素の１つ１つ
に対して、その決定リストを用いて記述される観測デー
タの記述量を計算する手段と、該記述量と決定リストの
系列を記憶する手段と、該決定リスト系列の中で記述量
を最小にする決定リストを各ｋに対して最良の決定リス
トとして求める手段と、複数のｋの値に対する最良の決
定リストを記憶する方法と、複数のｋの値に対する最良
の決定リストの記述量を比較し、最小とする決定リスト
を最終的に求める決定リストとして出力する手段と、を
含むことを特徴とする決定リストの生成装置である。（
上の２つの装置を「発明４」とする。）（作用）先ず、決定リストについて説明する。今、｛０、１）に
値をとる変数の数をｎとし、ｋはｎ以下の正の整数であ
るとする６　Ｌｎ”（Ｘｌ、Ｘ１＋’・・、ｘｎ、Ｘｎ
）（ｉ、＝：ｌ−ｘ、（ｉ＝１．−、ｎ））とし、Ｌｎ
の元をリテラルとよび、リテラルの論理積をタームとよ
ぶ。Alternatively, the following modification can be made to the above invention. That is, means for obtaining a decision list for k fixed by the above apparatus, means for generating a series of decision lists obtained by sequentially removing decision terms from the end of the decision list, and the series elements. means for calculating the descriptive amount of observation data described using the decision list, means for storing the descriptive amount and the decision list series, and A means for determining the best decision list for each k that minimizes the amount of description, a method for storing the best decision list for a plurality of values of k, and a method for storing the best decision list for a plurality of values of k. A decision list generation device characterized by comprising means for comparing description amounts and outputting a decision list that minimizes the amount of description as a final decision list. (
The above two devices are referred to as "Invention 4". ) (Operation) First, the decision list will be explained. Now, let n be the number of variables that take values of {0, 1), and let k be a positive integer less than or equal to 6 Ln''(Xl, X1+'..., xn, Xn
)(i,=:l−x,(i=1.−,n)), and Ln
The element of is called a literal, and the conjunction of literals is called a term.

ｋ−ＤＬ（ｎ）はＴｎ、の元ｔ、（ｊ＝１．−、ｒ）と
（ｏ、ｉ）にとる値Ｖ、θ＝１．・・・、ｒ）の組のリ
ストとして以下のように表されるものの集合を表す。こ
こに、ＴｎｋはＬｎのうち高々異なるに個のリテラルで
表されるターム全体の集合を示しく但し、ｘｉと同時に
ｘｉを含まない）、最後の関数ｔｒはｔｒｕｅを返す定
数関数である。k-DL(n) is the value V taken for element t, (j=1.-, r) and (o, i) of Tn, θ=1. . . , r) represents a set of items expressed as follows. Here, Tnk represents a set of all terms expressed by at most different literals among Ln (however, it does not include xi at the same time), and the last function tr is a constant function that returns true.

（ｔ、、ｖｌ）、・・・、（ｔｒ、ｖ、）　　　　　　
　　　　　（１）１つの決定リストは任意のｘ（ｘｎ（
ｘｎはｎ桁のブーリアンベクトルの全体の集合を表す）
に対してｔｉ（ｘ）＝１となる最初のｊに対するＶ、の
値を示す。（ｔｉ、ｖ、）を第ｉ決定（あるいは単に決
定）とよぶ。例えば、次は３−ＤＴ（４）の元である。(t,,vl),...,(tr,v,)
(1) One decision list is any x(xn(
xn represents the entire set of n-digit Boolean vectors)
The value of V for the first j such that ti(x)=1 is shown for. (ti, v,) is called the i-th decision (or simply decision). For example, the following is an element of 3-DT(4).

（ｘ、ｘ２，１）、（ｘ２ｆ３ｘ、、Ｏ）、（ｘ３ｘ４
，１）、（ｔｒｕｅ、Ｏ）　　　　（２）この決定リス
トによるデータの分類の決定過程は第７図のように表さ
れる。(x, x2, 1), (x2f3x,,O), (x3x4
, 1), (true, O) (2) The process of determining data classification using this determination list is expressed as shown in FIG.

第７図で示したように、決定リストは ”ｉｆ−ｔｈｅｎ　−ｅｌｓｅ　　ｉｆ−ｔｈｅｎ　−
”型の概念分類規則の一般化となっている。ｎ個の属性
と１つのクラスで記述された複数の観測データから同じ
情報源より発生するデータを出来るたけ正確に予測する
ような決定リストを構成するには、どのような方法ある
いは装置によって決定リストを発生させたらよいかとい
うことが問題であり、この発明に答えるのが本発明であ
る。但し、観測データには雑音が含まれているものとす
る。先ず、この問題に答える１つの方法として、観測デ
ータに対する情報利得（エントロピー、誤分類率、Ｇ１
ｎ１指標等の減少分で測られる）の高い決定から順に決
定リストの決定として加えていくようにし、誤分類率に
基づくストッピングルールを設けて、これが満足された
ところで決定の追加を停止し、そこで得られる決定リス
トを出力するという方法、あるいはさらにそこで得られ
る決定リストの末端から決定項を順次取り除くことによ
って得られる決定リストの系列の要素の１つ１つに対し
て、その決定リストを用いて記述される観測データの記
述量（全てを自己完結的に符号化するための符号長）を
計算し、該決定リスト系列の中で記述量を最小にする決
定リストを最終的に求める決定リストとする方法が考え
られる。情報利得を最大にする決定から拾っていく理由
は、識別能力が高い決定から先にデータを分類すること
により、情報源から頻出するデータについて、より精度
の高い分類・予測が行えるからである。また、記述量を
最小にする決定リストを選ぶことの理由は、観測データ
をモデルを用いて自己完結的な符号化を行う際に、全デ
ータをより圧縮しうるモデル（この場合は決定リスト）
は、同じ情報源より発生する未知の観測データに対し、
より正確な予測を行えるモデルであるということが漸近
的に成立するからである。この理論的裏付けに関しては
、特に統計モデルに適用する場合に関しては１９８６年
発行の米国の雑誌アナルスオブスタテイスティクス（Ａ
ｎｎａｌｓ　ｏｆ　５ｔａｔｉｓｔｉｃｓ）の１４巻の
１０８０−１１００頁掲載のＪ、リサネン（Ｊ、Ｒ１５
ｓａｎｅｎ）による論文［ストキャスティックコンブレ
キシティーアンドモデリン　グ（Ｓｔｏｃｈａｓｔｉｃ
　ｃｏｍｐｌｅｘｔｙ　ａｎｄ　ｍｏｄｅｌｉｎｇ）　
　Ｊ　　にＭＤＬ基準として述べられている。しかし、
本発明では、分類規則としての決定リストといった論理
型概念学習の最適モデル化にＭＤＬ基準を用いていると
ころに新規性をおいている。以上が発明１の方法の原理
であり、同じ原理を装置の形で実現するのが発明３であ
る。上述の情報利得を以下に正確に定義する。先ず、以
下のように記号を定める。As shown in FIG. 7, the decision list is “if-then-else if-then-
``This is a generalization of the type concept classification rule.It is a decision list that predicts data generated from the same information source as accurately as possible from multiple observed data described by n attributes and one class. The problem is what method or device should be used to generate the decision list, and the present invention is an answer to this problem.However, the observation data contains noise. First, one way to answer this question is to calculate the information gain (entropy, misclassification rate, G1
Decisions are added to the decision list in descending order of the number of decisions (measured by the decrease in the n1 index, etc.), and a stopping rule is set based on the misclassification rate, and when this is satisfied, the addition of decisions is stopped. Either outputting the decision list obtained there, or using that decision list for each element of the series of decision lists obtained by sequentially removing decision items from the end of the decision list obtained there. A decision list that calculates the amount of description (code length for self-contained encoding) of observation data described in One possible method is to do this. The reason why decisions that maximize information gain are picked up is that by classifying data first, starting with decisions that have the highest discrimination ability, it is possible to more accurately classify and predict data that frequently appears from information sources. Also, the reason for choosing a decision list that minimizes the amount of description is that when performing self-contained encoding of observed data using a model, a model that can further compress all data (in this case, a decision list)
is for unknown observation data generated from the same information source.
This is because it is asymptotically established that the model is capable of making more accurate predictions. Regarding this theoretical support, especially when applied to statistical models, the American magazine Annals of Statistics (A.
J, Rissanen (J, R15) published in vol. 14 of pp. 1080-1100 of
[Stochastic Complexity and Modeling]
complexity and modeling)
It is stated as an MDL standard in J. but,
The novelty of the present invention lies in the use of MDL criteria for optimal modeling of logical concept learning such as decision lists as classification rules. The above is the principle of the method of invention 1, and invention 3 realizes the same principle in the form of an apparatus. The information gain mentioned above is defined precisely below. First, the symbols are defined as follows.

［記号］　以下、対数の底は全て２とする。[Symbols] Below, the base of all logarithms is 2.

・ｋ、ｎ（Ｎ（自然数全体）は固定とする。・k, n (N (all natural numbers) is fixed.

・Ｓへ　（＜ｉ、ｘ、ｃ）（学習データ）；ｉ（対象番
号）（Ｎ。・To S (<i, x, c) (learning data); i (target number) (N.

Ｘ（属性値）（｛０、１）ｎ、ｃ（クラス）（｛０、１
））：学習データの集合（属性値は属性変数Ｘ１．・・
・ｔＸｎに対するデータの値を示す。） −５Ａ（ｔ）＝（＜ｉ、ｘ、ｃ＞　（ｓ：ｔ（ｘ）＝　
１）、ｔｅＴ”ｋ８Ｂ（ｔ）＝（＜ｉ、ｘ、ｃ〉（ｓ：
ｔ（ｘ）＝０）、ｔ（Ｔ”。X (attribute value) ({0, 1) n, c (class) ({0, 1
)): Set of learning data (attribute value is attribute variable X1...
- Indicates the data value for tXn. ) −5A(t)=(<i, x, c> (s:t(x)=
1), teT”k8B(t)=(<i, x, c>(s:
t(x)=0), t(T”.

・Ａ（ｔ）＝　＃５Ａ（ｔＸ以下、＃Ｗは集合Ｗの元の
総数を表す）Ｂ（ｔ）＝　＃５Ｂ（ｔ）・Ａ”（ｔ）＝＃（＜ｉ、ｘ、ｃ＞（ＳＡ（ｔ）：ｖ＝
１）、ｔ（ＴｎｋＡ−（ｔ）＝＃（＜ｉ、ｘ、ｃ＞（Ｓ
Ａ（ｔ）：ｖ＝０１ｔｔ’Ｔ’ｋＢ”（ｔ）＝＃（＜ｉ
、ｘ、ｃ＞　（ｓＢ（ｔ）：　ｖ＝１）、ｔ（ＴｎｋＢ
−（ｔ）＝＃（＜ｉ、ｘ、ｃ＞（ｓＢ（ｔ）：　ｖ＝０
）、ｔ（Ｔ％ここに、Ａ（ｔ）＝Ａ”（ｔ）＋Ａ−（ｔ
）、Ｂ（ｔ）＝Ｂ＋（ｔ）＋Ｂ−（ｔ）とする。・A(t) = #5A (below tX, #W represents the total number of elements in the set W) B(t) = #5B(t) ・A"(t) = #(<i, x, c> (SA(t):v=
1), t(TnkA-(t)=#(<i, x, c>(S
A(t):v=01tt'T'kB"(t)=#(<i
, x, c> (sB(t): v=1), t(TnkB
−(t)=#(<i, x, c>(sB(t): v=0
), t(T%Here, A(t)=A”(t)+A−(t
), B(t)=B+(t)+B−(t).

・ΔＩ（ｔ）へＩＢ（ｔ”）−Ｉ（ｔ）　　　　　　　
　　　　　　（４）を決定（ｔ、ｖ）による情報利得と
よぶ。・ΔI(t) to IB(t”)−I(t)
(4) is called the information gain due to decision (t, v).

（ｔは決定リストにおいてｔの直前に現れる決定のター
ムとする。）ここで、ＩＡ（ｔ）、ｌｌ３（ｔ）の選び方としては次
のようなものを考えることが出来る。(t is a decision term that appears immediately before t in the decision list.) Here, the following can be considered as a way to select IA(t) and ll3(t).

史」葺二ヒジく二次に、決定リストを用いた全観測データの記述量を正確
に定義する。データの記述量は、決定リスト自体の記述
量と決定リストによって分類されるデータの中の例外の
記述量との和として与えられる。例えば、（２）の決定
リストつにいては、＃Ｔ５３＝１３１である（定数関数
ｔｒｕｅに対応するタームを０としてこれも数える）か
ら、（２）の最初の決定のタームは１／１３１の確率で
選ばれているので、これを自己完結的に符号化するため
には、ｌｏｇ（１３１）ｂｉｔｓの符号長が必要である
。この場合、符号化方法としては、Ｈｕｆｆｍａｎ符号
化を用いる。また、１つの決定に対しては、タームを真
にするようなデータに１を割るけるかを０を割り付ける
かも記述しなければならず、これには１ｂｉｔ必要であ
るから、結局（２）の最初の決定に関する記述量として
、ｌｏｇ（１３１）＋１ｂｉｔｓ必要である。次の決定
に関しては、ｌｌ’５３の最初の決定に用いられたター
ムを除く１３０個のタームの中から決定に必要なターム
が選ばれるのであるから、同様にして、ｌｏｇ（１３０
）＋　１ｂｉｔｓの記述量が必要である。同様にして、
各決定に対する記述量を求め、それらの総和を計算する
ことにより決定リスト自体の記述量か計算できる。（２
）に対しては全部で（ｌｏｇ（１３１）　＋　１）　＋
　（ｌｏｇ（１３０）　＋　１）　＋　（ｌｏｇ（１２
９）　＋１）　＋（ｌｏｇ（１２８）＋１）ｂｉｔｓ必
要である。また、例外データの記述量は、例えば、１つ
の決定に対して、その決定値が１であるとして、その決
定におけるタームを真にするデータが７つであり、その
うちクラスが１であるものの数が５，０であるものの数
が２であるとして、対象番号の若い順にデータのクラス
を記述すると、１１０１１０１であったとする。このと
き、この系列を自己完結的に符号化するのに必要な符号
長は、ｌｏｇ（７＋　１）＋ｌｏｇ（７Ｃ２）ｂｉｔｓ
または初めから例外は必ず［７＋　１）７２３個以下で
あることが分かっていれば、ｌｏｇ（［ｑ７　＋　１）
／２４　＋　１）＋ｌｏｇ（７Ｃ２）ｂｉｔｓである。Second, we accurately define the amount of description of all observed data using a decision list. The amount of data description is given as the sum of the amount of description of the decision list itself and the amount of description of exceptions in the data classified by the decision list. For example, in the first decision list in (2), #T53=131 (the term corresponding to the constant function true is counted as 0), so the term in the first decision in (2) is 1/131. Since it is selected based on probability, a code length of log(131) bits is required to encode it self-contained. In this case, Huffman encoding is used as the encoding method. Also, for one decision, it must be written whether to divide 1 or assign 0 to the data that makes the term true, and this requires 1 bit, so in the end, (2) Log(131)+1 bits is required as the amount of description regarding the initial decision. Regarding the next decision, the terms necessary for the decision are selected from among 130 terms excluding the terms used in the first decision of ll'53, so in the same way, log(130
) + 1 bits of description is required. Similarly,
By finding the amount of description for each decision and calculating their sum, the amount of description of the decision list itself can be calculated. (2
), the total is (log(131) + 1) +
(log(130) + 1) + (log(12)
9) +1) +(log(128)+1) bits are required. In addition, the amount of exception data described is, for example, for one decision, assuming that the decision value is 1, there are 7 pieces of data that make the term true in that decision, and the number of data that makes the term true in that decision is 1. Assuming that there are 2 data classes with 5 and 0, and the data classes are written in descending order of object number, the result is 1101101. At this time, the code length required to encode this sequence self-contained is log(7+1)+log(7C2) bits.
Or, if you know from the beginning that the number of exceptions is always [7 + 1) 723 or less, log([q7 + 1)
/24 + 1) + log (7C2) bits.

一般に、系列の長さをＮ、ｂ＝ｂ１またはｂ２、ここに
ｂ１＝Ｎ、ｂ２＝　ｒ（Ｎ＋１）／２Ｊ　とし、例外の
数をｈする９ことにより、例外記述に必要な記述量はｌｏｇ（ｂ　＋　１）＋　ｌｏｇ（ＮＣ，）ｂｉｔｓ　
　　　　　　　　（８）または、ＬＮ（ｈ）＋　ｌｏｇ（ＮＣ，）ｂｉｔｓ　　　　　　
　　　　（９）で与えられる。ここに、ＬＮ（ｈ）は｛
０、１，・・・、Ｎ）に含まれる自然数の自己完結的な
符号化を行うときの符号長を表し、次に満たす。In general, by setting the length of the series to N, b = b1 or b2, where b1 = N, b2 = r (N + 1) / 2J, and multiplying the number of exceptions by h9, the amount of description required for the exception description is log (b + 1) + log (NC,) bits
(8) Or, LN(h)+log(NC,)bits
It is given by (9). Here, LN(h) is {
0, 1, ..., N) represents the code length when performing self-contained encoding of natural numbers, and satisfies the following.

ＬＮ（０）＝　１ＬＮ（ｋ）＝１＋１ｏｇｋ＋Ｉｏｇｌｏｇｋ＋−＋ＣＮ
（ｋ＞１）　（１０）但し、上の和は正の項のみに対し
てとられるものであり、ＣＭは８．８Ｍ２−ＬＮ（ｋ）
＝１を満たす実数である。LN(0)=1 LN(k)=1+1ogk+Ioglogk+-+CN
(k>1) (10) However, the above sum is taken only for positive terms, and CM is 8.8M2-LN(k)
is a real number that satisfies =1.

・以上に与えた方法によって算出される決定リスト自身
の記述量と例外データの記述量との和が決定リストを用
いることによる全学習データの記述量である。上で述べ
た情報利得と記述量の概念を用いて、所与の観測データ
に対する決定リストの最適化を行うことが出来る。例え
ば、属性数が６のこれらの中に、３と９．４と１１．１
２と２０．１９と２４のデータに矛盾を含んでいる。発
明１によれば、先ず、ｋの値を固定したときの決定リス
トを、情報利得を最大にする決定から付加して行く方法
で構成し、予め定めたストツピングルールで停止して、
そこで得られる決定リストを求める決定リストとするこ
とにより、次のような決定リストが得られる（ｋ、、＝
４．５に対し、それぞれＤＬ”（４）、ＤＬ＊（５）と
かく）。- The sum of the description amount of the decision list itself and the description amount of the exception data calculated by the method given above is the description amount of all the learning data by using the decision list. Using the concepts of information gain and description amount described above, the decision list can be optimized for given observational data. For example, among these with the number of attributes 6, 3, 9.4, and 11.1
There are contradictions in the data for 2 and 20.19 and 24. According to invention 1, first, when the value of k is fixed, a decision list is constructed by adding the decision that maximizes the information gain, and stops according to a predetermined stopping rule.
By using the decision list obtained there as the decision list to be sought, the following decision list can be obtained (k, , =
4.5, DL"(4) and DL*(5), respectively).

但し、情報利得はエントロピーを用いるものとし、この
ときの決定付加の停止条件として、前決定における決定
値をＶ、情報利得を最大にするタームをもとして、次の
ｉ）、ｉｉ）、１ｉｉ）の条件を採用するものとする。However, entropy is used for information gain, and as a condition for stopping decision addition at this time, the decision value in the previous decision is V, the term that maximizes the information gain is used, and the following i), ii), 1ii) are used. The following conditions shall be adopted.

ｉ）　ｍ１ｎ（Ａ”（ｔ）、Ａ−（ｔ））／Ａ（ｔ）＞
ａ（＝０．２５）ならば、Ｂ＋（ｔ’す＜　Ｂ　−（ｔ
”）かつｖ＝１あるいはＢ＋（ｔ”）＞Ｂ−（ｔ”）か
つｖ＝ｏ（ｔｒｕｅ、１−ｖ）を決定として右に付は加
えて出力し、停止する。i) m1n(A”(t), A-(t))/A(t)>
If a (=0.25), then B + (t's < B - (t
'') and v=1 or B+(t'')>B-(t'') and v=o(true, 1-v) are determined, and an appendix is added to the right and output, and the process is stopped.

ｉｉ）　ｍ１ｎ（Ｂ＋（ｔ）、Ｂ−（ｔ））／Ｂ（ｔ）
＝１３（＝０．２９）かつ、Ｂ＋（ｔ）＜Ｂ−（ｔ）か
つｖ＝１あるいはＢ＋（ｔ）＞Ｂ−（ｔ、）かつｖ＝０
ならば、（ｔｒｕｅ、１−　ｖ）を決定として右に付は
加えて出力し、停止する。ii) m1n(B+(t), B-(t))/B(t)
=13 (=0.29) and B+(t)<B-(t) and v=1 or B+(t)>B-(t,) and v=0
If so, (true, 1-v) is determined, an appendix is added to the right and output, and the process stops.

１ｉｉ）まだ決定されていない観測データがもうなけれ
ば、（ｔｒｕｅ、１−ｖ）を決定として右に加えて出力し、
停止する。1ii) If there is no more observation data that has not been decided yet, add (true, 1-v) to the right as a decision and output it,
Stop.

ＤＬ＊（４）は次の通り（Ｘ２Ｘ３Ｘ５１’６．１）（ＸＩ又２又、、ＯＸＸ１
．Ｘ２Ｘ４Ｘ６＋１）−（ｘ２ｘ３Ｘ、天、、１）（ｔ
ｒｕｅ、０）ＤＬ＊（５）は次の通り（Ｘ２Ｘ３Ｘ５Ｘ６，１　）（ＸＩＸ２Ｘ４．ＯＸＸ、
ｘ２ｘ４Ｘ５”６１１）−（Ｒ，ｘ２ｘ３ｘ、Ｘ、ＩＸ
ｘ１ｘ３．Ｆ、、ＩＸｔｒｕｅ、０）また、発明３によ
れば、観測データを記憶する手段と、情報利得を最大に
する決定から順に付加して、予め定めているストツピン
グルールの条件が満たされたときに得られる決定リスト
を求める最終的な決定リストとして出力する手段を具備
している装置によって、ｋ＝４．５のときにはそれぞれ
ＤＬ＊（４）、ＤＬ＊（５）が発生させられる。DL*(4) is as follows (X2X3X51'6.1) (XI or bifurcated,, OXX1
．． X2X4X6+1)-(x2x3X, heaven,,1)(t
rue, 0)DL*(5) is as follows (X2X3X5X6,1)(XIX2X4.OXX,
x2x4X5"611) - (R, x2x3x, X, IX
x1x3. F,, IXtrue, 0) Also, according to invention 3, when a predetermined stopping rule condition is satisfied by adding a means for storing observation data and a decision that maximizes information gain in order, When k=4.5, DL*(4) and DL*(5) are generated, respectively, by a device having means for outputting the decision list obtained as the final decision list.

尚、以上の決定リストの発生方法並びに装置では、固定
されたｋに対してのみに−ＤＬ（ｎ）が発生させること
が出来る。一般にｋの値が太きければ大きいほど表現能
力が高くなり、多次元のデータ空間の分割の精度も細か
くなるが、観測データには一般に雑音が入っているので
、ｋが必要以上に大きいと、統計的な揺らぎに過敏な決
定リストを構成してしまうことになり、その場合、未知
データに対する分類予測誤差は返って大きくなる。従っ
て、同じ情報源から発生する未知データに対する予測誤
差を最小にするような最適なｋの値が存在するはずであ
り、このようなｋに対する決定リストを発生させる方法
並びに装置が必要となる。上述の意味で最適な決定リス
トを発生させるためには、様々なｋの値に対して、発明
１の方法または発明３の方法によって決定リストを発生
させてから、それらを用いて記述される観測データの記
述量を計算して比較し、記述量を最小にするような決定
リストを最終的に求める決定リストとして求める方法並
びに装置が考えられる。この理論的根拠も前出のＪ、リ
サネンによる論文に示されているＭＤＬ基準の考え方に
依るものである。以上が発明２の方法の原理であり、同
じ原理を装置の形で実現するのが発明４である　例えば
、前出の例では、＃Ｔ６４＝４７３．＃Ｔ６５＝６６５であり、各決定に
おける（決定される対象の数、例外の数）は、ＤＬ＊（４）で、（６，０）、（６，１）、（８，２）、（７，２）、（
５，１）ＤＬ＊（５）で、（６，０）、（６，１）、（６，１）、（５，１）、（
２，０）、（７Ｊ）であるから、記述量は、例外の記述
量の計算方法としてはｂ＝ｂ１として（７）を用いるこ
とにすれば、次のように求められる。Note that with the above decision list generation method and apparatus, -DL(n) can be generated only for a fixed k. In general, the thicker the value of k, the higher the expressive ability and the finer the precision of dividing the multidimensional data space, but since observed data generally contains noise, if k is larger than necessary, This results in a decision list that is sensitive to statistical fluctuations, and in that case, the classification prediction error for unknown data becomes larger. Therefore, there must be an optimal value of k that minimizes the prediction error for unknown data generated from the same information source, and a method and apparatus for generating a decision list for such k is needed. In order to generate an optimal decision list in the above sense, it is necessary to generate a decision list for various values of k by the method of Invention 1 or the method of Invention 3, and then calculate the observations described using them. A method and apparatus are conceivable that calculate and compare the amount of description of data and ultimately obtain a decision list that minimizes the amount of description. This theoretical basis is also based on the concept of the MDL standard shown in the paper by J. Rissanen mentioned above. The above is the principle of the method of invention 2, and invention 4 realizes the same principle in the form of a device.For example, in the above example, #T64=473. #T65=665, and (number of objects to be decided, number of exceptions) in each decision is DL*(4), (6,0), (6,1), (8,2), ( 7, 2), (
5,1) DL*(5), (6,0), (6,1), (6,1), (5,1), (
2,0), (7J), the description amount can be calculated as follows if b=b1 and (7) is used to calculate the exception description amount.

ＤＬ室（４）で、モデルの記述量＝　４８．３９８　ｂｉｔｓ例外の記述
量　＝　２４．７５０　ｂｉｔｓＬ（４）　　　　　　
＝　７３．１４８　ｂｉｔｓＤＬ”（５）で、モデルの記述量：５５．２３０　ｂｉｔｓ例外の記述量
　＝　２０．６２１　ｂｉｔｓＬ（５）　　　　　　　
＝　７５．８５１　ｂｉｔｓ従って、Ｍｐｆ、基準の下
ではＤＬ＊（４）がＤＬ＊（５）よりも適当なモデルで
あると言うことが出来る。発明２によると、ｋ＝４とに
＝５に対して発明工の方法で１度決定リストを発生させ
、さらに各ｋに対する記述量を計算し、それらを比較し
て小さい記述量を与える決定リストとして、ＤＩ、”（
４）を発生させることが出来る。発明４によると、ｋ＝
４とに＝５に対して発明３の装置で１度決定リストを発
生させる手段と、各ｋに対する記述量を計算する手段と
、それら及び各ｋに対する決定リストを記憶する手段と
、記述量を比較し、最小にする決定リストを出力する手
段を具備していれば、ＤＬ＊（４）を発生させることが
出来る。In the DL room (4), the amount of model description = 48.398 bits The amount of exception description = 24.750 bitsL (4)
= 73.148 bitsDL” (5), Model description amount: 55.230 bits Exception description amount = 20.621 bitsL (5)
= 75.851 bits Therefore, it can be said that DL*(4) is a more suitable model than DL*(5) under the Mpf standard. According to invention 2, a decision list is generated once for k = 4 and = 5 using the inventor's method, and the amount of description for each k is calculated, and they are compared to create a decision list that gives a smaller amount of description. As,DI,”(
4) can be generated. According to invention 4, k=
means for generating a decision list once with the apparatus of invention 3 for 4 and = 5, means for calculating the amount of description for each k, means for storing the decision list for these and each k, and means for calculating the amount of description for each k. If a means for comparing and outputting a decision list to be minimized is provided, DL*(4) can be generated.

（実施例）次に、本発明について図面を参照して詳細に説明する。(Example) Next, the present invention will be explained in detail with reference to the drawings.

以下、記号は（作用）の項に従う。Hereinafter, symbols follow the (action) section.

第１図は発明の詳細な説明するフローチャートである。FIG. 1 is a flowchart illustrating the invention in detail.

５ｔａｒｔでは、対象番号と２値のｎ個の属性と２値の
クラスからなる観測データの全集合を初期値とする集合
Ｓと、固定されたｎ、ｋに対してＴｎｋｅ初期値とする
集合Ｔと、初期タームｔ”＝０（タームＯはＴｎｋｏ中
に含まれないが、全データを全て偽にするタームとして
定める。また、このタームを用いた決定は出力時点では
省略されるものとする）と、初期決定値ｖ＝０が与えら
れている。ステップ１１で、ストツピングルールの条件
として、最初から与えられている１以下の正の実数α、
Ｉ３に対して、１）Ｓ＝空集合ｉｉ）ｍｉｎ（Ａ”（ｔ”）、Ａ−（ｔ”））／Ａ（ｔ
”）〉αＢ＋（ｔ”）＞Ｂ−（ｔ”）かつｖ＝１あるい
はＢ＋（ｔ”）＜Ｂ−（ｔ”）かつｖ＝Ｏｉｉｉ）　ｍ１ｎ（Ｂ”（ｔ”）、Ｂ−（ｔ”））／Ｂ
（ｔ”）≦ｐかつＢ＋（ｔ”）＞Ｂ−（ｔ”）かつｖ＝
１あるいはＢ＋（ｔ’す＜Ｂ−（ｔ”）かつｖ＝０を設けて、ｉ）　ｉｉ）　１ｉｉ）の順にこれらの条件
の１つでも当てはまるかどうかを調べる。こけれらの条
件のうち１つでもあてはまるものがあれば、ステップ１
２に進み、最終決定として（ｔｒｕｅ、１−ｖ）を右に
付加して、それまで付加してきた決定と併せて出力して
終了する。これらの条件がいずれも満たされてないなら
ば、ステップ１３に進み、（４）で定められている情報
利得をタームｔの関数とみなしたときに、これを最大に
するタームｔをＴの中から決定する。（これをｔ＊とす
る。）情報利得最大のタームが複数ある場合は、その中
でランダムに七本を選ぶ。次に、ステップ１４で、Ａ＋
（を本）＞Ａ−（ｔ”）ならば、ｖ＝１とし、Ａ＋（ｔ
”）＜Ａ−（ｔ”）ならば、ｖ＝０とし、（ｔ”、ｖ）
を決定リストの決定として右に付は加える。次に、ステ
ップ１５で、タームｔを真にするデータをＳから除いた
集合を改めてＳとし、タームＬをＴから除いた集合を改
めてＴとし、七本を改めてｔｌｔとし、ステップ１１に
戻る。In 5tart, there is a set S whose initial value is the entire set of observed data consisting of a target number, n attributes of binary value, and a binary class, and a set T whose initial value is Tnke for fixed n and k. and the initial term t''=0 (term O is not included in Tnko, but is defined as a term that makes all data false. Also, decisions using this term shall be omitted at the time of output) and an initial determined value v=0 is given.In step 11, as a condition of the stopping rule, a positive real number α less than or equal to 1 given from the beginning,
For I3, 1) S = empty set ii) min(A”(t”), A-(t”))/A(t
”)〉αB+(t”)>B-(t”) and v=1 or B+(t”)<B-(t”) and v=O iii) m1n(B”(t”), B-( t”))/B
(t”)≦p and B+(t”)>B-(t”) and v=
1 or B+(t'<B-(t") and v=0, and check whether any of these conditions apply in the order of i) ii) 1ii). Among these conditions, If any of the above apply to you, please proceed to step 1.
Proceed to step 2, add (true, 1-v) to the right as the final decision, output it together with the decisions added so far, and end. If none of these conditions are met, proceed to step 13 and find the term t in T that maximizes the information gain defined in (4) when it is considered as a function of term t. Determine from. (This is referred to as t*.) If there are multiple terms with the maximum information gain, seven of them are randomly selected. Next, in step 14, A+
If (book)>A-(t”), then v=1 and A+(t
”)<A-(t”), then v=0, and (t”, v)
Add it to the right as a decision in the decision list. Next, in step 15, the set obtained by removing the data that makes the term t true from S is redefined as S, the set obtained by removing the term L from T is redefined as T, the seven data is redefined as tlt, and the process returns to step 11.

第２図は発明１のもう１つの実施例を説明するフローチ
ャートである。ステップ２１から２５まではそれぞれ、
第１図のステップ１１から１５までと同じ機能を果たし
、ステップ２２の出力として、第１図の出力と同じ決定
リストが得られる。第２図では、ステップ２２の出力と
しての決定リスト（ＤＬｍａｘ（ｋ）とする）に対して
、さらにステップ２６で、ＤＬｍａｘ（ｋ）を用いて記
述できる観測データの記述量を（作用）の項で述べたよ
うな方法で決定リスト自身の記述量と決定に対する例外
データの記述量との和として計算し、これをり。とする
。ここで、記述量は、決定リスト自身の記述量と決定に
対する例外データの記述量との単純な和ではなく、重み
付き和、すなわち、λ（決定リストの記述量）＋（１−
λ）（例外データの記述量）（Ｏくλ＜１）として計算
する場合もあるとする。以下の記述量計算においても同
様である。次に、ステップ２７で、ＤＬｍａｘ（ｋ）の
末端（最終）決定から刈り込みを行う。具体的には、Ｖ
（１）＝ＤＬｍ、ｘ（ｋ）として、ｊ≧１に対し、Ｖ（
ｊ）の右から２つめの決定を（ｔ、ｖ）、（ｔｒｕｅ、
１−ｖ）とするとき、これらを（ｔｒｕｅ、１−ｖ”）
に置き換えたものをＶ（ｊ＋１）とする。但し、ｔ′を
ｔの直前の決定に用いられたタームとすると、Ｖ＊は次
のように与えられる。FIG. 2 is a flowchart illustrating another embodiment of invention 1. Steps 21 to 25 are respectively
It performs the same function as steps 11 to 15 of FIG. 1, and the output of step 22 is the same decision list as the output of FIG. In FIG. 2, for the decision list (denoted as DLmax(k)) as the output of step 22, in step 26, the amount of description of observation data that can be described using DLmax(k) is determined in the (action) term. Calculate this as the sum of the amount of description of the decision list itself and the amount of description of exception data for the decision using the method described in . shall be. Here, the amount of description is not a simple sum of the amount of description of the decision list itself and the amount of description of exception data for the decision, but a weighted sum, that is, λ (description amount of the decision list) + (1-
λ) (description amount of exception data) (O x λ<1) in some cases. The same applies to the description amount calculation described below. Next, in step 27, pruning is performed from the terminal (final) determination of DLmax(k). Specifically, V
(1)=DLm, x(k), for j≧1, V(
The second decision from the right of j) is (t, v), (true,
1-v), these are (true, 1-v”)
Let V(j+1) be the substituted value. However, if t' is the term used to determine t immediately before, then V* is given as follows.

次に、ステップ２８で、Ｖ（ｊ＋１）を用いて記述され
る観測データの記述量を（作用）の項に示した方法で計
算し、これをり、＋１とする。次に、ステップ２９に進
み、Ｖ（ｊ＋１）からさらに刈り取る決定が残っている
かどうかを判断し、残っていなければ、ステップ３１に
進み、記述量全てのり、の値を比較して最小になるもの
に対するＶ（ｉ）を出力し、終了する。ここで、もし、
最小値が複数存在すれば、その最小値に対応するＶ（ｉ
）を複数出力し、終了する。また、ステップ２９におい
てＶθ＋１）から刈り取る決定が残っていれば、ステッ
プ３０に進み、ｊをｊ＋１としてステップ２７に戻る。Next, in step 28, the amount of description of the observed data described using V(j+1) is calculated by the method shown in the (effect) section, and this is set to +1. Next, the process proceeds to step 29, and it is determined whether there are any decisions left to reap from V(j+1). If there are no decisions remaining, the process proceeds to step 31, where the values of all description amounts, , are compared and the one that minimizes the Output V(i) for , and terminate. Here, if
If there are multiple minimum values, V(i
) and exit. If the decision to harvest from Vθ+1 remains in step 29, the process proceeds to step 30, sets j to j+1, and returns to step 27.

第３図は発明の詳細な説明するフローチャートである。FIG. 3 is a flowchart illustrating the invention in detail.

５ｔａｒｔにおいては、対象番号と２値のｎ個の属性と
２値のクラスとからなる観測データの全集合を初期値と
する集合Ｓと、固定されたｎ、と複数のｋに対して（ｋ
は１つの決定に用いられる論理積を構成する文字の最大
数であり、ｋのとりうる数の総数をＭとする）、Ｔｎｋ
ｅ初期値とする集合Ｔ（ｋ）と、初期タームｔ′′＝０
と、初期決定値Ｖ＝Ｏが与えられている。In 5tart, for a set S whose initial value is the entire set of observed data consisting of a target number, n binary attributes, and a binary class, a fixed n, and a plurality of k, (k
is the maximum number of characters constituting the logical product used for one decision, and M is the total number of possible numbers of k), Tnk
e Set T(k) as initial value and initial term t''=0
and an initial determined value V=O is given.

先ず、各ｋに対して、発明１の方法で決定リストを発生
させる。ｋ、ぐ・＜ｋＭを対象にするｋとして、ｋ、に
対応する決定リスト発生のステップを３１とする。ステ
ップ３１は第１図の全体或は第２図の全体である。次に
、発生した決定リストに対して、その決定リストを用い
て記述されるデータの記述量を計算する。第２図で示さ
れた方法をステップ３１にする場合はこのステップは省
略される。ｋ、に対応する記述量計算のステップをステ
ップ３２で表す。次に、ステップ３３各にの値に対して
計算された記述量を比較して、最小値を求める。ここで
、記述量は、決定リスト自身の記述量と決定に対する例
外データの記述量との単純な和ではなく、重み付き和、
すなわち、λ（決定リストの記述量）＋（１−λ）（例
外データの記述量）（０くλ＜１）として計算する場合
もあるとする。次に、ステップ３４で、これらの中で最
小なｋの値に対応する決定リストを出力し、終了する。First, for each k, a decision list is generated using the method of invention 1. Let 31 be the step of generating a decision list corresponding to k, where k is a target of k, gu<kM. Step 31 is the entirety of FIG. 1 or the entirety of FIG. 2. Next, the amount of data to be described using the generated decision list is calculated. If the method shown in FIG. 2 is carried out at step 31, this step is omitted. Step 32 represents the step of calculating the description amount corresponding to k. Next, in step 33, the calculated description amounts for each value are compared to find the minimum value. Here, the amount of description is not a simple sum of the amount of description of the decision list itself and the amount of description of exception data for the decision, but a weighted sum,
That is, it is assumed that the calculation may be performed as λ (description amount of the decision list) + (1-λ) (description amount of exception data) (0 x λ<1). Next, in step 34, the decision list corresponding to the smallest value of k among these is output, and the process ends.

第４図は発明３の装置を示すブロック図である。対象番
号と２値のｎ個の属性と２個のクラスとからなる観測デ
ータの全集合を入力として、これを４１の記憶装置で一
旦記憶する。制御信号発生装置から発生する制御信号の
指令によって記憶装置から観測データの一部が情報利得
最大ターム決定回路４２に送られる。最初に供給される
データは観測データ全部である。情報利得最大ターム決
定回路４２は初期状態のパラメータとして、Ｔ＝Ｔ”、
（ｋは固定）、ｔ”　＝　ｏを有しており、１）Ｓ＝空集合ｉｉ）　ｍ１ｎ（Ａ　”　（ｔ”）、Ａ　−（ｔ”））
／Ａ（ｔ”）〉αＢ　＋　（ｔ”）＞　Ｂ　−（ｔ”）
かつＶ＝ＯあるいはＢ　”　（ｔ”）＜　Ｂ　−（ｔ”
）かつｖ＝１ｉｉｉ　）　ｍ１ｎ（Ｂ　”　（ｔ”）、Ｂ−（ｔ”）
）／Ｂ（ｔ”）≦ｐかつ、Ｂ＋（ｔ’す＞　Ｂ　−（ｔ
”）かつＶ＝ＯあるいはＢ　＋　（ｔ’す＜　Ｂ　−（
ｔ’りかつｖ＝１の条件をｉ）　ｉｉ）　１ｉｉ）の順に調べ、１つでも
当てはまるような場合は、信号Ｐ及びｔｏを４２に縦属
する決定付加回路４３に送る。また、上の条件がいずれ
も満たされない場合には、固定されたｋの値に対して、
Ｔの中のタームについて、記憶装置から供給されるデー
タに対して（４）で与えられる情報利得を計算し、これ
を最大にするようなタームを本を決定し、状態をｔ”＝
ｔ＊、Ｔ＝Ｔ−ｔ”に変えて、ｔネを決定付加回路４３
に送る。決定付加回路４３は、情報利得最大ターム決定
回路４２の出力と記憶装置から供給されるデータ（４２
に供給されるデータと同じ）を入力として、４２からｔ
市が送られてきた場合はＡ＋（ｔｏ）、Ａ−（を本）を
計算して、Ａ＋（円＞Ａ−（を町ならば、ｖ＝１．Ａ”
（ｔ”）＜Ａ−（ｔｏ）ならば、ｖ＝０とし、（ｔ”、
ｖ）を決定リストの決定として右に付は加え、次に記憶
装置４１がＳ−８Ａ（ｔｏ）を４２に供給するように指
令する制御信号を発生させる信号を、制御信号発生装置
４４に送る。４４は記憶装置がＳ　−５Ａ（ｔ”）を４
２に供給、するように指令する制御信号を発生させる信
号を４１に送る。４１は４４からの指令を受けて、Ｓ−
８Ａ（ｔｏ）を４１に供給する。４４の指令を受けて新
たに記憶装置４１からデータが供給されてきた。４２は
同じ動作を繰り返す。４３に４２から信号Ｐが送られて
きた場合には、　Ａ＋（ｔ”）＞Ａ−（ｔｏ）ならば、
ｖ＝１．Ａ、”（ｔ”）＜Ａ−（を車）ならば、ｖ＝０
として、（ｔｒｕｅ、１−ｖ）を最後の決定として決定
リストの右に付け加えて、回路全体の出力としてこれま
で決定を付加して得られる決定リストを出力し、回路全
体の動作を終了する。FIG. 4 is a block diagram showing the apparatus of invention 3. A complete set of observation data consisting of an object number, n binary attributes, and two classes is input, and is temporarily stored in 41 storage devices. A part of the observation data is sent from the storage device to the information gain maximum term determination circuit 42 in response to a command of a control signal generated from the control signal generator. The first data supplied is all observation data. The information gain maximum term determination circuit 42 sets T=T'' as parameters in the initial state.
(k is fixed), t” = o, and 1) S = empty set ii) m1n(A ” (t”), A − (t”))
/A(t”)〉αB + (t”)>B −(t”)
and V=O or B ” (t”) < B − (t”
) and v=1 iii) m1n(B ” (t”), B-(t”)
)/B(t”)≦p and B+(t′su>B −(t
”) and V=O or B + (t'< B −(
The conditions for t' and v=1 are checked in the order of i), ii), and 1ii), and if even one of them is true, the signals P and to are sent to the decision/addition circuit 43 which is vertically attached to 42. Also, if none of the above conditions are satisfied, for a fixed value of k,
For the terms in T, calculate the information gain given by (4) for the data supplied from the storage device, determine the term that maximizes this, and set the state as t''=
t*, T=T-t", and determines tne by the addition circuit 43.
send to The decision addition circuit 43 combines the output of the maximum information gain term decision circuit 42 and the data (42
42 to t
If a city is sent, calculate A+(to) and A-(if A+(yen>A-(is a town), then v=1.A"
If (t”)<A-(to), then let v=0 and (t”,
Add v) to the right as a decision in the decision list, and then send a signal to the control signal generator 44 to generate a control signal instructing the storage device 41 to supply S-8A(to) to 42. . 44 is a storage device that stores S −5A(t”) at 4
A signal is sent to 41 to generate a control signal instructing it to be supplied to 2. 41 receives instructions from 44, and S-
8A(to) is supplied to 41. In response to the command 44, data is newly supplied from the storage device 41. 42 repeats the same operation. When signal P is sent from 42 to 43, if A+(t”)>A-(to), then
v=1. If A, "(t")<A-(car), then v=0
, (true, 1-v) is added to the right of the decision list as the last decision, and the decision list obtained by adding the decisions so far is output as the output of the entire circuit, and the operation of the entire circuit ends.

第５図は発明３の決定リストの発生装置のもう１つの実
施例を示すブロック図である。回路５１，５２．５４は
それぞれ、第４図の回路４１，４２，４４と同じ入出力
機能を果たす。しかし、回路５３の出力としては、第１
図の出力と同じ決定リストにくわえて、各決定に用いら
れたタームのそれぞれに対するＡ＋（ｔ）、Ａ−（ｔ）
の値も出力される。刈り込み及び記述量計算回路５５は
、回路５３の出力を入力として、それを末端の決定から
順に刈り込んで、それらのそれを用いてデータを記述す
るときの記述量を合わせて計算する。ここで、記述量は
、決定リスト自身の記述量と決定に対する例外データの
記述量との単純な和ではなく、重み付き和、すなわち、
λ（決定リストの記述量）＋（１−λ）（例外データの
記述量）（０くλ＜１）として計算する場合もあるとす
る。具体的には、右から２つの決定を（ｔ、ｖ）＋（ｔ
ｒｕｅ、１−ｖ）とするとき、これらを（ｔｒｕｅ、１
−ｖ”）に置き換えるといった操作を刈り取る決定が無
くなるまで繰り返す。但し、ｔ′をｔの直前の決定に用
いられたタームとすると、ｖ”Ｇよ次のように与えられ
る。FIG. 5 is a block diagram showing another embodiment of the decision list generating device of the third invention. Circuits 51, 52, and 54 perform the same input/output functions as circuits 41, 42, and 44 of FIG. 4, respectively. However, as the output of the circuit 53, the first
The same decision list as in the figure output, plus A+(t), A-(t) for each of the terms used in each decision.
The value of is also output. The pruning and description amount calculation circuit 55 takes the output of the circuit 53 as input, prunes it in order from the determination of the end, and calculates the description amount when describing data using these prunings. Here, the amount of description is not a simple sum of the amount of description of the decision list itself and the amount of description of exception data for the decision, but a weighted sum, that is,
It is assumed that the calculation may be performed as λ (description amount of decision list) + (1-λ) (description amount of exception data) (0 x λ<1). Specifically, the two decisions from the right are (t, v) + (t
true, 1-v), these are (true, 1-v).
-v") is repeated until there are no more decisions to reap. However, if t' is the term used in the decision immediately before t, then v"G is given as follows.

５５はそれぞれ刈り込みによって得られた決定リストの
それぞれと、それぞれに対して計算された記述量を合わ
せて、第２記憶装置５６に送る。第２記憶装置は５５か
ら送られてきた入力を記憶する。また、５５は刈り込ま
れたそれぞれの決定リストの記述量だけを記述量比較回
路５７にも送る。５７は刈り込まれた決定リスト達の記
述量を比較し、最小値を求め、第２記憶装置５６から記
述量を最小にする決定リストを出力させることを指示す
る制御信号を送ることを指令する信号を、第２制御信号
発生回路５８に送り込む。５８は５７の指令を受けて、
第２記憶装置５６から記述量を最小にする決定リストを
出力させることを指示する制御信号を第２記憶装置５６
に送る。第２記憶装置５′６は５８の指令を受けて、記
述量を最小にする決定リストを出力する。55 sends each of the decision lists obtained by pruning together with the description amount calculated for each to the second storage device 56 . The second storage device stores the input sent from 55. 55 also sends only the description amount of each pruned decision list to the description amount comparison circuit 57. A signal 57 instructs to compare the description amounts of the pruned decision lists, find the minimum value, and send a control signal instructing to output the decision list that minimizes the description amount from the second storage device 56. is sent to the second control signal generation circuit 58. 58 received 57's orders,
The second storage device 56 sends a control signal instructing the second storage device 56 to output a decision list that minimizes the amount of description.
send to The second storage device 5'6 receives the command 58 and outputs a decision list that minimizes the amount of description.

第６図は発明４の決定リストの発生装置の実施例を示す
ブロック図である。観測データを装置全体の人力として
、該人力は先ず、ＤＬ”（ｋ）発生回路６１に送り込ま
れる。６１は固定されたｋに対して、発明３の装置と同
じ回路によって決定リストを発生させ（固定されたｋに
対して発生された決定リストをＤＬ＊（ｋ）とする）、
これを第１制御信号発生装置６２より発生する制御信号
の指示に従って複数のｋの値に対して実行する。６１は
各ｋに対するＤＬ”（ｋ）と各決定に用いられたターム
のそれぞれに対するＡ＋（ｔ）、Ａ〜（１）の値を順に記述量計算回路６３
に送り込み、各ｋに対するＤＬ”（ｋ）を記憶装置６４
に送り込む。FIG. 6 is a block diagram showing an embodiment of a decision list generating device according to the fourth invention. Using the observation data as human power for the entire device, the human power is first sent to the DL''(k) generation circuit 61. 61 generates a decision list for fixed k using the same circuit as the device of invention 3 ( Let DL*(k) be the decision list generated for a fixed k),
This is executed for a plurality of values of k according to instructions of a control signal generated by the first control signal generator 62. 61 is a description amount calculation circuit 63 which sequentially calculates DL''(k) for each k and the values of A+(t) and A~(1) for each of the terms used for each determination.
DL''(k) for each k is stored in the storage device 64.
send to.

６３は６１の出力を入力として、それぞれの記述量を計
算し、その記述量を記述量比較回路６５及び記憶装置６
４に送り込む。ここで、記述量は、決定リスト自身の記
述量と決定に対する例外データの記述量との単純な和で
はなく、重み付き和、すなわち、λ（決定リストの記述
量）＋（１−λ）（例外データの記述量）（０くλ＜１
）として計算する場合もあるとする。記憶装置６４は６
１及び６３の出力を人力として、これらを記憶する。６
５は６３の出力を入力として、６５は各ｋに対するＤＬ
＊（ｋ）の記述量を比較し、最小値を求め、記憶装置６
４から記述量を最小にする決定Ｊストを出力させること
を指示する制御信号を送ることを指令する信号を、第２
制御信号発生回路６６に送り込む。６６は６５の指令を
受けて、記憶装置６４から記述量を最小にする決定リス
トを出力させることを指示する制御信号を記憶装置６４
に送る。63 uses the output of 61 as input, calculates each description amount, and sends the description amount to the description amount comparison circuit 65 and the storage device 6.
Send it to 4. Here, the amount of description is not a simple sum of the amount of description of the decision list itself and the amount of description of exception data for the decision, but a weighted sum, that is, λ (description amount of the decision list) + (1 - λ) ( Exception data description amount) (0×λ<1
). The storage device 64 is 6
The outputs of 1 and 63 are stored manually. 6
5 is the input of the output of 63, and 65 is the DL for each k.
* Compare the description amount of (k), find the minimum value, and store it in the storage device 6.
4, the second signal instructs to send a control signal instructing to output the decision J strike that minimizes the amount of description.
The signal is sent to the control signal generation circuit 66. 66 receives the command from 65 and sends a control signal to the storage device 64 instructing the storage device 64 to output a decision list that minimizes the amount of description.
send to

記憶装置６４は６６の指令を受けて、記述量を最小にす
る決定リストを出力する。The storage device 64 receives the command 66 and outputs a decision list that minimizes the amount of description.

（発明の効果）発明１及び発明３によれば、固定されたｋの値（決定を
規定する論理積に現れる文字の最大数）に対して、観測
データと同じ情報源から発生する未知データを出来るだ
け正確に分類することが理論的に保証された決定リスト
が発生できる。発明２及び発明４によれば、さらに、ｋ
の値を動かして得られる得られる広い決定リストの集合
の中から、観測データと同じ情報源から発生する未知デ
ータを出来るだけ正確に分類することが理論的に発生さ
れた決定リストを選ぶことが出来る。(Effect of the invention) According to inventions 1 and 3, for a fixed value of k (maximum number of characters appearing in the logical product that defines the decision), unknown data generated from the same information source as the observed data can be calculated. A decision list can be generated that is theoretically guaranteed to classify as accurately as possible. According to inventions 2 and 4, further k
From a wide set of decision lists obtained by varying the values of , it is possible to choose a decision list that is theoretically generated to classify as accurately as possible unknown data that originates from the same information source as the observed data. I can do it.

[Brief explanation of drawings]

第１図は発明１の決定リストの生成方法を示すフローチ
ャート、第２図は発明１の決定リストの生成方法で、第
１図に示した方法の変形版のフローチャート、第３図は
発明２の決定リストの生成方法を示すフローチャート、
第４図は発明３の決定リストの生成装置を示すブロック
図、第５図は発明３の決定リストの生成装置で、第４図
に示した方法の変形版のブロック図、第６図は発明４の
決定リストの生成装置を示すブロック図、第５図は発明
４の決定リストの生成装置で、第６図に示した方法の変
形版のブロック図、第７図、第８図は本発明の詳細な説
明するための図である。図において、４１；記憶装置、４２：情報利得最大ターム決定回路、
４３：決定付加回路、４４：制御信号発生装置、５１コ
第１記憶回路、５２：情報利得最大ターム決定回路、５
３：決定付加回路、５４：第１制御信号発生装置、５５
：刈り込み及び記述量計算回路、５６：第２記憶回路、
５７：記述量比較回路、５８：第２制御信号発生装置、
６１：ＤＬ＊（ｋ）発生回路、６２：第１制御信号発生
装置、６３：記述量計算回路、６４：記憶装置、６５：
記述量比較回路、６６：第２制御信号発生装置。FIG. 1 is a flowchart showing a method for generating a decision list according to invention 1, FIG. 2 is a flowchart showing a method for generating a decision list according to invention 1, and is a flowchart of a modified version of the method shown in FIG. A flowchart showing how to generate a decision list,
FIG. 4 is a block diagram showing a decision list generation device of invention 3, FIG. 5 is a block diagram of a decision list generation device of invention 3, and a block diagram of a modified version of the method shown in FIG. 4. 4 is a block diagram showing the decision list generation device of invention 4. FIG. 5 is a block diagram of the decision list generation device of invention 4, and a block diagram of a modified version of the method shown in FIG. FIG. 2 is a diagram for detailed explanation. In the figure, 41; storage device; 42: maximum information gain term determination circuit;
43: Determination addition circuit, 44: Control signal generator, 51 first storage circuit, 52: Information gain maximum term determination circuit, 5
3: Determination addition circuit, 54: First control signal generator, 55
: pruning and description amount calculation circuit, 56: second storage circuit,
57: description amount comparison circuit, 58: second control signal generator,
61: DL*(k) generation circuit, 62: first control signal generation device, 63: description amount calculation circuit, 64: storage device, 65:
Description amount comparison circuit, 66: second control signal generation device.

Claims

[Claims] In a method for generating a decision list from a plurality of noisy observation data given as a set of a plurality of attributes and classes taking values of 1, {0, 1}, one logical product is constructed. A step of sequentially selecting and adding the logical product that maximizes the information gain for the observed data as a logical product that defines the determination of the decision list, with the number of characters fixed, and a predetermined stopping rule. A method for generating a decision list, comprising the step of using a decision list obtained when adding decisions as a final decision list. 2. The step of once generating a decision list by the method of claim 1, and then sequentially removing decision items from the end of the decision list, for each element of the series of decision lists, It is characterized by including the step of calculating the amount of description of observed data to be described using the decision list, and setting the decision list that minimizes the amount of description in the decision list series as the final decision list. How to generate a list of decisions. 3. In a method for generating a decision list from multiple noisy observation data given as a pair of multiple attributes and classes that take values {0, 1}, the maximum number of characters constituting one logical product ( k) is fixed, and the step of sequentially selecting and adding the logical product that maximizes the information gain for the observed data as a logical product that specifies the decision of the decision list, and the step of adding it by sequentially selecting the logical product that maximizes the information gain for the observation data, and using a predetermined stopping rule. The step of calculating the decision list obtained when adding decisions is completed for multiple values of k, and the step of calculating the decision list obtained by repeating the above generation process for each k. A method for generating a decision list, comprising the steps of calculating and comparing the amounts of description, and determining a decision list that minimizes the amount of description among them as the final decision list. 4. Obtaining a decision list for each fixed k by the method of claim 1, and for each element of the series of decision lists obtained by sequentially removing decision terms from the end of the decision list. On the other hand, the step of calculating the amount of description of the observed data described using the decision list, and finding the best decision list for each k, which minimizes the amount of description among the decision list series. A method for generating a decision list, comprising the steps of: determining the best decision list for a plurality of k values, and selecting the decision list that minimizes the amount of description among them as the final decision list. 5. In an apparatus for generating a decision list from a plurality of noisy observation data given as a set of a plurality of attributes and classes taking values of {0, 1}, a means for storing observation data and one logical product. By fixing the number of characters composing the decision list, we sequentially select and add the logical product that maximizes the information gain for the observed data as a logical product that specifies the decision of the decision list, and according to a predetermined stopping rule. A device for generating a decision list, comprising: means for outputting a decision list obtained when adding decisions is finished as a final decision list. 6. Means for once generating a decision list by the apparatus of claim 5, and then means for generating a series of decision lists obtained by sequentially removing decision items from the end of the decision list, and one of the series elements. means for calculating the amount of description of observed data to be described using the decision list; means for storing the amount of description and the decision list series; means for outputting the decision list to be finally obtained as a decision list to be obtained;
A decision list generation device comprising: 7. In an apparatus for generating a decision list from a plurality of noisy observation data given as a set of a plurality of attributes and classes taking values of {0, 1}, a means for storing observation data and one logical product. By fixing the maximum number of characters (k) constituting the list, we sequentially select and add those that maximize the information gain for the observed data as a logical product that specifies the decision of the decision list, and means for obtaining a decision list obtained when addition of decisions is finished according to a predetermined stopping rule for a plurality of values of k; a means for storing a decision list obtained for a plurality of k; On the other hand, there is a means for calculating the descriptive amount of observed data using the decision list generated by repeating the above generation process, and a decision list that ultimately obtains the decision list that minimizes the technical amount. A device for generating a decision list, comprising: means for outputting it. 8. Means for obtaining a decision list for a fixed k by the apparatus of claim 5, means for generating a series of decision lists obtained by sequentially removing decision terms from the end of the decision list, and the series elements. means for calculating the descriptive amount of observation data described using the decision list, means for storing the descriptive amount, means for storing a series of decision lists, and the decision list. Means for obtaining a decision list that minimizes the amount of description in a series as the best decision list for each k; Means for storing the best decision list for a plurality of values of k; means for comparing the description amounts of the decision lists and outputting the smallest decision list as a final decision list;
A decision list generation device comprising: