JP3945144B2

JP3945144B2 - Inductive reasoning system

Info

Publication number: JP3945144B2
Application number: JP2000324077A
Authority: JP
Inventors: 和也千葉
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2000-10-24
Filing date: 2000-10-24
Publication date: 2007-07-18
Anticipated expiration: 2020-10-24
Also published as: JP2002132504A

Description

【０００１】
【発明の属する技術分野】
この発明は、帰納推論システムに関し、特に、数値情報を１つ含む目標概念について、その目標概念のサンプルである例を入力し、目標概念を説明する優先順位付きのルールを出力する一階述語論理に基づく帰納推論システムに関する。
【０００２】
【従来の技術】
一階述語論理に基づく帰納推論システムであり、文献（S. Muggleton. Inductive logic programming. New Generation Computing, 8(4):295-318, 1991.）により提案された帰納論理プログラミング(Inductive Logic Programming, 以下、ILPと称する)は、目標概念について、その目標概念のサンプルである例や背景知識などを入力し、目標概念を説明するルールを出力するものである。
【０００３】
ILPは、もともとは記号で表される概念の学習を意図してつくられたが、実際の応用の場面では、数値情報を含む概念が多くあり、これらをうまく学習するということは大きな課題である。
【０００４】
この課題に対して、数値を、記号としてあつかうアプローチ（例えば文献 B. Dolsak and S. Muggleton. The application of Inductive Logic Programming to finite element mesh design. In S. Muggleton, editor, Inductive Logic Programming, pages 453-472. Academic Press, London, 1992.）が試みられた。つまり、１，２，３等をａ，ｂ，ｃと同様の単なる記号と見なして学習をさせたのである。しかし、そのため、数値間の大小関係などは考慮されなかった。他にも、ILPを用いて数値情報を含む概念を学習する試みがいくつかあるが、たいていこのようなアプローチであり、せいぜい不等号を導入する程度にとどまっていた。
【０００５】
また、現在ILPシステムとして広く使われているものに、Progolがある。Progolについては文献（ S. Muggleton. Inverse entailment and Progol. New Generation Computing, 13:245-286, 1995.）に記載されている。Progolについても、例に含まれる数値情報を利用してルールを形成させようとする場合には、等しいことを表す２引数の述語、等しくないことを表す２引数の述語、右辺が左辺より大きいことを表す２引数の述語およびその逆、右辺が左辺より大きいか等しいことを表す２引数の述語およびその逆、が利用できるだけである。
【０００６】
また、Progolがルールの探索を行う際に利用するルールの評価関数は、ルールのカバーする（ルールを用いて導出できる）正例の数と、負例の数、およびルールの長さを用いている。このため、例に含まれる数値情報が利用されず、良い、（すなわち与えられた例を良く説明する、または未知例についての予測精度が高い）ルールが得られない場合もあるという問題点があった。
【０００７】
【発明が解決しようとする課題】
上述のように、従来の帰納推論システムにおいては、数値情報の学習を行うことが困難であり、その結果、例に含まれる数値情報が利用されず、良いルールが得られない場合があった。
【０００８】
そこで、この発明は、数値情報を利用し、良好なルールを得ることのできる帰納推論システムを提供することを目的とする。
【０００９】
【課題を解決するための手段】
上述した目的を達成するため、請求項１の発明は、数値情報を１つ含む目標概念について、該目標概念のサンプルである例を入力し、該目標概念を説明するルールを出力する一階述語論理に基づく帰納推論システムであって、前記例を記憶する記憶手段と、前記記憶手段に記憶した例に基づいてノードを生成し、該生成したノードのカバーする例の数値部分の標準偏差を用いて、予め定めた評価関数により該ノードの評価値を算出する推論手段と、前記推論手段が算出した評価値の高いノードの順に該ノードのカバーする例の数値部分の平均値を含むルールを出力し、該出力したルールに対応する例を前記記憶手段から削除する出力手段とを具備し、前記出力手段による出力の順をルールの優先順位とすることを特徴とする。
【００１３】
【発明の実施の形態】
以下、この発明に係る帰納推論システムの一実施の形態について、添付図面を参照して詳細に説明する。
【００１４】
図１は、帰納推論システムの概略構成を示すブロック図である。
同図に示すように、帰納推論システムは、入力部１と帰納推論部２、出力部３を具備して構成される。
【００１５】
入力部１は、ユーザインターフェースの部分であり、例、モード宣言、背景知識などを帰納推論部２に入力する。帰納推論部２は、ルールのトップダウン探索を行い、目標概念を説明する優先順位付きのルール集合を生成し、出力部３に送る。出力部１０３は、このシステムからの出力をユーザに提示する部分である。
【００１６】
また、帰納推論部２は、オリジナル例集合保持部２１と例集合保持部２２、モード宣言保持部２３、背景知識保持部２４、ノードバッファ２５を具備して構成される。
【００１７】
オリジナル例集合保持部２１は、例の集合を探索手続き中そのまま保持しておく役割を持ち、例集合保持部２２は、例の集合を探索手続き中その変化（減少）に応じて保持しておく役割を持つ。
【００１８】
モード宣言保持部２３は、モード宣言を保持しておく役割を持つ。モード宣言は、基本的にProgolのもの(modes)と同じであり、目標概念を定義する述語、ルールの生成に使われる述語やその変数の入出力関係が記述されている。
【００１９】
背景知識保持部２４は、背景知識を保持しておく役割を持つ。背景知識は、基本的にProgolのもの(background knowledge)と同じであり、述語の定義や、それぞれの例の具体的性質などが記述されている。
【００２０】
ノードバッファ２５は、探索木の各々のノードを保持する役割を持ち、ノード２６の列からなる。
【００２１】
ここで、ノード２６について説明する。
図２は、ノード２６を示した図である。
【００２２】
ノード２６は、述語集合、ｆ、σ、ｃ、ｓの５つ組である。その他に、展開（後述する）されたかどうかという情報を保持している。
【００２３】
述語集合は、ルールに対応している。すなわち、述語集合中の述語を右辺に置き、左辺に目標概念に対応する述語を置くことでルールが作られる。例えば、図２のノード２６において、述語集合は｛important(X)｝であり、目標概念に対応する述語は｛mesh(X,N)｝なので、対応するルールは、｛mesh(X,N)←important(X)｝となる。ここで、目標概念中の数値情報に対応する部分は、任意の変数（この例ではN）で置き換える。
【００２４】
ｆは、このノードがその時点でカバーする例の数を表す。つまり、このノードの述語集合に対応するルールによって導出できる、その時点での例集合中の例の数を表す。ノード２６においては、カバーする例の数は１１である。これは、ルールを支持する例の個数についての指標であり、多いほど良いルールであるといえる。
【００２５】
σは、このノードがカバーするオリジナル例集合中の例の数値部分の数値の値の標準偏差を表す。例えば、ノード２６が、mesh(a3,8),mesh(a9,3),...の１１の例をカバーした場合、例の数値部分は８，３，７，２，２，５，２，７，３，２，２であり、これらの標準偏差は２．３８６である。ここで標準偏差を利用したのは、この標準偏差の小さい場合は、カバーする例の数値部分にばらつきがなく、例をうまく説明するルールと考えられるからである。
【００２６】
cは、このノードの述語集合の要素の数を表す。例えば、ノード２６においては、述語集合の要素は１つであるので。cは１である。これは、ノードに対応するルールの長さを表す指標であり、少ないほど良いルールであるといえる。
【００２７】
最後に、sは、このノードのスコア、つまりルールの評価関数の値を表す。スコアｓは、基本的にはｆ、σ、ｃを用いて定める。またｆの代わりにオリジナル例集合中の例の数Ｆを用いても良い。例えばノード２６においては、ｓは−５．８４２である。
【００２８】
次に、帰納推論部２の動作について説明する。
図３は、帰納推論部２の探索手続きの流れを示すフローチャートである。
【００２９】
帰納推論部２は、まず、入力部１から受け取った例を、オリジナル例集合保持部２１に入れる（ステップ１０１）。そして、この例集合をコピーし、例集合保持部２２に入れ（ステップ１０２）、ノードバッファ２５を空にする（ステップ１０３）。
【００３０】
次に、述語集合を空集合としたノードを生成してノードバッファ２５の先頭に入れる（ステップ１０４）。このとき、ノードバッファ２５は、図４に示すように、述語集合が空集合（Empty）のノード２６−０が先頭に入れられている状態となる。
【００３１】
続いて、ノードバッファ２５の先頭から順に調べ、まだ展開されていないノードを展開手続きによって展開し、得られたノードをノードバッファ２５に加える（ステップ１０５）。このときノードバッファ２５は、スコアｓの大きい順に常になっているようにする。スコアｓが同じノードについては任意の順に並べて良い。
【００３２】
ここでの展開手続きは、Progolのもの（refinement operator)と同じであり、概略のみを説明する。
展開手続きは、元となるノード（２６等）の述語集合について、次のいづれかの操作を施して得られる述語集合のすべてを出力するものである。
１．モード宣言で指定された述語を１つ付け加える。
２．述語集合中の違う名前の２つの変数を同一のものにする。
３．述語集合中のある変数を定数に置き換える。
【００３３】
この展開手続きは、ノードバッファ２５にあらかじめユーザが指定した数（例えば、ここでは１００とする）を越える数のノードが加えられるまでの間（ステップ１０６でＮＯ）、繰り返され、ノードバッファ２５にあらかじめユーザが指定した数（例えば、ここでは１００とする）を越える数のノードが加えられると（ステップ１０６でＹＥＳ）、展開手続きを終了する。展開手続きが終了した際のノードバッファ２６の状態は、図５に示すように、ノード２６−１〜ノード２６−８等のノードが１００個以上保持されている状態となる（展開手続の関係で１００個以上となる場合があるため）。
【００３４】
次に、ノードバッファ２５の先頭から調べていき、ｆが０でないノード２６（何らかの例をカバーするノード）を出力として出力手続きによりルールを生成し、出力部３に送り、このノードをノードバッファ２５から取り除く（ステップ１０７）。なお、それ以前のｆが０のノードは単に捨てられる。出力手続きが終了した際のノードバッファ２６の状態は、図６に示すように、ノード２６−１が出力された状態（図５の場合と比較して）となる。
【００３５】
出力手続きは、ノードからルールを生成する手続きで、まず、ノードの述語集合中の述語を右辺に置き、ノードのカバーする例の数値部分の平均値を数値情報を表す引数の部分に入れた目標概念を表す述語を左辺に置くものである。このとき数値情報が整数であることをユーザが指定した場合は、その平均値は、四捨五入して整数に置き換える。
【００３６】
続いて、出力手続きで出力したノードがカバーした例を、例集合保持部２２から取り出す（ステップ１０８）。そして、例集合保持部２２に保持している例集合が空になれば（ステップ１０９でＹＥＳ）、手続きを終了する。
【００３７】
一方、例集合保持部２２に保持している例集合が空でない場合には（ステップ１０９でＮＯ）、ノードバッファ２５を調べ、ｆが０であるノードは、この先展開をしても何らかの例をカバーする可能性がないため、すべて取り除き（ステップ１１０）、ステップ１０５に戻り、同様の処理を繰り返す。
【００３８】
なお、探索手続きは、ループするごとに少なくとも１つの例を例集合から取り出すことと、述語集合が空のノードはすべての例をカバーすることから、この手続きはどのような場合でもいつか必ず終了することがいえる。
【００３９】
ところで、この帰納推論システムにより得られるルールには、優先順序がある。つまり、最初に出力されたルールほど、評価関数であるスコアが高いため、優先度の高いルールとして扱うことにする。Progolなどの従来の帰納推論システムは、出力するルールの集合は、そのルール間に優先順序はなく、同等のレベルのものとして扱われていたが、この帰納推論システムでは、ルールを用いて未知例の予測をする場合には、優先度の高いルールから順に適用して導出を試みる。
【００４０】
また、この帰納推論システムにより得られるルールを利用する際は、１つのルールで導出に成功すれば、それより優先度の低いルールは使用しないこととする。すなわち、「数値としての答えは１つしか出ない」といったことを実行する。因みに、従来のものでは、１つの例に対していくつかのルールで導出が可能になっている、すなわち「数値としての答えが２つ以上出る」場合があった。
【００４１】
次に、この帰納推論システムの動作の具体例を示す。
【００４２】
ここでは、前述した文献（ B. Dolsak and S. Muggleton. The application of Inductive Logic Programming to finite element mesh design. In S. Muggleton, editor, Inductive Logic Programming, pages 453-472. Academic Press, London, 1992.）と同じ例を用いて説明する。これは、いわゆる学習研究のために作った「toy dataset」ではなく、実世界で使われているデータセットである。ここでは、よいメッシュの例のデータを用いて、構造から望ましいメッシュを生成するためのルールを獲得するために、学習の目標となる述語を、mesh(X, N)とした。ここで、XはエッジのIDであり、Nはそのエッジの分割数（そのエッジに沿って存在する要素の数）である。ここで、目標概念meshは、数値情報(N)を含んでいる。Nは整数とする。
【００４３】
なお、ここでは、ノード２６のスコアｓは、s＝Ｆ/75−10σ×σ−0.05ｃで求めることとした。
【００４４】
まず、例、モード宣言、背景知識を帰納推論部２に入力する。ここでは、次の７５個の例が入力される。
【００４５】
mesh(a2, 1).
mesh(a3, 8).
mesh(a4, 1).
mesh(a5, 1).
mesh(a6, 2).
mesh(a7, 1).
mesh(a8, 1).
mesh(a9, 3).
mesh(a10, 1).
mesh(a11, 3).
mesh(a12, 1).
mesh(a13, 1).
mesh(a14, 1).
mesh(a15, 4).
mesh(a16, 1).
mesh(a17, 2).
mesh(a18, 1).
mesh(a19, 4).
mesh(a20, 1).
mesh(a21, 1).
mesh(a22, 2).
mesh(a23, 2).
mesh(a24, 1).
mesh(a25, 2).
mesh(a26, 1).
mesh(a27, 1).
mesh(a28, 2).
mesh(a29, 1).
mesh(a30, 1).
mesh(a31, 3).
mesh(a32, 2).
mesh(a33, 2).
mesh(a35, 1).
mesh(a36, 12).
mesh(a37, 12).
mesh(a38, 12).
mesh(a39, 5).
mesh(a40, 2).
mesh(a41, 1).
mesh(a42, 5).
mesh(b1, 9).
mesh(b2, 1).
mesh(b3, 2).
mesh(b4, 7).
mesh(b5, 1).
mesh(b6, 1).
mesh(b7, 1).
mesh(b8, 9).
mesh(b9, 1).
mesh(b10, 2).
mesh(b11, 7).
mesh(b12, 1).
mesh(b13, 1).
mesh(b14, 1).
mesh(c1, 1).
mesh(c2, 2).
mesh(c3, 1).
mesh(c4, 1).
mesh(c5, 3).
mesh(c6, 2).
mesh(c7, 2).
mesh(c8, 3).
mesh(c9, 1).
mesh(c10, 2).
mesh(c11, 1).
mesh(c12, 2).
mesh(c13, 1).
mesh(c14, 2).
mesh(c15, 8).
mesh(c16, 8).
mesh(c17, 8).
mesh(c18, 8).
mesh(c19, 8).
mesh(c20, 8).
mesh(c21, 8).
【００４６】
モード宣言は、次のものが入力される。ここでは、Progolと同じ文法形式で示した。
【００４７】
:- modeh(1,mesh(+edge,+nums))?
:- modeh(1,mesh(+edge,#nums))?
:- modeb(1,important_long(+edge))?
:- modeb(1,important(+edge))?
:- modeb(1,important_short(+edge))?
:- modeb(1,circuit(+edge))?
:- modeb(1,half_circuit(+edge))?
:- modeb(1,short_for_hole(+edge))?
:- modeb(1,long_for_hole(+edge))?
:- modeb(1,circuit_hole(+edge))?
:- modeb(1,half_circuit_hole(+edge))?
:- modeb(1,not_important(+edge))?
:- modeb(1,free(+edge))?
:- modeb(1,one_side_fixed(+edge))?
:- modeb(1,two_side_fixed(+edge))?
:- modeb(1,fixed(+edge))?
:- modeb(1,not_loaded(+edge))?
:- modeb(1,one_side_loaded(+edge))?
:- modeb(1,cont_loaded(+edge))?
:- modeb(*,neighbour_xy(+edge,-edge))?
:- modeb(*,neighbour_yz(+edge,-edge))?
:- modeb(*,neighbour_zx(+edge,-edge))?
:- modeb(2,opposite(+edge,-edge))?
:- modeb(2,same(+edge,-edge))?
【００４８】
ここで、modehは目標概念を定義する述語を指定する。modebは、ルールの生成に使われるルールの右辺を作る述語を指定する。+edge,-edgeは、それぞれ入力変数、出力変数であることを指定している。
【００４９】
背景知識は、次のものが入力される。これらは、文献に記されているものと同じである。ただし、２引数の述語については述語名のみ変更されている。
【００５０】
important_long(a1).
important_long(a34).
important_long(b1).
important_long(b8).
important(a3).
important_short(a9).
important_short(a11).
important_short(a13).
important_short(a15).
important_short(a19).
important_short(a22).
important_short(a23).
important(a39).
important(b4).
important(b11).
important(c2).
important(c5).
important(c6).
circuit(c18).
circuit(c19).
circuit(c20).
half_circuit(a36).
half_circuit(a37).
important(c8).
important(c10).
important(c12).
important(c14).
important_short(a6).
not_important(a12).
not_important(a14).
not_important(a20).
not_important(a21).
not_important(a24).
not_important(a27).
not_important(a29).
important_short(a25).
important_short(a26).
important_short(a28).
important_short(a31).
important_short(a33).
important_short(a35).
important_short(a40).
important_short(b3).
important_short(b10).
important_short(c4).
important_short(c7).
important_short(c13).
circuit(c15).
circuit(c16).
circuit(c17).
short_for_hole(a16).
short_for_hole(a18).
long_for_hole(a17).
circuit_hole(c21).
half_circuit_hole(a38).
half_circuit_hole(a42).
not_important(a2).
not_important(a4).
not_important(a5).
not_important(a7).
not_important(a8).
not_important(a10).
not_important(a30).
not_important(a32).
not_important(a41).
not_important(b2).
not_important(b5).
not_important(b6).
not_important(b7).
not_important(b9).
not_important(b12).
not_important(b13).
not_important(b14).
not_important(c1).
not_important(c3).
not_important(c9).
not_important(c11).
free(a39).
free(a40).
free(b2).
free(b3).
free(b7).
free(b9).
free(b10).
free(b14).
free(c6).
free(c7).
free(c11).
free(c12).
free(c13).
free(c17).
free(c18).
one_side_fixed(a34).
fixed(a13).
fixed(a14).
fixed(a15).
fixed(a16).
fixed(a17).
fixed(a18).
fixed(a19).
fixed(a20).
fixed(a21).
fixed(a22).
one_side_fixed(a35).
one_side_fixed(a41).
one_side_fixed(b1).
one_side_fixed(b4).
one_side_fixed(b8).
one_side_fixed(b11).
one_side_fixed(c2).
one_side_fixed(c3).
one_side_fixed(c5).
one_side_fixed(c8).
one_side_fixed(c10).
one_side_fixed(c14).
two_side_fixed(a36).
two_side_fixed(a37).
two_side_fixed(a38).
two_side_fixed(a42).
fixed(a23).
fixed(a24).
fixed(a25).
fixed(a26).
fixed(a27).
fixed(a28).
fixed(a29).
fixed(a30).
fixed(a31).
fixed(a32).
two_side_fixed(b5).
two_side_fixed(b6).
two_side_fixed(b12).
two_side_fixed(b13).
fixed(a1).
fixed(a2).
fixed(a3).
fixed(a4).
fixed(a5).
fixed(a6).
fixed(a7).
fixed(a8).
fixed(a9).
fixed(a10).
fixed(a11).
fixed(a12).
fixed(a33).
fixed(c1).
fixed(c4).
fixed(c9).
fixed(c15).
fixed(c16).
fixed(c19).
fixed(c20).
fixed(c21).
not_loaded(a1).
not_loaded(a2).
not_loaded(a3).
not_loaded(a4).
not_loaded(a5).
not_loaded(a6).
not_loaded(a7).
not_loaded(a23).
not_loaded(a24).
not_loaded(a25).
not_loaded(a26).
not_loaded(a27).
not_loaded(a28).
not_loaded(a29).
not_loaded(a33).
not_loaded(a36).
not_loaded(a42).
not_loaded(b1).
not_loaded(b2).
not_loaded(b5).
not_loaded(b6).
not_loaded(b7).
not_loaded(b8).
not_loaded(b9).
not_loaded(b12).
not_loaded(b13).
not_loaded(b14).
not_loaded(c1).
not_loaded(c2).
not_loaded(c3).
not_loaded(c4).
not_loaded(c5).
not_loaded(c6).
not_loaded(c7).
not_loaded(c8).
not_loaded(c9).
not_loaded(c15).
not_loaded(c20).
not_loaded(c21).
one_side_loaded(a34).
one_side_loaded(a35).
one_side_loaded(a40).
one_side_loaded(a41).
one_side_loaded(b3).
one_side_loaded(b4).
one_side_loaded(b10).
one_side_loaded(b11).
cont_loaded(a8).
cont_loaded(a9).
cont_loaded(a10).
cont_loaded(a11).
cont_loaded(a12).
cont_loaded(a13).
cont_loaded(a14).
cont_loaded(a15).
cont_loaded(a16).
cont_loaded(a17).
cont_loaded(a18).
cont_loaded(a19).
cont_loaded(a20).
cont_loaded(a21).
cont_loaded(a22).
cont_loaded(a30).
cont_loaded(a31).
cont_loaded(a32).
cont_loaded(a37).
cont_loaded(a38).
cont_loaded(a39).
cont_loaded(c10).
cont_loaded(c11).
cont_loaded(c12).
cont_loaded(c13).
cont_loaded(c14).
cont_loaded(c16).
cont_loaded(c17).
cont_loaded(c18).
cont_loaded(c19).
neighbour_xy(a34,a35).
neighbour_xy(a35,a26).
neighbour_xy(a26,a36).
neighbour_xy(a36,a4).
neighbour_xy(a4,a34).
neighbour_xy(b1,b13).
neighbour_xy(b13,b8).
neighbour_xy(b8,b7).
neighbour_xy(b7,b1).
neighbour_xy(b4,b6).
neighbour_xy(b6,b11).
neighbour_xy(b10,b14).
neighbour_xy(b14,b3).
neighbour_xy(c15,c9).
neighbour_xy(c16,c9).
neighbour_xy(c17,c11).
neighbour_xy(c12,c17).
neighbour_yz(c20,c2).
neighbour_yz(c21,c4).
neighbour_zx(a1,a2).
neighbour_zx(a2,a3).
neighbour_zx(a3,a4).
neighbour_zx(a4,a5).
neighbour_zx(a5,a6).
neighbour_zx(a6,a7).
neighbour_zx(a7,a8).
neighbour_zx(a8,a9).
neighbour_zx(a9,a10).
neighbour_zx(a10,a11).
neighbour_zx(a11,a12).
neighbour_zx(a12,a13).
neighbour_zx(a13,a14).
neighbour_zx(a14,a15).
neighbour_zx(b5,b1).
neighbour_zx(b8,b9).
neighbour_zx(b9,b10).
neighbour_zx(b10,b11).
neighbour_zx(b11,b12).
neighbour_zx(b12,b8).
neighbour_zx(c1,c2).
neighbour_zx(c2,c3).
neighbour_zx(c3,c4).
neighbour_zx(c4,c5).
neighbour_zx(c5,c6).
neighbour_zx(c6,c7).
neighbour_zx(c7,c8).
neighbour_zx(c8,c9).
neighbour_zx(c9,c10).
neighbour_zx(c10,c11).
neighbour_zx(c11,c12).
neighbour_xy(c18,c12).
neighbour_xy(c13,c18).
neighbour_xy(c19,c1).
neighbour_xy(c20,c1).
neighbour_xy(c21,c3).
neighbour_yz(a39,a41).
neighbour_yz(a40,a39).
neighbour_yz(a35,a40).
neighbour_yz(a25,a35).
neighbour_yz(a42,a25).
neighbour_yz(a24,a42).
neighbour_yz(b5,b6).
neighbour_yz(b6,b12).
neighbour_yz(b12,b13).
neighbour_yz(b13,b5).
neighbour_yz(b2,b7).
neighbour_yz(b7,b9).
neighbour_yz(b9,b14).
neighbour_yz(b14,b2).
neighbour_yz(c15,c8).
neighbour_yz(c16,c10).
neighbour_yz(c19,c14).
opposite(b11,b8).
opposite(c6,c12).
opposite(c2,c14).
opposite(c10,c14).
opposite(c15,c16).
opposite(c16,c17).
opposite(c17,c18).
opposite(c18,c19).
opposite(c19,c20).
opposite(c20,c21).
neighbour_zx(a15,a16).
neighbour_zx(a16,a17).
neighbour_zx(a17,a18).
neighbour_zx(a18,a19).
neighbour_zx(a19,a20).
neighbour_zx(a20,a21).
neighbour_zx(a21,a22).
neighbour_zx(a22,a23).
neighbour_zx(a23,a24).
neighbour_zx(a24,a1).
neighbour_zx(a25,a26).
neighbour_zx(a26,a27).
neighbour_zx(a27,a28).
neighbour_zx(a28,a29).
neighbour_zx(a29,a30).
neighbour_zx(a30,a31).
neighbour_zx(a31,a32).
neighbour_zx(a32,a33).
neighbour_zx(a33,a25).
neighbour_zx(b1,b2).
neighbour_zx(b2,b3).
neighbour_zx(b3,b4).
neighbour_zx(b4,b5).
same(a33,a23).
same(a36,a37).
same(a38,a37).
same(a39,a42).
same(b1,b8).
same(c6,c12).
same(c2,c14).
same(c10,c14).
same(c15,c16).
same(c16,c17).
neighbour_zx(c12,c13).
neighbour_zx(c13,c14).
neighbour_zx(c14,c1).
opposite(a11,a3).
opposite(a9,a3).
opposite(a31,a25).
opposite(a13,a1).
opposite(a15,a1).
opposite(a17,a1).
opposite(a19,a1).
opposite(a22,a1).
opposite(a23,a1).
opposite(a32,a22).
opposite(a33,a23).
opposite(a34,a37).
opposite(a36,a37).
opposite(a38,a37).
opposite(a39,a42).
opposite(b1,b8).
opposite(b3,b1).
opposite(b4,b1).
opposite(b10,b8).
same(c17,c18).
same(c18,c19).
same(c19,c20).
same(c20,c21).
【００５１】
次に、探索手続きに入る。
まず、準備として、まず入力部１から受け取った例を、オリジナル例集合保持部２１に入れる（ステップ２０１）。この例集合をコピーし、例集合保持部に入れ（ステップ１０２）、ノードバッファ２５を空にする（ステップ１０３）。
【００５２】
次に、述語集合を空集合としたノードを生成してノードバッファ２５の先頭に入れる（ステップ１０４）。このとき、ノードバッファ２５は、図４に示すような状態となる。ノード２６−０の述語集合は空集合なので、すべての例をカバーする。ここでは７５個の例を入れたので、ｆは７５となっている。
【００５３】
続いて、ステップ１０３の展開手続きに入るが、ノードバッファ２５の先頭のノード２６−０は、未展開であり、これを展開手続きによって展開し、得られたノード２６−１をノードバッファに加える。展開手続きでは、元が空集合なため、モード宣言で指定された述語を１つ付け加えることだけが行われる。
【００５４】
１回の展開では、ノードの数がユーザが指定したパラメータ（１００）に達しなかったため、ここではさらに展開が繰り返し行われる。この展開手続き中では、例えば、ノード｛important(X)｝から、述語を１つ付け加えることにより、ノード｛important(X), one_side_loaded(X)｝のようなｃ＝２のノードも得られる。ノードの数がユーザが指定したパラメータ（１００）を越えたならば（ステップ１０６でＹＥＳ）、展開手続きを終了する。このときのノードバッファ２５の様子を図５に示す。
【００５５】
次に、ステップ１０７の出力手続きに入る。ここではノードバッファ２５の先頭から調べていくが、図５の先頭のノード２６−１のｆが０でないため、これを出力として、出力手続きを呼び出す。
【００５６】
出力手続きは、ノード２６−１からルールを生成する。
まず、ノード２６−１の述語集合中｛not_important(X)｝の述語not_important(X)を右辺に置く。次に、ノード２６−１のカバーする例の数値部分の平均値を求める。ここでは、ノード２６−１がカバーする例は、mesh(a2,1),mesh(a4,1),...の21個であり、数値部分は1,1,1,...であり、これらを平均すると1.048である。
【００５７】
このとき数値情報が整数であることが指定されているので、平均値は、四捨五入して整数１に置き換えられる。結局左辺に置かれる述語はmesh(X,1)となり、出力手続きは、ルールmesh(X,1)←not_important(X).を生成する。
【００５８】
そして、ノード２６−１は取り除かれるので（ステップ１０７）、ノードバッファ２５は、図６の状態になる。
【００５９】
次に、ノード２６−１がカバーした例２１個を、例集合保持部２１から取り出す（ステップ１０８）。そして、例集合が空ではないので（ステップ１０９でＮＯ）、ステップ１１０に進む。
【００６０】
ここで、ノードバッファ２５を調べ、ｆが０であるノードを取り除く（ステップ１１０）。その後、ステップ１０５に戻る。
【００６１】
次にステップ１０５に再び入る。
同様に、ノードバッファ２５の先頭から順に調べ、まだ展開されていないノードを展開手続きによって展開し、得られたノードをノードバッファ２５に加える（ステップ１０５）。これを繰り返し（ステップ１０６でＮＯ）、さらに１００を越える数のノードが加えられたらこのステップを終了する（ステップ１０６でＹＥＳ）。
【００６２】
次に、出力手続きに入る。ここではノードバッファ２５の先頭から調べていくが、図６の先頭のノード２６−１のｆが０でないため、これを出力として出力手続きを呼び出す。
【００６３】
出力手続きは、先ほどと同様の方法で、mesh(X,8)←circuit(X).を生成する（ステップ１０７）。また、ノード２６−２は取り除かれる。
【００６４】
続いて、ノード２６−２がカバーした例６個を、例集合保持部２１から取り出す（ステップ１０８）。そして、例集合が空ではないので（ステップ１０９でＮＯ）、次に進み、以下同様に、例集合が空になるまでループを繰り返し、ルールを出力していく。そして、例集合が空になれば（ステップ１０９でＹＥＳ）、手続きは終了する。
【００６５】
出力された優先度付きルールは、次のものである。上のものほど優先度が高い。
【００６６】
mesh(X,1)←not_important(X).
mesh(X,8)←circuit(X).
mesh(X,9)←important_long(X).
mesh(X,12)←half_circuit(X).
：
【００６７】
ここで、このルールを利用する場合は、優先度の高いルールから順に適用して導出を試みる。例えば、エッジa2の分割数を求めたい場合は、ゴール←mesh(a2,N).の導出を、上のルールから順番に試みる。ここであるルールで導出が成功し、代入N=1が得られたとしたら、エッジa2の分割数は１と求められたことになる。そして、前述したように、それ以降のルールによる導出は行わない。
【００６８】
なお、この実施例では、目標述語中の数値情報に対応しない変数をXで示したが、これは便宜的なもので、変数記号であれば何でも良い。また、目標述語中の数値情報に対応しない変数が２つ以上あった場合も、本システムは同様に動作可能である。この場合は、目標述語は例えばtarget(X,Y,N)のように表される。
【００６９】
【発明の効果】
以上説明したように、この発明によれば、従来利用されていなかった学習の目標となる述語のうちの数値情報を数値として扱い、オリジナル例集合中の例の数値部分の値の標準偏差を利用するように構成したので、カバーする例をうまく説明するルールを出力することができた。
【００７０】
実際このことにより、従来技術に比べて、良い（すなわち未知例についての予測精度が高い）ルールが得られるという効果があることは、次の比較実験の結果が示している。
【００７１】
比較実験では、本発明のシステムでは、実施例で説明した例を用いた。数値情報を利用しないILPシステムとしては、Progolを使用し、比較のため、同じ例、モード宣言、背景知識を与えた。ただし、Progolには、例（正例）のほかに、負例を同数与えた。負例は、正例と異なる例（例えば、mesh(a29,8), mesh(a19,7)）をランダムに生成して与えた。Progolには、正例のみからルールを生成する機能もあるが、これは利用しなかった。それは、Progolにより得られるルールには優先順序がないため、１つの例に対していくつかのルールで導出が可能となり、つまり、「数値としての答えがいくつか出て」、そのどれかが正しければその未知例の予測に成功したことになり、本システムに比べて有利となるためである。この効果は、負例を同数入れることにより相殺される。
【００７２】
実験では、両方式とも、leave/1によるクロスバリデーション（１つの例を取りだし残りを訓練例として学習させ、得られたルールでテストを行うことを、全ての例について行う）を行った。その結果、この発明のシステムでは、予測精度８０．０％、Progolでは７３．３％であり、本システムの優位性が示された。なおProgolのパラメータはデフォルトのものを使用した。
【００７３】
一般に、学習において、未知例の予測をする場合、そのデータセットにもよるが、実世界のデータセットにおいては、予測精度９０％程度が上限であるのが普通である。その点（理想精度までの差を１６．７ポイントから１０ポイントまで約半減させた）を考えると、この予測精度の差６．７ポイントは、大幅な性能向上ということができる。
【００７４】
また、上述の実施例においては予測精度８０．０％が得られているが、アルゴリズムや評価関数の改良を行えば、机上での検討結果ではあるが、予測精度８４．０％まで性能を向上させることも可能であることがわかっており、この発明を実施した場合には、大幅な予測精度の向上が得られるという効果がある。
【図面の簡単な説明】
【図１】帰納推論システムの概略構成を示すブロック図である。
【図２】ノード２６を示した図である。
【図３】帰納推論部２の探索手続きの流れを示すフローチャートである。
【図４】ノードバッファ２５の状態を示した図（１）である。
【図５】ノードバッファ２５の状態を示した図（２）である。
【図６】ノードバッファ２５の状態を示した図（３）である。
【符号の説明】
１入力部
２帰納推論部
３出力部
２１オリジナル例集合保持部
２２例集合保持部
２３モード宣言保持部
２４背景知識保持部
２５ノードバッファ
２６、２６−０〜２６−８ノード[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an inductive reasoning system, and in particular, for a target concept including one numerical information, a first-order predicate logic that inputs an example that is a sample of the target concept and outputs a rule with priority that describes the target concept. Inductive reasoning system based on
[0002]
[Prior art]
An inductive inference system based on first-order predicate logic. Inductive Logic Programming, proposed by the literature (S. Muggleton. Inductive logic programming. New Generation Computing, 8 (4): 295-318, 1991.) (Hereinafter referred to as ILP) is an example in which a sample of the target concept, background knowledge, and the like are input for the target concept, and a rule that explains the target concept is output.
[0003]
Originally, ILP was created with the intention of learning concepts represented by symbols, but there are many concepts that include numerical information in actual application situations, and learning them well is a big challenge. .
[0004]
In response to this problem, an approach that uses numerical values as symbols (for example, B. Dolsak and S. Muggleton. The application of Inductive Logic Programming to finite element mesh design. In S. Muggleton, editor, Inductive Logic Programming, pages 453- 472. Academic Press, London, 1992.). That is, learning is performed by regarding 1, 2, 3, etc. as simple symbols similar to a, b, c. However, the magnitude relationship between numerical values was not considered. There are other attempts to learn concepts that include numerical information using ILP, but this approach is usually such that only the inequality sign is introduced.
[0005]
Progol is another widely used ILP system. Progol is described in literature (S. Muggleton. Inverse entailment and Progol. New Generation Computing, 13: 245-286, 1995.). For Progol, if you want to form a rule using the numerical information included in the example, a two-argument predicate that represents equality, a two-argument predicate that represents inequality, and the right side is greater than the left side A two-argument predicate that represents and vice versa, and a two-argument predicate that represents that the right-hand side is greater than or equal to the left-hand side, and vice versa.
[0006]
The rule evaluation function used when Progol searches for a rule uses the number of positive examples covered by the rule (which can be derived using the rules), the number of negative examples, and the length of the rule. Yes. For this reason, the numerical information included in the example is not used, and there is a problem that a good rule (that is, a well-described example is well explained or prediction accuracy for an unknown example is high) may not be obtained. It was.
[0007]
[Problems to be solved by the invention]
As described above, in the conventional inductive inference system, it is difficult to learn numerical information, and as a result, the numerical information included in the example is not used and a good rule may not be obtained.
[0008]
Therefore, an object of the present invention is to provide an inductive inference system that can obtain good rules by using numerical information.
[0009]
[Means for Solving the Problems]
In order to achieve the above-described object, the first aspect of the present invention provides a first-order predicate that inputs an example that is a sample of the target concept and outputs a rule that explains the target concept for the target concept including one numerical information. A recursive inference system based on logic, wherein a storage means for storing the example, a node is generated based on the example stored in the storage means, and a standard deviation of a numerical part of the example covered by the generated node is used An inference means for calculating an evaluation value of the node by a predetermined evaluation function, and a rule including an average value of numerical values of examples covered by the node in order of the node having the highest evaluation value calculated by the inference means And an output unit that deletes an example corresponding to the output rule from the storage unit, and the order of output by the output unit is set as the priority order of the rule.
[0013]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of an inductive inference system according to the present invention will be described in detail with reference to the accompanying drawings.
[0014]
FIG. 1 is a block diagram showing a schematic configuration of an inductive inference system.
As shown in the figure, the inductive reasoning system includes an input unit 1, an inductive reasoning unit 2, and an output unit 3.
[0015]
The input unit 1 is a part of the user interface, and inputs examples, mode declarations, background knowledge and the like to the inductive inference unit 2. The inductive reasoning unit 2 performs a top-down search of rules, generates a rule set with a priority order that explains the target concept, and sends the rule set to the output unit 3. The output unit 103 is a part that presents the output from this system to the user.
[0016]
The inductive reasoning unit 2 includes an original example set holding unit 21, an example set holding unit 22, a mode declaration holding unit 23, a background knowledge holding unit 24, and a node buffer 25.
[0017]
The original example set holding unit 21 has a role of holding the example set as it is during the search procedure, and the example set holding unit 22 holds the example set according to the change (decrease) during the search procedure. Have a role.
[0018]
The mode declaration holding unit 23 has a role of holding a mode declaration. The mode declaration is basically the same as that of Progol (modes), and describes the predicates that define the target concept, the predicates used to generate rules, and the input / output relationships of the variables.
[0019]
The background knowledge holding unit 24 has a role of holding background knowledge. The background knowledge is basically the same as that of Progol (background knowledge), which describes the definition of predicates and the specific properties of each example.
[0020]
The node buffer 25 has a role of holding each node of the search tree and includes a row of nodes 26.
[0021]
Here, the node 26 will be described.
FIG. 2 is a diagram showing the node 26.
[0022]
The node 26 is a set of five predicates, f, σ, c, and s. In addition, information about whether or not it has been expanded (described later) is held.
[0023]
A predicate set corresponds to a rule. That is, a rule is created by placing a predicate in a set of predicates on the right side and placing a predicate corresponding to the target concept on the left side. For example, in node 26 of FIG. 2, the predicate set is {important (X)} and the predicate corresponding to the target concept is {mesh (X, N)}, so the corresponding rule is {mesh (X, N) ← important (X)}. Here, the part corresponding to the numerical information in the target concept is replaced with an arbitrary variable (N in this example).
[0024]
f represents the number of examples this node covers at that time. That is, it represents the number of examples in the example set at that time that can be derived by the rule corresponding to the predicate set of this node. In the node 26, the number of examples to cover is eleven. This is an index for the number of examples that support the rule, and the larger the number, the better the rule.
[0025]
σ represents the standard deviation of the numerical value of the numerical part of the example in the original example set covered by this node. For example, if the node 26 covers 11 examples of mesh (a3,8), mesh (a9,3), ..., the numerical value part of the example is 8, 3, 7, 2, 2, 5, 2 , 7, 3, 2, 2 and their standard deviation is 2.386. The reason why the standard deviation is used is that when the standard deviation is small, there is no variation in the numerical value portion of the example to be covered, and it can be considered as a rule that explains the example well.
[0026]
c represents the number of elements in the predicate set of this node. For example, in node 26, there is one predicate set element. c is 1. This is an index representing the length of the rule corresponding to the node, and the smaller the number, the better the rule.
[0027]
Finally, s represents the score of this node, that is, the value of the rule evaluation function. The score s is basically determined using f, σ, and c. The number F of examples in the original example set may be used instead of f. For example, in the node 26, s is −5.842.
[0028]
Next, the operation of the inductive reasoning unit 2 will be described.
FIG. 3 is a flowchart showing the flow of the search procedure of the inductive reasoning unit 2.
[0029]
The inductive reasoning unit 2 first places the example received from the input unit 1 into the original example set holding unit 21 (step 101). Then, this example set is copied, put in the example set holding unit 22 (step 102), and the node buffer 25 is emptied (step 103).
[0030]
Next, a node having the predicate set as an empty set is generated and placed at the head of the node buffer 25 (step 104). At this time, as shown in FIG. 4, the node buffer 25 is in a state where a node 26-0 whose predicate set is an empty set (Empty) is placed at the head.
[0031]
Subsequently, the node buffer 25 is examined in order from the top, nodes that have not been expanded are expanded by the expansion procedure, and the obtained nodes are added to the node buffer 25 (step 105). At this time, the node buffer 25 is always arranged in descending order of the score s. Nodes having the same score s may be arranged in an arbitrary order.
[0032]
The expansion procedure here is the same as that of Progol (refinement operator), and only the outline will be described.
The expansion procedure outputs all of the predicate sets obtained by performing any one of the following operations on the predicate set of the original node (26 etc.).
1. Add one predicate specified in the mode declaration.
2. Make two variables with different names in the predicate set the same.
3. Replace a variable in the predicate set with a constant.
[0033]
This expansion procedure is repeated until the number of nodes exceeding the number specified by the user in advance (for example, 100) is added to the node buffer 25 (NO in step 106). When the number of nodes exceeding the number specified by the user (for example, 100 here) is added (YES in step 106), the expansion procedure is terminated. As shown in FIG. 5, the state of the node buffer 26 when the expansion procedure is completed is a state in which 100 or more nodes such as the nodes 26-1 to 26-8 are held (due to the expansion procedure). Because there may be more than 100).
[0034]
Next, the node buffer 25 is examined from the beginning, and a rule is generated by an output procedure with a node 26 (a node that covers some example) whose f is not 0 as an output, and is sent to the output unit 3. (Step 107). Note that the previous node with f = 0 is simply discarded. As shown in FIG. 6, the state of the node buffer 26 when the output procedure is completed is a state in which the node 26-1 is output (compared to the case of FIG. 5).
[0035]
The output procedure is a procedure that generates a rule from a node. First, a predicate in the node's predicate set is placed on the right side, and the average value of the numerical part of the example covered by the node is entered in the argument part representing the numerical information. A predicate representing a concept is placed on the left side. At this time, if the user specifies that the numerical information is an integer, the average value is rounded off and replaced with an integer.
[0036]
Subsequently, the example covered by the node output in the output procedure is taken out from the example set holding unit 22 (step 108). If the example set held in the example set holding unit 22 becomes empty (YES in step 109), the procedure ends.
[0037]
On the other hand, if the example set held in the example set holding unit 22 is not empty (NO in step 109), the node buffer 25 is checked, and a node whose f is 0 does not have any example even if it is further expanded. Since there is no possibility of covering, all are removed (step 110), the process returns to step 105, and the same processing is repeated.
[0038]
Note that the search procedure always ends at any time because at least one example is taken from the example set each time it loops, and a node whose predicate set is empty covers all examples. I can say that.
[0039]
By the way, the rules obtained by this inductive reasoning system have a priority order. In other words, since the score that is the evaluation function is higher in the rule that is output first, it is treated as a rule with higher priority. In the conventional inductive inference system such as Progol, the set of rules to be output has no priority order between the rules, and is treated as having the same level. When predicting, the derivation is attempted by applying the rules in descending order of priority.
[0040]
Also, when using the rules obtained by this inductive inference system, if the derivation is successful with one rule, the rule with lower priority than that is not used. That is, “only one answer as a numerical value appears” is executed. Incidentally, in the prior art, there are cases where derivation is possible with some rules for one example, that is, “two or more numerical answers are given”.
[0041]
Next, a specific example of the operation of this inductive inference system is shown.
[0042]
Here, the above-mentioned literature (B. Dolsak and S. Muggleton. The application of Inductive Logic Programming to finite element mesh design. In S. Muggleton, editor, Inductive Logic Programming, pages 453-472. Academic Press, London, 1992. ) And the same example. This is not a “toy dataset” created for so-called learning research, but a dataset used in the real world. Here, mesh (X, N) is used as the learning predicate in order to acquire rules for generating a desired mesh from the structure using data of a good mesh example. Here, X is an edge ID, and N is the number of divisions of the edge (the number of elements existing along the edge). Here, the target concept mesh includes numerical information (N). N is an integer.
[0043]
Here, the score s of the node 26 is determined by s = F / 75−10σ × σ−0.05c.
[0044]
First, examples, mode declarations, and background knowledge are input to the inductive reasoning unit 2. Here, the following 75 examples are input.
[0045]
mesh (a2, 1).
mesh (a3, 8).
mesh (a4, 1).
mesh (a5, 1).
mesh (a6, 2).
mesh (a7, 1).
mesh (a8, 1).
mesh (a9, 3).
mesh (a10, 1).
mesh (a11, 3).
mesh (a12, 1).
mesh (a13, 1).
mesh (a14, 1).
mesh (a15, 4).
mesh (a16, 1).
mesh (a17, 2).
mesh (a18, 1).
mesh (a19, 4).
mesh (a20, 1).
mesh (a21, 1).
mesh (a22, 2).
mesh (a23, 2).
mesh (a24, 1).
mesh (a25, 2).
mesh (a26, 1).
mesh (a27, 1).
mesh (a28, 2).
mesh (a29, 1).
mesh (a30, 1).
mesh (a31, 3).
mesh (a32, 2).
mesh (a33, 2).
mesh (a35, 1).
mesh (a36, 12).
mesh (a37, 12).
mesh (a38, 12).
mesh (a39, 5).
mesh (a40, 2).
mesh (a41, 1).
mesh (a42, 5).
mesh (b1, 9).
mesh (b2, 1).
mesh (b3, 2).
mesh (b4, 7).
mesh (b5, 1).
mesh (b6, 1).
mesh (b7, 1).
mesh (b8, 9).
mesh (b9, 1).
mesh (b10, 2).
mesh (b11, 7).
mesh (b12, 1).
mesh (b13, 1).
mesh (b14, 1).
mesh (c1, 1).
mesh (c2, 2).
mesh (c3, 1).
mesh (c4, 1).
mesh (c5, 3).
mesh (c6, 2).
mesh (c7, 2).
mesh (c8, 3).
mesh (c9, 1).
mesh (c10, 2).
mesh (c11, 1).
mesh (c12, 2).
mesh (c13, 1).
mesh (c14, 2).
mesh (c15, 8).
mesh (c16, 8).
mesh (c17, 8).
mesh (c18, 8).
mesh (c19, 8).
mesh (c20, 8).
mesh (c21, 8).
[0046]
The mode declaration is entered as follows: Here, it is shown in the same grammatical form as Progol.
[0047]
:-modeh (1, mesh (+ edge, + nums))?
:-modeh (1, mesh (+ edge, # nums))?
:-modeb (1, important_long (+ edge))?
:-modeb (1, important (+ edge))?
:-modeb (1, important_short (+ edge))?
:-modeb (1, circuit (+ edge))?
:-modeb (1, half_circuit (+ edge))?
:-modeb (1, short_for_hole (+ edge))?
:-modeb (1, long_for_hole (+ edge))?
:-modeb (1, circuit_hole (+ edge))?
:-modeb (1, half_circuit_hole (+ edge))?
:-modeb (1, not_important (+ edge))?
:-modeb (1, free (+ edge))?
:-modeb (1, one_side_fixed (+ edge))?
:-modeb (1, two_side_fixed (+ edge))?
:-modeb (1, fixed (+ edge))?
:-modeb (1, not_loaded (+ edge))?
:-modeb (1, one_side_loaded (+ edge))?
:-modeb (1, cont_loaded (+ edge))?
:-modeb (*, neighbour_xy (+ edge, -edge))?
:-modeb (*, neighbour_yz (+ edge, -edge))?
:-modeb (*, neighbour_zx (+ edge, -edge))?
:-modeb (2, opposite (+ edge, -edge))?
:-modeb (2, same (+ edge, -edge))?
[0048]
Here, modeh specifies a predicate that defines the target concept. modeb specifies a predicate that creates the right side of the rule used to generate the rule. + edge and -edge specify that they are input variables and output variables, respectively.
[0049]
The following is input as background knowledge. These are the same as those described in the literature. However, only the predicate name is changed for the two-argument predicate.
[0050]
important_long (a1).
important_long (a34).
important_long (b1).
important_long (b8).
important (a3).
important_short (a9).
important_short (a11).
important_short (a13).
important_short (a15).
important_short (a19).
important_short (a22).
important_short (a23).
important (a39).
important (b4).
important (b11).
important (c2).
important (c5).
important (c6).
circuit (c18).
circuit (c19).
circuit (c20).
half_circuit (a36).
half_circuit (a37).
important (c8).
important (c10).
important (c12).
important (c14).
important_short (a6).
not_important (a12).
not_important (a14).
not_important (a20).
not_important (a21).
not_important (a24).
not_important (a27).
not_important (a29).
important_short (a25).
important_short (a26).
important_short (a28).
important_short (a31).
important_short (a33).
important_short (a35).
important_short (a40).
important_short (b3).
important_short (b10).
important_short (c4).
important_short (c7).
important_short (c13).
circuit (c15).
circuit (c16).
circuit (c17).
short_for_hole (a16).
short_for_hole (a18).
long_for_hole (a17).
circuit_hole (c21).
half_circuit_hole (a38).
half_circuit_hole (a42).
not_important (a2).
not_important (a4).
not_important (a5).
not_important (a7).
not_important (a8).
not_important (a10).
not_important (a30).
not_important (a32).
not_important (a41).
not_important (b2).
not_important (b5).
not_important (b6).
not_important (b7).
not_important (b9).
not_important (b12).
not_important (b13).
not_important (b14).
not_important (c1).
not_important (c3).
not_important (c9).
not_important (c11).
free (a39).
free (a40).
free (b2).
free (b3).
free (b7).
free (b9).
free (b10).
free (b14).
free (c6).
free (c7).
free (c11).
free (c12).
free (c13).
free (c17).
free (c18).
one_side_fixed (a34).
fixed (a13).
fixed (a14).
fixed (a15).
fixed (a16).
fixed (a17).
fixed (a18).
fixed (a19).
fixed (a20).
fixed (a21).
fixed (a22).
one_side_fixed (a35).
one_side_fixed (a41).
one_side_fixed (b1).
one_side_fixed (b4).
one_side_fixed (b8).
one_side_fixed (b11).
one_side_fixed (c2).
one_side_fixed (c3).
one_side_fixed (c5).
one_side_fixed (c8).
one_side_fixed (c10).
one_side_fixed (c14).
two_side_fixed (a36).
two_side_fixed (a37).
two_side_fixed (a38).
two_side_fixed (a42).
fixed (a23).
fixed (a24).
fixed (a25).
fixed (a26).
fixed (a27).
fixed (a28).
fixed (a29).
fixed (a30).
fixed (a31).
fixed (a32).
two_side_fixed (b5).
two_side_fixed (b6).
two_side_fixed (b12).
two_side_fixed (b13).
fixed (a1).
fixed (a2).
fixed (a3).
fixed (a4).
fixed (a5).
fixed (a6).
fixed (a7).
fixed (a8).
fixed (a9).
fixed (a10).
fixed (a11).
fixed (a12).
fixed (a33).
fixed (c1).
fixed (c4).
fixed (c9).
fixed (c15).
fixed (c16).
fixed (c19).
fixed (c20).
fixed (c21).
not_loaded (a1).
not_loaded (a2).
not_loaded (a3).
not_loaded (a4).
not_loaded (a5).
not_loaded (a6).
not_loaded (a7).
not_loaded (a23).
not_loaded (a24).
not_loaded (a25).
not_loaded (a26).
not_loaded (a27).
not_loaded (a28).
not_loaded (a29).
not_loaded (a33).
not_loaded (a36).
not_loaded (a42).
not_loaded (b1).
not_loaded (b2).
not_loaded (b5).
not_loaded (b6).
not_loaded (b7).
not_loaded (b8).
not_loaded (b9).
not_loaded (b12).
not_loaded (b13).
not_loaded (b14).
not_loaded (c1).
not_loaded (c2).
not_loaded (c3).
not_loaded (c4).
not_loaded (c5).
not_loaded (c6).
not_loaded (c7).
not_loaded (c8).
not_loaded (c9).
not_loaded (c15).
not_loaded (c20).
not_loaded (c21).
one_side_loaded (a34).
one_side_loaded (a35).
one_side_loaded (a40).
one_side_loaded (a41).
one_side_loaded (b3).
one_side_loaded (b4).
one_side_loaded (b10).
one_side_loaded (b11).
cont_loaded (a8).
cont_loaded (a9).
cont_loaded (a10).
cont_loaded (a11).
cont_loaded (a12).
cont_loaded (a13).
cont_loaded (a14).
cont_loaded (a15).
cont_loaded (a16).
cont_loaded (a17).
cont_loaded (a18).
cont_loaded (a19).
cont_loaded (a20).
cont_loaded (a21).
cont_loaded (a22).
cont_loaded (a30).
cont_loaded (a31).
cont_loaded (a32).
cont_loaded (a37).
cont_loaded (a38).
cont_loaded (a39).
cont_loaded (c10).
cont_loaded (c11).
cont_loaded (c12).
cont_loaded (c13).
cont_loaded (c14).
cont_loaded (c16).
cont_loaded (c17).
cont_loaded (c18).
cont_loaded (c19).
neighbour_xy (a34, a35).
neighbour_xy (a35, a26).
neighbour_xy (a26, a36).
neighbour_xy (a36, a4).
neighbour_xy (a4, a34).
neighbor_xy (b1, b13).
neighbour_xy (b13, b8).
neighbour_xy (b8, b7).
neighbour_xy (b7, b1).
neighbour_xy (b4, b6).
neighbour_xy (b6, b11).
neighbour_xy (b10, b14).
neighbour_xy (b14, b3).
neighbour_xy (c15, c9).
neighbor_xy (c16, c9).
neighbour_xy (c17, c11).
neighbour_xy (c12, c17).
neighbour_yz (c20, c2).
neighbour_yz (c21, c4).
neighbour_zx (a1, a2).
neighbour_zx (a2, a3).
neighbour_zx (a3, a4).
neighbour_zx (a4, a5).
neighbour_zx (a5, a6).
neighbour_zx (a6, a7).
neighbour_zx (a7, a8).
neighbour_zx (a8, a9).
neighbour_zx (a9, a10).
neighbour_zx (a10, a11).
neighbour_zx (a11, a12).
neighbour_zx (a12, a13).
neighbour_zx (a13, a14).
neighbour_zx (a14, a15).
neighbour_zx (b5, b1).
neighbour_zx (b8, b9).
neighbour_zx (b9, b10).
neighbour_zx (b10, b11).
neighbour_zx (b11, b12).
neighbour_zx (b12, b8).
neighbour_zx (c1, c2).
neighbour_zx (c2, c3).
neighbour_zx (c3, c4).
neighbour_zx (c4, c5).
neighbour_zx (c5, c6).
neighbour_zx (c6, c7).
neighbour_zx (c7, c8).
neighbour_zx (c8, c9).
neighbour_zx (c9, c10).
neighbour_zx (c10, c11).
neighbour_zx (c11, c12).
neighbour_xy (c18, c12).
neighbour_xy (c13, c18).
neighbour_xy (c19, c1).
neighbour_xy (c20, c1).
neighbour_xy (c21, c3).
neighbour_yz (a39, a41).
neighbour_yz (a40, a39).
neighbour_yz (a35, a40).
neighbour_yz (a25, a35).
neighbour_yz (a42, a25).
neighbour_yz (a24, a42).
neighbour_yz (b5, b6).
neighbour_yz (b6, b12).
neighbour_yz (b12, b13).
neighbour_yz (b13, b5).
neighbour_yz (b2, b7).
neighbour_yz (b7, b9).
neighbour_yz (b9, b14).
neighbour_yz (b14, b2).
neighbour_yz (c15, c8).
neighbour_yz (c16, c10).
neighbour_yz (c19, c14).
opposite (b11, b8).
opposite (c6, c12).
opposite (c2, c14).
opposite (c10, c14).
opposite (c15, c16).
opposite (c16, c17).
opposite (c17, c18).
opposite (c18, c19).
opposite (c19, c20).
opposite (c20, c21).
neighbour_zx (a15, a16).
neighbour_zx (a16, a17).
neighbour_zx (a17, a18).
neighbour_zx (a18, a19).
neighbour_zx (a19, a20).
neighbour_zx (a20, a21).
neighbour_zx (a21, a22).
neighbour_zx (a22, a23).
neighbour_zx (a23, a24).
neighbour_zx (a24, a1).
neighbour_zx (a25, a26).
neighbour_zx (a26, a27).
neighbour_zx (a27, a28).
neighbour_zx (a28, a29).
neighbour_zx (a29, a30).
neighbour_zx (a30, a31).
neighbour_zx (a31, a32).
neighbour_zx (a32, a33).
neighbour_zx (a33, a25).
neighbour_zx (b1, b2).
neighbour_zx (b2, b3).
neighbour_zx (b3, b4).
neighbour_zx (b4, b5).
same (a33, a23).
same (a36, a37).
same (a38, a37).
same (a39, a42).
same (b1, b8).
same (c6, c12).
same (c2, c14).
same (c10, c14).
same (c15, c16).
same (c16, c17).
neighbour_zx (c12, c13).
neighbour_zx (c13, c14).
neighbour_zx (c14, c1).
opposite (a11, a3).
opposite (a9, a3).
opposite (a31, a25).
opposite (a13, a1).
opposite (a15, a1).
opposite (a17, a1).
opposite (a19, a1).
opposite (a22, a1).
opposite (a23, a1).
opposite (a32, a22).
opposite (a33, a23).
opposite (a34, a37).
opposite (a36, a37).
opposite (a38, a37).
opposite (a39, a42).
opposite (b1, b8).
opposite (b3, b1).
opposite (b4, b1).
opposite (b10, b8).
same (c17, c18).
same (c18, c19).
same (c19, c20).
same (c20, c21).
[0051]
Next, the search procedure is entered.
First, as preparation, first, the example received from the input unit 1 is put into the original example set holding unit 21 (step 201). This example set is copied, put in the example set holding unit (step 102), and the node buffer 25 is emptied (step 103).
[0052]
Next, a node having the predicate set as an empty set is generated and placed at the head of the node buffer 25 (step 104). At this time, the node buffer 25 is in a state as shown in FIG. Since the predicate set of node 26-0 is an empty set, all examples are covered. Since 75 examples are included here, f is 75.
[0053]
Subsequently, the expansion procedure of step 103 is entered, but the first node 26-0 of the node buffer 25 is unexpanded and is expanded by the expansion procedure, and the obtained node 26-1 is added to the node buffer. In the expansion procedure, since the element is an empty set, only one predicate specified in the mode declaration is added.
[0054]
Since the number of nodes does not reach the parameter (100) designated by the user in one expansion, the expansion is further repeated here. In this expansion procedure, for example, by adding one predicate from the node {important (X)}, a node of c = 2 such as the node {important (X), one_side_loaded (X)} is also obtained. If the number of nodes exceeds the parameter (100) designated by the user (YES in step 106), the expansion procedure is terminated. The state of the node buffer 25 at this time is shown in FIG.
[0055]
Next, the output procedure of step 107 is entered. In this case, the node buffer 25 is examined from the beginning. However, since f of the first node 26-1 in FIG. 5 is not 0, the output procedure is called using this as an output.
[0056]
The output procedure generates a rule from the node 26-1.
First, the predicate not_important (X) of {not_important (X)} in the predicate set of the node 26-1 is placed on the right side. Next, the average value of the numerical values in the example covered by the node 26-1 is obtained. Here, the example covered by the node 26-1 is 21 (mesh (a2,1), mesh (a4,1), ...), and the numerical part is 1,1,1, ... The average of these is 1.048.
[0057]
At this time, since the numerical information is specified to be an integer, the average value is rounded off and replaced with the integer 1. In the end, the predicate placed on the left side is mesh (X, 1), and the output procedure generates the rule mesh (X, 1) ← not_important (X).
[0058]
Since the node 26-1 is removed (step 107), the node buffer 25 is in the state shown in FIG.
[0059]
Next, 21 examples covered by the node 26-1 are taken out from the example set holding unit 21 (step 108). Since the example set is not empty (NO in step 109), the process proceeds to step 110.
[0060]
Here, the node buffer 25 is examined, and the node where f is 0 is removed (step 110). Thereafter, the process returns to step 105.
[0061]
Next, step 105 is entered again.
Similarly, the node buffer 25 is examined in order from the top, nodes that have not been expanded are expanded by the expansion procedure, and the obtained nodes are added to the node buffer 25 (step 105). This process is repeated (NO in step 106), and when more than 100 nodes are added, this step is terminated (YES in step 106).
[0062]
Next, the output procedure is entered. In this example, the node buffer 25 is checked from the beginning. Since f of the first node 26-1 in FIG. 6 is not 0, the output procedure is called using this as an output.
[0063]
The output procedure generates mesh (X, 8) ← circuit (X). By the same method as before (step 107). Also, the node 26-2 is removed.
[0064]
Subsequently, the six examples covered by the node 26-2 are taken out from the example set holding unit 21 (step 108). Since the example set is not empty (NO in step 109), the process proceeds to the next, and in the same manner, the loop is repeated until the example set becomes empty, and the rules are output. If the example set becomes empty (YES in step 109), the procedure ends.
[0065]
The output priority rules are as follows. The higher the priority, the higher the priority.
[0066]
mesh (X, 1) ← not_important (X).
mesh (X, 8) ← circuit (X).
mesh (X, 9) ← important_long (X).
mesh (X, 12) ← half_circuit (X).
:
[0067]
Here, when this rule is used, derivation is attempted by applying the rules in descending order of priority. For example, when the number of divisions of the edge a2 is to be obtained, the derivation of the goal ← mesh (a2, N). Is tried in order from the above rule. If the derivation is successful with this rule and the substitution N = 1 is obtained, the number of divisions of the edge a2 is obtained as 1. As described above, derivation based on the subsequent rules is not performed.
[0068]
In this embodiment, the variable not corresponding to the numerical information in the target predicate is indicated by X. However, this is for convenience and any variable symbol may be used. Further, when there are two or more variables that do not correspond to the numerical information in the target predicate, the present system can operate in the same manner. In this case, the target predicate is expressed as target (X, Y, N), for example.
[0069]
【The invention's effect】
As described above, according to the present invention, numerical information in a predicate that is a learning target that has not been conventionally used is treated as a numerical value, and the standard deviation of the numerical value portion of the example in the original example set is used Since it was configured to do so, it was possible to output a rule that well explained the covered example.
[0070]
In fact, the result of the following comparative experiment shows that this has an effect that a better rule (that is, a higher prediction accuracy for an unknown example) can be obtained as compared with the prior art.
[0071]
In the comparative experiment, the example described in the embodiment was used in the system of the present invention. Progol was used as an ILP system that does not use numerical information, and the same examples, mode declarations, and background knowledge were given for comparison. However, in addition to examples (positive examples), Progol was given the same number of negative examples. The negative example was generated by randomly generating an example different from the positive example (for example, mesh (a29,8), mesh (a19,7)). Progol also has the ability to generate rules from positive examples only, but this was not used. That is, the rules obtained by Progol do not have a priority order, so it is possible to derive with several rules for one example, that is, “some answers as numbers”, one of which is correct. This is because the unknown example has been successfully predicted, which is advantageous compared to this system. This effect is offset by including the same number of negative examples.
[0072]
In the experiment, both formulas were cross-validated by leave / 1 (take out one example, learn the rest as a training example, and test for all the examples). As a result, in the system of the present invention, the prediction accuracy was 80.0% and Progol was 73.3%, indicating the superiority of this system. The default parameters were used for Progol.
[0073]
In general, when an unknown example is predicted in learning, depending on the data set, the upper limit is usually about 90% in prediction accuracy in a real-world data set. Considering this point (the difference to the ideal accuracy has been reduced by about half from 16.7 points to 10 points), this prediction accuracy difference of 6.7 points can be said to be a significant performance improvement.
[0074]
Moreover, although the prediction accuracy of 80.0% is obtained in the above-described embodiment, if the algorithm and the evaluation function are improved, the performance is improved to the prediction accuracy of 84.0% although it is a result of examination on the desk. It has been found that it is also possible to achieve this, and when the present invention is implemented, there is an effect that a significant improvement in prediction accuracy can be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a schematic configuration of an inductive inference system.
FIG. 2 is a diagram showing a node 26;
FIG. 3 is a flowchart showing a flow of a search procedure of the inductive reasoning unit 2;
FIG. 4 is a diagram (1) showing a state of the node buffer 25;
FIG. 5 is a diagram (2) showing a state of the node buffer 25;
FIG. 6 is a diagram (3) showing a state of the node buffer 25;
[Explanation of symbols]
1 Input section
2 Inductive Reasoning Department
3 Output section
21 Original example set holding part
22 Example set holder
23 Mode declaration holding part
24 Background Knowledge Holding Unit
25 Node buffer
26, 26-0 to 26-8 nodes

Claims

An inductive inference system based on first-order predicate logic that inputs an example that is a sample of the target concept for a target concept including one piece of numerical information and outputs a rule that explains the target concept,
Storage means for storing the example;
Inference means for generating a node based on an example stored in the storage means, and calculating an evaluation value of the node by a predetermined evaluation function using a standard deviation of a numerical part of an example covered by the generated node; ,
An output means for outputting a rule including an average value of numerical values of an example covered by the node in order of nodes having a high evaluation value calculated by the inference means, and deleting an example corresponding to the outputted rule from the storage means; Comprising
An inductive inference system characterized in that the order of output by the output means is a priority order of rules.