JP3919863B2

JP3919863B2 - Database system

Info

Publication number: JP3919863B2
Application number: JP01757697A
Authority: JP
Inventors: 敏雄茂出木
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 1997-01-14
Filing date: 1997-01-14
Publication date: 2007-05-30
Anticipated expiration: 2017-01-14
Also published as: JPH10198704A

Description

【０００１】
【発明の属する技術分野】
本発明は、データベースシステムに関し、特に、複数の検索条件を順次与えてゆき、複数の検索条件のすべてを満たすデータのみを残すことにより、徐々に候補を絞り込む検索機能をもったデータベースシステムに関する。
【０００２】
【従来の技術】
現代の情報社会において、データベースシステムの果たす役割は益々重要度を増してきており、あらゆる分野の情報に関して、データベースシステムが構築されるに至っている。質の高いデータベースシステムを構築するためには、質の高いデータを収録することも重要ではあるが、それにも増して、質の高い検索環境を提供することが重要になる。いくら質の高いデータを収録したデータベースであっても、ユーザの要求するデータを探し出すことができなければ、何ら目的を達成することはできない。
【０００３】
データベースシステムにおける検索作業は、通常、キーワードを用いて行われる。データベースシステムの管理者が、予め個々のデータについて何らかのキーワードを対応づけておけば、ユーザが特定のキーワードを入力すると、該当するデータを提示させることができる。このようなキーワードを用いた検索作業の１つとして、複数のキーワードを順次与えてゆき、これら複数のキーワードのすべてに関連したデータのみを候補として残すことにより、徐々に候補を絞り込んでゆく手法は、ごく一般的に用いられている。
【０００４】
また、最近は、インターネットの普及にもみられるように、遠隔地に設置されたコンピュータへのアクセスが比較的容易に行われるようになってきており、データベースシステムも、通信回線などを介して世界的な規模での広がりをみせている。このため、個々の地域に構築されたローカルなデータベースを、相互に通信回線で結ぶことにより、グローバルなデータベースシステムを構築することが可能になってきている。このような分散型のデータベースシステムにおいて、質の高い検索環境を提供するために、ノード集合体を利用した検索手法が提案されている。すなわち、複数のノードからなるノード集合体を定義し、個々のノードにそれぞれ所定のデータを対応づけておく。そして、このノード集合体の中から特定の着目ノードが指定された場合、この着目ノードに対応づけられたデータを提供できるようにしておく。個々のノードにそれぞれキーワードを与えておけば、基本的には、キーワードによるデータ検索が行われることになる。ただ、各ノードを所定のリンク集合体によって関連づけておけば、ある１つのキーワード（ノード）を指定したときに、リンク集合体によって関連づけられた別なキーワード（ノード）を見つけ出すことができるので、検索の範囲は、ローカルなデータベースだけでなく、リンク集合体で関連づけられたグローバルなデータベースへと広がることになる。
【０００５】
【発明が解決しようとする課題】
従来のデータベースシステムは、基本的に、管理者が予め環境設定を行い、この環境下で一般利用者が検索作業などを行うことになる。この管理者による環境設定は、通常、標準的な利用者の便宜を考慮してなされるため、必ずしも個々の利用者の要望に沿ったものになるとは限らない。たとえば、あるデータについて、どのようなキーワードを対応づけるか、という事柄は、環境設定を行う管理者の裁量によって決められることになり、特定の利用者にとっては必ずしも適切なキーワードが対応づけられるとは限らない。
【０００６】
特に、上述した分散型のデータベースシステムでは、利用者が特定のノードに対して関連するノードを検索した場合に、どのような関連ノードが抽出されるかは、ノード間に張られているリンク次第であり、リンク集合体の設定内容に左右される。ところが、このリンク集合体の設定は、システムの管理者によって行われるため、必ずしも利用者の意向に沿った形でリンクが張られているとは限らず、たまたまリンクが定義されていなかったために、利用者の必要とするノードが関連ノードとして抽出されないような事態も起こり得る。特に、複数のキーワードを用いて徐々に候補を絞り込んでゆく検索手法を用いた場合、検索に用いたキーワードがたまたま対応づけられていなかった場合、目的とするデータが絞り込み作業の途中で候補から脱落してしまうため、最終的に検索されないという事態が生じ得る。
【０００７】
そこで本発明は、より柔軟な絞り込み検索を行うことができるデータベースシステムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
【０００９】
(1) 本発明の第１の態様は、絞り込み検索機能をもったデータベースシステムにおいて、
提供対象となるデータを格納するデータ格納手段と、
このデータ格納手段に格納されているデータの中から所定の検索条件に合致するデータを関連集合として抽出する検索処理を、オペレータから順次与えられる複数の検索条件に基づいて繰り返し実行する検索手段と、
検索手段が検索処理を実行するたびに、この検索処理の結果に基づいて所定のデータから構成される候補集合を求め、求めた候補集合を示す情報をオペレータに提示する候補提示手段と、
候補集合の中から、オペレータによって採択されたデータを提供するデータ提供手段と、
を設け、候補提示手段が、
個々のデータに、最下位ステータスＳ１乃至最上位ステータスＳ（ｎ＋１）なる合計（ｎ＋１）段階のステータスを定義できるようにし（ただし、ｎは予め設定された２以上の自然数）、一連の繰り返し検索処理の開始時に、全データのステータスを初期ステータスＳｊ（ただし、ｊは１＜ｊ≦ｎ＋１なる自然数）に設定しておき、
各回の検索処理が実行されるたびに、当該検索処理により関連集合として抽出されたデータについては、ステータスをｕ段階昇進（ただし、ｕは予め定義された０＜ｕ＜ｎなる自然数とし、全データについて共通の値でもよいし、個々のデータごとに異なる値でもよい。また、昇進は、最上位ステータスＳ（ｎ＋１）を上限とする）させ、当該検索処理により関連集合として抽出されなかったデータについては、最下位ステータスＳ１へ転落（ただし、最下位ステータスＳ１を下限とする）させるステータス遷移を行うとともに、最上位ステータスＳ（ｎ＋１）となったデータを候補集合として提示する処理を行うようにしたものである。
【００１０】
(2) 本発明の第２の態様は、複数のノードからなるノード集合体を定義し、個々のノードにそれぞれ所定のデータを対応づけ、特定の着目ノードが指定された場合に、この着目ノードに対応づけられたデータを提供する機能をもったデータベースシステムにおいて、
提供対象となるデータを、ノードに対応づけて格納するデータ格納手段と、
ノード間の関連を示すリンクの集合からなるリンク集合体を格納するリンク格納手段と、
オペレータからの指示に基づいて、特定の着目ノードを設定する着目ノード設定手段と、
リンク集合体を利用して、着目ノードに対して所定の条件下で関連する関連ノードを検索し、この関連ノードの集合体を関連集合として抽出する検索処理を実行する検索手段と、
関連集合に所属するノードの全部もしくは一部を候補ノードとしてオペレータに提示する候補提示手段と、
候補提示手段によって提示されている候補ノードの中から、オペレータに特定の候補ノードを採択させる候補採択手段と、
採択されたノードが新たな着目ノードとなるように、着目ノード設定手段の設定を更新する更新手段と、
データ格納手段から、着目ノードに対応づけられたデータを抽出し、これをオペレータに提供するデータ提供手段と、
を設け、
オペレータの採択行為に基づいて着目ノードを次々と更新させながら、一連の繰り返し検索処理を実行できるようにし、候補提示手段が、
個々のノードに、最下位ステータスＳ１乃至最上位ステータスＳ（ｎ＋１）なる合計（ｎ＋１）段階のステータスを定義できるようにし（ただし、ｎは予め設定された２以上の自然数）、一連の繰り返し検索処理の開始時に、全ノードのステータスを初期ステータスＳｊ（ただし、ｊは１＜ｊ≦ｎ＋１なる自然数）に設定しておき、
各回の検索処理が実行されるたびに、当該検索処理により関連集合として抽出されたノードについては、ステータスをｕ段階昇進（ただし、ｕは予め定義された０＜ｕ＜ｎなる整数とし、全ノードについて共通の値でもよいし、個々のノードごとに異なる値でもよい。また、昇進は、最上位ステータスＳ（ｎ＋１）を上限とする）させ、当該検索処理により関連集合として抽出されなかったノードについては、最下位ステータスＳ１へ転落（ただし、最下位ステータスＳ１を下限とする）させるステータス遷移を行うとともに、最上位ステータスＳ（ｎ＋１）となったノードを候補集合として提示する処理を行うようにしたものである。
【００１１】
(3) 本発明の第３の態様は、上述の第２の態様に係るデータベースシステムにおいて、
一連の繰り返し検索処理中に、着目ノードとなったノードについては、除外ステータスなる特別なステータスへ移行させるようにし、一連の繰り返し検索処理中には、除外ステータスから他のステータスへの遷移を行わせないようにしたものである。
【００１２】
(4) 本発明の第４の態様は、上述の第２の態様に係るデータベースシステムにおいて、
一連の繰り返し検索処理中に、最下位ステータスＳ１へ転落したことのある転落経験ノードが候補ノードとして提示され、転落経験ノードが採択された場合に、転落経験ノードが最下位ステータスＳ１へ転落した時点における着目ノードと、転落経験ノードとの間に、新たなリンクを定義し、この新たなリンクをリンク集合体に追加する処理を行う学習手段を更に設けたものである。
【００１３】
(5) 本発明の第５の態様は、上述の第２の態様に係るデータベースシステムにおいて、
着目ノードと個々の関連ノードとの関連の程度に基づいて、個々の関連ノードについてのステータスの昇進段階ｕの値を独立して決定するようにしたものである。
【００１４】
【発明の実施の形態】
以下、本発明を図示する実施形態に基づいて説明する。
【００１５】
§１．本実施形態に係るデータベースシステムの基本概念
はじめに、本実施形態の基本概念を説明するために、図１に示すような基本的なデータベースシステムについて、データへのアクセス方法を述べる。この図１に示すデータベースシステムは、多数のデータを収録したデータベース１と、これにアクセスするためのノード・リンク集合体２とによって構成されている。ノード・リンク集合体２は、多数のノードおよびリンクから構成されている。図示の例では、９つのノードＮ１〜Ｎ９と、７つのリンクＬ１〜Ｌ７が定義されている。各ノードＮ１〜Ｎ９には、それぞれ所定のキーワードＫ１〜Ｋ９が定義されており、ノード間を結ぶ各リンクは、個々のキーワードを結びつける機能を果たしている。このような意味で、ノード・リンク集合体２は、「キーワードネットワーク」と呼ぶこともできる。
【００１６】
図示のとおり、リンクはすべてのノード間に定義されているわけではなく、互いに関連をもったノード間（すなわち、互いに関連をもったキーワードが定義されているノード間）にのみ定義されている。たとえば、図示の例では、ノードＮ１とノードＮ２との間に定義されたリンクＬ１は、キーワードＫ１とキーワードＫ２とが何らかの関連性を有することを示している。同様に、ノードＮ２とノードＮ４との間に定義されたリンクＬ３は、キーワードＫ２とキーワードＫ４とが何らかの関連性を有することを示しており、ノードＮ４とノードＮ５との間に定義されたリンクＬ４は、キーワードＫ４とキーワードＫ５とが何らかの関連性を有することを示している。このようなリンクを介して、間接的に、ノードＮ１とノードＮ５との間が連結されており、キーワードＫ１を与えることにより、関連するキーワードとしてキーワードＫ５を得ることができる。なお、各リンクには、それぞれ重みづけがなされており、この重みづけによりノード間の関連の程度が示される。
【００１７】
どのノードにどのようなキーワードを定義し、どのノード間にどのようなリンクを定義するかは、第一義的には、このデータベースシステムの管理者に課せられた仕事であり、一般利用者がこのシステムを利用する時点において、既に、何らかのノード・リンク集合体２が構築されていることになる。もっとも、後述するように、このノード・リンク集合体２は、利用者の利用行為によって学習を行う機能を有しており、利用者が利用してゆくに従って、各リンクの重みづけが修正されたり、これまでリンクが定義されていなかったノード間に新たなリンクが定義されたりすることになる。したがって、たとえ管理者が何らリンクの定義を行わなかったとしても、利用者が利用するに従って、ノード・リンク集合体２は徐々に形成されてゆくことになる。
【００１８】
オペレータ３は、データベース１内の特定のデータを利用したいと考えた場合、まず、このノード・リンク集合体２に対してアクセスを行う。オペレータ３から見ると、このノード・リンク集合体２は、データベース１へアクセスするためのフロントエンドプロセッサとして機能していることになる。オペレータ３は、まず、ノード・リンク集合体２内の特定のノードを着目ノードとして指定する。ノードの指定は、キーワードの入力によって行うことができる。たとえば、オペレータ３が特定のキーワードＫ１をノード・リンク集合体２に対して与えたとすれば、このキーワードＫ１に対応するノードＮ１が着目ノードとして指定されることになる。本願の図では、説明の便宜上、その時点での着目ノードを、黒丸のノード点の周囲に円を描くことにより表示することにする。図１において、ノードＮ１の周囲に円が描かれているのは、このノードＮ１が現時点での着目ノードであることを示すものである。
【００１９】
ノード・リンク集合体２内の個々のノードには、それぞれデータベース１内の特定のデータが対応づけられている。別言すれば、１つのノードを特定すれば、このノードに対応づけられたデータをデータベース１から抽出してくることができる。したがって、オペレータ３が、現時点の着目ノードＮ１に関するデータを閲覧したい旨の指示を与えれば、ノードＮ１に対応づけられたデータが、データベース１から抽出され、オペレータ３に提供されることになる。ノードとデータベース１内の各データとの対応づけは、たとえば、個々のノードにデータベース１内の特定のアドレス情報をもたせておけばよい。着目ノードＮ１に関するデータ閲覧の指示が与えられたら、ノードＮ１のもつアドレス情報に基づいて、データベース１をアクセスして所定のデータを読出し、これをオペレータ３に提供すればよい。
【００２０】
このように、オペレータ３が、ノード・リンク集合体２に対してキーワードＫ１を与え、データ閲覧の指示を与えれば、データベース１からノードＮ１に対応づけられたデータが読み出され、オペレータ３に提供されることになる。しかしながら、このような検索処理は、所定のキーワードＫ１を入力したときに、予めこのキーワードＫ１に関連づけられていたデータを提示する、という従来の一般的なデータベースシステムにおける検索処理に過ぎない。もちろん、本実施形態に係るデータベースシステムでは、このような従来の一般的な検索処理を行うことも可能であるが、本実施形態の主眼は、オペレータが１つのキーワードをシステムに与えた場合に、このキーワードに関連する別なキーワードを検索できるようにし、より自由度の高い検索作業を実現することにある。
【００２１】
たとえば、オペレータが所定のキーワードＫ１をノード・リンク集合体２に与えると、図１に示されているように、ノードＮ１が着目ノードとして指定されることになる。上述したように、オペレータは、必要があれば、「着目ノードＮ１に対応づけられたデータの閲覧」を行う旨の指示を与えることが可能である。ただ、ここでは、オペレータが、このノードＮ１に対応づけられたデータには直接的には興味はないが、キーワードＫ１に関連した別なデータを探しているものとしよう。この場合、「ノードＮ１を着目ノードとした検索処理」を行う旨の指示をノード・リンク集合体２に対して与えればよい。ノード・リンク集合体２は、このような検索指示が与えられると、定義されているリンク集合体を参照しながら、着目ノードＮ１に関連する別なノードを検索する処理を実行する。たとえば、着目ノードＮ１に対して、リンクによって直接的あるいは間接的に連結されているノードすべてを関連ノードとして検索する処理を行った場合、図１の例の場合、着目ノードＮ１に対して、４つのノードＮ２，Ｎ３，Ｎ４，Ｎ５が関連ノードとして検索されることになる。ただ、実用上は、個々のリンクの重みを考慮して、ある程度以上の関連性をもったノードのみを関連ノードとして抽出するようにするのが好ましい。たとえば、図１の例では、リンクＬ４の重みが小さい場合には、着目ノードＮ１とノードＮ５との間の関連性は低いものと判断し、ノードＮ５は関連ノードから外されることになる。また、後の§４．３で詳述するように、１つのリンクで結合されたノード間の信号伝達をホップ数Ｈ＝１と定義し、所定のホップ数以下で連結されているノードのみを関連ノードとするという条件を課すのが好ましい。たとえば、ホップ数Ｈ＝２以下との条件を課した場合、図１の例では各リンクの重みにかかわらず、着目ノードＮ１に対してホップ数Ｈ＝３の位置にあるノードＮ５は関連ノードとしては抽出されない。
【００２２】
ここでは、ノードＮ１を着目ノードとした検索処理を行うことにより、３つのノードＮ２，Ｎ３，Ｎ４のみが抽出されたものと仮定しよう。図２は、このような検索処理の結果を示す図である。３つのノードＮ２，Ｎ３，Ｎ４が白抜きのノード点として描かれているのは、この３つのノードが関連ノードとして抽出されたノードであることを示すためである。以下、検索により抽出された関連ノードについては、白抜きのノード点で示すことにする。図に黒いノード点として示されているノードＮ５は、今回の検索では関連ノードとしては抽出されなかったことになる。
【００２３】
こうして、関連ノードＮ２，Ｎ３，Ｎ４が検索されたら、これらの「関連ノード」を「候補ノード」としてオペレータに提示する。具体的には、個々の候補ノードＮ２，Ｎ３，Ｎ４に定義された各キーワードＫ２，Ｋ３，Ｋ４をディスプレイなどに表示することにより、候補ノードの提示を行えばよい。ここで、「関連ノード」を「候補ノード」と呼ぶのは、これらのノードの中から、オペレータが新たな着目ノードの採択を行うからである。図２に示すように、現時点の着目ノードは、ノードＮ１であるが、オペレータは、候補ノードＮ２，Ｎ３，Ｎ４のうちのいずれかを、新たな着目ノードとして指定することになる。たとえば、オペレータが、候補ノードＮ４を新たな着目ノードとして採択したとすれば、図３に示すように、ノードＮ４が新たな着目ノードになり、その他のノードはいずれも通常のノードに戻る。採択すべき候補ノードの指定は、たとえば、ディスプレイ画面上に提示されているキーワードＫ２，Ｋ３，Ｋ４のうち、キーワードＫ４を選択する操作により行うことができる。
【００２４】
なお、後の§７で述べる「ＡＮＤ検索」における説明の便宜を考えて、本願明細書では、所定の着目ノードについての検索処理により、この着目ノードとある程度以上の関連性がある、と判断されて抽出されたノードを「関連ノード」と呼び、この「関連ノード」の中からオペレータに対して「新たな着目ノードの候補」として提示するノードを「候補ノード」と呼ぶことにする。後述する§７で述べる「ＡＮＤ検索」では、抽出された「関連ノード」のうちの一部のみが「候補ノード」として提示されることになるが、ここで述べる通常の検索では、抽出された「関連ノード」のすべてがそのまま「候補ノード」として提示されることになるので、とりあえずは、「関連ノード」＝「候補ノード」と考えておいてかまわない。したがって、§６までの説明においては、特に問題が生じない場合には、「関連ノード」と「候補ノード」とを同義語として取り扱うことにする。
【００２５】
以上の検索処理をオペレータ側の操作として見ると、次のようになる。まず、オペレータは、所定のキーワードＫ１を入力するとともに、検索指示を与える。すると、リンク集合体を参照した検索処理が行われ、ディスプレイ画面上に、キーワードＫ１にある程度の関連性をもった別なキーワードＫ２，Ｋ３，Ｋ４が表示される（候補ノードＮ２，Ｎ３，Ｎ４の提示）。オペレータは、これらの新たなキーワードの中から、自分の探しているデータに関連すると思われるキーワードを選択する（候補ノードの１つを新たな着目ノードとして採択）。これにより、ノードＮ４が新たな着目ノードになる。
【００２６】
このように、検索指示を与えることにより、オペレータは関連するノードを次々と渡り歩くことができ、キーワードネットワーク中を移動することができるようになる。しかも、必要があれば、現在の着目ノードに対応づけられたデータをいつでも閲覧することが可能になる。たとえば、図３に示すように、ノードＮ４が新たな着目ノードとなった時点で、このノードＮ４に対応づけられているデータを閲覧したい場合には、その時点で閲覧指示を与えればよい。すると、データベース１からノードＮ４に対応するデータが読み出され、オペレータに提示されることになる。また、「ノードＮ４を着目ノードとする検索処理」を行う旨の指示を与えれば、今度は、ノードＮ４に対してある程度の関連性を有する関連ノードが候補ノードとして抽出されることになる。たとえば、前回の「ノードＮ１を着目ノードとする検索処理」では関連性が低く、候補ノードからは漏れたノードＮ５が、今回の検索処理では候補ノードとして検索されることになる。
【００２７】
一般に、データベースシステムから必要とするデータを検索しようとする場合に、適切なキーワードが直ちに頭に思い浮かべばよいのであるが、必ずしも最適なキーワードが頭に思い浮かぶとは限らない。このような場合、本実施形態に係るシステムでは、非常に自由度の高い柔軟な検索が可能になる。すなわち、とりあえず頭に浮かんだキーワードＫ１を入力して検索指示を与えれば、これに関連したキーワードＫ２，Ｋ３，Ｋ４が候補としてディスプレイ上に自動的に提示されるので、オペレータは、検索対象となるデータをアクセスするのに、より適したキーワードを採択することができる。ここで、より適当なキーワードとしてＫ４（ノードＮ４）を採択し、再び検索指示を与えれば、たとえば、新たなキーワードＫ５が提示されることになる。ここで、このキーワードＫ５が、正に検索対象となるデータをアクセスするための最適のキーワードであったとすれば、このキーワードＫ５（ノードＮ５）を採択した上で、閲覧指示を与えれば、データベース１から目的のデータを得ることができる。
【００２８】
§２．ノード・リンク集合体の学習
本実施形態に係るデータベースシステムの特徴のひとつは、ノード・リンク集合体２が学習機能を備えている点である。前述したように、図１に示すノード・リンク集合体２は、第一義的には、このデータベースシステムの管理者によって構築されるが、その後、利用者が利用する過程において、徐々に形態を変えてゆくことになる。したがって、データベース１は全利用者について共通のものをひとつ用意しておけばよいが、ノード・リンク集合体２としては、個々の利用者ごとに別個独立したものを用意しておくのが好ましい。このように各利用者ごとに別個のノード・リンク集合体２を用意しておけば、当初は、すべてのノード・リンク集合体２がシステム管理者によって構築された共通のものであったとしても、各利用者がこのシステムを利用してゆくに従って、各利用者ごとのノード・リンク集合体２は、個々の利用者にとって利用しやすい形態に変遷してゆくことができる。
【００２９】
このように、ノード・リンク集合体２を、利用しやすい形態に変えてゆくために、本実施形態では次のような基本方針に沿った学習が行われるようにしている。すなわち、ある着目ノードについての検索により複数の候補ノードが抽出され、オペレータがこれらの候補ノードの中から、所望の候補ノードを採択した場合、着目ノードから採択ノードに至るパス上のリンクの重みづけを増加させるのである。たとえば、前述の例の場合、図２に示すように、着目ノードＮ１に対して３つの候補ノードＮ２，Ｎ３，Ｎ４が検索され、この中からオペレータがノードＮ４を採択し、その結果、図３に示すように、採択ノードＮ４が新たな着目ノードになった。この場合は、着目ノードＮ１から採択ノードＮ４に至るパス上のリンクＬ１，Ｌ３の重みづけを増加させるのである。図３では、リンクＬ１，Ｌ３が太線で示されているが、これはこのような重みづけを増加させる学習が行われたことを示すものである。
【００３０】
学習によって、リンクの重みづけを逆に減少させることも可能である。たとえば、上述の例の場合、着目ノードＮ１に対して３つの候補ノードＮ２，Ｎ３，Ｎ４が提示されたにもかかわらず、オペレータはノードＮ４を採択し、ノードＮ３は採択から漏れたことになる。別言すれば、リンクＬ２はノードの採択に何ら関与しなかったことになる。このような場合、リンクＬ２の重みづけを減少させる修正を行うと、利用者の利用形態に沿った形での学習が可能になる。たとえば、この利用者が、「ノードＮ１を着目ノードとする検索を行い、候補ノードのうちのノードＮ４を採択した」という事象が５回行われたとしよう。この場合、５回の学習のいずれにおいても、リンクＬ１，Ｌ３の重みづけを増加させ、リンクＬ２の重みづけを減少させる、という修正が行われることになる。その結果、リンクＬ１，Ｌ３の重みは非常に大きくなり、逆に、リンクＬ２の重みは非常に小さくなる。したがって、たとえば、「ノードＮ１を着目ノードとする６回目の検索」が行われた時点では、リンクＬ２が示す関連性（ノードＮ２とノードＮ３との関連性）はかなり小さくなり、もはやノードＮ３は候補ノードとしては抽出されなくなる。
【００３１】
このように、重みづけを増加させる学習とともに、重みづけを減少させる学習を行うようにすれば、過去の履歴を見た限りでは今後も採択される可能性が低いノードは、将来の検索時には候補ノードとして抽出されないようにすることができ、候補ノードを絞り込むことができるようになる。全ノード数が膨大な数になる実際のデータベースシステムでは、このように、オペレータに提示する候補ノードの数をある程度絞り込むことが、使い勝手を良くするために重要である。
【００３２】
リンクの重みづけに関して、増加させる学習と、減少させる学習との双方を行うには、次のような基準で学習を行うようにしておけばよい。すなわち、所定の着目ノードについての検索により、いくつかの候補ノードが抽出され、これらの候補ノードの中から１つのノードが採択された場合、着目ノードから個々の候補ノードに至るパスすべてを学習対象パスとするのである。そして、この学習対象パス上のリンクのうち、着目ノードから採択ノードに至るパス上のリンクの重みづけを増加させ、それ以外のリンクの重みづけを減少させる。このような重みづけの増減修正は、要するに、「着目ノードから採択ノードに至るパス上のリンクの重みづけを、その他のリンクの重みづけに対して相対的に増加させる修正」ということができる。
【００３３】
このように、重みづけを増減修正して学習を行う場合、前述の例では、次のような重みづけの修正が行われることになる。すなわち、図２に示すように、着目ノードＮ１についての検索により、３つの候補ノードＮ２，Ｎ３，Ｎ４が抽出されたとすると、着目ノードＮ１から各候補ノードＮ２，Ｎ３，Ｎ４に至るパス上の全リンクが学習対象パス上のリンクになる。具体的には、リンクＬ１，Ｌ２，Ｌ３が学習対象になる。オペレータがノードＮ４を採択したとすると、着目ノードＮ１から採択ノードＮ４に至るパス上のリンクＬ１，Ｌ３の重みづけを増加するとともに、それ以外の学習対象パス上のリンクＬ２の重みづけを減少する修正が行われることになる。このとき、リンクＬ４は、ノードＮ５が候補ノードになっていないため、学習対象にはなっておらず、重みづけはもとのまま変わらない。
【００３４】
要するに、上述したリンクの重みづけ学習の概念は、折角候補ノードとしてオペレータに提示されていたにもかかわらず、採択されなかったノードについては、そのノードへ至るリンクの重みづけを減少させ、逆に、採択されたノードへ至るリンクの重みづけを増加させることにある。そして重要な点は、「複数の候補ノードの中からひとつを採択する」というオペレータ（利用者）の行為に基づいて、すべての学習が行われる点である。したがって、このデータベースシステムが多数の利用者によって利用されている場合、個々の利用者ごとに異なる態様で学習が進んでゆくことになる。前述したように、ノード・リンク集合体２は個々の利用者ごとに別個独立したものが用意されるので、利用すればするほど、各利用者ごとのノード・リンク集合体２は、その利用者による使い勝手に合わせた学習が進むことになる。
【００３５】
なお、ここではリンクの重みづけだけを学習の対象として説明したが、本実施形態では、ノードにも重みづけを定義し、ノードの重みづけも学習の対象としている。このノードの重みづけの取扱いについては後述する。
【００３６】
§３．新たなリンクの発生
上述した§２では、リンクの重みづけについて学習が行われることを説明した。しかしながら、このような既存のリンクの重みづけを修正するだけでは、個々の利用者の使い勝手に合わせた柔軟な検索処理を実現することは困難である。たとえば、図１に例示したノード・リンク集合体２では、ノードＮ１に対して、ノードＮ２，Ｎ３，Ｎ４，Ｎ５がリンクによって結合されている。したがって、繰り返し検索処理を行ってゆけば、ノードＮ１からノードＮ２，Ｎ３，Ｎ４，Ｎ５へ到達することは可能である。実際、上述の例の場合、ノードＮ１を着目ノードとした第１回目の検索により候補ノードＮ２，Ｎ３，Ｎ４への到達が実現できており、続いて、ノードＮ４を採択し、この採択ノードＮ４を新たな着目ノードとした第２回目の検索を行えば、候補ノードＮ５へ到達することが可能である。
【００３７】
しかしながら、ノードＮ１〜Ｎ５と、ノードＮ６〜Ｎ９との間には、何らリンクが張られていないため、既存のリンクを利用して候補ノードの検索を行う限り、ノードＮ１〜Ｎ５から、ノードＮ６〜Ｎ９へ至る検索を行うことはできない。
【００３８】
既に述べたように、ノード・リンク集合体２を構築するのは、第一義的にはこのシステムの管理者である。したがって、ノードＮ１〜Ｎ５と、ノードＮ６〜Ｎ９との間に、何らリンクが定義されていなかったとしたら、キーワードＫ１〜Ｋ５と、キーワードＫ６〜Ｋ９との間には、何ら関連性はないとの判断が管理者によってなされていたことになる。しかしながら、このような管理者の判断は普遍的なものではなく、特定の利用者にとってみれば、たとえば、キーワードＫ４とキーワードＫ７とは、密接な関連性を有するとの認識がなされているかもしれない。このような場合、たとえば、図４に示すように、ノードＮ４とノードＮ７との間に破線で示すような一時的なリンクＬ８を発生させ、既存のリンクＬ１〜Ｌ７と、一時的に発生させたリンクＬ８との双方を用いた検索を行うことにより、より自由度の高い検索が可能になる。すなわち、この一時的なリンクＬ８を付加した状態において、ノードＮ４を着目ノードとする検索を行えば、たとえば、図４に白抜きのノード点として示したノードＮ１，Ｎ２，Ｎ３，Ｎ５，Ｎ６，Ｎ７，Ｎ８を候補ノードとして抽出することが可能になる。
【００３９】
本明細書では、これまで述べてきた既存のリンクを「スタティックリンク」と呼び、検索時に一時的に発生させるリンクを「ダイナミックリンク」と呼んで両者を区別することにする。図４に示す例の場合、実線で示すリンクＬ１〜Ｌ７がスタティックリンクであり、破線で示すリンクＬ８がダイナミックリンクである。検索時にダイナミックリンクを発生させるようにすれば、着目ノードとの間が既存のスタティックリンクによって完全には連結されておらず、不連結部分が存在するノードに対しても、この不連結部分に一時的にダイナミックリンクを定義することにより検索が可能になり、そのような不連結ノードにまで候補を広げることができるようになる。
【００４０】
いま、図４に示すように、ノードＮ４を着目ノードとする検索において、ダイナミックリンクＬ８を一時的に定義することにより（ダイナミックリンクの定義方法については後述する）、ノードＮ１，Ｎ２，Ｎ３，Ｎ５，Ｎ６，Ｎ７，Ｎ８が候補ノードとして抽出されたものとしよう。このとき、オペレータに対しては、キーワードＫ１，Ｋ２，Ｋ３，Ｋ５，Ｋ６，Ｋ７，Ｋ８が個々の候補ノードを示す情報として提示されることになる。そして、これらの候補ノードの中から、オペレータがノードＮ６を新たな着目ノードとして採択したとする。既に述べたように、オペレータは、この新たな着目ノードＮ６に対応づけられたデータの閲覧を行うこともできるし、このノードＮ６を着目ノードとした新たな検索を行うこともできる。このように、ダイナミックリンクの定義により、検索範囲の自由度はかなり広がることになる。
【００４１】
このように、一時的に定義したダイナミックリンクＬ８を用いた検索によって候補ノードＮ６が抽出され、この候補ノードＮ６が採択された場合（別言すれば、ダイナミックリンクＬ８が、着目ノードＮ４から採択ノードＮ６へ至るパス上のリンクになった場合）、このダイナミックリンクＬ８を、新たにスタティックリンクＬ８として追加する処理を行うようにする。すなわち、一時的なダイナミックリンクが、恒久的なスタティックリンクに昇格したことになる。図５は、採択ノードＮ６を新たな着目ノードにするとともに、ダイナミックリンクＬ８をスタティックリンクＬ８に昇格させた状態を示している。このとき、図に太線で示してあるように、着目ノードＮ４から採択ノードＮ６へ至るパス上のリンクＬ８，Ｌ５の重みづけは増加させられる（リンクＬ８の増加前の元の重みづけは、ダイナミックリンクＬ８を定義するときに決めておくようにする）。一方、学習対象となった他のリンク（着目ノードＮ４から各候補ノードへ至るパス上のリンクＬ１，Ｌ２，Ｌ３，Ｌ４，Ｌ６）の重みづけについては、減少させる学習が行われる。
【００４２】
結局、この例では、一時的に定義したダイナミックリンクＬ８が、スタティックリンクＬ８に昇格し、ノード・リンク集合体２の新たなメンバーとして追加されたことになる。もっとも、一時的に定義されたダイナミックリンクは、スタティックリンクに昇格することなしに消滅してしまうこともある。たとえば、図４に示すように、いくつかの候補ノードが提示された状態において、オペレータがノードＮ６を採択する代わりに、ノードＮ５を採択したような場合、ダイナミックリンクＬ８は着目ノードから採択ノードに至るパス上のリンクにはならなかったので、スタティックリンクに昇格することなしに消滅する。要するに、一時的に定義したダイナミックリンクは、オペレータの採択行為にパスとして関与した場合には、スタティックリンクとして残ることになるが、採択行為に関与しなかった場合には、そのまま消滅してしまうことになる。
【００４３】
このように、利用者が必要とするダイナミックリンクをスタティックリンクに昇格させてノード・リンク集合体２に追加する処理を行ってゆけば、データベースシステムの管理者から提供されなかったリンクが徐々に増えてゆくことになり、利用者にとって使い勝手の良いリンク集合体が形成されてゆくことになる。また、このようなリンクの追加手法を利用すれば、システムの管理者が、当初に全くリンクの定義を行わなかったとしても（すなわち、当初はスタティックリンクが全く存在しなかったとしても）、利用者がこのシステムを利用してゆく過程により、徐々にスタティックリンクが形成されてゆくことになる。したがって、ここで述べた新たなリンクの追加手法は、非常に有効な手法である。
【００４４】
ところで、これまでの説明では、ダイナミックリンクの定義方法については何ら触れていなかったが、図４に破線で示すようなダイナミックリンクＬ８を一時的に定義するためには、何らかの基準を設定しておく必要がある。図示の例では、ノードＮ４−Ｎ７間にダイナミックリンクＬ８を定義しているが、ノードＮ４−Ｎ６間、ノードＮ４−Ｎ８間、ノードＮ４−Ｎ９間にもダイナミックリンクを定義する余地はある。また、ノードＮ４−Ｎ３間や、ノードＮ４−Ｎ１間にもダイナミックリンクを定義する余地があり、スタティックリンクによって直接接続されていないノード間であれば、どのノード間にもダイナミックリンクを定義する余地はある。しかしながら、「リンク」というものが両ノード間の何らかの関連性を示すものである以上、何ら関連性をもたないノード間にダイナミックリンクを定義することは好ましくない。
【００４５】
そこで本実施形態では、検索時に、スタティックリンクによって直接接続されていないノード間については、両ノードに定義されたキーワードの関連性を具体的に評価し、評価結果が所定の条件を満たす場合に、両ノード間にダイナミックリンクを定義するようにしている。たとえば、図４に示す例の場合、ノードＮ４に定義されたキーワードＫ４と、ノードＮ７に定義されたキーワードＫ７との関連性が評価され、評価結果が所定の条件を満たしていたために、ノードＮ４−Ｎ７間にダイナミックリンクＬ８が定義されたことになる。
【００４６】
２つのキーワードの関連性の評価方法の一例としては、両キーワードを構成する文字列の一致度を定量的に評価する方法がある。たとえば、「高血圧」なるキーワードと、「血圧値」なるキーワードとは、３文字中「血圧」なる２文字分だけ一致しているので、一致度「２／３」というような定量的な評価が可能である。あるいは、何らかのシソーラス辞書を用意しておき、このシソーラス辞書において類似度を定量的に定めておけば、２つのキーワードの関連性を定量的に評価することが可能になる。たとえば、「高血圧」と「high pressure 」との類似度が１００と定義されているシソーラス辞書を用いれば、「高血圧」なるキーワードと、「high pressure 」なるキーワードとの一致度を定量的に評価することができる。このような評価値が一定の基準以上であった場合に、両ノード間にダイナミックリンクを定義するようにすればよい。また、この評価値をそのままダイナミックリンクの重みづけとして利用することもできる。
【００４７】
なお、ノード間の関連性評価をできるだけ合理的に行うようにするには、１つのノードについて複数の等価キーワードを定義しておくとよい。たとえば、図４に示す例では、ノードＮ４にはキーワードＫ４が定義され、ノードＮ７にはキーワードＫ７が定義されていると述べた。この場合、図６に示すように、キーワードＫ４を１つの代表キーワードＫ４０と複数の等価キーワードＫ４１〜Ｋ４５によって構成し、キーワードＫ７を１つの代表キーワードＫ７０と複数の等価キーワードＫ７１〜Ｋ７５によって構成しておき、いずれかのキーワード同士についての評価結果が一定の基準以上であった場合に、両ノード間にダイナミックリンクを定義するようにするとよい。図６に示す例では、等価キーワードＫ４２と等価キーワードＫ７４との関連性の評価結果が基準以上であったため、ノードＮ４−Ｎ７間にダイナミックリンクが定義されることになる。
【００４８】
この場合、個々の等価キーワードは、いずれも代表キーワードと等価なキーワードであり、代表キーワードの代わりに用いることができるキーワードである。たとえば、「高血圧」という代表キーワードに対して、「血圧が高い」，「血圧異常」，「高血圧症」といった等価キーワードを定義しておけば、いずれかの等価キーワードについての関連性が評価されればよいので、より合理的な評価結果を得ることができる。すなわち、本来は関連するノードであるにもかかわらず、キーワードの文字による表現形式が異なっていたために「関連性なし」との評価結果が出されるような不合理を解消することができる。
【００４９】
§４．分散型データベースシステムへの適用例
これまで、本実施形態に係るデータベースシステムの基本概念、ノード・リンク集合体の学習方法、新たなリンクの発生方法を、簡単な例を参照しながら説明してきた。ここでは、分散型データベースシステムに適用した実施形態について、より具体的な説明を行うことにする。
【００５０】
＜４．１：分散型システムにおけるクラスリンクの定義＞
ここ数年、複数のコンピュータをネットワークで相互接続して利用する環境が一般化してきており、各コンピュータごとに構築されたデータベースを、別のコンピュータからも利用できるような分散型データベースシステムが普及してきている。このような分散型データベースシステムでは、個々のローカルなデータベースは「クラス」という概念で取り扱われる。ここでは、便宜上、図７に示すように、３つのクラスＡ，Ｂ，Ｃからなる非常に簡単な分散型データベースシステムを例にとり、以下の説明を行うことにする。
【００５１】
図７では、個々のクラスごとに円周が描かれており、この円周上にノードＮ１〜Ｎ７が示されている。ここで、各円周は個々のクラスのまとまりを示し、各円周上のノードは、その特定のクラスに所属するノードを示している。たとえば、ノードＮ１，Ｎ２は、クラスＢに所属するノードであり、ノードＮ３，Ｎ４は、クラスＡに所属するノードであり、ノードＮ５，Ｎ６，Ｎ７は、クラスＣに所属するノードである。通常、各クラスごとのデータベースは、それぞれ空間的に離れた場所に設けられ、相互に通信回線などで接続されることになる。図示の例においても、クラスＡ，Ｂ，Ｃは、それぞれ空間的に離れているものとする。なお、図示された円周は、各ノードの所属を示すためのものであって、ノード間のリンクを示すものではない。したがって、図７に示す状態では、各ノードＮ１〜Ｎ７間には、まだ何らリンクの定義はなされていない。また、図示の例では、合計７つのノードだけが示されているが、実際には、各クラスに多数のノードが存在する。
【００５２】
このような分散型データベースシステムにおいて、各ノード間にリンクの定義を行うために、本実施形態では、予め個々のクラス間に関連づけを定義するようにしている。ここでは、このクラス間の関連づけを「クラスリンク」と呼ぶことにする。これまで、§１〜§３で述べてきたリンク（スタティックリンクおよびダイナミックリンク）は、ノードとノードとの関連づけを示すノード間のリンク（一般に、インスタンスリンクと呼ばれているリンク）であるが、ここで定義するクラスリンクは、クラスとクラスとの関連づけを示すクラス間のリンクである。
【００５３】
図８は、クラスリンクの定義の一例を示す図である。図に太線で示す直線もしくは円がクラスリンクを示している。具体的には、クラスＡ−Ｂ間にクラスリンクＡＢが定義され、クラスＡＣ間にクラスリンクＡＣが定義されている。一方、クラスＡ−Ａ間にもクラスリンクＡＡが定義されており、クラスＢＢ間にもクラスリンクＢＢが定義されている。直線で示されたクラスリンクＡＢおよびＡＣは、異なるクラス間の関連の程度を示すクラスリンクであり、ここでは「リモートリンク」と呼ぶことにする。これに対して、円で示されたクラスリンクＡＡおよびＢＢは、自己と自己との間の関連の程度を示すクラスリンクであり、ここでは「ローカルリンク」と呼ぶことにする。
【００５４】
ここで、混乱を避けるために、本明細書において用いられている「リンク」に関する用語を整理しておくと、次のようになる。
【００５５】
(1) インスタンスリンク（ノードとノードとの関連性を示すリンク：次のスタティックリンクとダイナミックリンクとの総称）
(1) の▲１▼：スタティックリンク（リンク集合体として構築された恒久的なリンク：利用者がシステムを利用してゆくにしたがって重みづけの学習が行われる。本明細書では、特に支障がないときには、単に「リンク」と標記する場合がある）
(1) の▲２▼：ダイナミックリンク（各ノードに定義されたキーワードが関連性を有する場合に、検索時に定義される一時的なリンク：ダイナミックリンクが、ノード採択に利用された場合はスタティックリンクに昇格するが、利用されなかった場合は消滅する）
(2) クラスリンク（クラスとクラスとの関連性を示すリンク：次のリモートリンクとローカルリンクとの総称：ここに述べる実施形態の場合、各リンクについて重みづけが定義されているが、重みづけに関する学習は行われない）
(2) の▲１▼：リモートリンク（異なるクラス間の関連性を示すリンク：図８では太い直線で示されている）
(2) の▲２▼：ローカルリンク（同一のクラスについて、自己と自己との間の関連性を示すリンク：図８では太い円で示されている）。
【００５６】
結局、図８に示す例では、クラスＡ−Ｂ間、クラスＡ−Ｃ間、クラスＡ−Ａ間、クラスＢ−Ｂ間には、それぞれクラスリンクが定義されているが、クラスＢ−Ｃ間、クラスＣ−Ｃ間には、クラスリンクは定義されていない。どのようなクラスリンクを定義するか、そして、各クラスリンクにどのような重みづけを定義するかは、このデータベースシステムの管理者の裁量によって決められる。ただ、実用上は、クラスリンクの定義を行うにあたっては、単に検索の便宜だけを考慮すればよいのではなく、個々のクラスに対応するデータベースの利用条件、利用契約の内容、利用料金、アクセス時間などを考慮しなければならない。したがって、データベースシステムの管理上の制約から、あえてクラスリンクを定義しなかったり、非常に小さな重みづけをもったクラスリンクを定義せざるを得ない場合もある。特に、医療症例のデータベースなどでは、患者のプライバシー保護の観点から、ごく限られたクラスリンクのみしか定義できない場合もあろう。
【００５７】
また、この実施形態の例では、個々のクラスリンクに対応させて、それぞれシソーラス辞書を用意するようにしている。たとえば、図８に示す例では、リモートリンクＡＢに対応させてシソーラス辞書Ｔａｂが用意され、リモートリンクＡＣに対応させてシソーラス辞書Ｔａｃが用意され、ローカルリンクＡＡに対応させてシソーラス辞書Ｔａａが用意され、ローカルリンクＢＢに対応させてシソーラス辞書Ｔｂｂが用意されている。これらのシソーラス辞書は、ダイナミックリンクを定義する際に利用されるが、その利用態様については後述する。ただ、このように、個々のクラスリンクごとに独自のシソーラス辞書を用意しておくことは非常に有意義である。たとえば、クラスＡが日本語のデータベースであり、クラスＢが英語のデータベースであり、クラスＣが仏語のデータベースであったような場合、シソーラス辞書Ｔａｂとしては「英和／和英シソーラス」を用い、シソーラス辞書Ｔａｃとしては「仏和／和仏シソーラス」を用いるようにすれば合理的である。
【００５８】
＜４．２：分散型システムにおけるスタティックリンクの定義＞
データベースシステムの管理者は、図８に示すようなクラスリンクの定義を行った後、個々のノード間にスタティックリンクの定義を行う。すなわち、個々のノードに対応づけられたキーワードを参照し、相互に関連性のあるキーワードが対応づけられたノード間に、所定の重みづけをもったスタティックリンクを張る作業を行う。このスタティックリンクを定義する際には、上述したクラスリンクの条件に従うようにする。すなわち、クラスリンクが定義されているクラス間については、スタティックリンクを定義することができるが、クラスリンクが定義されていないクラス間については、原則として、スタティックリンクを定義できないことにする。図９は、図７に示された各ノードについて定義されたスタティックリンクの具体例を示す図である。たとえば、スタティックリンクＬ１は、ノードＮ１−Ｎ２間のリンクであるが、これはクラスＢに関してローカルリンクＢＢが定義されているために許可されたリンクである。同様に、スタティックリンクＬ２は、リモートリンクＡＢにより許可されたリンクであり、スタティックリンクＬ３は、ローカルリンクＡＡにより許可されたリンクであり、スタティックリンクＬ４は、リモートリンクＡＣにより許可されたリンクである。これに対して、クラスＢ−Ｃ間にはリモートリンクは定義されていないので、たとえば、ノードＮ１−Ｎ７間にはスタティックリンクは定義されていない。また、クラスＣについては、ローカルリンクが定義されていないので、クラスＣに所属するノード間相互には、本来、スタティックリンクは定義できないが、ここでは例外的に、管理者の意向により、ノードＮ６−Ｎ７間にスタティックリンクＬ５が定義されている。このように、本実施形態では、原則的にはクラスリンクによって示される条件に基づいてスタティックリンクを定義するのが好ましいが、管理者が特に例外的な措置が必要であると判断した場合には、原則に反して、適宜スタティックリンクの定義を行えるようにしてある。
【００５９】
＜４．３：分散型システムにおける検索処理＞
さて、ここでは、図９に示すように、３つのクラスＡ，Ｂ，Ｃからなる分散型データベースシステムにおいて、７個のノードＮ１〜Ｎ７と５個のスタティックリンクＬ１〜Ｌ５が定義されている場合について、検索処理および学習処理がどのように行われるかを具体的に説明する。
【００６０】
まず、図９に示すように、ノードＮ１が最初の着目ノードとして選ばれたものとしよう。この最初の着目ノードの指定は、オペレータがノードＮ１に対応するキーワードＫ１を入力することによって行われる。こうして着目ノードが決定したら、次に、この着目ノードについての検索処理を実行する。この実施形態では、特定の着目ノードについての検索処理を、所定の信号値をもった信号を着目ノードからスタティックリンク（あるいは、後述するようにダイナミックリンク）に沿って他のノードへと伝達させる処理によって行っている。そのために、各スタティックリンク（およびダイナミックリンク）については、重みづけを示すための信号伝達係数を定義している。ここでは、図１０に示すように、各リンクに信号伝達係数が定義されていたものとしよう。この実施形態における信号伝達係数は、いずれもパーセント値で示されており、図示の例では、リンクＬ１：２５％、リンクＬ２：５０％、リンクＬ３：３０％、リンクＬ４：６０％、リンクＬ５：８０％という係数定義が行われている。各スタティックリンクのもつ信号伝達係数の値は、第一義的には、システムの管理者によって定義されるが、後述するように、利用者がシステムを利用するにしたがって学習が行われ、各係数値は修正されることになる。
【００６１】
ノードＮ１を着目ノードとする検索処理は、ノードＮ１に対してある程度以上の関連性を有する他のノードを候補ノードとして抽出する処理である。ここでは、このような候補ノードを抽出するために、着目ノードＮ１から初期信号値１００をもった信号を、各リンクに沿って伝達させることにする。そして、各リンクを通過するたびに、そのリンクに定義された信号伝達係数がもとの信号値に乗ぜられることにする。たとえば、図１０において、ノードＮ１からノードＮ２への信号伝達では、信号値１００×２５％なる乗算が行われ、ノードＮ２に到達した信号の信号値は２５に減衰することになる。同様に、ノードＮ１からノードＮ３への信号伝達では、信号値１００×５０％なる乗算が行われ、ノードＮ３に到達した信号の信号値は５０に減衰することになる。ノードＮ３に到達した信号は更にノードＮ４へと伝達するが、この信号伝達において、信号値５０×３０％なる乗算が行われ、ノードＮ４に到達した信号の信号値は１５にまで減衰することになる。更に、この信号値１５の信号がノードＮ４からノードＮ５へ伝達される際に、信号値１５×６０％なる乗算が行われ、ノードＮ５に到達した信号の信号値は９にまで減衰することになる。
【００６２】
図１０の下欄に示す図表には、着目ノードＮ１に信号値１００の信号を与えたときに、各ノードに伝達される信号の信号値が示されている。このような信号伝達の様子は、抵抗素子で連結された電子回路を電流が流れてゆくさまに似ている。すなわち、個々のリンクを所定の抵抗値をもった抵抗素子（信号伝達係数の小さなリンクほど抵抗値は大きい）と考え、信号値を電圧値と考えれば、信号の減衰は抵抗素子による電圧降下に相当するものになる。
【００６３】
いま、重みづけの大きなリンク（このリンクによって連結された２つのノードは大きな関連性を有することになる）ほど、大きな信号伝達係数を定義するようにしておけば、重みづけの大きなリンクを介しての信号伝達では、信号の減衰が少なくなり、信号が到達したノードにおける信号値は大きな値になる。したがって、大きな信号値が得られたノードほど、着目ノードに対する関連性が高いノードであると言える。そこで、この実施形態では、到達した信号の信号値に基づいて、各ノードに優先順位を定義するようにしている。図１０の下欄に示す図表における優先順位▲１▼，▲２▼，▲３▼は、このようにして定義された優先順位である。ノードＮ５は優先順位▲４▼のノードであるが、この例では、「条件以下」として取り扱い、特に優先順位の定義は行っていない。
【００６４】
ここに示す例では、有効信号値の下限条件を１０に設定してあり、伝達された信号の信号値が１０以下のノードに関しては、「条件以下」として考慮の対象から除外するようにしている。このように信号値が条件以下のノードについては、信号伝達がなかったのと同じ取り扱いがなされる。したがって、図１０の例の場合、ノードＮ５については、信号値９なる信号伝達があったにもかかわらず、信号伝達がなかったものとして取り扱われ、仮にこのノードＮ５より更に下流側に、リンクで連結された別なノードが存在したとしても（図１０の例では、そのようなノードは存在しないが）、もはやノードＮ５より下流側への信号伝達処理は行われないことになる。結局、この例では、ノードＮ５は、信号伝達が全くなかったノードＮ６，Ｎ７と同じ取り扱いがなされることになる。
【００６５】
かくして、ノードＮ１を着目ノードとする検索処理では、有効な信号値が得られたノードは、ノードＮ２，Ｎ３，Ｎ４だけとなり、これら３つのノードが関連ノードとして抽出され、候補ノードとしてオペレータに提示されることになる。図１０において、白抜きのノード点で示されているノードが、これら候補ノードである。ノードＮ５は、わずかな関連性を有しているものの、関連性が条件以下であるために候補としては抽出されなかったことになる。
【００６６】
オペレータは、提示された候補ノードＮ２，Ｎ３，Ｎ４の中から、新たな着目ノードを採択することになる。このとき、オペレータに対しては、優先順位に基づいて各候補ノードの提示を行うようにする。たとえば、図１０の例の場合、優先順位▲１▼，▲２▼，▲３▼に従って、候補ノードＮ３，Ｎ２，Ｎ４の順で提示が行われることになる（実際には、各ノードに対応づけられたキーワードが、優先順に従ってディスプレイに表示される。ディスプレイの一画面中に、全キーワードを表示しきれない場合には、優先順に画面を切り替えながら表示される）。このように、優先順位に基づいて候補ノードの提示を行うようにすれば、採択ノード（新たな着目ノード）を決定する際に、関連性の度合いを考慮することが可能になる。すなわち、オペレータは、優先的に表示されている候補ノードほど、関連の度合いが高いことを認識することができ、優先的に採択することが可能になる。
【００６７】
なお、上述の例では、伝達された信号の信号値が有効信号値としての条件を満たしているか否かに基づいて、関連ノード（候補ノード）として抽出するか否かを決めていたが、このような信号値に基づく条件設定を行う方法の他に、ホップ数Ｈに基づく条件設定を行うことも可能である。すなわち、１つのリンクで結合されたノード間の信号伝達をホップ数Ｈ＝１と定義し、ホップ数Ｈが所定の上限値を越えた場合には、信号の伝達処理を中止させるようにするのである。たとえば、図１０に示す例の場合、着目ノードＮ１に対してリンクで直接接続されているノードＮ２，Ｎ３への信号伝達は、ホップ数Ｈ＝１に相当する信号伝達であるが、ノードＮ４への信号伝達はホップ数Ｈ＝２、ノードＮ５への信号伝達はホップ数Ｈ＝３に相当する信号伝達である。そこで、たとえばホップ数Ｈの上限値をＨ＝２と設定しておけば、ホップ数Ｈが３以上となる信号伝達は中止されることになる。図１０に示す例の場合、ノードＮ４までの信号伝達は行われるが、それより下流のノードＮ５への信号伝達は行われなくなる。
【００６８】
実際には、信号値による条件とホップ数による条件とのＡＮＤ条件を設定するのが好ましい。すなわち、着目ノードから所定の条件以内のホップ数で信号伝達が可能であり、かつ、伝達されてきた信号の信号値が所定の条件以上であるようなノードのみを候補ノードとして抽出すればよい。具体的には、まず、ホップ数による条件で探索範囲（信号伝達の演算を行う範囲）を限定し、所定以下のホップ数で連結されているノードに対してのみ信号伝達の演算処理を行い、最終的に所定以上の信号値をもった信号が得られたノードのみを候補ノードとすればよい。このように、候補ノードの抽出に条件を課し、ある程度以上の関連性をもったノードのみを候補ノードとすることは、検索時間の短縮というシステム性能面でのメリットもあり、検索機能の使い勝手を向上させるために重要である。実際のデータベースシステムには、膨大な数のノードが存在するため、低い関連性しかもたないノードまでを候補ノードとして提示すると、検索待ち時間が長くなり、候補の数が多くなりすぎ、使い勝手は低下してしまうことになる。
【００６９】
＜４．４：クラスリンクの重みづけを考慮した検索処理＞
上述した検索は、スタティックリンクの重みづけ（信号伝達係数）を考慮した検索であるが、更に、クラスリンクの重みづけを考慮した検索処理を行うことも可能である。図１１は、スタティックリンクの重みづけとクラスリンクの重みづけとの双方を考慮した検索処理の一例を示す図である。ここで、各スタティックリンクＬ１〜Ｌ５の下に示されたパーセント値は、各スタティックリンクの信号伝達係数であり、図１０に示した値と全く同じである。一方、各クラスリンクＡＡ，ＢＢ，ＡＢ，ＡＣの下に示されたパーセント値は、各クラスリンクの信号伝達係数である。
【００７０】
ここで、着目ノードを起点とした信号伝達を行う際には、スタティックリンクについての信号伝達係数とクラスリンクについての信号伝達係数との積を用いるようにする。たとえば、図１０に示す例と同様に、着目ノードＮ１から初期信号値１００をもった信号を、各リンクに沿って伝達させる場合を考える。すると、図１１において、ノードＮ１からノードＮ２への信号伝達では、信号値１００×２５％（リンクＬ１）×２０％（リンクＢＢ）なる乗算が行われ、ノードＮ２に到達した信号の信号値は５に減衰することになる。同様に、ノードＮ１からノードＮ３への信号伝達では、信号値１００×５０％（リンクＬ２）×８０％（リンクＡＢ）なる乗算が行われ、ノードＮ３に到達した信号の信号値は４０に減衰することになる。ノードＮ３に到達した信号は更にノードＮ４へと伝達するが、この信号伝達において、信号値４０×３０％（リンクＬ３）×９０％（リンクＡＡ）なる乗算が行われ、ノードＮ４に到達した信号の信号値は１０．８にまで減衰することになる。更に、この信号値１０．８の信号がノードＮ４からノードＮ５へ伝達される際に、信号値１０．８×６０％（リンクＬ４）×１０％（リンクＡＣ）なる乗算が行われ、ノードＮ５に到達した信号の信号値は０．６４８にまで減衰することになる。
【００７１】
図１１の下欄に示す図表には、着目ノードＮ１に信号値１００の信号を与えたときに、各ノードに伝達される信号の信号値が示されている。この例でも、有効信号値の下限条件を１０に設定してあり、伝達された信号の信号値が１０以下のノードに関しては、「条件以下」として考慮の対象から除外するようにしている。その結果、候補ノードとして抽出されるノードは、ノードＮ３およびノードＮ４のみとなり、この順に優先順位が定義される。
【００７２】
このように、スタティックリンクの重みづけとクラスリンクの重みづけとの双方を考慮した検索が行われるように構成しておけば、システムの管理者により、個々の検索処理の傾向を統括的に制御することが可能になる。たとえば、特定のクラスへアクセスするための通信回線が非常に混雑する傾向にある場合、このクラスに関するクラスリンクの重みづけを小さくするような修正を行えば、このクラスに所属するノードが候補として抽出されることを抑制することが可能になる。
【００７３】
既に述べたように、スタティックリンクの重みづけ要素は学習対象になり、個々の利用者ごとに異なる重みづけをもったリンク集合体が形成されることになる。これに対して、クラスリンクの重みづけを、システムの管理者によってのみ設定できるようにしておけば、個々の利用者の学習内容を尊重しつつ、システム全体としての検索処理を管理者によって統括管理することが可能になる。
【００７４】
＜４．５：分散型システムにおける学習処理＞
続いて、分散型システムにおける具体的な学習処理について説明する。前述したように、クラスリンクの重みづけは、学習対象にならないため、ここでは、クラスリンクに一様に１００％の重みづけがなされている簡単な場合を考える。すなわち、図１１に示す例ではなく、図１０に示す例についての学習処理を考える。
【００７５】
いま、図１０に示すように、ノードＮ１を着目ノードとした検索を行った結果、３つのノードＮ２，Ｎ３，Ｎ４が候補ノードとして提示され、オペレータが、この３つの候補ノードのうちのノードＮ４を採択したものとしよう。これにより、ノードＮ４が新たな着目ノードになる。そして、このノードＮ４の採択行為により、学習が行われることになる。学習は、学習対象となるスタティックリンクの重みづけを修正することにより行われる。具体的には、スタティックリンクの信号伝達係数を増減する修正を行うことになる。
【００７６】
学習処理の基本方針は次のとおりである。まず、着目ノードから各候補ノードへ至るすべてのパスを学習対象パスとして抽出する。そして、着目ノードから採択ノードへ至るパス上のリンクについての信号伝達係数を、他のリンクの信号伝達係数に対して相対的に増加させる修正を行う。特に、ここで述べる実施形態では、着目ノードから採択ノードへ至るパス上のリンクについての信号伝達係数を増加させる修正を行うとともに、それ以外の学習対象パス上のリンクについての信号伝達係数を減少させる修正を行うようにしている。結局、全スタティックリンクのうち、
▲１▼学習対象パスのうち、着目ノードから採択ノードへ至るパス上のリンクについては、信号伝達係数を増加させる修正を行い、
▲２▼学習対象パスのうち、▲１▼以外のリンクについては、信号伝達係数を減少させる修正を行い、
▲３▼学習対象パス上にないリンクについては、何ら修正を行わない、
という処理が行われることになる。
【００７７】
この例では、上記▲１▼のリンクについては信号伝達係数のパーセント値を２０だけ増加させ、上記▲２▼のリンクについては信号伝達係数のパーセント値を２０だけ減少させる修正を行うようにしている。この例での学習対象パスは、図１０に示すように、着目ノードＮ１から、白抜きのノード点で示した各候補ノードＮ２，Ｎ３，Ｎ４へ至るすべてのパスであり、具体的には、スタティックリンクＬ１，Ｌ２，Ｌ３が学習対象となる。このうち、着目ノードＮ１から採択ノードＮ４に至るパス上のリンクＬ２，Ｌ３については、信号伝達係数値を２０だけ増加させる修正が行われ、リンクＬ２およびＬ３の学習後の信号伝達係数は、それぞれ５０％＋２０％＝７０％および３０％＋２０％＝５０％となる。一方、学習対象である残りのリンクＬ１については、信号伝達係数値を２０だけ減少させる修正が行われ、学習後の信号伝達係数は、２５％−２０％＝５％となる。なお、学習対象パスにはならなかったリンクＬ４，Ｌ５については学習は行われず、信号伝達係数の値はもとのままである。図１２は、このような基本方針に沿った具体的な学習処理を実施した後の状態を示す図である。
【００７８】
さて、このような学習により、検索処理にどのような変化が生じるかを見てみよう。図１３は、前述のような学習により、新たな信号伝達係数が定義された状態において、再び、ノードＮ１を着目ノードとして指定し、検索を行った場合の検索結果を示す図である。図１３に示す検索行為自体は、図１０に示す検索と全く同じであるが、各リンクの信号伝達係数が修正されているため、異なった検索結果が得られている。すなわち、やはり着目ノードＮ１から初期信号値１００をもった信号を、各リンクに沿って伝達させると、ノードＮ１からノードＮ２への信号伝達では、信号値１００×５％なる乗算が行われ、ノードＮ２に到達した信号の信号値は５（条件以下）に減衰することになる。一方、ノードＮ１からノードＮ３への信号伝達では、信号値１００×７０％なる乗算が行われ、ノードＮ３には信号値が７０の信号が得られる。更に、ノードＮ３からノードＮ４への伝達では、信号値７０×５０％なる乗算が行われ、ノードＮ４には信号値３５の信号が得られる。更に、この信号値３５の信号がノードＮ４からノードＮ５へ伝達される際に、信号値３５×６０％なる乗算が行われ、ノードＮ５には信号値２１の信号が得られる。
【００７９】
図１３の下欄に示す図表には、着目ノードＮ１に信号値１００の信号を与えたときに、各ノードに伝達される信号の信号値が示されている。この図表を、図１０の下欄に示す図表と比較すると、候補ノードとして抽出されるノードの組み合わせや優先順位に変化が生じていることがわかる。すなわち、学習前の図１０に示す検索では、ノードＮ２が候補ノードとなっていたのに対し、学習後の図１３に示す検索では、ノードＮ２の代わりにノードＮ５が候補ノードになっている。これは、学習により、ノードＮ１−Ｎ４間の関連の程度が大きくなったことを意味している。
【００８０】
この図１３に示す検索の後に、オペレータが再びノードＮ４を採択したとすると、リンクＬ２，Ｌ３の信号伝達係数は更に増加し、逆に、リンクＬ１，Ｌ４の信号伝達係数は減少することになる。前回の検索では、リンクＬ４は学習対象になっていなかったが、今回の検索では、ノードＮ５が候補ノードになったため、リンクＬ４も学習対象となる。ただ、リンクＬ４は、着目ノードＮ１から採択ノードＮ４へ至るパス上のリンクではないため、信号伝達係数を減少させる学習が行われることになる。
【００８１】
このように、利用者が所定の着目ノードについての検索を行い、この検索によって提示された候補ノードの中から新たな着目ノードを採択するたびに、学習対象パス上のリンクについての学習が行われる。すなわち、着目ノードから採択ノードに至るパスとして利用されたリンクに対しては重みづけの増加修正が行われ、それ以外の学習対象パス上のリンクに対しては重みづけの減少修正が行われる。このようにして、利用頻度の高いパスについての重みづけを増加させる学習が行われると、個々の利用者にとって使い勝手のよいリンク集合体が構築されてゆくことになる。
【００８２】
なお、本実施形態では、信号伝達係数に所定の上限値および下限値を設定してあり、上限値を越えるような増加修正や、下限値を越えるような減少修正は行われないようにしている。たとえば、信号伝達係数の上限値を１５０％とし、下限値を１％としておけば、増加修正は１５０％までしか行われず、減少修正は１％までしか行われない。もっとも、信号伝達係数が１％以下になるような場合には、そのスタティックリンクを消滅させてしまうように決めておくこともできる。
【００８３】
また、この実施形態では、リンクの重みづけに関して、方向性は定義していない。たとえば、リンクＬ２の重みづけを示す信号伝達係数は、ノードＮ１からノードＮ３へ信号が伝達する場合と、逆に、ノードＮ３からノードＮ１へ信号が伝達する場合と、の双方に共通して利用される。通常は、ノードＮ１から見たノードＮ３の関連性が高い場合、逆に、ノードＮ３から見たノードＮ１の関連性も高いのが普通である。したがって、リンクの重みに特に方向性を定義しなくても支障はない。しかしながら、第１のノードから見た第２のノードの関連性が、第２のノードから見た第１のノードの関連性と必ずしも同じにならない場合には、リンクの重みづけに方向性をもたせることもできる。この場合は、たとえば、リンクＬ２の重みづけとして、ノードＮ１からノードＮ３へ向かう方向についての信号伝達係数と、ノードＮ３からノードＮ１へ向かう方向についての信号伝達係数とを、それぞれ別個に定義すればよい。
【００８４】
＜４．６：ノードの重みづけを考慮した候補ノードの提示＞
これまで、スタティックリンクの重みづけを修正する学習処理を説明してきたが、個々のノードにも重みづけを定義して学習処理の対象にすれば、個々の候補ノードをオペレータに提示する際に、各候補ノードの重みを考慮した優先順位で提示を行うことが可能になる。
【００８５】
これを具体例で示そう。ここでは、すべてのノードに対して、重みづけを示す頻度係数を定義することにし、学習が行われていない初期状態において、全ノードの頻度係数を１００％に設定したものとする。各ノードに定義された頻度係数は、信号伝達の過程には何ら影響を与えないが、そのノードが候補ノードとして抽出されたときには、優先順位を決定するためのパラメータとして利用されることになる。このような頻度係数を定義した場合、図１０に示す検索処理時には、図１４の図表に示すような方法で候補ノードの優先順位が決定される。すなわち、ある特定のノードの優先順位は、そのノードに伝達された信号の信号値とそのノードの頻度係数との積に基づいて定められることになる。図１４に示す例では、まだ学習が行われていないため、全ノードの頻度係数は１００％になっており、頻度係数を乗じて得られる積は、もとの信号値と同じ値になる。
【００８６】
さて、図１０に示す検索により、３つの候補ノードＮ３，Ｎ２，Ｎ４がこの順序でオペレータに提示され、オペレータがノードＮ４を採択した場合を考えよう。この場合、各リンクの信号伝達係数を修正する学習が行われることは既に述べたとおりであるが、ノードについての重みづけを定義した場合には、ノードの重みづけも修正されるような学習を行うようにする。具体的には、
▲１▼採択ノードについては、頻度係数を増加させる修正を行い、
▲２▼学習対象パス上のノードのうち、▲１▼以外のノードについては、頻度係数を減少させる修正を行い、
▲３▼学習対象パス上にないノードについては、何ら修正を行わない、
という学習処理を行えばよい。結局、ノードの重みづけを示す頻度係数は、そのノードが過去にどの程度採択されたかという採択頻度を示すパラメータということになる。
【００８７】
ここでは、採択ノードについては、頻度係数を×１．５に増加させる修正を行い、学習対象パス上のその他のノードについては、頻度係数を×０．７に減少させる修正を行う具体例を考えてみる。すると、図１０に示す検索において、候補ノードＮ４が採択された場合、採択ノードであるノードＮ４の頻度係数は、１００％×１．５＝１５０％に修正されることになる。一方、学習対象パス上のその他のノードＮ２，Ｎ３の頻度係数は、１００％×０．７＝７０％に修正されることになる。ノードＮ５，Ｎ６，Ｎ７は、学習対象パス上にはないため、頻度係数の修正は行われない。
【００８８】
候補ノードＮ４の採択行為により、ノードに対する学習が行われるとともに、前述したようにリンクに対する学習も行われ、リンクの信号伝達係数は図１３の上欄に示すような値になる。このような学習が行われた後に、再び、ノードＮ１を着目ノードとする検索を行った場合の検索結果を考えてみると、各ノードに定義された頻度係数は、信号伝達過程には何ら影響を及ぼさないので、ノードＮ１〜Ｎ５に得られる信号の信号値は、図１３の下欄に示すような値になる。ただし、候補ノードＮ３，Ｎ４，Ｎ５をオペレータに提示するときの優先順位は、信号値と頻度係数との積に基づいて決められる。この積は、図１５の図表に示すようになり、結局、図１３に示す検索処理時には、図１５の図表に示すような方法で候補ノードの優先順位が決定されることになる。
【００８９】
図１５に示された優先順位を、図１３下欄に示された優先順位と比較すると、優先順位の▲１▼と▲２▼とが入れ替わっていることがわかる。すなわち、ノードについての重みづけを考慮して優先順位を決定すれば、採択ノードであるノードＮ４の優先順位が、単なる通過ノードとして利用されるノードＮ３の優先順位よりも高くなり、利用者の使い勝手がより向上することがわかる。
【００９０】
＜４．７：ダイナミックリンクを用いた検索処理＞
分散型データベースシステムについて、これまで説明を行ってきた検索例は、既存のスタティックリンクを利用した検索であった。ここでは、§３で説明したダイナミックリンクを利用した検索例を述べることにする。
【００９１】
たとえば、図１０に示す例において、ノードＮ１−Ｎ３間のスタティックリンクＬ２が定義されていなかった場合を考える。スタティックリンクのみを利用した検索では、この状態で、ノードＮ１を着目ノードとする検索を行った場合、候補ノードとして、ノードＮ２だけが検索されることになる。既に述べたように、スタティックリンクは、第一義的にはシステムの管理者によって設定されたリンクであり、必ずしも個々の利用者にとって有益なノード間の関連づけがなされているとは限らない。そこで、次のような手法を用いて、キーワード間に関連性のある特定のノード間に、一時的なダイナミックリンクを定義し、検索を行うようにする。
【００９２】
ダイナミックリンクを定義する基本的な手法は、§３において述べたとおりである。ここでは、図１６に示すように、既存のリンクが存在しないために不連結部分を構成しているノードＮ１−Ｎ３間に、ダイナミックリンクＬ２を発生させるための具体的な条件について説明する。不連結部分のノードＮ１−Ｎ３間をダイナミックリンクで連結すべきか否かの判断は、両ノードＮ１，Ｎ３に定義されているキーワードＫ１，Ｋ３の関連性の評価結果に基づいてなされる。すなわち、この評価結果が所定の条件を満足していれば、この不連結部分は、ダイナミックリンクＬ２によって一時的に連結されることになる。具体的には、たとえば、各キーワードＫ１，Ｋ３として、それぞれいくつかの等価キーワードが定義されていた場合には、キーワードＫ１を構成するいずれかの等価キーワードが、キーワードＫ３を構成するいずれかの等価キーワードと一致すれば、条件を満足する評価結果が得られたと判断してよい。もちろん、キーワードの完全一致を必須条件にする必要はなく、たとえば、「高血圧」と「血圧値」のように、文字列の２／３が一致している場合には、条件を満足するという判断を行うようにしてもかまわない。
【００９３】
本実施形態では、２つのキーワードの関連性の評価を行う上で、クラスリンクとともに用意されたシソーラス辞書を用いるようにしている。図８に示すように、リモートリンクＡＢについては、シソーラス辞書Ｔａｂが用意されている。したがって、ノードＮ１−Ｎ３間の関連性を評価する際には、このシソーラス辞書Ｔａｂを参照した評価を行うようにする。たとえば、シソーラス辞書Ｔａｂ内に、「高血圧」，「high pressure 」，「血圧異常」，「動脈硬化」という単語がいずれも類義語として定義されていたとすれば、クラスＡ内のノードとクラスＢ内のノードとの関連性を評価する上では、これらの類義語は同一のキーワードと見做した取り扱いができる。
【００９４】
結局、図１６に示されている不連結部分のノードＮ１−Ｎ３間にダイナミックリンクＬ２が定義されるためには、ノードＮ１のキーワードＫ１と、ノードＮ３のキーワードＫ３との関連性を、シソーラス辞書Ｔａｂを参照した上で評価し、評価結果が所定の条件を満足している必要がある。
【００９５】
なお、図１６において、ノードＮ１に関する不連結部分は、ノードＮ１−Ｎ３間だけではない。ノードＮ１に対して既存のスタティックリンクで連結されているノードは、ノードＮ２だけであるから、それ以外のノード間、すなわち、ノードＮ１−Ｎ４間，ノードＮ１−Ｎ５間，ノードＮ１−Ｎ６間，ノードＮ１−Ｎ７間も、ノードＮ１に関する不連結部分である。しかしながら、本実施形態では、「ダイナミックリンクは、クラスリンクが定義されているクラス間にのみ定義できる」という条件を課すことにより、ダイナミックリンクの発生に制限を加えるようにしている。したがって、上述の例の場合、ノードＮ１に関する不連結部分のうち、ノードＮ１−Ｎ３間およびノードＮ１−Ｎ４間には、キーワードの評価結果が条件を満足すれば、ダイナミックリンクを定義することが許されるが（クラスＡ−Ｂ間には、クラスリンクＡＢが存在するため）、たとえば、ノードＮ１−Ｎ７間には、図１７に示すように、ダイナミックリンクを定義することは許されない（クラスＢ−Ｃ間には、クラスリンクが存在しないため）。同様の理由により、ノードＮ１−Ｎ５間，ノードＮ１−Ｎ６間，Ｎ２−Ｎ５間，ノードＮ２−Ｎ６間，ノードＮ２−Ｎ７間，ノードＮ５−Ｎ６間，ノードＮ５−Ｎ７間にも、ダイナミックリンクの定義は許されない。
【００９６】
検索時に一時的に定義したダイナミックリンクには、所定の重みづけ、すなわち信号伝達係数が与えられる。したがって、図１６に示すダイナミックリンクＬ２についても、何らかの信号伝達係数が与えられ、信号伝達機能に関しては、他のスタティックリンクと全く同様の機能を果たすことになる。定義したダイナミックリンクに与える信号伝達係数は、たとえば、「一律５０％にする」というように定めておいてもよいが、キーワードの関連性の評価値に応じた係数を与えるようにしてもよい。たとえば、図１６に示すダイナミックリンクＬ２の信号伝達係数は、「キーワードＫ１とＫ３とが完全一致の場合には１００％とし、キーワードを構成する文字列の２／３が一致していたような場合には６６％とする」というように定めてもよい。また、等価キーワードを利用する場合は、個々の等価キーワードにそれぞれ重みづけを定めておき、一致した等価キーワードの重みづけに応じて、信号伝達係数の値を定めるようにしてもよい。
【００９７】
このようにして一時的に発生させたダイナミックリンクの学習時の取り扱いは、既に§３で述べたとおりである。すなわち、そのダイナミックリンクが、着目ノードから採択ノードへ至るパスとして利用された場合には、スタティックリンクに昇格し、かつ、信号伝達係数を増加させる修正が行われる。これに対し、着目ノードから採択ノードへ至るパスとして利用されなかった場合には、そのまま消滅させる。たとえば、図１６に示すように、ノードＮ１を着目ノードとする検索時に、ダイナミックリンクＬ２が定義された場合を考える。この場合、たとえば候補ノードであるノードＮ４が採択された場合には、ダイナミックリンクＬ２はスタティックリンクＬ２として残ることになり、信号伝達係数も増加修正される。ところが、別な候補ノードであるノードＮ２が採択された場合には、そのまま消滅してしまうことになる。
【００９８】
結局、上述した特徴をもったダイナミックリンクを、検索時に一時的に定義することにより、検索対象となるノードの範囲を広げることができるようになる。また、利用者の採択行為に関与したダイナミックリンクについては、これをスタティックリンクとして残すようにすることにより、個々の利用者にとって利用価値のあるリンクを新たに追加することができるようになる。
【００９９】
§５．本実施形態に係るデータベースシステムの動作手順
続いて、本実施形態に係るデータベースシステムの動作手順を流れ図に基づいて説明する。
【０１００】
＜５．１：本実施形態に係るデータベースシステムの利用手順＞
図１８は、このデータベースシステムの利用手順を、利用者の操作を中心にして示した流れ図である。利用者は、まずステップＳ１において、初期着目ノードを決定する。初期着目ノードの決定方法は、どのような方法を採ってもかまわない。たとえば、何らかのキーワードを入力し、この入力したキーワードと同一あるいは関連性のあるキーワードが定義されているノードを候補として表示させ（実際には、キーワードを表示する）、表示されたキーワードの中から、オペレータによって特定のノードを採択させれば、初期着目ノードを決定することができる。予めクラスを指定し、このクラス内のノードの中から初期着目ノードを選択させるような方法を採れば、候補をより絞り込んだ初期着目ノードの決定が可能である。
【０１０１】
続くステップＳ２では、この着目ノードに対応づけられたデータを利用するか否かが判断される。オペレータが、現在の着目ノードについて、データを利用（閲覧）したいと考えた場合には、その旨の指示を与えればよい。オペレータからデータ利用の指示が与えられると、ステップＳ２からステップＳ３へと進み、着目ノードに対応するデータが提供されることになる。すなわち、図１に示すシステムでは、ノード・リンク集合体２内の着目ノードに定義されたキーワードに基づいて、データベース１内から対応するデータが抽出され、オペレータ３に提示される。必要があれば、繰り返してデータ利用が可能である。
【０１０２】
続くステップＳ４では、この着目ノードについての検索を行うか否かが判断される。オペレータが検索を行わなかった場合には、一応、このシステムの利用手順は終了する。なお、オペレータが別な内容について調べるために、再びこのデータベースシステムをアクセスした場合には、再度ステップＳ１からの手順を開始すればよい。オペレータが、現在の着目ノードについて、検索を行いたいと考えた場合には、その旨の指示を与えれる。オペレータから検索指示が与えられると、ステップＳ４からステップＳ５へと進み、検索処理が行われることになる。このステップＳ５の検索処理の詳細な手順は、図１９の流れ図を参照しながら後に述べることにする。この検索処理により、着目ノードに対してある程度の関連性があると判断された関連ノードが抽出され、これら関連ノードが候補ノードとして提示されることになる。
【０１０３】
候補ノードの提示は、実際には、各候補ノードに定義されたキーワードをディスプレイ上に表示することによって行われる。オペレータは、このキーワードの表示を見ながら、探し求めているデータに最も関連があると思われるキーワードを採択する。すなわち、ステップＳ６における候補ノードの採択処理が行われる。続いて、オペレータの採択行為に基づいて、ステップＳ７において学習処理が行われる。このステップＳ７の学習処理の詳細な手順は、図２０の流れ図を参照しながら後に述べることにする。
【０１０４】
こうして学習処理が完了すると、ステップＳ８において、採択ノードを新たな着目ノードとする更新処理が行われ、ステップＳ２からの手順へ戻ることになる。
【０１０５】
結局、オペレータは、図１８に示す流れ図の手順を繰り返すことにより、何度も検索を繰り返し実行してゆき、着目ノードを転々と変えてゆくことができる。そして、必要に応じて、ステップＳ２において、データを利用（閲覧，印刷など）したい旨の指示を与えれば、その都度、現在の着目ノードに対応するデータを利用することができる。このような検索手法の特徴は、データを直接検索しているのではなく、関連するキーワードを検索している点にある。既に述べたように、特定のノードをオペレータに提示する処理は、実際には、その特定のノードに定義されたキーワードをディスプレイ上に表示する処理である。したがって、着目ノードを転々と変えてゆく操作は、オペレータから見れば、キーワードを転々と変えてゆく操作になる。このような検索手法は、最終的に探し求めているデータを、どのようなキーワードで検索すればよいかが不明瞭な場合に有効になる。別言すれば、オペレータは、とりあえず関連性があると思われるキーワードを、ステップＳ１において初期着目ノードを決定するためのキーワードとして与え、以後、ステップＳ４の検索処理を実行し、ステップＳ６において、提示された新たなキーワードの中から、最も関連性のありそうなキーワードを採択する（ノードの採択）という作業を繰り返し実行してゆけばよいことになる。
【０１０６】
具体的には、たとえば、ある特定の患者の症例に類似した過去の症例データを探す場合を考えてみよう。いま、オペレータが、ある特定の病院の症例データベースをクラス指定して、「高血圧」というキーワードを入力し、検索を行ったとする。その結果、候補ノードを示すキーワードとして、「血圧異常」、「若年性高血圧」、「老年性高血圧」、「高血圧性網膜症」、…などが候補として提示されたとしよう（システム内部の処理としては、「高血圧」というキーワードが定義されたノードが着目ノードとなり、「血圧異常」、「若年性高血圧」、「老年性高血圧」…などのキーワードが定義されたノードが候補ノードとして提示されたことになる）。ここで、たとえば、当該患者が老人であったため、「老年性高血圧」というキーワードを採択したとする（システム内部の処理としては、「老年性高血圧」なるキーワードが定義された候補ノードが採択され、新たな着目ノードになる）。オペレータは、この時点で必要があれば、「老年性高血圧」というキーワードに対応するデータを閲覧することができる（着目ノードに対応するデータの提供）。
【０１０７】
ここでは、この「老年性高血圧」というキーワードに基づいて更に検索を行った結果、候補ノードを示すキーワードとして、「動脈硬化」、「眼底出血」、「腎機能障害」、「心不全」、…などが候補として提示されたものとしよう。このとき、当該患者の腎機能に異常が見られたのなら、「腎機能障害」というキーワードを採択することができ、必要があれば、このキーワードに対応するデータ（たとえば、腎機能障害をもつ患者の症例データ）を閲覧することができる。
【０１０８】
このように、本実施形態に係るデータベースシステムでは、とりあえず何らかのキーワードを与えて検索を行うと、与えられたキーワードに関連した別なキーワードが候補としてシステム側から提示されることになるので、オペレータ側で適切なキーワードが思い浮かばないような場合であっても、より柔軟な検索を行うことが可能になる。しかも、システム自体が学習機能を有しているため、利用すればする程、採択される可能性の高いキーワードが優先的に提示されるようになるので、使い勝手は益々向上するようになる。
【０１０９】
なお、キーワードとしては、いわゆる文字列だけでなく、画像を用いることも可能である。たとえば、あるノードに対応づけられたデータを、簡単なアイコンで表現するようにし、このアイコンをそのノードのキーワードとして定義しておけば、検索結果をディスプレイ上に表示する際に、キーワードとしてのアイコンを並べ、オペレータに提示することができる。この場合、オペレータの採択行為は、アイコンをマウスポインタなどでクリックする簡単な操作で行うことができる。
【０１１０】
なお、図１８に示す流れ図では、ステップＳ６においてノードの採択が行われた後、無条件でステップＳ７の学習処理が行われることになっているが、このようにステップＳ７において無条件で学習処理を行った場合、不本意な学習が行われることもありうる。たとえば、オペレータがステップＳ６においてノード採択を行ったが、この採択行為が不本意なものであったため、この採択ノードについてはデータ利用も新たな検索処理も行わなかったような場合を考えよう。このような場合、採択行為自体がオペレータの意図に反するものであり、このような不本意な採択行為によりステップＳ７の学習処理が行われてしまうことは好ましくない。このような弊害に対処するためには、ステップＳ６における採択行為があった後、直ちに学習処理を行わずに、その採択ノードを新たな着目ノードとするデータ利用（ステップＳ３）あるいは検索処理（ステップＳ５）が行われた時点で、はじめて学習処理を行うようにするとよい。
【０１１１】
＜５．２：検索処理の手順＞
図１９は、図１８に示す流れ図におけるステップＳ５の検索処理の詳細な手順を示す流れ図である。ここでは、図１６に示すようなスタティックリンクが定義されている具体的な例について、この検索処理の手順を説明する。まず、ステップＳ１１において、ホップ数Ｈを初期値１に設定する。続いて、ステップＳ１２において、起点ノードを１つ抽出する。最初は、着目ノードＮ１が起点ノードとして抽出される。次のステップＳ１３では、この起点ノードＮ１に対する対象ノードが１つ抽出される。理論的には、このステップＳ１３において、起点ノード以外のすべてのノードが対象ノードとして順番に抽出されることになる。図１６に示す例の場合、起点ノードＮ１以外のすべてのノードＮ２〜Ｎ７が対象ノードとして順に抽出されることになる。ステップＳ１３では、１つの対象ノードのみが抽出されるので、ここでは、番号順に従って、ノードＮ２が対象ノードとして抽出されたものとしよう。
【０１１２】
こうして、起点ノードＮ１および対象ノードＮ２が抽出されたら、ステップＳ１４〜Ｓ１７において、この両ノード間について種々の判断がなされる。まず、ステップＳ１４において、両ノード間にクラスリンクがあるか否かが判断される。もし、両ノード間にクラスリンクがなければ、ステップＳ１９へと進むことになる。図１６に示す例では、ノードＮ１−Ｎ２間には（正確に言えば、ノードＮ１が所属するクラスＢと、ノードＮ２が所属するクラスＢとの間には）、クラスリンク（ローカルリンク）ＢＢが存在するので、ステップＳ１５へと進むことになる。ステップＳ１５では、両ノード間にスタティックリンクがあるか否かが判断される。図１６に示す例では、ノードＮ１−Ｎ２間には、スタティックリンクＬ１が存在するので、ステップＳ１５からステップＳ１９へと進むことになる。
【０１１３】
続いて、ステップＳ１９からステップＳ１３へと戻り、今度は、対象ノードＮ３が抽出され、起点ノードＮ１と対象ノードＮ３とに関して、種々の判断がなされることになる。まず、ステップＳ１４では、ノードＮ１−Ｎ３間には（正確に言えば、ノードＮ１が所属するクラスＢと、ノードＮ３が所属するクラスＡとの間には）、クラスリンク（リモートリンク）ＡＢが存在するので、ステップＳ１５へと進むことになる。ところが、ノードＮ１−Ｎ３間にはスタティックリンクは存在しないので、更にステップＳ１６へと進み、両ノード間にはまだダイナミックリンクも存在しないので、更にステップＳ１７へと進み、両ノードのキーワードの関連性が評価される。ここで、ノードＮ１に定義されたキーワードＫ１と、ノードＮ３に定義されたキーワードＫ３との関連性の評価結果が、所定の条件を満足していたとすると、ステップＳ１８へと進み、ノードＮ１−Ｎ３間に、図１６に破線で示されているダイナミックリンクＬ２が定義されることになる。
【０１１４】
結局、ステップＳ１４〜Ｓ１７の判断処理は、両ノード間にダイナミックリンクを定義すべきか否かを判断するための処理になっている。ダイナミックリンクを定義するための条件の１つは、既に述べたように、「両ノードが所属するクラス間にクラスリンクが存在すること」であり、この条件を満たさない場合は、ステップＳ１４からステップＳ１９へジャンプすることになる。また、ダイナミックリンクは、既存のスタティックリンクが存在しない不連結部分のノードに対して定義されるものであるから、両ノード間に既にスタティックリンクが存在する場合には、ステップＳ１５からステップＳ１９へジャンプすることになる。同様に、既にダイナミックリンクが定義されていた場合には（たとえば、後の手順において、ノードＮ３を起点ノード、ノードＮ１を対象ノードとする処理の場合、既に両ノード間には、ダイナミックリンクＬ２が定義されていることになる）、ステップＳ１６からステップＳ１９へジャンプすることになる。かくして、ステップＳ１７において、両ノードのキーワードが関連性の条件を満たす場合には、両ノード間にはスタティックリンクもダイナミックリンクもまだ定義されていないので、ステップＳ１８において、ダイナミックリンクの定義を行うことになる。
【０１１５】
なお、図１９に示す流れ図では、ステップＳ１３において対象ノードを１つずつ抽出しては、この抽出した対象ノードに対してクラスリンクが存在するか否かをステップＳ１４で判断しているが、実際の演算処理を行う上では、まず、対象となるクラスを１つ抽出し、この抽出した対象クラスに対してクラスリンクが存在するか否かを先に判断し、クラスリンクが存在する場合には、当該対象クラスの中から対象ノードを１つずつ抽出してステップＳ１５以下の処理へと進むようにし、クラスリンクが存在しない場合には、当該クラスの中からは対象ノードの抽出を一切行わないようにすれば、効率的な処理が可能になる。
【０１１６】
こうして、最初の起点ノードＮ１に関して、全対象ノードＮ２〜Ｎ７を抽出した処理が完了したら、ステップＳ１９からステップＳ２０へと進み、全起点ノードが抽出されたか否かが判断される。「起点ノード」とは、信号伝達の起点となるノードを意味し、当初の起点ノードは着目ノードたるノードＮ１のみであり、ステップＳ２０では、全起点ノードが抽出済みと判断され、ステップＳ２１へと進むことになる。
【０１１７】
ステップＳ２１では、起点ノードから１ホップ分信号を伝達させ、信号値が所定レベル以上のノードを新たな起点ノードとする処理が行われる。図１６に示す例において、起点ノードＮ１から１ホップ分の信号伝達を行えば、既存のスタティックリンクＬ１を介してノードＮ１からノードＮ２への信号伝達と、定義されたばかりのダイナミックリンクＬ２を介してノードＮ１からノードＮ３への信号伝達と、の２通りの信号伝達が行われることになる。既に述べたように、起点ノードの信号の信号値は、リンクのもつ信号伝達係数によって減衰することになる。そこで、信号伝達の行われたノードのうち、信号値が所定レベル以上（§４で述べた例では、信号値１０以上）のノードだけを新たな起点ノードとする。また、信号値が１０未満になってしまったノードがあった場合、そのノードへの信号伝達はなかったものとして取り扱うようにする。
【０１１８】
続くステップＳ２２では、ホップ数Ｈが上限値Ｈmax になったか否かが判断される。たとえば、Ｈmax ＝２に設定してあったとすると、現時点ではＨ＝１であり、ステップＳ２２からステップＳ２３へと進むことになり、ホップ数Ｈを１だけ増加させ、ステップＳ１２からの処理が繰り返し行われることになる。
【０１１９】
図１６に示す例の場合、初代の起点ノードＮ１から１ホップ分の信号伝達を受けたノードＮ２，Ｎ３が、二代目の起点ノードになる。ステップＳ１２では、この２つの起点ノードのうち、起点ノードＮ２が抽出されたものとする。続くステップＳ１３では、起点ノードＮ２に対して、対象ノードを１つ定め、ステップＳ１４〜Ｓ１７の判断処理で、ダイナミックリンクが定義できるか否かが判断されることになる。ここで、起点ノードＮ２との間には、いずれの対象ノードについてもダイナミックリンクは定義されなかったとする（図示の例では、リモートリンクＡＢの存在により、ノードＮ２−Ｎ４間には、ダイナミックリンクが定義される可能性はあるが、ここでは、キーワードＫ２−Ｋ４間には十分な関連性がなかったとする）。すると、ステップＳ２０からステップＳ１２へと戻り、今度は起点ノードＮ３が抽出され、同様にダイナミックリンクの定義可能性が判断されるが、やはり新たなダイナミックリンクの定義は行われなかったとしよう。
【０１２０】
かくして、全起点ノードＮ２，Ｎ３の抽出が完了したので、ステップＳ２０からステップＳ２１へと進むことになる。ステップＳ２１では、２つの起点ノードＮ２，Ｎ３から１ホップ分信号を伝達させ（既に信号伝達に用いられたリンクへは、再度の信号伝達は行わないので、ノードＮ２やＮ３からノードＮ１へ戻るような信号伝達は行われない）、信号値が所定レベル以上のノードを新たな起点ノードとする処理が行われる。図１６に示す例では、起点ノードＮ２からは、もはや信号伝達は行われない。一方、もうひとつの起点ノードＮ３から１ホップ分の信号伝達を行えば、既存のスタティックリンクＬ３を介してノードＮ３からノードＮ４への信号伝達が行われることになる。ここで、ノードＮ４に到達した信号の信号値が所定のレベル以上であれば、ノードＮ４が新たな起点ノードになる。
【０１２１】
続くステップＳ２２では、ホップ数Ｈが上限値Ｈmax になったか否かが判断される。ここでは、Ｈmax ＝２なる設定をしてあったので、ホップ数Ｈが上限値になったことになり、ステップＳ２４へと進むことになる。このステップＳ２４では、パス上（信号が流れた経路上）の全ノードが関連ノードとして抽出される。図１６に示す例の場合、リンクＬ１，Ｌ２，Ｌ３をパスとして、ノードＮ２，Ｎ３，Ｎ４にまで信号伝達があったので、これらのノードＮ２，Ｎ３，Ｎ４が関連ノードとして抽出されることになる。この関連ノードが候補ノードとしてオペレータに提示されることは既に述べたとおりである。なお、信号伝達があっても、信号値が所定のレベルに満たない場合には、そのノードへの信号伝達はなかったものとして採り扱われるので、そのノードが関連ノードとして抽出されることはない。
【０１２２】
＜５．３：学習処理の手順＞
図２０は、図１８に示す流れ図におけるステップＳ７の学習処理の詳細な手順を示す流れ図である。まず、ステップＳ３１において、検索処理によって信号が流れた全パスが学習対象パスとして抽出される（信号伝達があっても、信号値が所定のレベルに満たない場合には、そのノードへの信号伝達はなかったものとして採り扱われるので、そのノードへのパスは学習対象パスにはならない）。上述した図１６に示す例において、ノードＮ１を着目ノードとする検索により、候補ノードＮ２，Ｎ３，Ｎ４が検索された場合、リンクＬ１，Ｌ２，Ｌ３が学習対象パスとして抽出されることになる。
【０１２３】
続いて、ステップＳ３２において、学習対象パスのうち、着目ノードから採択ノードに至るパス上のリンクについては、
▲１▼スタティックリンクの場合には、信号伝達係数を増加させ、
▲２▼ダイナミックリンクの場合には、信号伝達係数を増加させるとともに、このダイナミックリンクをスタティックリンクに昇格させる、
という処理が行われる。上述した図１６に示す例において、候補ノードＮ４が採択された場合には、着目ノードＮ１から採択ノードＮ４に至るパス上のリンクのうち、ダイナミックリンクＬ２については、信号伝達係数を増加させるとともに、スタティックリンクへ昇格させる学習が行われ、スタティックリンクＬ３については、信号伝達係数を増加させる学習が行われることになる。
【０１２４】
一方、ステップＳ３３において、学習対象パスのうち、上記以外のパス上のリンクについては、
▲１▼スタティックリンクの場合には、信号伝達係数を減少させ、
▲２▼ダイナミックリンクの場合には、リンク自体を消滅させる、
という処理が行われる。上述した図１６に示す例の場合、スタティックリンクＬ１について、信号伝達係数を減少させる学習が行われることになる。
【０１２５】
最後に、ステップＳ３４において、ノードの頻度係数に関する学習が行われる。すなわち、学習対象パス上の全ノードのうち、
▲１▼採択ノードについては、頻度係数を増加させ、
▲２▼それ以外のノードについては、頻度係数を減少させる、
という処理が行われる。上述した図１６に示す例の場合、採択ノードＮ４についての頻度係数を増加させる学習が行われ、それ以外の候補ノードＮ２，Ｎ３についての頻度係数を減少させる学習が行われることになる。
【０１２６】
§６．本実施形態に係るデータベースシステムの具体的な構成
＜６．１：基本システム＞
図２１は、本実施形態の一実施形態に係るデータベースシステムの具体的な構成を示すブロック図である。実際には、本実施形態に係るデータベースシステムの本質は、ソフトウエアによって実現されることになるが、ここでは便宜上、このシステムを複数の機能要素の集合として説明を行うことにする。図２１に示された個々のブロックは、このシステムを構成する個々の機能要素を示しており、実際にはソフトウエアによって構築されることになる。したがって、この図２１に示す個々のブロックは、データベースシステムを構成する具体的なハードウエア構成要素に直接対応づけられるものではない。たとえば、個々のノードに定義されているキーワードを格納するためのハードウエア（記憶装置）や、オペレータに対して種々の情報を提示するためのディスプレイ装置、オペレータからの指示を入力する入力デバイスなどは、独立した機能要素としては図示されていないが、ハードウエア構成要素としては、これらは当然、本システムに含まれているものである。
【０１２７】
このデータベースシステムでは、複数のノードからなるノード集合体が定義され、データは個々のノードに対応づけて格納される。データ格納手段１０は、提供対象となるデータを、個々のノードに対応づけて格納する記憶手段である。データ提供手段２０は、データ格納手段１０から、所定の着目ノードに対応づけられたデータを抽出し、これをオペレータ１００に提供する機能を有する。着目ノードは、着目ノード設定手段３０によって設定される。すなわち、オペレータ１００が、着目ノード設定手段３０に対して特定のノードを着目ノードとして設定する指示を与えると、指定されたノードが着目ノードとして設定され、オペレータ１００に対しては着目ノードがいずれのノードであるかを示すノード情報（実際には、着目ノードに定義されたキーワード）が提示される。オペレータ１００が現時点での着目ノードに対応づけられたデータを利用したいと考えた場合は、データ提供手段２０に対してデータの提供要求を行えばよい。データ提供手段２０は、この提供要求に応じて、着目ノード設定手段３０に設定されている現時点の着目ノードに対応づけられたデータをデータ格納手段１０から抽出し、これをオペレータ１００に対して提供データとして提供する（たとえば、ディスプレイ上に表示する）。
【０１２８】
検索手段４０は、オペレータからの検索要求に応じて、着目ノード設定手段３０に設定されている現時点での着目ノードに関連する候補ノードを、リンク格納手段５０内に格納されているリンク集合体を参照して検索する機能を有する。リンク格納手段５０内には、ノードの集合およびノード間の関連の程度を示すリンク（スタティックリンク）の集合からなるノード・リンク集合体が格納されている。オペレータ１００が、現時点での着目ノードについて検索要求を出すと、検索手段４０は、リンク格納手段５０内のリンク集合体を利用して、着目ノード設定手段３０に設定されている着目ノードに対して、所定の条件下で関連する関連ノードを検索する処理を行い、この関連ノードの集合体を関連集合として抽出する。この検索処理の具体的な手法は、§４．３あるいは§５．２において既に述べたとおりである。また、検索手段４０は、リンク格納手段５０内のスタティックリンクとは別に、ダイナミックリンクを発生させる機能を有し、この際に、シソーラス辞書を参照してノード間の関連性を評価する機能を有する（§４．７参照）。
【０１２９】
検索手段４０による検索処理で、関連ノードの集合体が抽出されると、候補提示手段６０は、この関連集合に所属するノードの全部もしくは一部を候補ノードとしてオペレータ１００に提示する。これまで述べてきた例では、関連集合に所属する関連ノードの全部がそのまま候補ノードとしてオペレータ１００に提示されてきたが、必要に応じて、この候補提示手段６０においてふるい分けを行い、特定の条件を満足する一部の関連ノードのみを候補ノードとしてオペレータ１００に提示することもできる。このようなふるい分けの具体例は、§７の「ＡＮＤ検索」において述べることにする。候補提示手段６０によるオペレータ１００への候補提示は、たとえば、各候補ノードに定義されているキーワードをディスプレイ上に表示することにより行うことができる。
【０１３０】
候補採択手段７０は、候補提示手段６０によって提示されている候補ノードの中から、オペレータ１００に特定の候補ノードを採択させる機能を有する。オペレータ１００が特定の候補ノードを採択する旨の採択指示を与えると、候補採択手段７０は、指定された候補ノードを採択ノードとして更新手段８０および学習手段９０に伝達する。更新手段８０は、採択ノードが新たな着目ノードとなるように、着目ノード設定手段３０の設定を更新する処理を行う。なお、本明細書で述べる実施形態では、採択ノードは１つだけに限定しているが、複数の候補ノードを採択することを認めることも可能である（この場合、複数の採択ノードがそれぞれ新たな着目ノードとなる）。一方、学習手段９０は、着目ノードから各関連ノードへ至るパス上のリンクに対して、関連の程度についての修正を行う。すなわち、リンク格納手段５０内のリンク集合体に対して、リンクの重みづけを修正する処理を行う。
【０１３１】
かくして、オペレータ１００は、このシステムを利用する場合、まず、着目ノード設定手段３０に対して初期着目ノードの設定指示を与えて、初期着目ノードの設定を行い、以後、必要に応じて、データ提供手段２０に対するデータの提供要求もしくは検索手段４０に対する検索要求を行い、候補提示手段６０によって候補ノードの提示があった場合には、候補採択手段７０に対して採択指示を与えることにより、提示された候補の中から新たな着目ノードを決定する処理を行うことになる。オペレータ１００が新たな着目ノードの採択を行うたびに、リンク格納手段５０内に格納されているリンク集合体が修正され、旧着目ノードから新着目ノードへ至るパス上のリンクの重みづけが増加される。
【０１３２】
なお、リンク格納手段５０内には、個々のオペレータ（利用者）ごとに別個のリンク集合体が用意され、学習手段９０は、候補ノードの採択を行ったオペレータに対応するリンク集合体に対して修正を行うことになる。したがって、学習は個々のオペレータごとに別個独立して行われることになり、このシステムを利用すればするほど、各オペレータにとっての使い勝手は向上してゆくことになる。
【０１３３】
＜６．２：信号伝達係数の定義および修正＞
既に説明したように、リンク格納手段５０に格納されているリンク集合体を構成する個々のリンク（スタティックリンク）には、重みづけを示す値として、それぞれ所定の信号伝達係数が定義されている。検索手段４０は、着目ノードに対して関連ノードを検索する場合、所定の信号値をもった信号を着目ノードからリンクに沿って他のノードへ伝達させる処理を行う。この信号伝達過程において、各リンクに定義された信号伝達係数に基づいて、伝達される信号値を増減させ、所定レベル以上の信号値をもった信号が伝達されたノードだけが関連ノードとして抽出されることになる。信号値が所定のレベル未満となったノードについては、信号伝達が行われなかったものとして取り扱い、更に下流側への信号伝達処理は行われないことになる。また、１つのリンクで結合されたノード間の信号伝達をホップ数１と定義し、ホップ数が所定の上限値を越えた場合にも、信号の伝達処理は中止される。したがって、たとえ信号値が所定のレベル以上であったとしても、着目ノードからのホップ数が所定の上限値を越えたノードに対しては、信号伝達処理は行われないことになる。
【０１３４】
このように、着目ノードからの信号伝達処理によって、所定のレベル以上の信号値をもった信号伝達が行われたノードだけが、検索手段４０によって関連ノードとして抽出され、この関連ノードが候補提示手段６０によって候補ノードとしてオペレータ１００に提示されることになる。なお、候補提示手段６０によって候補ノードの提示を行う際には、優先順位に基づく提示が行われる。すなわち、候補提示手段６０は、各候補ノードに伝達された信号の信号値に応じて各候補ノードに優先順位を定義し、この優先順位に基づいて候補ノードの提示を行う機能を有する。具体的には、優先順位の高い候補ノードに定義されているキーワードを、ディスプレイ上で優先的に表示（たとえば、一覧リストの上位側に表示）すればよい。結局、このような優先表示は、現着目ノードに対する検索において、過去に採択された可能性の高い候補ノードほど優先的に表示するためのものであり、採択可能性の高い候補ほど、再び採択されやすいような提示が行われることになる。
【０１３５】
学習手段９０は、オペレータ１００の採択行為に基づいて、各ノードの信号伝達係数を修正する。すなわち、着目ノードから採択ノードへ至るパス上のリンクについての信号伝達係数を、他のリンクの信号伝達係数に対して相対的に増加させる修正を行う。より具体的には、学習手段９０は、着目ノードから各関連ノードへ至る全パスを学習対象パスと定義し、この学習対象パス上のリンクのうち、着目ノードから採択ノードへ至るパス上のリンクについてはその信号伝達係数を増加させ、学習対象パス上のその他のリンクについてはその信号伝達係数を減少させる修正を行う。
【０１３６】
＜６．３：ダイナミックリンクを用いた検索＞
検索手段４０は、リンク格納手段５０内のリンク集合体を構成するリンク（スタティックリンク）を用いた検索だけでなく、必要に応じて、一時的に発生させた別なリンク（ダイナミックリンク）を用いた検索を行う機能を有する（§３あるいは§４．７参照）。すなわち、着目ノードとの間が既存のスタティックリンクによって完全に連結されているノードについては、この既存のスタティックリンクを用いた検索を行い、着目ノードとの間が既存のスタティックリンクによって完全には連結されておらず、不連結部分が存在するノードについては、不連結部分のノード間についての関連性を評価し、評価結果が所定の条件を満たす場合には、不連結部分に一時的にダイナミックリンクを定義し、スタティックリンクおよびダイナミックリンクの双方を用いた検索を行う。
【０１３７】
不連結部分のノード間の関連性の評価は、個々のノードに定義されたキーワードの関連の度合いに基づいて決定される。たとえば、第１のノードと第２のノードとの間にはスタティックリンクが定義されておらず、不連結部分を構成していた場合、第１のノードに定義されている第１のキーワードと、第２のノードに定義されている第２のキーワードとの関連の度合い（たとえば、文字列の一致度）が評価され、評価結果が所定の条件を満たす場合（たとえば、相互の文字列が所定の割合以上で一致していた場合）に、両ノード間に一時的なダイナミックリンクが定義され、このダイナミックリンクを介しての信号伝達処理が行われる。なお、この一時的に定義するダイナミックリンクについての信号伝達係数は、キーワードの関連性の評価結果に基づいて決定すればよい。たとえば、両キーワードの文字列の一致度が高ければ高いほど、大きな信号伝達係数が定義されることになる。また、各ノードに、複数の等価キーワードを定義しておくようにし、いずれの等価キーワードも評価対象となるようにしておけば、より柔軟な関連性評価が可能になる。
【０１３８】
学習手段９０による学習処理を行う際には、リンク格納手段５０内に格納されているリンク集合体を構成する既存のスタティックリンクと、一時的に発生させたダイナミックリンクとでは、それぞれ別個の取り扱いがなされる。すなわち、スタティックリンクに関する学習は上述したとおりであるが、ダイナミックリンクについては、当該ダイナミックリンクを用いた検索によって抽出された関連ノードが、候補採択手段によって採択された場合には、当該ダイナミックリンクがスタティックリンクとしてリンク集合体に追加され、新たな構成要素となるような修正が行われる。この際、当該ダイナミックリンクに対して定義されていた信号伝達係数を増加修正し、修正後の信号伝達係数が、昇格したスタティックリンクの信号伝達係数として与えられることになる。一方、当該ダイナミックリンクを用いた検索によって抽出された関連ノードが、候補採択手段によって採択されなかった場合には、当該ダイナミックリンクはそのまま消滅させられる。
【０１３９】
＜６．４：クラスリンクの定義＞
分散型データベースシステムとして利用する場合には、§４．１で述べたように、各ノードがいずれかのクラスに所属するように、個々のノードを複数のクラスに分類し、各クラス間の関連の程度を示すクラスリンクを定義するのが好ましい。図８に示す例では、互いに異なるクラス間の関連の程度を示すリモートリンクＡＢ，ＡＣと、自己と自己との間の関連の程度を示すローカルリンクＡＡ，ＢＢと、の二種類のクラスリンクが定義されている。検索時にダイナミックリンクを定義するか否かの判断を行う際に、不連結部分のノード間の関連性を評価することになるが、この関連性の評価を行うときに、個々のノードが所属するクラス間に定義されたクラスリンクを参考にすることができる。たとえば、前述した実施形態のように、クラスリンクが存在しないクラス間には、ダイナミックリンクの定義を許可しないようにすれば、クラスリンクの定義によって、ダイナミックリンクの発生態様を制御することができるようになる。
【０１４０】
また、個々のクラスリンクに対応させてそれぞれシソーラス辞書を用意しておき、不連結部分のノード間の関連性を評価する際に、個々のノードが所属するクラス間に定義されたクラスリンクに対応するシソーラス辞書を用いて、キーワードの関連性評価を行うようにすることもできる。用意するシソーラス辞書の内容によって、キーワードの関連性の評価結果が変わってくることになり、関連性の評価方法の方向づけを定めることができる。
【０１４１】
更に、各クラスリンクにも信号伝達係数を定義しておき、検索手段が、各ノード間に信号伝達をさせる際に、ノード間に定義されたリンクの信号伝達係数と、クラス間に定義されたクラスリンクの信号伝達係数と、の双方を考慮して信号値の増減を行うようにすれば、§４．４において述べたように、信号伝達の傾向をクラスリンクによって統括的に制御することが可能になる。
【０１４２】
＜６．５：頻度係数格納手段の付加＞
図２２は、上述した図２１に示すデータベースシステムに、更に、頻度係数格納手段１１０を付加した実施形態のブロック図である。頻度係数格納手段１１０は、各ノードについての採択頻度を示す頻度係数を格納する手段である。オペレータ１００が、候補提示手段６０によって提示された候補ノードの中から、採択ノードを決定すると、学習手段９０は、採択されたノードについての頻度係数を、他のノードについての頻度係数に対して相対的に増加させる修正を行う。各ノードについての頻度係数は、ノードの重みづけを示す係数であり、オペレータ１００による採択頻度の高いノードほど、頻度係数の値が大きくなり、重みが増すことになる。このようなノードの重みづけは、信号伝達過程それ自身には影響を与えることはないが、§４．６で述べたように、候補ノードを提示する際の優先順位決定に利用される。すなわち、この実施形態における候補提示手段６０は、各候補ノードに伝達された信号の信号値と、各候補ノードについての頻度係数と、の双方に応じて各候補ノードに優先順位を定義し、この優先順位に基づいて候補ノードの提示を行う機能を有する。
【０１４３】
なお、頻度係数格納手段１１０内には、個々のオペレータごとに別個の頻度係数を格納するようにし、学習手段９０は、ノードの採択を行ったオペレータに対応する頻度係数に対して修正を行うようにする。したがって、リンクについての信号伝達係数と同様に、学習は個々のオペレータごとに別個独立して行われることになり、このシステムを利用すればするほど、各オペレータにとっての使い勝手は向上してゆくことになる。
【０１４４】
§７．ＡＮＤ検索
本発明に係るデータベースシステムの特徴は、その特有の検索方法にある。既に述べてきたように、このデータベースシステムを利用して所望のデータを検索するには、とりあえず関連があると思われるキーワードを入力し、このキーワード（着目ノード）に対する検索処理を行うことにより、関連性のあるキーワード（候補ノード）の提示が行われることになる。オペレータは、この提示されたキーワードの中から、目的とするデータに、より関連性があると思われるキーワードを採択し、更に関連のあるキーワードを検索することができる。
【０１４５】
図２３〜図２８は、このような本発明独特の検索操作を説明する概念図である。本発明に係るデータベースシステムでは、図２３に示すようなノード集合体が定義される。図において多数の黒丸は個々のノードを示しており個々のノードには、それぞれ所定のキーワードが定義されるとともに、特定のデータが対応づけられている。利用者が探し求めている目的のデータは、このいずれかのノードに対応づけられていることになり、オペレータの行う検索操作は、目的のデータが対応づけられたノードを見付けるための操作ということになる。
【０１４６】
いま、オペレータが目的のデータに関連性があると思われるキーワードを入力し、このキーワードに対応した着目ノードＮ１を指定したとする。図２４は、ノード集合体内の１つのノードが着目ノードＮ１として指定された状態を示している。もちろん、この着目ノードＮ１が目的とするノードであった場合には、データ提供手段２０に対して提供要求（閲覧指示）を与えれば、着目ノードＮ１に対応づけられた目的のデータがデータ格納手段１０から抽出されて提示されることになる。オペレータが、検索手段４０に対して検索要求を与えると、この着目ノードＮ１に対する検索が行われ、着目ノードＮ１に対して、ある程度の関連性をもつノードからなる関連集合Ｇ１が抽出される。図２５は、このようにして検索された関連集合Ｇ１を示している。こうして検索された関連集合Ｇ１内の関連ノードは、候補提示手段６０によって、候補ノードとしてオペレータに提示される。候補ノードの提示順は、各ノードの優先順位に従うことになる。
【０１４７】
オペレータは、提示された候補ノードの中の１つを採択する。図２６は、関連集合Ｇ１内のノードＮ２を採択した状態を示す。ここで採択されたノードＮ２は、図２７に示すように、新たな着目ノードとなる。このとき、ノードＮ１からノードＮ２へ至るパス上のリンクの重みづけを増加させる学習が行われることになり、将来、再びノードＮ１を着目ノードとする検索が行われた場合、関連集団Ｇ１内のノードを候補ノードとして提示する際のノードＮ２の優先順位が向上することになる。
【０１４８】
オペレータは、必要に応じて、この新たな着目ノードＮ２に対応づけられたデータの提供要求を行い、閲覧をすることが可能である。また、この新たな着目ノードＮ２に対して、再び検索要求が行われると、ノードＮ２に対して、ある程度の関連性をもつノードからなる関連集合Ｇ２が抽出される。図２８は、このようにして検索された関連集合Ｇ２を示している。こうして検索された関連集合Ｇ２内の関連ノードは、候補提示手段６０によって、候補ノードとしてオペレータに提示される。なお、前述した学習により、ノードＮ１とノードＮ２との間のパス上のリンクは、重みづけが増加されているため、通常、関連集合Ｇ２内に元の着目ノードＮ１が含まれることになる。このように、元の着目ノードが、後の検索において再び候補ノードとして提示されることが好ましくない場合には、一連の検索操作において、過去の着目ノードについては候補ノードから外すような取捨選択機能を候補提示手段６０に付加しておけばよい。
【０１４９】
これまで述べてきた例では、候補提示手段６０は、検索手段４０によって抽出された関連ノードすべてをそのまま候補ノードとしてオペレータ１００に提示していた。しかしながら、上述したように、候補提示手段６０に取捨選択機能を設けておくと、検索手段４０によって抽出された関連ノードのうちの一部のみを候補ノードとして提示することが可能になる。本実施形態に係るデータベースシステムは、このような候補提示手段６０による取捨選択機能を利用し、以下に述べるＡＮＤ検索を行うようにしている。
【０１５０】
ここで述べるＡＮＤ検索は、データベースシステムを用いて検索操作を行ってゆく過程で、候補を徐々に絞り込んでゆく上で有効な手段である。たとえば、上述した一般的な検索操作の例の場合、図２５に示す関連集合Ｇ１と、図２８に示す関連集合Ｇ２とは、集合的には別個独立した集合であり、２回の検索操作を行ったにもかかわらず、候補の絞り込みは行われていない。候補を絞り込むための検索方法としては、通常、候補を決定するための条件を複数設定し、これらの条件の論理積を満たすような候補のみを抽出するＡＮＤ検索が行われる。ここで述べる検索手法は、この一般的なＡＮＤ検索を本発明に係るデータベースシステムの検索に適用したものである。
【０１５１】
図２９は、前述した図２１に示すデータベースシステムに、更に、母集合定義手段１２０を付加した実施形態のブロック図である。母集合定義手段１２０は、特定のノード集合体を示す母集合を定義する機能を有し、この実施形態における候補提示手段６０は、検索手段４０によって抽出された関連集合と、母集合定義手段１２０に定義されている母集合との論理積集合に所属するノードのみを候補ノードとして提示する機能を有する。また、候補提示手段６０によって、この論理積集合が候補ノードとして提示されると、母集合定義手段１２０は、この候補ノードの集合体を、新たな母集合とする再定義を行う機能を有する。
【０１５２】
以下、このＡＮＤ検索による検索操作を、具体例を挙げて説明する。いま、図３０に示すようなノード集合体の中のノードＮ１が着目ノードとして指定されている状態を考える（ノード集合体には、図２３と同様に多数のノードが含まれているが、以下の説明では、図が繁雑になるのを避けるため、各図において注目すべきノードについてのみ、黒丸で示す）。この初期状態では、ノード集合体全体を母集合Ｍ１と定義することにする。したがって、この時点では、母集合定義手段１２０には、ノード集合体全体に相当する母集合Ｍ１が定義されていることになる。
【０１５３】
ここで、この着目ノードＮ１に対して、検索要求を行うと、図３１に示すように、関連ノードの集合体が関連集合Ｇ１として抽出されることは既に述べたとおりである。検索手段４０が候補提示手段６０に対して与える関連集合Ｇ１は、このように抽出された関連ノードすべてを含むものである。これに対し、候補提示手段６０は、この関連集合Ｇ１と、母集合Ｍ１との論理積集合「（Ｇ１）ＡＮＤ（Ｍ１）」に含まれるノードだけを候補ノードとして提示する。図３１に示す例の場合、この時点における母集合Ｍ１は、ノード集合体全体であるので、候補提示手段６０は、関連集合Ｇ１に含まれる関連ノードすべてを、そのまま候補ノードとして提示することになる。すなわち、図３１にハッチングを施して示す領域内のノードが候補ノードとして提示される。
【０１５４】
こうして候補ノードの提示が行われると、母集合定義手段１２０は、この候補ノードの集合体を、新たな母集合とする再定義を行う。したがって、この場合、図３１にハッチングを施して示した関連集合Ｇ１が新たな母集合Ｍ２として定義されることになる。
【０１５５】
さて、ここで、候補ノードの中のノードＮ２が採択されたとすると、このノードＮ２が新たな着目ノードとなる。いま、オペレータがこの新たな着目ノードＮ２に対して、再び検索要求を与えたとする。しかも、「ＡＮＤ検索」のモードでの検索要求が与えられたとする（オペレータが検索手段４０に対して検索要求を与える際に、「通常検索」モードか、「ＡＮＤ検索」モードかの指定を行えるようにしておけばよい）。検索手段４０は、着目ノードＮ２に対する関連ノードをすべて含む関連集合Ｇ２を抽出する検索処理を実行する。図３２は、このようにして抽出された関連集合Ｇ２を示す。
【０１５６】
候補提示手段６０は、この関連集合Ｇ２と、母集合Ｍ２との論理積集合「（Ｇ２）ＡＮＤ（Ｍ２）」に含まれるノード（図３２にハッチングを施して示した領域に含まれるノード）だけを候補ノードとして提示する。結果的に、この時点で提示される候補ノードは、１回目の検索時の関連集合Ｇ１と２回目の検索時の関連集合Ｇ２との論理積集合に含まれるノードのみになっており、２回の検索による絞り込みが行われたことになる。このような候補ノードの提示が行われると、母集合定義手段１２０は、この候補ノードの集合体を、新たな母集合とする再定義を行う。したがって、この場合、図３２にハッチングを施して示した領域に含まれるノードの集合が新たな母集合Ｍ３として定義される（図３３）。
【０１５７】
ここで更に、候補ノードの中のノードＮ３が採択されたとすると、このノードＮ３が新たな着目ノードとなる。そして、オペレータがこの新たな着目ノードＮ３に対して、再び「ＡＮＤ検索」モードで検索要求を与えたとしよう。この場合、検索手段４０は、着目ノードＮ３に対する関連ノードをすべて含む関連集合Ｇ３を抽出する検索処理を実行する。図３４は、このようにして抽出された関連集合Ｇ３を示す。
【０１５８】
候補提示手段６０は、この関連集合Ｇ３と、母集合Ｍ３との論理積集合「（Ｇ３）ＡＮＤ（Ｍ３）」に含まれるノード（図３４にハッチングを施して示した領域に含まれるノード）だけを候補ノードとして提示する。結果的に、この時点で提示される候補ノードは、図３５に示すように、１回目の検索時の関連集合Ｇ１と、２回目の検索時の関連集合Ｇ２と、３回目の検索時の関連集合Ｇ３との論理積集合に含まれるノードのみになっており、３回の検索による絞り込みが行われたことになる。このような候補ノードの提示が行われると、母集合定義手段１２０は、この候補ノードの集合体を、新たな母集合とする再定義を行う。したがって、この場合、図３５にハッチングを施して示した領域に含まれるノードの集合が新たな母集合Ｍ４として定義される。
【０１５９】
このようなＡＮＤ検索は、本実施形態に係るデータベースシステムを利用して絞り込みを行う場合に非常に有効である。たとえば、第１回目の検索時に、「高血圧」なるキーワードが定義されたノードを着目ノードとする検索を行い、その結果、候補ノードを示すキーワードとして、「血圧異常」、「動脈硬化」、「高血圧性網膜症」、…などが候補として提示されたとする。ここで、「動脈硬化」なるキーワードが定義されたノードを採択して新たな着目ノードとし、第２回目の検索を「ＡＮＤ検索」のモードで行えば、「高血圧」なるキーワードと、「動脈硬化」なるキーワードとの双方に関連した新たなキーワードが候補として提示されることになる。このように、ＡＮＤ検索を利用すれば、候補を次第に絞り込んでゆくことができ、目的のデータを探し出すための有効な検索手法として活用することが可能である。
【０１６０】
§８．負の検索処理
これまで述べてきた検索処理は、「あるキーワードに関連した事項」を見つける検索処理であり、たとえば、「高血圧」なるキーワードが定義されたノードを着目ノードとした検索を行えば、「血圧異常」、「動脈硬化」、「高血圧性網膜症」、…などのキーワードが定義された関連ノードが見つけられる。このような検索処理を「正の検索処理」と呼ぶことにすれば、ここで説明する検索処理は、「あるキーワードに関連しない事項」を見つける検索処理であり、「負の検索処理」と呼ぶべきものである。
【０１６１】
＜８．１：負の検索処理の基本概念＞
既に述べたように、図２３に示すノード集合体の中から、図２４に示すように着目ノードＮ１を指定し、正の検索処理を実行すると、この着目ノードＮ１に対して関連性をもつノードからなる関連集合Ｇ１が抽出され、この関連集合Ｇ１内のノードが候補ノード（白抜きのノード点で示す）として提示される。したがって、提示された候補ノードは、「着目ノードＮ１に関連したノード」ということになる。これに対して、図３６に示すように、関連集合Ｇ１の補集合Ｇ１^＊を候補ノード（白抜きのノード点で示す）として提示すれば、提示されたノードは、「着目ノードＮ１に関連しないノード」ということになる。これらのノードは、「関連しない」という特徴を「関連性」のひとつと考えれば、「関連しないという関連性をもつノード」ということになる。そこで、ここでは、いわゆる「一般的な意味での関連性」を「正の関連」と呼び、「関連しないという関連性」を「負の関連」と呼ぶことにする。したがって、図２５において白抜きのノード点で示されたノードは、着目ノードＮ１に対して「正の関連」を有する「正の関連ノード」であり、関連集合Ｇ１は「正の関連集合」ということになる。これに対し、図３６において白抜きのノード点で示されたノードは、着目ノードＮ１に対して「負の関連」を有する「負の関連ノード」であり、関連集合Ｇ１^＊は「負の関連集合」ということになる。
【０１６２】
図２１，図２２，図２９に示す検索手段４０はいずれも、「正の関連集合」を抽出する「正の検索処理」と「負の関連集合」を抽出する「負の検索処理」との２通りの検索処理を行う機能を有している。このような「負の検索処理」は、特に、§７で述べたＡＮＤ検索と組み合わせると効果的である。ここでは、前述のＡＮＤ検索で述べた具体例に、「負の検索処理」を組み合わせてみよう。
【０１６３】
まず、図３０に示すようなノード集合体の中のノードＮ１を着目ノードに指定し、第１回目の検索として「正の検索処理」を行うと、図３１に示すような「正の関連集合Ｇ１」が抽出され、候補ノードとして提示される。更に、この候補ノードの中からノードＮ２を採択し、この採択ノードＮ２を新たな着目ノードに指定し、第２回目の検索として「ＡＮＤ検索」のモードで「正の検索処理」を行うと、図３２に示すように、「正の関連集合Ｇ２」が抽出され、母集合Ｍ２との論理積部分（ハッチング部分）が候補ノードとして提示される。続いて、図３３に示すように、この候補ノードの中からノードＮ３を採択し、この採択ノードＮ２を新たな着目ノードに指定し、第３回目の検索として、再び「ＡＮＤ検索」のモードで「正の検索処理」を行うと、図３４に示すように、「正の関連集合Ｇ３」が抽出され、母集合Ｍ３との論理積部分（図３４のハッチング部分）が候補ノードとして提示される。このような３回の検索処理により、図３５に示すように、「（Ｇ１）ＡＮＤ（Ｇ２）ＡＮＤ（Ｇ３）」なる論理積集合に含まれるノードが候補ノードとして提示されることは既に述べたとおりである。
【０１６４】
ところが、図３３に示す状態において、ノードＮ３を新たな着目ノードとして採択し、この新たな着目ノードＮ３について、第３回目の検索として、「ＡＮＤ検索」のモードで「負の検索処理」を行えば、図３７に示すように、「負の関連集合Ｇ３^＊」が抽出され、母集合Ｍ３と負の関連集合Ｇ３^＊との論理積部分（図３７のハッチング部分）が候補ノードとして提示されることになる。結局、このような３回の検索処理により、図３８に示すように、「（Ｇ１）ＡＮＤ（Ｇ２）ＡＮＤ（Ｇ３^＊）」なる論理積集合に含まれるノードが候補ノードとして提示されることになる。
【０１６５】
このような「負の検索処理」を組み合わせたＡＮＤ検索は、柔軟な絞り込み検索を行う場合に非常に有効である。たとえば、何らかの循環器疾患をもつ患者に対して、過去の類似症例データを検索する場合を考えよう。まず、「循環器疾患」なるキーワードが定義されたノードを着目ノードとして、第１回目の検索を「正の検索処理」で行い、いくつかの候補ノードが提示されたとする。ここでは、当該患者の血圧が高かったため、この候補の中から、「高血圧」なるキーワードが定義されたノードを採択し、第２回目の検索として、「ＡＮＤ検索」のモードで「正の検索処理」を行ったとしよう。その結果、候補ノードを示すキーワードとして、「血圧異常」、「動脈硬化」、「高血圧性網膜症」、…などが候補として提示されたとする。ここで、当該患者には「動脈硬化」という症状は見られなかったとすると、第３回目の検索としては、「動脈硬化」なるキーワードが定義されたノードを採択し、この「動脈硬化」なるキーワードが定義されたノードを着目ノードに指定し、「ＡＮＤ検索」のモードで「負の検索処理」を行えばよい。すると、この第３回目の検索結果として提示される候補ノードは、「動脈硬化」なるキーワードに関連のない（負の関連をもつ）ノードに絞り込まれることになる。
【０１６６】
本実施形態に係るシステムでは、検索手段４０に対して検索要求を与えるときに、「通常検索」のモード（抽出された関連集合に所属する関連ノードをそのまま候補ノードとして提示するモード）か、「ＡＮＤ検索」のモード（前回の検索時の候補ノードを母集合とし、抽出された関連集合と母集合との論理積集合に所属するノードのみを候補ノードとして提示するモード）か、を示す第１の選択と、「正の検索処理」か「負の検索処理」かを示す第２の選択と、の２とおりの選択を行うようにし、これらの組み合わせにより、柔軟な検索処理を実現できるようにしている。
【０１６７】
＜８．２：符号を考慮したリンク定義＞
負の検索処理を行う具体的な方法のひとつとして、まず、正の検索処理を行って正の関連集合Ｇを抽出し、その補集合として、負の関連集合Ｇ^＊を抽出する、という方法が考えられる。しかしながら、このような検索方法は、負の関連集合Ｇ^＊を直接的に求める方法ではなく、正の関連集合Ｇを利用して間接的に求める方法に過ぎない。このような間接的な方法を採ると、リンクの重みに基づいて候補ノードについての優先順位を定義することができなくなる。
【０１６８】
前述したように、個々のリンクに信号伝達係数を定義し、着目ノードから信号伝達を行うようにし、所定レベル以上の信号値が得られたノードを関連ノードとして抽出するという手法を採れば、個々の関連ノードに信号値の大きさに基づく優先順位を定義することができ、候補ノードとしての提示を行う際に、この優先順位を考慮した提示を行うことができる。しかしながら、上述した間接的な方法により負の検索処理を行った場合、リンクの重みに基づいて負の関連ノードに優先順位を定義することはできなくなる。たとえば、図３６に示す例の場合、負の関連集合Ｇ１^＊を求める際に、まず、正の関連集合Ｇ１を求め、その補集合として間接的に求める手法を採った場合を考えよう。この場合、着目ノードＮ１から信号伝達があったノードは、正の関連集合Ｇ１内のノードのみである。したがって、正の関連集合Ｇ１内のノードについては、伝達された信号の信号値に基づいて優先順位を定義することができるが、その補集合として得られた負の関連集合Ｇ１^＊内のノードについては、信号伝達に基づく優先順位を定義することはできない。
【０１６９】
負の検索処理によって得られた候補ノードは、着目ノードに対して負の関連性（関連性のないという関連性）をもつノードであるが、この負の関連性についても、正の関連性と同様に定量的な取扱いができると便利である。すなわち、同じ負の関連性であっても、その程度によって区別し、より強い負の関連性を示す候補ノードを優先的に提示できた方が好ましい。このように、負の関連性を正の関連性と同様に定量的に取り扱うことができるように、本願発明者は、正リンクと負リンクとの２とおりのリンクを定義することを着想した。すなわち、正の関連性を有するノード間には正の信号伝達係数をもった正リンクを定義し、負の関連性を有するノード間には負の信号伝達係数をもった負リンクを定義するのである。
【０１７０】
図３９は、このような正リンクと負リンクとの双方を定義したノード・リンク集合体を示す概念図である。図で「＋」符号が付されたリンク（実線で示す）は、正の信号伝達係数を有する正リンクであり、図で「−」符号が付されたリンク（破線で示す）は、負の信号伝達係数を有する負リンクである。たとえば、ノードＡとノードＢとの間のリンク（以下、リンクＡＢと表現する）は正リンクであり、ノードＡとノードＢとが正の関連性（いわゆる通常の関連性）をもつノードであることを示している。これに対して、たとえばリンクＢＤは負リンクであり、ノードＢとノードＤとが負の関連性（関連がないという関連性）をもつノードであることを示している。正リンクはいずれも正の信号伝達係数を有し、負リンクはいずれも負の信号伝達係数を有している（図では、単に正負の符号のみを示してあるが、実際にはそれぞれ所定の絶対値をもった信号伝達係数が定義されている）。
【０１７１】
さて、このような正リンクと負リンクとの双方を定義したノード・リンク集合体を用いた検索処理は次のようにして行われる。まず、正の検索処理、すなわち、着目ノードに対して正の関連性を有するノードを検索する処理を行う場合には、正の信号値をもった信号を着目ノードからリンクに沿って他のノードへ伝達させ、正の信号値をもった信号が得られたノードを正の関連ノードとして抽出すればよい。図４０は、着目ノードＢについて、信号伝達をホップ数Ｈ＝１に限定した正の検索処理を行った状態を示す概念図である。各ノードに括弧書きで付した符号は、各ノードに得られる信号の信号値の符号を示しており、「０」なる符号が付されたノードは、信号伝達がなかったノードを示している。また、太線の矢印は、信号伝達経路を示すものであり、実線の矢印は正リンクに沿った信号伝達経路、破線の矢印は負リンクに沿った信号伝達経路を示している。
【０１７２】
正リンクＢＡ，ＢＥ，ＢＪは正の信号伝達係数を有するので、着目ノードＢに正の信号値をもった信号を与えると、ノードＡ，Ｅ，Ｊには、いずれにも正の信号値をもった信号が伝達されることになる。これに対して、負リンクＢＤ，ＢＧは負の信号伝達係数を有するので、着目ノードＢに正の信号値をもった信号を与えると、ノードＤ，Ｇには、いずれにも負の信号値をもった信号が伝達されることになる。それ以外のノードＣ，Ｆ，Ｈ，Ｉ，Ｋには、ホップ数Ｈ＝１という条件では信号伝達は行われない。ここで、正の信号値をもった信号が得られたノードＡ，Ｅ，Ｊを正の関連ノードとして抽出し、これらを候補ノードとして提示すれば、正の検索処理が実行されたことになる。なお、実際には、着目ノードＢに与えた信号の信号値に、各ノードのもつ信号伝達係数が乗ぜられるので、信号は減衰することになり、所定レベル以上の信号値が得られなかったノードについては、信号伝達がなかったものとして取り扱われる。ただ、ここでは、説明の便宜上、このような信号減衰については考慮しないことにする。
【０１７３】
このように、負リンクなるものを定義したとしても、着目ノードに正の信号値をもった信号を与え、正の信号値をもった信号が得られたノードのみを抽出するようにすれば、正の検索処理には何ら支障は生じない。すなわち、負リンクで結ばれたノードが候補ノードとして抽出されることはない。そして、この正の検索処理によって提示された候補ノードの中から１つのノードを採択すれば、学習が行われる点は既に述べたとおりである。たとえば、図４０に示す例において、候補ノードＪが採択されたとすれば、図２０の流れ図に示すとおり、学習対象パス（リンクＢＡ，ＢＥ，ＢＪ）のうち、採択ノードＪへ至るリンクＢＪの信号伝達係数を増加させ、それ以外のリンクＢＡ，ＢＥの信号伝達係数を減少させる学習が行われるとともに、採択ノードＪの頻度係数を増加させ、それ以外の候補ノードＡ，Ｅの頻度係数を減少させる学習が行われることになる。
【０１７４】
なお、信号伝達のためのホップ数Ｈを２以上にした場合も、上述の例と同様に、信号伝達係数の符号を考慮した乗算を行うようにすればよい。図４１は、着目ノードＢについて、信号伝達をホップ数Ｈ＝２に限定した正の検索処理を行った状態の一部（ノードＢ→Ａ→Ｆなるパスと、ノードＢ→Ｄ→Ｋなるパスと、ノードＢ→Ｇ→Ｈなるパスと、の３通りのみ例示し、他のパスについては省略してある）を示す概念図である。正リンクＢＡを介した信号伝達によりノードＡには正の信号が得られ、この正の信号が更に正リンクＡＦを介してノードＦに伝達されるため、ノードＦにも正の信号が得られている。一方、負リンクＢＤを介した信号伝達によりノードＤには負の信号が得られ、この負の信号が更に正リンクＤＫを介してノードＫに伝達されるため、ノードＫにも負の信号が得られている。これに対し、負リンクＢＧを介した信号伝達によりノードＧには負の信号が得られるが、この負の信号が更に負リンクＧＨを介してノードＨに伝達されるため、ノードＨには正の信号が得られている。このように、ホップ数Ｈが２以上の場合、負リンクを偶数回通って信号伝達が行われたノードには、正の信号が得られることになり、正の関連ノードとして抽出されることになる。結局、図４１に示す例では、ノードＡ，Ｆの他、ノードＨも正の関連ノードとして抽出されることになる。これは、ノードＨが、「着目ノードＢに対して負の関連性をもっているノードＧ」に対して負の関連性をもっているためであり、いわば「関連性のないノードに対して関連性がない」という二重否定の原理により、ノードＨが着目ノードＢに対して関連性を有すると期待されるためである。
【０１７５】
続いて、負の検索処理、すなわち、着目ノードに対して負の関連性を有するノードを検索する場合を考える。この場合は、負の信号値をもった信号を着目ノードからリンクに沿って他のノードへ伝達させ、正の信号値をもった信号が得られたノードおよび信号伝達のなかったノードを負の関連ノードとして抽出すればよい。図４２は、着目ノードＢについて、信号伝達をホップ数Ｈ＝１に限定した負の検索処理を行った状態を示す概念図である。
【０１７６】
正リンクＢＡ，ＢＥ，ＢＪは正の信号伝達係数を有するので、着目ノードＢに負の信号値をもった信号を与えると、ノードＡ，Ｅ，Ｊには、いずれにも負の信号値をもった信号が伝達されることになる。これに対して、負リンクＢＤ，ＢＧは負の信号伝達係数を有するので、着目ノードＢに負の信号値をもった信号を与えると、ノードＤ，Ｇには、いずれにも正の信号値をもった信号が伝達されることになる。それ以外のノードＣ，Ｆ，Ｈ，Ｉ，Ｋには、ホップ数Ｈ＝１という条件では信号伝達は行われない。ここで、正の信号値をもった信号が得られたノードＡ，Ｅ，Ｊと、信号伝達のなかったノードＣ，Ｆ，Ｈ，Ｉ，Ｋとを、負の関連ノードとして抽出し、これらを候補ノードとして提示すれば、負の検索処理が実行されたことになる。図４０に白抜きのノード点で示された各ノード（正の検索処理により得られた候補ノード）の集合と、図４２に白抜きのノード点で示された各ノード（負の検索処理により得られた候補ノード）の集合と、を比較すれば、両者が補集合の関係にあることがわかる。
【０１７７】
このように、負リンクなるものを定義し、着目ノードに負の信号値をもった信号を与え、正の信号値をもった信号が得られたノードおよび信号伝達のなかったノードを抽出するようにすれば、正の検索処理によって得られるノード集合（正の関連集合）の補集合（負の関連集合）が得られることがわかる。これは、ホップ数Ｈを２以上にした場合も全く同様である。
【０１７８】
ここで重要な点は、このような手法による負の検索処理は、負の関連集合を直接的に求めているという点である。たとえば、図４２の例の場合、候補ノードとして抽出されたノードＤ，Ｇには、実際に信号伝達が行われており、ノードＤ，Ｇについてはそれぞれ所定の信号値が得られることになる。それ以外の候補ノードＣ，Ｆ，Ｈ，Ｉ，Ｋには、信号伝達がなかったため信号値は零になるが、いずれにせよ、負の関連集合として抽出された候補ノードには、信号値に基づいて優先順位を定義することが可能になる。たとえば、図４２において、負リンクＢＤの信号伝達係数の絶対値が、負リンクＢＧの信号伝達係数の絶対値よりも大きければ、ノードＤに得られる信号値はノードＧに得られる信号値よりも大きくなり、結局、候補ノードの第１順位にはノードＤ、第２順位にはノードＧ、第３順位にはノードＣ，Ｆ，Ｈ，Ｉ，Ｋという優先順位を定義することができる。もちろん、§４．６で述べたように、各ノードに頻度係数を定義するようにすれば、更に、この頻度係数を考慮に入れた優先順位の定義も可能である。
【０１７９】
以上のように、符号を考慮したリンク定義を行っておけば、着目ノードに与える信号の信号値を正にすれば正の検索処理を行うことができ、負にすれば負の検索処理を行うことができるようになる。
【０１８０】
＜８．３：負の検索処理による学習＞
図２１，図２２，図２９に示すシステムは、学習手段９０を有しており、オペレータが候補ノードの中から新たな着目ノードを採択する行為を行うことにより、リンク格納手段５０内のリンク集合体に対する学習処理が行われることは既に述べたとおりである。また、図２２に示すシステムでは、頻度係数格納手段１１０内に個々のノードの採択頻度が格納されており、オペレータの採択行為により、この採択頻度に対する学習処理が行われることも既に述べたとおりである。
【０１８１】
このように、正の検索処理に基づいて採択が行われた場合の学習処理の手順は、図２０の流れ図に示したとおりであるが、負の検索処理に基づいて採択が行われた場合も、基本的には、この図２０の流れ図に示す学習手順を共通して適用することができる。たとえば、図４２に示す負の検索処理において、候補ノードの中から、ノードＤが新たな着目ノードとして採択された場合を考えよう。この場合、図２０の流れ図に示すとおり、学習対象パス（関連ノードを抽出するための有効な信号伝達のあったパス、すなわち、リンクＢＤ，ＢＧ）のうち、採択ノードＤへ至るリンクＢＤの信号伝達係数を増加させ（符号は負のままで絶対値を増加させる）、それ以外のリンクＢＧの信号伝達係数を減少させる（符号は負のままで絶対値を減少させる）学習が行われるとともに、採択ノードＤの頻度係数を増加させ、それ以外の学習対象パス上のノードＧの頻度係数を減少させる学習が行われる。
【０１８２】
ただ、正の検索処理と負の検索処理とでは、（正または負の）関連ノードとして抽出されるノードに相違がある。すなわち、正の検索処理によって抽出される正の関連ノードは、いずれも有効な信号伝達のあったパス（学習対象パス）上のノード（図４０に示す例では、ノードＡ，Ｅ，Ｊ）であるのに対し、負の検索処理によって抽出される負の関連ノードは、有効な信号伝達のあったパス（学習対象パス）上のノード（図４２に示す例では、ノードＤ，Ｇ）だけでなく、信号伝達のなかったノード（図４２に示す例では、ノードＣ，Ｆ，Ｈ，Ｉ，Ｋ）も含まれることになる。したがって、頻度係数についての学習を行う際には、信号伝達のなかったノードも学習対象に加えるようにするのが好ましい。したがって、図４２において、ノードＤが採択された場合、採択ノードＤの頻度係数を増加させるとともに、採択に漏れた残りの全候補ノードＧ，Ｃ，Ｆ，Ｈ，Ｉ，Ｋの頻度係数を減少させる学習処理が行われることになる。もちろん、信号伝達のなかったノードが採択された場合にも、この採択ノードの頻度係数を増加させ、採択に漏れた残りの全候補ノードの頻度係数を減少させる学習処理を行うようにすればよい。たとえば、図４２において、ノードＩが採択された場合は、採択ノードＩの頻度係数を増加させるとともに、採択に漏れた残りの全候補ノードＣ，Ｄ，Ｇ，Ｆ，Ｈ，Ｉ，Ｋの頻度係数を減少させる学習を行えばよい。
【０１８３】
本実施形態では、負の検索処理の結果、信号伝達のなかったノードが採択された場合には、新たな負リンクの定義という付加的な学習処理を実行するようにしている。たとえば、図４２において、ノードＩが採択された場合を考える。図４２に示す状態では、着目ノードＢと採択ノードＩとの間には、直接的なリンクは存在しない。この例で、ノードＩが着目ノードＢに対する候補ノードとして提示された理由は、ホップ数Ｈ＝１という所定の条件の下では、ノードＩに信号が伝達されなかったためであり、着目ノードＢとノードＩとの間に、積極的な負リンクが定義されていたためではない。しかしながら、「着目ノードＢについての負の検索処理を行った結果、オペレータにより、ノードＩが採択された」という事実は、このオペレータが、着目ノードＢと採択ノードＩとの間に負の関連性を認識していることを示している。既に述べたように、図４２に示す状態では、候補ノードとしての優先順位は、ノードＤ，Ｇの方がノードＩよりも高い。ところが、オペレータが優先順位の高いノードＤ，Ｇを採択せずに、敢えてノードＩを採択したことにより、将来、ノードＢを着目ノードとする負の検索が再び行われた場合に、ノードＩの優先順位を昇格させて提示するのが好ましい。
【０１８４】
このような観点から、負の検索処理において、信号伝達がなかったノードが採択された場合には、学習手段９０によって、着目ノードと採択ノードとの間に、負の関連を示す負リンクを新たに定義し、この負リンクをリンク格納手段５０内のリンク集合体に追加する処理を行うようにしている。
【０１８５】
たとえば、図４２に示す状態において、信号伝達がなかったノードＩが新たな着目ノードとして採択された場合には、図４３に示すように、着目ノードＢと採択ノードＩとの間には、オペレータによって負の関連が示されたことになる。そこで、図４４に示すように、着目ノードＢと採択ノードＩとの間に、新たに負リンクＢＩを定義するのである。この負リンクＢＩには負の信号伝達係数が定義されることになるが、その絶対値については、たとえば「７０％にする」というように予め設定しておけばよい。
【０１８６】
図４５は、図４４に示すような学習処理が行われた後に、再び、ノードＢを着目ノードとする負の検索処理が行われた状態を示す概念図である。図４２に示す概念図と比べると、図４５の学習後の検索処理では、新たに負リンクＢＩを介した信号伝達が付加されることになり、ノードＩには正の信号値が得られるようになっていることがわかる。別言すれば、候補ノードＩの優先順位が、図４２に示す検索処理に比べて向上していることになる。ここで、オペレータが再び候補ノードＩを採択した場合、図２０の流れ図に示す学習処理により、着目ノードＢから採択ノードＩへのパス上の負リンクＢＩの信号伝達係数の絶対値を増加させる学習が行われることになる（図４６では、「−−」なる符号で、この学習結果が示されている）。このような再学習により、負のＢＩの信号伝達係数の絶対値が向上し、将来、再びノードＢを着目ノードとする負の検索処理が実行された場合、候補ノードＩの優先順位は更に昇格し、オペレータに優先的に提示されるようになる。
【０１８７】
図４７および図４８は、前述したＡＮＤ検索との組み合わせ例における上述の学習処理を示す図である。すなわち、前述したＡＮＤ検索との組み合わせにより、図３７にハッチングで示されている領域内のノードが候補ノードとして提示され、これらの候補ノードの中から、図４７に示すように、ノードＮ４を採択したとする。この場合、ノードＮ３を着目ノードとする負の検索処理により、採択ノードＮ４が得られたことになるので、オペレータは、着目ノードＮ３と採択ノードＮ４との間に負の関連があると認識したことになる。ここで、採択ノードＮ４が信号伝達のなかったノードとして候補にあがったノードであった場合には、図４８に示すように、着目ノードＮ３と採択ノードＮ４との間に新たに負のノードが定義されることになる。
【０１８８】
§９．リンクの再構成
本実施形態に係るデータベースシステムの特徴のひとつは、学習手段９０により、リンク格納手段５０内のリンク集合体が学習される点である。ただ、これまで述べた学習手段９０による学習は、各スタティックリンクに与えられた信号伝達係数を増減する修正によるものであった。もちろん、§３，§４．７，§６．３で述べたように、ダイナミックリンクを定義し、これをスタティックリンクに昇格させてリンク構造体の新たなメンバーとして追加するという処理や、§８．３で述べたように、負の検索処理を行った際に新たな負リンクを追加するという処理によっても、リンク集合体の学習は行われるが、これまでの実施形態では、一度定義されたスタティックリンクを整理したり統合したりするリンクの再構成については触れられていない。ここでは、付随的な学習処理として、既存のスタティックリンクの整理統合処理を行う機能をもったデータベースシステムを説明する。
【０１８９】
図４９は、前述した図２１に示すデータベースシステムに、更に、リンク再構成手段１３０を付加した実施形態のブロック図である。リンク再構成手段１３０は、学習手段９０が行った信号伝達係数に対する修正に基づいて、リンク格納手段５０内のリンク集合体に対して、既存のリンクの削除や新たなリンクの追加などのリンク再構成処理を行う機能を有する。ここでは、２とおりの具体的なリンク再構成処理を以下に例示する。
【０１９０】
＜９．１：リンク再構成処理の第１の例＞
いま、たとえば、図５０の左欄に示すように、３つのノードＮ１，Ｎ２，Ｎ３について、ノードＮ１−Ｎ２間の負リンクＬ１の信号伝達係数が「所定の基準」以上であり、ノードＮ２−Ｎ３間の負リンクＬ２の信号伝達係数が「所定の基準」以上であり、かつ、両リンクＬ１，Ｌ２に挟まれた中間ノードＮ２の頻度係数が「所定の基準」以下であったとする。この場合、負リンクＬ１，Ｌ２は、オペレータの過去の採択行為に頻繁に関与していたと考えられるが、中間ノードＮ２自身は、採択されることが少なかったことになる。別言すれば、中間ノードＮ２は、ノードＮ１からノードＮ３へ至る頻繁に利用されるパスにおいて、単なる通過ノードとしての役割しか果たしていないことになる。このような場合、図５０の右欄に示すように、負リンクＬ１，Ｌ２を削除し、新たにノードＮ１−Ｎ３間にバイパス用の正リンクＬ３を定義すればよい。この場合、新たに定義した正リンクＬ３の信号伝達係数として、負リンクＬ１の信号伝達係数と負リンクＬ２の信号伝達係数との積を与えるようにすれば（負の係数同士の積であるので、当然、正の係数になる）、ノードＮ１−Ｎ３間の信号伝達による信号値の減衰は、リンクの再構成を行う前の状態と同じに維持できる。また、ノードＮ３は、「ノードＮ１に対して負の関連性をもっているノードＮ２」に対して負の関連性をもっているノードであるので、ノードＮ１に対して正の関連性をもっているノードであると期待でき、正リンクＬ３によって、ノードＮ１−Ｎ３間を直接接続することは適切である。
【０１９１】
結局、特定のノードＮ２について、当該ノードＮ２についての頻度係数が「所定の基準」以下であり、当該ノードＮ２と別な第１のノードＮ１との間の信号伝達係数が「所定の基準」以上である第１の負リンクＬ１が存在し、かつ、当該ノードＮ２と別な第２のノードＮ３との間の信号伝達係数が「所定の基準」以上である第２の負リンクＬ２が存在する場合には、第１の負リンクＬ１および第２の負リンクＬ２を削除するとともに、第１のノードＮ１と第２のノードＮ３との間に新たな正リンクＬ３を追加するリンク再構成を行うようにすればよい。
【０１９２】
なお、このようなリンク再構成を行うか否かの判断基準となる「所定の基準」としては、予め固有の基準値を設定しておけばよい。あるいは、固有の基準値を設定する代わりに、信号伝達係数と頻度係数との相対的な大きさを比較するようにしてもよい。たとえば、図５０に示す例の場合、負リンクＬ１の信号伝達係数がノードＮ２の頻度係数の２倍以上で、かつ、負リンクＬ２の信号伝達係数がノードＮ２の頻度係数の２倍以上である場合には、上述のようなリンク再構成を行う、というような取決めを行うことも可能である。この場合、ノードＮ２の頻度係数についての「所定の基準」は、負リンクＬ１，Ｌ２の信号伝達係数値ということになり、逆に、負リンクＬ１，Ｌ２の信号伝達係数についての「所定の基準」は、ノードＮ２の頻度係数値ということになる。
【０１９３】
＜９．２：リンク再構成処理の第２の例＞
いま、たとえば、図５１の左欄に示すように、３つのノードＮ１，Ｎ２，Ｎ３について、ノードＮ１−Ｎ２間の負リンクＬ１の信号伝達係数が「所定の基準」以上であり、ノードＮ２−Ｎ３間の正リンクＬ２の信号伝達係数が「所定の基準」以上であり、かつ、両リンクＬ１，Ｌ２に挟まれた中間ノードＮ２の頻度係数が「所定の基準」以下であったとする。この場合、負リンクＬ１および正リンクＬ２は、オペレータの過去の採択行為に頻繁に関与していたと考えられるが、中間ノードＮ２自身は、採択されることが少なかったことになる。別言すれば、中間ノードＮ２は、ノードＮ１からノードＮ３へ至る頻繁に利用されるパスにおいて、単なる通過ノードとしての役割しか果たしていないことになる。このような場合、図５１の右欄に示すように、負リンクＬ１および正リンクＬ２を削除し、新たにノードＮ１−Ｎ３間にバイパス用の負リンクＬ３を定義すればよい。この場合、新たに定義した負リンクＬ３の信号伝達係数として、負リンクＬ１の信号伝達係数と正リンクＬ２の信号伝達係数との積を与えるようにすれば（正の係数と負の係数との積であるので、当然、負の係数になる）、ノードＮ１−Ｎ３間の信号伝達による信号値の減衰は、リンクの再構成を行う前の状態と同じに維持できる。また、ノードＮ３は、「ノードＮ１に対して負の関連性をもっているノードＮ２」に対して正の関連性をもっているノードであるので、ノードＮ１に対して負の関連性をもっているノードであると期待でき、負リンクＬ３によって、ノードＮ１−Ｎ３間を直接接続することは適切である。
【０１９４】
結局、特定のノードＮ２について、当該ノードＮ２についての頻度係数が「所定の基準」以下であり、当該ノードＮ２と別な第１のノードＮ１との間の信号伝達係数が「所定の基準」以上である負リンクＬ１が存在し、かつ、当該ノードＮ２と別な第２のノードＮ３との間の信号伝達係数が「所定の基準」以上である正リンクＬ２が存在する場合には、負リンクＬ１および正リンクＬ２を削除するとともに、第１のノードＮ１と第２のノードＮ３との間に新たな負リンクＬ３を追加するリンク再構成を行うようにすればよい。
【０１９５】
なお、このようなリンク再構成を行うか否かの判断基準となる「所定の基準」としては、予め固有の基準値を設定しておけばよい。あるいは、固有の基準値を設定する代わりに、信号伝達係数と頻度係数との相対的な大きさを比較するようにしてもよい。たとえば、図５１に示す例の場合、負リンクＬ１の信号伝達係数がノードＮ２の頻度係数の２倍以上で、かつ、正リンクＬ２の信号伝達係数がノードＮ２の頻度係数の２倍以上である場合には、上述のようなリンク再構成を行う、というような取決めを行うことも可能である。この場合、ノードＮ２の頻度係数についての「所定の基準」は、負リンクＬ１および正リンクＬ２の信号伝達係数値ということになり、逆に、負リンクＬ１および正リンクＬ２の信号伝達係数についての「所定の基準」は、ノードＮ２の頻度係数値ということになる。
【０１９６】
＜９．３：リンク再構成処理の具体例＞
最後に、図３９に示すリンク集合体に対して、上述したリンク再構成処理を具体的に実施した例を示しておく。たとえば、図３９において、負リンクＢＧの信号伝達係数の絶対値および負リンクＧＨの信号伝達係数の絶対値と、ノードＧの頻度係数とを比較した場合に、前者が後者の２倍以上であると判断されたとしよう。この場合、負リンクＢＧおよび負リンクＧＨはノードの採択に比較的頻繁に利用されているのにもかかわらず、ノードＧの採択頻度は低いので、ノードＧは単なる通過ノードとしての役割しか果たしていないと考えられる。このような場合は、負リンクＢＧおよび負リンクＧＨを削除し、図５２に示すように、新たにノードＢ−Ｈ間に正リンクＢＨを定義すればよい。正リンクＢＨの信号伝達係数は、削除した負リンクＢＧの信号伝達係数と削除した負リンクＧＨの信号伝達係数との積とすればよい。
【０１９７】
また、図３９において、負リンクＨＧの信号伝達係数の絶対値および正リンクＧＪの信号伝達係数の絶対値と、ノードＧの頻度係数とを比較した場合に、前者が後者の２倍以上であると判断されたとしよう。この場合、負リンクＨＧおよび正リンクＧＪはノードの採択に比較的頻繁に利用されているのにもかかわらず、ノードＧの採択頻度は低いので、ノードＧは単なる通過ノードとしての役割しか果たしていないと考えられる。このような場合は、負リンクＨＧおよび正リンクＧＪを削除し、図５３に示すように、新たにノードＨ−Ｊ間に負リンクＨＪを定義すればよい。負リンクＨＪの信号伝達係数は、削除した負リンクＨＧの信号伝達係数と削除した正リンクＧＪの信号伝達係数との積とすればよい。
【０１９８】
§１０．敗者復活型ＡＮＤ検索
これまで、本発明の一実施形態に係るデータベースシステムの種々の機能を述べてきたが、本発明の本質は、このようなデータベースシステムに、以下に述べる「敗者復活型ＡＮＤ検索」という機能を付加した点にある。
【０１９９】
＜１０．１：敗者復活型ＡＮＤ検索の基本概念＞
ここで述べる「敗者復活型ＡＮＤ検索」は、§７で述べた「ＡＮＤ検索」を発展させたものである。いま、図５４に示すようなノード集合体の中の目的とするノード（目的とするデータが対応づけられたノード）がノードＮ０であったとし、オペレータが、この目的とするノードＮ０を検索するために、ノードＮ１を最初の着目ノードとして指定したものとしよう（具体的には、ノードＮ１に定義されたキーワードを指定することになる）。ここで、この最初の着目ノードＮ１について、第１回目の検索を行った結果、図５５に示すような関連集合Ｇ１が検索され、この関連集合Ｇ１内のノードが候補ノードとして提示されたとする。この場合、目的とするノードＮ０は、関連集合Ｇ１から漏れていることになる。オペレータは、目的とするノードＮ０を見つけるために適当と思われるキーワードを入力して、最初の着目ノードＮ１を指定したわけであるが、図５５に示す検索結果は、このオペレータの期待に反する結果となっている。既に述べたように、このノード集合体を構成するノード間の関連性を示すリンクは、このシステムの管理者によって第一義的に定義されたものであるため、必ずしも個々の利用者の期待どおりのものにはなっていない。したがって、図５５に示す例のように、本来、ノードＮ０を見つけ出す目的で着目ノードＮ１を指定したのにもかかわらず、目的のノードＮ０が検索から漏れてしまうという事態も起こり得る。
【０２００】
このように、第１回目の検索によって目的のノードＮ０が漏れてしまうと、以後、一連の繰り返し検索を§７で述べた「ＡＮＤ検索」のモードで実施しても、目的のノードＮ０が候補ノードとして提示されることはない。すなわち、図５５に示す関連集合Ｇ１は、新たに母集合Ｍ２となるが、ノードＮ０はこの母集合Ｍ２には属していないため、母集合Ｍ２内のノードＮ２を新たな着目ノードとして採択して第２回目の検索を行い、図５６に示すように、ノードＮ０を含む関連集合Ｇ２が得られたとしても、この第２回目の検索において候補ノードとして提示されるのは、図５６にハッチングを施して示す「（Ｍ２）ＡＮＤ（Ｇ２）」の部分となってしまう。このハッチング部分は、図５７に示すように、新たな母集合Ｍ３となる。更に、母集合Ｍ３内のノードＮ３を新たな着目ノードとして採択して第３回目の検索を行い、図５８に示すように、ノードＮ０を含む関連集合Ｇ３が得られたとしても、この第３回目の検索において候補ノードとして提示されるのは、図５８にハッチングを施して示す「（Ｍ３）ＡＮＤ（Ｇ３）」の部分となってしまう。このハッチング部分は、図５９に示すように、新たな母集合Ｍ４となる。更に続けて、母集合Ｍ４内のノードＮ４を新たな着目ノードとして採択して第４回目の検索を行い、図６０に示すように、ノードＮ０を含む関連集合Ｇ４が得られたとしても、この第４回目の検索において候補ノードとして提示されるのは、図６０にハッチングを施して示す「（Ｍ４）ＡＮＤ（Ｇ４）」の部分となってしまう。
【０２０１】
結局、一連の４回の検索処理を行った結果、図６１に示すように、４つの関連集合Ｇ１〜Ｇ４のすべての論理積集合に相当する部分（図６１にハッチングを施して示す部分）のみが最終的な候補として残ることになり、目的とするノードＮ０は候補から漏れてしまうことになる。図６１を見れば明らかなように、目的とするノードＮ０は、関連集合Ｇ２，Ｇ３，Ｇ４のいずれにも含まれているにもかかわらず、たまたま関連集合Ｇ１には含まれていなかったために、候補から漏れる結果となっている。本発明の本質をなす「敗者復活型ＡＮＤ検索」は、このような弊害をなくすための工夫であり、一連の繰り返しＡＮＤ検索を行う際に、一度候補から漏れてしまったノードを、いわゆる「トーナメント戦における敗者復活」と同様に、再び候補として復活させることを意図したものである。
【０２０２】
上述の例の場合、４回のＡＮＤ検索によって残った最終候補は、図６１にハッチング領域として示すように、「（Ｇ１）ＡＮＤ（Ｇ２）ＡＮＤ（Ｇ３）ＡＮＤ（Ｇ４）」なる論理積部分になっているが、ここで、第１回目の検索で得られた関連集合Ｇ１を論理積の対象から除外すると、最終候補は、図６２にハッチング領域として示すように、「（Ｇ２）ＡＮＤ（Ｇ３）ＡＮＤ（Ｇ４）」なる論理積部分になり、目的のノードＮ０が候補ノードとして提示されることになる。別言すれば、第１回目の検索において候補から漏れてしまったノードＮ０が、第２回目〜第４回目の検索において関連ノードとして抽出されたため、第４回目の最終候補として復活したことになる。
【０２０３】
図６３は、この「敗者復活型ＡＮＤ検索」モードによる５回にわたる一連の検索処理を示す概念図である。図の▲１▼〜▲５▼は、それぞれ第１回目から第５回目までの一連の検索処理を示しており、最下段には、各検索処理の後に提示される候補ノードの集合体が示されている。まず、第１回目の検索では、着目ノードＮ１についての検索により関連集合Ｇ１が抽出され、この関連集合Ｇ１がそのまま候補ノードの集合として提示される。続く第２回目の検索では、関連集合Ｇ１内から採択された新たな着目ノードＮ２についての検索により関連集合Ｇ２が抽出され、関連集合Ｇ１とＧ２との論理積集合が候補ノードの集合として提示される。次の第３回目の検索では、関連集合Ｇ２内から採択された新たな着目ノードＮ３についての検索により関連集合Ｇ３が抽出され、関連集合Ｇ１とＧ２とＧ３との論理積集合が候補ノードの集合として提示される。ここまでは、§７で述べたＡＮＤ検索と同じである。
【０２０４】
ここに例示する「敗者復活型ＡＮＤ検索」では、「３回以上前の検索結果を考慮しない」という形式で候補ノードが決定される。すなわち、第４回目の検索では、関連集合Ｇ３内から採択された新たな着目ノードＮ４についての検索により関連集合Ｇ４が抽出されるが、この時点では、３回以上前の検索（第１回目の検索）は考慮されず、関連集合Ｇ２とＧ３とＧ４との論理積集合が候補ノードの集合として提示されることになる。したがって、関連集合Ｇ１に含まれていなかったために第１回目の検索で候補から漏れたノードには、この第４回目の検索において候補に復活するチャンスが与えられることになる。更に、第５回目の検索では、関連集合Ｇ４内から採択された新たな着目ノードＮ５についての検索により関連集合Ｇ５が抽出されるが、この時点では、３回以上前の検索（第２回目以前の検索）は考慮されず、関連集合Ｇ３とＧ４とＧ５との論理積集合が候補ノードの集合として提示されることになる。したがって、関連集合Ｇ１あるいはＧ２に含まれていなかったために候補から漏れたノードに対しても、復活のチャンスが与えられることになる。
【０２０５】
図６４は、「３回以上前の検索結果を考慮しない」という形式での「敗者復活型ＡＮＤ検索」の第ｉ回目（ただし、ｉ≧３）の検索時における候補ノードの集合を決定する方法を示す概念図である。すなわち、第ｉ回目の検索では、関連集合Ｇ（ｉ−１）内から採択された新たな着目ノードＮｉについての検索により関連集合Ｇｉが抽出されるが、この時点では、３回以上前の検索（第（ｉ−３）回目以前の検索）は考慮されず、関連集合Ｇ（ｉ−２）とＧ（ｉ−１）とＧｉとの論理積集合が候補ノードの集合として提示されることになる。
【０２０６】
もっとも、本発明に係る「敗者復活型ＡＮＤ検索」は、「３回以上前の検索結果を考慮しない」という形式に限定されるものではなく、一般に、「ｎ回以上前の検索結果を考慮しない」という形式に拡張可能である（ｎは２以上の自然数）。上述の例は、ｎ＝３に設定した場合の例であるが、ｎ＝２に設定すれば、図６３において、第３回目の検索では、Ｇ２とＧ３との論理積集合が候補ノードとして提示され、第４回目の検索では、Ｇ３とＧ４との論理積集合が候補ノードとして提示され、第５回目の検索では、Ｇ４とＧ５との論理積集合が候補ノードとして提示されることになり、敗者復活の条件が緩和されることになる。逆に、ｎ＝４に設定すれば、図６３において、第４回目の検索では、Ｇ１とＧ２とＧ３とＧ４の論理積集合が候補ノードとして提示され、この時点ではまだ敗者復活は認められず、次の第５回目の検索で、Ｇ２とＧ３とＧ４とＧ５の論理積集合が候補ノードとして提示され、やっと敗者復活が認められることになる。
【０２０７】
結局、より拡張した一般論として述べると、図２１に示すシステムにおいて、予め２以上の自然数ｎを設定しておき、検索手段４０により第ｉ回目の検索処理が実行されたときに、ｉ≦ｎの場合には、第１回目乃至第ｉ回目の検索処理により抽出されたｉ組の関連集合の論理積集合を候補集合として求め（これは、§７で述べた「ＡＮＤ検索」）、ｉ＞ｎの場合には、第（ｉ−ｎ＋１）回目乃至第ｉ回目の検索処理により抽出されたｎ組の関連集合の論理積集合を候補集合として求め、求めた候補集合に所属するノードを候補ノードとして提示する処理を、候補提示手段６０によって行うようにすればよい。
【０２０８】
＜１０．２：ノードにステータスを定義する手法＞
ここでは、上述した「敗者復活型ＡＮＤ検索」を実現するための具体的な手法を述べる。ここで述べる手法は、「敗者復活型ＡＮＤ検索」を行うときに、個々のノードにステータスを定義し、このステータスを変遷させながら、各ノードを候補ノードとすべきか否かの判断を行うものである。
【０２０９】
上述の一般論で説明すれば、個々のノードに、最下位ステータスＳ１乃至最上位ステータスＳ（ｎ＋１）なる合計（ｎ＋１）段階のステータスを定義できるようにし、一連の繰り返し検索処理の開始時に、全ノードのステータスを次席ステータスＳｎに設定する。たとえば、ｎ＝３の場合には、合計４段階のステータスＳ１，Ｓ２，Ｓ３，Ｓ４が定義されることになる。ここで、ステータスＳ１は最下位ステータス、ステータスＳ４は最上位ステータスであり、一連の繰り返し検索処理の開始時には、全ノードのステータスは次席ステータスＳ３に設定される。また、ここで述べる実施形態では、更に別なステータスとして、除外ステータスＳ０なるものを定義している。この除外ステータスＳ０は、過去に着目ノードとなったノードが再び候補ノードとして提示されることを防ぐために用いられる特別なステータスである。図６５に、これらステータスＳ１〜Ｓ４およびＳ０の一覧を示す。
【０２１０】
これらのステータスは、「敗者復活型ＡＮＤ検索」モードで一連の繰り返し検索処理を実行するときにのみ定義されるものであり、この一連の繰り返し検索処理の開始時には、常に全ノードが次席ステータスＳ３（最上位ステータスより１つ下のステータス）に設定される。すなわち、すべてのノードにとって、この次席ステータスＳ３が初期ステータスとなる。
【０２１１】
「敗者復活型ＡＮＤ検索」モードでの一連の繰り返し検索処理の実行中は、毎検索時に次のような遷移規則(a) 〜(c) に沿ってステータス遷移が行われる。
(a) 当該検索処理により関連集合として抽出されたノード（関連ノード）については、ステータスを１段階昇進させる。ただし、最上位ステータスＳ４を上限とし、もともと最上位ステータスＳ４にあったノードは、そのままステータスＳ４を維持させる。また、除外ステータスＳ０のノードも、そのままステータスＳ０を維持させる。
(b) 当該検索処理により関連集合として抽出されなかったノード（関連ノードにならなかったノード）については、最下位ステータスＳ１へ転落させる。ただし、最下位ステータスＳ１を下限とし、もともと最下位ステータスＳ１にあったノードは、そのままステータスＳ１を維持させる。また、除外ステータスＳ０のノードは、そのままステータスＳ０を維持させる。
(c) 当該検索処理を行った際の着目ノードについては、除外ステータスＳ０へ移行させる。
【０２１２】
候補提示手段６０は、このようなステータス遷移の後に、最上位ステータスＳ４のノードを候補ノードとして抽出し、これをオペレータに提示する処理を行えばよい。図６６は、このようなステータス遷移規則の一覧を示す図である。
【０２１３】
このようなステータスを利用した手法により、実際に「敗者復活型ＡＮＤ検索」が可能になることを、図６７の具体例に即して説明する。この図６７に示す例は、ノードＡ〜Ｋなる１１個のノードからなるノード集合体に対して、合計５回からなる一連の繰り返し検索を行ったものであり、左欄の▲１▼〜▲６▼は、それぞれ第１回目から第６回目の検索前の状態を示している（この図では、第６回目の検索はまだ行われていない）。ここでは、各ノード名Ａ〜Ｋの右側に併記した０〜４の数字により、各ノードのその時点でのステータスを示してある。
【０２１４】
第１行目▲１▼に列挙された各ノードＡ〜Ｋは、第１回目の検索を行う前のノードを示しており、いずれも初期ステータスＳ３に設定されている（ノード名Ａ３〜Ｋ３は、各ノードがいずれもステータスＳ３にあることを示す）。ここでは、これらのノードのうち、ノードＥが最初の着目ノードとして指定されたものとし、このノードＥを着目ノードとする第１回目の検索により、ノードＢ，Ｃ，Ｄ，Ｆ，Ｇ，Ｈが関連ノードとして抽出されたものとしよう。図では、着目ノードと検索された各関連ノードとの間を矢印で結んで示してある。この第１回目の検索において、上述の遷移規則に基づくステータス遷移を実行すると、まず、遷移規則(a) により、関連ノードＢ，Ｃ，Ｄ，Ｆ，Ｇ，Ｈのステータスは１段階昇進し、ステータスＳ３からＳ４へと遷移することになる。また、遷移規則(b) により、関連ノードとならなかったノードＡ，Ｉ，Ｊ，Ｋは、最下位ステータスＳ１へと転落する。更に、遷移規則(c) により、着目ノードＥは、除外ステータスＳ０へと移行する。以後の一連の繰り返し検索において、ノードＥは除外ステータスＳ０を維持するため、候補ノードとして提示されることはない。
【０２１５】
第２行目▲２▼に列挙された各ノードＡ〜Ｋは、この第１回目の検索後のノードを示している。ここで、四角で囲まれたノードは、最上位ステータスＳ４のノードであり、候補ノードとして提示されることになる。ここでは、これらの候補ノードのうち、ノードＤが新たな着目ノードとして指定されたものとし、このノードＤを着目ノードとする第２回目の検索により、図に矢印で示すように、ノードＢ，Ｃ，Ｅ，Ｇ，Ｈ，Ｉが関連ノードとして抽出されたものとしよう。この第２回目の検索において、上述の遷移規則に基づくステータス遷移を実行すると、まず、遷移規則(a) により、関連ノードＢ，Ｃ，Ｅ，Ｇ，Ｈ，Ｉのステータスは１段階昇進する。ただし、関連ノードＢ，Ｃ，Ｇ，Ｈは既に最上位ステータスＳ４にあるので、そのままのステータスを維持し、関連ノードＥは除外ステータスＳ０をそのまま維持する。したがって、実際のステータス遷移は、ノードＩがステータスＳ１からＳ２へ昇進するだけである。また、遷移規則(b) により、関連ノードとならなかったノードＡ，Ｆ，Ｊ，Ｋは、最下位ステータスＳ１へと転落する。ただし、ノードＡ，Ｊ，Ｋは既に最下位ステータスＳ１にあるので、そのままのステータスを維持し、実際にステータスＳ１に転落するのはノードＦだけである。更に、遷移規則(c) により、着目ノードＤは、除外ステータスＳ０へと移行する。
【０２１６】
第３行目▲３▼に列挙された各ノードＡ〜Ｋは、この第２回目の検索後のノードを示している。やはり四角で囲まれた最上位ステータスＳ４のノードが候補ノードとして提示されることになる。ここでは、これらの候補ノードのうち、ノードＣが新たな着目ノードとして指定されたものとし、このノードＣを着目ノードとする第３回目の検索により、図に矢印で示すように、ノードＤ，Ｅ，Ｇ，Ｈ，Ｉ，Ｊが関連ノードとして抽出されたものとしよう。この第３回目の検索において、上述の遷移規則に基づくステータス遷移を実行すると、まず、遷移規則(a) により、関連ノードＤ，Ｅ，Ｇ，Ｈ，Ｉ，Ｊのステータスは１段階昇進する。ただし、関連ノードＧ，Ｈは既に最上位ステータスＳ４にあるので、そのままのステータスを維持し、関連ノードＤ，Ｅは除外ステータスＳ０をそのまま維持する。したがって、実際のステータス遷移は、ノードＩがステータスＳ２からＳ３へ昇進し、ノードＪがステータスＳ１からＳ２へ昇進するだけである。また、遷移規則(b) により、関連ノードとならなかったノードＡ，Ｂ，Ｆ，Ｋは、最下位ステータスＳ１へと転落する。ただし、ノードＡ，Ｆ，Ｋは既に最下位ステータスＳ１にあるので、そのままのステータスを維持し、実際にステータスＳ１に転落するのはノードＢだけである。更に、遷移規則(c) により、着目ノードＣは、除外ステータスＳ０へと移行する。
【０２１７】
第４行目▲４▼に列挙された各ノードＡ〜Ｋは、この第３回目の検索後のノードを示している。やはり四角で囲まれた最上位ステータスＳ４のノードが候補ノードとして提示されることになる。ここでは、これらの候補ノードのうち、ノードＨが新たな着目ノードとして指定されたものとし、このノードＨを着目ノードとする第４回目の検索により、図に矢印で示すように、ノードＣ，Ｄ，Ｅ，Ｇ，Ｉ，Ｊが関連ノードとして抽出されたものとしよう。この第４回目の検索において、上述の遷移規則に基づくステータス遷移を実行すると、まず、遷移規則(a) により、関連ノードＣ，Ｄ，Ｅ，Ｇ，Ｉ，Ｊのステータスは１段階昇進する。ただし、関連ノードＧは既に最上位ステータスＳ４にあるので、そのままのステータスを維持し、関連ノードＣ，Ｄ，Ｅは除外ステータスＳ０をそのまま維持する。したがって、実際のステータス遷移は、ノードＩがステータスＳ３からＳ４へ昇進し、ノードＪがステータスＳ２からＳ３へ昇進するだけである。また、遷移規則(b) により、関連ノードとならなかったノードＡ，Ｂ，Ｆ，Ｋは、最下位ステータスＳ１へと転落する。ただし、いずれも既に最下位ステータスＳ１にあるので、そのままのステータスを維持し、実際の転落は起こらない。更に、遷移規則(c) により、着目ノードＨは、除外ステータスＳ０へと移行する。
【０２１８】
第５行目▲５▼に列挙された各ノードＡ〜Ｋは、この第４回目の検索後のノードを示している。やはり四角で囲まれた最上位ステータスＳ４のノードが候補ノードとして提示されることになる。ここでは、これらの候補ノードのうち、ノードＩが新たな着目ノードとして指定されたものとし、このノードＩを着目ノードとする第５回目の検索により、図に矢印で示すように、ノードＣ，Ｄ，Ｅ，Ｈ，Ｊ，Ｋが関連ノードとして抽出されたものとしよう。この第５回目の検索において、上述の遷移規則に基づくステータス遷移を実行すると、まず、遷移規則(a) により、関連ノードＣ，Ｄ，Ｅ，Ｈ，Ｊ，Ｋのステータスは１段階昇進する。ただし、関連ノードＣ，Ｄ，Ｅ，Ｈは除外ステータスＳ０をそのまま維持する。したがって、実際のステータス遷移は、ノードＪがステータスＳ３からＳ４へ昇進し、ノードＫがステータスＳ１からＳ２へ昇進するだけである。また、遷移規則(b) により、関連ノードとならなかったノードＡ，Ｂ，Ｆ，Ｇは、最下位ステータスＳ１へと転落する。ただし、ノードＡ，Ｂ，Ｆは既に最下位ステータスＳ１にあるので、そのままのステータスを維持し、実際にステータスＳ１に転落するのはノードＧだけである。更に、遷移規則(c) により、着目ノードＩは、除外ステータスＳ０へと移行する。第６行目▲６▼に列挙された各ノードＡ〜Ｋは、この第５回目の検索後のノードを示している。
【０２１９】
以上の一連の繰り返し検索処理において注目すべき点は、第１回目の検索処理において関連ノードにならなかったために候補から漏れ、最下位ステータスＳ１へ転落してしまった転落経験ノードＩが、第２回目〜第４回目の検索処理においては、いずれも関連ノードとなったために、第４回目の検索処理の後に候補ノードとして復活している点である。また、同様に、第１回目の検索処理において関連ノードにならなかったために候補から漏れ、最下位ステータスＳ１へ転落してしまった転落経験ノードＪも、第３回目〜第５回目の検索処理において、いずれも関連ノードとなったために、第５回目の検索処理の後に候補ノードとして復活している。要するに、ｎ＝３の設定では、連続して３回の検索において関連ノードになった場合には、それ以前に関連ノードにならないで最下位ステータスＳ１に転落していたとしても、候補として復活することになる。もちろん、任意のｎ（ただし、ｎは２以上の自然数）を設定した場合には、連続してｎ回の検索において関連ノードになった場合には、それ以前に最下位ステータスＳ１に転落していたとしても、候補として復活することになる。なお、一度着目ノードになったノードは、除外ステータスＳ０へ遷移するため、候補ノードとして提示されることはない。すなわち、オペレータが一度採択したノードは、再び候補として提示されることはないので、重複採択を避けることができる。
【０２２０】
＜１０．３：ステータス定義を利用したより一般的な手法＞
上述の手法では、たとえば、ｎ＝３に設定した場合、ステータスＳ１〜Ｓ４までの４段階のステータスが定義され、最下位ステータスＳ１に転落したノードであっても、その後、３回続けて関連集合として抽出されれば、候補として復活することになる。これは、上述した遷移規則(a) により、「関連集合として抽出されたノードは、ステータスを１段階昇進させる」というステータス遷移が行われるため、３回続けて関連集合として抽出されれば３段階の昇進が行われ、最下位ステータスＳ１から最上位ステータスＳ４へと遷移するからである。ステータスＳ１〜Ｓ（ｎ＋１）までの合計（ｎ＋１）段階のステータスを定義した一般の場合には、ｎ回続けて関連集合として抽出されれば、ｎ段階の昇進が行われるので、最下位ステータスＳ１から最上位ステータスＳ（ｎ＋１）まで昇進し、候補として提示されることになる。
【０２２１】
この上述の手法では、すべてのノードについて一律にステータスの昇進を１段階ずつと定めているが、本発明を実施する上では、ステータスの昇進段数は必ずしも全ノードについて一律に「１段階」と設定する必要はない。すなわち、１回の昇進段数を一般にｕ段階（ただし、ｕは０＜ｕ＜ｎなる自然数）と定め、各ノードごとにｕの値をそれぞれ別個独立して設定し、かつ、同じノードであっても各検索処理ごとに異なるｕの値を設定することも可能である。このように、昇進段数を一般の値ｕに拡張した場合、前述した遷移規則(a) は次のようになる。
(a) 当該検索処理により関連集合として抽出されたノード（関連ノード）については、個々のノードごとに予め定義された昇進段数ｕ（ただし、ｕは０＜ｕ＜ｎなる自然数）に基づいて、ステータスをｕ段階昇進させる。ただし、最上位ステータスＳ（ｎ＋１）を上限とし、ｕ段階の昇進により最上位ステータスを越える場合には、最上位ステータスまでの昇進しか行わない。また、除外ステータスＳ０のノードは、そのままステータスＳ０を維持させる。
【０２２２】
このように、遷移規則(a) を一般に拡張した場合、前述した§１０．２の例は、昇進段数ｕを全ノードについて一律に１と定義した特別な形態ということになる。昇進段数ｕを一律に固定せずに、個々のノードごとに別個独立して定められるようにしておけば、より柔軟な敗者復活条件を設定することが可能になる。すなわち、１回の検索処理が行われた時点で、着目ノードと個々の関連ノードとの関連の程度に注目し、関連の程度の高いノードほど、その時点での昇進段数ｕの値が大きくなるように、その都度、昇進段数ｕを設定するようにすれば、より関連の程度の高いノードほど、敗者復活の条件を早く満たすようになる。
【０２２３】
たとえば、ｎ＝３００に設定して、全部で３０１段階のステータスを定義しておき、昇進段数ｕとしては、ｕ＝１００を標準的な値として、ｕ＝５０〜１５０の範囲内の値を個々のノードごとに設定しておいたとする。この場合、ｕ＝１００の標準的なノードは、最下位ステータスに転落しても、３回連続して関連集合に選ばれると敗者復活条件を満たし、候補として提示されるが、ｕ＝５０という関連の程度の弱いノードは、６回連続して関連集合に選ばれなければ敗者復活条件を満たさないことになる。逆に、ｕ＝１５０という関連の程度の強いノードは、最下位ステータスに転落しても、２回連続して関連集合に選ばれるだけで敗者復活条件を満たし、候補として提示されることになる。このように、昇進段数ｕの値を各ノードごとに別個独立した変数として設定するようにすれば、ノードごとに異なる敗者復活条件を課することができ、敗者復活のタイミングをノードごとに変えることができるようになる。
【０２２４】
昇進段数ｕの値の設定は、検索処理を行った際に各関連ノードに得られた信号値を利用して定めることができる。既に述べたように、検索処理は、着目ノードから所定の信号値をもった信号をリンクに沿って伝達させることによって行われ、所定のレベル以上の信号値が得られたノードが関連ノードとして抽出される。ここで、各関連ノードに得られた信号の信号値は、着目ノードとの関連の程度を示しており、大きな信号値が得られた関連ノードほど、着目ノードに対する関連の程度も大きいということが言える。そこで、ある検索処理において、大きな信号値が得られた関連性の高いノードについては、当該検索処理時のステータス遷移に関しての昇進段数ｕを大きく設定するようにし、逆に、小さな信号値が得られた関連性の低いノードについては、当該検索処理時のステータス遷移に関しての昇進段数ｕを小さく設定するようにすればよい。もちろん、各関連ノードに得られる信号の信号値は、個々の検索処理ごとに変わるものであるから、各ノードに定義される昇進段数ｕも、個々の検索処理ごとに変わる変数になる。
【０２２５】
一方、一連の繰り返し検索処理を開始する際の初期ステータスとして、上述の実施形態では、全ノードに対して次席ステータスＳｎ（下からｎ番目のステータス）を設定するようにしていたが、初期ステータスは、必ずしも次席ステータスＳｎに設定する必要はない。たとえば、ｎ＝３００に設定して、全３０１段階のステータスを定義し、昇進段数ｕとしては、ｕ＝１００を標準的な値として設定したような場合、初期ステータスとしては、次席ステータスＳ３００を設定するよりも、下から２００段階目のステータスＳ２００程度を設定する方が好ましい。また、場合によっては、最上位ステータスを初期ステータスに設定することも可能である。一般に、本発明における初期ステータスＳｊとしては、１＜ｊ≦ｎ＋１なる自然数ｊによって、下からｊ番目のステータスとして設定できるステータスであれば、どのようなステータスを設定してもかまわない。
【０２２６】
＜１０．４：敗者復活型ＡＮＤ検索における学習＞
上述した「敗者復活型ＡＮＤ検索」において、敗者復活したノードがオペレータによって採択された場合、学習手段９０による学習を行うようにするのが好ましい。すなわち、一連の繰り返し検索処理中に、最下位ステータスＳ１へ転落したことのある転落経験ノードが候補ノードとして提示され、この転落経験ノードが新たな着目ノードとして採択された場合には、転落経験ノードが最下位ステータスＳ１へ転落した時点における着目ノードと、当該転落経験ノードとの間に新たなリンクを定義し、この新たなリンクをリンク集合体に追加する学習処理を行うようにすればよい。
【０２２７】
この学習処理は、図６７に示す例では次のとおりとなる。すなわち、ノードＩ，Ｊはいずれも、第１回目の検索処理において最下位ステータスＳ１へ転落した転落経験ノードである。ところが、この転落経験ノードＩは、第４回目の検索処理後に候補ノードとして提示され、第５回目の検索処理を行う際には着目ノードとして採択されている。そこで、この場合は、転落経験ノードＩと、このノードＩの転落が生じた第１回目の検索時における着目ノードＥとの間に新たなリンクを定義するようにすればよい。これは、「ノードＥを着目ノードとする第１回目の検索時に、本来はノードＩも候補ノードとして提示されるべきであった」との考えに基づくものである。このように新たなリンクを追加する学習を行っておけば、再び、ノードＥを着目ノードとする検索処理が実行された場合、第１回目の検索時において、ノードＩが関連ノードとして抽出されるので候補から漏れることはなくなる。同様に、図６７に示す例において、第５回目の検索処理後に、ノードＪが新たな着目ノードとして採択されれば、ノードＪとノードＥとの間に新たなリンク定義を行う学習がなされることになる。
【０２２８】
この学習処理を、図６２に示す例で説明すると次のようになる。図６２に示す例は、既に述べたように、４回の繰り返し検索により、第１回目の検索で漏れたノードＮ０が候補に復活したわけであるが、この状態において、図６８に示すように、このノードＮ０（転落経験ノード）が新たな着目ノードとして採択された場合を考える。この場合、全４回の検索を経て、やっとノードＮ０までたどり着くことができたわけであるが、利用者の立場からすれば、ノードＮ１を着目ノードとした第１回目の検索時において、ノードＮ０が候補として提示されていた方が好ましい。そこで、図６９に示すように、ノードＮ０とノードＮ１との間に新たなリンクＬを定義する学習を行うのである。このような学習が行われれば、再び、ノードＮ１を着目ノードとする検索処理が実行された場合、リンクＬの存在により、ノードＮ０が関連ノードとして抽出され、図７０に示すように、関連集合Ｇ１内の候補ノードとしてノードＮ０が提示されることになる。
【０２２９】
＜１０．５：敗者復活型ＡＮＤ検索のその他の実施形態＞
以上述べた「敗者復活型ＡＮＤ検索」の実施形態は、「正の検索処理」を行った場合のものであるが、§８で述べた「負の検索処理」を組み合わせた場合にも同様に適用可能である。なお、「負の検索処理」によって最下位ステータスへ転落したノードが、候補として復活し、新たな着目ノードとして採択された場合には、学習処理で定義する新たなリンクは負のリンクとすればよい。
【０２３０】
また、上述の実施形態では、ノード・リンク集合体を利用したデータベースシステムに本発明に係る「敗者復活型ＡＮＤ検索」を適用した例を示したが、本発明に係る「敗者復活型ＡＮＤ検索」の基本思想は、このようなノード・リンク集合体を利用したデータベースシステムに限定されるものではなく、上述の例におけるノードをデータ自体と考えれば、一般のデータベースシステムに広く適用可能なものである。
【０２３１】
【発明の効果】
以上のとおり本発明に係るデータベースシステムによれば、複数の検索条件を順次与えることにより候補を絞り込んでゆく検索を行う場合に、特定の検索条件を満たさないために一度候補から除外されたデータを、再び候補として復活させることができるようになるため、より柔軟な絞り込み検索を行うことができるようになる。
【図面の簡単な説明】
【図１】本実施形態の基本概念を説明するための基本的なデータベースシステムの構成を示すブロック図である。
【図２】図１に示すデータベースシステムにおける検索処理の結果の一例を示す図である。
【図３】図２に示す検索結果に基づいて、新たな着目ノードＮ４の採択を行った状態を示す図である。
【図４】図１に示すデータベースシステムにおいて、ダイナミックリンクＬ８を一時的に定義した検索処理を示す図である。
【図５】図４に示す検索結果に基づいて、新たな着目ノードＮ６の採択を行うとともに、ダイナミックリンクＬ８をスタティックリンクに昇格させた状態を示す図である。
【図６】ダイナミックリンクを定義する際のキーワード評価において、１つのノードに複数の等価キーワードを定義した場合の評価方法を示す図である。
【図７】３つのクラスＡ，Ｂ，Ｃからなる簡単な分散型データベースシステムの一例を示す図である。
【図８】図７に示す分散型データベースシステムに、クラスリンクおよびシソーラス辞書を定義した一例を示す図である。
【図９】図８に示す分散型データベースシステムにおいて、スタティックリンクＬ１〜Ｌ５を定義した状態を示す図である。
【図１０】図９に示すスタティックリンクを利用して、ノードＮ１を着目ノードとする検索を行った結果を示す図である。
【図１１】図１０に示す検索と同じ検索処理を、クラスリンクの重みづけ（信号伝達係数）を考慮して行った結果を示す図である。
【図１２】図１０に示す検索結果に基づいて、スタティックリンクの重みづけ（信号伝達係数）を修正する学習処理を行った状態を示す図である。
【図１３】図１２に示す学習処理後の状態において、再び図１０に示す検索と同じ検索処理を行った結果を示す図である。
【図１４】図１０に示す検索処理を実行した後、ノードの重みを考慮して候補ノードを提示する際の優先順位の決定方法を示す図表である。
【図１５】図１３に示す検索処理を実行した後、ノードの重みを考慮して候補ノードを提示する際の優先順位の決定方法を示す図表である。
【図１６】図８に示す分散型データベースシステムにおいて、スタティックリンクとダイナミックリンクとの双方を用いた検索処理を実行した例を示す図である。
【図１７】ダイナミックリンクの定義が、クラスリンクによって制限を受ける例を示す図である。
【図１８】本実施形態に係るデータベースシステムの全体的な利用手順を示す流れ図である。
【図１９】図１８に示す流れ図におけるステップＳ４の検索処理の手順を詳細に示す流れ図である。
【図２０】図１８に示す流れ図におけるステップＳ７の学習処理の手順を詳細に示す流れ図である。
【図２１】本実施形態に係るデータベースシステムの具体的な構成を示すブロック図である。
【図２２】図２１に示すデータベースシステムに、更に、頻度係数格納手段１１０を付加した実施形態を示すブロック図である。
【図２３】本実施形態に係るデータベースシステムにおける検索対象となるノード集合体を示す概念図である。
【図２４】図２３に示すノード集合体内の１つのノードＮ１を着目ノードとして指定した状態を示す概念図である。
【図２５】図２４に示す状態において、着目ノードＮ１についての正の検索処理により検索された正の関連集合Ｇ１を示す概念図である。
【図２６】図２５に示す正の関連集合Ｇ１内から１つのノードＮ２を採択した状態を示す概念図である。
【図２７】図２６に示す採択ノードＮ２を新たな着目ノードに更新した状態を示す概念図である。
【図２８】図２７に示す新たな着目ノードＮ２についての正の検索処理により検索された正の関連集合Ｇ２を示す概念図である。
【図２９】図２１に示すデータベースシステムに、更に、母集合定義手段１２０を付加した実施形態を示すブロック図である。
【図３０】図２９に示すデータベースシステムの検索対象となるノード集合体において、着目ノードＮ１を指定した状態を示す概念図である。
【図３１】図３０に示す状態において、着目ノードＮ１についての正の検索処理により検索された正の関連集合Ｇ１を示す概念図である。
【図３２】図３１に示す正の関連集合Ｇ１の中のノードＮ２を新たな着目ノードとして採択した後、この新たな着目ノードＮ２についての「ＡＮＤ検索」モードによる正の検索処理により検索された正の関連集合Ｇ２を示す概念図である。
【図３３】図３２に示す「ＡＮＤ検索」モードによる正の検索結果として提示される候補ノードの集合体Ｍ３を示す概念図である。
【図３４】図３３に示された候補ノードの集合体の中からノードＮ３を新たな着目ノードとして採択し、この新たな着目ノードＮ３についての「ＡＮＤ検索」モードによる正の検索処理により検索された正の関連集合Ｇ３を示す概念図である。
【図３５】３回の「ＡＮＤ検索」モードによる正の検索処理によって、各検索条件の論理積に相当する候補が抽出される様子を示す概念図である。
【図３６】図２４に示す状態において、着目ノードＮ１についての負の検索処理により検索された負の関連集合Ｇ１^＊を示す概念図である。
【図３７】図３３に示された候補ノードの集合体の中からノードＮ３を新たな着目ノードとして採択し、この新たな着目ノードＮ３についての「ＡＮＤ検索」モードによる負の検索処理により検索された負の関連集合Ｇ３^＊を示す概念図である。
【図３８】３回の「ＡＮＤ検索」モードによる正および負の検索処理によって、各検索条件の論理積に相当する候補が抽出される様子を示す概念図である。
【図３９】正リンクと負リンクとの双方を用いて定義されたノード・リンク集合体の一例を示す概念図である。
【図４０】図３９に示すノード・リンク集合体において、ノードＢを着目ノードと指定し、ホップ数Ｈ＝１に限定した正の検索処理を行った状態を示す概念図である。
【図４１】図３９に示すノード・リンク集合体において、ノードＢを着目ノードと指定し、ホップ数Ｈ＝２に限定した正の検索処理を行った状態の一部を示す概念図である。
【図４２】図３９に示すノード・リンク集合体において、ノードＢを着目ノードと指定し、ホップ数Ｈ＝１に限定した負の検索処理を行った状態を示す概念図である。
【図４３】図４２に示す負の検索処理によって提示された候補ノードの中から、ノードＩを採択した状態を示す概念図である。
【図４４】図４３に示す候補ノードＩの採択による学習処理を示す概念図である。
【図４５】図４４に示す学習処理後のノード・リンク集合体において、ノードＢを着目ノードと指定し、ホップ数Ｈ＝１に限定した負の検索処理を行った状態を示す概念図である。
【図４６】図４５に示す負の検索処理によって提示された候補ノードの中から、ノードＩを採択することによって行われる学習処理を示す概念図である。
【図４７】図３７に示す負の検索処理によって提示された候補ノードの中から、ノードＮ４を採択した状態を示す概念図である。
【図４８】図４７に示すノードＮ４の採択により、着目ノードＮ３と採択ノードＮ４との間に新たな負リンクが定義される学習処理を示す概念図である。
【図４９】図２１に示すデータベースシステムに、更に、リンク再構成手段１３０を付加した実施形態を示すブロック図である。
【図５０】図４９に示すデータベースシステムにおいて実行されるリンク再構成処理の第１の例を示す図である。
【図５１】図４９に示すデータベースシステムにおいて実行されるリンク再構成処理の第２の例を示す図である。
【図５２】図３９に示すノード・リンク集合体に対して、図５０に示すリンク再構成処理を適用した例を示す図である。
【図５３】図３９に示すノード・リンク集合体に対して、図５１に示すリンク再構成処理を適用した例を示す図である。
【図５４】目的とするノードＮ０を含むノード集合体において、着目ノードＮ１を指定した状態を示す概念図である。
【図５５】図５４示す状態において、着目ノードＮ１についての検索処理により抽出された関連集合Ｇ１から、ノードＮ０が漏れた状態を示す概念図である。
【図５６】図５５に示す関連集合Ｇ１の中のノードＮ２を新たな着目ノードとして採択した後、この新たな着目ノードＮ２についての「ＡＮＤ検索」モードによる検索処理により検索された関連集合Ｇ２を示す概念図である。
【図５７】図５６に示す「ＡＮＤ検索」モードによる検索結果として提示される候補ノードの集合体Ｍ３を示す概念図である。
【図５８】図５７に示された候補ノードの集合体の中からノードＮ３を新たな着目ノードとして採択し、この新たな着目ノードＮ３についての「ＡＮＤ検索」モードによる検索処理により検索された関連集合Ｇ３を示す概念図である。
【図５９】図５８に示す「ＡＮＤ検索」モードによる検索結果として提示される候補ノードの集合体Ｍ４を示す概念図である。
【図６０】図５９に示された候補ノードの集合体の中からノードＮ４を新たな着目ノードとして採択し、この新たな着目ノードＮ４についての「ＡＮＤ検索」モードによる検索処理により検索された関連集合Ｇ４を示す概念図である。
【図６１】４回の「ＡＮＤ検索」モードによる検索処理によって、各検索条件の論理積に相当する候補が抽出される様子を示す概念図である。
【図６２】図６１に示す４回の「ＡＮＤ検索」モードによる検索処理において、１回目の検索条件を撤回することにより、目的とするノードＮ０が候補ノードとして抽出される様子を示す概念図である。
【図６３】本発明に係る「敗者復活型ＡＮＤ検索」モードによる５回にわたる一連の検索処理を示す概念図である。
【図６４】本発明に係る「敗者復活型ＡＮＤ検索」モードによる一連の検索処理を一般論として示す概念図である。
【図６５】本発明に係る「敗者復活型ＡＮＤ検索」モードでの検索時に各ノードに定義されるステータスを示す図表である。
【図６６】本発明に係る「敗者復活型ＡＮＤ検索」モードでの毎検索時のステータス遷移を示す図である。
【図６７】本発明に係る「敗者復活型ＡＮＤ検索」モードでの検索処理の具体例を示す図である。
【図６８】本発明に係る「敗者復活型ＡＮＤ検索」モードでの検索処理後の学習処理の前半段階を示す図である。
【図６９】本発明に係る「敗者復活型ＡＮＤ検索」モードでの検索処理後の学習処理の後半段階を示す図である。
【図７０】本発明に係る「敗者復活型ＡＮＤ検索」モードでの検索処理後の学習処理の効果を示す図である。
【符号の説明】
１…データベース
２…ノード・リンク集合体
３…オペレータ
１０…データ格納手段
２０…データ提供手段
３０…着目ノード設定手段
４０…検索手段
５０…リンク格納手段
６０…候補提示手段
７０…候補採択手段
８０…更新手段
９０…学習手段
１００…オペレータ
１１０…頻度係数格納手段
１２０…母集合定義手段
１３０…リンク再構成手段
Ａ，Ｂ，Ｃ…クラス
ＡＡ，ＢＢ…クラスリンク（ローカルリンク）
ＡＢ，ＡＣ…クラスリンク（リモートリンク）
Ａ〜Ｋ…ノード
Ｇ１〜Ｇ５…正の関連集合（着目ノードに関連するノードの集合）
Ｇ１^＊，Ｇ３^＊…負の関連集合（着目ノードに関連しないノードの集合）
Ｈ…ホップ数
Ｈmax …ホップ数の上限値
Ｋ１〜Ｋ９…キーワード
Ｋ４０，Ｋ７０…代表キーワード
Ｋ４１〜Ｋ４５，Ｋ７１〜Ｋ７５…等価キーワード
Ｌ，Ｌ１〜Ｌ８…インスタンスリンク（スタティックリンクおよびダイナミックリンク）
Ｍ１〜Ｍ４…母集合
Ｎ０，Ｎ１〜Ｎ９，Ｎ１２…ノード
Ｔａａ，Ｔｂｂ，Ｔａｂ，Ｔａｃ…シソーラス辞書[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a database system, and more particularly to a database system having a search function for gradually narrowing down candidates by sequentially giving a plurality of search conditions and leaving only data satisfying all of the plurality of search conditions.
[0002]
[Prior art]
In the modern information society, the role played by database systems has become increasingly important, and database systems have been constructed for information in all fields. In order to build a high-quality database system, it is important to record high-quality data, but it is also important to provide a high-quality search environment. No matter how high-quality data the database is, if the data requested by the user cannot be found, no purpose can be achieved.
[0003]
A search operation in a database system is usually performed using keywords. If the administrator of the database system associates certain keywords with respect to individual data in advance, when the user inputs a specific keyword, the corresponding data can be presented. As one of the search operations using such keywords, a method of gradually narrowing down candidates by sequentially giving a plurality of keywords and leaving only data related to all of the plurality of keywords as candidates. It is very commonly used.
[0004]
Recently, as seen in the spread of the Internet, it has become relatively easy to access computers installed at remote locations, and database systems are also available worldwide via communication lines. It is spreading on a large scale. For this reason, it has become possible to construct a global database system by connecting local databases constructed in individual regions through communication lines. In such a distributed database system, in order to provide a high-quality search environment, a search method using a node aggregate has been proposed. That is, a node aggregate composed of a plurality of nodes is defined, and predetermined data is associated with each node. When a specific node of interest is designated from the node aggregate, data associated with the node of interest is provided. If a keyword is given to each node, basically, data retrieval using the keyword is performed. However, if each node is associated with a predetermined link aggregate, when one keyword (node) is specified, another keyword (node) associated with the link aggregate can be found. Will extend not only to the local database, but also to the global database associated with the link aggregate.
[0005]
[Problems to be solved by the invention]
In a conventional database system, an administrator basically sets an environment in advance, and a general user performs a search operation in this environment. This environment setting by the administrator is usually made in consideration of the convenience of a standard user, and therefore does not necessarily meet the needs of individual users. For example, the kind of keyword to be associated with certain data is determined at the discretion of the administrator who sets the environment, and an appropriate keyword is not necessarily associated with a specific user. Not exclusively.
[0006]
In particular, in the distributed database system described above, when a user searches for a node related to a specific node, what type of related node is extracted depends on the link between the nodes. And depends on the setting contents of the link aggregate. However, since this link aggregate is set by the system administrator, the link is not always in line with the user's intention, and the link was not defined by chance. There may be a situation where a node required by the user is not extracted as a related node. In particular, when using a search method that gradually narrows down candidates using multiple keywords, if the keyword used for the search does not happen to be matched, the target data is dropped from the candidates during the narrowing down process. As a result, a situation where the search is not finally performed may occur.
[0007]
Therefore, an object of the present invention is to provide a database system capable of performing a more flexible narrowing search.
[0008]
[Means for Solving the Problems]
[0009]
  (1)   Of the present inventionFirstIn the database system with the refined search function,
  Data storage means for storing data to be provided;
  Search means for repeatedly executing, as a related set, data that matches a predetermined search condition from data stored in the data storage means, based on a plurality of search conditions sequentially given by an operator;
  Each time the search means executes search processing, a candidate presentation means for obtaining a candidate set composed of predetermined data based on the result of the search processing and presenting information indicating the obtained candidate set to the operator;
  Data providing means for providing data adopted by the operator from the candidate set;
  The candidate presentation means
  A total of (n + 1) stages of statuses from the lowest status S1 to the highest status S (n + 1) can be defined for each data (where n is a preset natural number of 2 or more), and a series of repeated search processing At the start of the above, the status of all data is set to the initial status Sj (where j is a natural number 1 <j ≦ n + 1),
  Each time the search process is executed, the status of the data extracted as a related set by the search process is increased by u stages (where u is a natural number of 0 <u <n defined in advance, and all data For the data that is not extracted as a related set by the search process, the value may be a common value or a different value for each piece of data. Performs status transition that falls to the lowest status S1 (however, the lowest status S1 is the lower limit) and performs processing for presenting the data that has become the highest status S (n + 1) as a candidate set Is.
[0010]
  (2)   Of the present inventionSecondThis mode defines a node aggregate consisting of a plurality of nodes, associates predetermined data with each node, and provides data associated with the target node when a specific target node is specified. In the database system with the function to
  Data storage means for storing data to be provided in association with nodes;
  Link storage means for storing a link aggregate consisting of a set of links indicating associations between nodes;
  Based on an instruction from the operator, focused node setting means for setting a specific focused node;
  Search means for searching related nodes related to the node of interest under a predetermined condition using a link aggregate, and executing a search process for extracting the aggregate of the related nodes as a related set;
  Candidate presentation means for presenting all or part of the nodes belonging to the related set to the operator as candidate nodes;
  Candidate selection means for allowing the operator to select a specific candidate node from among the candidate nodes presented by the candidate presentation means;
  Updating means for updating the setting of the focused node setting means so that the adopted node becomes a new focused node;
  Data providing means for extracting data associated with the node of interest from the data storage means and providing it to the operator;
  Provided,
  While updating the node of interest one after another based on the operator's selection action, a series of repeated search processes can be executed,
  It is possible to define a total (n + 1) stages of statuses from the lowest status S1 to the highest status S (n + 1) to each node (where n is a preset natural number of 2 or more), and a series of iterative search processes At the start, the status of all nodes is set to the initial status Sj (where j is a natural number 1 <j ≦ n + 1),
  Each time the search process is executed, for the nodes extracted as a related set by the search process, the status is promoted to u stages (where u is a predefined integer of 0 <u <n, all nodes For the nodes that are not extracted as related sets by the search process, the value may be a common value for each node or may be different for each node. Performs a status transition that falls to the lowest status S1 (however, the lowest status S1 is the lower limit) and performs a process of presenting the node having the highest status S (n + 1) as a candidate set Is.
[0011]
  (3)   Of the present inventionThirdAspects of the aboveSecondIn the database system according to the aspect of
  During the series of repeated search processes, the node that became the target node is shifted to a special status that is an exclusion status, and during the series of repeated search processes, the transition from the excluded status to another status is performed. It is something that is not.
[0012]
  (Four)   Of the present invention4thAspects of the aboveSecondIn the database system according to the aspect of
  When a fall experience node that has fallen to the lowest status S1 is presented as a candidate node and a fall experience node is adopted during a series of repeated search processes, the fall experience node falls to the lowest status S1 Learning means is further provided for defining a new link between the node of interest and the fall experience node and adding the new link to the link aggregate.
[0013]
  (Five)   Of the present invention5thAspects of the aboveSecondIn the database system according to the aspect of
  Based on the degree of association between the node of interest and each related node, the value of the status promotion stage u for each related node is determined independently.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, the present invention will be described based on the illustrated embodiments.
[0015]
§1. Basic concept of database system according to this embodiment
First, in order to explain the basic concept of the present embodiment, a data access method for a basic database system as shown in FIG. 1 will be described. The database system shown in FIG. 1 is composed of a database 1 containing a large number of data and a node / link aggregate 2 for accessing the database 1. The node / link aggregate 2 includes a large number of nodes and links. In the illustrated example, nine nodes N1 to N9 and seven links L1 to L7 are defined. Predetermined keywords K1 to K9 are defined in each of the nodes N1 to N9, and each link connecting the nodes fulfills a function of connecting individual keywords. In this sense, the node / link aggregate 2 can also be called a “keyword network”.
[0016]
As shown in the figure, links are not defined between all nodes, but are defined only between nodes that are related to each other (that is, between nodes that define keywords that are related to each other). For example, in the illustrated example, the link L1 defined between the node N1 and the node N2 indicates that the keyword K1 and the keyword K2 have some relationship. Similarly, the link L3 defined between the node N2 and the node N4 indicates that the keyword K2 and the keyword K4 have some relationship, and the link defined between the node N4 and the node N5. L4 indicates that the keyword K4 and the keyword K5 have some relationship. The node N1 and the node N5 are indirectly connected through such a link, and the keyword K5 can be obtained as a related keyword by giving the keyword K1. Each link is weighted, and the weight indicates the degree of association between nodes.
[0017]
Which keywords are defined for which nodes and what links are defined between which nodes is primarily a task imposed on the administrator of this database system. At the time of using this system, some node / link aggregate 2 has already been constructed. However, as will be described later, this node / link aggregate 2 has a function of learning by a user's usage act, and the weight of each link is corrected as the user uses it. A new link may be defined between nodes for which no link has been defined. Therefore, even if the administrator does not define any link, the node / link aggregate 2 is gradually formed as the user uses it.
[0018]
When the operator 3 wants to use specific data in the database 1, the operator 3 first accesses the node / link aggregate 2. From the viewpoint of the operator 3, the node / link aggregate 2 functions as a front-end processor for accessing the database 1. First, the operator 3 designates a specific node in the node / link aggregate 2 as a target node. A node can be specified by inputting a keyword. For example, if the operator 3 gives a specific keyword K1 to the node / link aggregate 2, the node N1 corresponding to the keyword K1 is designated as the node of interest. In the figure of the present application, for the convenience of explanation, the node of interest at that time is displayed by drawing a circle around the black circle node point. In FIG. 1, the circle drawn around the node N1 indicates that this node N1 is the current node of interest.
[0019]
Specific data in the database 1 is associated with each node in the node / link aggregate 2. In other words, if one node is specified, data associated with this node can be extracted from the database 1. Therefore, if the operator 3 gives an instruction to view the data related to the current node of interest N1, the data associated with the node N1 is extracted from the database 1 and provided to the operator 3. The correspondence between the nodes and each data in the database 1 may be, for example, by giving specific address information in the database 1 to each node. When an instruction to browse data regarding the node of interest N1 is given, the database 1 is accessed based on the address information of the node N1 to read predetermined data, which is provided to the operator 3.
[0020]
Thus, if the operator 3 gives the keyword K1 to the node / link aggregate 2 and gives an instruction to browse the data, the data associated with the node N1 is read from the database 1 and provided to the operator 3 Will be. However, such a search process is merely a search process in a conventional general database system in which data previously associated with the keyword K1 is presented when a predetermined keyword K1 is input. Of course, in the database system according to the present embodiment, it is possible to perform such conventional general search processing, but the main point of the present embodiment is that when the operator gives one keyword to the system, Another keyword related to this keyword can be searched to realize a search operation with a higher degree of freedom.
[0021]
For example, when the operator gives a predetermined keyword K1 to the node / link aggregate 2, as shown in FIG. 1, the node N1 is designated as the node of interest. As described above, the operator can give an instruction to “browse data associated with the node of interest N1” if necessary. However, here, it is assumed that the operator is not directly interested in the data associated with the node N1, but is looking for other data related to the keyword K1. In this case, it is only necessary to give an instruction to the node / link aggregate 2 to perform the “search process with the node N1 as the target node”. When such a search instruction is given, the node / link aggregate 2 executes a process of retrieving another node related to the node of interest N1 while referring to the defined link aggregate. For example, when a process of searching all nodes connected directly or indirectly by links as related nodes is performed on the target node N1, in the case of the example in FIG. Two nodes N2, N3, N4, and N5 are searched as related nodes. However, in practice, it is preferable to extract only nodes having a certain degree of relationship as related nodes in consideration of the weight of each link. For example, in the example of FIG. 1, when the weight of the link L4 is small, it is determined that the relationship between the target node N1 and the node N5 is low, and the node N5 is removed from the related node. In addition, as detailed later in §4.3, signal transmission between nodes connected by one link is defined as the number of hops H = 1, and only nodes connected with a predetermined number of hops or less are defined. It is preferable to impose a condition that the node is a related node. For example, when the condition that the number of hops H = 2 or less is imposed, in the example of FIG. 1, regardless of the weight of each link, the node N5 located at the hop number H = 3 with respect to the node of interest N1 is a related node. Is not extracted.
[0022]
Here, it is assumed that only the three nodes N2, N3, and N4 are extracted by performing the search process using the node N1 as the target node. FIG. 2 is a diagram showing the result of such a search process. The reason why the three nodes N2, N3, and N4 are drawn as white node points is to indicate that these three nodes are nodes extracted as related nodes. Hereinafter, related nodes extracted by the search are indicated by white node points. The node N5 shown as a black node point in the figure is not extracted as a related node in this search.
[0023]
When the related nodes N2, N3, and N4 are searched in this way, these “related nodes” are presented to the operator as “candidate nodes”. Specifically, the candidate nodes may be presented by displaying each keyword K2, K3, K4 defined in each candidate node N2, N3, N4 on a display or the like. Here, the “related node” is referred to as a “candidate node” because the operator selects a new node of interest from these nodes. As shown in FIG. 2, the current target node is the node N1, but the operator designates one of the candidate nodes N2, N3, and N4 as a new target node. For example, if the operator adopts candidate node N4 as a new node of interest, as shown in FIG. 3, node N4 becomes the new node of interest, and all other nodes return to normal nodes. The designation of the candidate node to be adopted can be performed, for example, by an operation of selecting the keyword K4 from the keywords K2, K3, K4 presented on the display screen.
[0024]
Note that, for the convenience of explanation in the “AND search” described later in §7, in the present specification, it is determined that there is a certain degree of relevance to the target node by the search process for the predetermined target node. The nodes extracted in this manner are called “related nodes”, and the nodes that are presented as “new candidate nodes of interest” to the operator from among these “related nodes” are called “candidate nodes”. In the “AND search” described in §7, which will be described later, only some of the extracted “related nodes” are presented as “candidate nodes”, but in the normal search described here, they are extracted. Since all of the “related nodes” are presented as “candidate nodes” as they are, it may be considered that “related nodes” = “candidate nodes” for the time being. Therefore, in the description up to §6, when there is no particular problem, “related node” and “candidate node” are treated as synonyms.
[0025]
The above search processing is viewed as an operation on the operator side as follows. First, the operator inputs a predetermined keyword K1 and gives a search instruction. Then, a search process referring to the link aggregate is performed, and other keywords K2, K3, K4 having a certain degree of relevance to the keyword K1 are displayed on the display screen (candidate nodes N2, N3, N4). Presentation). The operator selects a keyword that seems to be related to the data he / she is looking for from these new keywords (one candidate node is adopted as a new node of interest). Thereby, the node N4 becomes a new target node.
[0026]
In this way, by giving a search instruction, the operator can move around the related nodes one after another and can move through the keyword network. Moreover, if necessary, data associated with the current node of interest can be browsed at any time. For example, as shown in FIG. 3, when it is desired to browse data associated with the node N4 when the node N4 becomes a new node of interest, a browsing instruction may be given at that point. Then, data corresponding to the node N4 is read from the database 1 and presented to the operator. In addition, if an instruction to perform “search processing with node N4 as the target node” is given, a related node having a certain degree of relevance to node N4 is extracted as a candidate node. For example, in the previous “search process using the node N1 as the target node”, the node N5 having low relevance and leaking from the candidate node is searched as a candidate node in the current search process.
[0027]
In general, when searching for necessary data from a database system, an appropriate keyword may immediately come to mind, but the optimum keyword does not necessarily come to mind. In such a case, the system according to the present embodiment enables a flexible search with a very high degree of freedom. That is, if a keyword K1 that comes to mind is input and a search instruction is given for the time being, keywords K2, K3, and K4 related thereto are automatically presented as candidates on the display, so that the operator becomes a search target. More suitable keywords can be adopted to access the data. Here, if K4 (node N4) is adopted as a more appropriate keyword and a search instruction is given again, for example, a new keyword K5 will be presented. Here, if this keyword K5 is exactly the optimum keyword for accessing the data to be searched, the keyword 1 is adopted after the keyword K5 (node N5) is adopted and the browsing instruction is given. The target data can be obtained from
[0028]
§2. Learning node-link aggregates
One of the features of the database system according to the present embodiment is that the node / link aggregate 2 has a learning function. As described above, the node / link aggregate 2 shown in FIG. 1 is primarily constructed by the administrator of this database system. Will change. Accordingly, the database 1 may be prepared in common for all users, but it is preferable to prepare a separate node / link aggregate 2 for each user. If a separate node / link aggregate 2 is prepared for each user as described above, even if all the node / link aggregates 2 are initially constructed by the system administrator, As each user uses this system, the node / link aggregate 2 for each user can be changed to a form that is easy for each user to use.
[0029]
In this way, in order to change the node / link aggregate 2 into a form that is easy to use, learning according to the following basic policy is performed in the present embodiment. That is, when a plurality of candidate nodes are extracted by a search for a certain target node, and the operator selects a desired candidate node from among these candidate nodes, the weight of the link on the path from the target node to the selected node Is increased. For example, in the case of the above-described example, as shown in FIG. 2, three candidate nodes N2, N3, and N4 are searched for the target node N1, and the operator adopts the node N4 from these, and as a result, FIG. As shown, the adopted node N4 has become a new node of interest. In this case, the weights of the links L1 and L3 on the path from the target node N1 to the adopted node N4 are increased. In FIG. 3, the links L1 and L3 are indicated by bold lines, which indicates that learning for increasing such weighting has been performed.
[0030]
It is also possible to reduce the link weight by learning. For example, in the case of the above-described example, although the three candidate nodes N2, N3, and N4 are presented to the node of interest N1, the operator adopts the node N4, and the node N3 is omitted from the adoption. . In other words, the link L2 is not involved in the selection of the node. In such a case, if correction is performed to reduce the weighting of the link L2, learning in a form according to the usage form of the user becomes possible. For example, suppose that the user has performed an event that “a search is performed with node N1 as a target node and node N4 among candidate nodes is adopted” five times. In this case, in any of the five learnings, correction is performed in which the weights of the links L1 and L3 are increased and the weight of the link L2 is decreased. As a result, the weights of the links L1 and L3 become very large, and conversely, the weight of the link L2 becomes very small. Therefore, for example, when the “sixth search using node N1 as the target node” is performed, the relationship indicated by link L2 (relationship between node N2 and node N3) is considerably small, and node N3 is no longer Cannot be extracted as a candidate node.
[0031]
In this way, if learning to increase weighting and learning to decrease weighting are performed, nodes that are unlikely to be adopted in the future based on past history are candidates for future search. It is possible not to be extracted as nodes, and to narrow down candidate nodes. In an actual database system in which the total number of nodes is enormous, it is important to reduce the number of candidate nodes presented to the operator to some extent in order to improve usability.
[0032]
In order to perform both increasing learning and decreasing learning regarding link weighting, learning should be performed according to the following criteria. That is, when several candidate nodes are extracted by searching for a predetermined target node, and one node is selected from these candidate nodes, all the paths from the target node to the individual candidate nodes are learned. It is a pass. Then, among the links on the learning target path, the weight of the link on the path from the target node to the adopted node is increased, and the weight of the other links is decreased. In short, such weight increase / decrease correction can be referred to as “correction in which the weight of the link on the path from the target node to the adopted node is relatively increased with respect to the weight of the other links”.
[0033]
As described above, when learning is performed by increasing / decreasing the weighting, the following weighting correction is performed in the above example. That is, as shown in FIG. 2, if three candidate nodes N2, N3, and N4 are extracted by the search for the target node N1, all the paths on the path from the target node N1 to each candidate node N2, N3, and N4 are extracted. The link becomes a link on the learning target path. Specifically, the links L1, L2, and L3 are learning targets. If the operator adopts the node N4, the weights of the links L1 and L3 on the path from the target node N1 to the adopted node N4 are increased and the weights of the links L2 on the other learning target paths are decreased. A correction will be made. At this time, the link L4 is not a learning target because the node N5 is not a candidate node, and the weighting remains unchanged.
[0034]
In short, the concept of link weight learning described above is presented to the operator as a corner candidate node, but for a node that has not been adopted, the weight of the link to that node is reduced, and conversely It is to increase the weight of the link to the adopted node. An important point is that all learning is performed based on an operator (user) action of “selecting one of a plurality of candidate nodes”. Therefore, when this database system is used by a large number of users, learning proceeds in a different manner for each individual user. As described above, since the node / link aggregate 2 is prepared separately for each user, the node / link aggregate 2 for each user becomes the user as the user is used. Learning according to usability will progress.
[0035]
Although only the link weighting has been described as a learning target here, in this embodiment, the weighting is also defined for the node, and the node weighting is also the learning target. The handling of the node weight will be described later.
[0036]
§3. Occurrence of a new link
In §2 described above, it has been explained that learning is performed on link weighting. However, it is difficult to realize a flexible search process that is tailored to the convenience of individual users by simply correcting the weights of such existing links. For example, in the node / link aggregate 2 illustrated in FIG. 1, the nodes N2, N3, N4, and N5 are connected to the node N1 by links. Therefore, if the search process is repeatedly performed, it is possible to reach the nodes N2, N3, N4, and N5 from the node N1. In fact, in the case of the above-described example, reaching the candidate nodes N2, N3, and N4 can be realized by the first search using the node N1 as the target node, and then the node N4 is adopted, and this adopted node N4 is adopted. If the second search is performed using as a new target node, the candidate node N5 can be reached.
[0037]
However, since no links are established between the nodes N1 to N5 and the nodes N6 to N9, as long as the candidate nodes are searched using the existing links, the nodes N1 to N5 to the node N6 It is not possible to perform a search up to N9.
[0038]
As already described, the node / link aggregate 2 is primarily constructed by the system administrator. Therefore, if no link is defined between the nodes N1 to N5 and the nodes N6 to N9, there is no relationship between the keywords K1 to K5 and the keywords K6 to K9. Judgment was made by the administrator. However, such an administrator's judgment is not universal. For a specific user, for example, it may be recognized that the keyword K4 and the keyword K7 are closely related. Absent. In such a case, for example, as shown in FIG. 4, a temporary link L8 as shown by a broken line is generated between the node N4 and the node N7, and the existing links L1 to L7 are temporarily generated. By performing a search using both of the links L8, a search with a higher degree of freedom becomes possible. In other words, in a state where the temporary link L8 is added, if a search using the node N4 as a target node is performed, for example, the nodes N1, N2, N3, N5, N6 shown as white node points in FIG. N7 and N8 can be extracted as candidate nodes.
[0039]
In the present specification, the existing link described so far is referred to as a “static link”, and a link that is temporarily generated during a search is referred to as a “dynamic link” to distinguish them. In the example shown in FIG. 4, the links L1 to L7 indicated by solid lines are static links, and the link L8 indicated by broken lines is a dynamic link. If a dynamic link is generated at the time of search, a node that is not completely connected to the node of interest by an existing static link, and a node that has an unconnected portion is temporarily added to the unconnected portion. By defining dynamic links, it becomes possible to search, and the candidates can be extended to such disconnected nodes.
[0040]
Now, as shown in FIG. 4, in the search using the node N4 as the target node, by dynamically defining the dynamic link L8 (the dynamic link definition method will be described later), the nodes N1, N2, N3, N5 , N6, N7, and N8 are extracted as candidate nodes. At this time, keywords K1, K2, K3, K5, K6, K7, and K8 are presented to the operator as information indicating individual candidate nodes. Then, it is assumed that the operator has selected the node N6 as a new node of interest from among these candidate nodes. As described above, the operator can browse data associated with the new target node N6, or can perform a new search using the node N6 as the target node. As described above, the definition of the dynamic link greatly expands the degree of freedom of the search range.
[0041]
As described above, when the candidate node N6 is extracted by the search using the temporarily defined dynamic link L8 and this candidate node N6 is adopted (in other words, the dynamic link L8 is adopted from the node of interest N4) When the link is on the path to N6), the dynamic link L8 is newly added as a static link L8. That is, the temporary dynamic link is promoted to a permanent static link. FIG. 5 shows a state where the adopted node N6 is a new target node and the dynamic link L8 is promoted to the static link L8. At this time, as indicated by a bold line in the figure, the weights of the links L8 and L5 on the path from the target node N4 to the adopted node N6 are increased (the original weight before the increase of the link L8 is dynamic). (It is determined when the link L8 is defined). On the other hand, learning is performed to reduce the weighting of other links (links L1, L2, L3, L4, L6 on the path from the node of interest N4 to each candidate node) to be learned.
[0042]
Eventually, in this example, the temporarily defined dynamic link L8 is promoted to the static link L8 and added as a new member of the node / link aggregate 2. However, a temporarily defined dynamic link may disappear without being promoted to a static link. For example, as shown in FIG. 4, in the state where several candidate nodes are presented, when the operator adopts the node N5 instead of the node N6, the dynamic link L8 is changed from the target node to the adopted node. Since it did not become a link on the path to reach, it disappears without being promoted to a static link. In short, a temporarily defined dynamic link will remain as a static link if it is involved in the operator's adoption act as a path, but it will disappear if it is not involved in the adoption act. become.
[0043]
In this way, if the dynamic link required by the user is promoted to a static link and added to the node / link aggregate 2, the number of links not provided by the database system administrator gradually increases. As a result, a link aggregate that is convenient for the user will be formed. Also, using this method of adding links, even if the system administrator did not define any links at first (ie, no static links initially existed) As a person uses this system, a static link is gradually formed. Therefore, the new link addition method described here is a very effective method.
[0044]
By the way, in the explanation so far, there has been no mention of the definition method of the dynamic link, but in order to temporarily define the dynamic link L8 as shown by the broken line in FIG. 4, some reference is set. There is a need. In the illustrated example, the dynamic link L8 is defined between the nodes N4-N7, but there is room for defining the dynamic link between the nodes N4-N6, between the nodes N4-N8, and between the nodes N4-N9. Also, there is room for defining a dynamic link between the nodes N4-N3 and between the nodes N4-N1, and there is room for defining a dynamic link between any nodes that are not directly connected by static links. There is. However, it is not preferable to define a dynamic link between nodes that have no relationship as long as the “link” indicates some relationship between the two nodes.
[0045]
Therefore, in this embodiment, at the time of search, between nodes that are not directly connected by static links, the relevance of keywords defined in both nodes is specifically evaluated, and when the evaluation result satisfies a predetermined condition, A dynamic link is defined between both nodes. For example, in the example shown in FIG. 4, the relationship between the keyword K4 defined in the node N4 and the keyword K7 defined in the node N7 is evaluated, and the evaluation result satisfies a predetermined condition. The dynamic link L8 is defined between -N7.
[0046]
As an example of a method for evaluating the relationship between two keywords, there is a method for quantitatively evaluating the degree of matching between character strings constituting both keywords. For example, the keyword “high blood pressure” and the keyword “blood pressure value” match only two characters “blood pressure” in three characters, and therefore, a quantitative evaluation such as a degree of coincidence “2/3” is made. Is possible. Alternatively, if a certain thesaurus dictionary is prepared and the similarity is quantitatively determined in this thesaurus dictionary, it becomes possible to quantitatively evaluate the relationship between the two keywords. For example, if a thesaurus where the similarity between “high blood pressure” and “high pressure” is defined as 100 is used, the degree of coincidence between the keyword “high blood pressure” and the keyword “high pressure” is quantitatively evaluated. be able to. When such an evaluation value is equal to or higher than a certain standard, a dynamic link may be defined between both nodes. Also, this evaluation value can be used as it is as the weight of the dynamic link.
[0047]
In addition, in order to perform the evaluation of the relevance between nodes as rationally as possible, it is preferable to define a plurality of equivalent keywords for one node. For example, in the example shown in FIG. 4, it has been described that the keyword K4 is defined for the node N4 and the keyword K7 is defined for the node N7. In this case, as shown in FIG. 6, the keyword K4 is composed of one representative keyword K40 and a plurality of equivalent keywords K41 to K45, and the keyword K7 is composed of one representative keyword K70 and a plurality of equivalent keywords K71 to K75. In addition, when the evaluation result for any one of the keywords exceeds a certain standard, a dynamic link may be defined between both nodes. In the example shown in FIG. 6, since the evaluation result of the relationship between the equivalent keyword K42 and the equivalent keyword K74 is equal to or higher than the standard, a dynamic link is defined between the nodes N4 and N7.
[0048]
In this case, each equivalent keyword is a keyword equivalent to the representative keyword, and can be used instead of the representative keyword. For example, if an equivalent keyword such as “high blood pressure”, “blood pressure abnormality”, or “hypertension” is defined for the representative keyword “hypertension”, the relevance of any equivalent keyword is evaluated. Therefore, a more rational evaluation result can be obtained. That is, it is possible to eliminate the unreasonableness that an evaluation result of “no relevance” is issued because the expression form of the keyword character is different although it is originally a related node.
[0049]
§4. Example of application to a distributed database system
So far, the basic concept of the database system according to the present embodiment, the node / link aggregate learning method, and the new link generation method have been described with reference to simple examples. Here, the embodiment applied to the distributed database system will be described more specifically.
[0050]
<4.1 Definition of class link in distributed system>
Over the past few years, the environment in which multiple computers are connected to each other via a network has become commonplace, and distributed database systems that can use a database built for each computer from another computer have become widespread. ing. In such a distributed database system, each local database is handled by the concept of “class”. Here, for the sake of convenience, as shown in FIG. 7, the following description will be given using a very simple distributed database system including three classes A, B, and C as an example.
[0051]
In FIG. 7, a circumference is drawn for each class, and nodes N1 to N7 are shown on this circumference. Here, each circumference indicates a group of individual classes, and nodes on each circumference indicate nodes belonging to the specific class. For example, nodes N1 and N2 are nodes belonging to class B, nodes N3 and N4 are nodes belonging to class A, and nodes N5, N6 and N7 are nodes belonging to class C. Usually, the database for each class is provided in a spatially separated place and is connected to each other via a communication line or the like. In the illustrated example, it is assumed that classes A, B, and C are spatially separated from each other. The circumference shown in the figure is for indicating the affiliation of each node, and does not indicate a link between the nodes. Therefore, in the state shown in FIG. 7, no link is defined between the nodes N1 to N7. In the illustrated example, only a total of seven nodes are shown, but in reality, there are a large number of nodes in each class.
[0052]
In such a distributed database system, in order to define links between nodes, in this embodiment, associations are defined between individual classes in advance. Here, this association between classes is referred to as “class link”. So far, the links (static links and dynamic links) described in §1 to §3 are links between nodes (generally called instance links) indicating the association between nodes. The class link defined here is a link between classes indicating association between classes.
[0053]
FIG. 8 is a diagram illustrating an example of the definition of the class link. A straight line or circle indicated by a bold line in the figure indicates a class link. Specifically, a class link AB is defined between the classes A and B, and a class link AC is defined between the classes AC. On the other hand, a class link AA is defined between the classes A and A, and a class link BB is also defined between the classes BB. Class links AB and AC indicated by straight lines are class links indicating the degree of association between different classes, and are referred to as “remote links” herein. On the other hand, class links AA and BB indicated by circles are class links indicating the degree of association between themselves and are referred to as “local links” herein.
[0054]
Here, in order to avoid confusion, terms related to “link” used in this specification are summarized as follows.
[0055]
(1) Instance links (links indicating the relationship between nodes: the generic term for the following static links and dynamic links)
(1) (1): Static link (permanent link constructed as a link aggregate: learning of weights is performed as the user uses the system. If not, it may simply be labeled “link”)
(2) of (1): Dynamic link (Temporary link defined at the time of retrieval when the keyword defined in each node is relevant: Static link when dynamic link is used for node selection) Will be promoted, but will disappear if not used)
(2) Class links (links indicating the relationship between classes: the generic name of the following remote links and local links: In the embodiment described here, weights are defined for each link. Is not learned)
(2) (1): Remote link (link indicating the relationship between different classes: shown as a thick straight line in FIG. 8)
(2) {circle over (2)}: Local link (link indicating the relationship between self for the same class: indicated by a thick circle in FIG. 8).
[0056]
After all, in the example shown in FIG. 8, class links are defined between classes A and B, between classes A and C, between classes A and A, and between classes B and B, but between classes B and C. No class link is defined between classes CC. What class link is defined and what weight is defined for each class link are determined by the discretion of the administrator of the database system. However, in practice, when defining a class link, it is not just a matter of considering the convenience of search, but the database usage conditions, usage contract details, usage fee, access time for each class. And so on. Therefore, there are cases where the class link is not defined or a class link having a very small weight is inevitably defined due to restrictions on management of the database system. In particular, in a medical case database or the like, only a very limited class link may be defined from the viewpoint of protecting patient privacy.
[0057]
In the example of this embodiment, a thesaurus dictionary is prepared for each class link. For example, in the example shown in FIG. 8, a thesaurus dictionary Tab is prepared corresponding to the remote link AB, a thesaurus dictionary Tac is prepared corresponding to the remote link AC, and a thesaurus dictionary Taa is prepared corresponding to the local link AA. A thesaurus dictionary Tbb is prepared corresponding to the local link BB. These thesaurus dictionaries are used when defining dynamic links, and the usage mode will be described later. However, it is very meaningful to prepare a unique thesaurus dictionary for each class link. For example, if class A is a Japanese database, class B is an English database, and class C is a French database, then the thesaurus dictionary uses “English-Japanese / Japanese-English thesaurus” and thesaurus dictionary. It is reasonable to use “French-Japanese / Japanese-French Thesaurus” as Tac.
[0058]
<4.2: Definition of Static Link in Distributed System>
The administrator of the database system defines a class link as shown in FIG. 8 and then defines a static link between individual nodes. That is, referring to a keyword associated with each node, a work is performed to establish a static link with a predetermined weight between nodes associated with mutually related keywords. When defining this static link, it follows the above-mentioned class link conditions. That is, a static link can be defined between classes in which class links are defined, but in principle, a static link cannot be defined between classes in which class links are not defined. FIG. 9 is a diagram showing a specific example of the static link defined for each node shown in FIG. For example, the static link L1 is a link between the nodes N1 and N2, but this is a permitted link because the local link BB is defined for the class B. Similarly, the static link L2 is a link permitted by the remote link AB, the static link L3 is a link permitted by the local link AA, and the static link L4 is a link permitted by the remote link AC. . On the other hand, since no remote link is defined between the classes B and C, for example, no static link is defined between the nodes N1 and N7. Also, since no local link is defined for class C, a static link cannot be originally defined between nodes belonging to class C. In this case, however, node N6 is exceptionally provided by the administrator's intention. A static link L5 is defined between -N7. As described above, in the present embodiment, in principle, it is preferable to define a static link based on the condition indicated by the class link. However, when the administrator determines that an exceptional measure is necessary. Contrary to the principle, static links can be defined as appropriate.
[0059]
<4.3: Search processing in a distributed system>
Now, as shown in FIG. 9, in a distributed database system composed of three classes A, B, and C, when seven nodes N1 to N7 and five static links L1 to L5 are defined. Will be described in detail how search processing and learning processing are performed.
[0060]
First, as shown in FIG. 9, let us assume that the node N1 is selected as the first node of interest. The first node of interest is designated by the operator inputting the keyword K1 corresponding to the node N1. When the target node is determined in this way, next, search processing for the target node is executed. In this embodiment, a search process for a specific target node is performed by transmitting a signal having a predetermined signal value from the target node to another node along a static link (or a dynamic link as will be described later). Is going by. Therefore, a signal transmission coefficient for indicating weighting is defined for each static link (and dynamic link). Here, as shown in FIG. 10, it is assumed that a signal transfer coefficient is defined for each link. The signal transmission coefficients in this embodiment are all expressed as percentage values. In the illustrated example, link L1: 25%, link L2: 50%, link L3: 30%, link L4: 60%, link L5 : The coefficient is defined as 80%. The value of the signal transfer coefficient of each static link is primarily defined by the system administrator, but as will be described later, learning is performed as the user uses the system, and The numbers will be corrected.
[0061]
The search process using the node N1 as a target node is a process of extracting other nodes having a certain degree of relationship with the node N1 as candidate nodes. Here, in order to extract such a candidate node, a signal having an initial signal value 100 is transmitted from each node of interest N1 along each link. Each time a link is passed, the signal transmission coefficient defined for that link is multiplied by the original signal value. For example, in FIG. 10, in signal transmission from the node N1 to the node N2, a multiplication of a signal value of 100 × 25% is performed, and the signal value of the signal reaching the node N2 is attenuated to 25. Similarly, in signal transmission from the node N1 to the node N3, multiplication of a signal value of 100 × 50% is performed, and the signal value of the signal reaching the node N3 is attenuated to 50. The signal that has reached the node N3 is further transmitted to the node N4. In this signal transmission, a signal value of 50 × 30% is multiplied, and the signal value of the signal that has reached the node N4 is attenuated to 15. Become. Further, when the signal of the signal value 15 is transmitted from the node N4 to the node N5, a multiplication of a signal value of 15 × 60% is performed, and the signal value of the signal reaching the node N5 is attenuated to 9. Become.
[0062]
The chart shown in the lower column of FIG. 10 shows signal values of signals transmitted to each node when a signal having a signal value of 100 is given to the node of interest N1. Such a state of signal transmission is similar to a current flowing through an electronic circuit connected by a resistance element. That is, if each link is considered to be a resistance element having a predetermined resistance value (a link having a smaller signal transmission coefficient has a higher resistance value) and the signal value is considered to be a voltage value, the signal attenuation is caused by a voltage drop due to the resistance element. It will be equivalent.
[0063]
Now, if a link with a large weight (two nodes connected by this link have a large relationship), a larger signal transmission coefficient is defined. In the signal transmission, the signal attenuation is reduced, and the signal value at the node where the signal arrives becomes a large value. Therefore, it can be said that a node with a larger signal value is a node having higher relevance to the node of interest. Therefore, in this embodiment, the priority order is defined for each node based on the signal value of the arrived signal. The priority orders {circle around (1)}, {circle around (2)}, and {circle around (3)} in the chart shown in the lower column of FIG. 10 are the priorities thus defined. The node N5 is a node having the priority order (4), but in this example, it is treated as “below the condition” and the priority order is not particularly defined.
[0064]
In the example shown here, the lower limit condition of the effective signal value is set to 10, and a node whose signal value of the transmitted signal is 10 or less is excluded from consideration as “below the condition”. . As described above, a node whose signal value is equal to or smaller than the condition is handled in the same manner as when no signal is transmitted. Therefore, in the example of FIG. 10, the node N5 is treated as having no signal transmission despite the signal transmission of the signal value 9, and is assumed to be further downstream from the node N5 by a link. Even if another connected node exists (in the example of FIG. 10, such a node does not exist), the signal transmission process downstream from the node N5 is no longer performed. Eventually, in this example, the node N5 is handled in the same manner as the nodes N6 and N7 that did not transmit any signal.
[0065]
Thus, in the search process using the node N1 as the target node, the nodes for which valid signal values are obtained are only the nodes N2, N3, and N4, and these three nodes are extracted as related nodes and presented to the operator as candidate nodes. Will be. In FIG. 10, nodes indicated by white node points are these candidate nodes. Although the node N5 has a slight relationship, it is not extracted as a candidate because the relationship is below the condition.
[0066]
The operator will adopt a new node of interest from the presented candidate nodes N2, N3, N4. At this time, each candidate node is presented to the operator based on the priority order. For example, in the case of the example of FIG. 10, the presentation is performed in the order of candidate nodes N3, N2, and N4 in accordance with the priorities (1), (2), and (3) (actually corresponding to each node). The assigned keywords are displayed on the display according to the priority order.If all the keywords cannot be displayed on one screen of the display, they are displayed while switching the screen in the priority order). In this way, if candidate nodes are presented based on priority, it is possible to consider the degree of relevance when determining an adopted node (new node of interest). That is, the operator can recognize that a candidate node that is preferentially displayed has a higher degree of association and can be preferentially selected.
[0067]
In the above example, whether or not to extract as a related node (candidate node) is determined based on whether or not the signal value of the transmitted signal satisfies the condition as an effective signal value. In addition to the method of setting conditions based on such signal values, it is also possible to set conditions based on the number of hops H. In other words, signal transmission between nodes connected by one link is defined as the number of hops H = 1, and when the number of hops H exceeds a predetermined upper limit value, the signal transmission process is stopped. is there. For example, in the case of the example shown in FIG. 10, the signal transmission to the nodes N2 and N3 directly connected to the target node N1 by a link is a signal transmission corresponding to the hop number H = 1, but to the node N4. The signal transmission is the signal transmission corresponding to the hop number H = 2, and the signal transmission to the node N5 is the signal transmission corresponding to the hop number H = 3. Therefore, for example, if the upper limit value of the hop number H is set to H = 2, the signal transmission in which the hop number H is 3 or more is stopped. In the case of the example shown in FIG. 10, signal transmission to the node N4 is performed, but signal transmission to the downstream node N5 is not performed.
[0068]
Actually, it is preferable to set an AND condition of a condition based on a signal value and a condition based on the number of hops. That is, it is only necessary to extract only nodes that can transmit signals with the number of hops within a predetermined condition from the node of interest and that have a signal value of a transmitted signal that is equal to or greater than a predetermined condition as candidate nodes. Specifically, first, the search range (range in which signal transmission calculation is performed) is limited by the condition based on the number of hops, and the signal transmission calculation process is performed only for nodes connected with a predetermined number of hops, Only nodes that finally obtain a signal having a signal value greater than or equal to a predetermined value may be set as candidate nodes. In this way, imposing conditions on candidate node extraction and making only nodes with a certain degree of relevance as candidate nodes also has the advantage of system performance that shortens the search time, and the usability of the search function Is important to improve. Since there are an enormous number of nodes in an actual database system, if even nodes that have only low relevance are presented as candidate nodes, the search waiting time becomes long, the number of candidates becomes too large, and usability decreases. Will end up.
[0069]
<4.4: Search Processing Considering Class Link Weighting>
The search described above is a search that takes into account the weight (signal transmission coefficient) of the static link, but it is also possible to perform a search process that takes into account the weight of the class link. FIG. 11 is a diagram illustrating an example of a search process that considers both the weighting of static links and the weighting of class links. Here, the percentage value shown below each static link L1-L5 is the signal transmission coefficient of each static link, and is exactly the same as the value shown in FIG. On the other hand, the percentage value shown under each class link AA, BB, AB, AC is a signal transmission coefficient of each class link.
[0070]
Here, when performing signal transmission starting from the node of interest, the product of the signal transmission coefficient for the static link and the signal transmission coefficient for the class link is used. For example, as in the example shown in FIG. 10, consider a case where a signal having an initial signal value 100 is transmitted from each node of interest N1 along each link. Then, in FIG. 11, in signal transmission from the node N1 to the node N2, a multiplication of a signal value of 100 × 25% (link L1) × 20% (link BB) is performed, and the signal value of the signal reaching the node N2 is It will decay to 5. Similarly, in signal transmission from the node N1 to the node N3, multiplication of a signal value of 100 × 50% (link L2) × 80% (link AB) is performed, and the signal value of the signal reaching the node N3 is attenuated to 40 Will do. The signal that has reached the node N3 is further transmitted to the node N4. In this signal transmission, a signal value of 40 × 30% (link L3) × 90% (link AA) is multiplied, and the signal that has reached the node N4 Will be attenuated to 10.8. Further, when the signal having the signal value of 10.8 is transmitted from the node N4 to the node N5, a multiplication of the signal value of 10.8 × 60% (link L4) × 10% (link AC) is performed. The signal value of the signal that has reached is attenuated to 0.648.
[0071]
The chart shown in the lower column of FIG. 11 shows signal values of signals transmitted to each node when a signal having a signal value of 100 is given to the target node N1. Also in this example, the lower limit condition of the effective signal value is set to 10, and a node whose signal value of the transmitted signal is 10 or less is excluded from consideration as “below the condition”. As a result, the nodes extracted as candidate nodes are only the node N3 and the node N4, and the priority order is defined in this order.
[0072]
In this way, if the search is performed in consideration of both the static link weight and the class link weight, the system administrator can control the individual search processing trends in an integrated manner. It becomes possible to do. For example, if the communication line for accessing a specific class tends to be very congested, the nodes belonging to this class can be extracted as candidates by making corrections that reduce the class link weight for this class. It becomes possible to suppress that.
[0073]
As already described, the weighting element of the static link becomes a learning target, and a link aggregate having a different weight for each user is formed. On the other hand, if the weights of class links can be set only by the system administrator, the administrator manages the search processing for the entire system while respecting the learning contents of individual users. It becomes possible to do.
[0074]
<4.5: Learning processing in a distributed system>
Next, specific learning processing in the distributed system will be described. As described above, since the weight of the class link is not a learning target, here, a simple case in which 100% of the weight is uniformly applied to the class link is considered. That is, consider the learning process for the example shown in FIG. 10 instead of the example shown in FIG.
[0075]
Now, as shown in FIG. 10, as a result of performing a search using the node N1 as a target node, three nodes N2, N3, and N4 are presented as candidate nodes, and the operator selects the node N4 among the three candidate nodes. Let's assume that is adopted. Thereby, the node N4 becomes a new target node. And learning is performed by the adoption act of this node N4. Learning is performed by correcting the weight of the static link to be learned. Specifically, correction is performed to increase or decrease the signal transmission coefficient of the static link.
[0076]
The basic policy of the learning process is as follows. First, all paths from the target node to each candidate node are extracted as learning target paths. And the signal transmission coefficient about the link on the path from the node of interest to the selected node is relatively increased with respect to the signal transmission coefficients of other links. In particular, in the embodiment described here, the signal transmission coefficient for the link on the path from the node of interest to the adopted node is increased and the signal transmission coefficient for the link on the other learning target path is decreased. I am trying to fix it. After all, of all static links,
(1) Among links to be learned, for links on the path from the target node to the adopted node, perform a correction to increase the signal transmission coefficient,
(2) For links other than (1) among the paths to be learned, make corrections to reduce the signal transmission coefficient,
(3) No correction is made for links that are not on the learning path.
The process is performed.
[0077]
In this example, the percentage of the signal transmission coefficient is increased by 20 for the link (1), and the percentage of the signal transmission coefficient is decreased by 20 for the link (2). . As shown in FIG. 10, the learning target paths in this example are all paths from the target node N1 to the candidate nodes N2, N3, and N4 indicated by white node points. Specifically, The static links L1, L2, and L3 are learning targets. Among these, for the links L2 and L3 on the path from the node of interest N1 to the adopted node N4, correction is performed to increase the signal transmission coefficient value by 20, and the signal transmission coefficients after learning of the links L2 and L3 are respectively 50% + 20% = 70% and 30% + 20% = 50%. On the other hand, for the remaining link L1 to be learned, the signal transmission coefficient value is corrected by 20 and the learned signal transmission coefficient is 25% -20% = 5%. Note that the links L4 and L5 that have not become the learning target path are not learned, and the value of the signal transfer coefficient remains unchanged. FIG. 12 is a diagram showing a state after performing a specific learning process in accordance with such a basic policy.
[0078]
Now, let's look at what kind of change occurs in the search process due to such learning. FIG. 13 is a diagram illustrating a search result when a search is performed by designating the node N1 again as a target node in a state where a new signal transmission coefficient is defined by learning as described above. The search act shown in FIG. 13 is exactly the same as the search shown in FIG. 10, but different search results are obtained because the signal transmission coefficient of each link is corrected. That is, when a signal having an initial signal value of 100 is transmitted from the node of interest N1 along each link, multiplication of a signal value of 100 × 5% is performed in signal transmission from the node N1 to the node N2. The signal value of the signal reaching N2 is attenuated to 5 (under the condition). On the other hand, in signal transmission from the node N1 to the node N3, multiplication with a signal value of 100 × 70% is performed, and a signal with a signal value of 70 is obtained at the node N3. Further, in the transmission from the node N3 to the node N4, multiplication with a signal value of 70 × 50% is performed, and a signal with a signal value of 35 is obtained at the node N4. Further, when the signal having the signal value 35 is transmitted from the node N4 to the node N5, multiplication of the signal value 35 × 60% is performed, and a signal having the signal value 21 is obtained at the node N5.
[0079]
The chart shown in the lower column of FIG. 13 shows signal values of signals transmitted to each node when a signal having a signal value of 100 is given to the node of interest N1. When this chart is compared with the chart shown in the lower column of FIG. 10, it can be seen that there is a change in the combination and priority order of nodes extracted as candidate nodes. That is, in the search shown in FIG. 10 before learning, the node N2 is a candidate node, whereas in the search shown in FIG. 13 after learning, the node N5 is a candidate node instead of the node N2. This means that the degree of association between the nodes N1 and N4 is increased by learning.
[0080]
If the operator again selects the node N4 after the search shown in FIG. 13, the signal transmission coefficients of the links L2 and L3 are further increased, and conversely, the signal transmission coefficients of the links L1 and L4 are decreased. . In the previous search, the link L4 was not a learning target. However, in this search, the node N5 is a candidate node, so the link L4 is also a learning target. However, since the link L4 is not a link on the path from the target node N1 to the adopted node N4, learning for reducing the signal transmission coefficient is performed.
[0081]
In this way, every time a user searches for a predetermined node of interest and selects a new node of interest from the candidate nodes presented by this search, learning about the link on the learning target path is performed. . That is, the weight increase correction is performed on the link used as the path from the target node to the adopted node, and the weight decrease correction is performed on the other links on the learning target path. In this way, when learning is performed to increase the weighting of a frequently used path, a link aggregate that is easy to use for each user is constructed.
[0082]
In the present embodiment, a predetermined upper limit value and lower limit value are set for the signal transmission coefficient so that an increase correction exceeding the upper limit value or a decrease correction exceeding the lower limit value is not performed. . For example, if the upper limit value of the signal transmission coefficient is set to 150% and the lower limit value is set to 1%, the increase correction is performed only up to 150% and the decrease correction is performed up to 1%. Of course, when the signal transmission coefficient is 1% or less, it is possible to determine that the static link is extinguished.
[0083]
In this embodiment, the directionality is not defined for link weighting. For example, the signal transmission coefficient indicating the weighting of the link L2 is commonly used for both the case where the signal is transmitted from the node N1 to the node N3 and the case where the signal is transmitted from the node N3 to the node N1. Is done. Normally, when the relevance of the node N3 as viewed from the node N1 is high, the relevance of the node N1 as viewed from the node N3 is normally high. Therefore, there is no problem even if the direction of the link is not particularly defined. However, if the relevance of the second node viewed from the first node is not necessarily the same as the relevance of the first node viewed from the second node, the link weight is given a direction. You can also. In this case, for example, as the weighting of the link L2, the signal transmission coefficient in the direction from the node N1 to the node N3 and the signal transmission coefficient in the direction from the node N3 to the node N1 are respectively defined separately. Good.
[0084]
<4.6: Presenting Candidate Nodes Considering Node Weighting>
So far, the learning process for correcting the weight of the static link has been described. However, if the weight is also defined for each node and the target of the learning process, when presenting each candidate node to the operator, It is possible to present in the priority order considering the weight of each candidate node.
[0085]
Let me illustrate this with a specific example. Here, it is assumed that the frequency coefficient indicating weighting is defined for all nodes, and the frequency coefficients of all nodes are set to 100% in an initial state where learning is not performed. The frequency coefficient defined for each node does not affect the signal transmission process, but is used as a parameter for determining priority when the node is extracted as a candidate node. When such a frequency coefficient is defined, the priority order of candidate nodes is determined by the method shown in the chart of FIG. 14 in the search process shown in FIG. That is, the priority order of a particular node is determined based on the product of the signal value of the signal transmitted to that node and the frequency coefficient of that node. In the example shown in FIG. 14, since learning has not yet been performed, the frequency coefficient of all nodes is 100%, and the product obtained by multiplying the frequency coefficient is the same value as the original signal value.
[0086]
Now, let us consider a case where three candidate nodes N3, N2, and N4 are presented to the operator in this order by the search shown in FIG. 10, and the operator adopts the node N4. In this case, as described above, learning for correcting the signal transmission coefficient of each link is performed. However, when the weighting for the node is defined, the learning for correcting the weighting of the node is performed. To do. In particular,
(1) For adopted nodes, make corrections to increase the frequency coefficient,
(2) Among the nodes on the learning target path, the nodes other than (1) are corrected to reduce the frequency coefficient.
(3) No correction is made for nodes not on the learning target path.
The learning process may be performed. After all, the frequency coefficient indicating the weighting of a node is a parameter indicating the frequency of adoption indicating how much the node has been adopted in the past.
[0087]
Here, a specific example is considered in which correction is performed to increase the frequency coefficient to × 1.5 for the adopted node, and correction is performed to decrease the frequency coefficient to × 0.7 for the other nodes on the learning target path. Try. Then, in the search shown in FIG. 10, when the candidate node N4 is adopted, the frequency coefficient of the adopted node N4 is corrected to 100% × 1.5 = 150%. On the other hand, the frequency coefficients of the other nodes N2 and N3 on the learning target path are corrected to 100% × 0.7 = 70%. Since the nodes N5, N6, and N7 are not on the learning target path, the frequency coefficient is not corrected.
[0088]
As a result of the selection of the candidate node N4, learning for the node is performed, and learning for the link is also performed as described above, and the signal transmission coefficient of the link becomes a value as shown in the upper column of FIG. Considering the search results when the search is performed with the node N1 as the target node again after such learning is performed, the frequency coefficient defined for each node has no effect on the signal transmission process. Therefore, the signal values of the signals obtained at the nodes N1 to N5 are as shown in the lower column of FIG. However, the priority order when the candidate nodes N3, N4, and N5 are presented to the operator is determined based on the product of the signal value and the frequency coefficient. This product is as shown in the chart of FIG. 15. Eventually, in the search process shown in FIG. 13, the priority order of the candidate nodes is determined by the method shown in the chart of FIG.
[0089]
When the priority order shown in FIG. 15 is compared with the priority order shown in the lower column of FIG. 13, it can be seen that the priorities (1) and (2) are switched. That is, if the priority order is determined in consideration of the weighting of the nodes, the priority order of the node N4 that is the adopted node becomes higher than the priority order of the node N3 that is used as a simple passing node. It can be seen that is improved further.
[0090]
<4.7: Search processing using dynamic links>
The search example that has been described so far for the distributed database system is a search using an existing static link. Here, a search example using the dynamic link described in §3 will be described.
[0091]
For example, consider the case where the static link L2 between the nodes N1 and N3 is not defined in the example shown in FIG. In a search using only a static link, if a search using the node N1 as a target node is performed in this state, only the node N2 is searched as a candidate node. As already described, a static link is primarily a link set by a system administrator, and is not necessarily associated with nodes that are useful to individual users. Therefore, by using the following method, a temporary dynamic link is defined between specific nodes having a relationship between keywords, and a search is performed.
[0092]
The basic method for defining dynamic links is as described in §3. Here, as shown in FIG. 16, a specific condition for generating the dynamic link L2 between the nodes N1 and N3 constituting the unconnected portion because there is no existing link will be described. The determination as to whether or not the nodes N1 to N3 in the unconnected portion should be connected by a dynamic link is made based on the evaluation result of the relevance between the keywords K1 and K3 defined in both nodes N1 and N3. That is, if this evaluation result satisfies a predetermined condition, this unconnected portion is temporarily connected by the dynamic link L2. Specifically, for example, when several equivalent keywords are defined as the keywords K1 and K3, any equivalent keyword constituting the keyword K1 is equivalent to any equivalent constituting the keyword K3. If it matches the keyword, it may be determined that an evaluation result satisfying the condition has been obtained. Of course, it is not necessary to make the complete match of the keyword an indispensable condition. For example, when 2/3 of the character strings are matched, such as “high blood pressure” and “blood pressure value”, it is determined that the condition is satisfied. It does not matter if you do.
[0093]
In this embodiment, a thesaurus dictionary prepared with class links is used to evaluate the relevance of two keywords. As shown in FIG. 8, a thesaurus dictionary Tab is prepared for the remote link AB. Therefore, when evaluating the relationship between the nodes N1-N3, the evaluation is performed with reference to the thesaurus dictionary Tab. For example, if the words “high blood pressure”, “high pressure”, “blood pressure abnormality”, and “arteriosclerosis” are all defined as synonyms in the thesaurus dictionary Tab, a node in class A and a node in class B In evaluating the relevance to a node, these synonyms can be treated as the same keyword.
[0094]
Eventually, in order for the dynamic link L2 to be defined between the nodes N1 and N3 of the unconnected portion shown in FIG. 16, the relationship between the keyword K1 of the node N1 and the keyword K3 of the node N3 is determined based on the thesaurus dictionary. Evaluation is performed with reference to the Tab, and the evaluation result needs to satisfy a predetermined condition.
[0095]
In FIG. 16, the unconnected portion related to the node N1 is not only between the nodes N1 and N3. Since the node N2 is the only node connected to the node N1 by the existing static link, the other nodes, that is, between the nodes N1 and N4, between the nodes N1 and N5, between the nodes N1 and N6, The node N1-N7 is also a non-connected portion related to the node N1. However, in this embodiment, the generation of dynamic links is limited by imposing the condition that “dynamic links can be defined only between classes in which class links are defined”. Therefore, in the case of the above-described example, it is allowed to define a dynamic link between the nodes N1 and N3 and between the nodes N1 and N4 among the unconnected portions related to the node N1 if the keyword evaluation result satisfies the condition. However, since a class link AB exists between classes A and B, for example, as shown in FIG. 17, it is not permitted to define a dynamic link between nodes N1 and N7 (class B− (There is no class link between C). For the same reason, dynamic links are also established between nodes N1-N5, between nodes N1-N6, between N2-N5, between nodes N2-N6, between nodes N2-N7, between nodes N5-N6, and between nodes N5-N7. Definition of is not allowed.
[0096]
The dynamic link temporarily defined at the time of the search is given a predetermined weight, that is, a signal transmission coefficient. Therefore, some signal transmission coefficient is given to the dynamic link L2 shown in FIG. 16, and the signal transmission function is exactly the same as that of other static links. The signal transmission coefficient to be given to the defined dynamic link may be determined, for example, as “equal 50%”, but a coefficient corresponding to the evaluation value of the relevance of the keyword may be given. For example, the signal transmission coefficient of the dynamic link L2 shown in FIG. 16 is “in the case where the keywords K1 and K3 are 100% identical, and 2/3 of the character strings constituting the keywords are the same. May be 66% ". In addition, when using an equivalent keyword, a weight may be set for each equivalent keyword, and the value of the signal transmission coefficient may be determined according to the weight of the matched equivalent keyword.
[0097]
The handling at the time of learning of the dynamic link temporarily generated in this way is as already described in §3. That is, when the dynamic link is used as a path from the target node to the adopted node, correction is performed to promote the static link and increase the signal transmission coefficient. On the other hand, if it is not used as a path from the node of interest to the adopted node, it is extinguished as it is. For example, as shown in FIG. 16, consider a case where a dynamic link L2 is defined at the time of a search using the node N1 as a target node. In this case, for example, when the node N4 that is a candidate node is adopted, the dynamic link L2 remains as the static link L2, and the signal transmission coefficient is also increased and corrected. However, if node N2, which is another candidate node, is adopted, it will disappear as it is.
[0098]
Eventually, by dynamically defining the dynamic link having the above-described features at the time of search, the range of nodes to be searched can be expanded. Moreover, about the dynamic link involved in the user's selection action, by leaving this as a static link, it becomes possible to newly add a link that is useful for each user.
[0099]
§5. Operation procedure of database system according to this embodiment
Subsequently, an operation procedure of the database system according to the present embodiment will be described based on a flowchart.
[0100]
<5.1 Usage Procedure of Database System According to this Embodiment>
FIG. 18 is a flowchart showing the use procedure of this database system with a focus on user operations. First, in step S1, the user determines an initial target node. Any method may be adopted as a method of determining the initial target node. For example, if you enter some keyword, display the candidate nodes that have the same or related keywords as the entered keyword (actually, display the keyword), and from the displayed keywords, If a specific node is adopted by the operator, the initial target node can be determined. If a method is used in which a class is specified in advance and an initial target node is selected from the nodes in the class, it is possible to determine the initial target node by further narrowing down candidates.
[0101]
In the subsequent step S2, it is determined whether or not to use data associated with the node of interest. If the operator wants to use (view) the current node of interest, an instruction to that effect may be given. When an instruction to use data is given from the operator, the process proceeds from step S2 to step S3, and data corresponding to the node of interest is provided. That is, in the system shown in FIG. 1, corresponding data is extracted from the database 1 based on the keyword defined for the node of interest in the node / link aggregate 2 and presented to the operator 3. If necessary, data can be used repeatedly.
[0102]
In a succeeding step S4, it is determined whether or not a search for the node of interest is performed. If the operator does not perform a search, the system use procedure is temporarily terminated. If the operator accesses the database system again to check another content, the procedure from step S1 may be started again. If the operator wants to search for the current node of interest, an instruction to that effect is given. When a search instruction is given from the operator, the process proceeds from step S4 to step S5, and a search process is performed. The detailed procedure of the search process in step S5 will be described later with reference to the flowchart of FIG. By this search processing, related nodes determined to have a certain degree of relationship with the node of interest are extracted, and these related nodes are presented as candidate nodes.
[0103]
The candidate nodes are actually presented by displaying the keywords defined for each candidate node on the display. The operator selects a keyword that seems to be most relevant to the data being searched for while viewing the display of the keyword. That is, candidate node selection processing in step S6 is performed. Subsequently, a learning process is performed in step S7 based on the operator's selection action. The detailed procedure of the learning process in step S7 will be described later with reference to the flowchart of FIG.
[0104]
When the learning process is completed in this way, in step S8, an update process using the selected node as a new target node is performed, and the process returns to the procedure from step S2.
[0105]
Eventually, the operator can repeat the search many times by repeating the procedure of the flowchart shown in FIG. 18, and can change the node of interest one after another. If necessary, in step S2, if an instruction to use data (viewing, printing, etc.) is given, the data corresponding to the current node of interest can be used each time. The feature of such a search method is that it searches not for data directly but for related keywords. As already described, the process of presenting a specific node to the operator is actually a process of displaying a keyword defined for the specific node on the display. Therefore, the operation of changing the node of interest is an operation of changing the keyword from the viewpoint of the operator. Such a search method is effective when it is unclear what keyword should be used to search for the finally searched data. In other words, the operator gives a keyword that seems to be relevant for the time being as a keyword for determining the initial target node in step S1, and then executes the search process in step S4 and presents it in step S6. It is only necessary to repeatedly execute the task of selecting the most relevant keyword (adoption of the node) from the new keywords.
[0106]
Specifically, for example, consider searching for past case data similar to a case of a specific patient. Assume that the operator designates a case database of a specific hospital as a class, inputs the keyword “high blood pressure”, and performs a search. As a result, suppose that keywords such as “blood pressure abnormality”, “juvenile hypertension”, “senile hypertension”, “hypertensive retinopathy”, etc. are presented as candidates (candidates for processing inside the system) The node in which the keyword “hypertension” is defined becomes the target node, and the node in which keywords such as “blood pressure abnormality”, “juvenile hypertension”, “senile hypertension”, etc. are presented as candidate nodes. Become). Here, for example, because the patient is an elderly person, the keyword “senile hypertension” is adopted (as a process inside the system, a candidate node in which the keyword “senile hypertension” is defined is adopted, New node of interest). If necessary, the operator can browse data corresponding to the keyword “senile hypertension” (providing data corresponding to the node of interest).
[0107]
Here, as a result of further searching based on the keyword “senile hypertension”, keywords indicating candidate nodes include “arteriosclerosis”, “fundus hemorrhage”, “renal dysfunction”, “heart failure”, etc. Is presented as a candidate. At this time, if there is an abnormality in the renal function of the patient, the keyword “renal dysfunction” can be adopted, and if necessary, data corresponding to this keyword (for example, having renal dysfunction) Patient case data).
[0108]
As described above, in the database system according to the present embodiment, when a search is performed by giving a keyword for the time being, another keyword related to the given keyword is presented as a candidate from the system side. Even if you don't come up with the right keyword, you can search more flexibly. In addition, since the system itself has a learning function, the more frequently it is used, the keywords that are more likely to be adopted are preferentially presented, and the usability is further improved.
[0109]
As a keyword, not only a so-called character string but also an image can be used. For example, if the data associated with a certain node is expressed by a simple icon and this icon is defined as a keyword for that node, the icon as a keyword will be displayed when the search results are displayed on the display. Can be arranged and presented to the operator. In this case, the operator's selection action can be performed by a simple operation of clicking an icon with a mouse pointer or the like.
[0110]
In the flowchart shown in FIG. 18, after the node is adopted in step S6, the learning process of step S7 is unconditionally performed. In this way, the learning process is unconditionally performed in step S7. If this is done, unintentional learning may be performed. For example, let us consider a case where the operator has selected a node in step S6, but since this selection action is unwilling, neither data use nor new search processing has been performed for this selected node. In such a case, the adoption act itself is contrary to the operator's intention, and it is not preferable that the learning process of step S7 is performed by such an unwilling adoption act. In order to deal with such an adverse effect, after the adoption act in step S6, the learning process is not performed immediately, but the data use (step S3) or the search process (step S3) using the adopted node as a new node of interest. The learning process may be performed for the first time when S5) is performed.
[0111]
<5.2: Search Processing Procedure>
FIG. 19 is a flowchart showing a detailed procedure of the search process in step S5 in the flowchart shown in FIG. Here, the procedure of this search process will be described for a specific example in which a static link as shown in FIG. 16 is defined. First, in step S11, the hop count H is set to an initial value 1. Subsequently, in step S12, one starting node is extracted. Initially, the node of interest N1 is extracted as the starting node. In the next step S13, one target node for this starting node N1 is extracted. Theoretically, in this step S13, all nodes other than the origin node are extracted in order as target nodes. In the example shown in FIG. 16, all the nodes N2 to N7 other than the starting node N1 are extracted in order as target nodes. Since only one target node is extracted in step S13, it is assumed here that the node N2 is extracted as the target node according to the numerical order.
[0112]
When the starting node N1 and the target node N2 are thus extracted, various determinations are made between these nodes in steps S14 to S17. First, in step S14, it is determined whether or not there is a class link between both nodes. If there is no class link between both nodes, the process proceeds to step S19. In the example shown in FIG. 16, between the nodes N1 and N2 (to be exact, between the class B to which the node N1 belongs and the class B to which the node N2 belongs), the class link (local link) BB Therefore, the process proceeds to step S15. In step S15, it is determined whether there is a static link between both nodes. In the example illustrated in FIG. 16, since the static link L1 exists between the nodes N1 and N2, the process proceeds from step S15 to step S19.
[0113]
Subsequently, the process returns from step S19 to step S13, and this time, the target node N3 is extracted, and various determinations are made regarding the start node N1 and the target node N3. First, in step S14, a class link (remote link) AB is between nodes N1-N3 (to be precise, between class B to which node N1 belongs and class A to which node N3 belongs). Since it exists, it will progress to step S15. However, since there is no static link between the nodes N1 and N3, the process further proceeds to step S16, and there is no dynamic link between both nodes. Therefore, the process proceeds to step S17, and the relevance of the keywords of both nodes is determined. Is evaluated. Here, if the evaluation result of the relationship between the keyword K1 defined in the node N1 and the keyword K3 defined in the node N3 satisfies a predetermined condition, the process proceeds to step S18, and the nodes N1-N3 In the meantime, a dynamic link L2 indicated by a broken line in FIG. 16 is defined.
[0114]
After all, the determination process of steps S14 to S17 is a process for determining whether or not a dynamic link should be defined between both nodes. As described above, one of the conditions for defining the dynamic link is “a class link exists between the classes to which both nodes belong”. If this condition is not satisfied, step S14 to step S14 are performed. Jump to S19. In addition, since the dynamic link is defined for the node of the unconnected portion where no existing static link exists, if a static link already exists between both nodes, the process jumps from step S15 to step S19. Will do. Similarly, when a dynamic link has already been defined (for example, in the case of a process in which node N3 is a starting node and node N1 is a target node in a later procedure, a dynamic link L2 has already been established between both nodes. It will be defined), and jump from step S16 to step S19. Thus, in step S17, if the keywords of both nodes satisfy the relevance condition, neither a static link nor a dynamic link has been defined between the two nodes. Therefore, in step S18, a dynamic link is defined. become.
[0115]
In the flowchart shown in FIG. 19, the target nodes are extracted one by one in step S13, and it is determined in step S14 whether or not a class link exists for the extracted target node. In performing the arithmetic processing of (1), first, one target class is extracted, it is first determined whether or not a class link exists for the extracted target class, and if a class link exists, The target nodes are extracted one by one from the target class, and the process proceeds to the processing of step S15 and subsequent steps. If no class link exists, no target node is extracted from the class. By doing so, efficient processing becomes possible.
[0116]
Thus, when the process of extracting all the target nodes N2 to N7 is completed for the first starting node N1, the process proceeds from step S19 to step S20, and it is determined whether or not all starting nodes have been extracted. The “starting node” means a node that is a starting point of signal transmission, and the initial starting node is only the node N1 that is the target node. In step S20, it is determined that all starting nodes have been extracted, and the process proceeds to step S21. Will go on.
[0117]
In step S21, a signal for one hop is transmitted from the origin node, and a process in which a node having a signal value equal to or higher than a predetermined level is set as a new origin node is performed. In the example shown in FIG. 16, if signal transmission for one hop is performed from the origin node N1, signal transmission from the node N1 to the node N2 via the existing static link L1 and via the dynamic link L2 just defined Two types of signal transmission are performed: signal transmission from the node N1 to the node N3. As already described, the signal value of the signal at the origin node is attenuated by the signal transmission coefficient of the link. Therefore, among the nodes to which signal transmission has been performed, only a node having a signal value of a predetermined level or higher (in the example described in §4, a signal value of 10 or higher) is set as a new starting node. Further, when there is a node whose signal value is less than 10, it is handled as if there was no signal transmission to that node.
[0118]
In the subsequent step S22, it is determined whether or not the hop count H has reached the upper limit value Hmax. For example, if Hmax = 2 is set, H = 1 at this time, and the process proceeds from step S22 to step S23. The number of hops H is increased by 1, and the process from step S12 is repeated. It will be.
[0119]
In the example shown in FIG. 16, the nodes N2 and N3 that have received the signal transmission for one hop from the first origin node N1 are the second origin nodes. In step S12, it is assumed that the start node N2 is extracted from the two start nodes. In the subsequent step S13, one target node is determined for the starting node N2, and it is determined whether or not a dynamic link can be defined in the determination processing in steps S14 to S17. Here, it is assumed that no dynamic link is defined for any target node with the origin node N2 (in the illustrated example, a dynamic link is not established between the nodes N2 and N4 due to the presence of the remote link AB). Although it may be defined, it is assumed here that the keywords K2 to K4 are not sufficiently related). Then, the process returns from step S20 to step S12. This time, the origin node N3 is extracted, and the possibility of defining a dynamic link is determined in the same manner. However, it is assumed that a new dynamic link has not been defined.
[0120]
Thus, since the extraction of all the origin nodes N2, N3 is completed, the process proceeds from step S20 to step S21. In step S21, a signal for one hop is transmitted from the two origin nodes N2 and N3 (the signal already transmitted to the link already used for signal transmission is not performed, so that the node N2 or N3 returns to the node N1. No signal transmission is performed), and a process in which a node having a signal value equal to or higher than a predetermined level is set as a new starting node is performed. In the example shown in FIG. 16, signal transmission is no longer performed from the origin node N2. On the other hand, if signal transmission for one hop is performed from the other origin node N3, signal transmission from the node N3 to the node N4 is performed via the existing static link L3. Here, if the signal value of the signal reaching the node N4 is equal to or higher than a predetermined level, the node N4 becomes a new starting node.
[0121]
In the subsequent step S22, it is determined whether or not the hop count H has reached the upper limit value Hmax. Here, since Hmax = 2 has been set, the number of hops H has reached the upper limit value, and the process proceeds to step S24. In this step S24, all nodes on the path (on the route through which the signal flows) are extracted as related nodes. In the case of the example shown in FIG. 16, since signals are transmitted to the nodes N2, N3, and N4 using the links L1, L2, and L3 as paths, these nodes N2, N3, and N4 are extracted as related nodes. Become. As described above, this related node is presented to the operator as a candidate node. Even if there is signal transmission, if the signal value does not reach a predetermined level, it is treated as if there was no signal transmission to that node, so that node is not extracted as a related node. .
[0122]
<5.3: Procedure of learning process>
FIG. 20 is a flowchart showing a detailed procedure of the learning process in step S7 in the flowchart shown in FIG. First, in step S31, all paths through which a signal has flowed through the search process are extracted as learning target paths (if there is signal transmission, if the signal value does not reach a predetermined level, signal transmission to that node is performed. (The path to that node is not the learning target path). In the example shown in FIG. 16 described above, when the candidate nodes N2, N3, and N4 are searched by the search using the node N1 as the target node, the links L1, L2, and L3 are extracted as learning target paths.
[0123]
Subsequently, in step S32, regarding the links on the path from the target node to the adopted node among the learning target paths,
(1) In the case of static link, increase the signal transmission coefficient,
(2) In the case of a dynamic link, the signal transmission coefficient is increased and the dynamic link is promoted to a static link.
The process is performed. In the example shown in FIG. 16 described above, when the candidate node N4 is adopted, among the links on the path from the target node N1 to the adopted node N4, the signal transmission coefficient is increased for the dynamic link L2, and Learning to promote to a static link is performed, and learning to increase the signal transmission coefficient is performed for the static link L3.
[0124]
On the other hand, in step S33, among the learning target paths, links on paths other than the above are:
(1) In the case of static link, reduce the signal transmission coefficient,
(2) In the case of a dynamic link, the link itself is deleted.
The process is performed. In the case of the example shown in FIG. 16 described above, learning for reducing the signal transmission coefficient is performed for the static link L1.
[0125]
Finally, in step S34, learning regarding the frequency coefficient of the node is performed. That is, among all the nodes on the learning target path,
(1) For the selected node, increase the frequency coefficient,
(2) For other nodes, decrease the frequency coefficient.
The process is performed. In the case of the example shown in FIG. 16 described above, learning for increasing the frequency coefficient for the adopted node N4 is performed, and learning for decreasing the frequency coefficient for the other candidate nodes N2 and N3 is performed.
[0126]
§6. Specific configuration of database system according to this embodiment
<6.1: Basic system>
FIG. 21 is a block diagram showing a specific configuration of a database system according to an embodiment of the present embodiment. Actually, the essence of the database system according to the present embodiment is realized by software, but here, for convenience, this system will be described as a set of a plurality of functional elements. The individual blocks shown in FIG. 21 show individual functional elements constituting this system, and are actually constructed by software. Therefore, the individual blocks shown in FIG. 21 are not directly associated with specific hardware components constituting the database system. For example, hardware (storage device) for storing keywords defined in individual nodes, a display device for presenting various information to the operator, an input device for inputting instructions from the operator, etc. Although not shown as independent functional elements, these are naturally included in the present system as hardware components.
[0127]
In this database system, a node aggregate composed of a plurality of nodes is defined, and data is stored in association with individual nodes. The data storage unit 10 is a storage unit that stores data to be provided in association with individual nodes. The data providing unit 20 has a function of extracting data associated with a predetermined target node from the data storage unit 10 and providing the extracted data to the operator 100. The target node is set by the target node setting unit 30. That is, when the operator 100 gives an instruction to set the specific node as the target node to the target node setting unit 30, the designated node is set as the target node, and the operator 100 Node information indicating that the node is a node (actually, a keyword defined for the node of interest) is presented. When the operator 100 wishes to use the data associated with the current node of interest, the data providing means 20 may be requested to provide data. In response to this provision request, the data providing unit 20 extracts data associated with the current node of interest set in the node-of-interest setting unit 30 from the data storage unit 10 and provides this to the operator 100. Provide as data (for example, display on a display).
[0128]
In response to a search request from the operator, the search means 40 selects candidate nodes related to the current node of interest set in the node-of-interest setting means 30 as link aggregates stored in the link storage means 50. It has a function to search by reference. In the link storage means 50, a node / link aggregate including a set of nodes and a set of links (static links) indicating the degree of association between the nodes is stored. When the operator 100 issues a search request for the current node of interest, the search unit 40 uses the link aggregate in the link storage unit 50 to search for the node of interest set in the node of interest setting unit 30. Then, a process of searching for related nodes under a predetermined condition is performed, and an aggregate of the related nodes is extracted as a related set. The specific method of this search process is as already described in §4.3 or §5.2. Further, the search means 40 has a function of generating a dynamic link separately from the static link in the link storage means 50. At this time, the search means 40 has a function of referring to the thesaurus dictionary and evaluating the relationship between the nodes. (See § 4.7).
[0129]
When a collection of related nodes is extracted by the search processing by the search means 40, the candidate presentation means 60 presents all or part of the nodes belonging to this related set to the operator 100 as candidate nodes. In the examples described so far, all of the related nodes belonging to the related set have been presented to the operator 100 as candidate nodes as they are. However, if necessary, the candidate presenting means 60 performs sieving to specify specific conditions. Only some satisfied related nodes may be presented to the operator 100 as candidate nodes. A specific example of such sieving will be described in “AND Search” in §7. Candidate presentation to the operator 100 by the candidate presenting means 60 can be performed, for example, by displaying a keyword defined in each candidate node on the display.
[0130]
The candidate adopting means 70 has a function of causing the operator 100 to adopt a specific candidate node from among the candidate nodes presented by the candidate presenting means 60. When the operator 100 gives an adoption instruction to adopt a specific candidate node, the candidate adoption unit 70 transmits the designated candidate node to the update unit 80 and the learning unit 90 as an adopted node. The updating unit 80 performs processing for updating the setting of the target node setting unit 30 so that the selected node becomes a new target node. In the embodiment described in this specification, the number of adopted nodes is limited to one, but it is also possible to accept a plurality of candidate nodes (in this case, each of the plurality of adopted nodes is new). The node of interest). On the other hand, the learning unit 90 corrects the degree of association with respect to the link on the path from the target node to each related node. That is, a process of correcting the link weight is performed on the link aggregate in the link storage unit 50.
[0131]
Thus, when using this system, the operator 100 first sets the initial target node by giving an instruction to set the initial target node to the target node setting means 30, and thereafter provides data as necessary. A request for providing data to the means 20 or a search request to the search means 40 is made, and when a candidate node is presented by the candidate presentation means 60, the candidate is presented by giving an adoption instruction to the candidate adoption means 70 A process of determining a new target node from the candidates is performed. Each time the operator 100 adopts a new target node, the link aggregate stored in the link storage means 50 is modified, and the weight of the link on the path from the old target node to the new arrival node is increased. The
[0132]
In the link storage means 50, a separate link aggregate is prepared for each operator (user), and the learning means 90 applies the link aggregate corresponding to the operator who has selected the candidate node. A correction will be made. Therefore, learning is performed independently for each operator, and as this system is used, usability for each operator is improved.
[0133]
<6.2: Definition and modification of signal transmission coefficient>
As already described, a predetermined signal transmission coefficient is defined for each link (static link) constituting the link aggregate stored in the link storage means 50 as a value indicating weighting. When searching for a related node for the node of interest, the search means 40 performs a process of transmitting a signal having a predetermined signal value from the node of interest to another node along the link. In this signal transmission process, based on the signal transmission coefficient defined for each link, the signal value to be transmitted is increased or decreased, and only the node to which a signal having a signal value of a predetermined level or higher is transmitted is extracted as a related node. Will be. A node whose signal value is less than a predetermined level is handled as a signal that has not been transmitted, and further, signal transmission processing to the downstream side is not performed. Also, signal transmission between nodes connected by one link is defined as the number of hops 1, and the signal transmission process is also stopped when the number of hops exceeds a predetermined upper limit value. Therefore, even if the signal value is equal to or higher than a predetermined level, the signal transmission process is not performed for a node whose number of hops from the target node exceeds a predetermined upper limit value.
[0134]
In this way, only the nodes that have performed signal transmission with a signal value of a predetermined level or higher by the signal transmission processing from the node of interest are extracted as related nodes by the search means 40, and the related nodes are candidate presentation means. 60 is presented to the operator 100 as a candidate node. Note that when the candidate presentation unit 60 presents candidate nodes, presentation based on the priority order is performed. That is, the candidate presentation unit 60 has a function of defining a priority order for each candidate node according to the signal value of the signal transmitted to each candidate node, and presenting the candidate node based on this priority order. Specifically, the keyword defined in the candidate node having a high priority may be preferentially displayed on the display (for example, displayed on the upper side of the list). After all, in the search for the current target node, such priority display is for displaying a candidate node that has a high possibility of being adopted in the past, and a candidate having a high possibility of being adopted is adopted again. An easy presentation will be made.
[0135]
The learning unit 90 corrects the signal transmission coefficient of each node based on the action adopted by the operator 100. That is, correction is performed to increase the signal transmission coefficient for the link on the path from the node of interest to the adopted node relative to the signal transmission coefficient of the other link. More specifically, the learning unit 90 defines all paths from the target node to each related node as a learning target path, and among the links on the learning target path, links on the path from the target node to the adopted node For the other link on the learning target path, the signal transmission coefficient is increased and the signal transmission coefficient is decreased.
[0136]
<6.3: Search using dynamic links>
The search means 40 uses not only a search using links (static links) constituting a link aggregate in the link storage means 50 but also another link (dynamic link) that is temporarily generated as necessary. (See §3 or 4.7). That is, for a node that is completely connected to the target node by an existing static link, a search using the existing static link is performed, and the target node is completely connected to the target node by an existing static link. For nodes that are not connected and have unconnected parts, evaluate the relevance between the nodes of the unconnected parts, and if the evaluation result satisfies the specified condition, temporarily link to the unconnected parts. And search using both static and dynamic links.
[0137]
Evaluation of the relevance between the nodes of the unconnected portion is determined based on the degree of relevance of the keyword defined for each node. For example, when a static link is not defined between the first node and the second node, and a non-connected portion is configured, the first keyword defined in the first node, When the degree of association with the second keyword defined in the second node (for example, the matching degree of character strings) is evaluated and the evaluation result satisfies a predetermined condition (for example, the mutual character strings are predetermined) If they match at a rate equal to or greater than the ratio, a temporary dynamic link is defined between both nodes, and signal transmission processing is performed via this dynamic link. Note that the signal transmission coefficient for the temporarily defined dynamic link may be determined based on the evaluation result of the keyword relevance. For example, the higher the degree of matching between the character strings of both keywords, the larger the signal transmission coefficient is defined. Further, if a plurality of equivalent keywords are defined for each node, and all the equivalent keywords are to be evaluated, more flexible relevance evaluation can be performed.
[0138]
When the learning process by the learning unit 90 is performed, the existing static link constituting the link aggregate stored in the link storage unit 50 and the dynamic link generated temporarily are handled separately. Made. That is, learning about static links is as described above. However, for dynamic links, when a related node extracted by a search using the dynamic link is adopted by the candidate adopting means, the dynamic link is static. It is added to the link aggregate as a link, and modification is performed so that it becomes a new component. At this time, the signal transmission coefficient defined for the dynamic link is increased and corrected, and the corrected signal transmission coefficient is given as the signal transmission coefficient of the promoted static link. On the other hand, when the related node extracted by the search using the dynamic link is not adopted by the candidate adopting means, the dynamic link is deleted as it is.
[0139]
<6.4: Definition of class link>
When used as a distributed database system, as described in § 4.1, each node is classified into a plurality of classes so that each node belongs to one of the classes, and the relationship between the classes. It is preferable to define a class link indicating the degree of. In the example shown in FIG. 8, there are two types of class links: remote links AB and AC indicating the degree of association between different classes and local links AA and BB indicating the degree of association between self and self. Is defined. When deciding whether to define a dynamic link at the time of search, the relationship between the nodes of the unconnected part is evaluated. When this relationship is evaluated, each node belongs. You can refer to the class link defined between classes. For example, if the dynamic link definition is not permitted between classes where no class link exists as in the above-described embodiment, the dynamic link generation mode can be controlled by the class link definition. become.
[0140]
Also, a thesaurus dictionary is prepared corresponding to each class link, and when evaluating the relationship between the nodes of the unconnected part, it corresponds to the class link defined between the classes to which each node belongs It is also possible to evaluate the relevance of keywords using a thesaurus dictionary. Depending on the contents of the thesaurus dictionary to be prepared, the keyword relevance evaluation result varies, and the direction of the relevance evaluation method can be determined.
[0141]
Furthermore, a signal transmission coefficient is defined for each class link, and when the search means transmits a signal between each node, the signal transmission coefficient of the link defined between the nodes and the class are defined. If the signal value is increased or decreased in consideration of both the class link signal transmission coefficient and the signal transmission factor, as described in §4.4, the tendency of signal transmission can be comprehensively controlled by the class link. It becomes possible.
[0142]
<6.5: Addition of frequency coefficient storage means>
FIG. 22 is a block diagram of an embodiment in which a frequency coefficient storage unit 110 is further added to the database system shown in FIG. 21 described above. The frequency coefficient storage means 110 is a means for storing a frequency coefficient indicating the adoption frequency for each node. When the operator 100 determines an adopted node from the candidate nodes presented by the candidate presenting means 60, the learning means 90 sets the frequency coefficient for the adopted node relative to the frequency coefficients for the other nodes. Correction to increase. The frequency coefficient for each node is a coefficient indicating the weighting of the node, and the node having a higher frequency of adoption by the operator 100 has a larger frequency coefficient value and an increased weight. Such node weighting does not affect the signal transmission process itself, but, as described in §4.6, is used for priority determination when presenting candidate nodes. That is, the candidate presentation unit 60 in this embodiment defines the priority order for each candidate node according to both the signal value of the signal transmitted to each candidate node and the frequency coefficient for each candidate node. It has a function of presenting candidate nodes based on priority.
[0143]
In the frequency coefficient storage unit 110, a separate frequency coefficient is stored for each operator, and the learning unit 90 corrects the frequency coefficient corresponding to the operator who has selected the node. To. Therefore, as with the signal transfer coefficient for the link, learning is performed independently for each operator, and the more this system is used, the better the convenience for each operator. Become.
[0144]
§7. AND search
The database system according to the present invention is characterized by its unique search method. As already mentioned, in order to search for desired data using this database system, a keyword that seems to be related is input for the time being, and a search process for this keyword (target node) is performed. A unique keyword (candidate node) is presented. The operator can select a keyword that seems to be more relevant to the target data from the presented keywords, and search for a more relevant keyword.
[0145]
23 to 28 are conceptual diagrams for explaining such a search operation unique to the present invention. In the database system according to the present invention, a node aggregate as shown in FIG. 23 is defined. In the figure, a large number of black circles indicate individual nodes. Each node is defined with a predetermined keyword and associated with specific data. The target data that the user is searching for is associated with one of these nodes, and the search operation performed by the operator is an operation for finding the node associated with the target data. Become.
[0146]
Now, it is assumed that the operator inputs a keyword that seems to be related to the target data and designates the node of interest N1 corresponding to this keyword. FIG. 24 shows a state where one node in the node aggregate is designated as the node of interest N1. Of course, when the target node N1 is the target node, if the provision request (browsing instruction) is given to the data providing means 20, the target data associated with the target node N1 is the data storage means. 10 will be extracted and presented. When the operator gives a search request to the search means 40, a search is performed on the target node N1, and a related set G1 including nodes having a certain degree of relevance is extracted from the target node N1. FIG. 25 shows the related set G1 searched in this way. The related nodes in the related set G1 thus searched are presented to the operator as candidate nodes by the candidate presenting means 60. The presentation order of candidate nodes follows the priority order of each node.
[0147]
The operator adopts one of the presented candidate nodes. FIG. 26 shows a state where the node N2 in the related set G1 is adopted. The node N2 adopted here becomes a new node of interest as shown in FIG. At this time, learning to increase the weight of the link on the path from the node N1 to the node N2 is performed, and when a search using the node N1 as the target node is performed again in the future, The priority of the node N2 when presenting the node as a candidate node is improved.
[0148]
The operator can make a request to provide data associated with the new target node N2 and browse the data as necessary. Further, when a search request is made again for this new target node N2, a related set G2 consisting of nodes having a certain degree of relevance to the node N2 is extracted. FIG. 28 shows the related set G2 searched in this way. The related nodes in the related set G2 thus searched are presented to the operator as candidate nodes by the candidate presenting means 60. Note that, as a result of the above-described learning, the link on the path between the node N1 and the node N2 is weighted, and thus the original target node N1 is usually included in the related set G2. Thus, when it is not preferable that the original target node is presented again as a candidate node in a later search, a selection function that removes the past target node from the candidate nodes in a series of search operations. May be added to the candidate presenting means 60.
[0149]
In the example described so far, the candidate presentation unit 60 presents all the related nodes extracted by the search unit 40 to the operator 100 as candidate nodes as they are. However, as described above, if the candidate presenting means 60 is provided with a sorting function, only a part of the related nodes extracted by the search means 40 can be presented as candidate nodes. The database system according to the present embodiment uses such a sorting function by the candidate presenting means 60 to perform an AND search described below.
[0150]
The AND search described here is an effective means for gradually narrowing down candidates in the process of performing a search operation using a database system. For example, in the case of the general search operation described above, the related set G1 shown in FIG. 25 and the related set G2 shown in FIG. 28 are collective and independent sets, and two search operations are performed. Despite having done, the candidate is not narrowed down. As a search method for narrowing down candidates, an AND search is generally performed in which a plurality of conditions for determining candidates are set and only candidates that satisfy the logical product of these conditions are extracted. The search method described here applies this general AND search to the search of the database system according to the present invention.
[0151]
FIG. 29 is a block diagram of an embodiment in which a population definition means 120 is further added to the database system shown in FIG. The population definition means 120 has a function of defining a population indicating a specific node aggregate. In this embodiment, the candidate presentation means 60 includes the related set extracted by the search means 40 and the population definition means 120. The node has a function of presenting only nodes belonging to the intersection set with the mother set defined in the above as candidate nodes. Further, when this AND set is presented as a candidate node by the candidate presenting means 60, the population defining means 120 has a function of redefining the aggregate of candidate nodes as a new mother set.
[0152]
Hereinafter, the search operation by the AND search will be described with a specific example. Consider a state where the node N1 in the node aggregate as shown in FIG. 30 is designated as the node of interest (the node aggregate includes a large number of nodes as in FIG. 23. In the explanation of FIG. 2, only the nodes that should be noted in each figure are indicated by black circles in order to avoid complication of the figures). In this initial state, the entire node aggregate is defined as a mother set M1. Therefore, at this point, the population definition means 120 defines the population M1 corresponding to the entire node assembly.
[0153]
Here, as described above, when a search request is made to this node of interest N1, as shown in FIG. 31, a collection of related nodes is extracted as a related set G1. The relation set G1 that the search means 40 gives to the candidate presentation means 60 includes all the related nodes extracted in this way. On the other hand, the candidate presenting means 60 presents only nodes included in the logical product set “(G1) AND (M1)” of the related set G1 and the mother set M1 as candidate nodes. In the case of the example shown in FIG. 31, since the mother set M1 at this point is the entire node set, the candidate presentation unit 60 presents all the related nodes included in the related set G1 as candidate nodes as they are. . That is, nodes in the area shown by hatching in FIG. 31 are presented as candidate nodes.
[0154]
When the candidate nodes are presented in this way, the population definition unit 120 redefines the candidate node aggregate as a new population. Therefore, in this case, the related set G1 shown by hatching in FIG. 31 is defined as a new mother set M2.
[0155]
Now, assuming that the node N2 among the candidate nodes is adopted, this node N2 becomes a new target node. Now, it is assumed that the operator again gives a search request to the new target node N2. In addition, it is assumed that a search request in the “AND search” mode is given (when the operator gives a search request to the search means 40, the “normal search” mode or the “AND search” mode can be specified. Just like that). The search means 40 executes a search process for extracting a related set G2 including all related nodes for the node of interest N2. FIG. 32 shows the related set G2 extracted in this way.
[0156]
The candidate presenting means 60 only includes nodes included in the logical product set “(G2) AND (M2)” of the related set G2 and the mother set M2 (nodes included in the area shown by hatching in FIG. 32). As a candidate node. As a result, the candidate nodes presented at this point are only nodes included in the intersection set of the related set G1 at the time of the first search and the related set G2 at the time of the second search. This means that the search has been narrowed down. When such candidate nodes are presented, the population definition means 120 redefines the candidate node aggregate as a new population. Therefore, in this case, a set of nodes included in the area indicated by hatching in FIG. 32 is defined as a new mother set M3 (FIG. 33).
[0157]
Here, if the node N3 among the candidate nodes is adopted, this node N3 becomes a new node of interest. Assume that the operator again gives a search request to the new node of interest N3 in the “AND search” mode. In this case, the search means 40 executes a search process for extracting a related set G3 including all related nodes for the node of interest N3. FIG. 34 shows the related set G3 extracted in this way.
[0158]
The candidate presenting means 60 only includes nodes included in the logical product set “(G3) AND (M3)” of the related set G3 and the mother set M3 (nodes included in the area shown by hatching in FIG. 34). As a candidate node. As a result, as shown in FIG. 35, the candidate nodes presented at this point are related sets G1 at the time of the first search, related sets G2 at the time of the second search, and relationships at the time of the third search. Only the nodes included in the logical product set with the set G3 are included, and the narrowing-down is performed by three searches. When such candidate nodes are presented, the population definition means 120 redefines the candidate node aggregate as a new population. Therefore, in this case, a set of nodes included in the area shown by hatching in FIG. 35 is defined as a new mother set M4.
[0159]
Such an AND search is very effective when narrowing down using the database system according to the present embodiment. For example, at the time of the first search, a search is performed using a node in which the keyword “high blood pressure” is defined as a target node. As a result, keywords indicating candidate nodes include “blood pressure abnormality”, “arteriosclerosis”, “hypertension”. “Retinopathy”, etc. are presented as candidates. Here, if the node in which the keyword “arteriosclerosis” is defined is adopted as a new node of interest and the second search is performed in the “AND search” mode, the keyword “hypertension” A new keyword related to both of the keywords “” is presented as a candidate. In this way, if AND search is used, candidates can be narrowed down gradually, and it can be used as an effective search method for searching for target data.
[0160]
§8. Negative search processing
The search process described so far is a search process for finding “a matter related to a certain keyword”. For example, if a search using a node in which the keyword “high blood pressure” is defined as a focused node is performed, “blood pressure abnormality” , “Arteriosclerosis”, “hypertensive retinopathy”,... If such a search process is called a “positive search process”, the search process described here is a search process for finding “a matter not related to a certain keyword”, and is called a “negative search process”. It should be.
[0161]
<8.1: Basic concept of negative search processing>
As described above, from the node aggregate shown in FIG. 23, when the target node N1 is designated as shown in FIG. 24 and the positive search process is executed, the nodes having relevance to the target node N1 A related set G1 is extracted, and nodes in the related set G1 are presented as candidate nodes (indicated by white node points). Therefore, the presented candidate node is “a node related to the node of interest N1”. On the other hand, as shown in FIG. 36, the complementary set G1 of the related set G1.^*Is presented as a candidate node (indicated by a white node point), the presented node is a “node not related to the node of interest N1”. These nodes are considered to be “nodes having a relationship of being unrelated” if the feature of “not related” is considered as one of the “relevance”. Therefore, here, the so-called “relevance in a general sense” is referred to as “positive association”, and “relevance that is not relevant” is referred to as “negative association”. Therefore, the nodes indicated by the white node points in FIG. 25 are “positive association nodes” having “positive association” with respect to the node of interest N1, and the association set G1 is referred to as “positive association set”. It will be. On the other hand, the nodes indicated by the white node points in FIG. 36 are “negative related nodes” having a “negative relationship” with respect to the node of interest N1, and the related set G1.^*Is a “negative association set”.
[0162]
The search means 40 shown in FIGS. 21, 22, and 29 includes a “positive search process” for extracting a “positive related set” and a “negative search process” for extracting a “negative related set”. It has a function to perform two types of search processing. Such “negative search processing” is particularly effective when combined with the AND search described in §7. Here, let us combine “negative search processing” with the specific example described in the above AND search.
[0163]
First, when the node N1 in the node aggregate as shown in FIG. 30 is designated as the target node and the “positive search process” is performed as the first search, the “positive association set as shown in FIG. G1 "is extracted and presented as a candidate node. Furthermore, when the node N2 is selected from the candidate nodes, the selected node N2 is designated as a new target node, and “positive search processing” is performed in the “AND search” mode as the second search, As shown in FIG. 32, a “positive association set G2” is extracted, and a logical product part (hatched part) with the mother set M2 is presented as a candidate node. Next, as shown in FIG. 33, the node N3 is selected from the candidate nodes, the selected node N2 is designated as a new node of interest, and again in the “AND search” mode as the third search. When “positive search processing” is performed, as shown in FIG. 34, “positive relation set G3” is extracted, and a logical product part (hatched part in FIG. 34) with the mother set M3 is presented as a candidate node. . As described above, as shown in FIG. 35, the nodes included in the logical product set “(G1) AND (G2) AND (G3)” are presented as candidate nodes as a result of the above three search processes. It is as follows.
[0164]
However, in the state shown in FIG. 33, the node N3 is adopted as a new target node, and “negative search processing” is performed for the new target node N3 in the “AND search” mode as the third search. For example, as shown in FIG.^*Is extracted, and the mother set M3 and the negative related set G3^*And the logical product part (hatched part in FIG. 37) are presented as candidate nodes. Eventually, as shown in FIG. 38, “(G1) AND (G2) AND (G3^*) ”Is included as a candidate node.
[0165]
An AND search combined with such a “negative search process” is very effective when performing a flexible narrowing search. For example, consider a case where past similar case data is searched for a patient having some kind of cardiovascular disease. First, assume that a node in which the keyword “cardiovascular disease” is defined is a node of interest, and the first search is performed by “positive search processing”, and some candidate nodes are presented. Here, since the blood pressure of the patient is high, a node in which the keyword “high blood pressure” is defined is selected from the candidates, and “positive search processing” is performed in the “AND search” mode as the second search. ”. As a result, it is assumed that “blood pressure abnormality”, “arteriosclerosis”, “hypertensive retinopathy”, etc. are presented as candidates as keywords indicating candidate nodes. Here, if the patient does not have a symptom of “arteriosclerosis”, a node in which the keyword “arteriosclerosis” is defined is adopted as the third search, and the keyword “arteriosclerosis” is selected. May be designated as the target node and “negative search processing” may be performed in the “AND search” mode. Then, the candidate nodes presented as the third search result are narrowed down to nodes not related to the keyword “arteriosclerosis” (having a negative relationship).
[0166]
In the system according to the present embodiment, when a search request is given to the search means 40, a “normal search” mode (a mode in which related nodes belonging to the extracted related set are presented as candidate nodes) or “ AND search mode (a mode in which candidate nodes at the time of the previous search are set as a mother set, and only nodes belonging to the intersection set of the extracted related set and the mother set are presented as candidate nodes) And a second selection indicating “positive search processing” or “negative search processing”, and a combination of these selections enables flexible search processing to be realized. ing.
[0167]
<8.2: Link definition considering codes>
As a specific method for performing negative search processing, first, positive search processing is performed to extract a positive related set G, and a negative related set G is used as its complement.^*It is conceivable to extract the. However, such a search method uses a negative related set G^*Is not a method for directly obtaining the value, but merely a method for indirectly obtaining the value using the positive relation set G. If such an indirect method is adopted, it becomes impossible to define priorities for candidate nodes based on link weights.
[0168]
As described above, if a method of defining a signal transmission coefficient for each link, performing signal transmission from the node of interest, and extracting a node having a signal value equal to or higher than a predetermined level as a related node, The priority order based on the magnitude of the signal value can be defined for the related nodes, and the presentation considering the priority order can be performed when the candidate node is presented. However, when the negative search process is performed by the indirect method described above, it is impossible to define the priority order for the negative related node based on the link weight. For example, in the example shown in FIG. 36, the negative related set G1^*First, let us consider a case where a method is used in which a positive relation set G1 is first obtained and indirectly obtained as its complement. In this case, only nodes in the positive relation set G1 are signaled from the target node N1. Therefore, for nodes in the positive association set G1, priority can be defined based on the signal value of the transmitted signal, but the negative association set G1 obtained as its complement is obtained.^*It is not possible to define priorities based on signaling for the nodes within.
[0169]
Candidate nodes obtained by negative search processing are nodes that have a negative relationship (relevance that there is no relationship) to the node of interest, but this negative relationship is also a positive relationship. Similarly, it is convenient to be able to handle quantitatively. In other words, it is preferable that even the same negative relevance is distinguished according to the degree, and a candidate node showing a stronger negative relevance can be preferentially presented. In this way, the present inventor has conceived to define two types of links, a positive link and a negative link, so that a negative relationship can be handled quantitatively in the same way as a positive relationship. That is, a positive link with a positive signal transfer coefficient is defined between nodes having a positive relationship, and a negative link with a negative signal transfer coefficient is defined between nodes having a negative relationship. is there.
[0170]
FIG. 39 is a conceptual diagram showing a node / link aggregate in which both a positive link and a negative link are defined. A link (indicated by a solid line) with a “+” sign in the figure is a positive link having a positive signal transmission coefficient, and a link (indicated by a broken line) with a “−” sign in the figure is a negative link. A negative link with a signal transfer coefficient. For example, a link between node A and node B (hereinafter referred to as link AB) is a positive link, and node A and node B are nodes having a positive relationship (so-called normal relationship). It is shown that. On the other hand, for example, the link BD is a negative link, indicating that the node B and the node D are nodes having a negative relationship (relevance that there is no relationship). Each positive link has a positive signal transfer coefficient, and all negative links have a negative signal transfer coefficient (in the figure, only a positive / negative sign is shown, but in practice each of the predetermined links has a predetermined value. A signal transmission coefficient with an absolute value is defined).
[0171]
Now, the search process using the node / link aggregate in which both the positive link and the negative link are defined is performed as follows. First, when performing a positive search process, that is, a process for searching for a node having a positive relationship with the target node, a signal having a positive signal value is transmitted from the target node to the other nodes along the link. And a node from which a signal having a positive signal value is obtained is extracted as a positive related node. FIG. 40 is a conceptual diagram illustrating a state in which a positive search process is performed for the node of interest B in which signal transmission is limited to the number of hops H = 1. A symbol given in parentheses to each node indicates a symbol of a signal value of a signal obtained at each node, and a node given a symbol of “0” indicates a node where no signal is transmitted. Further, the thick arrow indicates a signal transmission path, the solid arrow indicates a signal transmission path along the positive link, and the broken arrow indicates a signal transmission path along the negative link.
[0172]
Since the positive links BA, BE, BJ have a positive signal transmission coefficient, if a signal having a positive signal value is given to the node B of interest, a positive signal value is given to each of the nodes A, E, J. The signal with that will be transmitted. On the other hand, since the negative links BD and BG have a negative signal transmission coefficient, if a signal having a positive signal value is given to the node B of interest, both of the negative signal values are given to the nodes D and G. A signal having a signal is transmitted. Signal transmission is not performed to the other nodes C, F, H, I, and K under the condition that the number of hops H = 1. Here, if nodes A, E, and J from which a signal having a positive signal value is obtained are extracted as positive related nodes and presented as candidate nodes, a positive search process is executed. . In practice, since the signal value of the signal given to the node B of interest is multiplied by the signal transfer coefficient of each node, the signal is attenuated, and the signal value above the predetermined level cannot be obtained. Is treated as if there was no signal transmission. However, for convenience of explanation, such signal attenuation is not considered here.
[0173]
In this way, even if a negative link is defined, if a signal having a positive signal value is given to the node of interest and only a node from which a signal having a positive signal value is obtained is extracted, There is no problem with the positive search process. That is, nodes connected by negative links are not extracted as candidate nodes. As described above, if one node is selected from the candidate nodes presented by the positive search process, learning is performed. For example, in the example shown in FIG. 40, if the candidate node J is selected, the signal of the link BJ reaching the selected node J among the learning target paths (links BA, BE, BJ) as shown in the flowchart of FIG. Learning is performed to increase the transfer coefficient and decrease the signal transfer coefficients of the other links BA and BE, increase the frequency coefficient of the adopted node J, and decrease the frequency coefficients of the other candidate nodes A and E. Learning will be done.
[0174]
Even when the number of hops H for signal transmission is 2 or more, multiplication considering the sign of the signal transmission coefficient may be performed as in the above example. FIG. 41 shows a part of a state in which a positive search process in which signal transmission is limited to the number of hops H = 2 is performed for the target node B (a path of node B → A → F and a path of node B → D → K). And only the three paths of nodes B → G → H are illustrated, and the other paths are omitted). By transmitting the signal through the positive link BA, a positive signal is obtained at the node A, and this positive signal is further transmitted to the node F through the positive link AF, so that a positive signal is also obtained at the node F. ing. On the other hand, a negative signal is obtained at the node D by signal transmission through the negative link BD, and this negative signal is further transmitted to the node K through the positive link DK. Has been obtained. On the other hand, a negative signal is obtained at the node G by signal transmission through the negative link BG, but this negative signal is further transmitted to the node H through the negative link GH. The signal is obtained. Thus, when the number of hops H is 2 or more, a positive signal is obtained at a node that has been transmitted through an even number of times through the negative link, and is extracted as a positive related node. Become. Eventually, in the example shown in FIG. 41, in addition to nodes A and F, node H is also extracted as a positive related node. This is because the node H has a negative relationship with the “node G having a negative relationship with the target node B”. This is because the node H is expected to have relevance to the target node B based on the principle of double negation.
[0175]
Next, consider a negative search process, that is, a case of searching for a node having a negative relevance to the node of interest. In this case, a signal having a negative signal value is transmitted from the target node to other nodes along the link, and a node having a signal value having a positive signal value and a node having no signal transmission are negatively transmitted. What is necessary is just to extract as a related node. FIG. 42 is a conceptual diagram illustrating a state in which a negative search process in which signal transmission is limited to the number of hops H = 1 is performed for the node of interest B.
[0176]
Since the positive links BA, BE, BJ have a positive signal transmission coefficient, if a signal having a negative signal value is given to the node B of interest, a negative signal value is given to each of the nodes A, E, J. The signal with that will be transmitted. On the other hand, since the negative links BD and BG have a negative signal transmission coefficient, if a signal having a negative signal value is given to the node B of interest, both nodes D and G have a positive signal value. A signal having a signal is transmitted. Signal transmission is not performed to the other nodes C, F, H, I, and K under the condition that the number of hops H = 1. Here, nodes A, E, J from which signals having positive signal values are obtained and nodes C, F, H, I, K without signal transmission are extracted as negative related nodes. Is presented as a candidate node, negative search processing has been executed. A set of nodes (candidate nodes obtained by positive search processing) indicated by white node points in FIG. 40 and each node (by negative search processing) indicated by white node points in FIG. Comparing the obtained set of candidate nodes), it can be seen that they are in a complementary set relationship.
[0177]
In this manner, a negative link is defined, a signal having a negative signal value is given to the node of interest, and a node from which a signal having a positive signal value is obtained and a node having no signal transmission are extracted. By doing so, it can be seen that a complementary set (negative related set) of the node set (positive related set) obtained by the positive search process is obtained. This is exactly the same when the number of hops H is 2 or more.
[0178]
The important point here is that the negative search processing by such a method directly obtains a negative relation set. For example, in the example of FIG. 42, signal transmission is actually performed to the nodes D and G extracted as candidate nodes, and predetermined signal values are obtained for the nodes D and G, respectively. The other candidate nodes C, F, H, I, and K have no signal transmission, so the signal value becomes zero, but in any case, the candidate node extracted as a negative association set has a signal value of Priorities can be defined based on this. For example, in FIG. 42, if the absolute value of the signal transfer coefficient of the negative link BD is larger than the absolute value of the signal transfer coefficient of the negative link BG, the signal value obtained at the node D is greater than the signal value obtained at the node G. Eventually, the priority order can be defined as node D for the first rank, node G for the second rank, and nodes C, F, H, I, and K for the third rank. Of course, as described in §4.6, if a frequency coefficient is defined for each node, it is possible to further define a priority order taking this frequency coefficient into consideration.
[0179]
As described above, if the link definition considering the sign is performed, the positive search process can be performed if the signal value of the signal given to the node of interest is positive, and the negative search process is performed if the signal value is negative. Will be able to.
[0180]
<8.3: Learning by negative search processing>
The system shown in FIG. 21, FIG. 22, and FIG. 29 has a learning means 90, and the link set in the link storage means 50 is obtained when the operator performs an action of selecting a new node of interest from the candidate nodes. As described above, the learning process for the body is performed. Further, in the system shown in FIG. 22, the frequency of selecting individual nodes is stored in the frequency coefficient storage means 110, and the learning process for the frequency of selection is performed by the operator's selection action as already described. is there.
[0181]
As described above, the procedure of the learning process when the selection is performed based on the positive search process is as shown in the flowchart of FIG. 20, but the case where the selection is performed based on the negative search process is also possible. Basically, the learning procedure shown in the flowchart of FIG. 20 can be applied in common. For example, consider the case where node D is selected as a new node of interest from among candidate nodes in the negative search process shown in FIG. In this case, as shown in the flowchart of FIG. 20, the signal of the link BD reaching the adopted node D among the learning target paths (paths in which effective signal transmission for extracting related nodes, ie, links BD and BG) is performed. Learning is performed to increase the transfer coefficient (the sign remains negative and increase the absolute value) and to reduce the other link BG signal transfer coefficient (the sign remains negative and decrease the absolute value), Learning is performed in which the frequency coefficient of the adopted node D is increased and the frequency coefficient of the node G on the other learning target path is decreased.
[0182]
However, there is a difference in nodes extracted as related nodes (positive or negative) between positive search processing and negative search processing. That is, the positive related nodes extracted by the positive search process are all nodes (nodes A, E, and J in the example shown in FIG. 40) on the path (learning target path) in which effective signal transmission has occurred. On the other hand, the negative related nodes extracted by the negative search process are only the nodes (nodes D and G in the example shown in FIG. 42) on the path (learning target path) in which effective signal transmission has occurred. In addition, nodes that do not transmit signals (nodes C, F, H, I, and K in the example shown in FIG. 42) are also included. Therefore, when learning about the frequency coefficient, it is preferable to add a node for which no signal is transmitted to the learning target. Therefore, in FIG. 42, when the node D is adopted, the frequency coefficient of the adopted node D is increased, and the frequency coefficients of all remaining candidate nodes G, C, F, H, I, and K that are not adopted are decreased. A learning process is performed. Of course, even when a node without signal transmission is selected, the frequency coefficient of the selected node is increased, and the learning process for decreasing the frequency coefficients of all remaining candidate nodes that have been rejected for selection may be performed. . For example, in FIG. 42, when node I is adopted, the frequency coefficient of the adopted node I is increased, and the frequencies of all remaining candidate nodes C, D, G, F, H, I, and K that are not adopted are adopted. Learning to reduce the coefficient may be performed.
[0183]
In this embodiment, when a node having no signal transmission is selected as a result of the negative search process, an additional learning process of defining a new negative link is executed. For example, consider the case where node I is adopted in FIG. In the state shown in FIG. 42, there is no direct link between the target node B and the adopted node I. In this example, the reason why the node I is presented as a candidate node for the target node B is that no signal is transmitted to the node I under a predetermined condition that the number of hops H = 1. This is not because an active negative link was defined between I and I. However, the fact that “the node I was selected by the operator as a result of performing a negative search process on the target node B” indicates that the operator has a negative relationship between the target node B and the selected node I. Indicates that As already described, in the state shown in FIG. 42, the priorities as candidate nodes are higher for nodes D and G than for node I. However, when the operator does not adopt the higher priority nodes D and G and dares to adopt node I, when a negative search with node B as the target node is performed again in the future, It is preferable to present it with a higher priority.
[0184]
From this point of view, when a node that did not transmit a signal is selected in the negative search process, the learning unit 90 newly adds a negative link indicating a negative relationship between the target node and the selected node. And processing for adding this negative link to the link aggregate in the link storage means 50 is performed.
[0185]
For example, in the state shown in FIG. 42, when a node I that has not been signaled is selected as a new target node, an operator between the target node B and the selected node I is shown in FIG. Indicates a negative association. Therefore, as shown in FIG. 44, a negative link BI is newly defined between the node of interest B and the adopted node I. A negative signal transmission coefficient is defined for the negative link BI. The absolute value of the negative link BI may be set in advance, for example, “70%”.
[0186]
FIG. 45 is a conceptual diagram illustrating a state in which a negative search process using the node B as a target node is performed again after the learning process illustrated in FIG. 44 is performed. Compared with the conceptual diagram shown in FIG. 42, in the search processing after learning shown in FIG. 45, signal transmission via the negative link BI is newly added, and a positive signal value is obtained at the node I. You can see that In other words, the priority order of the candidate node I is improved as compared with the search process shown in FIG. Here, when the operator adopts the candidate node I again, learning for increasing the absolute value of the signal transmission coefficient of the negative link BI on the path from the node of interest B to the adopted node I by the learning process shown in the flowchart of FIG. (In FIG. 46, the learning result is indicated by the sign "-"). By such re-learning, the absolute value of the negative BI signal transmission coefficient is improved, and when a negative search process is performed again with the node B as the target node in the future, the priority of the candidate node I is further promoted. Then, it is presented with priority to the operator.
[0187]
47 and 48 are diagrams showing the above-described learning process in a combination example with the above-described AND search. That is, by combining with the AND search described above, nodes in the area shown by hatching in FIG. 37 are presented as candidate nodes, and node N4 is selected from these candidate nodes as shown in FIG. Suppose that In this case, since the adopted node N4 is obtained by the negative search process using the node N3 as the target node, the operator recognizes that there is a negative relationship between the target node N3 and the selected node N4. It will be. Here, if the adopted node N4 is a node that has been selected as a node that did not transmit a signal, a new negative node is added between the target node N3 and the adopted node N4, as shown in FIG. Will be defined.
[0188]
§9. Link reconfiguration
One of the features of the database system according to the present embodiment is that the link aggregate in the link storage means 50 is learned by the learning means 90. However, the learning by the learning means 90 described so far is based on a modification that increases or decreases the signal transmission coefficient given to each static link. Of course, as described in §3, §4.7, §6.3, a process of defining a dynamic link, promoting it to a static link and adding it as a new member of the link structure, or §8 As described in .3, link aggregation is also learned by the process of adding a new negative link when a negative search process is performed, but in the previous embodiments, it was defined once. There is no mention of link reorganization that organizes and integrates static links. Here, a database system having a function of performing consolidation processing of existing static links as an accompanying learning process will be described.
[0189]
FIG. 49 is a block diagram of an embodiment in which a link reconfiguration unit 130 is further added to the database system shown in FIG. The link reconfiguration unit 130 performs link reconfiguration such as deletion of an existing link or addition of a new link on the link aggregate in the link storage unit 50 based on the correction to the signal transmission coefficient performed by the learning unit 90. It has a function to perform configuration processing. Here, two specific link reconfiguration processes are exemplified below.
[0190]
<9.1: First Example of Link Reconfiguration Processing>
Now, for example, as shown in the left column of FIG. 50, for the three nodes N1, N2, and N3, the signal transmission coefficient of the negative link L1 between the nodes N1 and N2 is greater than or equal to “predetermined reference”, and the node N2− It is assumed that the signal transmission coefficient of the negative link L2 between N3 is “predetermined standard” or more and the frequency coefficient of the intermediate node N2 sandwiched between both links L1 and L2 is “predetermined reference” or less. In this case, it is considered that the negative links L1 and L2 were frequently involved in the past action of the operator, but the intermediate node N2 itself was rarely adopted. In other words, the intermediate node N2 serves only as a transit node in a frequently used path from the node N1 to the node N3. In such a case, as shown in the right column of FIG. 50, the negative links L1 and L2 may be deleted, and a bypass positive link L3 may be newly defined between the nodes N1 and N3. In this case, if the product of the signal transmission coefficient of the negative link L1 and the signal transmission coefficient of the negative link L2 is given as the newly defined signal transmission coefficient of the positive link L3 (because it is a product of negative coefficients). (Of course, it becomes a positive coefficient) The attenuation of the signal value due to the signal transmission between the nodes N1 and N3 can be maintained in the same state as before the link reconfiguration. Further, since the node N3 is a node having a negative relationship with the “node N2 having a negative relationship with the node N1”, the node N3 has a positive relationship with the node N1. It can be expected that it is appropriate to connect the nodes N1 to N3 directly by the positive link L3.
[0191]
After all, for a specific node N2, the frequency coefficient for the node N2 is not more than “predetermined standard”, and the signal transmission coefficient between the node N2 and another first node N1 is not less than “predetermined standard”. And a second negative link L2 having a signal transmission coefficient between the node N2 and another second node N3 equal to or greater than a “predetermined reference”. In this case, the first negative link L1 and the second negative link L2 are deleted, and a link reconfiguration is performed in which a new positive link L3 is added between the first node N1 and the second node N3. What should I do?
[0192]
It should be noted that a unique reference value may be set in advance as a “predetermined reference” that is a criterion for determining whether or not to perform such link reconfiguration. Alternatively, instead of setting a specific reference value, the relative sizes of the signal transmission coefficient and the frequency coefficient may be compared. For example, in the example shown in FIG. 50, the signal transmission coefficient of the negative link L1 is twice or more than the frequency coefficient of the node N2, and the signal transmission coefficient of the negative link L2 is two or more times of the frequency coefficient of the node N2. In such a case, it is possible to make an arrangement such as performing link reconfiguration as described above. In this case, the “predetermined standard” regarding the frequency coefficient of the node N2 is the signal transmission coefficient value of the negative links L1 and L2, and conversely the “predetermined standard” regarding the signal transmission coefficient of the negative links L1 and L2. "Is the frequency coefficient value of the node N2.
[0193]
<9.2: Second example of link reconfiguration processing>
Now, for example, as shown in the left column of FIG. 51, for the three nodes N1, N2, and N3, the signal transmission coefficient of the negative link L1 between the nodes N1 and N2 is greater than or equal to “predetermined reference”, and the node N2− It is assumed that the signal transmission coefficient of the positive link L2 between N3 is equal to or greater than “predetermined reference”, and the frequency coefficient of the intermediate node N2 sandwiched between both links L1 and L2 is equal to or less than “predetermined reference”. In this case, it can be considered that the negative link L1 and the positive link L2 were frequently involved in the past action of the operator, but the intermediate node N2 itself was rarely adopted. In other words, the intermediate node N2 serves only as a transit node in a frequently used path from the node N1 to the node N3. In such a case, as shown in the right column of FIG. 51, the negative link L1 and the positive link L2 may be deleted, and a bypass negative link L3 may be newly defined between the nodes N1 and N3. In this case, if the product of the signal transmission coefficient of the negative link L1 and the signal transmission coefficient of the positive link L2 is given as the newly defined signal transmission coefficient of the negative link L3 (the positive coefficient and the negative coefficient Since it is a product, of course, it becomes a negative coefficient), attenuation of the signal value due to signal transmission between the nodes N1 and N3 can be maintained in the same state as before the link reconfiguration. Further, since the node N3 is a node having a positive relationship with the “node N2 having a negative relationship with the node N1”, the node N3 has a negative relationship with the node N1. It can be expected that it is appropriate to connect the nodes N1-N3 directly by the negative link L3.
[0194]
After all, for a specific node N2, the frequency coefficient for the node N2 is not more than “predetermined standard”, and the signal transmission coefficient between the node N2 and another first node N1 is not less than “predetermined standard”. Negative link L1 exists, and there is a positive link L2 in which the signal transmission coefficient between the node N2 and another second node N3 is greater than or equal to a “predetermined reference”. L1 and the positive link L2 may be deleted, and link reconfiguration may be performed in which a new negative link L3 is added between the first node N1 and the second node N3.
[0195]
It should be noted that a unique reference value may be set in advance as a “predetermined reference” that is a criterion for determining whether or not to perform such link reconfiguration. Alternatively, instead of setting a specific reference value, the relative sizes of the signal transmission coefficient and the frequency coefficient may be compared. For example, in the case of the example shown in FIG. 51, the signal transfer coefficient of the negative link L1 is twice or more than the frequency coefficient of the node N2, and the signal transfer coefficient of the positive link L2 is two or more times the frequency coefficient of the node N2. In such a case, it is possible to make an arrangement such as performing link reconfiguration as described above. In this case, the “predetermined criterion” for the frequency coefficient of the node N2 is the signal transmission coefficient value of the negative link L1 and the positive link L2, and conversely, the signal transmission coefficient of the negative link L1 and the positive link L2. The “predetermined criterion” is the frequency coefficient value of the node N2.
[0196]
<9.3: Specific Example of Link Reconfiguration Processing>
Finally, an example in which the above-described link reconfiguration process is specifically performed on the link aggregate shown in FIG. 39 will be shown. For example, in FIG. 39, when the absolute value of the signal transmission coefficient of the negative link BG and the absolute value of the signal transmission coefficient of the negative link GH are compared with the frequency coefficient of the node G, the former is more than twice the latter. Suppose that it was judged. In this case, although the negative link BG and the negative link GH are relatively frequently used for node selection, the frequency of node G adoption is low, so that the node G serves only as a transit node. it is conceivable that. In such a case, it is only necessary to delete the negative link BG and the negative link GH and newly define the positive link BH between the nodes B-H as shown in FIG. The signal transfer coefficient of the positive link BH may be a product of the signal transfer coefficient of the deleted negative link BG and the signal transfer coefficient of the deleted negative link GH.
[0197]
In FIG. 39, when the absolute value of the signal transmission coefficient of the negative link HG and the absolute value of the signal transmission coefficient of the positive link GJ are compared with the frequency coefficient of the node G, the former is more than twice the latter. Suppose that it was judged. In this case, although the negative link HG and the positive link GJ are relatively frequently used for node selection, the frequency of node G adoption is low, so that the node G only serves as a transit node. it is conceivable that. In such a case, the negative link HG and the positive link GJ may be deleted, and a new negative link HJ may be defined between the nodes H-J as shown in FIG. The signal transmission coefficient of the negative link HJ may be a product of the signal transmission coefficient of the deleted negative link HG and the signal transmission coefficient of the deleted positive link GJ.
[0198]
§10. Loser revival type AND search
Up to now, various functions of the database system according to an embodiment of the present invention have been described. The essence of the present invention is to add a function called “loser revival type AND search” described below to such a database system. There is in point.
[0199]
<10.1: Basic Concept of Loser Resurrection AND Search>
The “loser revival type AND search” described here is an extension of the “AND search” described in §7. Now, assuming that the target node (node associated with the target data) in the node aggregate as shown in FIG. 54 is the node N0, the operator searches for the target node N0. Therefore, let us assume that the node N1 is designated as the first node of interest (specifically, the keyword defined for the node N1 is designated). Here, as a result of the first search for the first target node N1, it is assumed that a related set G1 as shown in FIG. 55 is searched and nodes in the related set G1 are presented as candidate nodes. In this case, the target node N0 is leaked from the related set G1. The operator inputs a keyword that seems to be appropriate for finding the target node N0 and designates the first target node N1, but the search result shown in FIG. 55 is a result contrary to the expectation of this operator. It has become. As already mentioned, the link indicating the relationship between the nodes that make up this node aggregate is defined primarily by the administrator of this system, so it is not necessarily as expected by individual users. It is not a thing. Therefore, as in the example shown in FIG. 55, there is a possibility that the target node N0 is leaked from the search even though the target node N1 is originally designated for the purpose of finding the node N0.
[0200]
As described above, if the target node N0 is leaked by the first search, the target node N0 is selected as a candidate even if a series of repeated searches are performed in the “AND search” mode described in §7. It is not presented as a node. That is, the related set G1 shown in FIG. 55 newly becomes the mother set M2, but the node N0 does not belong to this mother set M2, so the node N2 in the mother set M2 is adopted as a new target node. 56, even if a related set G2 including the node N0 is obtained as shown in FIG. 56, the second search is presented as a candidate node in FIG. 56 by hatching. It becomes a part of “(M2) AND (G2)” shown. As shown in FIG. 57, this hatched portion becomes a new population M3. Furthermore, even if the node N3 in the mother set M3 is adopted as a new target node and the third search is performed and a related set G3 including the node N0 is obtained as shown in FIG. What is presented as a candidate node in the second search is a portion of “(M3) AND (G3)” shown by hatching in FIG. As shown in FIG. 59, the hatched portion becomes a new mother set M4. Subsequently, even if the node N4 in the mother set M4 is adopted as a new target node and the fourth search is performed, and a related set G4 including the node N0 is obtained as shown in FIG. What is presented as a candidate node in the fourth search is a portion of “(M4) AND (G4)” shown by hatching in FIG.
[0201]
After all, as a result of performing a series of four search processes, as shown in FIG. 61, only the part corresponding to all logical product sets of the four related sets G1 to G4 (the part shown by hatching in FIG. 61). Will remain as final candidates, and the target node N0 will be leaked from the candidates. As apparent from FIG. 61, the target node N0 is included in any of the related sets G2, G3, and G4, but happens to be not included in the related set G1, It is a result leaked from the candidate. The “loser revival type AND search”, which is the essence of the present invention, is a device for eliminating such a harmful effect, and when performing a series of repeated AND searches, a node once leaked from a candidate is called a “tournament”. Similar to “Restoring Losers in War”, it is intended to be revived as a candidate again.
[0202]
In the case of the above example, the final candidate remaining after the four AND searches is added to the logical product part “(G1) AND (G2) AND (G3) AND (G4)” as shown as a hatched area in FIG. However, when the related set G1 obtained in the first search is excluded from the logical product, the final candidate is “(G2) AND (G3 ) AND (G4) "and the target node N0 is presented as a candidate node. In other words, the node N0 that was leaked from the candidates in the first search was extracted as a related node in the second to fourth searches, and thus has been revived as the fourth final candidate. .
[0203]
FIG. 63 is a conceptual diagram showing a series of five search processes in this “loser revival type AND search” mode. In the figure, (1) to (5) show a series of search processes from the first to the fifth, respectively, and the lowermost row shows a set of candidate nodes presented after each search process. Has been. First, in the first search, a related set G1 is extracted by searching for the node of interest N1, and this related set G1 is presented as it is as a set of candidate nodes. In the subsequent second search, the related set G2 is extracted by searching for the new target node N2 selected from the related set G1, and the logical product set of the related sets G1 and G2 is presented as a set of candidate nodes. The In the next third search, the related set G3 is extracted by searching for the new target node N3 selected from the related set G2, and the logical product set of the related sets G1, G2, and G3 is the set of candidate nodes. Presented as Up to this point, it is the same as the AND search described in §7.
[0204]
In the “loser revival type AND search” exemplified here, candidate nodes are determined in a format of “not considering the search result of three or more times before”. That is, in the fourth search, the related set G4 is extracted by searching for the new target node N4 selected from the related set G3. At this time, the search (first time) Search) is not considered, and a logical product set of related sets G2, G3, and G4 is presented as a set of candidate nodes. Therefore, a node that is not included in the related set G1 and is omitted from the candidate in the first search is given a chance to be restored to the candidate in the fourth search. Further, in the fifth search, the related set G5 is extracted by searching for the new target node N5 selected from the related set G4. At this time, the search is performed three times or more before (before the second search). Search) is not considered, and a logical product set of related sets G3, G4, and G5 is presented as a set of candidate nodes. Therefore, a chance of reviving is also given to a node that is not included in the related set G1 or G2 and is omitted from the candidate.
[0205]
FIG. 64 shows a method of determining a set of candidate nodes at the time of the i-th (where i ≧ 3) search of the “loser revival-type AND search” in the form of “does not take into account search results three or more times before”. FIG. That is, in the i-th search, the related set Gi is extracted by searching for the new node of interest Ni selected from the related set G (i-1). (Search before the (i-3) th time) is not considered, and a logical product set of the related sets G (i-2), G (i-1), and Gi is presented as a set of candidate nodes. Become.
[0206]
However, the “loser revival type AND search” according to the present invention is not limited to the form of “not considering the search result three times before”, and generally “does not consider the search result n times or more before”. ”(N is a natural number of 2 or more). In the above example, n = 3 is set. If n = 2 is set, the intersection of G2 and G3 is presented as a candidate node in the third search in FIG. 63. In the fourth search, the logical product set of G3 and G4 is presented as a candidate node, and in the fifth search, the logical product set of G4 and G5 is presented as a candidate node. The conditions for reviving the loser will be eased. On the other hand, if n = 4 is set, in FIG. 63, in the fourth search, the logical product set of G1, G2, G3, and G4 is presented as a candidate node, and at this point in time, the loss of the loser is not recognized. In the next fifth search, the logical product set of G2, G3, G4, and G5 is presented as a candidate node, and finally, the revival of the loser is recognized.
[0207]
Eventually, as a more generalized theory, in the system shown in FIG. 21, when a natural number n of 2 or more is set in advance and the i-th search process is executed by the search means 40, i ≦ n In this case, a logical product set of i sets of related sets extracted by the first to i-th search processes is obtained as a candidate set (this is the “AND search” described in §7), and i> In the case of n, a logical product set of n related sets extracted by the (i−n + 1) th to i-th search processes is obtained as a candidate set, and nodes belonging to the obtained candidate set are candidate nodes. The candidate presenting means 60 may perform the process of presenting as
[0208]
<10.2: Method for defining status in a node>
Here, a specific method for realizing the above-described “loser revival type AND search” will be described. The technique described here defines the status for each node when performing a “loser revival type AND search”, and determines whether each node should be a candidate node while changing the status. is there.
[0209]
As described in the above general theory, it is possible to define a total of (n + 1) stages of statuses from the lowest status S1 to the highest status S (n + 1) to each node. The status of the node is set to the next seat status Sn. For example, when n = 3, a total of four levels of statuses S1, S2, S3, and S4 are defined. Here, the status S1 is the lowest status, and the status S4 is the highest status. At the start of a series of repeated search processes, the status of all nodes is set to the secondary seat status S3. In the embodiment described here, an exception status S0 is defined as another status. This exclusion status S0 is a special status used to prevent a node that has become a focused node in the past from being presented as a candidate node again. FIG. 65 shows a list of these statuses S1 to S4 and S0.
[0210]
These statuses are defined only when a series of iterative search processes are executed in the “loser revival type AND search” mode. At the start of this series of iterative search processes, all the nodes always have the secondary seat status S3 ( Status that is one level lower than the highest status). That is, for all nodes, this next-seat status S3 is the initial status.
[0211]
During execution of a series of repetitive search processes in the “loser revival type AND search” mode, status transition is performed according to the following transition rules (a) to (c) at the time of each search.
(a) The status of the node (related node) extracted as a related set by the search process is promoted by one level. However, the uppermost status S4 is set as an upper limit, and the node originally in the highest status S4 maintains the status S4 as it is. Also, the node with the exclusion status S0 maintains the status S0 as it is.
(b) Nodes that have not been extracted as related sets by the search process (nodes that have not become related nodes) are dropped to the lowest status S1. However, the lowest status S1 is set as the lower limit, and the node originally in the lowest status S1 maintains the status S1 as it is. Further, the node with the exclusion status S0 maintains the status S0 as it is.
(c) The target node when the search process is performed is shifted to the exclusion status S0.
[0212]
The candidate presenting means 60 may perform a process of extracting the node of the highest status S4 as a candidate node after such status transition and presenting it to the operator. FIG. 66 is a diagram showing a list of such status transition rules.
[0213]
The fact that the “loser revival type AND search” can be actually performed by the method using such a status will be described with reference to a specific example of FIG. In the example shown in FIG. 67, a series of repetitive searches including a total of five times are performed on a node aggregate including 11 nodes A to K, and (1) to (▲) in the left column. 6 ▼ shows the state before the first to sixth searches (in this figure, the sixth search has not yet been performed). Here, the status of each node at that time is indicated by the numbers 0 to 4 written to the right of the node names A to K.
[0214]
The nodes A to K listed in the first row {circle around (1)} indicate the nodes before the first search, and are all set to the initial status S3 (the node names A3 to K3 are , Each node is in status S3). Here, it is assumed that node E is designated as the first node of interest among these nodes, and nodes B, C, D, F, G, and H are obtained by the first search using node E as the node of interest. Is extracted as a related node. In the figure, the node of interest and each related node searched are connected by arrows. In this first search, when status transition based on the above transition rule is executed, first, the status of the related nodes B, C, D, F, G, H is promoted by one stage according to the transition rule (a). The status will change from S3 to S4. Further, the nodes A, I, J, and K that have not become related nodes fall to the lowest status S1 according to the transition rule (b). Furthermore, due to the transition rule (c), the node of interest E shifts to the exclusion status S0. In a series of subsequent repeated searches, node E is not presented as a candidate node because it maintains the exclusion status S0.
[0215]
The nodes A to K listed in the second row {circle around (2)} indicate nodes after the first search. Here, the nodes surrounded by the squares are nodes of the highest status S4 and are presented as candidate nodes. Here, among these candidate nodes, it is assumed that node D is designated as a new node of interest, and as a result of the second search using this node D as the node of interest, nodes B, Let C, E, G, H, and I be extracted as related nodes. In the second search, when status transition based on the above transition rule is executed, first, the status of the related nodes B, C, E, G, H, and I is promoted by one stage according to the transition rule (a). However, since the related nodes B, C, G, and H are already in the highest status S4, the status is maintained as it is, and the related node E is maintained as the exclusion status S0. Therefore, the actual status transition is only that node I is promoted from status S1 to S2. Further, the nodes A, F, J, and K that have not become related nodes fall to the lowest status S1 according to the transition rule (b). However, since the nodes A, J, and K are already in the lowest status S1, the status is maintained as it is, and only the node F actually falls to the status S1. Furthermore, due to the transition rule (c), the node of interest D shifts to the exclusion status S0.
[0216]
Each node A to K listed in the third row {circle around (3)} indicates a node after the second search. The node having the highest status S4 surrounded by a square is also presented as a candidate node. Here, among these candidate nodes, it is assumed that node C is designated as a new node of interest, and as a result of the third search using this node C as the node of interest, nodes D, Let E, G, H, I, and J be extracted as related nodes. In this third search, when status transition based on the above transition rule is executed, first, the status of the related nodes D, E, G, H, I, and J is promoted by one stage according to the transition rule (a). However, since the related nodes G and H are already in the highest status S4, the status remains as it is, and the related nodes D and E maintain the exclusion status S0 as they are. Therefore, the actual status transition is only that node I is promoted from status S2 to S3 and node J is promoted from status S1 to S2. Further, according to the transition rule (b), the nodes A, B, F, and K that have not become related nodes fall to the lowest status S1. However, since the nodes A, F, and K are already in the lowest status S1, the status is maintained as it is, and only the node B actually falls to the status S1. Furthermore, due to the transition rule (c), the node of interest C shifts to the exclusion status S0.
[0217]
The nodes A to K listed in the fourth row (4) indicate nodes after the third search. The node having the highest status S4 surrounded by a square is also presented as a candidate node. Here, among these candidate nodes, it is assumed that the node H is designated as a new node of interest, and as a result of the fourth search using this node H as the node of interest, the nodes C, Assume that D, E, G, I, and J are extracted as related nodes. In the fourth search, when status transition based on the above transition rule is executed, first, the status of the related nodes C, D, E, G, I, and J is promoted by one stage according to the transition rule (a). However, since the related node G is already in the highest status S4, the status is maintained as it is, and the related nodes C, D, E maintain the exclusion status S0 as it is. Therefore, the actual status transition is only that node I is promoted from status S3 to S4 and node J is promoted from status S2 to S3. Further, according to the transition rule (b), the nodes A, B, F, and K that have not become related nodes fall to the lowest status S1. However, since both are already in the lowest status S1, the status is maintained as it is and no actual fall occurs. Furthermore, the node of interest H shifts to the exclusion status S0 according to the transition rule (c).
[0218]
The nodes A to K listed in the fifth row (5) indicate nodes after the fourth search. The node having the highest status S4 surrounded by a square is also presented as a candidate node. Here, among these candidate nodes, it is assumed that node I is designated as a new node of interest, and as a result of the fifth search using this node I as the node of interest, nodes C, Assume that D, E, H, J, and K are extracted as related nodes. In the fifth search, when status transition based on the above transition rule is executed, first, the status of the related nodes C, D, E, H, J, K is promoted by one stage according to the transition rule (a). However, the related nodes C, D, E, and H maintain the exclusion status S0 as they are. Therefore, the actual status transition is only that node J is promoted from status S3 to S4 and node K is promoted from status S1 to S2. Further, according to the transition rule (b), the nodes A, B, F, and G that have not become related nodes fall to the lowest status S1. However, since the nodes A, B, and F are already in the lowest status S1, the status is maintained as it is, and only the node G actually falls to the status S1. Furthermore, due to the transition rule (c), the node of interest I moves to the exclusion status S0. The nodes A to K listed in the sixth line {circle around (6)} indicate nodes after the fifth search.
[0219]
The point to be noted in the series of repeated search processes described above is that the fall experience node I that has fallen from the candidate because it has not become a related node in the first search process and has fallen to the lowest status S1 is the second In the fourth to fourth search processes, all become related nodes, and thus are restored as candidate nodes after the fourth search process. Similarly, the fall experience node J that has been dropped from the candidate because it has not become a related node in the first search process and has fallen to the lowest status S1 is also included in the third to fifth search processes. Since both are related nodes, they are restored as candidate nodes after the fifth search process. In short, when n = 3 is set, if the node becomes a related node in three consecutive searches, even if it does not become a related node before that and falls to the lowest status S1, it is revived as a candidate. It will be. Of course, if an arbitrary n (where n is a natural number of 2 or more) is set, if it becomes a related node in successive n searches, it has fallen to the lowest status S1 before that. Even so, it will be revived as a candidate. Note that a node that has once become a target node is not presented as a candidate node because it transitions to the exclusion status S0. That is, since the node once adopted by the operator is not presented again as a candidate, duplicate adoption can be avoided.
[0220]
<10.3: More general method using status definition>
In the above method, for example, when n = 3 is set, four stages of statuses S1 to S4 are defined, and even if the node falls to the lowest status S1, then the related set is continued three times. Will be revived as a candidate. This is because, according to the transition rule (a) described above, the status transition that “the node extracted as a related set promotes the status by one level” is performed. Is promoted, and the transition from the lowest status S1 to the highest status S4 is made. In the general case where a total of (n + 1) stages of statuses S1 to S (n + 1) are defined, if they are extracted as related sets continuously n times, n stages of promotion are performed, so the lowest status S1 To the highest status S (n + 1) and presented as a candidate.
[0221]
In the above-mentioned method, the status promotion is uniformly set to one step for all nodes. However, in implementing the present invention, the number of status promotion steps is always set to “one step” for all nodes. do not have to. That is, the number of promotion stages per time is generally defined as u stage (where u is a natural number 0 <u <n), the value of u is set independently for each node, and the same node It is also possible to set a different u value for each search process. Thus, when the promotion stage number is extended to the general value u, the above-described transition rule (a) is as follows.
(a) For nodes (related nodes) extracted as a related set by the search process, based on the number of promotion stages u defined in advance for each node (where u is a natural number 0 <u <n), Increase the status to u level. However, if the uppermost status S (n + 1) is the upper limit and the uppermost status is exceeded by the u-level promotion, only promotion to the highest status is performed. Further, the node with the exclusion status S0 maintains the status S0 as it is.
[0222]
As described above, when the transition rule (a) is generally extended, the example of §10.2 described above is a special form in which the promotion stage number u is uniformly defined as 1 for all nodes. If the promotion stage number u is not fixed uniformly but is determined independently for each node, it is possible to set a more flexible loser recovery condition. That is, when the search process is performed once, attention is paid to the degree of association between the node of interest and each related node, and the higher the degree of association, the larger the value of the promotion stage u at that time. In this way, if the promotion stage number u is set each time, the more highly relevant node will satisfy the condition for losing the loser earlier.
[0223]
For example, n = 300 is set and 301 stages of status are defined in total. As the number of promotion stages u, u = 100 is a standard value, and values in the range of u = 50 to 150 are individually set. Suppose you have set for each node. In this case, even if the standard node of u = 100 falls to the lowest status, if it is selected as a related set three times in succession, it satisfies the loser resurrection condition and is presented as a candidate, but u = 50 A weakly related node will not satisfy the loser resurrection condition unless it is selected as a related set for six consecutive times. Conversely, a node with a strong degree of association of u = 150 satisfies the loser resurrection condition and is presented as a candidate simply by being selected as the association set twice in succession even if it falls to the lowest status. . In this way, if the value of the promotion stage u is set as an independent variable for each node, different loser recovery conditions can be imposed on each node, and the timing of loser recovery can be changed for each node. Will be able to.
[0224]
The value of the promotion stage number u can be determined by using the signal value obtained for each related node when the search process is performed. As described above, the search process is performed by transmitting a signal having a predetermined signal value from the target node along the link, and a node having a signal value of a predetermined level or higher is extracted as a related node. Is done. Here, the signal value of the signal obtained at each related node indicates the degree of association with the target node, and the relation node with a large signal value indicates that the degree of association with the target node is large. I can say that. Therefore, for a highly relevant node for which a large signal value is obtained in a certain search process, the promotion stage number u regarding the status transition at the time of the search process is set to a large value, and conversely a small signal value is obtained. For nodes with low relevance, the promotion stage number u regarding the status transition at the time of the search process may be set small. Of course, since the signal value of the signal obtained at each related node changes for each individual search process, the promotion stage number u defined for each node also becomes a variable that changes for each individual search process.
[0225]
On the other hand, as the initial status when starting a series of repetitive search processes, the second-seat status Sn (nth status from the bottom) is set for all nodes in the above-described embodiment. However, it is not always necessary to set the next-seat status Sn. For example, when n = 300 is defined and all 301 stages of status are defined, and u = 100 is set as a standard value as the number of promotion stages u, the next-seat status S300 is set as the initial status. It is more preferable to set about the status S200 of the 200th stage from the bottom. In some cases, the highest status can be set as the initial status. In general, as the initial status Sj in the present invention, any status may be set as long as the status can be set as the jth status from the bottom by a natural number j of 1 <j ≦ n + 1.
[0226]
<10.4: Learning in Loser Resurrection AND Search>
In the above-described “loser revival-type AND search”, it is preferable to perform learning by the learning means 90 when the operator has adopted the node that has lost the losing person. That is, during a series of repeated search processing, a fall experience node that has fallen to the lowest status S1 is presented as a candidate node, and if this fall experience node is adopted as a new node of interest, a fall experience node A new link may be defined between the node of interest at the time of falling to the lowest status S1 and the fall experience node, and a learning process for adding the new link to the link aggregate may be performed.
[0227]
This learning process is as follows in the example shown in FIG. That is, nodes I and J are both fall experience nodes that have fallen to the lowest status S1 in the first search process. However, the fall experience node I is presented as a candidate node after the fourth search process, and is selected as the node of interest when the fifth search process is performed. Therefore, in this case, a new link may be defined between the fall experience node I and the node of interest E in the first search in which the fall of the node I occurred. This is based on the idea that “node I should have been presented as a candidate node at the time of the first search using node E as the target node”. If learning for adding a new link is performed in this way, when the search process using the node E as the target node is executed again, the node I is extracted as a related node at the time of the first search. Therefore, it will not leak from the candidate. Similarly, in the example shown in FIG. 67, if node J is adopted as a new node of interest after the fifth search process, learning is performed to define a new link between node J and node E. It will be.
[0228]
This learning process is described as follows using the example shown in FIG. In the example shown in FIG. 62, as described above, the node N0 leaked in the first search is restored to the candidate by the four repeated searches. In this state, as shown in FIG. Consider a case where this node N0 (falling experience node) is adopted as a new node of interest. In this case, after all four searches, the node N0 was finally reached, but from the user's point of view, the node N0 is the first search using the node N1 as the node of interest. It is preferable to be presented as a candidate. Therefore, as shown in FIG. 69, learning for defining a new link L between the node N0 and the node N1 is performed. If such learning is performed, when the search process with the node N1 as the target node is executed again, the node N0 is extracted as a related node due to the presence of the link L, and as shown in FIG. Node N0 will be presented as a candidate node in G1.
[0229]
<10.5: Other Embodiment of Loser Resurrection AND Search>
The embodiment of the “loser revival type AND search” described above is a case where the “positive search process” is performed, but similarly when the “negative search process” described in §8 is combined. Applicable. Note that if a node that has fallen to the lowest status due to the “negative search process” is restored as a candidate and adopted as a new node of interest, the new link defined in the learning process should be a negative link. Good.
[0230]
In the above-described embodiment, an example in which the “loser revival type AND search” according to the present invention is applied to the database system using the node / link aggregate is shown. However, the “loser revival type AND search” according to the present invention is shown. The basic idea is not limited to a database system using such a node / link aggregate, but can be widely applied to general database systems if the node in the above example is considered as data itself. .
[0231]
【The invention's effect】
As described above, according to the database system according to the present invention, when a search is performed by narrowing down candidates by sequentially providing a plurality of search conditions, data that has been excluded from the candidates once is not satisfied because the specific search conditions are not satisfied. Since it can be revived as a candidate again, a more flexible refinement search can be performed.
[Brief description of the drawings]
FIG. 1 is a block diagram showing the configuration of a basic database system for explaining the basic concept of the present embodiment.
FIG. 2 is a diagram showing an example of a search processing result in the database system shown in FIG. 1;
3 is a diagram showing a state in which a new target node N4 has been adopted based on the search result shown in FIG.
4 is a diagram showing search processing in which a dynamic link L8 is temporarily defined in the database system shown in FIG.
FIG. 5 is a diagram showing a state in which a new node of interest N6 is adopted based on the search result shown in FIG. 4 and a dynamic link L8 is promoted to a static link.
FIG. 6 is a diagram illustrating an evaluation method when a plurality of equivalent keywords are defined in one node in keyword evaluation when defining a dynamic link.
FIG. 7 is a diagram illustrating an example of a simple distributed database system including three classes A, B, and C.
8 is a diagram showing an example in which class links and thesaurus dictionaries are defined in the distributed database system shown in FIG.
FIG. 9 is a diagram showing a state where static links L1 to L5 are defined in the distributed database system shown in FIG.
10 is a diagram illustrating a result of a search using the node N1 as a target node using the static link illustrated in FIG. 9;
11 is a diagram showing a result of performing the same search process as the search shown in FIG. 10 in consideration of class link weighting (signal transmission coefficient). FIG.
12 is a diagram showing a state in which learning processing for correcting the weight (signal transmission coefficient) of the static link is performed based on the search result shown in FIG.
13 is a diagram illustrating a result of performing the same search process as the search illustrated in FIG. 10 again in the state after the learning process illustrated in FIG. 12;
14 is a chart showing a method for determining a priority when a candidate node is presented in consideration of the weight of a node after the search process shown in FIG. 10 is executed.
FIG. 15 is a chart showing a priority order determination method when candidate nodes are presented in consideration of node weights after the search process shown in FIG. 13 is executed;
16 is a diagram showing an example in which search processing using both static links and dynamic links is executed in the distributed database system shown in FIG.
FIG. 17 is a diagram illustrating an example in which the definition of a dynamic link is restricted by a class link.
FIG. 18 is a flowchart showing an overall use procedure of the database system according to the embodiment.
FIG. 19 is a flowchart showing in detail a search processing procedure in step S4 in the flowchart shown in FIG. 18;
20 is a flowchart showing in detail the learning process in step S7 in the flowchart shown in FIG.
FIG. 21 is a block diagram showing a specific configuration of the database system according to the present embodiment.
22 is a block diagram showing an embodiment in which a frequency coefficient storage means 110 is further added to the database system shown in FIG. 21. FIG.
FIG. 23 is a conceptual diagram showing a node aggregate to be searched in the database system according to the present embodiment.
24 is a conceptual diagram showing a state where one node N1 in the node aggregate shown in FIG. 23 is designated as a node of interest.
FIG. 25 is a conceptual diagram showing a positive related set G1 searched by a positive search process for the node of interest N1 in the state shown in FIG.
26 is a conceptual diagram showing a state in which one node N2 is adopted from within the positive relation set G1 shown in FIG. 25. FIG.
FIG. 27 is a conceptual diagram showing a state where the adopted node N2 shown in FIG. 26 is updated to a new node of interest.
FIG. 28 is a conceptual diagram showing a positive relation set G2 searched by the positive search process for the new target node N2 shown in FIG.
29 is a block diagram showing an embodiment in which population definition means 120 is further added to the database system shown in FIG. 21. FIG.
30 is a conceptual diagram showing a state in which a node of interest N1 is specified in a node aggregate to be searched in the database system shown in FIG. 29. FIG.
FIG. 31 is a conceptual diagram showing a positive related set G1 searched by a positive search process for the node of interest N1 in the state shown in FIG. 30;
32 is selected by a positive search process in the “AND search” mode for the new target node N2 after the node N2 in the positive relation set G1 shown in FIG. 31 is adopted as the new target node. It is a conceptual diagram which shows the positive related set G2.
33 is a conceptual diagram showing a candidate node set M3 presented as a positive search result in the “AND search” mode shown in FIG. 32;
FIG. 34 adopts node N3 from the set of candidate nodes shown in FIG. 33 as a new target node, and is searched by a positive search process in the “AND search” mode for this new target node N3. It is a conceptual diagram which shows the positive related set G3.
FIG. 35 is a conceptual diagram showing how candidates corresponding to the logical product of search conditions are extracted by positive search processing in three “AND search” modes.
36 is a negative related set G1 searched by a negative search process for the node of interest N1 in the state shown in FIG.^*FIG.
FIG. 37 adopts node N3 as a new target node from the set of candidate nodes shown in FIG. 33, and searches for this new target node N3 by negative search processing in the “AND search” mode. Negative related set G3^*FIG.
FIG. 38 is a conceptual diagram showing how candidates corresponding to the logical product of search conditions are extracted by positive and negative search processing in three “AND search” modes.
FIG. 39 is a conceptual diagram illustrating an example of a node / link aggregate defined using both positive links and negative links.
40 is a conceptual diagram illustrating a state in which a positive search process is performed in which the node B is designated as the node of interest and the number of hops is limited to H = 1 in the node / link aggregate illustrated in FIG. 39;
FIG. 41 is a conceptual diagram illustrating a part of a state in which a positive search process is performed in which the node B is designated as a node of interest and the number of hops is limited to H = 2 in the node / link aggregate illustrated in FIG. 39;
FIG. 42 is a conceptual diagram illustrating a state where a negative search process is performed in which the node B is designated as a node of interest and the number of hops is limited to H = 1 in the node / link aggregate illustrated in FIG. 39;
43 is a conceptual diagram showing a state in which node I is adopted from among candidate nodes presented by the negative search process shown in FIG.
44 is a conceptual diagram showing learning processing by adoption of candidate node I shown in FIG. 43. FIG.
FIG. 45 is a conceptual diagram illustrating a state where a negative search process is performed in which the node B is designated as a node of interest and the number of hops is limited to H = 1 in the node / link aggregate after the learning process illustrated in FIG. 44; .
46 is a conceptual diagram showing a learning process performed by adopting node I from candidate nodes presented by the negative search process shown in FIG. 45. FIG.
47 is a conceptual diagram showing a state in which a node N4 is selected from the candidate nodes presented by the negative search process shown in FIG. 37. FIG.
FIG. 48 is a conceptual diagram illustrating a learning process in which a new negative link is defined between a target node N3 and an adopted node N4 by the adoption of the node N4 shown in FIG.
49 is a block diagram showing an embodiment in which a link reconfiguration unit 130 is further added to the database system shown in FIG. 21. FIG.
50 is a diagram showing a first example of link reconfiguration processing executed in the database system shown in FIG. 49. FIG.
51 is a diagram showing a second example of link reconfiguration processing executed in the database system shown in FIG. 49. FIG.
52 is a diagram illustrating an example in which the link reconfiguration process illustrated in FIG. 50 is applied to the node / link aggregate illustrated in FIG. 39;
53 is a diagram illustrating an example in which the link reconfiguration process illustrated in FIG. 51 is applied to the node / link aggregate illustrated in FIG. 39;
FIG. 54 is a conceptual diagram showing a state in which a target node N1 is specified in a node aggregate including a target node N0.
FIG. 55 is a conceptual diagram illustrating a state in which the node N0 has leaked from the related set G1 extracted by the search process for the node of interest N1 in the state illustrated in FIG.
FIG. 56 shows a relation set G2 retrieved by a search process in the “AND search” mode for the new target node N2 after the node N2 in the related set G1 shown in FIG. 55 is adopted as the new target node. FIG.
FIG. 57 is a conceptual diagram showing an aggregate M3 of candidate nodes presented as a search result in the “AND search” mode shown in FIG. 56.
FIG. 58 is a diagram showing a relation retrieved by selecting node N3 from the set of candidate nodes shown in FIG. 57 as a new target node and searching for the new target node N3 in the “AND search” mode. It is a conceptual diagram which shows set G3.
59 is a conceptual diagram showing a candidate node set M4 presented as a search result in the “AND search” mode shown in FIG. 58;
60. Node N4 is selected as a new node of interest from the set of candidate nodes shown in FIG. 59, and the relationship retrieved by the search processing in the “AND search” mode for this new node of interest N4 It is a conceptual diagram which shows set G4.
FIG. 61 is a conceptual diagram showing how candidates corresponding to the logical product of each search condition are extracted by the search processing in four “AND search” modes.
FIG. 62 is a conceptual diagram showing how the target node N0 is extracted as a candidate node by withdrawing the first search condition in the search processing in the four “AND search” modes shown in FIG. 61; is there.
FIG. 63 is a conceptual diagram showing a series of five search processes in the “loser revival type AND search” mode according to the present invention.
FIG. 64 is a conceptual diagram showing, as a general theory, a series of search processing in the “loser revival type AND search” mode according to the present invention.
FIG. 65 is a chart showing the statuses defined for each node when searching in the “loser revival type AND search” mode according to the present invention.
FIG. 66 is a diagram showing a status transition at the time of every search in the “loser revival type AND search” mode according to the present invention.
FIG. 67 is a diagram showing a specific example of search processing in the “loser revival type AND search” mode according to the present invention.
FIG. 68 is a diagram showing the first half of the learning process after the search process in the “loser revival type AND search” mode according to the present invention.
FIG. 69 is a diagram showing a latter half of the learning process after the search process in the “loser revival type AND search” mode according to the present invention.
FIG. 70 is a diagram showing the effect of the learning process after the search process in the “loser revival type AND search” mode according to the present invention.
[Explanation of symbols]
1 ... Database
2 ... Node link aggregate
3 ... Operator
10 Data storage means
20 ... Data providing means
30: Node-of-interest setting means
40 ... Search means
50. Link storage means
60: Candidate presentation means
70: Candidate selection means
80. Updating means
90 ... Learning means
100 ... Operator
110: Frequency coefficient storage means
120 ... population definition means
130: Link reconfiguration means
A, B, C ... class
AA, BB ... Class link (local link)
AB, AC ... Class link (remote link)
AK ... Node
G1 to G5... Positive related set (set of nodes related to the node of interest)
G1^*, G3^*... Negative association set (set of nodes not related to the node of interest)
H ... Number of hops
Hmax ... Upper limit of the number of hops
K1-K9 ... Keyword
K40, K70 ... Representative keywords
K41-K45, K71-K75 ... Equivalent keywords
L, L1 to L8 ... Instance link (static link and dynamic link)
M1-M4 ... Mother set
N0, N1-N9, N12 ... nodes
Taa, Tbb, Tab, Tac ... Thesaurus

Claims

Data storage means for storing data to be provided;
  Search means for repeatedly executing, as a related set, data that matches a predetermined search condition from data stored in the data storage means, based on a plurality of search conditions sequentially given by an operator;
  Each time the search means executes a search process, a candidate set consisting of predetermined data is obtained based on the result of the search process, and a candidate presenting means for presenting information indicating the obtained candidate set to the operator;
  Data providing means for providing data adopted by an operator from the candidate set;
  The candidate presenting means comprises:
  A total of (n + 1) stages of statuses from the lowest status S1 to the highest status S (n + 1) can be defined for each data (where n is a preset natural number of 2 or more), and a series of repeated search processing At the start of the above, the status of all data is set to the initial status Sj (where j is a natural number 1 <j ≦ n + 1),
  Each time the search process is executed, the status of the data extracted as a related set by the search process is increased by u stages (where u is a natural number of 0 <u <n defined in advance, and all data For the data that is not extracted as a related set by the search process, the value may be a common value or a different value for each piece of data. Performs a status transition that falls to the lowest status S1 (however, the lowest status S1 is the lower limit) and performs a process of presenting the data having the highest status S (n + 1) as a candidate set A database system.

It has a function to define a node aggregate consisting of a plurality of nodes, associate predetermined data with each node, and provide data associated with the target node when a specific target node is specified. Database system,
  Data storage means for storing data to be provided in association with nodes;
  Link storage means for storing a link aggregate consisting of a set of links indicating associations between nodes;
  Based on an instruction from the operator, focused node setting means for setting a specific focused node;
  Search means for searching related nodes related to the node of interest under a predetermined condition using the link aggregate, and executing a search process for extracting the aggregate of the related nodes as a related set;
  Candidate presenting means for presenting all or part of the nodes belonging to the related set to the operator as candidate nodes;
  Candidate selection means for allowing the operator to select a specific candidate node from among the candidate nodes presented by the candidate presentation means;
  Updating means for updating the setting of the focused node setting means so that the adopted node becomes a new focused node;
  Data providing means for extracting data associated with the node of interest from the data storage means and providing it to the operator;
  With
  The candidate presenting means has a function of executing a series of repetitive search processes while updating the target node one after another based on an operator's adoption action,
  It is possible to define a total (n + 1) stages of statuses from the lowest status S1 to the highest status S (n + 1) to each node (where n is a preset natural number of 2 or more), and a series of iterative search processes At the start, the status of all nodes is set to the initial status Sj (where j is a natural number 1 <j ≦ n + 1),
  Each time the search process is executed, the node extracted as a related set by the search process is displayed. For the node, the status is promoted by u stage (where u is a predefined integer of 0 <u <n, and may be a common value for all nodes or may be a value different for each individual node. In the promotion, the uppermost status S (n + 1) is set as the upper limit), and the node that is not extracted as the related set by the search processing falls to the lowest status S1 (however, the lowermost status S1 is set as the lower limit). A database system characterized by performing status transition to be performed, and performing processing for presenting a node having the highest status S (n + 1) as a candidate set.

The database system according to claim 2, wherein
During the series of iterative search processes, the node that became the target node is shifted to a special status that is an exclusion status. During the series of iterative search processes, the transition from the exclusion status to another status is performed. A database system characterized by not allowing it to be performed.

The database system according to claim 2, wherein
During a series of iterative search processing, a fall experience node that has fallen to the lowest status S1 is presented as a candidate node, and when the fall experience node is adopted, the fall experience node falls to the lowest status S1 A database system characterized by further comprising a learning means for defining a new link between the node of interest at the time of the failure and the fall experience node and adding the new link to the link aggregate .

The database system according to claim 2, wherein
A database system characterized by independently determining the value of the status promotion stage u for each related node based on the degree of relationship between the node of interest and each related node.