JP3603613B2

JP3603613B2 - Distributed search system and search device in distributed search system

Info

Publication number: JP3603613B2
Application number: JP26152598A
Authority: JP
Inventors: 武大谷; 俊朗南
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1997-10-31
Filing date: 1998-09-16
Publication date: 2004-12-22
Anticipated expiration: 2018-09-16
Also published as: JPH11195048A

Description

【産業上の利用分野】
本発明は、大規模ネットワークに接続された膨大な数の計算機が、それぞれ何らかの情報資源を有し、それを利用者に提供する検索装置環境において、利用者が希望する情報資源を有する計算機を探し出すための機能に関する。
【従来の技術】
計算機が提供する情報やサービスは、情報産業のもとになる無形の物資という意味で、これらを総称して「情報資源」と呼ぶことにする。
今日のように、ネットワークが発達し、ネットワークに接続されている計算機の数が莫大になり、様々なサービスが行われるようになるにつれて、それぞれの計算機がどのような情報資源を有しているのかを知ることが非常に困難になってきた。
また、たとえそのような情報が得られたとしても、計算機やネットワークのメンテナンスや故障などによって、ネットワーク環境は刻一刻と変化するため、以前に利用できた情報資源が、再び利用できるとは限らない。そのため、利用者は情報資源を実際に利用しようとする時点における最新の情報をもとに、希望する情報資源を提供する計算機を知る必要がある。
また、同じ情報資源を有する計算機は一般に多数存在しており、そのそれぞれは、計算機の管理者の運営方針により、情報の新しさ、正確さ、抽象度などの情報資源の質が、計算機ごとに異なるのが当然である。したがって、利用者としては、多くの計算機群の中から、より良い情報資源を有する計算機を見つけ出せることが望ましい。しかし、その情報資源の善し悪しは、実際に利用し、他の情報資源も使い比べてみなければ分からず、利用者がそれを行うのは、無闇に手間を必要とし、有意義とは言えない。また、情報資源に関する知識の乏しい初心者にとっては、そのようなことを見極めることすら、困難なことである。そこで、多くの利用者から比較的良いと認知されている情報資源を推薦する機能も有益である。
近年、ネットワーク上の多くの情報資源がＷＷＷ（ＷｏｒｌｄＷｉｄｅＷｅｂ）によってサービスされるようになってきた。ＷＷＷにおいては、情報資源の位置情報はＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ）によって表現され、ユーザがある情報資源を利用しようとする場合には、その情報資源の位置情報を示すＵＲＬを知る必要がある。しかし、個人が独力で知り得るＵＲＬは、ネットワーク上の全情報資源に遥かに及ばない。そこで、情報資源に対応するＵＲＬを検索する方法として、サーチエンジンと呼ばれる検索サービスが、ＷＷＷ上で提供されている。これらの行っている手法は、基本的に、ネットワークを通じて利用可能な情報資源に関する情報を収集するステップと、収集した情報を管理し、利用者に提供するステップに分けることができる。情報収集の方法は、次の２通りに大別できる。
・ディレクトリサービス
情報資源を提供する側が、サーチエンジンのサービス（ディレクトリサービス）を行っている管理者（一種の情報資源の仲介業者）にディレクトリリストへの登録を依頼する方式で、Ｙａｈｏｏ（ｈｔｔｐ：／／ｗｗｗ．ｙａｈｏｏ．ｃｏｍ／）やＡｌｔａＶｉｓｔａ（ｈｔｔｐ：／／ａｌｔａｖｉｓｔａ．ｄｉｇｉｔａｌ．ｃｏｍ／）など多数のサーチエンジンが該当する。この方式では、情報資源提供者が、ある程度自信を持って登録依頼を出すので、情報の
質が良い反面、登録作業が管理者の手作業によってなされる場合が多く、管理者の負担が大きいという短所を持つ。また、その結果として、情報の更新が速やかかつ正確に行われない恐れもある。
・ロボット
ロボットと呼ばれるプログラムを使用して、ＨＴＭＬ（Ｈｙｐｅｒ−ＴｅｘｔＭａｒｋ−ｕｐＬａｎｇｕａｇｅ。ＷＷＷで提供する情報を記述する標準的な言語）文書内のリンク（アンカー）を順に辿って、世界中に存在するＵＲＬを自動的に探索して回り、ＵＲＬのデータベースを構築する方式である。ＷＷＷＷｏｒｍ（コロラド大、Ｏ．Ａ．ＭｃＢｒｙａｎ）やＲＢＳＥＳｐｉｄｅｒ（ヒューストン大、Ｄ．Ｅｉｃｈｉｍａｎｎ）などがそうである。この方式の利点は、情報資源提供者側が何もしなくてもよいことである。しかし、厳密には、誰かに情報資源を提供する旨を知らせて、リンクを張って貰わない限り、ロボットが発見することは不可能である。また、情報資源提供者の知らない内に情報資源が参照されてしまうために、情報資源やサービスを更新したことを誰に伝えればよいのかが分からない点は問題である。更に、機械的に情報資源を漁るので、役に立たない情報資源を拾いやすく、ネットワークや計算機の無用な負荷を生むことになる。
次に、集めた情報資源に関する位置情報を管理し、利用者に提供する方法は、次のように分類することができる。
・集中管理
一箇所のサーバで全てのデータを提供する方式で、ＹａｈｏｏやＡｌｔａＶｉｓｔａを含め、多くのサーチエンジンが該当する。この方式は、管理の対象が一箇所なのでメンテナンスは容易であるという利点があるが、その反面、利用者のアクセスが一箇所のサーバに集中するので、サーバの負荷が非常に大きくなりやすい。また、ある利用者はサーバへの通信コストが高くなってしまうのは必須で、すべての利用者が快適に利用できるとは限らない上に、一台のサーバがダウンすると、もはや利用できなくなってしまうという短所を持ちあわせている。
・分散管理
互いに異なる幾つかのサーバで分担してデータを管理・提供する方式であるが、分担の仕方に応じて次のように分類できる。
各利用者がアクセスしやすいサーバを選んで使用することで、負荷分散を図る。ミラーリングはこれに該当する。この方式の利点は同じ機能を持つサーバがいくつもあるので、あるサーバがダウンしてもサービスが続けられるという点である。しかし、利用者にとっては、同じサービスを行う代替サーバの位置情報を把握しておかなければ、この利点を亨受することはできない。また、すべてのサーバで常に同じデータを保持しなければならないので、データの管理に要するコストが大きいという問題点を含んでいる。
・サービスの分散
サービスをいくつかのカテゴリに分類し、各々のカテゴリをいくつかのサーバに担当させる。計算機の名前からそのＩＰアドレスを参照するＤＮＳ（ＤｏｍａｉｎＮａｍｅＳｅｒｖｉｃｅ）は、これに該当する。また、大規模分散データベースＷＡＩＳ（ＷｉｄｅＡｒｅａＩｎｆｏｒｍａｔｉｏｎＳｅｒｖｅｒ）も、この範疇に入れることが可能である。また、アクセスの分散とも両立する場合がある。この方式では、サービスの種類によって、管理するサーバが違うので、メンテナンスは容易である。しかし、サービスの種類や範囲を限定すると、結局は集中管理と見なせるので、集中管理の短所が見えてくる。
利用者は、自分の希望するサービスに応じてサーバを使い分けなければならず、サービスからそれを提供するサーバを知る手段がないと利用者側は不便を感じる（ＤＮＳでは、ドメインの階層を利用して、自動的に捜し出せるので、このような問題はない。ＷＡＩＳはｄｉｒｅｃｔｏｒｙ−ｏｆ−ｓｅｒｖｅｒｓという名前のデータベースで、それを行う）。
一方、情報資源の推薦機能としては、ソーシャルフィルタリングあるいは共助（ｃｏｌｌａｂｏｒａｔｉｖｅ）フィルタリングと呼ばれる、他者の推薦や評価値、同じ嗜好を持つ他者の行動を基にして、利用者にとって好ましいものを推薦する技術が開発されている。例えば、Ｔａｐｅｓｔｒｙ（ＸｅｒｏｘＰａｌｏＡｌｔｏＲｅｓｅａｒｃｈＣｅｎｔｅｒ、Ｄ．Ｇｏｌｄｂｅｒｇ、Ｄ．Ｔｅｒｒｙ）は、ネットニュースやメーリングリストの記事を対象にし、大量の記事の中から、ある人が積極的に推薦してくれる記事を選択して読むことを支援するシステムである。これに対して、過去の記事に対する評価値の付け方が似ている（価値観の似ている）他の利用者が、高い評価値を付けたものを推薦するシステムとしては、ネットニュースの推薦を行うＧｒｏｕｐＬｅｎｓ（Ｐ．Ｒｅｓｎｉｃｋ）、音楽アルバムの推薦を行うＲｉｎｇｏ（ＭＩＴ、Ｐ．Ｍａｅｓ、Ｕ．Ｓｈａｒｄａｎａｎｄ）などが挙げられる。
ところが、ある分野の嗜好が類似しているからといって、その他の分野に関する嗜好が同じである保証は得られないので、必ずしも特定の個人の行動や推薦に従うのが良いとは限らないことがある。また、嗜好に関する情報は集中管理されているため、嗜好データの管理に関して、サーチエンジンの情報の集中管理で述べたような問題点が浮き彫りになる。
【発明が解決しようとする課題】
前節で述べたように、既存の方法における問題点は、次のように整理することができる。
・ディレクトリサービスにおいては、登録作業が管理者の手作業によることが多く、そのため管理者の負担が大きい。また、管理者の手違いにより、検索がうまく行われなくなる可能性がある。
・ロボットプログラムがＨＴＭＬ文書の内容をよく吟味しないため、役に立たないＨＴＭＬ文書を転送する可能性が大で、それによるトラフィックやサーバへの負荷が増大しがちであった。
・トラフィックを低く抑えるためには、ロボットプログラムを起動する頻度を少なくする必用があり、そうすると収集された情報は古くなりがちで、入手した情報が既に無効になっている可能性が大きくなる。
・ロボットによる情報収集は、情報資源提供者側で、提供するサービス内容の変更知らせるべき相手が分からなかったり、たとえ変更を伝えることができたとしても、その内容が即座に検索結果に反映されない可能性がある。
・情報資源のデータベースが巨大になると、検索に対して膨大な結果が出力されるために、利用者はどれが最も適切な情報資源なのかを判断しにくくなる。特に、利用者が、対象としている情報に関する十分な知識を持っていない場合に顕著である。
・特定の嗜好が類似しているからと言って、すべての嗜好が類似しているとは保証できないので、ある特定の個人が推薦したものに、常に満足できるとは限らない。
更に、上記のような分散検索方式においては、エージェントがどのように配置され、どのように結合されているかが、検索のパーフォーマンスを左右する．このため、エージェントの管理者は、全体の効率を上げるためには、エージェントネットワークをどのように構築するかをよく検討する必要がある。
従来の分散検索方式においては、エージェントの近隣関係は、エージェントの管理者があらかじめ決定し、チューニングは管理者が手作業で行う必要があり、エージェントのネットワーク環境の変化に対して柔軟に対処することが困難であった。
ところが、ネットワーク環境は絶えず変化するものであり、あるエージェント間の通信路が障害の為に通信できなくなったり，一時的にエージェントの負荷がくなってしまい、速やかに検索結果を得ることが困難である状況に出喰わすことがある。また、検索要求を受け取ったエージェントすべてからの結果が得られないのであれば、十分に望ましい検索結果を得られないということにも継がる。
一方、検索結果の返送によって、資源情報を得るエージェントは、自分の最寄りの利用者が全然参照しない、無駄な資源情報を無意味に保持する状況も想定される。このような状況は、無用な情報の管理コストを削減するためには、避けたい状況である。
本発明は、情報資源に関する情報の管理を自動化し、情報資源提供者による情報資源の広告と利用者による問い合わせを仲介し、検索結果を利用者に返送する際に、情報資源の広告も同時に行うことにより、情報の陳腐化を防ぎ、多くの情報資源から一般的に良いと認知されているものを選択するようにすることで、上記の問題点を解決した情報資源の発見を目的とする。
更に本発明は、各エージェントが、受け取った資源情報の広告、検索要求、あるいは検索結果の履歴として、頻度、発生元、他のエージェントの応答時間などの情報を記録し、それらのデータを利用し、情報の転送のスケジューリングを行ったり、情報配送のコストの配分を行ったり、転送先として適切なエージェントを選択することによって、効率良く資源情報を伝播させ、検索することを目的とする。
【課題を解決するための手段】
本発明は、複数のエージェントのネットワークからなり、図１は、各エージェント（検索装置）単体の原理説明図である。図中（１）は広告処理部であり、情報資源提供者からの情報資源に関する広告を受け付け、必要に応じて情報資源提供者に通知を行うものであり、問い合わせ処理部（２）は、利用者からの検索依頼を受け付け、結果を利用者に返送する部分である。エージェントインターフェース（３）は、他のエージェントと、広告や問い合わせとその結果をやり取りする部分である。情報資源データベース（４）は、広告処理部（１）から送付された広告や、エージェントインターフェース（３）から受け取った広告を保持するものである。情報資源データ制御部（５）は、広告処理部（１）やエージェントインターフェース（３）から受け取った広告に対して、情報資源データベース（４）に保管したり、広告に要したコストを計算し、それに応じて、エージェントインターフェース（３）を通じて、他のエージェントに送付するなどの処理を指示し、問い合わせ処理部（２）やエージェントインターフェース（３）からの問い合わせに対しては、情報資源データベース（４）を検索したり、検索要求の伝達に要したコストの計算を行い、エージェントインターフェース（３）を通じて他のエージェントへ問い合わせを回送するなどの処理を行うものである。システム全体における、このような広告・問い合わせなどのデータの授受を模式的に表したものが図２である。図中○がエージェントを表し、そのそれぞれがここで述べた動作を行う。また、図中のエージェント間の実線は、通信可能であるという関係を表している。
本発明では、図２のように、前節で述べた構成を持つ複数のエージェントが、相互に連結したインフラを構成し、情報資源提供者が自分の提供できるサービスに関する情報を、広告という形式で、最寄りのエージェントに知らせる。広告を受け取ったエージェントは、その広告に記載されている内容を情報資源データベースに蓄え、あるコスト内で伝達できる範囲のエージェントにも同様の広告を行う。こうして、広告を受け取ったエージェントは、その広告をある一定期間（生存期間と呼ぶ）保持し、それを過ぎた時点で消去するようにし、古い情報がいつまでもエージェントの情報資源データベースに留まらないようにしている。
一方、利用者からの問い合わせが発生した場合は、広告と同様に、あるコスト内で伝達できる範囲のエージェントにも問い合わせを送付し、該当するデータを持つエージェントに問い合わせが到達した時点で、そのエージェントは結果を利用者に返送する。結果を返送する際には、問い合わせが経由してきたエージェントを逆に辿り、各エージェントにその結果を蓄えさせるようにして、最初に情報資源提供者が広告した範囲外にも、広告が伝播されるようにする（図３）。検索に成功した利用者は、その情報資源を利用し満足のいく結果が得られた場合、図４のように評価をフィードバックさせ、情報資源提供者が行った広告と同じように伝播させる。情報資源に関して良い評価を受け取ったエージェントは、その情報資源に関する情報の生存期間を延長するようにし、利用者を満足させる情報資源に関する広告が長く情報資源データベースに留まるようにしている。逆に、否定的な評価を受け取ったエージェントは、その情報の生存期間を短縮し、検索によって見つけられにくくする。
従って、各エージェントは、近隣のエージェントとの通信だけで済み、更に広告や検索依頼によって特定される情報のみを伝達する為、通信コストを低く押さえることが可能となる。また、利用者の使用実績をデータベース内の生き残り時間という形で反映し、頻繁に利用され好評を博する情報資源に関する広告は、そうでないものよりも長時間データベースに存在し、また広範囲のエージェントに伝播される為、別の利用者が検索する際に発見されやすくなる。
更に本発明の変更態様として、図１にて説明した構成に加えて図２３に示すように履歴データベース（５）を設ける構成を考案した。この履歴データベース（５）は、広告処理部（１）、問い合わせ処理部（２）およびエージェントインターフェース（３）を通じて、転送されてきた広告情報、検索要求および検索結果が、いつ、どこから（どこを通って）転送されてきたか、どのような内容を含んでいるかを記録するデータベースである。
情報資源データ制御部（６）は、広告処理部（１）やエージェントインターフェース（３）から受け取った広告に対して、情報資源データベース（４）に保管、履歴データベース（５）を更新したり、履歴の内容に応じて、エージェントインターフェース（３）を通じて、他のエージェントに送付するなどの処理を指示する。また、問い合わせ処理部（２）やエージェントインターフェース（３）からの問い合わせに対しては、情報資源データベース（４）を検索し、履歴データベース（５）を更新し、その内容に応じて、エージェントインターフェース（３）を通じて他のエージェントへ問い合わせを回送するなどの処理を行うものである。エージェントインターフェース（３）を通じて得られる検索結果を基に、資源情報データベース（４）を更新するのも、情報資源データ制御部（６）の指示によって行われる。その際に、履歴データベース（５）を更新するのも、広告・検索要求の場合と同様である。
システム全体における、このような広告・検索要求・検索結果のデータの授受を模式的に表したものが、それぞれ図２４、図２５、図２６である。図中の○がエージェントを表し、そのそれぞれが図２３の構成を有して、上述した動作を行う。また、図中のエージェント間の実線は、互いのエージェントが近隣関係にあることを表している。ここで言う「エージェントが近隣関係にある」ということは論理的に近隣関係にあるということで、エージェントが互いの位置情報を知り、直接に通信を行うことができることを意味している。従って必ずしも、地理的あるいはハードウェア的な（物理的な）近さを反映するものではない。また、近隣関係がないからといって、通信ができないことを意味するのではない。
この構成においては、図２４のように、前節で述べた構成を持つ複数のエージェントが、相互に連結したインフラを構成し、情報資源提供者が自分の提供できるサービスに関する情報を、広告という形式で、最寄りのエージェントに知らせる。広告を受け取ったエージェントは、その広告に記載されている内容を情報資源データベースに蓄え、近隣のエージェントにも同様の広告を行う。こうして、資源情報がエージェントに徐々に伝達されていく。
一方、利用者からの問い合わせが発生した場合は、広告と同様に、近隣のエージェントに検索要求を伝達していき（図２５）、該当するデータを持つエージェントに問い合わせが到達した時点で、そのエージェントは結果を利用者に返送する。結果を返送する際には、問い合わせが経由してきたエージェントを逆に辿り、各エージェントにその結果を蓄えさせるようにして、最初に情報資源提供者が広告した範囲外にも、広告が伝播されるようにする（第２６）。
情報資源の広告情報、検索要求および検索結果の転送先のエージェントは、システム導入時に、エージェントの管理者が設定するのであるが、ある程度運用
を続けていくと、広告情報、検索要求および検索結果の履歴が蓄積されてくる。
その情報から、「エージェントＡから来る広告情報は、Ｆという分野に関するものが多い」、「エージェントＢからの検索要求は、Ｆという分野に関するものが多い」、あるいは「エージェントＣからの検索結果は、Ｆという分野に関するものが多い」といったエージェントの扱っている資源情報の傾向が見て取れるようになる。
このような情報を利用し、あるエージェントからＦという分野に関する情報の検索要求が来た場合には、優先的にエージェントＡまたはＣに近い近隣エージェントに検索要求を送付したり、あるいは直接エージェントＡまたはＣに送付することにより、求める情報資源が発見しやすくなることが期待できる。
逆に、Ｆという分野に関する広告が来た場合には、Ｆに対する需要の高いエージェントに近い方へと、広告情報を伝達しておけば、その地域の利用者は、Ｆに関する情報を発見しやすくなることが期待できる。同時に、広告情報の配布に関しては、需要の高いものを即時に伝達し、そうでないものを、ネットワークトラフィックの少ない時間帯に伝達を行うようにすることで、ネットワークトラフィックの集中やエージェントの過負荷を防ぐことが可能である。
また、履歴情報として、エージェントのレスポンス時間に関する情報も記録することにすると、ごく最近に、ネットワークの不通やエージェントのダウンという状況があったなら、それに該当するエージェントへの情報伝達を、一時的にストップさせ、他のエージェントに送るようにすることで、エージェントネットワークの障害が一部にあったとしても、全体として、情報探索機能を滞りなく提供することが可能となる。
【発明の実施の形態】
［第一の実施形態］
図５は本発明の一実施例構成図であり、情報資源情報の分散管理と検索機能を持つ情報資源発見装置を示している。図中、図１で示したものには同一の番号を付与してある（図１１および図１３中の番号は、データフローの順番を表すもので、それとは無関係である）。
通常の動作は、情報資源提供者（プロデューサ）が、あらかじめ、プロデューサインターフェース（６）を通して、自分の提供するサービスに関する情報を最寄りのエージェントに広告という形で伝える（広告依頼）。広告の具体的な形式は、図６のように、次のような項目とそれに対応する情報の組からなるものとすればよい（＊を付けた項目は、広告処理部（１）が自動的に付与する情報であり、情報資源提供者が指定しなくてもよい）。
ＩＤ（＊）：広告を識別する記号。
Ｆｒｏｍ（＊）：広告の発送者を特定する情報。例えば、メールアドレス。情報の利用者からの問い合わせに利用したり、広告内容の改竄を防止するための認照用のデータとして利用。
Ｓｕｂｊｅｃｔ：提供する情報資源の名称。
Ｋｅｙｗｏｒｄｓ：提供する情報あるいはサービスに関係するキーワード。
ＵＲＬ：提供する情報の位置情報。位置情報を検索するので、このようなフィールドを設けているが、他の検索対象にしたい情報を記述するフィールドとしてもよい。
Ｍａｉｎｔａｉｎｅｒ：情報資源の管理者を特定する情報。広告の発送者と同じでも構わない。
Ｃｏｓｔ：広告範囲を限定するパラメータ。経由するエージェントの最大数、あるいは広告に費やす時間など。通信や計算機の使用に関して課金システムが導入されているのなら、その金額でもよい。
Ｄａｔｅ（＊）：エージェントが広告を受け取った日時。
Ｐａｔｈ（＊）：広告が中継されたエージェントのＩＤの列。ひとつの広告が同じエージェントに何度も伝達されるのを防ぐのが目的である。
プロデューサインターフェース（６）は、その広告を、エージェントの広告処理部（１）に伝える。そして、広告の内容を解析し、上記の＊を付けたフィールドを補足し、広告内容を情報資源データ制御部（５）に渡す。ここで、広告の生存期間を決定し、広告の内容は情報資源データベース（４）に蓄えられる。情報資源データ制御部（５）はクロック（８）を参照しており、既に生存期間を過ぎた広告を情報資源データベース（４）から削除する。また、情報資源データ制御部（５）では、広告に要したコストの計算も行い、使用可能コストにまだ余裕のある時には、エージェントインターフェース（３）を通じて他のエージェントに広告の伝達を行う。
コストとしては、広告を依頼してから広告を終了するまでの時間や、通信費用やデータベースを利用するために必要な費用などを挙げることができる。
特に、コストを無制限にすることによって、広告範囲を限定しない従来のブロードキャストを実現することも可能である。この一連の処理フローが図７である。図中の（６ａ）はプロデューサインターフェース（６）、（１ａ）は広告処理部（１）、（５ａ）から（５ｃ）は情報資源データ制御部（５）、（３ｃ）はエージェントインターフェース（３）の動作に、それぞれ対応する。
一方、情報検索者（サーチャ）は、サーチャインターフェース（７）を通じて、問い合わせ処理部（２）へと検索要求を出す（検索依頼）。検索要求の具体的な形式は、図８のように、広告の形式と同様の形式からなり、各項目のうち、既に分かっている項目を適当に埋め、検索によって知りたい項目を空けておいたものとすればよい（＊の付けられた項目の取り扱いに関しては、広告と同様に、検索者が付けるのではなく、問い合わせ処理部（２）が自動的に補う）。これにより行われる検索として、例えば、次のような検索の仕方が考えられる。
１．Ｋｅｙｗｏｒｄｓをいくつか指定し、それ以外の項目を空けておき、指定したキーワードに関連を持つ情報資源の名称やＵＲＬを検索する。
２．実際にサービスが行われているか否か、あるいは、サービスの内容を調べるために、ＵＲＬを指定し、その他の項目を空けておき、ＳｕｂｊｅｃｔやＫｅｙｗｏｒｄｓを検索する。
３．同じ資源を提供している者同士での連携を図るために、ＳｕｂｊｅｃｔやＫｅｙｗｏｒｄｓを指定し、Ｍａｉｎｔａｉｎｅｒを空けておき、関係しそうな情報資源の管理をしている人を検索する。逆にＭａｉｎｔａｉｎｅｒを指定し、その人が管理している情報資源を検索する。
検索要求を受け取った問い合わせ処理部（２）は、情報資源データ制御部（５）に検索依頼を送り、情報資源データ制御部（５）は、その依頼を解析し、それをもとに情報資源データベース（４）を検索する。該当する結果が見つかれば、その結果を問い合わせ処理部（２）とサーチャインターフェース（７）を通じて検索の依頼主に返送される（検索結果）。この時も広告の場合と同様に、情報資源データ制御部（５）において、コストの計算を行い、使用可能コストに余裕がある場合には、エージェントインターフェース（３）を通じて、他のエージェントへ検索を依頼し、その結果を集めて、情報検索者のもとへ返送する。検索結果は、広告の形式と同様で、検索要求のうち検索の対象となっている項目を該当するデータを補ったものである。コストに関しては、上記で述べた情報資源の広告で触れたように、時間や費用などを考えることができる。また、コストを無制限にすることによって、検索範囲を限定せずにすべてのエージェントに検索を依頼することもでき、逆に最小にすることで、ただひとつのエージェントのみを検索の対象にすることもできる。この一連の処理フローが図９である。図中の（７ａ）はサーチャインターフェース（７）、（２ａ）および（２ｂ）は問い合わせ処理部（２）、（３ｂ）はエージェントインターフェース（３）、（５ｄ）から（５ｉ）は情報資源データ制御部（５）の動作に、それぞれ対応する。
［第二の実施形態］
図１１は、大規模分散データベースへの実施例であり、請求項４に対応する。利用者は、検索したい情報のキーワードをデータベース検索フロントエンド（９）に与え、データベース検索フロントエンド（９）は、まずそのキーワードに関する情報を持つデータベースサーバの位置情報を、最寄りのエージェントにサーチャインターフェース（７）を通じて尋ねる。検索依頼を受けたエージェントは、最初に述べた実施例の動作に従い、検索を実行する。その際、各エージェントが、他のエージェントから返送されてくる検索結果を、自分の情報資源データベース（４）の該当するデータとを比較し、もし検索結果の方が新しいデータであるならば、情報資源データベース（４）のデータを新しいデータに書換えることで、請求項４を実現することができる。そうして、データベースサーバの位置情報が得られたならば、データベース検索フロントエンド（９）は、得られた結果から、適切なサーバをいくつか選び、それらのデータベースサーバにキーワードを転送し、データベースの検索を実行させる。最後にデータベースサーバからの検索結果を統合し、利用者に返送する。この一連の動作を模式的に説明した図が図１２である。
［第三の実施形態］
図１３は、別の大規模分散データベースへの実施例である。
第２の実施形態は、本発明によりデータベースサーバの位置情報が判明した後に、検索を実行する方式であったが、この実施例は、データベースサーバの位置情報の検索と、発見されたデータベースサーバに対するデータベース検索を並行して行う方式である。
データベースサーバの管理者が、あらかじめデータベースサーバの広告をしておくのは、前節と同様である。情報検索を行う人は、サーチャインターフェース（７）を通じて、データベースの検索を依頼する。その依頼は問い合わせ処理（２）によって、適切な形式に整形され、情報資源データ制御部（５）に渡される。
情報資源データ制御部（５）は、情報資源データベース（４）を検索し、検索に応じられるデータベースサーバの位置情報を探す。もし、該当するデータベースサーバに関する情報が発見できたなら、データベースインターフェース（１０）を通して、データベースサーバに検索を実行させる。使用可能コストに余裕があれば、更にエージェントインターフェース（３）を通じて、他のエージェントにも検索依頼を発行し、検索結果を返送するようにしてもらう。一方、先程検索を依頼したデータベースサーバから結果が返送されてきたならば、その結果を一時的に蓄えておき、他のエージェントからの結果が揃うまで、待機する。すべての検索結果が返送されてきたならば、それらを統合して、検索依頼元に送り返す。この一連の動作を模式的に説明した図が図１４である。
この方式では、あるデータベースサーバに複数のエージェントから同じ検索依頼を受ける可能性があるが、あらかじめ検索依頼にＩＤを付与し、データベースサーバは、検索履歴をそのＩＤと共に記憶しておき、重複した検索依頼には、「既に回答済み」と依頼元に返答するなど、同じ検索を何度も行うのを避ける機能を有することが望ましい。
［第四の実施形態］
以上の実施例では、エージェントから得られた情報資源に関する情報は基本的に情報資源提供者側から一方的に与えられるものであるが、図１５の実施例のように、検索の結果として得られた情報資源を実際に使用し、その善し悪しなどの情報をエージェントにフィードバックしてもよい。そうすることで、すべての利用者が効果的に情報資源を利用することが可能となる。例えば、実際に情報資源を使用する際、すでに使用不可能になっていることが判明した情報資源に関しては、各エージェントの情報資源データベースから、該当する情報資源を削除するように広告を行い、他の利用者がその無効な情報資源を利用しようとしないようにする。また、利用者の希望によく叶うものであったならば、生存期間を延長するようなデータ制御のための広告を行い、他の利用者もその情報資源に関する情報を容易に発見しやすいようにする。この実施例は、請求項５に対応する。
情報資源の評価に関するフィードバック情報は、サーチャインターフェース（７）を通じて、フィードバック処理部（１１）に渡される。そして、そこで適切な形式に変換され、他のエージェントにも伝播すべく、広告を行うために、情報資源データ制御部（５）に渡される。広告を受け取った情報資源データ制御部（５）は、情報資源データベース（４）中の該当する情報資源のエントリの生存期間を評価に応じて更新し通常の広告と同様にコストに余裕のある限り、他のエージェントへエージェントインターフェース（３）を通じて、評価を伝達していく。フィードバック処理部（１１）の処理フローは、図１６である。
この図では、利用者からの評価を、＋（満足）、（不満）、＊（使用不能）の３段階と仮定している。そして、そのそれぞれの評価に対して、データベース中の該当する情報資源のエントリの生存期間の延長、短縮、削除を要求する広告を行うように情報資源データ制御部（５）に命令する。
上記の実施例では、情報資源の善し悪しを計る目安として、情報資源データベース内での広告の生存期間を採用していたが、生存期間とは別に、利用されている回数や、一番新しいものなど、他の評価基準で優先順位を付け、ソートし、検索結果を返送してもよい。その場合には、まず、情報資源提供者が広告を行い、各エージェント内の情報資源データベース（４）に情報資源に関する情報を記録する際に、情報資源に関する評価という属性もデータベースに記録しておく必要がある。同じ情報資源に関する情報が、複数のエージェントからそれぞれ異なる評価値と共に返送されてくる場合には、平均をとるなどして、エージェント組織全体としての評価値を計算する。そして、最終的にサーチャインターフェース（７）は、情報検索者に検索結果をすべて提示するのでなく、検索者の評価基準に照し合わせて順位を付け、最良のものを提示してもよい。
［第五の実施形態］
上記の実施例では、各エージェントが持っている情報資源に関する評価値は情報検索者が検索することによって、集計され、検索者が情報資源を取捨選択するのに利用されているが、評価値の計算は検索とは関係なく独自で行なわれ、情報資源提供者が自分が公開した情報資源の社会的な評価を知るために利用してもよい。
図１７は、情報資源提供者による情報資源の評価値の集計機能付きの情報資源発見方式の構成図である。プロデューサインターフェース（６）は、情報資源提供者から広告依頼の他に評価依頼を受け付け、評価依頼を受け取った場合には、もし必要ならば、情報資源提供者の認照を行い、評価依頼処理部（１２）にその内容を送る。評価依頼処理部（１２）は、その内容を解析し、情報資源データ制御部（５）評価要求を送る。情報資源データ制御部（５）は、情報資源データベース（４）の該当する情報資源の評価値を検索すると共に、資源情報の広告や検索と同様に、エージェントインターフェース（３）を通じて、他のエージェントへも評価依頼を送出する。検索の結果返送されてきた評価値は、情報資源データ制御部（５）に集められ、そこで集計され、評価依頼処理部（１２）およびプロデューサインターフェース（６）を通じて、情報資源提供者に返送される。
［第六の実施形態］
図１８は、Ｗｅｂ文書に関する情報の管理・検索を行う、分散型のサーチエンジンの構成を示している。図中、上述の各実施形態と同じ構成要素には同一の番号を付与してある。
なお、本実施形態は技術思想的には第一第六の実施形態と同様であるが、ベストモードとして開示するものである。
通常の動作は、Ｗｅｂ文書作成者が、プロデューサインターフェース（６）を通して、自分が作成したＷｅｂ文書に関する情報を最寄りのエージェントに広告という形で伝える。広告の具体的な形式については、上述の各実施形態と同様である。広告処理部（１）は、プロデューサインターフェース（６）から送られてくるＷｅｂ文書に関する広告情報を基に、登録日時、情報の生存期間、広告コスト・評価値の初期値などの情報を付与し、Ｗｅｂ文書データ制御部（５）に渡す。Ｗｅｂ文書データ制御部（５）は、基本的に情報資源データ制御部と同じであるが、この実施例が対象とする情報資源がＷｅｂ文書であるので、便宜的にそう名付けた。Ｗｅｂ文書データベース（４）も同様の理由で、情報資源データベースのことである。Ｗｅｂ文書データ制御部（５）では、上述した各実施形態と同様に、広告で得られたＷｅｂ文書情報をＷｅｂ文書データベース（４）に登録し、広告に要したコストの計算を行い、コストに余裕があれば、エージェントインターフェース（３）を通じて、他のエージェントへ広告を伝達させる。また、クロック（８）に基づいて、有効期間を過ぎたＷｅｂ文書情報をＷｅｂ文書データベース（４）から削除する。
また、他のエージェントから広告依頼を受ける場合は、エージェントインターフェース（７）を通じて、広告依頼がＷｅｂ文書データ制御部（５）に渡される。
このように、Ｗｅｂ文書データ制御部（５）が広告処理部（１）やエージェントインターフェース（７）から広告依頼データを受け取ってからの処理フローが図１９である。
一方、Ｗｅｂ文書閲覧者は、サーチャインターフェース（７）を通じて、Ｗｅｂ文書の検索を行う。問い合わせ処理部（２）は、Ｗｅｂ文書閲覧者からの入力を基に、検索要求を作成し、それをＷｅｂ文書データ制御部（５）に送る。検索要求の形式は、図１０で示したようなものである。Ｗｅｂ文書データ処理部（５）は、Ｗｅｂ文書データベース（４）を検索し、検索要求を満すＷｅｂ文書を検索し、検索結果を保持する。そして、広告の場合と同様に、残りの利用可能コストを計算し、まだ余裕があれば、エージェントインターフェース（３）を通じて、他のエージェントに検索要求を転送し、検索結果が返送されてくるのを待つ。
検索要求を出したエージェントすべてから検索結果が返送されてくるか、ある一定時間が経過した時点で、集まった検索結果を、問い合わせ処理部（２）に送り、リストの形式にし、サーチャインターフェース（７）を通じて、Ｗｅｂ文書閲覧者に提示する。他のエージェントから検索要求を受け取る場合は、エージェントインターフェース（７）を通じて、検索要求がＷｅｂ文書データ制御部（５）に渡され、検索結果は、エージェントインターフェース（７）を通じて、依頼元のエージェントへと返送される。
Ｗｅｂ文書データ処理部（５）が問い合わせ処理部（２）やエージェントインターフェース（７）から検索依頼を受け取ってからの処理フローが図２０である。
Ｗｅｂ文書閲覧者から、直接検索要求を受け取ったエージェントは、検索結果をＷｅｂ文書閲覧者に提示し、こうして得られた検索結果を基に、Ｗｅｂ文書閲覧者は、サーチャインターフェース（７）を通じて、実際に検索結果に示されたＷｅｂ文書にアクセスする。そして、閲覧者にＷｅｂ文書が閲覧者の要求を満たしているか否かの評価を行ってもらう。評価は、サーチャインターフェース（７）が自動的に行う場合もある。実際のＷｅｂ文書が存在していないとか、何等かの理由で利用できないという状態は好ましくないので、それを負の評価として自動的に収集することが可能である。この時のサーチャインターフェース（７）の処理フローが図２１である。
［第七の実施形態］
Ｗｅｂ文書の収集は、Ｗｅｂ文書作成者からの広告ばかりでなく、既存のサーチエンジンが持つＷｅｂ文書情報を利用することも可能である。エージェントは、サーチエンジンのラッパーの役割を果し、相互にＷｅｂ文書に関する情報をやりとりする利用形態も考えられる。こうすることで、既存のサーチエンジンを連携・統合し、すべてのサーチエンジンを統一的に利用することが可能となる。
図２２は、このように既存サーチエンジンを利用した分散型サーチエンジンの構成図である。前項の分散型サーチエンジンとほぼ同じであるが、サーチエンジンインターフェース（１３）を持ち、既存のサーチエンジンからＷｅｂ文書情報を得ることができる点が異なる。
このサーチエンジンインターフェース（１３）は、検索要求に含まれるキーワードを抽出して、既存のサーチエンジンを用いて検索を行い、得られたＷｅｂ文書をｗｅｂ文書データ制御部からサーチャインターフェースを介して、情報検索者に提示するものである。
またＷｅｂ文書データベース（４）は、既存のサーチエンジンのＷｅｂ文書情報をキャッシュし、Ｗｅｂ文書に関する評価情報のデータベースとして機能する。Ｗｅｂ文書を解析して上述したような広告情報のフォ−マットでＷｅｂ文書に対応する広告情報を作成し、Ｗｅｂ文書データベース（４）に保存を行う。
広告依頼に対する基本的な動作は前項と同じであるが、検索要求に対する動作は、既存サーチエンジンの利用形態に応じて、いくつか考えられる。
・主に既存サーチエンジンをＷｅｂ文書データベースとし、Ｗｅｂ文書データベース（４）を、Ｗｅｂ文書に対する評価情報を蓄えるデータベースとして利用する形態。検索要求に対して、サーチエンジンインターフェース（１３）を通して、サーチエンジンを検索し、得られた結果中の各Ｗｅｂ文書に対して、Ｗｅｂ文書データベース（４）を調べ、評価情報を付与し、エージェントまたはＷｅｂ文書閲覧者へと返送する。
・主にＷｅｂ文書データベース（４）を利用するが、検索結果が十分に得られない場合に、既存サーチエンジンを補助的に利用する形態。
検索要求に対して、まずＷｅｂ文書データベース（４）を検索し、検索要求に見合う結果が得られない、あるいは、十分な個数のＷｅｂ文書情報が得られなかった場合に、サーチエンジンインターフェース（１３）を通じて、既存のサーチエンジンを利用して検索を行う。そして、その結果をＷｅｂ文書閲覧者に提示すると共に、Ｗｅｂ文書データベース（４）に登録し、次回の検索に備える。
・文書データベース（４）の管理・維持のために、既存サーチエンジンを利用する形態。
検索要求に対して、通常のようにＷｅｂ文書データベース（４）を利用して、検索を行う。それと同時に、検索要求の履歴を蓄えておき、ある一定時間ごとに、検索要求の履歴をもとに、サーチエンジンインターフェース（１３）を通して、サーチエンジンを検索し、その結果を基にＷｅｂ文書データベース（４）を更新する。
以上説明したように、本発明によれば、情報資源に関する情報の管理を人手に頼らずに自動的行うことができ、資源情報の管理者は情報の登録や維持に手間を割く必要がなくなる。おまけに、本発明は、従来のディレクトリサービスと同様に、基本的には情報資源提供者の申告に基づくため、ディレクトリサービスの利点であった情報の質を損うことはない。
また、資源情報は分散管理され、各エージェント間の結合に冗長性を持たせることで、ネットワーク環境の変化に対して頑健で検索の負荷分散や検索者のアクセスのしやすさを向上させることができ、広告および検索に利用できるコストを制限することで、資源情報を広告するエージェントの範囲と検索対象とするエージェントの範囲を限定することができるため、全体の通信量を低く抑えることができる。そして、ロボットプログラムのような情報資源のサーバへの無用なアクセスを行わずに済むため、サーバへの高負荷を生むこともない。更に、情報資源提供者は常に最寄りのエージェントへアクセスし、資源情報の更新を行うことができ、一方では、情報資源検索者による情報資源の利用可能性についてのフィードバックがかかり、他のエージェントにも即座に更新情報が伝わるため、資源情報の陳腐化現象を起りにくくする効果が期待できる。
情報資源検索者による情報資源の評価のフィードバックは、各エージェント上での資源情報の生存期間を調節することで、有益な情報資源に関する情報は発見されやすく、そうでないものに関する資源情報は発見されにくくなり、検索結果として得られる資源情報は、多数の検索者のフィードバックに基づく全体的な評価に基づいて順位付けられるので、数多くある情報資源を取捨選択するだけの知識や技量を持たない情報資源検索者にとっても、より良い情報資源を発見・選択・利用しやすくなるという効果が期待できる。そのため、本発明は、大規模な資源情報の効率的な分散管理および検索に寄与するところが大きい。
［第八の実施形態］
図２７は本実施形態に関する構成図であり、情報資源情報の分散管理と検索機能を持つ情報資源発見装置を示している。図中、図２３で示したものには同一の番号を付与してある。
まず広告依頼の処理を図２８のフローチャートに基づいて説明する。通常の動作は、情報資源提供者（プロデューサ）が、プロデューサインターフェース（６）を通して、自分の提供するサービスに関する情報を最寄りのエージェントに広告という形で伝える（広告依頼）。広告の具体的な内容は、提供する情報資源のＵＲＬ、名称、内容を表すキーワード、管理者名などである。プロデューサインターフェース（６）は、その広告を、エージェントの広告処理部（１）に伝える。そして、広告の内容を解析し、必要な情報が落ちていないか（例えば、情報資源のＵＲＬや、関連キーワードなど）、不正なデータが記載されていないか（例えば、妥当な形式のＵＲＬか否か、漢字コードは正常か）等をチェックし、広告内容を情報資源データ制御部（５）に渡す。ここで、この広告内容に基づいて、履歴データベース（１４）を更新し、広告の内容は情報資源データベース（４）に蓄えられる。履歴データベース（１４）の具体的な形式については、後で述べる。
また、情報資源の広告情報は、エージェントインターフェース（３）を通じて、あらかじめエージェントの管理者が決めておくエージェントに伝達することもできる。広告情報を受け取るエージェントでは、やはりエージェントインターフェース（３）から広告情報を貰い、情報資源データ制御部（５）に渡し、先程と同様に、履歴データベース（１４）を更新し、情報資源データベース（４）を更新する。
一方、情報資源の検索要求の処理は図２９の処理フローチャートに示すように、情報検索者からサーチャインターフェース（７）を通じて、問い合わせ処理部（２）へと送られる。そこで、検索要求に関して、必要なデータが落ちていないか（例えば、検索キーワードとか、結果をメールで返送するのならメールアドレスなど）、不正なデータが含まれていないか（例えば、メールアドレスが妥当な形式であるか否か）などをチェックする。
検索要求を受け取った問い合わせ処理部（２）は、情報資源データ制御部（５）に検索依頼を送り、情報資源データ制御部（５）はその依頼を解析し、それを基に履歴データベース（１４）を更新する。そして情報資源データベース（５）を検索し、その結果を問い合わせ処理部（２）とサーチャインターフェース（７）を通じて検索の依頼主に返送される。
また、検索の場合も広告の場合と同様に、エージェントインターフェース（３）を通じて、エージェント管理者が定める近隣エージェントへ検索を依頼し、その結果を集めて、情報検索者のもとへ返送することもできる。この場合は、検索要求を転送したエージェントからの検索結果がすべて揃うまで、ある一定時間待ち続ける。検索結果が、エージェントインターフェース（３）から、情報資源データ制御部（５）へと送られると、検索結果の内容を解析し（例えば、通信の過程でデータが壊れていないかどうか）、どの検索要求に対する検索結果であるのかをチェックする。もし、その検索要求に対する結果を、まだ依頼主（検索者あるいは別のエージェント）に返送していないのなら、依頼主に返送するために、検索結果を蓄えている返送用バッファに結果を追加する。そして、検索結果を履歴データベース（５）に登録し、情報資源データベース（４）の内容を更新する。この検索結果に対する一連の処理フローが図３０である。こうして、検索要求を送付したエージェントすべてから結果が返送されるか、クロック（８）を参照して、タイムアウトになるまで待ち続け、返送用バッファの中身を、依頼主に返送する。
履歴データベースの具体的な形式の例は、図３１に示すように、広告情報、検索要求および検索結果を受け取った時刻、それらのＩＤ、関連キーワード、送付元の組からなる。
ＩＤは受け取った情報を識別するためのもので、「Ｒ２３６６６６＠Ａ３」のような形式である。先頭の文字は、広告情報、検索要求、検索結果かを示すもので、それぞれ「Ｐ」、「Ｑ」、「Ｒ」である。その後に続く数字列は、情報を区別するために、時刻やプロセス番号などから作った一意的な数字列である。「＠」より後はエージェントのＩＤである。
次にキーワードの列は、広告情報の場合は情報資源の関連キーワード、検索要求の場合は検索キーワード、検索結果の場合は検索結果として得られている情報資源（一般に複数ある）のすべての関連キーワードである。
最後の送付元エージェントは、どのエージェントが、その情報を直接送ってきたかを示す。このデータベースによって、どこのエージェントからは、どのような種類の情報資源に関する情報が集まっているか、または、どのような種類の情報資源の需要が高いのかを知ることが可能となる。
［第九の実施形態］
図３２は、広告情報のスプールを持つ分散検索装置の実施形態である。利用者にとって、検索は、結果が得られるのが早ければ早いほど喜ばしいことであるが、広告処理は、必ずしもそうであるとは限らない。検索処理を行っている時に、広告依頼を受理したからといって、その場ですぐに広告処理を行って、検索処理のパーフォーマンスを犠牲にする必要はなく、後回しにした方が望ましいことがある。実際、ほとんど誰も検索したことのないような情報資源に関する広告は、すぐに他のエージェントに広告を行ったところで、誰からも参照されないかもしれず、急いで広告を行っても、利用者にとってメリットはないかもしれない。これに対し、多くの利用者から、何度も検索されているような分野に属する情報資源ならば、すぐにでも利用者は知りたがるであろうから、急いで広告することに意義がある。本実施形態の構成はそのような目的で、急いで広告する必要がない広告情報に関しては、一旦広告情報スプール（１０）に蓄え、バッチ的に広告処理を行うものである。
本実施形態での一連の処理フローを図３３に示す。広告情報を受け取った情報資源データ制御部（５）は、広告情報に含まれる情報資源の関連キーワードあるいはその同義語やそれらに関連する単語を使用している検索要求を履歴データベースから検索する。そして、その結果から優先度を決定する。例えば、履歴データベース（５）を検索して得られる検索要求の数としてもよい。その優先度がある閾値よりも大きければ、すぐに広告処理を行い、そうでなければ一旦広告情報スプール（１５）に蓄える。そして、エージェントが暇になった時点、または広告情報スプール（１５）が一杯になった時点、あるいは、ある一定時間おきに、広告情報スプール（１５）の中の広告情報の処理を始める。この時も、優先度の高い順に処理を行う。
本装置の起動直後では、履歴がないので優先度は特に決められない。そこで最初はエージェントの管理者が優先度を決めておいても構わない。また、それ以後も、管理者の定める優先度と履歴によって決まる優先度を組合わせて利用しても構わない。
［第十の実施形態］
この実施形態は、履歴データベース（１４）に経路情報も記録し、広告や検索要求のより効果的な配送を行うことを目的とする。構成図は図２７と同じであるが、図２７における履歴データベース（１４）の形式は、図３４のように、履歴データベース（１４）の転送元エージェントの代わりに、情報がどのエージェントを経由してきたを示す転送経路を記録する。転送経路に関する情報は、広告情報、検索要求や検索結果が同じエージェントを何度も中継されるのを防ぐために、やりとりされる情報の中に通ってきたエージェントのＩＤを保持しているので（メールやネットニュースのＰａｔｈフィールドと同様）、そこから抽出することが可能である。それらの情報を受け取ったエージェントは、その経路情報を、履歴データベース（１４）に記録しておく。キーワードは、広告および検索結果の場合、対象となる情報資源の関連キーワードの組であり、検索要求の場合には、検索語として指定されたキーワードの組である。また、転送経路は、広告情報、検索要求あるいは検索結果が経由されてきたエージェントの列である。例えば、「Ａ１：Ａ２：Ａ３」はデータがエージェントＡ３からエージェントＡ２を経由してエージェントＡ１まで伝達されてきたことを示す。
例えば、図３４の例では、キーワードｋｗ１に関係する情報資源の広告情報は、エージェントＡ３から多く広告されていることが分かり（図３４中、白い丸印を付けた部分が該当。なお、図３４中の丸印は説明のために記入したものであって、実際にデータベースに記録される情報では無い）、ｋｗ１に関する資源情報は、Ａ３の付近に集中している可能性が高い。したがって、ｋｗ１に関する検索要求を受け取った場合に、近隣関係にはないエージェントＡ３に直接転送すると、多くの検索結果を得やすい。
逆に、エージェントＡ７からは、ｋｗ１に関する検索要求が多いことが分かり（黒い丸印を付けた部分が該当）、ｋｗ１に関する情報資源に関する広告をＡ７に直接転送すると、Ａ７からの検索が効果的に行うことが可能となる。このような考察に基づいて、情報資源データ制御部（６）は、随時履歴データベース（５）を参照し、広告情報や検索要求の内容に応じて、適切なエージェントを決定し、そこへ広告情報や検索要求の転送を行う。
［第十一の実施形態］
第十の実施形態のような履歴データベース（図３４）を採用している分散検索装置において、更に図３５のように、エージェントステイトテーブル（１６）を設ける。エージェントステイトテーブル（１６）は、広告情報、検索要求や検索結果を他のエージェントに転送する際に要したレスポンス時間のテーブルである。図３６はそのテーブルの例である。テーブルの各列は、エージェントＩＤ、そのエージェントに対する応答時間、最後に応答時間を計測した時刻を示す最終チェック時刻の３項からなる。応答時間において、「−−−−−−」は、正常に通信ができなかったことを示す。
この実施形態においては、近隣のエージェントに広告依頼を送る際に、ネットワークの障害など何らかの理由によって、近隣エージェントに広告依頼を転送できなかった場合に、代わりに最寄りエージェントに広告依頼を送ることを目的としている。通常の広告依頼の処理を行い、近隣のエージェントに広告依頼を送る。そして、その時の応答時間を計測し、エージェントステイトテーブル（１６）を更新する。もし、近隣エージェントに広告依頼を送れなかった場合には（送れなかったエージェントをＡとする）、まず図３４に示したような履歴データベース（１４）を検索し、Ａの近隣エージェントを検索する。そして、図３６のようなエージェントステイトテーブル（１６）を調べ、Ａの近隣エージェントから最も応答の早いエージェントＢを選択し、そこに広告依頼を転送する。こうすることで、処理を行えない状態のエージェントを回避し、システム全体として、広告処理を続行させることが可能となる。検索要求の場合にも、同様の手順により、障害を回避し、システム全体として、検索を行うことができる。
つまり図３７に示すように、過去においてエージェントＡからＢを経由してきた広告依頼、検索要求または検索結果を受理したことがあれば、Ａの近隣にＢがいることを知ることができ、Ａに障害が起きても、直接Ｂに送ることが可能となる。
この一連の処理フローが、図３８である。
［第十二の実施形態］
図３９は、一旦スプールに入れられた広告依頼を、近隣エージェントに送付する際に、履歴データベース（１４）を参照し、ネットワークのトラフィックやエージェントの負荷が比較的低い時間帯を選んで、転送を行うようにスケジューリングを行う分散検索装置の構成図である。
図４０は、この実施例のためのエージェントステイトテーブル（１６）の例である。このテーブルは、実施形態十一でのエージェントステイトテーブルの例（図３６）とは違い、各時間帯毎の平均応答時間と実際に計測を行った回数の情報を保持する。これらの情報は、おもにリアルタイムに行われる検索要求や検索結果の返送処理の際に要した時間を基に更新される。この例では、平均応答時間と計測回数のみをテーブルに記録しているが、実施形態十一のテーブル（図３６）のように最終チェック時刻やその時の状態に関する情報などを含めてもよい。
広告依頼を受け取った場合の動作は、実施形態九のように、優先度の高くない広告情報を、一旦広告情報スプール（１５）に蓄える。その際に、広告スケジューラ（１７）により、どの時間帯にどのエージェントに広告情報を伝達するかを規定する、広告配送スケジュールが作成される。例えば、図４０において、エージェントＡ１に送らなければならない広告情報は、優先度が非常に低い場合には、Ａ１の処理が最も少ないと予想される０１：００から０１：５９の時間帯に転送を行うようにし、優先度がやや高いと思われるものは、ある程度応答時間が大きい時間帯でも、なるべく早い時間に転送を行うようにする。そして、その広告配送スケジュールに応じて、指定されたエージェントに対して広告情報の配送を行う。
［第十三の実施形態］
図４２は、コストを用いて、広告依頼などの情報の配布範囲を限定した分散検索検索方式において、履歴データベース（１４）に記録されている過去の広告および検索の実績に応じて、各近隣エージェントに送る情報の転送に費やすコストを分配する機能を持つ分散検索装置の構成図である。
広告依頼を受け取った場合の動作は、通常の広告依頼の処理を行い、他のエージェントに広告依頼を転送する際に、履歴データベース（１４）を検索して、過去においてどのエージェントから、その広告に見合う検索要求が多く転送されてきたかを調べる。検索要求が沢山送られてきた方のエージェント群では、その情報に関する需要が多いと見ることができ、その方向に沢山のコストを費やして、広告を行っておけば、需要の高い地域のすぐ近くまで広告情報が行き渡り、次回検索要求が発生する時に、少ないコストで資源情報が発見する可能性が高いであろう。このような、近隣エージェントに対する、広告依頼のコスト配分を行うのが、コストマネージャ（１８）の役割である。
逆に、検索要求が転送されてきた場合で、他のエージェントに検索要求を転送する場合には、広告依頼および検索結果が転送されてきた方向に、該当する資源情報が多いと考えられるため、その方向に沢山のコストを費やして検索を行えば、多くの検索結果を得ることができ、それだけ良い資源情報を発見することが可能と考えられる。
図４２は、コストの分配の例である。過去において、あるキーワードに関する広告依頼を、近隣エージェントＡ、Ｂ、Ｃから、それぞれ、２：５：３の比率で受け取っていたならば、当然Ｂの方向には、そのキーワードに関する情報が多いと思われる。そこでＢを検索するコストを多くした方が、よい検索結果が得られると期待される。そこで本実施形態では、前記の２：５：３の比率で検索コストを分配し、それぞれの近隣エージェントへ検索要求を転送する。なお、この比率は必ずしも比例させる必要は無く、ある程度の重み付けをするように構成しても良い。
【発明の効果】
以上説明したように、本発明によれば、情報資源に関する情報の管理を人手に頼らずに自動的行うことができ、資源情報の管理者は情報の登録や維持に手間を割く必要がなくなる。おまけに、本発明は、従来のディレクトリサービスと同様に、基本的には情報資源提供者の申告に基づくため、ディレクトリサービスの利点であった情報の質を損うことはない。
また、資源情報は分散管理され、各エージェント間の結合に冗長性を持たせることで、ネットワーク環境の変化に対して頑健で検索の負荷分散や検索者のアクセスのしやすさを向上させることができ、広告および検索に利用できるコストを制限することで、資源情報を広告するエージェントの範囲と検索対象とするエージェントの範囲を限定することができるため、全体の通信量を低く抑えることができる。そして、ロボットプログラムのような情報資源のサーバへの無用なアクセスを行わずに済むため、サーバへの高負荷を生むこともない。更に、情報資源提供者は常に最寄りのエージェントへアクセスし、資源情報の更新を行うことができ、一方では、情報資源検索者による情報資源の利用可能性についてのフィードバックがかかり、他のエージェントにも即座に更新情報が伝わるため、資源情報の陳腐化現象を起りにくくする効果が期待できる。
情報資源検索者による情報資源の評価のフィードバックは、各エージェント上での資源情報の生存期間を調節することで、有益な情報資源に関する情報は発見されやすく、そうでないものに関する資源情報は発見されにくくなり、検索結果として得られる資源情報は、多数の検索者のフィードバックに基づく全体的な評価に基づいて順位付けられるので、数多くある情報資源を取捨選択するだけの知識や技量を持たない情報資源検索者にとっても、より良い情報資源を発見・選択・利用しやすくなるという効果が期待できる。そのため、本発明は、大規模な資源情報の効率的な分散管理および検索に寄与するところが大きい。
更に資源情報の分散検索システムにおいて広告情報、検索要求や検索結果に関する履歴を保持、利用する構成を採用することにより、エージェントの負荷の少ない時間帯での広告情報配信、需要の高い地域への情報配信、情報を持っていると思われる地域への優先的な検索、需要とネットワーク環境に応じた広告配信のスケジューリング、過去の検索実績に応じたコスト配分による広告情報および検索要求のルーティングなどが可能となり、省トラフィックおよび効率的な情報配信・情報検索に寄与するところが大きい。
【図面の簡単な説明】
【図１】本発明の分散検索システムにおける検索装置（エージェント）の原理説明図である。
【図２】分散検索システムにおける情報資源の広告と検索に関する説明図である。
【図３】分散検索システムにおける情報資源に関する情報の伝播の説明図である。
【図４】分散検索システムにおける情報資源の評価の伝播に関する説明図である。
【図５】検索装置（エージェント）の実施例の構成を示すブロック図である。
【図６】広告の例を示す図である。
【図７】検索装置（エージェント）における広告の処理を示すフローチャートである。
【図８】検索要求の例を示す図である。
【図９】検索装置（エージェント）における検索の処理を示すフローチャートである。
【図１０】検索結果の例を示す図である。
【図１１】第２の実施形態のの大規模分散データベースにおける検索装置の構成を示すブロック図である。
【図１２】第２の実施形態の大規模分散データベースの動作説明図である。
【図１３】第３の実施形態のの大規模分散データベースにおける検索装置の構成を示すブロック図である。
【図１４】第３の実施形態の大規模分散データベースの動作説明図である。
【図１５】第４の実施形態のの大規模分散データベースにおける検索装置の構成を示すブロック図である。
【図１６】図１５のフィードバック処理部の処理を示すフローチャートである。
【図１７】第５の実施形態の大規模分散データベースにおける検索装置の構成を示すブロック図である。
【図１８】第六の実施形態の大規模分散データベースにおける検索装置の構成を示すブロック図である。
【図１９】第六の実施形態における広告依頼の処理を示すフローチャートである。
【図２０】第六の実施形態における検索要求の処理を示すフローチャートである。
【図２１】第六の実施形態におけるサーチャインターフェースの処理を示すフローチャートである。
【図２２】第七の実施形態における検索装置の構成を示すブロック図である。
【図２３】履歴データベースを付加した検索装置の原理構成図。
【図２４】図２３の分散検索システムにおける情報資源の広告に関する説明図である。
【図２５】図２３の分散検索システムにおける情報資源の検索の説明図である。
【図２６】図２３の分散検索システムにおける情報資源の返送の説明図である。
【図２７】第八の実施形態における検索装置の構成を示すブロック図である。
【図２８】第八の実施形態における広告依頼の処理を示すフローチャートである。
【図２９】第八の実施形態における検索要求の処理を示すフローチャートである。
【図３０】第八の実施形態における検索結果の処理を示すフローチャートである。
【図３１】第八の実施形態における履歴データベースの例を示す図である。
【図３２】第九の実施形態における検索装置の構成を示すブロック図である。
【図３３】第九の実施形態における広告依頼の処理を示すフローチャートである。
【図３４】第十の実施形態における履歴データベースの例を示す図である。
【図３５】第十一の実施形態における検索装置の構成を示すブロック図である。
【図３６】第十一の実施形態におけるエージェントステイトテーブルの例を示す図である。
【図３７】第十一の実施形態における動作の説明図である。
【図３８】第十一の実施形態における代替エージェントの検索による広告依頼の処理を示すフローチャートである。
【図３９】第十二の実施形態における検索装置の構成を示すブロック図である。
【図４０】第十二の実施形態におけるエージェントステイトテーブルの例を示す図である。
【図４１】第十三の実施形態における検索装置の構成を示すブロック図である。
【図４２】第十三の実施形態における動作の説明図である。
【符号の説明】
１広告処理部、
２問い合わせ処理部、
３エージェントインターフェース、
４情報資源データベース、
５情報資源データ制御部、
６プロデューサインターフェース、
７サーチャインターフェース、
８クロック、
９ＤＢ検索フロントエンド、
１０ＤＢインターフェース、
１１フィードバック処理部、
１２評価依頼処理部。
１３サーチエンジンインターフェース
１４履歴データベース
１５広告情報スプール
１６エージェントステイトテーブル
１７広告スケジューラ
１８コストマネージャ[Industrial applications]
According to the present invention, in a search device environment in which a huge number of computers connected to a large-scale network each have some information resources and provide the users with the information resources, a computer having the information resources desired by the user is searched for. For functions related to.
[Prior art]
Information and services provided by computers are intangible goods that are the basis of the information industry, and are collectively called “information resources”.
As the network has evolved and the number of computers connected to the network has become enormous and various services have been provided, what information resources do each computer have? It has become very difficult to know.
Also, even if such information is obtained, the network environment changes every moment due to computer and network maintenance and failures, so that the previously available information resources may not always be available again. . Therefore, the user needs to know the computer that provides the desired information resource based on the latest information at the time of actually using the information resource.
In addition, there are generally many computers having the same information resources, and each of them has the quality of information resources such as the newness, accuracy, and abstraction of information depending on the operation policy of the computer administrator. Naturally they are different. Therefore, it is desirable for a user to be able to find a computer having better information resources from many computer groups. However, the quality of the information resource is not known unless it is actually used and other information resources are used and compared, and it is insignificant for the user to perform it, and it is not meaningful. Even for beginners with little knowledge of information resources, even finding such a thing is difficult. Therefore, a function of recommending an information resource that is recognized as being relatively good by many users is also useful.
In recent years, many information resources on a network have been provided by the WWW (World Wide Web). In the WWW, the location information of an information resource is represented by a URL (Uniform Resource Locator), and when a user attempts to use an information resource, it is necessary to know the URL indicating the location information of the information resource. However, URLs that individuals can find on their own are far below all information resources on the network. Thus, as a method of searching for a URL corresponding to an information resource, a search service called a search engine is provided on the WWW. These methods can be basically divided into a step of collecting information on information resources available through a network, and a step of managing the collected information and providing it to users. Information collection methods can be broadly classified into the following two methods.
・ Directory service
The information resource providing side requests a manager (a kind of information resource intermediary) of a search engine service (a directory service) to register in a directory list. Yahoo (http: // www. Many search engines are applicable, such as yahoo.com/) and AltaVista (http://altavista.digital.com/). In this method, the information resource provider makes a registration request with some confidence,
On the other hand, although the quality is good, the registration work is often performed manually by the administrator, and has a disadvantage that the burden on the administrator is large. As a result, information may not be updated quickly and accurately.
·robot
Using a program called a robot, HTML (Hyper-Text Mark-up Language. A standard language for describing information provided on the WWW) URLs existing in the world are sequentially traced through links (anchors) in documents. Is automatically searched for, and a URL database is constructed. The WWW Worm (University of Colorado, OA McBryan) and the RBSE Spider (University of Houston, D. Echimann) are the same. The advantage of this method is that the information resource provider does not need to do anything. However, strictly speaking, it is impossible for a robot to discover unless someone informs the user that the information resource is provided and a link is provided. Another problem is that since the information resource is referred to without the knowledge of the information resource provider, it is difficult to know who should be notified that the information resource or service has been updated. Furthermore, since information resources are mechanically caught, useless information resources can be easily picked up, and this causes unnecessary load on the network and the computer.
Next, the method of managing the position information on the collected information resources and providing the information to the user can be classified as follows.
・ Centralized management
One server provides all data, and many search engines, including Yahoo and AltaVista, are applicable. This method has an advantage that maintenance is easy because the management target is one place, but on the other hand, since the user's access is concentrated on one server, the load on the server tends to be very large. In addition, it is indispensable that some users increase the communication cost to the server, not all users can use it comfortably, and if one server goes down, it can no longer be used. It has the disadvantage of getting lost.
・ Distributed management
In this method, data is managed and provided by several different servers, but can be classified as follows according to the sharing method.
Load distribution is achieved by selecting and using a server that is easy for each user to access. Mirroring corresponds to this. The advantage of this method is that since there are several servers with the same function, the service can be continued even if one server goes down. However, the user cannot receive this advantage unless the user knows the location information of the alternative server performing the same service. In addition, since the same data must always be held in all servers, there is a problem that the cost required for data management is large.
・ Distribution of services
The service is divided into several categories, and each category is assigned to several servers. DNS (Domain Name Service), which refers to the IP address from the computer name, corresponds to this. Also, a large-scale distributed database WAIS (Wide Area Information Server) can be included in this category. In addition, access distribution may be compatible. In this method, maintenance is easy because the managed server differs depending on the type of service. However, if the type and range of services are limited, it can be regarded as centralized management, and the disadvantages of centralized management can be seen.
Users must use servers according to their desired services, and users will find it inconvenient if there is no way for services to know which server provides them (in DNS, the domain hierarchy is used. WAIS does this in a database named directory-of-servers).
On the other hand, the information resource recommendation function recommends what is preferable for the user based on the recommendation or evaluation value of another person and the action of another person having the same preference, which is called social filtering or collaborative filtering. Technology is being developed. For example, Tapestry (Xerox Palo Alto Research Center, D. Goldberg, D. Terry) targets articles on the Internet news and mailing lists, and selects an article that a certain person actively recommends from a large number of articles. It is a system that supports reading. In contrast, other users who have similar ratings for past articles (similar values) recommend NetNews as a system that recommends those with high ratings. GroupLens (P. Resnick) to perform, and Ringo (MIT, P. Maes, U. Shardand) to recommend music albums.
However, similar preferences in one area do not guarantee the same tastes in other areas, so it is not always good to follow the actions and recommendations of a particular individual. is there. In addition, since the information on the preferences is centrally managed, the problem described in the centralized management of the information of the search engine is highlighted in the management of the preference data.
[Problems to be solved by the invention]
As mentioned in the previous section, the problems with the existing methods can be summarized as follows.
-In the directory service, registration work is often performed manually by an administrator, and therefore the burden on the administrator is large. In addition, a search may not be performed properly due to a mistake of an administrator.
・ Because the robot program does not examine the contents of the HTML document well, the possibility of transferring useless HTML documents is high, which tends to increase the traffic and load on the server.
In order to keep the traffic low, it is necessary to reduce the frequency of starting the robot program, so that the collected information tends to be out of date and the possibility that the obtained information has already been invalidated increases.
-Information collection by robots may not be immediately reflected in search results even if the information resource provider does not know who needs to be notified of the change in the content of the service to be provided, or even if the change can be communicated. There is.
-If the database of information resources is huge, a huge amount of results are output for the search, and it is difficult for the user to determine which is the most appropriate information resource. In particular, this is remarkable when the user does not have sufficient knowledge about the target information.
• Just because similar preferences are similar, it is not possible to guarantee that all preferences are similar, so it is not always possible to be satisfied with recommendations made by a particular individual.
Furthermore, in the above-described distributed search method, how the agents are arranged and how the agents are connected determines the performance of the search. For this reason, the agent manager needs to carefully consider how to build an agent network in order to increase the overall efficiency.
In the conventional distributed search method, the agent's neighbor relationship is determined in advance by the agent administrator, and tuning must be manually performed by the administrator, so that the agent can flexibly cope with changes in the agent's network environment. Was difficult.
However, the network environment is constantly changing, and the communication path between certain agents cannot communicate due to a failure, or the load on the agents temporarily increases, making it difficult to obtain search results quickly. There are times when you come across situations. In addition, if the results from all the agents that have received the search request cannot be obtained, a sufficiently desirable search result cannot be obtained.
On the other hand, an agent that obtains resource information by returning a search result may have a meaningless useless resource information that the nearest user does not refer to at all. Such a situation is a situation that should be avoided in order to reduce the management cost of unnecessary information.
The present invention automates the management of information related to information resources, mediates advertisement of information resources by information resource providers and inquiries by users, and simultaneously performs advertisement of information resources when returning search results to users. In this way, the object of the present invention is to find an information resource that solves the above-mentioned problems by preventing the information from becoming obsolete and selecting a generally recognized good resource from many information resources.
Further, according to the present invention, each agent records information such as frequency, origin, response time of other agents, etc. as a history of received resource information advertisement, search request or search result, and uses those data. Another object of the present invention is to efficiently transmit and search resource information by scheduling information transfer, allocating information delivery costs, and selecting an appropriate agent as a transfer destination.
[Means for Solving the Problems]
The present invention comprises a network of a plurality of agents, and FIG. 1 is a diagram illustrating the principle of each agent (search device) alone. In the figure, reference numeral (1) denotes an advertisement processing unit which receives an advertisement relating to an information resource from an information resource provider and notifies the information resource provider as necessary. This is a part that receives a search request from a user and returns the result to the user. The agent interface (3) is a part for exchanging advertisements, inquiries, and the results with other agents. The information resource database (4) holds advertisements sent from the advertisement processing unit (1) and advertisements received from the agent interface (3). The information resource data control unit (5) stores the advertisement received from the advertisement processing unit (1) or the agent interface (3) in the information resource database (4), calculates the cost required for the advertisement, In response to this, an instruction such as sending to another agent is instructed through the agent interface (3), and an inquiry from the inquiry processing unit (2) or the agent interface (3) is sent to the information resource database (4). , And calculates the cost required to transmit the search request, and forwards an inquiry to another agent through the agent interface (3). FIG. 2 schematically shows transmission and reception of data such as advertisements and inquiries in the entire system. In the figure, circles represent agents, each of which performs the operation described here. Further, the solid line between the agents in the figure represents the relationship that communication is possible.
In the present invention, as shown in FIG. 2, a plurality of agents having the configuration described in the previous section constitute an interconnected infrastructure, and information resource providers provide information on services that they can provide in the form of advertisements. Notify the nearest agent. The agent receiving the advertisement stores the contents described in the advertisement in the information resource database, and performs the same advertisement to agents within a range that can be transmitted within a certain cost. In this way, the agent receiving the advertisement keeps the advertisement for a certain period of time (called the lifetime), deletes it after that time, and prevents old information from staying in the agent's resource database forever. I have.
On the other hand, when an inquiry from the user occurs, the inquiry is sent to the agents within the range that can be conveyed within a certain cost, like the advertisement, and when the inquiry reaches the agent with the corresponding data, the agent Returns the result to the user. When returning the result, the inquiry is traced back to the agent that came in, and each agent is made to store the result, so that the advertisement is propagated outside the range that the resource provider originally advertised (Figure 3). The user who succeeds in the search, when a satisfactory result is obtained by using the information resource, feeds back the evaluation as shown in FIG. 4 and propagates the evaluation in the same manner as the advertisement performed by the information resource provider. An agent that receives a good evaluation of an information resource extends the lifetime of the information about the information resource so that advertisements about the information resource satisfying the user remain in the information resource database for a long time. Conversely, an agent that receives a negative rating shortens the lifetime of the information and makes it harder to find by searching.
Therefore, each agent only needs to communicate with nearby agents and further transmits only information specified by advertisements and search requests, so that communication costs can be kept low. It also reflects the user's usage in the form of the time remaining in the database, and advertisements about frequently used and popular information resources exist in the database for a longer time than others, and a wider range of agents Because it is propagated, it becomes easy to be found when another user searches.
Further, as a modification of the present invention, a configuration in which a history database (5) is provided as shown in FIG. 23 in addition to the configuration described in FIG. 1 has been devised. The history database (5) stores the transferred advertisement information, search request and search result through the advertisement processing unit (1), the inquiry processing unit (2) and the agent interface (3) when, from where (from where). T) A database that records what has been transferred and what content it contains.
The information resource data control unit (6) stores the advertisement received from the advertisement processing unit (1) or the agent interface (3) in the information resource database (4), updates the history database (5), In accordance with the contents of (1), processing such as sending to another agent is instructed through the agent interface (3). In response to an inquiry from the inquiry processing unit (2) or the agent interface (3), the information resource database (4) is searched and the history database (5) is updated. 3) It performs processing such as forwarding an inquiry to another agent through. Updating of the resource information database (4) based on the search results obtained through the agent interface (3) is also performed according to an instruction of the information resource data control unit (6). At this time, the history database (5) is updated similarly to the case of the advertisement / search request.
FIGS. 24, 25, and 26 schematically show the transmission and reception of such advertisement, search request, and search result data in the entire system. In the figure, circles represent agents, each of which has the configuration shown in FIG. 23 and performs the above-described operation. The solid line between the agents in the figure indicates that the agents are in a close relationship. Here, "agents are in a neighbor relationship" means that they are in a logical neighbor relationship, which means that agents know each other's positional information and can directly communicate. Therefore, it does not necessarily reflect geographical or hardware (physical) proximity. Also, just because there is no proximity does not mean that communication is not possible.
In this configuration, as shown in FIG. 24, a plurality of agents having the configuration described in the previous section constitute an interconnected infrastructure, and information resource providers provide information on services that they can provide in the form of advertisements. Notify the nearest agent. The agent receiving the advertisement stores the contents described in the advertisement in the information resource database, and performs similar advertisement to the nearby agents. Thus, the resource information is gradually transmitted to the agent.
On the other hand, when an inquiry from the user occurs, the search request is transmitted to a nearby agent as in the case of the advertisement (FIG. 25), and when the inquiry reaches the agent having the corresponding data, the agent is Returns the result to the user. When returning the result, the inquiry is traced back to the agent that came in, and each agent is made to store the result, so that the advertisement is propagated outside the range that the resource provider originally advertised (No. 26).
The agent to which the advertisement information of information resources, search requests, and search results are transferred is set by the agent's administrator when the system is introduced.
The history of the advertisement information, the search request and the search result is accumulated.
From the information, “the advertisement information coming from agent A is often related to the field F”, “the search request from agent B is often related to the field F”, or “the search result from agent C is Many tend to be related to the field of F. "
Using such information, when a search request for information related to the field F is received from an agent, the search request is preferentially sent to a nearby agent close to the agent A or C, or the agent A or By sending the information to C, it can be expected that the desired information resource can be easily found.
Conversely, when an advertisement related to the field of F arrives, by transmitting the advertisement information to a person closer to the agent who has a high demand for F, users in the area can easily find information on F. Can be expected. At the same time, regarding the distribution of advertising information, high-demand information is immediately transmitted, and non-high-demand information is transmitted during periods of low network traffic, thereby reducing network traffic concentration and agent overload. It is possible to prevent.
In addition, if information on the response time of the agent is also recorded as history information, if the network was disconnected or the agent went down recently, the information transmission to the corresponding agent will be temporarily stopped. By stopping and sending the information to another agent, it is possible to provide the information search function as a whole without delay, even if the agent network has a partial failure.
BEST MODE FOR CARRYING OUT THE INVENTION
[First embodiment]
FIG. 5 is a block diagram of an embodiment of the present invention, and shows an information resource finding apparatus having a function of distributing and managing information resource information and a search function. In the figure, the same numbers are given to those shown in FIG. 1 (the numbers in FIG. 11 and FIG. 13 indicate the order of data flow and have nothing to do with it).
In a normal operation, an information resource provider (producer) transmits information on a service provided by the information resource provider (producer) to a nearest agent in advance in the form of an advertisement (advertisement request). The specific format of the advertisement may be a set of the following items and corresponding information as shown in FIG. 6 (items marked with * are automatically processed by the advertisement processing unit (1)). ), Which need not be specified by the information resource provider).
ID (*): A symbol for identifying an advertisement.
From (*): Information that identifies the sender of the advertisement. For example, email address. Used for inquiries from users of information and used as verification data to prevent falsification of advertising content.
Subject: The name of the information resource to be provided.
Keywords: Keywords related to information or services to be provided.
URL: Location information of information to be provided. Although such a field is provided to search for position information, it may be a field describing other information to be searched.
Maintainer: Information for specifying an information resource manager. It can be the same as the sender of the ad.
Cost: A parameter that limits the advertising range. For example, the maximum number of agents passing through, or the time spent on advertising. If a charging system has been introduced for communication and use of a computer, that amount may be used.
Date (*): Date and time when the agent received the advertisement.
Path (*): ID column of the agent to which the advertisement was relayed. The goal is to prevent a single ad from being transmitted to the same agent multiple times.
The producer interface (6) transmits the advertisement to the advertisement processing unit (1) of the agent. Then, the contents of the advertisement are analyzed, the fields marked with * are supplemented, and the contents of the advertisement are passed to the information resource data control unit (5). Here, the lifetime of the advertisement is determined, and the content of the advertisement is stored in the information resource database (4). The information resource data control unit (5) refers to the clock (8), and deletes the advertisement whose lifetime has already passed from the information resource database (4). The information resource data control unit (5) also calculates the cost required for the advertisement, and transmits the advertisement to another agent through the agent interface (3) when the usable cost still has room.
The cost includes a time from when an advertisement is requested to when the advertisement is completed, a communication cost, a cost required for using a database, and the like.
In particular, by making the cost unlimited, it is also possible to realize a conventional broadcast without limiting the advertising range. FIG. 7 shows this series of processing flows. In the figure, (6a) is a producer interface (6), (1a) is an advertisement processing unit (1), (5a) to (5c) are information resource data control units (5), and (3c) is an agent interface (3). Respectively correspond to the operations of.
On the other hand, the information searcher (searcher) issues a search request to the inquiry processing unit (2) through the searcher interface (7) (search request). As shown in FIG. 8, the specific format of the search request is the same as the format of the advertisement. Among the items, the already known items are appropriately filled in, and the items to be known by the search are left empty. (The handling of items marked with * is not supplemented by the searcher as in the case of advertisements, but is automatically supplemented by the inquiry processing unit (2)). As a search performed by this, for example, the following search method can be considered.
1. Specify some Keywords, leave other items empty, and search for the name or URL of the information resource related to the specified keyword.
2. In order to check whether or not the service is actually being performed or the contents of the service, a URL is specified, other items are left empty, and a search is made for the subject and the keywords.
3. In order to cooperate between persons who provide the same resource, a subject or keywords is specified, a maintainer is left empty, and a person who manages an information resource that is likely to be related is searched. Conversely, Maintainer is specified, and the information resource managed by that person is searched.
Upon receiving the search request, the query processing unit (2) sends a search request to the information resource data control unit (5), and the information resource data control unit (5) analyzes the request and, based on the request, Search database (4). If a corresponding result is found, the result is returned to the search requester through the inquiry processing unit (2) and the searcher interface (7) (search result). At this time, as in the case of the advertisement, the cost is calculated in the information resource data control unit (5), and when there is a margin in the usable cost, another agent is searched through the agent interface (3). Request, collect the results, and return to the information searcher. The search result is the same as the format of the advertisement, and is obtained by supplementing the search target item in the search request with the corresponding data. As for the cost, as mentioned in the information resource advertisement described above, time, cost, and the like can be considered. In addition, by setting unlimited costs, all agents can be searched without limiting the search range. Conversely, by minimizing costs, only one agent can be searched. it can. FIG. 9 shows this series of processing flows. In the figure, (7a) is a searcher interface (7), (2a) and (2b) are query processing units (2), (3b) is an agent interface (3), and (5d) to (5i) are information resource data control. This corresponds to the operation of the unit (5).
[Second embodiment]
FIG. 11 shows an embodiment for a large-scale distributed database, and corresponds to claim 4. The user gives the keyword of the information to be searched to the database search front end (9), and the database search front end (9) first sends the position information of the database server having the information on the keyword to the nearest agent to the searcher interface (9). 7) Ask through. The agent receiving the search request executes the search according to the operation of the first embodiment. At this time, each agent compares the search result returned from the other agent with the corresponding data in its own information resource database (4), and if the search result is newer data, Claim 4 can be realized by rewriting the data of the resource database (4) with new data. Then, when the location information of the database server is obtained, the database search front end (9) selects some appropriate servers from the obtained results, transfers the keywords to those database servers, and sends the database to the database server. To perform a search. Finally, the search results from the database server are integrated and returned to the user. FIG. 12 schematically illustrates this series of operations.
[Third embodiment]
FIG. 13 shows an embodiment for another large-scale distributed database.
In the second embodiment, the search is executed after the location information of the database server is found according to the present invention. In this embodiment, the search of the location information of the database server and the search for the found database server are performed. In this method, database search is performed in parallel.
It is the same as in the previous section that the database server administrator pre-advertises the database server. The information searcher requests a search of the database through the searcher interface (7). The request is formed into an appropriate format by an inquiry process (2) and passed to the information resource data control unit (5).
The information resource data control unit (5) searches the information resource database (4) to find the position information of the database server corresponding to the search. If information on the database server is found, the database server is caused to execute a search through the database interface (10). If there is enough available cost, a search request is issued to another agent through the agent interface (3), and the search result is returned. On the other hand, if the result is returned from the database server that requested the search, the result is temporarily stored, and the process waits until the results from other agents are completed. If all search results are returned, combine them and send them back to the search requester. FIG. 14 schematically illustrates this series of operations.
In this method, there is a possibility that a certain database server receives the same search request from a plurality of agents. However, an ID is assigned to the search request in advance, and the database server stores the search history together with the ID, and performs a duplicate search. The request desirably has a function of avoiding performing the same search many times, such as replying to the requester that "it has already been answered".
[Fourth embodiment]
In the above embodiment, the information about the information resource obtained from the agent is basically provided unilaterally from the information resource provider side. However, as in the embodiment of FIG. 15, the information is obtained as a result of the search. The information resources used may be actually used, and information such as the quality of the information resources may be fed back to the agent. By doing so, it becomes possible for all users to use information resources effectively. For example, when an information resource is actually used, if the information resource is found to be unusable, an advertisement is issued to delete the information resource from the information resource database of each agent, Of users will not try to use the invalid information resource. Also, if the request meets the user's wishes well, we will provide an advertisement for data control that extends the lifespan so that other users can easily find information about the information resource. I do. This embodiment corresponds to claim 5.
Feedback information on the evaluation of the information resource is passed to the feedback processing unit (11) through the searcher interface (7). Then, the data is converted into an appropriate format and passed to the information resource data control unit (5) for advertisement so as to be transmitted to other agents. Upon receiving the advertisement, the information resource data control unit (5) updates the lifetime of the entry of the corresponding information resource in the information resource database (4) according to the evaluation, and as long as there is a margin for the cost as in a normal advertisement The evaluation is transmitted to other agents through the agent interface (3). FIG. 16 shows a processing flow of the feedback processing unit (11).
In this figure, it is assumed that the evaluation from the user is in three levels of + (satisfied), (dissatisfied), and * (unusable). Then, for each of the evaluations, the information resource data control unit (5) is instructed to make an advertisement requesting extension, shortening, and deletion of the lifetime of the corresponding information resource entry in the database.
In the above embodiment, the lifetime of the advertisement in the information resource database is adopted as a measure of the quality of the information resource. However, apart from the lifetime, the number of times the advertisement is used, the latest one, etc. , May be prioritized by other criteria, sorted, and the search results returned. In that case, first, when the information resource provider makes an advertisement and records information about the information resource in the information resource database (4) in each agent, the attribute of evaluation about the information resource is also recorded in the database. There is a need. When information about the same information resource is returned from a plurality of agents together with different evaluation values, the evaluation value of the entire agent organization is calculated by taking an average or the like. Then, finally, the searcher interface (7) may not present all the search results to the information searcher, but may rank them according to the evaluation criteria of the searcher and present the best one.
[Fifth embodiment]
In the above embodiment, the evaluation values of the information resources possessed by each agent are totalized by the information searcher searching, and used by the searcher to select the information resources. The calculation is performed independently of the search, and may be used by the information resource provider to know the social evaluation of the information resource that he has disclosed.
FIG. 17 is a configuration diagram of an information resource discovery method with a function of totalizing evaluation values of information resources by an information resource provider. The producer interface (6) receives an evaluation request in addition to the advertisement request from the information resource provider, and if the evaluation request is received, checks the information resource provider if necessary, and the evaluation request processing unit The contents are sent to (12). The evaluation request processing unit (12) analyzes the content and sends an information resource data control unit (5) evaluation request. The information resource data control unit (5) searches the evaluation value of the corresponding information resource in the information resource database (4) and, in the same manner as the advertisement or search of the resource information, sends the evaluation value to another agent through the agent interface (3). Also sends an evaluation request. The evaluation values returned as a result of the search are collected by the information resource data control unit (5), compiled there, and returned to the information resource provider through the evaluation request processing unit (12) and the producer interface (6). .
[Sixth embodiment]
FIG. 18 shows the configuration of a distributed search engine that manages and searches for information related to Web documents. In the figure, the same components as those in the above-described embodiments are given the same numbers.
This embodiment is the same as the sixth embodiment in terms of technical concept, but is disclosed as the best mode.
In a normal operation, a Web document creator transmits information on a Web document created by the Web document creator to a nearest agent through a producer interface (6) in the form of an advertisement. The specific format of the advertisement is the same as in each of the above embodiments. The advertisement processing unit (1) adds information such as registration date and time, information lifetime, and initial values of advertisement cost and evaluation value based on the advertisement information on the Web document sent from the producer interface (6), It is passed to the Web document data control unit (5). The Web document data control unit (5) is basically the same as the information resource data control unit, but is named for convenience because the information resource targeted by this embodiment is a Web document. The Web document database (4) is an information resource database for the same reason. In the Web document data control unit (5), the Web document information obtained by the advertisement is registered in the Web document database (4), and the cost required for the advertisement is calculated. If there is room, the advertisement is transmitted to another agent through the agent interface (3). Further, based on the clock (8), the Web document information whose validity period has passed is deleted from the Web document database (4).
When receiving an advertisement request from another agent, the advertisement request is passed to the Web document data control unit (5) through the agent interface (7).
FIG. 19 shows a processing flow after the Web document data control unit (5) receives the advertisement request data from the advertisement processing unit (1) or the agent interface (7).
On the other hand, the Web document viewer searches for the Web document through the searcher interface (7). The inquiry processing unit (2) creates a search request based on the input from the Web document viewer and sends it to the Web document data control unit (5). The format of the search request is as shown in FIG. The Web document data processing unit (5) searches the Web document database (4), searches for a Web document that satisfies the search request, and holds the search result. Then, as in the case of the advertisement, the remaining available cost is calculated, and if there is still room, the search request is transferred to another agent through the agent interface (3), and the search result is returned. wait.
When the search results are returned from all the agents that have issued the search requests, or when a certain period of time has elapsed, the collected search results are sent to the inquiry processing unit (2), and the search results are converted into a list format, and the searcher interface (7) ) To present it to the Web document viewer. When a search request is received from another agent, the search request is passed to the Web document data control unit (5) through the agent interface (7), and the search result is sent to the requesting agent through the agent interface (7). Will be returned.
FIG. 20 shows a processing flow after the Web document data processing unit (5) receives a search request from the inquiry processing unit (2) or the agent interface (7).
The agent receiving the search request directly from the Web document viewer presents the search result to the Web document viewer, and based on the obtained search result, the Web document viewer can actually search through the searcher interface (7). Access the Web document indicated in the search result. Then, the viewer is asked to evaluate whether the Web document satisfies the request of the viewer. The evaluation may be performed automatically by the searcher interface (7). Since it is not preferable that an actual Web document does not exist or cannot be used for any reason, it is possible to automatically collect it as a negative evaluation. FIG. 21 shows a processing flow of the searcher interface (7) at this time.
[Seventh embodiment]
The collection of Web documents can utilize not only advertisements from Web document creators but also Web document information of existing search engines. The agent plays the role of a wrapper for a search engine, and can be used in a manner of exchanging information about Web documents with each other. By doing so, existing search engines can be linked and integrated, and all search engines can be used in a unified manner.
FIG. 22 is a configuration diagram of a distributed search engine using an existing search engine as described above. It is almost the same as the distributed search engine described in the previous section, except that it has a search engine interface (13) and can obtain Web document information from an existing search engine.
The search engine interface (13) extracts a keyword included in the search request, performs a search using an existing search engine, and converts the obtained Web document from the Web document data control unit via the searcher interface into information. It is presented to the searcher.
The Web document database (4) caches Web document information of an existing search engine and functions as a database of evaluation information on Web documents. The Web document is analyzed to create advertisement information corresponding to the Web document in the format of the advertisement information as described above, and is stored in the Web document database (4).
The basic operation for an advertisement request is the same as the previous section, but there are several possible operations for a search request depending on the use form of an existing search engine.
-A form in which an existing search engine is mainly used as a Web document database, and the Web document database (4) is used as a database for storing evaluation information for Web documents. In response to the search request, the search engine is searched through the search engine interface (13), the Web document database (4) is searched for each Web document in the obtained result, evaluation information is added, and the agent or Return to Web document viewer.
-A form that mainly uses the Web document database (4), but uses an existing search engine as a supplement when sufficient search results cannot be obtained.
In response to the search request, first, the Web document database (4) is searched, and if a result corresponding to the search request is not obtained, or if a sufficient number of Web document information is not obtained, the search engine interface (13) Through existing search engines. Then, the result is presented to the Web document viewer, and is registered in the Web document database (4) to prepare for the next search.
-An existing search engine is used to manage and maintain the document database (4).
In response to the search request, a search is performed using the Web document database (4) as usual. At the same time, a search request history is stored, and a search engine is searched through the search engine interface (13) at regular intervals based on the search request history, and a Web document database ( 4) Update.
As described above, according to the present invention, management of information on information resources can be automatically performed without relying on humans, and a resource information manager does not need to take time to register and maintain information. In addition, since the present invention is basically based on the declaration of the information resource provider like the conventional directory service, the quality of information which is an advantage of the directory service is not impaired.
In addition, resource information is managed in a distributed manner, and by adding redundancy to the connection between each agent, it is robust against changes in the network environment and can improve search load distribution and accessibility of searchers. By limiting the cost that can be used for advertisement and search, the range of agents that advertise resource information and the range of agents to be searched can be limited, so that the overall communication volume can be suppressed. Since unnecessary access of information resources such as a robot program to the server is not required, a high load on the server is not generated. In addition, the resource provider can always access the nearest agent and update the resource information, while the resource searcher receives feedback on the availability of the resource, and other agents also receive the information. Since the update information is transmitted immediately, the effect of preventing the obsolescence phenomenon of the resource information from occurring can be expected.
The feedback of the resource evaluation by the resource searcher is that the information about useful information resources is easily found and the information about resources that are not is hard to be found by adjusting the lifetime of the resource information on each agent. In other words, the resource information obtained as a search result is ranked based on the overall evaluation based on the feedback of many searchers, so the information resource search that does not have the knowledge and skills to select many resources It is expected that users will be able to find, select and use better information resources more easily. Therefore, the present invention greatly contributes to efficient distribution management and retrieval of large-scale resource information.
[Eighth Embodiment]
FIG. 27 is a configuration diagram relating to the present embodiment, and shows an information resource discovery device having a function of distributing and managing information resource information and a search function. In the drawing, the same numbers are given to those shown in FIG.
First, the processing of the advertisement request will be described based on the flowchart of FIG. In a normal operation, an information resource provider (producer) transmits information about a service provided by the information resource provider (producer) to the nearest agent in the form of an advertisement (advertisement request). Specific contents of the advertisement include the URL of the information resource to be provided, the name, a keyword indicating the content, the name of a manager, and the like. The producer interface (6) transmits the advertisement to the advertisement processing unit (1) of the agent. Then, the content of the advertisement is analyzed to determine whether necessary information has been dropped (for example, a URL of an information resource or a related keyword) or whether incorrect data has been described (for example, whether the URL is in a valid format or not). Or whether the kanji code is normal), etc., and pass the contents of the advertisement to the information resource data control unit (5). Here, the history database (14) is updated based on the contents of the advertisement, and the contents of the advertisement are stored in the information resource database (4). The specific format of the history database (14) will be described later.
Further, the advertisement information of the information resource can be transmitted to an agent determined in advance by an agent manager through the agent interface (3). The agent receiving the advertisement information also receives the advertisement information from the agent interface (3), passes it to the information resource data control unit (5), and updates the history database (14) and the information resource database (4) as in the previous case. To update.
On the other hand, the processing of the information resource search request is sent from the information searcher to the inquiry processing unit (2) through the searcher interface (7) as shown in the processing flowchart of FIG. For the search request, check whether the required data has been deleted (for example, the search keyword or the e-mail address if the result is sent back by e-mail), or whether any invalid data is included (for example, if the e-mail address is Check if the format is correct).
The query processing unit (2) that has received the search request sends a search request to the information resource data control unit (5), and the information resource data control unit (5) analyzes the request and, based on the request, based on the history database (14). ) To update. Then, the information resource database (5) is searched, and the result is returned to the search requester through the inquiry processing unit (2) and the searcher interface (7).
Also, in the case of search, as in the case of advertisement, it is also possible to request a search to a nearby agent determined by the agent manager through the agent interface (3), collect the results, and return it to the information searcher. it can. In this case, it waits for a certain period of time until all search results from the agent that has transmitted the search request are collected. When the search result is sent from the agent interface (3) to the information resource data control unit (5), the contents of the search result are analyzed (for example, whether the data is not broken in the course of communication) and which search is performed. Check if it is a search result for the request. If the results for the search request have not yet been returned to the requester (searcher or another agent), add the results to the return buffer that stores the search results to return to the requester. . Then, the search result is registered in the history database (5), and the contents of the information resource database (4) are updated. FIG. 30 shows a series of processing flows for this search result. In this way, the result is returned from all the agents that have sent the search request, or by referring to the clock (8), the process waits until the timeout occurs, and returns the contents of the return buffer to the requester.
As shown in FIG. 31, a specific example of the format of the history database includes a set of advertisement information, a time at which a search request and a search result are received, their IDs, related keywords, and a sender.
The ID is for identifying the received information and has a format such as "R236666 @ A3". The first character indicates whether it is advertisement information, a search request, or a search result, and is “P”, “Q”, or “R”, respectively. The following numeric string is a unique numeric string created from time, process number, and the like to distinguish information. After "@" is the ID of the agent.
Next, the keyword column contains the relevant keywords of the information resource in the case of advertising information, the search keyword in the case of a search request, and all the related keywords of the information resource (generally multiple) obtained in the search result in the case of search results. It is.
The last source agent indicates which agent sent the information directly. From this database, it is possible to know from which agent information about what kind of information resource is collected or what kind of information resource is in high demand.
[Ninth embodiment]
FIG. 32 shows an embodiment of a distributed search device having a spool of advertisement information. For users, the sooner the results are obtained, the more pleasing the search is, but the advertising processing is not always so. It is not necessary to sacrifice the performance of search processing by immediately performing advertisement processing on the spot just because an advertisement request is received during search processing, and it is desirable to postpone it later. is there. In fact, ads about information resources that almost nobody has searched for may be immediately referred to other agents, may not be referred to by anyone, and even if you rush to advertise, it is beneficial for users May not be. On the other hand, it is worthwhile to hurry to advertise information resources that belong to fields that are searched many times by many users, because they will want to know immediately. is there. In the configuration of the present embodiment, for such a purpose, advertisement information that does not need to be advertised in a hurry is temporarily stored in the advertisement information spool (10), and the advertisement processing is performed in batch.
FIG. 33 shows a series of processing flows in the present embodiment. Upon receiving the advertisement information, the information resource data control unit (5) searches the history database for a search request using the related keyword of the information resource included in the advertisement information, a synonym thereof, or a word related thereto. Then, the priority is determined from the result. For example, the number of search requests obtained by searching the history database (5) may be used. If the priority is larger than a certain threshold value, the advertisement processing is performed immediately, otherwise, the advertisement processing is temporarily stored in the advertisement information spool (15). Then, the processing of the advertisement information in the advertisement information spool (15) is started when the agent becomes free, or when the advertisement information spool (15) is full, or at regular intervals. Also at this time, processing is performed in the order of priority.
Immediately after the start of the apparatus, there is no history, so the priority cannot be particularly determined. Therefore, the manager of the agent may decide the priority first. After that, the priority determined by the administrator and the priority determined by the history may be used in combination.
[Tenth embodiment]
This embodiment aims to record route information in the history database (14) and to perform more effective delivery of advertisements and search requests. The configuration diagram is the same as that of FIG. 27, but the format of the history database (14) in FIG. 27 is different from that of the transfer source agent of the history database (14), as shown in FIG. Is recorded. The information on the transfer route holds the ID of the agent that has passed through the exchanged information in order to prevent advertisement information, search requests and search results from being relayed to the same agent many times. And the path field of Netnews), it is possible to extract from there. The agent having received the information records the route information in the history database (14). The keyword is a set of keywords related to the target information resource in the case of an advertisement and a search result, and is a set of keywords specified as a search word in the case of a search request. The transfer route is a column of agents through which advertisement information, a search request, or a search result has passed. For example, “A1: A2: A3” indicates that data has been transmitted from agent A3 to agent A1 via agent A2.
For example, in the example of FIG. 34, it can be seen that the advertisement information of the information resource related to the keyword kw1 is advertised in a large amount from the agent A3 (in FIG. 34, the portion marked with a white circle corresponds to FIG. 34). The circles in the figure are filled out for explanation and are not information actually recorded in the database.) It is highly possible that the resource information related to kw1 is concentrated near A3. Therefore, when a search request for kw1 is received, a large number of search results can be easily obtained by directly transferring the search request to the agent A3 that is not in a neighboring relationship.
Conversely, the agent A7 finds that there are many search requests for kw1 (corresponding to the portions marked with black circles). When the advertisement for the information resource for kw1 is directly transferred to A7, the search from A7 is effectively performed. It is possible to do. Based on such considerations, the information resource data control unit (6) refers to the history database (5) as needed, determines an appropriate agent according to the contents of the advertisement information and the search request, and stores the advertisement information therein. And transfer of search requests.
[Eleventh embodiment]
In the distributed search device employing the history database (FIG. 34) as in the tenth embodiment, an agent state table (16) is further provided as shown in FIG. The agent state table (16) is a table of the response time required for transferring the advertisement information, the search request and the search result to another agent. FIG. 36 shows an example of the table. Each column of the table includes three items: an agent ID, a response time for the agent, and a final check time indicating the last time when the response time was measured. In the response time, “−−−−−−” indicates that communication was not normally performed.
The purpose of this embodiment is to send an advertisement request to a nearby agent when an advertisement request cannot be transferred to a nearby agent for some reason such as a network failure when sending an advertisement request to a nearby agent. And Performs normal advertising request processing and sends an advertising request to a nearby agent. Then, the response time at that time is measured, and the agent state table (16) is updated. If the advertisement request could not be sent to the nearby agent (the agent that could not be sent is assumed to be A), the history database (14) as shown in FIG. Then, the agent state table (16) as shown in FIG. 36 is examined, and the agent B who has the quickest response is selected from the neighboring agents of A, and the advertisement request is transferred there. By doing so, it is possible to avoid an agent in a state where processing cannot be performed, and to continue the advertisement processing as the whole system. In the case of a search request as well, a failure can be avoided and a search can be performed for the entire system by the same procedure.
That is, as shown in FIG. 37, if an advertisement request, a search request, or a search result that has passed from agent A via B has been received in the past, it is possible to know that B is near A. Even if a failure occurs, it is possible to directly send to B.
FIG. 38 shows this series of processing flows.
[Twelfth embodiment]
FIG. 39 shows that when the spooled advertisement request is sent to a nearby agent, the history database (14) is referred to, a time period during which network traffic and agent load are relatively low is selected, and transfer is performed. FIG. 2 is a configuration diagram of a distributed search device that performs scheduling so as to perform it.
FIG. 40 is an example of the agent state table (16) for this embodiment. This table, unlike the example of the agent state table in the eleventh embodiment (FIG. 36), holds information on the average response time for each time zone and the number of times of actual measurement. These pieces of information are updated mainly based on the time required for a search request and a search result return process performed in real time. In this example, only the average response time and the number of times of measurement are recorded in the table. However, as in the table of the eleventh embodiment (FIG. 36), information about the last check time and the state at that time may be included.
The operation when receiving the advertisement request is to temporarily store the low priority advertisement information in the advertisement information spool (15) as in the ninth embodiment. At that time, the advertisement scheduler (17) creates an advertisement delivery schedule that defines which time zone and which agent the advertisement information is transmitted to. For example, in FIG. 40, if the priority of the advertisement information to be sent to the agent A1 is very low, the transfer of the advertisement information during the time period from 01:00 to 01:59 at which the processing of A1 is expected to be the least is considered. In the case where the priority is somewhat high, the transfer is performed as early as possible even in a time zone in which the response time is relatively large. Then, advertisement information is delivered to the designated agent according to the advertisement delivery schedule.
[Thirteenth embodiment]
FIG. 42 is a diagram showing a distributed search search method in which the distribution range of information such as an advertisement request is limited by using costs, in accordance with past advertisements and search results recorded in the history database (14). 1 is a configuration diagram of a distributed search device having a function of distributing a cost spent for transferring information to be sent.
The operation when an advertisement request is received is as follows. When a normal advertisement request is processed and the advertisement request is transferred to another agent, the history database (14) is searched, and from any agent in the past, Investigate whether many matching search requests have been transferred. Agents who receive a lot of search requests can see that there is a lot of demand for that information, and if they spend a lot of money in that direction and advertise, they will be in close proximity to areas with high demand Advertisement information is distributed to the next time, and the next time a search request is issued, resource information will likely be found at a low cost. It is the role of the cost manager (18) to distribute the cost of the advertisement request to the nearby agent.
Conversely, when the search request is forwarded and the search request is forwarded to another agent, it is considered that there are many relevant resource information in the direction in which the advertisement request and the search result have been forwarded, If a lot of cost is spent on searching in that direction, many search results can be obtained, and it is considered possible to find better resource information.
FIG. 42 is an example of cost distribution. In the past, if an advertisement request for a certain keyword has been received from neighboring agents A, B, and C at a ratio of 2: 5: 3, it is natural that there is much information about the keyword in the direction of B. It is. Therefore, it is expected that better search results can be obtained by increasing the cost of searching for B 1. Therefore, in the present embodiment, the search cost is distributed in the ratio of 2: 5: 3, and the search request is transferred to each neighboring agent. Note that this ratio does not necessarily need to be proportional, and may be configured to be weighted to some extent.
【The invention's effect】
As described above, according to the present invention, management of information on information resources can be automatically performed without relying on humans, and a resource information manager does not need to take time to register and maintain information. In addition, since the present invention is basically based on the declaration of the information resource provider like the conventional directory service, the quality of information which is an advantage of the directory service is not impaired.
In addition, resource information is managed in a distributed manner, and by adding redundancy to the connection between each agent, it is robust against changes in the network environment and can improve search load distribution and accessibility of searchers. By limiting the cost that can be used for advertisement and search, the range of agents that advertise resource information and the range of agents to be searched can be limited, so that the overall communication volume can be suppressed. Since unnecessary access of information resources such as a robot program to the server is not required, a high load on the server is not generated. In addition, the resource provider can always access the nearest agent and update the resource information, while the resource searcher receives feedback on the availability of the resource, and other agents also receive the information. Since the update information is transmitted immediately, the effect of preventing the obsolescence phenomenon of the resource information from occurring can be expected.
The feedback of the resource evaluation by the resource searcher is that the information about useful information resources is easily found and the information about resources that are not is hard to be found by adjusting the lifetime of the resource information on each agent. In other words, the resource information obtained as a search result is ranked based on the overall evaluation based on the feedback of many searchers, so the information resource search that does not have the knowledge and skills to select many resources It is expected that users will be able to find, select and use better information resources more easily. Therefore, the present invention greatly contributes to efficient distribution management and retrieval of large-scale resource information.
Furthermore, by adopting a configuration that retains and uses the advertisement information, the history of search requests and search results in the resource information distributed search system, it is possible to distribute the advertisement information during the time when the load on the agent is small, and to send information to areas with high demand. Delivery, priority search for areas where information is thought to exist, scheduling of advertisement distribution according to demand and network environment, routing of advertisement information and search requests by cost allocation according to past search results, etc. This greatly contributes to traffic saving and efficient information distribution and information retrieval.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating the principle of a search device (agent) in a distributed search system according to the present invention.
FIG. 2 is an explanatory diagram relating to advertisement and search for information resources in a distributed search system.
FIG. 3 is an explanatory diagram of propagation of information on information resources in a distributed search system.
FIG. 4 is an explanatory diagram regarding propagation of information resource evaluation in a distributed search system.
FIG. 5 is a block diagram illustrating a configuration of an embodiment of a search device (agent).
FIG. 6 is a diagram illustrating an example of an advertisement.
FIG. 7 is a flowchart showing an advertisement process in a search device (agent).
FIG. 8 is a diagram illustrating an example of a search request.
FIG. 9 is a flowchart illustrating a search process in a search device (agent).
FIG. 10 is a diagram showing an example of a search result.
FIG. 11 is a block diagram illustrating a configuration of a search device in a large-scale distributed database according to a second embodiment.
FIG. 12 is an explanatory diagram of the operation of the large-scale distributed database according to the second embodiment.
FIG. 13 is a block diagram illustrating a configuration of a search device in a large-scale distributed database according to a third embodiment.
FIG. 14 is an explanatory diagram of the operation of the large-scale distributed database according to the third embodiment.
FIG. 15 is a block diagram illustrating a configuration of a search device in a large-scale distributed database according to a fourth embodiment.
FIG. 16 is a flowchart illustrating a process of a feedback processing unit in FIG. 15;
FIG. 17 is a block diagram illustrating a configuration of a search device in a large-scale distributed database according to a fifth embodiment.
FIG. 18 is a block diagram illustrating a configuration of a search device in a large-scale distributed database according to a sixth embodiment.
FIG. 19 is a flowchart illustrating processing of an advertisement request in the sixth embodiment.
FIG. 20 is a flowchart illustrating a process of a search request according to the sixth embodiment.
FIG. 21 is a flowchart illustrating a process of a searcher interface according to a sixth embodiment.
FIG. 22 is a block diagram illustrating a configuration of a search device according to a seventh embodiment.
FIG. 23 is a diagram illustrating the principle configuration of a search device to which a history database is added.
FIG. 24 is an explanatory diagram regarding advertisement of information resources in the distributed search system of FIG. 23;
FIG. 25 is an explanatory diagram of information resource search in the distributed search system of FIG. 23;
FIG. 26 is an explanatory diagram of returning information resources in the distributed search system of FIG. 23;
FIG. 27 is a block diagram illustrating a configuration of a search device according to an eighth embodiment.
FIG. 28 is a flowchart illustrating processing of an advertisement request in the eighth embodiment.
FIG. 29 is a flowchart illustrating processing of a search request according to the eighth embodiment.
FIG. 30 is a flowchart illustrating processing of a search result according to the eighth embodiment.
FIG. 31 is a diagram illustrating an example of a history database according to the eighth embodiment.
FIG. 32 is a block diagram illustrating a configuration of a search device according to a ninth embodiment.
FIG. 33 is a flowchart showing processing of an advertisement request in the ninth embodiment.
FIG. 34 is a diagram illustrating an example of a history database according to the tenth embodiment.
FIG. 35 is a block diagram illustrating a configuration of a search device according to the eleventh embodiment.
FIG. 36 is a diagram illustrating an example of an agent state table according to the eleventh embodiment.
FIG. 37 is an explanatory diagram of an operation in the eleventh embodiment.
FIG. 38 is a flowchart showing processing of an advertisement request by searching for an alternative agent in the eleventh embodiment.
FIG. 39 is a block diagram illustrating a configuration of a search device according to a twelfth embodiment.
FIG. 40 is a diagram illustrating an example of an agent state table according to the twelfth embodiment.
FIG. 41 is a block diagram illustrating a configuration of a search device according to a thirteenth embodiment.
FIG. 42 is an explanatory diagram of the operation in the thirteenth embodiment.
[Explanation of symbols]
1 advertisement processing unit,
2 inquiry processing unit,
3 agent interface,
4 information resource database,
5 resource data control unit,
6 Producer interface,
7 Searcher interface,
8 clocks,
9 DB search front end,
10 DB interface,
11 feedback processing unit,
12 Evaluation request processing unit.
13 Search Engine Interface
14 History database
15 Advertising information spool
16 Agent State Table
17 Advertising Scheduler
18 Cost Manager

Claims

A plurality of search devices are connected on the network, each of the plurality of search devices stores advertising information, which is information including location information of information resources, and a self search device that receives a search request from a user. Means for searching for advertisement information storage means of another search device on the network; means for receiving a request to register the advertisement information in the advertisement information storage means; and And a means for transferring to another search device above, a distributed search system comprising:
A distributed search system wherein a range in which the advertisement information is transferred on a network is determined according to cost information given to the advertisement information.

When returning the advertisement information, which is the search result from another search device connected to the network, to the search device of the request source, while tracing the search device that relayed the search request in reverse, each search device 2. The distributed search system according to claim 1, wherein the advertisement information stored in the system is updated.

An information searcher stores an evaluation value given to an information resource corresponding to advertisement information obtained by searching in association with advertisement information stored on each search device on the network. The distributed search system according to claim 1, wherein:

The distributed search system according to claim 3, wherein advertisement information having a high evaluation value and / or an information resource corresponding to the advertisement information is selected from the search results and presented to an information searcher.

4. The distributed search system according to claim 3, wherein the information resource provider is notified of an evaluation value of the information resource advertised by the information resource provider.

The method according to claim 3, wherein a period for storing the advertisement information stored on each of the search devices on the network is changed and / or deleted based on the evaluation value given to the advertisement information. Distributed search system.

A means for storing advertisement information that is connected to a network and including at least information on the location of information resources; receives a search request from a user and searches the advertisement information storage means for extraction; A search device for presenting advertisement information and / or information resources corresponding to the advertisement information,
An advertisement processing unit that receives registration of the advertisement information;
A process of recording the advertisement information received by the advertisement processing unit in the advertisement information storage unit and transferring the advertisement information to another search device connected to the network; and another search device connected to the network. It is determined whether or not to transfer the advertisement information to another search device based on the cost information included in the advertisement information while recording the advertisement information transferred from the storage means, and when it is determined to transfer the advertisement information, A control unit for executing a process of transferring the advertisement information to another search device connected to the network except for the other search device of the transfer source.

The search device according to claim 7, wherein the control unit calculates cost information included in the advertisement information, rewrites cost information of the advertisement information, and performs transfer.

The search in the distributed search system according to claim 7, wherein the advertisement processing unit acquires an evaluation value given to the advertisement information, and presents the evaluation value to a provider of the advertisement information. apparatus.

  Network - Advertisement means connected to the server and storing advertisement information that is information including at least information on the location of information resources. The advertisement is extracted by receiving a search request from a user and performing a search in the advertisement information storage means. A search device for presenting information and / or information resources corresponding to the advertisement information,
  An advertisement processing unit that receives registration of the advertisement information;
  A process of recording the advertisement information received by the advertisement processing unit in the advertisement information storage unit and transferring the advertisement information to another search device connected to the network; and another search device connected to the network. It is determined whether or not to transfer the advertisement information to another search device based on the cost information included in the advertisement information while recording the advertisement information transferred from the storage means, and when it is determined to transfer the advertisement information, A controller that executes a process of transferring the advertisement information to another search device connected to the network except for the other search device of the transfer source;
  A history record of at least one type of information among the advertisement information, search request, and search result information received by the search device, including information on the transfer source and / or transfer route of the information as a history A search device in a distributed search system, comprising: a database;

11. The retrieval apparatus according to claim 10, wherein the history database records at least keywords relating to the accumulated information and information of a transfer source and / or a transfer route of the information in association with each other.

In addition to providing an advertising information spool that stores advertising information,
The control unit searches the history database for a search request corresponding to the received advertisement information, determines a priority of the received advertisement information based on the search result, and determines the received advertisement based on the priority. 11. The search device according to claim 10, wherein a timing of information transfer is determined, and advertisement information that is not transferred immediately is stored in the advertisement information spool.

The control unit searches the history database based on a keyword included in a search request, and determines a transfer destination of the search request based on a transfer source and / or a transfer path of the advertisement information including the keyword. The search device in the distributed search system according to claim 11, wherein

The control unit searches the history database based on a keyword included in the advertisement request, and determines a transfer destination of the advertisement request based on a transfer source and / or a transfer path of the search request including the keyword. The search device in the distributed search system according to claim 11, wherein

Providing an agent state table for recording the response time required for transferring information to another search device for each of the other search devices,
The control unit searches the history database when the transfer of information to one of the other search devices fails, and searches for the transfer failure from the information on the transfer path included in the history database. Search for another search device that is logically close to the device, obtain the search device that has the fastest response among the obtained other search devices from the agent state table, and search for the obtained other search devices. The retrieval apparatus according to claim 10, wherein the selected information is transferred when the transfer fails.

An agent state table for determining a response time required for transferring information to another search device for each preset time zone and recording for each of the other search devices,
Estimating the height of the load for each time zone for each of the other search devices with reference to the agent state table, based on the estimated height of the load and the priority of the advertising information to be transferred, the 14. The search device in the distributed search system according to claim 13, further comprising an advertisement scheduler for determining a time zone for transferring the stored advertisement information for each search device of a transfer destination.

Based on the cost information included in the received information and the path information of the information corresponding to the received information recorded in the history database, new cost information is calculated for each of the other search devices of the transfer destination. 11. The search in the distributed search system according to claim 10, wherein a cost manager for rewriting cost information included in the received information to the calculated cost information and transferring the information to the another search device is provided. apparatus.

In the cost information, the number of search devices through which the advertisement information passes on the network is set, and each search device subtracts the number of search devices included in the search range information added to the advertisement information. 2. The information search method according to claim 1, wherein

In the cost information, the number of search devices through which the advertisement information passes on the network is set, and the calculation of cost information by the control unit subtracts the number of search devices. An information retrieval method according to claim 8,

The process of transferring an advertisement request to another search device connected to the network is performed for another search device that recognizes logical location information on the network. Item 8. The search device according to Item 7.