JP2940459B2

JP2940459B2 - Node / link search device

Info

Publication number: JP2940459B2
Application number: JP8022343A
Authority: JP
Inventors: 信也久保
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1996-02-08
Filing date: 1996-02-08
Publication date: 1999-08-25
Anticipated expiration: 2016-02-08
Also published as: JPH09218876A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ノードとリンクか
らなるデータベースにおいて指定された条件の下で探索
の対象とすべきノードの範囲を見い出すノード・リンク
探索装置に係わり、特に分散ハイパーメディアシステム
のようにノードがネットワーク上の複数のサーバに分散
して格納されたデータベースで用いられるノード・リン
ク探索装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a node / link search apparatus for finding a range of nodes to be searched under a condition specified in a database composed of nodes and links, and more particularly to a distributed hypermedia system. The present invention relates to a node link search device used in a database in which nodes are distributed and stored in a plurality of servers on a network.

【０００２】[0002]

【従来の技術】データベースの１つとして、ドキュメン
トファイルなど各種情報を格納した複数のノードをこれ
らの従属関係を表わしたリンクをたどることによって検
索を行うハイパーテキストシステムがある。このような
ハイパーテキストでの検索作業は、ノードに格納されて
いる情報のインデックスと、ノード相互間のリンク関係
を予め保持しておくことでスムースに進めることができ
る。2. Description of the Related Art As one of databases, there is a hypertext system for searching a plurality of nodes storing various information such as a document file by following a link indicating a dependency between the nodes. Such a search operation using hypertext can proceed smoothly by holding the index of the information stored in the nodes and the link relation between the nodes in advance.

【０００３】特開平４−３２１１４４号公報には、ノー
ド間相互間の関係を容易に把握できるように表示できる
ハイパーテキストシステムが開示されている。各ノード
は任意のノードにリンクすることができるので、複数の
ノード間でループ状にリンクが形成されることがある。
このシステムでは、ノード相互間のリンク関係を予め登
録したテーブルを基にして表示画面を作成している。そ
して、ループ状になっているリンクを切り離し、あるノ
ードを起点としたときに木構造としてノード相互間のリ
ンク関係が表示されるようにしている。Japanese Patent Laid-Open Publication No. Hei 4-321144 discloses a hypertext system capable of displaying a relationship between nodes so that the relationship can be easily grasped. Since each node can be linked to an arbitrary node, a link may be formed in a loop between a plurality of nodes.
In this system, a display screen is created based on a table in which link relationships between nodes are registered in advance. Then, the link in a loop is separated, and the link relation between the nodes is displayed as a tree structure starting from a certain node.

【０００４】従来、ハイパーテキストシステムはローカ
ルのワークステーションなどに構築されていたが、近年
の通信技術の発達により、ネットワークを介して接続さ
れた複数のサーバにノードを分散して格納するものが登
場している。このようなシステムは、分散ハイパーメデ
ィアシステムと呼ばれている。たとえば、インターネッ
ト上の情報発信手段として、ワールド・ワイド・ウェブ
（World Wide Web 以下ＷＷＷと表わす。) が注目され
ている。ハイパーメディアとは、文字や表などのテキス
トデータだけでなく、動画や音声などのマルチメディア
データも扱うことのできるハイパーテキストのことであ
る。Conventionally, the hypertext system has been constructed on a local workstation or the like, but with the recent development of communication technology, a system in which nodes are distributed and stored in a plurality of servers connected via a network has appeared. doing. Such a system is called a distributed hypermedia system. For example, the World Wide Web (WWW) has attracted attention as a means of transmitting information on the Internet. Hypermedia is hypertext that can handle not only text data such as characters and tables, but also multimedia data such as moving images and audio.

【０００５】分散ハイパーメディアシステムでは、各サ
ーバに格納されているノードの情報をネットワークを介
して取得することによって、ノード相互間のリンク関係
を表わしたテーブルを作成するようになっている。ノー
ド相互間のリンク関係を表わした情報をノード・リンク
情報データベースと呼ぶことにする。また、ノード・リ
ンク情報データベースを作成する装置をノード・リンク
探索装置と呼ぶことにする。[0005] In the distributed hypermedia system, a table representing a link relationship between nodes is created by acquiring information on nodes stored in each server via a network. Information indicating the link relationship between nodes is called a node / link information database. Also, an apparatus that creates a node link information database will be referred to as a node link search apparatus.

【０００６】ワールド・ワイド・ウェブ（ＷＷＷ）で
は、ノードはインターネット上の複数のサーバマシンに
分散して存在しており、各サーバに蓄積されているノー
ドのデータはハイパーテキスト・トランスファ・プロ
トコル(HyperText Transfer Protocol 以下ＨＴＴＰと
表わす。) と呼ばれる手順に従って転送される。ノード
になっているマルチメディアドキュメントは、ハイパー
テキスト・マークアップ・ランゲージ(HyperText Marku
p Language 以下ＨＴＭＬと表わす。) と呼ばれるハイ
パーメディア記述言語形式で記述されている。In the World Wide Web (WWW), nodes are distributed among a plurality of server machines on the Internet, and node data stored in each server is stored in a hypertext transfer protocol (HyperText). Transfer Protocol (hereinafter referred to as HTTP). Multimedia documents that are nodes are hypertext markup languages.
It is expressed as HTML below. ) Is described in a hypermedia description language format.

【０００７】ノード・リンク探索装置は、ネットワーク
を介して取得したＨＴＭＬで記述されているノードの内
容を解析して、次のリンク先のノードの格納場所を表わ
すハイパーリンクを抽出する。ハイパーリンクは、ＨＴ
ＭＬで規定されているその開始位置を示す所定の文字列
（これをタグと呼ぶ。）と終了位置を示すタグを検索す
ることで、これらの間の文字列として抽出される。The node / link searching device analyzes the contents of the node described in HTML acquired via the network, and extracts a hyperlink indicating the storage location of the next linked node. Hyperlink is HT
By searching for a predetermined character string (referred to as a tag) indicating the start position specified by the ML and a tag indicating the end position, the character string is extracted as a character string between them.

【０００８】リンク先ノードを表わすハイパーリンク
は、ユニフォーム・リソース・ロケーターズ(Uniform R
esource Locators 以下ＵＲＬと表わす。) と呼ばれる
表記形式により記述されている。取得したノードのテキ
スト中からハイパーリンクを示すタグに挟まれた部分を
見い出し、この中からＵＲＬで記述された文字列を抽出
することによってノードとハイパーリンクの探索を行う
ことができる。ノード・リンク探索装置は、このような
探索を繰り返すことによって、ノード・リンク情報デー
タベースを構築する。ノード・リンク探索装置として
は、“WWW Wanderer”、“WWW Robot ”、“WWW Spide
r”などと呼ばれるものがある。[0008] The hyperlink representing the link destination node is a Uniform Resource Locators (Uniform R).
esource Locators Hereinafter referred to as URL. ). A node and a hyperlink can be searched for by finding a portion sandwiched between tags indicating hyperlinks from the acquired text of the node and extracting a character string described by a URL from the portion. The node link search device constructs a node link information database by repeating such a search. "WWW Wanderer", "WWW Robot", "WWW Spide"
There is something called "r".

【０００９】リンク先ノードを表わすハイパーリンク
は、ＵＲＬにおいて以下の構成が定義されている。＜スキーム(Scheme)＞：＜スキーム特有部(Scheme-Spec
ific-Part)＞ワールド・ワイド・ウェブ（ＷＷＷ）でノードのドキュ
メントファイルの転送プロトコルとして用いられるハイ
パーテキスト・トランスファ・プロコル（ＨＴＴＰ）ス
キームの場合には、ハイパーリンクは以下のように表さ
れる。 http://<host>:<port>:<path>?< searchpart>The following configuration of a hyperlink representing a link destination node is defined in a URL. <Scheme>: <Scheme-Spec part
ific-Part)> In the case of the Hypertext Transfer Protocol (HTTP) scheme used as a document file transfer protocol in the World Wide Web (WWW), hyperlinks are represented as follows. http: // <host>: <port>: <path>? <searchpart>

【００１０】ここで、“<host>”はドキュメントファイ
ルの存在するサーバマシンのホスト名を、“<port>”は
通信を行う際のポート番号を表わしている。ポート番号
は省略可能であり、省略した場合には標準値“８０”が
指定されたものとして取り扱われる。“<path>”はサー
バマシン上のドキュメントファイルの存在する場所を表
わしている。また、“< searchpart> ”はサーバマシン
にデータを渡す場合に用いられる領域である。Here, “<host>” represents the host name of the server machine where the document file exists, and “<port>” represents the port number for performing communication. The port number can be omitted, and if omitted, it is treated as if the standard value “80” was specified. “<Path>” indicates the location of the document file on the server machine. “<Searchpart>” is an area used when transferring data to the server machine.

【００１１】図１４は、ノードとこれらノード間を接続
するハイパーリンクの一例を表わしたものである。ノー
ド１００１、ノード１００２、ノード１００３にそれぞ
れ含まれるドキュメントファイルは、ＨＴＭＬで記述さ
れている。ドキュメントファイルの内容は、各種のタグ
によりその内容が分類され識別可能になっている。この
うちハイパーリンクを示すタグは、<A HREF="URL">…</
A>で表され、"URL" に対応する部分がリンク先ノードを
示している。ノード１００１のドキュメントファイルで
は、点線で囲んだ領域１００４、１００５がそれぞれハ
イパーリンクを表わしている。FIG. 14 shows an example of nodes and hyperlinks connecting these nodes. The document files included in each of the nodes 1001, 1002, and 1003 are described in HTML. The contents of the document file are classified and identified by various tags. Of these, the tag indicating a hyperlink is <A HREF="URL">… </ </
A part represented by A> and corresponding to "URL" indicates a link destination node. In the document file of the node 1001, areas 1004 and 1005 surrounded by dotted lines represent hyperlinks, respectively.

【００１２】ノード１００１は、ＵＲＬで“http://hos
tA/tale/TOC.html”のように表記されるノードであると
する。また、ノード１００２は“http://hostB:80/tale
/Introduction.html”と、ノード１００３は“http://h
ostC:8080/tale/Birth.html”と表記されるノードであ
るものとする。ノード１００１内のハイパーリンク１０
０４は、ノード１００２を指し示している。またノード
１００１内のハイパーリンク１００５は、ノード１００
３を指し示している。このようにノードのドキュメント
ファイル内に登録されているハイパーリンクによって、
リンク先のノードが表される。The node 1001 uses the URL “http: // hos
tA / tale / TOC.html ”. The node 1002 is“ http: // hostB: 80 / tale ”.
/Introduction.html ”and node 1003 is“ http: // h
ostC: 8080 / tale / Birth.html ". Hyperlink 10 in node 1001
04 indicates the node 1002. The hyperlink 1005 in the node 1001 is
3 is indicated. In this way, by the hyperlink registered in the node's document file,
The link destination node is represented.

【００１３】図１５は、ノードのドキュメント同士の関
係の一例を表わしたものである。一点破線で囲んだ領域
は、ノードの格納されているサーバを表わしている。領
域１０１１はホスト名が“Ａ”のサーバマシンを、領域
１０１２はホスト名が“Ｂ”のサーバマシンを、領域１
０１３はホスト名が“Ｃ”のサーバマシンをそれぞれ表
わしている。FIG. 15 shows an example of the relationship between documents of a node. The area surrounded by the dashed line represents the server where the node is stored. An area 1011 is a server machine having a host name “A”, an area 1012 is a server machine having a host name “B”, and an area 1
013 represents a server machine whose host name is “C”, respectively.

【００１４】図１４に示したノード１００１〜１００３
に含まれるドキュメントファイルのうち、ＨＴＭＬで規
定されたタグなどを取り除いた内容１０１４〜１０１６
はそのノードの格納されているサーバを表わす領域１０
１１〜１０１３内に表わしてある。また、ノード１００
１のリンク先ノードがノード１００２とノード１００３
であることをハイパーリンク１０１７、１０１８により
表わしている。The nodes 1001 to 1003 shown in FIG.
1014 to 1016 of the document file included in the file, excluding tags and the like defined by HTML
Is an area 10 representing a server in which the node is stored.
It is represented in the 11-1013. Node 100
The link destination nodes 1 are nodes 1002 and 1003
Are represented by hyperlinks 1017 and 1018.

【００１５】次に、各ノードに登録されているハイパー
リンクを基にしてノード・リンク情報データベースの構
築を行うノード・リンク探索装置の構成を説明する。Next, the configuration of a node / link search device that constructs a node / link information database based on hyperlinks registered in each node will be described.

【００１６】図１６は、従来から使用されているノード
・リンク探索装置の構成の概要を表わしたものである。
ノード・リンク探索装置１０２１には、インターネット
などのネットワーク１０２２を介して複数のサーバマシ
ン１０２３が接続されている。各サーバマシン１０２３
には、ドキュメントファイルを格納したノードが蓄積さ
れている。ノード・リンク探索装置１０２１は、サーバ
マシン１０２３からノードのファイルを収集するファイ
ル収集部１０３１と、サーバマシン１０２３から取得し
たノードのファイルの中からハイパーリンクを抽出する
ハイパーリンク抽出部１０３２を備えている。FIG. 16 shows an outline of the configuration of a conventionally used node / link searching device.
A plurality of server machines 1023 are connected to the node / link search device 1021 via a network 1022 such as the Internet. Each server machine 1023
Stores nodes storing document files. The node / link search device 1021 includes a file collection unit 1031 that collects node files from the server machine 1023, and a hyperlink extraction unit 1032 that extracts hyperlinks from node files acquired from the server machine 1023. .

【００１７】また抽出したハイパーリンクを基にしてノ
ードの格納先などの各種属性情報およびノード相互間の
リンク情報を蓄積・管理するノード・リンク情報データ
ベース１０３３を有する。探索範囲制御部１０３４は、
図示しない入力端末あるいは設定用の外部ファイルから
与えられる探索範囲の条件に合致する範囲にノードの探
索範囲を制限する部分である。システム制御部１０３５
は、ノード・リンク探索装置１０２１内の各部の動作の
流れを統括的に制御する回路部分である。Further, a node link information database 1033 for storing and managing various attribute information such as a storage location of the node and link information between the nodes based on the extracted hyperlink is provided. The search range control unit 1034
This is a part that limits the search range of the node to a range that matches the condition of the search range provided from an input terminal (not shown) or an external file for setting. System control unit 1035
Is a circuit part that controls the flow of operation of each unit in the node / link search device 1021 in an integrated manner.

【００１８】ハイパーリンク抽出部１０３２は、ハイパ
ーメディア記述言語で書かれているノードのマルチメデ
ィアドキュメントファイルの内容を解析するハイパーメ
ディア記述言語解析部１０３６を備えている。ハイパー
メディア記述言語解析部１０３６は、各種タグを検出す
ることによってハイパーリンクとリンク先ノードを抽出
するようになっている。探索範囲は、サーバマシン１０
２３の識別子としてのホスト名によって指定される。探
索範囲制御部１０３４は、ノードの格納先のサーバマシ
ンのホスト名と指定された探索範囲としてのホスト名と
を比較するホスト名比較部１０３７を備えている。ノー
ドの格納位置を表わす情報の一部としてのホスト名を比
較することにより、ノードが探索範囲内のものであるか
否かを判定するようになっている。The hyperlink extracting unit 1032 includes a hypermedia description language analyzing unit 1036 that analyzes the contents of the multimedia document file of the node written in the hypermedia description language. The hypermedia description language analysis unit 1036 extracts a hyperlink and a link destination node by detecting various tags. The search range is the server machine 10
23 is specified by the host name as an identifier. The search range control unit 1034 includes a host name comparison unit 1037 that compares the host name of the server machine where the node is stored with the host name as the specified search range. By comparing the host name as a part of the information indicating the storage location of the node, it is determined whether or not the node is within the search range.

【００１９】ノード・リンク探索装置１０２１のファイ
ル収集部１０３１は、探索範囲として指定されたホスト
名と一致するサーバマシン１０２３からネットワーク１
０２２を通じてノードのマルチメディアドキュメントフ
ァイルを読み込む。そして、ハイパーリンク抽出部１０
３２により読み込んだファイルからハイパーリンクとリ
ンク先ノードの情報を抽出する。探索範囲制御部１０３
４は抽出したリンク先ノードの格納されているサーバマ
シンのホスト名が探索範囲内の場合には、そのリンク先
ノードに対して探索を継続する。一方、リンク先ノード
の格納先のサーバマシンが探索範囲外のときは、それよ
り先のノードへの探索は中止する。ノード・リンク情報
データベース１０３３には、探索範囲内のノードおよび
ハイパーリンクについての各種属性が登録される。The file collection unit 1031 of the node / link search device 1021 sends the network 1 from the server machine 1023 that matches the host name specified as the search range.
022, the multimedia document file of the node is read. Then, the hyperlink extraction unit 10
Then, the information of the hyperlink and the link destination node is extracted from the file read in step S32. Search range control unit 103
No. 4 continues the search for the linked node if the host name of the server machine where the extracted linked node is stored is within the search range. On the other hand, when the server machine at the storage destination of the link destination node is out of the search range, the search to the node after that is stopped. Various attributes of nodes and hyperlinks within the search range are registered in the node / link information database 1033.

【００２０】このようにホスト名を探索範囲として指定
するものの他に、何ら探索範囲を指定できないノード・
リンク探索装置も存在する。As described above, in addition to the node whose host name is specified as a search range, a node or node whose search range cannot be specified at all.
There is also a link search device.

【００２１】[0021]

【発明が解決しようとする課題】探索範囲を指定するこ
とのできないノード・リンク探索装置は、インターネッ
トなどネットワーク全体を探索範囲とし、ネットワーク
上の全てのサーバマシンに存在する全てのノードの探索
を行う。このため、ノード・リンク情報データベースに
は、本来、探索範囲とすべきもの以外のノードに関する
不必要な情報も蓄積され、その資源を浪費するばかりで
なく、検索する際の作業効率も低下していた。さらに、
ノードの存在するサーバマシンやネットワークをアクセ
スする時間が長くなり、他の利用者のネットワーク資源
の利用が制限されていしまうという問題もある。A node / link search device that cannot specify a search range makes the entire network such as the Internet a search range, and searches for all nodes existing in all server machines on the network. . For this reason, unnecessary information on nodes other than those which should be originally set as a search range is also accumulated in the node link information database, which not only wastes resources but also reduces work efficiency in searching. . further,
There is also a problem that the time for accessing the server machine or the network where the node exists becomes long, and the use of network resources of other users is restricted.

【００２２】探索範囲を指定することのできる装置であ
っても、従来はホスト名を単位として探索範囲を制限す
ることしかができない。これはノードの物理的な位置で
探索範囲を制限していることになる。しかしながら、ハ
イパーテキストはノード間の意味的な関係に基づいて順
次検索するものであるので、サーバマシンを単位とする
探索範囲の指定では、適切な範囲指定ができない。Conventionally, even a device that can specify a search range can only limit the search range in units of host names. This means that the search range is limited by the physical position of the node. However, since hypertexts are searched sequentially based on the semantic relationship between nodes, an appropriate range cannot be specified by specifying a search range in units of server machines.

【００２３】たとえば、図１５に示した例では、サーバ
“Ａ”１０１１上に目次の登録されたノード１０１４が
存在し、サーバ“Ｂ”１０１２上およびサーバ“Ｃ”１
０１３上に、目次に対応する内容の文章の登録されたノ
ード１０１５、１０１６がそれぞれ登録されている。こ
のような場合、目次のみではなく、書かれている文書の
中身までも探索範囲としたい場合には、サーバ“Ａ”、
サーバ“Ｂ”、サーバ“Ｃ”のすべてを探索範囲のホス
ト名として指定しなければならない。その結果、サーバ
“Ｂ”、サーバ“Ｃ”上に存在する目次と何ら意味的な
つながりの無い他の多数のノードまでが探索範囲とな
り、不要なノードの探索が行われてしまうという問題が
ある。For example, in the example shown in FIG. 15, a node 1014 whose table of contents is registered exists on server "A" 1011 and server "B" 1012 and server "C" 1
013, nodes 1015 and 1016 in which sentences having contents corresponding to the table of contents are registered, respectively. In such a case, if the search range is to include not only the table of contents but also the contents of the written document, the server “A”,
All of the servers "B" and "C" must be specified as host names in the search range. As a result, there is a problem that a search range extends to many other nodes having no meaningful connection with the table of contents existing on the server “B” and the server “C”, and an unnecessary node is searched. .

【００２４】そこで本発明の目的は、意味的なつながり
のない不要なノードの探索を制限することのできるノー
ド・リンク探索装置を提供することにある。It is an object of the present invention to provide a node / link search device which can limit the search for unnecessary nodes having no meaningful connection.

【００２５】[0025]

【課題を解決するための手段】請求項１記載の発明で
は、ハイパーテキストの各ノードに含まれるリンク先の
ノードの名称とリンク先のノードの格納位置とを表わし
たリンク情報および探索範囲のホスト名を基にして任意
のノードからリンク先のノードへの探索を順次行う際の
探索範囲の制限条件を、探索の起点となるノードから探
索先のノードまでの間に存在するノードの数である階層
数の最大値および探索するサーバマシンを特定するため
の文字列とによって設定する探索条件設定手段と、リン
ク情報の示すリンク先のノードの内容をそれを蓄積して
いるサーバから読み出すことを探索の起点となるノード
から順に繰り返し行うファイル収集手段と、このファイ
ル収集手段によって１つのノードの内容を読み込むごと
にそのノードの含むリンク情報およびこれの示すリンク
先のノードと今回読み込んだノードとの対応付けを表わ
す情報とを記憶するノード・リンク情報記憶手段と、フ
ァイル収集手段によって１つのノードの内容を読み込む
ごとにそのノードに含まれるリンク情報の示すリンク先
のノードの階層数を求める階層数算出手段と、この階層
数算出手段によって求めた階層数が探索条件設定手段に
より設定した階層数の最大値よりも大きいとき今回読み
込んだノード以降にリンクされているノードの内容のフ
ァイル収集手段による読み込みを中止させる探索範囲制
限手段と、この探索範囲制限手段によって求めた階層数
が探索条件設定手段により設定した階層数の最大値より
小さいかあるいはこれと等しいとき、探索範囲のホスト
名として指定された文字列とノードのホスト名とが先頭
から指定された文字列だけ部分一致するか否かを判別す
るホスト名部分一致判別手段と、このホスト名部分一致
判別手段が部分一致したと判別した場合にこのノードが
探索範囲に存在すると判別する探索範囲判別手段とをノ
ード・リンク探索装置に具備させている。According to the first aspect of the present invention, link information indicating the name of the link destination node included in each node of the hypertext and the storage position of the link destination node and the host of the search range are provided. restriction of the search range when sequentially performing the search to the linked node from any node to name a based, is the number of nodes existing between the node as a starting point of the search until the search destination node Search condition setting means for setting the maximum value of the number of layers and a character string for specifying the server machine to be searched, and searching for reading the contents of the linked node indicated by the link information from the server storing it File collection means that repeats in order from the node serving as the starting point of the file, and each time the file collection means reads the contents of one node, Link information storage means for storing link information and information indicating the correspondence between the link destination node indicated by the link information and the node read this time, and each time the content of one node is read by the file collection means, Means for calculating the number of layers of the node at the link destination indicated by the included link information, and this reading when the number of layers obtained by the means for calculating number of layers is larger than the maximum value of the number of layers set by the search condition setting means Search range restricting means for stopping the reading of the contents of the nodes linked after the node by the file collecting means, and the number of layers obtained by the search range restricting means is greater than the maximum value of the number of layers set by the search condition setting means. If less than or equal to this, the string specified as the host name in the search range and the node Host name partial match determining means for determining whether or not the host name partially matches only a specified character string from the beginning; and if the host name partial match determining means determines that the partial match occurs, this node sets the search range. Is provided in the node / link search device.

【００２６】すなわち請求項１記載の発明では、探索の
起点となるノードから探索先のノードまでの間に存在す
るノードの数を階層数とし、探索を行う範囲を制限する
条件を階層数の最大値により指定する。起点となるノー
ドからリンクをたどることによって順次ノードの内容を
サーバから読み出し、読み出したノードのリンク先のノ
ードについてその名称や格納位置およびリンク元ノード
とリンク先ノードとの対応関係を登録する。また、ノー
ドをサーバから読み出すごとに読み込んだノードのリン
ク先のノードについての階層数を求める。この階層数が
設定した最大値以上になったとき、それ以上先へリンク
をたどりノードを読み出すことを中止する。That is, according to the first aspect of the present invention, the number of nodes existing between the search starting point node and the search destination node is set as the number of layers, and the condition for limiting the search range is set to the maximum number of layers. Specify by value. By following the link from the starting node, the contents of the nodes are sequentially read from the server, and the names and storage positions of the linked nodes of the read nodes and the correspondence between the link source nodes and the link destination nodes are registered. Also, each time a node is read from the server, the number of layers for the node to which the read node is linked is obtained. When the number of hierarchies becomes equal to or greater than the set maximum value, the link is traversed further and reading of the node is stopped.

【００２７】これらにより、起点としたノードからの階
層数によって探索の範囲が制限される。このように起点
のノードからの階層数によって探索範囲を制限している
ので、探索されるノードは起点のノードと意味的なつな
がりの強いもののみとなる。また階層数により探索範囲
を制限できるので、必要な範囲での情報のみを収集する
ことができる。また、探索範囲制限手段によって求めた
階層数が探索条件設定手段により設定した階層数の最大
値より小さいかあるいはこれと等しいときには、探索範
囲のホスト名として指定された文字列とノードのホスト
名とが先頭から指定された文字列だけ部分一致するか否
かをホスト名部分一致判別手段で判別し、このホスト名
部分一致判別手段が部分一致したと判別した場合にはこ
のノードが探索範囲に存在すると判別することにしてい
る。As a result, the range of the search is limited by the number of layers from the starting node. As described above, since the search range is limited by the number of layers from the starting node, the searched nodes are only those having a strong semantic connection with the starting node. Further, since the search range can be limited by the number of layers, it is possible to collect only information in a necessary range. In addition, it was obtained by the search range limiting means.
The number of layers is the maximum number of layers set by the search condition setting means.
If less than or equal to the value, the search range
Character string specified as the enclosed host name and the host of the node
Whether the name partially matches only the specified character string from the beginning
Is determined by the host name partial match determination means.
If the partial match determination means determines that partial
Is determined to exist in the search range.
You .

【００２８】請求項２記載の発明では、ハイパーテキス
トの各ノードに含まれるリンク先のノードの名称とリン
ク先のノードの格納位置とを表わしたリンク情報および
探索範囲のホスト名を基にして任意のノードからリンク
先のノードへの探索を順次行う際の探索範囲の制限条件
を、探索の起点となるノードから探索先のノードまでの
間に存在するリンクの数である階層数の最大値および探
索するサーバマシンを特定するための文字列とによって
設定する探索条件設定手段と、リンク情報の示すリンク
先のノードの内容をそれを蓄積しているサーバから読み
出すことを探索の起点となるノードから順に繰り返し行
うファイル収集手段と、このファイル収集手段によって
１つのノードの内容を読み込むごとにそのノードの含む
リンク情報およびこれの示すリンク先のノードと今回読
み込んだノードとの対応付けを表わす情報とを記憶する
ノード・リンク情報記憶手段と、ファイル収集手段によ
って１つのノードの内容を読み込むごとにそのノードに
含まれるリンク情報の表わすリンク先のノードへのリン
クの階層数を求める階層数算出手段と、この階層数算出
手段によって求めた階層数が探索条件設定手段により設
定した階層数の最大値よりも大きいとき今回読み込んだ
ノード以降にリンクされているノードの内容のファイル
収集手段による読み込みを中止させる探索範囲制限手段
と、この探索範囲制限手段によって求めた階層数が探索
条件設定手段により設定した階層数の最大値より小さい
かあるいはこれと等しいとき、探索範囲のホスト名とし
て指定された文字列とノードのホスト名とが先頭から指
定された文字列だけ部分一致するか否かを判別するホス
ト名部分一致判別手段と、このホスト名部分一致判別手
段が部分一致したと判別した場合にこのノードが探索範
囲に存在すると判別する探索範囲判別手段とをノード・
リンク探索装置に具備させている。According to the second aspect of the present invention, any information is provided based on the link information indicating the name of the link destination node included in each node of the hypertext and the storage position of the link destination node and the host name of the search range. Conditions for search range when searching sequentially from a given node to a linked node
And the character string for identifying the maximum and exploration <br/> search to the server machine in the hierarchy number is the number of links existing between from the starting point of the search node to the search destination node <br A search condition setting unit to be set; a file collection unit for repeatedly reading out the contents of the link destination node indicated by the link information from the server storing the same in order from the node serving as the search start point; Node / link information storage means for storing link information included in a node every time the contents of one node is read by the means, and information indicating the correspondence between the link destination node indicated by the node and the currently read node; Each time the content of one node is read by the collection means, the number of layers of the link to the link destination node represented by the link information included in the node Means for calculating the number of layers, and when the number of layers obtained by the number of layers calculation means is greater than the maximum value of the number of layers set by the search condition setting means, the file of the contents of the nodes linked after the node read this time A search range limiting means for stopping reading by the collecting means, and a host name of the search range when the number of layers obtained by the search range limiting means is smaller than or equal to the maximum value of the number of layers set by the search condition setting means. A host name partial match determining unit that determines whether or not a character string specified as a part of the node and a host name of the node partially match only the character string specified from the beginning; and that the host name partial match determining unit partially matches. A search range determining means for determining that this node exists in the search range when the determination is made;
A link search device is provided.

【００２９】すなわち請求項２記載の発明では、探索の
起点となるノードから探索先のノードまでの間に存在す
るリンクの数を階層数とし、探索を行う範囲を階層数の
最大値により制限している。これにより起点のノードと
意味的なつながりが強いノードについての情報だけを収
集することができる。また階層数で指定された探索範囲
内のノードについてだけその名称や格納位置ならびにノ
ード相互間のリンク関係を収集できる。また、探索範囲
制限手段によって求めた階層数が探索条件設定手段によ
り設定した階層数の最大値より小さいかあるいはこれと
等しいときには、探索範囲のホスト名として指定された
文字列とノードのホスト名とが先頭から指定された文字
列だけ部分一致するか否かをホスト名部分一致判別手段
で判別し、このホスト名部分一致判別手段が部分一致し
たと判別した場合にはこのノードが探索範囲に存在する
と判別することにしている。 That is, in the invention according to the second aspect, the number of links existing from the search starting point node to the search destination node is defined as the number of layers, and the search range is limited by the maximum value of the number of layers. ing. As a result, it is possible to collect only information about nodes that have a strong semantic connection with the originating node. Further, only for nodes within the search range specified by the number of layers, their names, storage positions, and link relationships between nodes can be collected. Also, search range
The number of hierarchies determined by the limiting means is
Less than or equal to the maximum number of layers
If they are equal, the specified host name in the search range
Character with the character string and the host name of the node specified from the beginning
Host name partial match determination means to determine whether only a column partially matches
And the host name partial match determination means
If it is determined that this node is in the search range
Is determined.

【００３０】請求項３記載の発明では、探索の対象とな
るノードがネットワークに接続された複数のサーバに分
散して格納されている。According to the third aspect of the present invention, the nodes to be searched are distributed and stored in a plurality of servers connected to the network.

【００３１】すなわち請求項３記載の発明では、探索の
対象となる各ノードは、ネットワークを介して接続され
た複数のサーバに分散して格納されている。探索範囲を
階層数によって制限しているので、ノードが複数のサー
バに分散されていても、多数のサーバから必要なノード
の情報だけを収集することができる。たとえば、サーバ
単位でしか探索範囲を指定できない場合には、起点とす
るノードから必要な階層数を越えるノードのみならず、
起点のノードとリンク関係のない無関係のノードまで収
集される場合もある。階層数によって探索範囲を制限す
ることにより、探索範囲のノードが複数のサーバに分散
されていても、起点とするノードから意味的なつながり
のある必要範囲のノードについての情報だけを収集する
ことができる。That is, according to the third aspect of the present invention, each node to be searched is distributed and stored in a plurality of servers connected via a network. Since the search range is limited by the number of layers, even if nodes are distributed to a plurality of servers, only necessary node information can be collected from many servers. For example, if the search range can be specified only for each server, not only the nodes exceeding the required number of layers from the starting node,
In some cases, irrelevant nodes that do not have a link relationship with the originating node are collected. By limiting the search range according to the number of layers, even if nodes in the search range are distributed to multiple servers, it is possible to collect only information about nodes in the necessary range that have a semantic connection from the starting node. it can.

【００３２】請求項４記載の発明では、ファイル収集手
段は、今回読み込んだノードに含まれているリンク情報
の示すリンク先のノードが既に読み込んだノードと同一
であるか否かを判別する同一ノード判別手段と、この同
一ノード判別手段により既に読み込んだノードと同一で
あると判別されたときそのノードの再度の読み込みを中
止する多重読込中止手段とを具備している。According to the fourth aspect of the present invention, the file collection means determines whether or not the link destination node indicated by the link information included in the currently read node is the same as the already read node. Determining means; and multiplex reading stopping means for stopping the rereading of the node when the same node determining means determines that the read node is the same as the already read node.

【００３３】すなわち請求項４記載の発明では、１度読
み込んだことのあるノードの再度の読み込みを防止して
いる。これにより、ループした範囲を繰り返し探索する
ことを回避することができる。That is, in the invention according to claim 4, re-reading of a node which has been read once is prevented. This makes it possible to avoid repeatedly searching the looped range.

【００３４】[0034]

BEST MODE FOR CARRYING OUT THE INVENTION

【００３５】[0035]

【実施例】図１は、本発明の一実施例におけるノード・
リンク探索装置の構成の概要を表わしたものである。ハ
イパーメディア構造上の探索手順は、探索を開始したノ
ードを根とする木構造と見ることができる。そこで、根
を始点として各ノードのある木の深さを階層数とし、探
索範囲を始点からの階層数で制限するようになってい
る。FIG. 1 is a block diagram of a node according to an embodiment of the present invention.
1 shows an outline of a configuration of a link search device. The search procedure on the hypermedia structure can be viewed as a tree structure with the node that started the search as the root. Therefore, the depth of a tree at each node is set as the number of layers with the root as the starting point, and the search range is limited by the number of layers from the starting point.

【００３６】ノード・リンク探索装置１１には、インタ
ーネットなどのネットワーク１２を介してノードを蓄積
した複数のサーバマシン１３が接続されている。ノード
・リンク探索装置１１は、サーバマシン１３からノード
のファイルを収集するファイル収集部２１と、サーバマ
シン１３から取得したノードのファイルの中からハイパ
ーリンクとリンク先ノードを抽出するハイパーリンク抽
出部２２を備えている。A plurality of server machines 13 storing nodes are connected to the node / link search device 11 via a network 12 such as the Internet. The node / link search device 11 includes a file collection unit 21 that collects node files from the server machine 13 and a hyperlink extraction unit 22 that extracts hyperlinks and link destination nodes from the node files acquired from the server machine 13. It has.

【００３７】またノード・リンク探索装置１１は、抽出
したハイパーリンクとリンク先ノードの各種属性情報を
蓄積・管理するノード・リンク情報データベース２３を
有する。探索範囲制御部２４は、図示しない入力端末あ
るいは設定用のファイルから入力される探索範囲を指定
するための条件に合致するようにノードの探索範囲を制
限する部分である。システム制御部２５は、ノード・リ
ンク探索装置１１内の各部の動作の流れを統括的に制御
する回路部分である。The node / link search device 11 has a node / link information database 23 for storing and managing extracted hyperlinks and various types of attribute information of link destination nodes. The search range control unit 24 is a part that limits the search range of the node so as to meet a condition for specifying a search range input from an input terminal (not shown) or a setting file. The system control unit 25 is a circuit unit that controls the operation flow of each unit in the node / link search device 11 in a comprehensive manner.

【００３８】ハイパーリンク抽出部２２は、ハイパーメ
ディア記述言語で書かれたノードのマルチメディアドキ
ュメントファイルの内容を解析するハイパーメディア記
述言語解析部２６を備えている。ハイパーメディア記述
言語解析部２６は、各種タグを検出することによってハ
イパーリンクとリンク先ノードを抽出するようになって
いる。また、ハイパーリンク抽出部２２は、抽出したリ
ンク先ノードの階層数を算出する階層数算出部２７を備
えている。階層数算出部２７は、リンク元のノードの階
層数に“１”を加えたものをリンク先ノードの階層数と
して求める。The hyperlink extracting unit 22 includes a hypermedia description language analyzing unit 26 that analyzes the contents of the multimedia document file of the node written in the hypermedia description language. The hypermedia description language analysis unit 26 extracts a hyperlink and a link destination node by detecting various tags. The hyperlink extraction unit 22 includes a hierarchy number calculation unit 27 that calculates the hierarchy number of the extracted link destination node. The number-of-layers calculation unit 27 obtains the number of layers of the link destination node by adding “1” to the number of layers of the link destination node.

【００３９】探索範囲は、サーバマシン１３の識別子と
してのホスト名と、最大階層数により指定される。探索
範囲制御部２４は、ノードの格納先のホスト名と探索範
囲として指定されたホスト名とを比較するホスト名比較
部２８と、そのノードの階層数と探索範囲の制限値とし
ての最大階層数とを比較する階層数比較部２９を備えて
いる。探索範囲制御部２４は、ノードの階層数、格納先
のホスト名によって探索範囲を制限する。また、同一の
ノードの多重読み込みを防止するために、前回の検索か
らの経過時間により探索の範囲を制限することも行う。The search range is specified by the host name as an identifier of the server machine 13 and the maximum number of layers. The search range control unit 24 compares the host name of the storage destination of the node with the host name specified as the search range, the number of layers of the node, and the maximum number of layers as the limit value of the search range. And a number-of-layers comparing section 29 for comparing the number of layers. The search range control unit 24 limits the search range according to the number of nodes and the host name of the storage destination. Also, in order to prevent multiple readings of the same node, the range of the search is limited by the elapsed time from the previous search.

【００４０】ファイル収集部２１は、ネットワーク１２
を通じてサーバマシン１３からノードのマルチメディア
ドキュメントファイルを読み込む。ハイパーリンク抽出
部２２は、ハイパーメディア記述言語解析部２６の解析
結果を基にして、マルチメディアドキュメントファイル
からハイパーリンクとリンク先ノードを抽出する。次
に、階層数算出部２７は、リンク元ノードに“１”を加
えることにより、リンク先ノードまたはハイパーリンク
の階層数を算出する。The file collection unit 21 is connected to the network 12
The multimedia document file of the node is read from the server machine 13 through the server. The hyperlink extraction unit 22 extracts a hyperlink and a link destination node from the multimedia document file based on the analysis result of the hypermedia description language analysis unit 26. Next, the hierarchy number calculation unit 27 calculates the number of layers of the link destination node or the hyperlink by adding “1” to the link source node.

【００４１】探索範囲制御部２４は、ノードあるいはハ
イパーリンクの階層数が指定された最大階層数以下か否
か、あるいはホスト名が指定されたものと部分一致する
か否かを基に探索範囲を制限する。探索範囲内のノード
およびハイパーリンクについては、それらの属性情報が
ノード・リンク情報データベース２３に蓄積される。ノ
ード・リンク情報データベース２３内でノードについて
の属性を登録するテーブルをノードテーブルと呼び、リ
ンクに関する属性を登録するテーブルをリンクテーブル
と呼ぶことにする。このように、階層数によって探索範
囲を制限することで、不必要なノードについての属性情
報の収集およびデータベースへの登録を防いでいる。The search range control unit 24 determines the search range based on whether the number of layers of nodes or hyperlinks is equal to or less than the specified maximum number of layers, or whether the host name partially matches the specified number. Restrict. For the nodes and hyperlinks within the search range, their attribute information is stored in the node / link information database 23. In the node / link information database 23, a table for registering attributes of nodes is called a node table, and a table for registering attributes of links is called a link table. In this way, by limiting the search range by the number of layers, collection of attribute information on unnecessary nodes and registration in the database are prevented.

【００４２】図２は、ノードテーブルの登録内容の一例
を表わしたものである。ノードテーブル３１には、図の
左から、ノード識別子３２、スキーム(Scheme)３３、ホ
スト名３４、ポート番号３５、スキーム特有部(Scheme-
Specific) ３６、階層数３７、登録日時３８、探索日時
３９、最終更新日時４１が登録される。このうちノード
識別子３２は、図１では示していないデータベース・マ
ネジメント・システム（ＤＢＭＳ）によって、ノードの
属性情報をノードテーブルに登録する際に自動的に生成
される識別子である。スキーム３３、ホスト名３４、ポ
ート番号３５およびスキーム特有部３６は、ノードの格
納位置を示すＵＲＬの文字列の各部分に対応するもので
ある。階層数３７は、検索開始位置からのノードの階層
数を表わす項目である。FIG. 2 shows an example of the registered contents of the node table. The node table 31 includes, from the left of the figure, a node identifier 32, a scheme (Scheme) 33, a host name 34, a port number 35, and a scheme-specific part (Scheme-
Specific) 36, the number of layers 37, the registration date and time 38, the search date and time 39, and the last update date and time 41 are registered. The node identifier 32 is an identifier that is automatically generated when the attribute information of the node is registered in the node table by the database management system (DBMS) not shown in FIG. The scheme 33, the host name 34, the port number 35, and the scheme specific part 36 correspond to each part of the character string of the URL indicating the storage position of the node. The number of layers 37 is an item indicating the number of layers of the node from the search start position.

【００４３】登録日時３８は、ノードがデータベースに
最初に登録された日時を示す。探索日時３９は、このノ
ードが探索された最新の日時を表わしている。最終更新
日時４１は、ノードのマルチメディアドキュメントファ
イルの最終更新日時を表わしている。図中、点線４２で
囲んだ探索日時３９と、最終更新日時４１の項目は、リ
ンク元のノードとして登録されるときに更新される属性
であり、点線４３で囲んだ項目は、リンク元ノードのハ
イパーリンクにより指し示されているリンク先ノードと
して登録されるときに更新される属性である。The registration date and time 38 indicates the date and time when the node was first registered in the database. The search date and time 39 represents the latest date and time when this node was searched. The last update date / time 41 represents the last update date / time of the multimedia document file of the node. In the figure, items of a search date and time 39 and a last update date and time 41 surrounded by a dotted line 42 are attributes updated when registered as a link source node, and items enclosed by a dotted line 43 are items of the link source node. This attribute is updated when registered as a link destination node indicated by a hyperlink.

【００４４】図３は、リンクテーブルの登録内容の一例
を表わしたものである。リンクテーブル５１は、図の左
から、リンク識別子５２、リンク元ノード識別子５３、
リンク先ノード識別子５４が登録される。これらの項目
には、ＤＢＭＳで自動的に割り当てられたノードの識別
番号が登録される。リンク識別子５２で示されるハイパ
ーリンクの出所元のノードのノード識別子がリンク元ノ
ード識別子５３に、ハイパーリンクの指す先のノードの
ノード識別子がリンク先ノード識別子５４として登録さ
れる。FIG. 3 shows an example of the registered contents of the link table. The link table 51 includes a link identifier 52, a link source node identifier 53,
The link destination node identifier 54 is registered. In these items, the identification numbers of the nodes automatically assigned by the DBMS are registered. The node identifier of the source node of the hyperlink indicated by the link identifier 52 is registered as the link source node identifier 53, and the node identifier of the node pointed to by the hyperlink is registered as the link destination node identifier 54.

【００４５】ノード・リンク情報データベースのノード
テーブル３１には、探索の開始点となるノードの情報を
予め少なくとも１つ登録しておく。この際、登録してお
くべき項目は、ノード識別子３２、スキーム３３、ホス
ト名３４、ポート番号３５、スキーム特有部３６、階層
数３７、登録日時３８である。探索の起点となるノード
の階層数には“０”を設定しておく。In the node table 31 of the node / link information database, at least one node information serving as a search start point is registered in advance. At this time, the items to be registered are the node identifier 32, the scheme 33, the host name 34, the port number 35, the scheme specific part 36, the number of layers 37, and the registration date 38. "0" is set to the number of layers of the node serving as the starting point of the search.

【００４６】図４は、ノード・リンク探索装置の行う動
作の流れを表わしたものである。一点破線６１で囲んだ
ステップの処理は、ノード・リンク装置のシステム制御
部２５によって行われる。一点破線６２で囲んだステッ
プの処理は探索範囲制御部２４により、一点破線６３で
囲んだステップの処理はファイル収集部２１およびハイ
パーリンク抽出部２２によって行われる。まず、探索に
先立ちシステム制御部は探索条件の入力などのシステム
の初期化を行う（ステップＳ１０１）。探索条件は、図
１で示していない入力端末や外部ファイルから入力され
る。探索条件として、ここでは探索範囲とすべきノード
の階層数の最大値（最大階層数）と、前回の探索結果を
古いものと判定するための経過時間と、探索対象とする
サーバマシンを特定するための文字列（探索するホスト
名の一部あるいは全部）が条件として設定される。FIG. 4 shows the flow of the operation performed by the node / link searching device. The processing of the steps surrounded by the dashed line 61 is performed by the system control unit 25 of the node / link device. The processing of the steps surrounded by the dashed line 62 is performed by the search range control unit 24, and the processing of the steps surrounded by the dashed line 63 is performed by the file collection unit 21 and the hyperlink extraction unit 22. First, prior to the search, the system control unit initializes the system such as inputting search conditions (step S101). The search condition is input from an input terminal or an external file not shown in FIG. Here, as the search conditions, the maximum value of the number of layers (the maximum number of layers) of the nodes to be set as the search range, the elapsed time for determining that the previous search result is old, and the server machine to be searched are specified. (A part or all of the host name to be searched) is set as a condition.

【００４７】システム制御部２５は、ノードテーブル３
１を検索して、未探索のノードが存在するか否かを調べ
る（ステップＳ１０２）。ノードテーブル３１中に登録
されている探索日時３９が、未定義（未登録）の場合、
探索日時３９が現在時刻よりも探索条件として設定され
た経過時間以上過去の時刻である場合のいずれかに該当
するとき未探索のノードと判定する。未探索のノードが
存在する場合には（ステップＳ１０２；Ｙ）、探索範囲
制御部２４は、探索されていない１つのノードの属性情
報をノードテーブル３１から取り出す。この際、階層数
の小さいものから優先的にノードを選択する（ステップ
Ｓ１０３）。したがって、最初は、階層数が“０”であ
る探索開始点のノードの属性情報が取り出される。The system control unit 25 includes the node table 3
1 to check whether there is an unsearched node (step S102). When the search date and time 39 registered in the node table 31 is undefined (unregistered),
When the search date / time 39 corresponds to one of the cases where the current time is earlier than the current time by the elapsed time set as the search condition, it is determined that the node has not been searched. When there is an unsearched node (step S102; Y), the search range control unit 24 extracts from the node table 31 the attribute information of one unsearched node. At this time, the node is preferentially selected from the one having the smaller number of layers (step S103). Therefore, at first, the attribute information of the node at the search start point whose hierarchical level is “0” is extracted.

【００４８】次に、選択したノードが指定された探索範
囲内に存在するノードであるかどうかを判別する（ステ
ップＳ１０４）。探索範囲内か否かを判別する処理につ
いては後に詳細に説明する。探索範囲内に存在するノー
ドである場合には（ステップＳ１０４；Ｙ）、そのノー
ドのドキュメントファイルをネットワークを通じて読み
出し、これに記述されているリンク先ノードをノードテ
ーブルに追加登録する等のテーブル情報更新処理（ステ
ップＳ１０５）を行う。テーブル更新処理の詳細につい
ては後に説明する。１つのノードについての更新処理
（ステップＳ１０５）を終えた後、再びステップＳ１０
２に戻り、未探索のノードについての探索処理を繰り返
す。Next, it is determined whether or not the selected node is a node existing within the designated search range (step S104). The process of determining whether or not it is within the search range will be described later in detail. If the node exists within the search range (step S104; Y), the document file of the node is read through the network, and the table information is updated such that the link destination node described therein is additionally registered in the node table. The processing (step S105) is performed. Details of the table update processing will be described later. After the update processing for one node (step S105) is completed, step S10 is performed again.
Returning to step 2, the search process for the unsearched nodes is repeated.

【００４９】選択したノードが指定された探索範囲外の
ノードである場合には（ステップＳ１０４；Ｎ）、ノー
ドテーブル３１内の選択したノードについての探索時刻
３９を現在時刻に変更し（ステップＳ１０６）、ステッ
プＳ１０２に戻る。未探索のノードがノードテーブルに
存在しなくなったとき（ステップＳ１０２；Ｎ）、探索
を終了するための終了処理を行い（ステップＳ１０
７）、処理を終了する（エンド）。If the selected node is outside the designated search range (step S104; N), the search time 39 for the selected node in the node table 31 is changed to the current time (step S106). Then, the process returns to step S102. When the unsearched node no longer exists in the node table (step S102; N), a termination process for terminating the search is performed (step S10).
7) End the processing (END).

【００５０】図５は、探索範囲内に存在するノードであ
るか否かを判定する際の処理の流れを表わしたものであ
る。ここでは、階層数のみによって探索範囲内か否かを
判別している。まず探索範囲制御部２４は、選択したノ
ードの階層数と探索条件として設定されている最大階層
数とを比較する（ステップＳ２０１）。ノードの階層数
が最大階層数よりも大きい場合には（ステップＳ２０
１；Ｎ）、このノードは探索範囲に存在しないものと判
定する（ステップＳ２０２）。ノードの階層数が最大階
層数よい小さいかあるいは等しい場合には（ステップＳ
２０１；Ｙ）、このノードが探索範囲に存在するものと
判定する（ステップＳ２０３）。FIG. 5 shows the flow of processing for determining whether a node exists within the search range. Here, it is determined whether or not the area is within the search range based only on the number of layers. First, the search range control unit 24 compares the number of layers of the selected node with the maximum number of layers set as a search condition (step S201). If the number of layers of the node is larger than the maximum number of layers (step S20
1; N), it is determined that this node does not exist in the search range (step S202). If the number of layers of the node is smaller than or equal to the maximum number of layers (step S
201; Y), it is determined that this node exists in the search range (step S203).

【００５１】図６は、探索範囲内に存在するノードであ
るか否かを判定する処理の他の一例の流れを表わしたも
のである。ここでは、階層数による判別の他に、ホスト
名による判別を加えている。探索範囲制御部２４は、選
択したノードの階層数と最大階層数とを比較し（ステッ
プＳ３０１）、ノードの階層数が最大階層数よりも大き
い場合には（ステップＳ３０１；Ｎ）、このノードは探
索範囲に存在しないものと判定する（ステップＳ３０
２）。ノードの階層数が最大階層数よい小さいかあるい
は等しい場合には（ステップＳ３０１；Ｙ）、探索範囲
のホスト名として指定された文字列と、ノードのホスト
名とが部分一致するか否かを調べる（ステップＳ３０
３）。FIG. 6 shows the flow of another example of the process for determining whether or not a node exists within the search range. Here, in addition to the determination based on the number of layers, the determination based on the host name is added. The search range control unit 24 compares the number of layers of the selected node with the maximum number of layers (step S301), and when the number of layers of the node is larger than the maximum number of layers (step S301; N), this node It is determined that it does not exist in the search range (step S30)
2). When the number of layers of the node is smaller or equal to the maximum number of layers (step S301; Y), it is determined whether or not the character string specified as the host name in the search range partially matches the host name of the node. (Step S30
3).

【００５２】たとえば、探索文字列として“ＡＢ”が指
定されたときには、“ＡＢＣ”や“ＡＢＤ”などホスト
名の先頭から指定された文字列“ＡＢ”を含むものはす
べて部分一致していると判別される。“ＣＡＢ”などの
ように部分一致していない場合には（ステップＳ３０
３；Ｎ）、このノードを探索範囲に存在しないものと判
定する（ステップＳ３０２）。部分一致する場合には
（ステップＳ３０３；Ｙ）、このノードが探索範囲に存
在するものと判定する（ステップＳ３０４）。ここで
は、階層数の判定を行ってからホスト名の部分一致を判
定したが、これらの順序を入れ換えて行ってもよい。For example, when "AB" is specified as the search character string, all characters including the character string "AB" specified from the beginning of the host name such as "ABC" or "ABD" are partially matched. Is determined. If the partial match does not occur, such as “CAB” (step S30
3; N), it is determined that this node does not exist in the search range (step S302). If they partially match (step S303; Y), it is determined that this node exists in the search range (step S304). Here, the partial number matching of the host name is determined after the number of layers is determined, but the order may be interchanged.

【００５３】図７は、図４に示したテーブル更新処理の
流れを表わしたものである。この処理は、ファイル収集
部２１とハイパーリンク抽出部２２により行われる。ま
ず、図４のステップＳ１０４により探索範囲内に存在す
ると判別されたノードの内容をネットワークを通じてそ
れが格納されているサーバマシンから読み出すことを行
う。このため、ファイル収集部２１は、該当するノード
の属性情報をノード・リンク情報データベース２３のノ
ードテーブル３１から読み込む（ステップＳ４０１）。
読み込んだノードの階層数をここでは、仮にｎとする。
次に、ファイル収集部２１は、サーバマシンとの通信に
用いる図示しない通信用バッファと一時バッファを初期
化した後サーバマシン１３との接続を行い、ノードのド
キュメントファイルの転送（読み込み）準備を行う（ス
テップＳ４０２）。FIG. 7 shows the flow of the table updating process shown in FIG. This process is performed by the file collection unit 21 and the hyperlink extraction unit 22. First, the content of the node determined to be within the search range in step S104 of FIG. 4 is read out from the server machine where it is stored via the network. Therefore, the file collection unit 21 reads the attribute information of the corresponding node from the node table 31 of the node / link information database 23 (Step S401).
Here, the number of layers of the read node is assumed to be n here.
Next, the file collection unit 21 initializes a communication buffer and a temporary buffer (not shown) used for communication with the server machine, and then connects to the server machine 13 to prepare for transfer (reading) of the document file of the node. (Step S402).

【００５４】続いてファイル収集部２１は、分散ハイパ
ーメディアシステムの転送プロトコルに従って、ドキュ
メントファイルの転送を開始する。ファイル収集部２１
は、転送プロトコルメッセージヘッダ部を読み込み（ス
テップＳ４０３）、ドキュメントファイルの記述言語な
どのフォーマットを調べる（ステップＳ４０４）。読み
込んだファイルがハイパーメディア記述言語形式でない
場合には（ステップＳ４０４；Ｎ）、そのドキュメント
ファイルの終端まで読み込む（ステップＳ４０５）。そ
して、通信用バッファおよび一時バッファの解放ならび
にサーバマシンとの接続を断するなどの後処理を行い
（ステップＳ４０６）、処理を終了する（エンド）。Subsequently, the file collection unit 21 starts transferring the document file according to the transfer protocol of the distributed hypermedia system. File collection unit 21
Reads the transfer protocol message header (step S403) and checks the format such as the description language of the document file (step S404). If the read file is not in the hypermedia description language format (step S404; N), the file is read up to the end of the document file (step S405). Then, post-processing such as releasing the communication buffer and the temporary buffer and disconnecting the connection with the server machine is performed (step S406), and the processing ends (END).

【００５５】読み込んだファイルがハイパーメディア記
述言語形式の場合には（ステップＳ４０４；Ｙ）、ファ
イル収集部２１は、転送プロトコルメッセージ本体のド
キュュメントファイルを通信用バッファに読み込む（ス
テップＳ４０７）。次に、ハイパーメディアド記述言語
解析部２６により、ハイパーリンクの部分を表わすタグ
の開始文字から終了文字までを一時バッファに移動させ
る（ステップＳ４０８）。一時バッファに格納したハイ
パーリンクの中からリンク先ノードを記述している部分
を取り出す（ステップＳ４０９）。階層数算出部は、リ
ンク元のノードの階層数“ｎ”に“１”を加えた“ｎ＋
１”を、リンク先ノードの階層数として求める（ステッ
プＳ４１０）。If the read file is in the hypermedia description language format (step S404; Y), the file collection unit 21 reads the document file of the body of the transfer protocol message into the communication buffer (step S407). Next, the hypermediad description language analysis unit 26 moves the start character to the end character of the tag representing the hyperlink portion to the temporary buffer (step S408). The portion describing the link destination node is extracted from the hyperlink stored in the temporary buffer (step S409). The hierarchy number calculation unit adds “1” to the hierarchy number “n” of the link source node to “n +
1 "is obtained as the number of layers of the link destination node (step S410).

【００５６】このようにして得たノードとハイパーリン
クの属性情報を、ノード・リンク情報データベース２３
内のノードテーブル３１およびリンクテーブル５１に書
き込む（ステップＳ４１１）。この際、リンク元ノード
の属性情報、リンク先ノードの属性情報の各一部項目
と、ハイパーリンクの属性情報（リンクテーブル）の全
項目を書き込む。これにより、今回読み込んだノードに
リンクされている１つ階層の進んだリンク先ノードにつ
いての属性情報と、読み込んだノードからそのリンク先
ノードへのハイパーリンクを各テーブルに１つ追加登録
したことになる。ただし、同一のノードについて既に登
録されている場合には、テーブルへの追加登録は行わ
ず、探索日時などの更新のみを行う。これにより同一の
ノードやハイパーリンクの多重登録が回避される。The attribute information of the nodes and hyperlinks thus obtained is stored in the node / link information database 23.
It is written into the node table 31 and the link table 51 in (step S411). At this time, each part of the attribute information of the link source node and the attribute information of the link destination node and all the items of the attribute information (link table) of the hyperlink are written. As a result, the attribute information of the one-level advanced link destination node linked to the currently read node and one hyperlink from the read node to the link destination node are additionally registered in each table. Become. However, if the same node has already been registered, no additional registration to the table is performed, and only the update of the search date and time is performed. This avoids multiple registrations of the same node or hyperlink.

【００５７】次に、今回読み込んだノードのドキュュメ
ントファイルの終端まで処理を行ったか否かを調べ（ス
テップＳ４１２）、終端に到らないときは（ステップＳ
４１２；Ｎ）、ステップＳ４０７に戻る。各ノードは、
複数のリンク先ノードを有することがあるので、ファイ
ルの終端までこのような処理を繰り返し行うことによ
り、今回読み込んだノードのリンク先ノードの全てにつ
いてノードテーブルとリンクテーブルへの登録を行う。
ファイルの終端まで処理したときは（ステップＳ４１
２；Ｙ）、このノードについてのテーブル更新処理を終
了する（エンド）。Next, it is checked whether or not processing has been performed up to the end of the document file of the node read this time (step S 412).
412; N), and return to step S407. Each node is
Since there may be a plurality of link destination nodes, such processing is repeated until the end of the file, whereby all the link destination nodes of the currently read node are registered in the node table and the link table.
When processing is performed up to the end of the file (step S41)
2; Y), the table update processing for this node ends (END).

【００５８】ここで、図７のステップＳ４１１において
ノードテーブルとリンクテーブルに登録する属性情報の
内容について説明する。サーバマシンから読み込んだノ
ードをリンク元ノードとし、このノードに含まれるハイ
パーリンクに記述されているノードをリンク先ノードと
する。まず、リンク元ノードについては、探索日時３９
と、最終更新日時４１を登録する。これにより、ノード
をいつ検索したかの最新の時刻情報を残すことができ
る。たとえば、図２に示すノードテーブル３１におい
て、ノード識別子が“１”のノードをサーバマシンから
読み出し、これのハイパーリンクの指す先としてノード
識別子が“２”のノードが記述されていたものとする。
この場合は、ノード識別子が“１”のノードがリンク元
ノードであり、ノード識別子が“２”のノードがリンク
先ノードである。Here, the contents of the attribute information registered in the node table and the link table in step S411 of FIG. 7 will be described. A node read from the server machine is defined as a link source node, and a node described in a hyperlink included in the node is defined as a link destination node. First, for the link source node, the search date and time 39
Is registered. Thus, the latest time information indicating when the node was searched can be left. For example, in the node table 31 shown in FIG. 2, it is assumed that a node having a node identifier of "1" is read from the server machine, and a node having a node identifier of "2" is described as a destination pointed by the hyperlink.
In this case, the node with the node identifier “1” is the link source node, and the node with the node identifier “2” is the link destination node.

【００５９】図２において点線４２で囲まれている探索
日時と最終更新日時がリンク元ノードの属性情報として
更新される。リンク先ノードの属性情報として登録され
るのは図２の点線４４で囲まれている部分である。すな
わち、ノード識別子と、ノードの格納位置を示すＵＲＬ
の文字列に対応した、スキーム、ホスト名、ポート番
号、スキーム特有部、階層数算出部で求めた階層数、お
よびデータベースに登録された日時である登録日時であ
る。ノード識別子は、ＤＢＭＳによって自動生成された
ものが登録される。リンク先のノードについては、まだ
実際にサーバマシン１３から読み出してそれに含まれる
ハイパーリンクを調べていないので、探索日時および最
終更新日時は未登録のままとなる。In FIG. 2, the search date and time and the last update date and time surrounded by a dotted line 42 are updated as attribute information of the link source node. What is registered as the attribute information of the link destination node is a portion surrounded by a dotted line 44 in FIG. That is, the node identifier and the URL indicating the storage location of the node
, A scheme name, a host name, a port number, a scheme-specific part, the number of layers obtained by the number-of-layers calculation unit, and a registration date and time registered in the database. As the node identifier, one automatically generated by the DBMS is registered. As for the link destination node, the search date and time and the last update date and time remain unregistered since the hyperlink included in the linked node has not been actually read from the server machine 13 yet.

【００６０】リンクテーブル５１には、ＤＢＭＳによっ
て自動生成されたリンク識別子５２と、リンク元ノード
識別子５３と、リンク先ノード識別子５４が登録され
る。図３を例に説明する。ノード識別子が“１”のノー
ドを読み込み、これのハイパーリンクによってノード識
別子が“２”のノードがリンク先ノードとなっているも
のとする。まず、リンク識別子が“１”のハイパーリン
ク（５５）の出所元のノードのノード識別子は“１”で
あるので、この値をリンク元ノード識別子（５６）とし
て登録する。またハイパーリンクの指す先のノードのノ
ード識別子は“２”であるので、この値をリンク先ノー
ド識別子（５７）として登録する。読み込んだノードの
他のハイパーリンクが登録されている場合には、それら
リンクについてもリンク識別子を割り当て、リンク元ノ
ード識別子とリンク先ノード識別子が登録される。２つ
ハイパーリンクが存在してる場合には、図３の点線５８
で示した範囲の情報がリンクテーブル５１に登録され
る。In the link table 51, a link identifier 52 automatically generated by the DBMS, a link source node identifier 53, and a link destination node identifier 54 are registered. This will be described with reference to FIG. It is assumed that the node having the node identifier “1” is read, and the node having the node identifier “2” is a link destination node by the hyperlink. First, since the node identifier of the source node of the hyperlink (55) whose link identifier is "1" is "1", this value is registered as the link source node identifier (56). Since the node identifier of the node pointed to by the hyperlink is “2”, this value is registered as the link destination node identifier (57). If other hyperlinks of the read node are registered, a link identifier is also assigned to those links, and a link source node identifier and a link destination node identifier are registered. If there are two hyperlinks, the dotted line 58 in FIG.
Is registered in the link table 51.

【００６１】ノードテーブル３１およびリンクテーブル
５１の更新は、１つのノードをサーバマシン１３から読
み込み、新たなリンク先を見い出すたびに行われる。し
たがって、図４の流れ図において１つのノードについて
テーブル更新処理（ステップＳ１０５）を終えた後、ス
テップＳ１０２に戻ると、テーブル更新処理において新
たに登録されたノードもその探索対象になる。これによ
り、次々とリンク先への探索が進められる。ただし、そ
の探索範囲は階層数などの条件によって制限される。The updating of the node table 31 and the link table 51 is performed each time one node is read from the server machine 13 and a new link destination is found. Therefore, in the flowchart of FIG. 4, when the table update processing (step S105) is completed for one node and the process returns to step S102, the node newly registered in the table update processing is also searched. As a result, the search for the link destination proceeds one after another. However, the search range is limited by conditions such as the number of layers.

【００６２】また、探索日時が未定義か経過時間以上過
去の時刻であることを未探索判定基準にしており、また
一度探索したノードの探索日時には現在時刻に近い時刻
が登録されるので、同一のノードについて重ねてリンク
先の調査が行われることはない。これにより、たとえば
ノード“Ａ”のリンク先がノード“Ｂ”で、ノード
“Ｂ”のリンク先がノード“Ａ”のようにリンクにより
ループが形成されている場合であっても、ノード“Ａ”
を再度調べることが無く、最大階層数までループを繰り
返したどるようなことがない。Further, the fact that the search date and time is undefined or a time earlier than the elapsed time is used as an unsearch determination criterion, and a time near the current time is registered as the search date and time of the node once searched. The link destination is not repeatedly examined for the node. Thus, for example, even when the link destination of the node “A” is the node “B” and the link destination of the node “B” is a loop formed by links like the node “A”, the node “A” "
Is not checked again, and the loop is not repeated up to the maximum number of layers.

【００６３】このようにノードの階層数によって探索の
範囲を制限しているので、起点となるノードと意味的な
つながりの強いノードだけを探索することができる。ま
た、サーバ単位でしか探索範囲を制限できない場合に比
べて不必要な探索を低減することができる。As described above, since the range of the search is limited by the number of layers of the nodes, it is possible to search only the nodes having strong semantic connection with the starting node. Further, unnecessary search can be reduced as compared with the case where the search range can be limited only on a server basis.

【００６４】変形例 Modification

【００６５】これまで説明した実施例では、ノードの属
性情報として階層数を持たせているが、変形例ではハイ
パーリンクの階層数を基にして探索範囲内か否かを判別
するようになっている。装置の構成は図１に示したもの
と同一でありその説明を省略する。In the above-described embodiment, the number of layers is given as the attribute information of the node. In the modified example, it is determined whether or not the node is within the search range based on the number of layers of the hyperlink. I have. The configuration of the device is the same as that shown in FIG. 1, and a description thereof will be omitted.

【００６６】図８は、変形例のノード・リンク探索装置
で用いられるノードテーブルの登録内容の一例を表わし
たものである。図２と同一の項目には同一の符号を付し
てあり、それらの説明を適宜省略する。ノードテーブル
７１は、図２に示したノードテーブル３１に比べて、階
層数と探索日時を登録する項目が削除されている点で相
違する。FIG. 8 shows an example of the registered contents of the node table used in the node / link search device of the modified example. The same items as those in FIG. 2 are denoted by the same reference numerals, and description thereof will be omitted as appropriate. The node table 71 is different from the node table 31 shown in FIG. 2 in that items for registering the number of layers and the search date and time are deleted.

【００６７】図９は、変形例のノード・リンク探索装置
で用いるリンクテーブルの登録内容の一例を表わしたも
のである。図３と同一項目には同一の符号を付してあ
り、それらの説明を適宜省略する。リンクテーブル８１
は、図３に示したリンクテーブル５１に加えて、階層数
の項目８２と、探索日時の項目８３を備えている。これ
らは、実施例においてはノードテーブルに登録されてい
たものである。変形例では、探索に先立って、探索の開
始点となるハイパーリンクを少なくとも１つリンクテー
ブル８１に登録しておく必要がある。ここで登録される
探索開始点となるハイパーリンクは、仮想的なものであ
り、リンク元ノードが存在しない（不明）。また、探索
の開始点となるハイパーリンクの示すリンク先ノードに
ついての属性情報をノードテーブル７１に予め登録して
おかなければならない。FIG. 9 shows an example of the registered contents of the link table used in the node / link search device of the modified example. Items that are the same as those in FIG. Link table 81
Has an item 82 of the number of layers and an item 83 of the search date and time in addition to the link table 51 shown in FIG. These are registered in the node table in the embodiment. In the modification, it is necessary to register at least one hyperlink serving as a search start point in the link table 81 prior to the search. The hyperlink serving as a search start point registered here is virtual, and has no link source node (unknown). In addition, attribute information about a link destination node indicated by a hyperlink serving as a search start point must be registered in the node table 71 in advance.

【００６８】探索に先立ってノードテーブル７１に登録
する内容として、まず探索の起点のハイパーリンクの示
すリンク先ノードのノード識別子がある。この値はＤＢ
ＭＳによって自動的に割り当てられる。さらに、ノード
の格納位置を示すＵＲＬの文字列に対応した項目とし
て、スキーム、ホスト名、ポート番号、スキーム特有部
を初期登録しておく。また、このノードがデータベース
に登録された時刻を表わす登録日時を初期登録する。図
８の例では、点線７２で囲んだ範囲の項目が初期登録さ
れる。As contents to be registered in the node table 71 prior to the search, first, there is a node identifier of a link destination node indicated by a hyperlink at the starting point of the search. This value is DB
Assigned automatically by MS. Furthermore, a scheme, a host name, a port number, and a scheme specific part are initially registered as items corresponding to the URL character string indicating the storage location of the node. Also, a registration date and time indicating the time when this node was registered in the database is initially registered. In the example of FIG. 8, items in a range surrounded by a dotted line 72 are initially registered.

【００６９】リンクテーブル８１に初期登録しておく内
容としては、探索の起点となるハイパーリンクの識別子
であるリンク識別子がある。この値は、ＤＢＭＳによっ
て自動生成される。探索の起点となるハイパーリンクの
リンク元は不明であるので、リンク元ノード識別子の初
期値は未定義を表わす“０”とする。リンク先ノード識
別子は、リンク先となるノードに対してＤＢＭＳの割り
当てたノード識別子と同一の値を登録しておく。階層数
は、探索の開始点であるの“０”を初期登録する。また
探索日時は、当該ハイパーリンクについての探索を行っ
た日時を登録するものであり、探索開始の初期値として
は未定義のままとする。図９の例では点線８４で囲んだ
項目が初期登録される。The contents initially registered in the link table 81 include a link identifier which is an identifier of a hyperlink serving as a search starting point. This value is automatically generated by the DBMS. Since the link source of the hyperlink serving as the starting point of the search is unknown, the initial value of the link source node identifier is set to “0” representing undefined. As the link destination node identifier, the same value as the node identifier assigned by the DBMS to the link destination node is registered. For the number of layers, “0”, which is the starting point of the search, is initially registered. The search date and time registers the date and time when the search for the hyperlink was performed, and is left undefined as the initial value of the search start. In the example of FIG. 9, the items enclosed by the dotted line 84 are initially registered.

【００７０】図１０は、変形例におけるノード・リンク
探索装置の行う処理の流れを表わしたものである。一点
破線１０１で囲まれたステップは、システム制御部２５
の行う処理を表わしている。一点破線１０２で囲まれた
ステップは、探索範囲制御部２４により、一点破線１０
３で囲まれたステップは、ファイル収集部２１およびハ
イパーリンク抽出部２２によって行われる処理を表わし
ている。FIG. 10 shows a flow of a process performed by the node / link searching device according to the modification. The steps surrounded by the dashed line 101 are the system controller 25
Is performed. Steps surrounded by the dashed line 102 are performed by the search range control unit 24 by the dashed line 10.
Steps surrounded by 3 represent processes performed by the file collection unit 21 and the hyperlink extraction unit 22.

【００７１】まず、システム制御部２５は、当該システ
ムの初期化を行う（ステップＳ５０１）。この際、探索
範囲とするハイパーリンクの階層数の最大値としての最
大階層数と、前回の探索結果を古いものとして扱う基準
となる経過時間とを設定する。さらにホスト名によって
サーバの範囲を設定する場合には、ホスト名を制限する
ための文字列を入力する。これらは、図示しない入力端
末あるいは外部ファイルから取り込む。次に、リンクテ
ーブル８１の中に探索していないハイパーリンクが存在
するか否かを調べる（ステップＳ５０２）。探索されて
いないハイパーリンクとは、探索日時が未定義（未登
録）のもの、あるいは探索日時が現在時刻よりも経過時
間以上古いものである。First, the system controller 25 initializes the system (step S501). At this time, the maximum number of layers as the maximum value of the number of layers of the hyperlink to be the search range and the elapsed time serving as a reference for treating the previous search result as an old one are set. When setting a server range by a host name, a character string for limiting the host name is input. These are taken from an input terminal (not shown) or an external file. Next, it is determined whether or not a hyperlink that has not been searched exists in the link table 81 (step S502). A hyperlink that has not been searched is one whose search date and time is undefined (unregistered), or one whose search date and time is older than the current time by an elapsed time or more.

【００７２】探索されていないハイパーリンクがリンク
テーブル８１に存在するときは（ステップＳ５０２；
Ｙ）、探索されていないハイパーリンクのうちの１つを
選択しその属性情報を取り出す（ステップＳ５０３）。
次に、探索範囲制御部２４は、選択した１つのハイパー
リンクが探索範囲内であるかどうかを判定する（ステッ
プＳ５０４）。階層数あるいは階層数とホスト名の双方
により判定されるが、その詳細については後に説明す
る。When a hyperlink that has not been searched exists in the link table 81 (step S502;
Y), one of the unsearched hyperlinks is selected and its attribute information is extracted (step S503).
Next, the search range control unit 24 determines whether or not the selected one hyperlink is within the search range (step S504). The determination is made based on the number of layers or both the number of layers and the host name, and details thereof will be described later.

【００７３】探索範囲内に存在する場合には（ステップ
Ｓ５０４；Ｙ）、テーブル更新処理（ステップＳ５０
５）を行う。この処理では、ハイパーリンクの示すリン
ク先ノードの属性情報をノードテーブル７１から取り出
し、このノードのドキュメントファイルをサーバマシン
から読み出し、これに登録されたリンク先ノードの属性
情報を追加登録する等を行う。処理の詳細な流れについ
ては後述する。ハイパーリンクの示すリンク先ノードに
ついてのテーブル更新処理（ステップＳ５０５）を終え
た後、再びステップＳ５０２に戻り、未探索のハイパー
リンクについての探索処理を繰り返す。If it is within the search range (step S504; Y), the table is updated (step S50).
Perform 5). In this process, the attribute information of the link destination node indicated by the hyperlink is extracted from the node table 71, the document file of this node is read from the server machine, and the attribute information of the link destination node registered therein is additionally registered. . The detailed flow of the processing will be described later. After finishing the table updating process (step S505) for the link destination node indicated by the hyperlink, the process returns to step S502 again, and the search process for the unsearched hyperlink is repeated.

【００７４】選択したハイパーリンクが探索範囲外の場
合には（ステップＳ５０４；Ｎ）、リンクテーブル８１
中の選択したハイパーリンクについての探索日時３９を
現在時刻に変更し（ステップＳ５０６）、ステップＳ５
０２に戻る。未探索のハイパーリンクがリンクテーブル
８１に存在しなくなったとき（ステップＳ５０２；
Ｎ）、探索を終了するための終了処理を行い（ステップ
Ｓ５０７）、処理を終了する（エンド）。If the selected hyperlink is out of the search range (step S504; N), the link table 81
The search date and time 39 for the selected hyperlink is changed to the current time (step S506), and step S5 is performed.
Return to 02. When an unsearched hyperlink no longer exists in the link table 81 (step S502;
N), end processing for ending the search is performed (step S507), and the processing is ended (END).

【００７５】図１１は、ハイパーリンクが探索範囲内か
否かを判定する際の処理の流れを表わしたものである。
ここでは、ハイパーリンクの階層数のみによって探索範
囲内か否かを判別している。まず、探索範囲制御部２４
は、選択したハイパーリンクの階層数と探索条件として
設定されている最大階層数とを比較する（ステップＳ６
０１）。ハイパーリンクの階層数が最大階層数よりも大
きい場合には（ステップＳ６０１；Ｎ）、このハイパー
リンクは探索範囲に存在しないものと判定する（ステッ
プＳ６０２）。ハイパーリンクの階層数が最大階層数よ
い小さいかあるいは等しい場合には（ステップＳ６０
１；Ｙ）、このハイパーリンクが探索範囲に存在するも
のと判定する（ステップＳ６０３）。FIG. 11 shows the flow of processing when determining whether a hyperlink is within the search range.
Here, it is determined whether or not the area is within the search range only by the number of layers of the hyperlink. First, the search range control unit 24
Compares the number of layers of the selected hyperlink with the maximum number of layers set as the search condition (step S6).
01). If the number of layers of the hyperlink is larger than the maximum number of layers (step S601; N), it is determined that the hyperlink does not exist in the search range (step S602). When the number of layers of the hyperlink is smaller than or equal to the maximum number of layers (step S60)
1; Y), it is determined that this hyperlink exists in the search range (step S603).

【００７６】図１２は、ハイパーリンクが探索範囲内か
否かを判定する処理の他の一例の流れを表わしたもので
ある。ここでは、ハイパーリンクの階層数による判別の
他に、ホスト名による判別を加えている。探索範囲制御
部２４は、選択したハイパーリンクの階層数と最大階層
数とを比較し（ステップＳ７０１）、ハイパーリンクの
階層数が最大階層数よりも大きい場合には（ステップＳ
７０１；Ｎ）、このハイパーリンクは探索範囲に存在し
ないものと判定する（ステップＳ７０２）。ハイパーリ
ンクの階層数が最大階層数よい小さいかあるいは等しい
場合には（ステップＳ７０１；Ｙ）、このハイパーリン
クの示すリンク先ノードについての属性情報をノードテ
ーブル７１から取り出す（ステップＳ７０３）。FIG. 12 shows the flow of another example of the process for determining whether or not the hyperlink is within the search range. Here, in addition to the determination based on the number of layers of the hyperlink, the determination based on the host name is added. The search range control unit 24 compares the number of layers of the selected hyperlink with the maximum number of layers (step S701), and when the number of layers of the hyperlink is larger than the maximum number of layers (step S701).
701; N), it is determined that this hyperlink does not exist in the search range (step S702). When the number of layers of the hyperlink is smaller or equal to the maximum number of layers (step S701; Y), the attribute information about the link destination node indicated by the hyperlink is extracted from the node table 71 (step S703).

【００７７】取り出したリンク先ノードの属性情報に含
まれるホスト名が探索範囲のホスト名として指定された
文字列と、部分一致するか否かを調べる（ステップＳ７
０４）。部分一致する場合には（ステップＳ７０４；
Ｙ）、先のハイパーリンクが探索範囲内に存在するもの
と判定する（ステップＳ７０５）。部分一致しない場合
には（ステップＳ７０５；Ｎ）、ハイパーリンクが探索
範囲内に存在しないと判定する（ステップＳ７０２）。
ここでは、階層数の判定を行ってからホスト名の部分一
致を判定したが、これらの順序を入れ換えて行ってもよ
い。It is checked whether or not the host name included in the attribute information of the extracted link destination node partially matches the character string specified as the host name in the search range (step S7).
04). If they partially match (step S704;
Y), it is determined that the preceding hyperlink exists within the search range (step S705). If the partial links do not match (step S705; N), it is determined that the hyperlink does not exist within the search range (step S702).
Here, the partial number matching of the host name is determined after the number of layers is determined, but the order may be interchanged.

【００７８】図１３は、図１０に示したテーブル更新処
理の流れを表わしたものである。この処理は、ファイル
収集部２１とハイパーリンク抽出部２２により行われ
る。まず、図１０のステップＳ５０４により探索範囲内
に存在すると判別されたハイパーリンクの属性情報をリ
ンクテーブル８１から読み込む（ステップＳ８０１）。
読み込んだハイパーリンクの階層数をここでは、仮にｎ
とする。次に、このハイパーリンクのリンク先ノード識
別子に対応するノードの属性情報をノードテーブル７１
から読み込む（ステップＳ８０２）。FIG. 13 shows the flow of the table updating process shown in FIG. This process is performed by the file collection unit 21 and the hyperlink extraction unit 22. First, the attribute information of the hyperlink determined to be within the search range in step S504 of FIG. 10 is read from the link table 81 (step S801).
The number of layers of the read hyperlink is assumed to be n here.
And Next, the attribute information of the node corresponding to the link destination node identifier of this hyperlink is stored in the node table 71.
(Step S802).

【００７９】ファイル収集部２１は、サーバマシンとの
通信に用いる図示しない通信用バッファと一時バッファ
クを初期化した後サーバマシンとの接続を行い、ハイパ
ーリンクの示すリンク先ノードのドキュメントファイル
の読み込み準備を行う（ステップＳ８０３）。続いてフ
ァイル収集部２１は、分散ハイパーメディアシステムの
転送プロトコルに従って、ドキュメントファイルの転送
を開始する。The file collection unit 21 connects to the server machine after initializing a communication buffer (not shown) and a temporary buffer used for communication with the server machine, and prepares to read the document file of the link destination node indicated by the hyperlink. Is performed (step S803). Subsequently, the file collection unit 21 starts transferring the document file according to the transfer protocol of the distributed hypermedia system.

【００８０】ファイル収集部２１は、転送プロトコルメ
ッセージヘッダ部を読み込み（ステップＳ８０４）、ド
キュメントファイルの記述言語がハイパーメディア記述
言語形式でない場合には（ステップＳ８０５；Ｎ）、そ
のドキュメントファイルの終端まで読み込む（ステップ
Ｓ８０６）。そして、通信用バッファおよび一時バッフ
ァの解放ならびにサーバマシンとの接続を断するなどの
後処理を行い（ステップＳ８０７）、処理を終了する
（エンド）。The file collection unit 21 reads the transfer protocol message header (step S804). If the description language of the document file is not in the hypermedia description language format (step S805; N), the file collection unit 21 reads up to the end of the document file. (Step S806). Then, post-processing such as releasing the communication buffer and the temporary buffer and disconnecting the connection with the server machine is performed (step S807), and the processing ends (end).

【００８１】読み込んだファイルがハイパーメディア記
述言語形式の場合には（ステップＳ８０５；Ｙ）、ファ
イル収集部２１は、転送プロトコルメッセージ本体のド
キュュメントファイルを通信用バッファに読み込む（ス
テップＳ８０８）。次に、ハイパーリンクの部分を表わ
すタグの開始文字から終了文字までを一時バッファに移
動させ（ステップＳ８０９）、一時バッファに格納した
ハイパーリンクの中からリンク先ノードを記述している
部分を取り出す（ステップＳ８１０）。階層数算出部２
７は、抽出したハイパーリンクの階層数として“ｎ＋
１”を設定する（ステップＳ８１２）。If the read file is in the hypermedia description language format (step S805; Y), the file collection unit 21 reads the document file of the transfer protocol message body into the communication buffer (step S808). Next, the portion from the start character to the end character of the tag representing the hyperlink portion is moved to the temporary buffer (step S809), and the portion describing the link destination node is extracted from the hyperlink stored in the temporary buffer (step S809). Step S810). Layer number calculation unit 2
7 is “n +” as the number of layers of the extracted hyperlink.
1 "is set (step S812).

【００８２】このようにして得たノードとハイパーリン
クの属性情報を、ノード・リンク情報データベース２３
内のノードテーブル７１およびリンクテーブル８１に書
き込む（ステップＳ８１２）。この際、リンク元ノード
の属性情報、リンク先ノードの属性情報の各一部項目
と、ハイパーリンクの属性情報の全項目を書き込む。こ
れにより、今回の読み込んだノードのリンク先ノードに
ついての属性情報と、読み込んだノードからそのリンク
先ノードへのハイパーリンクについて登録が行われる。
ただし、同一のリンク元とリンク先を有するハイパーリ
ンクが既に登録されている場合には、テーブルへの追加
登録は行わず、探索日時などの更新のみを行う。これに
より同一のノードやハイパーリンクが多重登録されるこ
とが回避される。The attribute information of the nodes and hyperlinks thus obtained is stored in the node / link information database 23.
It is written into the node table 71 and the link table 81 in (step S812). At this time, each item of the attribute information of the link source node and the attribute information of the link destination node and all the items of the attribute information of the hyperlink are written. As a result, the attribute information about the link destination node of the currently read node and the hyperlink from the read node to the link destination node are registered.
However, if a hyperlink having the same link source and link destination has already been registered, additional registration to the table is not performed, and only update of the search date and time is performed. This prevents multiple registrations of the same node or hyperlink.

【００８３】次に、今回読み込んだノードのドキュュメ
ントファイルの終端まで処理を行ったか否かを調べ（ス
テップＳ８１３）、終端に到らないときは（ステップＳ
８１３；Ｎ）、ステップＳ８０８に戻る。各ノードは、
複数のリンク先ノードを有することがあるので、ファイ
ルの終端までこのような処理を繰り返し行うことによ
り、今回読み込んだノードのリンク先ノードの全てにつ
いてノードテーブルとリンクテーブルへの登録が行われ
る。ファイルの終端まで処理したとき（ステップＳ８１
３；Ｙ）、処理を終了する（エンド）。Next, it is checked whether or not processing has been performed up to the end of the document file of the currently read node (step S813).
813; N), and return to step S808. Each node is
Since there may be a plurality of link destination nodes, by repeating such processing until the end of the file, all the link destination nodes of the currently read node are registered in the node table and the link table. When processing is performed up to the end of the file (step S81
3; Y), end the process (END).

【００８４】ここで、図１３のステップＳ８１２におい
てノードテーブル７１とリンクテーブル８１に登録され
る属性の内容について説明する。サーバマシンから読み
込んだノードをリンク元ノードとし、このノードに含ま
れるハイパーリンクに記述されているノードをリンク先
ノードとする。まず、リンク元ノードについては、最終
更新日時４１を登録する。たとえば、図８に示すノード
テーブル７１において、ノード識別子が“１”のノード
をサーバマシンから読み出し、これのハイパーリンクの
指す先としてノード識別子が“２”のノードが記述され
ていたものとする。この場合は、ノード識別子が“１”
のノードがリンク元ノードであり、ノード識別子が
“２”のノードがリンク先ノードになる。Here, the contents of the attributes registered in the node table 71 and the link table 81 in step S812 in FIG. 13 will be described. A node read from the server machine is defined as a link source node, and a node described in a hyperlink included in the node is defined as a link destination node. First, the last update date and time 41 is registered for the link source node. For example, it is assumed that, in the node table 71 shown in FIG. 8, a node having a node identifier "1" is read from the server machine, and a node having a node identifier "2" is described as a destination pointed by the hyperlink. In this case, the node identifier is “1”
Is the link source node, and the node with the node identifier “2” is the link destination node.

【００８５】図８において点線７３で囲まれている最終
更新日時がリンク元ノードの属性情報として更新され
る。リンク先ノードの属性情報として登録されるのは図
８の点線７４で囲まれている部分である。すなわち、ノ
ード識別子と、ノードの位置を示すＵＲＬの文字列に対
応した、スキーム、ホスト名、ポート番号、スキーム特
有部、およびデータベースに登録された日時である登録
日時である。ノード識別子は、ＤＢＭＳによって自動生
成されたものが登録される。In FIG. 8, the last update date and time surrounded by a dotted line 73 is updated as attribute information of the link source node. What is registered as the attribute information of the link destination node is a portion surrounded by a dotted line 74 in FIG. That is, a scheme identifier, a host name, a port number, a scheme specific part, and a registration date and time, which is a date and time registered in the database, corresponding to the node identifier and the character string of the URL indicating the position of the node. As the node identifier, one automatically generated by the DBMS is registered.

【００８６】リンクテーブル８１には、ＤＢＭＳによっ
て自動生成されたリンク識別子と、リンク元ノード識別
子と、リンク先ノード識別子と、階層数と、探索日時が
登録される。図９を例に説明する。リンク識別子が
“１”のハイパーリンクを基にしてリンク先であるノー
ド識別子が“１”のノードを読み込んだものとする。読
み込んだノードに記述されているハイパーリンクにより
ノード識別子が“２”とノード識別子が“３”のノード
とがリンクされているものとする。In the link table 81, a link identifier automatically generated by the DBMS, a link source node identifier, a link destination node identifier, the number of layers, and a search date and time are registered. This will be described with reference to FIG. It is assumed that, based on the hyperlink having the link identifier “1”, the node having the link destination node identifier “1” is read. It is assumed that the node having the node identifier “2” and the node having the node identifier “3” are linked by the hyperlink described in the read node.

【００８７】この際、ハイパーリンク８４を参照して探
索を行ったので、その時刻を探索日時８５として登録す
る。次に新たに得られたハイパーリンクについてのリン
ク識別子をＤＢＭＳから取得し、これを追加登録するリ
ンク識別子の欄に登録する。たとえば、リンク識別子が
“２”のハイパーリンク（８６）のリンク識別子（８
７）として“２”を登録する。この値は、テーブルの登
録順などを基準にＤＢＭＳにより適宜与えられるＩＤ番
号である。At this time, since the search is performed with reference to the hyperlink 84, the time is registered as the search date 85. Next, a link identifier of the newly obtained hyperlink is acquired from the DBMS, and registered in the column of a link identifier to be additionally registered. For example, the link identifier (8) of the hyperlink (86) whose link identifier is "2"
7) is registered as "2". This value is an ID number appropriately given by the DBMS based on the registration order of the table and the like.

【００８８】ハイパーリンク８６のリンク元ノードは、
そのノード識別子の値が“１”であるので、リンク元ノ
ード識別子８８として“１”を登録する。またリンク先
ノード識別子８９には、リンク先のノードの属性をノー
ドテーブル７１に登録する際にＤＢＭＳにより与えられ
たノード識別子の値、すなわち“２”を登録する。ま
た、階層数９１には、階層数“０”のハイパーリンクを
たどって得たノードから取得したハイパーリンクである
ので“０”に“１”を加えた値“１”を登録する。ノー
ド識別子が“３”のノードに向けてのハイパーリンク９
２についも同様の手順により登録される。追加登録した
ハイパーリンクについて図９の点線９３で示した範囲の
情報が登録されることになる。The link source node of the hyperlink 86 is
Since the value of the node identifier is “1”, “1” is registered as the link source node identifier 88. In the link destination node identifier 89, the value of the node identifier given by the DBMS when the attribute of the link destination node is registered in the node table 71, that is, “2” is registered. Also, since the number of layers 91 is a hyperlink obtained from a node obtained by following the hyperlink with the number of layers “0”, a value “1” obtained by adding “1” to “0” is registered. Hyperlink 9 toward node with node identifier "3"
2 is registered by the same procedure. The information in the range indicated by the dotted line 93 in FIG. 9 is registered for the additionally registered hyperlink.

【００８９】ノードテーブル７１およびリンクテーブル
８１の更新は、１つのハイパーリンクを基にしてそのリ
ンク先ノードをサーバマシから読み込み、新たなリンク
先を見い出すたびに行われる。したがって、図１０の流
れ図において１つのノードについてテーブル更新処理
（ステップＳ５０５）を終えた後、ステップＳ５０２に
戻ると、テーブル更新処理において新たに登録されたハ
イパーリンクもその探索対象になる。これにより、次々
とリンク先への探索が進められる。ただし、その探索範
囲は階層数などの条件によって制限される。The updating of the node table 71 and the link table 81 is performed each time a link destination node is read from the server machine based on one hyperlink and a new link destination is found. Therefore, after the table update process (step S505) is completed for one node in the flowchart of FIG. 10, when returning to step S502, the hyperlink newly registered in the table update process is also searched. As a result, the search for the link destination proceeds one after another. However, the search range is limited by conditions such as the number of layers.

【００９０】また、探索日時が未定義か経過時間以上過
去の時刻であることを未探索の判定基準にしているの
で、一度探索したハイパーリンクについて再度調べるこ
とがない。これにより、たとえばノード“Ａ”のリンク
先がノード“Ｂ”で、ノード“Ｂ”のリンク先がノード
“Ａ”のようにリンクによりループが形成されている場
合であっても、ノード“Ａ”からノード“Ｂ”へのハイ
パーリンクを再度調べることが無く、最大階層数までル
ープをたどるようなことがない。Further, since the search date and time is undefined or a time earlier than the elapsed time is used as an unsearched criterion, the hyperlink once searched is not checked again. Thus, for example, even when the link destination of the node “A” is the node “B” and the link destination of the node “B” is a loop formed by links like the node “A”, the node “A” Does not check the hyperlink from "" to node "B" again, and does not follow the loop up to the maximum number of layers.

【００９１】このようにハイパーリンクの階層数を基に
探索範囲を制限しても、ノードの階層数を基に探索範囲
を制限したときと同様の効果を得ることができる。ただ
し、ハイパーリンクに階層数を付与する場合には、リン
クテーブルを参照して得たハイパーリンクのリンク先ノ
ードを調べるために、ノードテーブルも参照する必要が
あり、実施例にようにノードに階層数を付与した場合に
比べてテーブルの参照処理が増加する。As described above, even when the search range is limited based on the number of layers of hyperlinks, the same effect as when the search range is limited based on the number of layers of nodes can be obtained. However, when assigning the number of layers to a hyperlink, it is necessary to refer to the node table in order to check the link destination node of the hyperlink obtained by referring to the link table. The number of reference processes for the table increases as compared with the case where the number is given.

【００９２】[0092]

【発明の効果】以上詳細に説明したように請求項１ない
し請求項３記載の発明によれば、起点のノードからの階
層数によって探索範囲を制限しているので、探索される
ノードは起点のノードと意味的なつながりの強いものの
みとなる。また階層数により探索範囲を制限できるの
で、必要な範囲での情報のみを収集することができる。
また、不要なノードの探索が行われないので、サーバお
よびサーバとの間の通信回線の負担を軽減できるととも
に探索に要する時間を短くすることができる。また、請
求項１および請求項２記載の発明では、探索範囲制限手
段によって求めた階層数が探索条件設定手段により設定
した階層数の最大値より小さいかあるいはこれと等しい
とき、探索範囲のホスト名として指定された文字列とノ
ードのホスト名とが先頭から指定された文字列だけ部分
一致するか否かをホスト名部分一致判別手段で判別し、
このホスト名部分一致判別手段が部分一致したと判別し
た場合にこのノードが探索範囲に存在すると判別するこ
とにした。このため、先頭に違う文字を含んでいるよう
なホスト名を有するノードを探索範囲から除外すること
ができ、効率的な探索が可能になる。 As described above in detail, according to the first to third aspects of the present invention, the search range is limited by the number of layers from the starting node, so that the searched node is the starting node. Only those that have a strong semantic connection to the node. Further, since the search range can be limited by the number of layers, it is possible to collect only information in a necessary range.
In addition, since unnecessary nodes are not searched, the load on the server and the communication line between the servers can be reduced, and the time required for the search can be shortened. In addition,
According to the first and second aspects of the present invention, the search range restriction
The number of hierarchies determined by the level is set by the search condition setting means
Less than or equal to the maximum number of layers
When the string specified as the host name in the search range
Only the character string specified from the beginning with the host name of the host
Whether or not they match is determined by the host name partial match determination means,
The host name partial match determining means determines that partial match has occurred.
If this node exists in the search range
And For this reason, it seems to contain a different character at the beginning
Exclude nodes with strange host names from the search range
And efficient search becomes possible.

【００９３】また請求項４記載の発明によれば、１度読
み込んだことのあるノードを再度読み込むことを防止し
たので、ループした範囲を繰り返し探索することを回避
することができる。これにより探索を効率良く行うこと
ができる。According to the fourth aspect of the present invention, it is possible to prevent a node that has once been read from being read again, so that it is possible to avoid repeatedly searching a looped range. Thus, the search can be performed efficiently.

[Brief description of the drawings]

【図１】本発明の一実施例におけるノード・リンク探索
装置の構成の概要を表わしたブロック図である。FIG. 1 is a block diagram showing an outline of a configuration of a node / link searching device according to an embodiment of the present invention.

【図２】ノードテーブルの登録内容の一例を表わした説
明図である。FIG. 2 is an explanatory diagram showing an example of registered contents of a node table.

【図３】リンクテーブルの登録内容の一例を表わした説
明図である。FIG. 3 is an explanatory diagram showing an example of registered contents of a link table.

【図４】ノード・リンク探索装置の行う動作の流れを表
わした流れ図である。FIG. 4 is a flowchart showing a flow of an operation performed by the node / link searching device.

【図５】探索範囲内に存在するノードであるか否かを判
定する際の処理の流れを表わした流れ図である。FIG. 5 is a flowchart showing a processing flow when determining whether or not a node exists within a search range.

【図６】探索範囲内に存在するノードであるか否かを判
定する処理の他の一例の流れを表わした流れ図である。FIG. 6 is a flowchart illustrating another example of a process of determining whether a node exists within a search range;

【図７】図４に示したテーブル更新処理の流れを表わし
た流れ図である。FIG. 7 is a flowchart showing a flow of a table updating process shown in FIG. 4;

【図８】変形例のノード・リンク探索装置で用いられる
ノードテーブルの登録内容の一例を表わした説明図であ
る。FIG. 8 is an explanatory diagram showing an example of registered contents of a node table used in a node / link search device of a modified example.

【図９】変形例のノード・リンク探索装置で用いられる
リンクテーブルの登録内容の一例を表わした説明図であ
る。FIG. 9 is an explanatory diagram showing an example of registered contents of a link table used in a node / link search device of a modified example.

【図１０】変形例におけるノード・リンク探索装置の行
う動作の流れを表わした流れ図である。FIG. 10 is a flowchart illustrating a flow of an operation performed by a node / link search device according to a modification.

【図１１】ハイパーリンクが探索範囲内に存在するか否
かを判定する際の処理の流れを表わした流れ図である。FIG. 11 is a flowchart showing the flow of processing when determining whether a hyperlink exists within a search range.

【図１２】ハイパーリンクが探索範囲内に存在するか否
かを判定する処理の他の一例の流れを表わした流れ図で
ある。FIG. 12 is a flowchart illustrating another example of a process of determining whether a hyperlink exists within a search range.

【図１３】図１０に示したテーブル更新処理の流れを表
わした流れ図である。FIG. 13 is a flowchart showing a flow of a table updating process shown in FIG. 10;

【図１４】ノードとこれらノード間を接続するハイパー
リンクの一例を表わした説明図である。FIG. 14 is an explanatory diagram showing an example of nodes and hyperlinks connecting these nodes.

【図１５】ノードのドキュメント同士の関係の一例を表
わした説明図である。FIG. 15 is an explanatory diagram illustrating an example of a relationship between documents of a node.

【図１６】従来から使用されているノード・リンク探索
装置の構成の概要を表わしたブロック図である。FIG. 16 is a block diagram showing an outline of a configuration of a conventionally used node / link searching device.

[Explanation of symbols]

１１ノード・リンク探索装置１２ネットワーク１３サーバマシン２１ファイル収集部２２ハイパーリンク抽出部２３ノード・リンク情報データベース２４探索範囲制御部２５システム制御部２６ハイパーメディア記述言語解析部２７階層数算出部２８ホスト名比較部２９階層数比較部３１、７１ノードテーブル５１、８１リンクテーブル DESCRIPTION OF SYMBOLS 11 Node / link search device 12 Network 13 Server machine 21 File collection unit 22 Hyperlink extraction unit 23 Node / link information database 24 Search range control unit 25 System control unit 26 Hypermedia description language analysis unit 27 Layer number calculation unit 28 Host name Comparing unit 29 Hierarchical number comparing unit 31, 71 Node table 51, 81 Link table

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平８−305729（ＪＰ，Ａ) 特開平６−110926（ＪＰ，Ａ) Ｐ．Ｍ．Ｅ．ＤｅＢｒａ，Ｒ．Ｄ. Ｊ．Ｐｏｓｔ，”ＩｎｆｏｒｍａｔｉｏｎｒｅｔｒｉｅｖａｌｉｎｔｈｅＷｏｒｌｄ−ＷｉｄｅＷｅｂ：Ｍａｋｉｎｇｃｌｉｅｎｔ−ｂａｓｅｄｓｅａｒｃｈｉｎｇｆｅａｓｉｂｌｅ”，ＣＯＭＰＵＴＥＲＮＥＴＷＯＲＫＳＡＮＤＩＳＤＮＳＹＳＴＥＭＳ，ＶＯＬＵＭＥ27，ＮＯ．２ＰＰ 183−192（ＮＯＶＥＭＢＥＲ 1994) 菅井猛，和田光教，「インターネット上の情報フィルタリング（２）」，情報処理学会第51回（平成７年後期）全国大会講演論文集ｐｐ４−87〜４−88 ＫＩＭＭＥＬＳ，”ＲｏｂｏｔｇｅｎｅｒａｔｅｄｏｎｔｈｅＷｏｒｌｄＷｉｄｅＷｅｂ”，ＤａｔａｂａｓｅＶｏｌ．19，Ｎｏ．１，ｐｐ．40−43，46−49（平成８年２月13日日本科学技術情報センター受け入れ) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06F 17/30 G06F 12/00 545 ────────────────────────────────────────────────── ─── Continuation of front page (56) References JP-A-8-305729 (JP, A) JP-A-6-110926 (JP, A) M. E. FIG. De Bra, R .; D.J. Post, "Information Retrieval in the World-Wide Web: Making Client-Based Searching Feasible", COMPUTER NETWORK KS AND ISDN SYSTEMS. 2 PP 183-192 (NOVEMBER 1994) Takeshi Sugai, Mitsunori Wada, "Information Filtering on the Internet (2)", Proc. -88 KIMMEL S, "Robot generated on the World Wide Web", Data base Vol. 19, No. 1, pp. 40-43, 46-49 (accepted by Japan Science and Technology Information Center on February 13, 1996) (58) Fields investigated (Int. Cl. ⁶ , DB name) G06F 17/30 G06F 12/00 545

Claims

(57) [Claims]

1. A link destination from an arbitrary node based on link information indicating a name of a link destination node included in each node of the hypertext and a storage position of the link destination node and a host name in a search range. Specify the conditions for limiting the search range when searching for nodes sequentially, and specify the maximum value of the number of layers, which is the number of nodes that exist from the search start node to the search destination node, and the server machine to be searched. search condition setting means and repeatedly performs file contents of the destination node indicated by the link information from the node serving as a starting point for the search to be read from the server that stores it in the order set by a character string for Collection means, each time the contents of one node are read by the file collection means, the link information contained in the node and the link information A node / link information storage unit for storing information indicating the correspondence between the previous node and the node read this time; each time the content of one node is read by the file collection unit, the link information included in the node is indicated. A number-of-layers calculating means for calculating the number of layers of the link destination node; and if the number of layers determined by the number-of-layers calculating means is larger than the maximum value of the number of layers set by the search condition setting means, a link is provided after the node read this time Search range limiting means for stopping the reading of the contents of the node by the file collecting means, and the number of layers obtained by the search range limiting means is smaller than the maximum value of the number of layers set by the search condition setting means, or When equal to this, the character string specified as the host name of the search range and the host name of the node A host name partial match determining means for determining whether or not the character string specified from the beginning partially matches; and a determination that this node exists in the search range when the host name partial match determining means determines that the partial match has occurred. A search range discriminating means that performs
Link search device.

2. A method according to claim 1, further comprising the step of: searching for the link destination node name and the storage location of the link destination node included in each node of the hypertext and the host name of the search range from any node to the link destination. identify the server machine restriction of the search range when sequentially performing the search to the node, to a maximum value and search hierarchical number is the number of links existing between from the starting point of the search node to the search destination node search condition setting means and repeatedly performs file contents of the destination node indicated by the link information from the node serving as a starting point for the search to be read from the server that stores it in the order set by a character string for Collection means, each time the contents of one node are read by the file collection means, the link information contained in the node and the link information A node / link information storage unit for storing information indicating the correspondence between the previous node and the node read this time; and each time the contents of one node are read by the file collection unit, the link information included in the node is represented. Means for calculating the number of layers of the link to the link destination node; and a node read this time when the number of layers determined by the number of layers calculation means is greater than the maximum value of the number of layers set by the search condition setting means. A search range limiting means for stopping the reading of the contents of the linked nodes by the file collecting means; and the number of hierarchies obtained by the search range limiting means is greater than the maximum number of hierarchies set by the search condition setting means. If less than or equal to this, the string specified as the host name in the search range and the node Host name partial match determining means for determining whether or not the host name partially matches only a specified character string from the beginning; and, if the host name partial match determining means determines that the partial match has occurred, this node sets the search range. A search range determining means for determining that the node exists in the node.
Link search device.

3. The node / node according to claim 1, wherein the nodes to be searched are distributed and stored in a plurality of servers connected to a network.
Link search device.

4. The same node discriminating means for discriminating whether or not the link destination node indicated by the link information included in the currently read node is the same as the already read node. 3. The node according to claim 1, further comprising: a multiplex reading stop unit that stops reading of the node again when the same node determination unit determines that the read node is the same as the already read node. -Link search device.