JP2001282856A

JP2001282856A - Index generation method, index display system, index retrieval method and index generation device

Info

Publication number: JP2001282856A
Application number: JP2000098981A
Authority: JP
Inventors: Osamu Torii; 修鳥井; Tatsunori Kanai; 達徳金井; Toshiki Kitsu; 俊樹岐津; Seiji Maeda; 誠司前田; Hirokuni Yano; 浩邦矢野; Hiroshi Yao; 浩矢尾
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2000-03-31
Filing date: 2000-03-31
Publication date: 2001-10-12

Abstract

PROBLEM TO BE SOLVED: To provide a method capable of generating an index for structured documents, which is smaller in size than conventional one and enables efficient retrieval. SOLUTION: In a tree, composed of plurality of vertices and plurality of edges, which represent a XML document of an object of index generation, names of starting points and names of endpoints of edges, excluding edges which satisfy a predetermined condition, are interchanged arbitrarily, and succeedingly, plurality of edges which share a common starting point and have a same name of starting point and a same name of endpoint are merged into one recursively. Then, root vertices of data structure of indices for plurality of XML documents are merged into one. Also, regarding to this data structure in which root vertices are merged into one, plurality of edges which share common starting point and have a same name of starting point and a same name of endpoint are merged into one recursively. These indices thus generated are stored in increments of edge. On this occasion, every edge having same name of starting point is stored in a continuous region in a storage device.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、構造化文書に含ま
れる情報の中から必要な情報を高速に取り出すために有
効なデータ構造であるインデックスを作成するインデッ
クス作成方法及びインデックス作成装置、並びにインデ
ックス閲覧要求に応じて該当する情報を表示するインデ
ックス表示方法、インデックス検索要求に応じて該当す
る情報を検索するインデックス検索方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an index creating method and an index creating apparatus for creating an index, which is an effective data structure, for quickly extracting necessary information from information contained in a structured document. The present invention relates to an index display method for displaying corresponding information in response to a browsing request, and an index search method for searching for corresponding information in response to an index search request.

【０００２】[0002]

【従来の技術】データの中から必要な情報を取り出す方
法には、例えば、「スキャン」、「サーチ」、「ハッシ
ング」、「インデッキシング」などの方法がある。2. Description of the Related Art Methods for extracting necessary information from data include, for example, methods such as "scan", "search", "hashing", and "in decking".

【０００３】スキャンは、データを１つ１つ調べて必要
な情報に合致するものを取り出す方法であり、もっとも
原始的な方法である。サーチは、データをあらかじめ一
定の規則にしたがって並べておき、この規則を利用して
データを検索する方法であり、ブロックサーチ、バイナ
リサーチなどがある。ハッシングは、ハッシュ関数とい
う関数を用いて、検索したい情報の内容からその内容の
存在位置を求める方法である。[0005] Scanning is a method of examining data one by one and extracting data that matches required information, and is the most primitive method. The search is a method of arranging data in advance according to a certain rule and searching for data using this rule, and includes a block search, a binary search, and the like. Hashing is a method of using a function called a hash function to find the location of the content to be searched from the content of the content.

【０００４】一方、インデッキシングは、検索対象であ
るデータとは別にインデックスと呼ばれる表を用意し、
この表を利用してデータを検索する方法である。この従
来のインデックスは、検索したい情報の内容とその内容
の存在位置の組を記録した表であり、ＩＳＡＭ、ＶＳＡ
Ｍ、Ｂ−ＴＲＥＥなどがある。インデックスは、単に情
報の内容とその内容の存在位置との組を羅列したもので
あることは少なく、多くの場合、情報の内容を指定して
その内容の存在位置を高速に取得するために特徴的なデ
ータ構造を保持する（例えば、根から各葉への深さや各
頂点のサイズのばらつきが少ない木構造が用いられ
る）。On the other hand, in decking prepares a table called an index separately from the data to be searched,
This is a method of searching for data using this table. The conventional index is a table that records a set of information contents to be searched and the location of the contents.
M and B-TREE. An index is rarely simply a set of information content and its location, and is often characterized by specifying the content of information and quickly obtaining the location of the content. (For example, a tree structure with small variations in the depth from the root to each leaf and the size of each vertex is used).

【０００５】リレーショナルデータベースは、インデッ
クスの研究が進んでいる分野の１つである。リレーショ
ナルデータベースは、行と呼ばれる縦軸と、カラムと呼
ばれる横軸からなるテーブルを格納するデータベースで
ある（なお、各行はタプルと呼ばれる）。例えば、図２
７に示したものはテーブルの一例である。テーブルにお
いて、特定のカラム、例えば『筆頭発明者』というカラ
ムに対してインデックスを作成しておけば、このカラム
に関する検索を高速に行なうことが可能である。図２７
の例に関して言うと、『筆頭発明者が岐津俊樹であるタ
プルを検索せよ』という検索要求に対して、高速に条件
を満たすタプルを見つけることが可能である。[0005] Relational databases are one area in which index research is progressing. The relational database is a database that stores a table having a vertical axis called a row and a horizontal axis called a column (each row is called a tuple). For example, FIG.
FIG. 7 shows an example of a table. In the table, if an index is created for a specific column, for example, a column called “lead inventor”, it is possible to perform a search for this column at high speed. FIG.
For example, in response to a search request that “the first inventor searches for a tuple that is Toshiki Kizu”, it is possible to quickly find a tuple that satisfies the condition.

【０００６】検索の対象がリレーショナルデータベース
である場合には、従来のインデッキシング技術は有効で
あるが、実際に検索を行なう対象が常にリレーショナル
データベースであるとは限らない。When the search target is a relational database, the conventional indiction technology is effective, but the search target is not always always a relational database.

【０００７】近年インターネットやイントラネットなど
の通信路を介して交換される標準的な文書形式に、ＨＴ
ＭＬ文書やＸＭＬ文書などの構造化文書がある（ＸＭＬ
文書については例えば“ＥｘｔｅｎｓｉｂｌｅＭａｒ
ｋｕｐＬａｎｇｕａｇｅ（ＸＭＬ）１．０”（Ｗ３Ｃ
Ｒｅｃｏｍｍｅｎｄａｔｉｏｎ１０−Ｆｅｂｒｕａ
ｒｙ−１９９８）に詳しい）。ＨＴＭＬ文書やＸＭＬ文
書などの構造化文書の中から必要な情報を得たいという
要求が想定されるが、この場合には従来のインデッキシ
ング技術は有効でない。In recent years, the standard document format exchanged via a communication path such as the Internet or an intranet has been changed to HT.
There are structured documents such as ML documents and XML documents (XML
For documents, see, for example, “Extended Mar
kup Language (XML) 1.0 "(W3C
Recommendation 10-Februa
ry-1998)). It is assumed that there is a demand for obtaining necessary information from a structured document such as an HTML document or an XML document. In this case, the conventional indiction technology is not effective.

【０００８】リレーショナルデータベースにおける検索
と構造化文書における検索との違いは、テーブルの構造
と構造化文書の構造との違いに起因する。[0008] The difference between a search in a relational database and a search in a structured document results from the difference between the structure of the table and the structure of the structured document.

【０００９】図２８および図２９に構造化文書の例を示
す。図２８に示した具体例は『出願番号特願平８−２３
２５１５に関係する所定の情報』をＸＭＬ文書形式で表
現したものである。また、図２９に示した具体例は『出
願番号特願平８−４７５２６に関係する所定の情報』を
ＸＭＬ文書形式で表現したものである。この２つのＸＭ
Ｌ文書は図２７に例示した２つの『特許出願に関係する
所定の情報』に対応する（ただし含まれる情報量は若干
異なった例となっている）。FIGS. 28 and 29 show examples of structured documents. The specific example shown in FIG. 28 is described in “Application No. 8-23.
2515 ”is expressed in an XML document format. The specific example shown in FIG. 29 expresses "predetermined information relating to application number Japanese Patent Application No. 8-47526" in an XML document format. These two XMs
The L document corresponds to the two “predetermined information related to the patent application” illustrated in FIG. 27 (however, the included information amounts are slightly different examples).

【００１０】図２７と図２８を、『筆頭発明者に関する
情報』に関して比べると、図２７では、『筆頭発明者』
コラムに対して、各タプルの値が対応するというフラッ
トな構造を持っているのに対して、図２８では、『筆頭
発明者』エレメントの子供エレメントに『名前』エレメ
ントが存在し、『名前』エレメントの子供エレメントに
『姓』エレメントと『名』エレメントが存在し、さらに
『姓』エレメントの子供文字列に『岐津』文字列が存在
し、名エレメントの子供文字列に『俊樹』文字列が存在
するという具合に、階層的な構造を持っている。FIG. 27 is compared with FIG. 28 with respect to “information on first inventor”.
In contrast to the column, which has a flat structure in which the value of each tuple corresponds, in FIG. 28, a “name” element exists as a child element of the “first inventor” element, and a “name” element exists. "Last name" element and "First name" element are present in the child element of the element, "Gitsu" character string is present in the child character string of the "Last name" element, and "Toshiki" character string is present in the child element of the first name element Has a hierarchical structure.

【００１１】例えば『筆頭発明者に関する情報』を検索
する場合、図２７では、『筆頭発明者が岐津俊樹である
タプルを検索せよ』などのように、『のコラムの値が×
×であるタプルを検索せよ』という型の検索がほとんど
である。つまり、コラムの名前が『筆頭発明者』である
コラムの値と対応するタプルを格納するインデックスを
用意しておけば十分である。For example, in the case of searching for "information on the first inventor", in FIG. 27, "the value of the column of"
Search for tuples that are "x"]. In other words, it is sufficient to prepare an index for storing a tuple corresponding to the value of the column whose column name is "lead inventor".

【００１２】これに対して、ＸＭＬ文書は、階層構造を
持っている分だけ、検索の型も１通りではない。『岐
津』という文字列から、この文字列を含む構造化文書を
得る検索のパターンには、『姓が岐津であるＸＭＬ文書
を検索せよ』という検索、『名前の姓が岐津であるＸＭ
Ｌ文書を検索せよ』という検索、『筆頭発明者の名前の
姓が岐津であるＸＭＬ文書を検索せよ』という検索、
『公開特許公報の筆頭発明者の名前の姓が岐津であるＸ
ＭＬ文書を検索せよ』という検索など様々である。On the other hand, an XML document does not have a single search type because it has a hierarchical structure. The search pattern for obtaining a structured document including this character string from the character string “Gizu” includes a search “Search for an XML document whose last name is Kizu” and a search pattern “The last name of the name is Gitsu. XM
Search for "L document", search for "XML document whose first inventor's name is Gizu",
"X in which the last name of the first inventor in the published patent gazette is Kizu
Search for ML document ".

【００１３】したがって、リレーショナルデータベース
で行なっていたことと同様に、階層（リレーショナルデ
ータベースではコラムの名前に相当）ごとに階層の値
（リレーショナルデータベースではコラムの値に相当）
と対応するＸＭＬ文書（リレーショナルデータベースで
はタプルに相当）を格納するインデックスを用意するシ
ステムを考え、先の検索要求すべてに高速に検索結果を
得るためには、階層『姓』のインデックス、階層『名
前，姓』のインデックス、階層『筆頭発明者，名前，
姓』のインデックス、階層『公開特許公報、筆頭発明
者，名前，姓』のインデックスを用意しなければならな
い。Therefore, in the same way as in the relational database, the value of the hierarchy (corresponding to the column value in the relational database) is determined for each hierarchy (corresponding to the column name in the relational database).
Considering a system that prepares an index that stores XML documents (corresponding to tuples in a relational database) corresponding to and, in order to obtain search results at high speed for all of the preceding search requests, an index of the hierarchy "last name" and a hierarchy "name" , Surname ”index, hierarchy“ first inventor, first name,
An index of "last name" and an index of hierarchy "public patent gazette, first inventor, first name, last name" must be prepared.

【００１４】また、上記の検索要求に加えて、『筆頭発
明者の名前が姓を含んでいるＸＭＬ文書を検索せよ』と
いう検索を行ないたい場合、高速な検索を実現するため
には、先に用意した４種類のインデックスは利用できな
い。この検索を高速に実現するためには、新たに階層
『筆頭発明者，名前』のインデックスを用意しなければ
ならない。In addition to the above search request, if a search "search for an XML document in which the first inventor's name includes the last name" is desired, a high-speed search must be performed first. The prepared four types of indexes cannot be used. In order to realize this search at a high speed, a new index of the hierarchy "first inventor, name" must be prepared.

【００１５】さらに、この方法では、検索が行なわれる
階層の種類だけインデックスが必要なことになり、イン
デックスの記憶サイズという意味で現実的でない。Furthermore, in this method, an index is required only for the type of the hierarchy to be searched, which is not practical in terms of the storage size of the index.

【００１６】[0016]

【発明が解決しようとする課題】以上のように、従来の
インデッキシング技術では、構造化文書を対象としたと
きに、検索要求に効果的に対応できるインデックスの作
成は困難であり、また検索要求にある程度対応できるイ
ンデックスができたとしても、データ・サイズや検索速
度の点で問題あがった。As described above, in the conventional indiction technology, it is difficult to create an index that can effectively respond to a search request when a structured document is targeted. Even with an index that could handle the request to some extent, there were problems with data size and search speed.

【００１７】本発明は、上記事情を考慮してなされたも
ので、従来よりもデータ・サイズが小さく且つ効率的な
検索を可能とする、構造化文書のインデックスを作成す
るインデックス作成方法及びインデックス作成装置、イ
ンデックス閲覧要求に応じて該当する情報を表示するイ
ンデックス表示方法、インデックス検索要求に応じて該
当する情報を検索するインデックス検索方法を提供する
ことを目的とする。The present invention has been made in view of the above circumstances, and has an index creation method and an index creation method for creating an index of a structured document, which enables a more efficient search with a smaller data size than before. It is an object to provide an apparatus, an index display method for displaying corresponding information in response to an index browsing request, and an index search method for searching for corresponding information in response to an index search request.

【００１８】[0018]

【課題を解決するための手段】本発明は、最上位頂点
（根）以外の頂点は唯一の上位の頂点を親に持ち且つ最
下位頂点以外の頂点は１つまたは複数の下位の頂点を子
に持つことにより階層的に親子関係を形成する複数の頂
点と、親子関係にある１対の上位および下位の頂点に関
係する情報（例えば、始点の名前および終点の名前）を
保持する複数の辺とからなる構造によって表現される１
つの構造化文書からインデックスを作成するインデック
ス作成方法であって、対象とする１つの構造化文書にお
いて、同一の頂点を上位側に持つ複数の辺であって且つ
予め定められた関係にある情報を保持する複数の辺（例
えば、上位側の頂点を共有し且つ始点の名前および終点
の名前が同一である辺）が存在する否か判断し、該複数
の辺が存在すれば、該複数の辺についての下位側に位置
する頂点を１つの頂点として統合して該複数の辺を１つ
に共有化する共有化処理を行い、この共有化処理を行っ
た後の構造化文書の構造を保持したインデックスを作成
することを特徴とする。上記は単一の構造化文書のみの
インデックスを作成する場合である。複数の構造化文書
についてのインデックスを作成する場合には、対象とす
る複数の構造化文書を表現する構造の最上位頂点を１つ
に統合して新たな１つの構造とし、対象とする複数の構
造化文書のそれぞれおよび前記新たな１つの構造におい
て、同一の頂点を上位側に持つ複数の辺であって且つ予
め定められた関係にある情報を保持する複数の辺が存在
するか否か判断し、該複数の辺が存在すれば、該複数の
辺についての下位側に位置する頂点を１つの頂点として
統合して該複数の辺を１つに共有化する共有化処理を行
い、この共有化処理を行った後の構造化文書の構造を保
持したインデックスを作成する。なお、対象とする複数
の構造化文書のそれぞれについての上記共有化処理（こ
こで第１の共有化処理とする）と、前記新たな１つの構
造について同様に共有化処理（ここで第２の共有化処理
とする）と、上記最上位頂点の統合の処理の順序は、種
々のものが設定可能であり、例えば、第１の共有化処
理、最上位頂点の統合、第１の共有化処理の順に行って
もよいし、最上位頂点の統合、第１の共有化処理、第２
の共有化処理の順に行ってもよいし、最上位頂点の統合
の後で、第１の共有化処理と第２の共有化処理を一括し
て行ってもよい。あるいは、複数の構造化文書について
のインデックスを作成する場合には、対象とする複数の
構造化文書のそれぞれについては上記共有化処理は行わ
ずに、該複数の構造化文書の最上位頂点を１つに統合し
て新たな１つの構造化文書とし、前記新たな１つの構造
化文書について元々の辺の構造化文書に属していた辺の
共有化処理を行い、この共有化処理を行った後の構造化
文書の構造を保持したインデックスを作成する。According to the present invention, vertices other than the highest vertex (root) have only one upper vertex as a parent, and vertices other than the lowest vertex have one or more lower vertices as children. , A plurality of vertices that form a parent-child relationship hierarchically, and a plurality of edges that hold information (eg, the name of the start point and the name of the end point) related to a pair of upper and lower vertices in the parent-child relationship 1 represented by the structure consisting of
An index creation method for creating an index from two structured documents, in which information of a plurality of sides having the same vertex on the upper side and having a predetermined relationship is obtained in one target structured document. It is determined whether or not there are a plurality of sides to be held (for example, sides sharing the upper vertex and having the same name of the start point and the end point). A vertex located on the lower side of is integrated as one vertex to perform a sharing process of sharing the plurality of sides into one, and the structure of the structured document after performing the sharing process is retained. An index is created. The above is a case where an index of only a single structured document is created. When creating an index for a plurality of structured documents, the highest vertex of the structure representing the plurality of target structured documents is integrated into one to form a new structure, and the plurality of target In each of the structured documents and the new one structure, it is determined whether or not there are a plurality of sides having the same vertex on the upper side and holding a plurality of pieces of information having a predetermined relationship. If the plurality of sides exist, a vertex located on the lower side of the plurality of sides is integrated as one vertex to perform a sharing process of sharing the plurality of sides into one. Creates an index that holds the structure of the structured document after the conversion process. The above-described sharing process (herein, referred to as a first sharing process) for each of a plurality of target structured documents, and the sharing process (here, a second sharing process) for the new one structure are similarly performed. The order of the process of integrating the top vertices can be set in various ways, for example, the first sharing process, the integration of the top vertices, and the first sharing process. Or the integration of the top vertex, the first sharing process, the second
May be performed in this order, or the first sharing process and the second sharing process may be performed collectively after the top vertices are integrated. Alternatively, when creating an index for a plurality of structured documents, the sharing process is not performed for each of the plurality of target structured documents, and the highest vertex of the plurality of structured documents is set to 1 Into one new structured document, perform the sharing process of the edge belonging to the original structured document of the new structured document, and perform the sharing process. Create an index that holds the structure of the structured document of.

【００１９】また、本発明は、最上位頂点以外の頂点は
唯一の上位の頂点を親に持ち且つ最下位頂点以外の頂点
は１つまたは複数の下位の頂点を子に持つことにより階
層的に親子関係を形成する複数の頂点と、親子関係にあ
る１対の上位および下位の頂点に関係する情報を保持す
る複数の辺とからなる構造によって表現される構造化文
書のインデックスを作成するインデックス作成装置であ
って、対象となる構造化文書の構造における前記親子関
係を維持しつつ一定の関係にある複数の前記辺を１つに
まとめた構造を保持したインデックスを作成する手段を
備えたことを特徴とする。Further, the present invention provides a hierarchical structure in which vertices other than the highest vertex have only one upper vertex as a parent and vertices other than the lowest vertex have one or more lower vertices as children. Index creation for creating an index of a structured document represented by a structure including a plurality of vertices forming a parent-child relationship and a plurality of edges holding information relating to a pair of upper and lower vertices in the parent-child relationship The apparatus further comprises means for creating an index holding a structure in which the plurality of sides having a certain relationship are integrated into one while maintaining the parent-child relationship in the structure of the target structured document. Features.

【００２０】なお、装置に係る本発明は方法に係る発明
としても成立し、方法に係る本発明は装置に係る発明と
しても成立する。It should be noted that the present invention relating to the apparatus is also realized as an invention relating to a method, and the present invention relating to a method is also realized as an invention relating to an apparatus.

【００２１】また、装置または方法に係る本発明は、コ
ンピュータに当該発明に相当する手順を実行させるため
の（あるいはコンピュータを当該発明に相当する手段と
して機能させるための、あるいはコンピュータに当該発
明に相当する機能を実現させるための）プログラムを記
録したコンピュータ読取り可能な記録媒体としても成立
する。Further, the present invention relating to an apparatus or a method is provided for causing a computer to execute a procedure corresponding to the present invention (or for causing a computer to function as means corresponding to the present invention, or for causing a computer to correspond to the present invention). The present invention is also realized as a computer-readable recording medium in which a program for realizing the function of performing the above is recorded.

【００２２】本発明では、対象となる構造化文書の構造
における親子関係を維持しつつ一定の関係にある複数の
辺を１つにまとめた構造を保持したインデックスを作成
する。これによって、構造化文書に含まれる情報の中か
ら必要な情報を高速に検索することが可能になる。ま
た、インデックスは、同一の検索機能を有する従来のイ
ンデックスと比較して、インデックスの記憶領域は少な
くてすみ、この点は検索スピードの向上にも寄与する。
また、インデックスの同じ構造を持つ部分をインデック
ス格納装置上の連続する領域に格納することにより、さ
らなる検索スピードの向上に寄与する。また、本発明に
よれば、検索を行なう際には、構造化文書の構造を知ら
なくても検索要求を何回かに分けてインタラクティブに
行なうことが可能である。また、構造化文書の構造が分
かっている場合には、１回の検索要求でその構造を指定
することで、より高速な検索を行なうことが可能であ
る。このように、同一のインデックスが、両方の検索形
態をサポートすることができる。According to the present invention, an index is created which maintains a parent-child relationship in the structure of a target structured document and holds a structure in which a plurality of sides having a certain relationship are combined into one. This makes it possible to quickly search for necessary information from information included in the structured document. In addition, the index requires less storage area for the index than a conventional index having the same search function, which also contributes to an improvement in search speed.
In addition, storing portions having the same structure of the index in a continuous area on the index storage device contributes to further improvement in search speed. Further, according to the present invention, when performing a search, it is possible to interactively perform a search request several times without knowing the structure of the structured document. When the structure of a structured document is known, a higher-speed search can be performed by specifying the structure in one search request. Thus, the same index can support both search forms.

【００２３】[0023]

【発明の実施の形態】以下、図面を参照しながら発明の
実施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００２４】まず、本発明もしくはその実施の形態にお
ける基本的な構成に関する説明を行う。First, the basic configuration of the present invention or its embodiment will be described.

【００２５】本発明では、「頂点」と呼ばれるデータの
集合と、「辺」と呼ばれるデータの集合の組によって表
現される文書を対象としている。The present invention targets a document represented by a set of data called "vertices" and a set of data called "edges".

【００２６】ただし、ここで各頂点は名前を保持し、各
辺は「始点」と呼ばれる頂点と、「終点」と呼ばれる頂
点の２頂点からなるものとし、ただ１つの例外を除き、
すべての頂点はただ１つの辺の終点になっているものと
する。Here, each vertex holds a name, and each side is composed of two vertices, a vertex called a "start point" and a vertex called an "end point". With one exception,
All vertices shall be the end points of only one side.

【００２７】例えば、ＸＭＬ文書は、・エレメント、・属性名、・属性値、・文字列に１対１に対応する頂点と、これらとは別に用意した・例外頂点とを合わせたものを「頂点集合」とし、親子関係にあ
る、・エレメント頂点−エレメント頂点の組、・エレメント頂点−属性名頂点の組、・エレメント頂点−文字列頂点の組、・属性名頂点−属性値頂点の組、と、・例外頂点−ルートエレメント頂点の組を合わせたものを辺集合として、これらの頂点集合と辺
集合の組によって表現可能な文書である。For example, in an XML document, a vertex corresponding one-to-one to an element, an attribute name, an attribute value, a character string, and an exceptional vertex prepared separately from them are referred to as a vertex. A set of element vertices-element vertices; a set of element vertices-attribute name vertices; a set of element vertices-character string vertices; a set of attribute name vertices-attribute value vertices; A document that can be expressed by a set of these vertices and edges as a set obtained by combining a set of exception vertices and a root element vertex.

【００２８】始点から見た終点のことを「子」、終点か
らみた始点のことを「親」と呼ぶものとする。The end point viewed from the start point is called a "child", and the start point viewed from the end point is called a "parent".

【００２９】ただ１つの例外頂点を除き、すべての頂点
はただ１つの辺の終点になっているとは、ただ１つの例
外頂点を除き、すべての頂点はただ１つの親を持つとい
うことを意味する。Except for one exceptional vertex, all vertices are the end points of only one edge, meaning that all vertices have only one parent, except for one exceptional vertex. I do.

【００３０】また、例外頂点のことを「根」と呼ぶもの
とする。The exceptional vertex is called a "root".

【００３１】頂点集合の辺集合の組からなる前述のよう
なデータ構造のことを根付き木と呼び、根付き木によっ
て表現可能な文書のことを「構造化文書」と呼ぶものと
する。The above-described data structure composed of a set of edge sets of a vertex set is called a rooted tree, and a document that can be expressed by the rooted tree is called a “structured document”.

【００３２】さて、本発明は、複数の構造化文書の中か
ら必要な情報を高速に取り出すためのデータ構造である
インデックスの作成、格納、閲覧、検索に関するもので
ある。The present invention relates to creation, storage, browsing, and retrieval of an index, which is a data structure for quickly extracting necessary information from a plurality of structured documents.

【００３３】インデックスは、すべての構造化文書を根
付き木で表現し、すべての根付き木の根を１つの頂点に
まとめて新たな１つの大きな根付き木を作成したもので
ある（１つの構造化文書のみ含むインデックスもあ
る）。The index expresses all structured documents as rooted trees and combines all roots of the rooted tree into one vertex to create a new large rooted tree (including only one structured document). There is also an index).

【００３４】インデックスの根付き木の持つ最大の特徴
は、「もとの構造化文書が持つ階層を保持している」点
である。The greatest feature of the rooted tree of the index is that "the hierarchy of the original structured document is retained".

【００３５】例えば、『Ａの子がＢになっており、Ｂの
子がＣになっており、Ｃの子がＤになっているような構
造化文書を求めなさい』というような、階層を持った検
索要求に対して、インデックスの根付き木の持つ階層を
親頂点から子頂点へ、または子頂点から親頂点へ順番に
たどることにより、検索要求に合致する構造化文書を見
つけることが可能である。この際、インデックスの根付
き木は、もとの構造化文書が持つ階層を保持しているの
で、検索の途中で検索要求に合致しないインデックスの
辺が出現したら、この辺から先の検索を行う必要はな
い。先の例で、Ａの子がＢになっている辺が存在し、Ｂ
の子がＣになっている辺が存在しない場合には、この辺
より先に探索を進める必要はない。For example, a hierarchy such as "find a structured document in which the child of A is B, the child of B is C, and the child of C is D" is obtained. It is possible to find a structured document that matches the search request by tracing the hierarchy of the root tree of the index from parent vertices to child vertices or from child vertices to parent vertices. is there. At this time, since the root tree of the index retains the hierarchy of the original structured document, if a side of the index that does not match the search request appears during the search, it is not necessary to perform the search ahead from this side. Absent. In the previous example, there is an edge where the child of A is B, and B
If there is no edge whose child is C, there is no need to proceed with the search before this edge.

【００３６】しかしながら、すべての構造化文書の根付
き木の根を、１つの頂点にまとめただけのインデックス
では、検索スピードの点でも、データ構造のサイズの点
でも十分でない。However, an index in which the roots of the rooted trees of all the structured documents are combined into one vertex is not sufficient in terms of the search speed and the size of the data structure.

【００３７】そこで、このインデックスからスタートし
て、１つの構造化文書内で、または複数の構造化文書に
わたって、または１つの構造化文書内且つ複数の構造化
文書にわたって、一定の関係にある複数の辺、例えば、
始点を共有し、同一の始点名と同一の終点名を保持する
複数の辺を１本の辺にまとめるという操作を（再帰的
に）行うことによって、単一の構造化文書中で同一の階
層情報を保持する複数の辺や複数の構造化文書中で同一
の階層情報を保持する複数の辺をまとめることが可能で
ある。Therefore, starting from this index, a plurality of documents having a fixed relationship within one structured document, or over a plurality of structured documents, or within one structured document and over a plurality of structured documents. Sides, for example,
The same hierarchy in a single structured document is obtained by performing (recursively) an operation of combining a plurality of sides holding the same start point name and the same end point name into one side while sharing the start point. It is possible to combine a plurality of sides holding information and a plurality of sides holding the same hierarchical information in a plurality of structured documents.

【００３８】最終的に得られたインデックスの根付き木
は、（１）構造化文書の保持する階層構造を保持する。（２）一定の関係にある辺（例えば、始点を共有し、同
じ始点名、同じ終点名を保持する辺）はただ１つのみ存
在するという特徴を有する。The root tree of the index finally obtained: (1) Holds the hierarchical structure held by the structured document. (2) There is a characteristic that there is only one side having a certain relationship (for example, a side sharing a start point and holding the same start point name and the same end point name).

【００３９】これは以下の２つの効果をもたらす。This has the following two effects.

【００４０】（１）変形を行う前のインデックスで検索
を行う場合に、複数の辺に対応していた探索が、変形を
行った後のインデックスで検索を行う場合には、１つの
辺にのみ対応する検索に抑えられるので、検索速度が向
上する。(1) When a search is performed using the index before the transformation, the search corresponding to a plurality of sides is performed. When the search is performed using the index after the transformation, only one side is searched. The search speed is improved because the corresponding search is suppressed.

【００４１】（２）変形を行う前のインデックスの複数
辺が、変形を行った後のインデックスで１つの辺にのみ
対応するので、データ構造のサイズを小さく抑えること
が可能である。(2) Since the plurality of sides of the index before the transformation corresponds to only one side in the index after the transformation, the size of the data structure can be reduced.

【００４２】また、最終的に得られたインデックスの根
付き木を格納装置の連続領域に格納する際には、始点名
が同じ辺ごとに格納装置の連続領域に格納する工夫を行
うことにより、インデックスの探索を行う際に、根以外
の頂点から探索を開始することが可能であることを意味
し、検索速度を向上させる効果をもたらす。When the finally obtained rooted tree of the index is stored in the continuous area of the storage device, the index name can be stored in the continuous area of the storage device for each side having the same starting point name. Means that it is possible to start the search from vertices other than the root when performing the search, which brings about the effect of improving the search speed.

【００４３】以下では、構造化文書としてＸＭＬ文書を
例にとりながら本実施形態について詳しく説明する。In the following, the present embodiment will be described in detail by taking an XML document as an example of a structured document.

【００４４】図１に、本発明の一実施形態に係る構造化
文書／インデックス処理システムの構成例を示す。FIG. 1 shows a configuration example of a structured document / index processing system according to an embodiment of the present invention.

【００４５】図１に示されるように、この構造化文書／
インデックス処理システムは、ＸＭＬ文書入力装置１、
インデックス作成装置２、インデックス格納装置３、イ
ンデックス読み出し装置４、インデックス閲覧装置５、
検索入出力装置６を備えている。As shown in FIG. 1, this structured document /
The index processing system includes an XML document input device 1,
Index creation device 2, index storage device 3, index reading device 4, index browsing device 5,
A search input / output device 6 is provided.

【００４６】ＸＭＬ文書入力装置１は、インデックスに
追加しようとするＸＭＬ文書を入力するための装置であ
る。The XML document input device 1 is a device for inputting an XML document to be added to an index.

【００４７】ＸＭＬ文書入力装置１から入力されたＸＭ
Ｌ文書は、インデックス作成装置２に送信される。XML input from XML document input device 1
The L document is transmitted to the index creation device 2.

【００４８】インデックス作成装置２は、ＸＭＬ文書入
力装置１から送信されたＸＭＬ文書からインデックスの
データ構造を作成し、これをインデックス格納装置３に
格納する。The index creation device 2 creates an index data structure from the XML document transmitted from the XML document input device 1 and stores it in the index storage device 3.

【００４９】インデックス読み出し装置４は、インデッ
クス閲覧装置５や検索入出力装置６からの指示に従い、
インデックス格納装置３からインデックスを読み出し、
インデックス閲覧装置５や検索入出力装置６に送信す
る。The index reading device 4 receives an instruction from the index browsing device 5 or the search input / output device 6,
The index is read from the index storage device 3,
The data is transmitted to the index browsing device 5 and the search input / output device 6.

【００５０】インデックス閲覧装置５は、ユーザのイン
デックス閲覧要求を受け付け、インデックス読み出し装
置４に対してインデックスの読み出し指示を行ない、イ
ンデックス読み出し装置４から送信されるインデックス
を表示する。The index browsing device 5 receives a user's index browsing request, instructs the index reading device 4 to read an index, and displays the index transmitted from the index reading device 4.

【００５１】検索入出力装置６は、ユーザの検索要求を
受け付け、インデックス読み出し装置４に対して検索要
求を送信し、インデックス読み出し装置４から送信され
る検索結果を表示する。The search input / output device 6 receives the search request from the user, transmits the search request to the index reading device 4, and displays the search result transmitted from the index reading device 4.

【００５２】なお、図１に示した構成において、インデ
ックス閲覧装置５およびまたは検索入出力装置６を備え
ない構成もある。In the configuration shown in FIG. 1, there is also a configuration in which the index browsing device 5 and / or the search input / output device 6 are not provided.

【００５３】また、図１に示した構成または図１に示し
た構成でインデックス閲覧装置５およびまたは検索入出
力装置６を備えない構成において、インデックスのもと
となった構造化文書の全部または一部を格納する構造化
文書格納装置をさらに備える構成もあり、また構造化文
書は一切蓄積しない構成もある。In the configuration shown in FIG. 1 or in the configuration shown in FIG. 1 but without the index browsing device 5 and / or the search input / output device 6, all or one of the structured documents that are the basis of the index are Some configurations further include a structured document storage device for storing a unit, and some configurations do not store any structured documents.

【００５４】このシステムは、１台又は複数台の計算機
を用いてスタンドアローンの装置としても構成可能であ
り、また、複数台の計算機を用いてネットワークを介し
てサーバ・クライアント・システムとしても構成可能で
ある。This system can be configured as a stand-alone device using one or a plurality of computers, and can also be configured as a server-client system via a network using a plurality of computers. It is.

【００５５】なお、上記の送信には、実際にデータを送
信する場合と、データへアクセスするためのポインタを
送信する場合とがある。The above transmission includes a case where data is actually transmitted and a case where a pointer for accessing data is transmitted.

【００５６】以下では、構造化文書の具体例として図２
８に例示したＸＭＬ文書と図２９に例示したＸＭＬ文書
を例にとって説明する。なお、図２８の構造化文書の名
前をＸ．ｘｍｌで表し、図２９の構造化文書の名前を
Ｙ．ｘｍｌで表すものとする。FIG. 2 shows a specific example of the structured document.
The XML document illustrated in FIG. 8 and the XML document illustrated in FIG. 29 will be described as examples. Note that the name of the structured document in FIG. xml, and the name of the structured document in FIG. xml.

【００５７】また、（Ｅ）はエレメント、（ＡＮ）は属
性名、（ＡＶ）は属性値、（Ｓ）は文字列、（ＷＥ）は
任意のエレメント、（ＷＡＮ）は任意の属性名、（ＷＡ
Ｖ）は任意の属性値、（ＷＳ）は任意の文字列を意味す
るものとする。(E) is an element, (AN) is an attribute name, (AV) is an attribute value, (S) is a character string, (WE) is an arbitrary element, (WAN) is an arbitrary attribute name, WA
V) means an arbitrary attribute value, and (WS) means an arbitrary character string.

【００５８】最初にＸＭＬ文書からインデックスを作成
する手順に関して説明する。First, a procedure for creating an index from an XML document will be described.

【００５９】ここでは、Ｘ．ｘｍｌ（図２８）からイン
デックスを作成する場合を例にとって説明する。Here, X. xml (FIG. 28) will be described as an example.

【００６０】まず、ＸＭＬ文書入力装置１からＸ．ｘｍ
ｌを入力する。First, from the XML document input device 1 to the X.X. xm
Enter l.

【００６１】Ｘ．ｘｍｌは、ＸＭＬ文書入力装置１か
ら、インデックス作成装置２に送信される。X. The xml is transmitted from the XML document input device 1 to the index creation device 2.

【００６２】インデックス作成装置２では以下の手続き
が行なわれる。The following procedure is performed in the index creation device 2.

【００６３】まず、ＸＭＬ文書を木で表現する。First, an XML document is represented by a tree.

【００６４】図２に、ＸＭＬ文書Ｘ．ｘｍｌを表現した
木を例示する。FIG. 2 shows an XML document X. An example of a tree expressing xml is shown.

【００６５】図２の各枝には、例えば、根公開特許公報（Ｅ）や、公開特許公報（Ｅ）出願番号（ＡＮ）のような２段の名前がふられている。Each branch in FIG. 2 is given a two-stage name such as, for example, Root Patent Publication (E) and Publication Patent Application (E) Application Number (AN).

【００６６】前者は、『根』を始点名とし『公開特許公
報（Ｅ）』を終点名とする辺を意味し後者は、『公開特
許公報（Ｅ）』を始点名とし『出願番号（ＡＮ）』を終
点名とする辺を意味する。The former means a side having "root" as a starting point name and "open patent publication (E)" as an end point name, and the latter refers to "open patent publication (E)" as a starting point name and an "application number (AN)". )] ”Means the side with the end point name.

【００６７】なお、以下では、辺（根，公開特許公報
（Ｅ））は辺（公開特許公報（Ｅ），出願番号（Ａ
Ｎ））の親辺であると呼び、逆に、辺（公開特許公報
（Ｅ），出願番号（ＡＮ））は辺（根，公開特許公報
（Ｅ））の子（供）辺であると呼ぶ。つまり、一方の辺
の終点を、他方の辺が始点とするという関係にあれば、
該一方の辺は該他方の辺の親辺であり、該他方の辺は該
一方の辺の子供辺である。In the following, the side (root, published patent publication (E)) is referred to as the side (published patent publication (E), application number (A)).
N)), and conversely, the side (open patent publication (E), application number (AN)) is a child (supplied) side of the side (root, open patent publication (E)). Call. In other words, if there is a relationship that the end point of one side is the start point of the other side,
The one side is a parent side of the other side, and the other side is a child side of the one side.

【００６８】また、以下では、親子関係にある辺のリス
トのことをパスと呼ぶ。例えば、図図２の木において、
（根，公開特許公報（Ｅ））（公開特許公報（Ｅ），出
願番号（ＡＮ））（出願番号（ＡＮ），特願平７−２３
２５１５（ＡＶ））はパスの一例である。In the following, a list of sides having a parent-child relationship is called a path. For example, in the tree of FIG.
(Root, published patent publication (E)) (published patent publication (E), application number (AN)) (application number (AN), Japanese Patent Application No. 7-23)
2515 (AV)) is an example of a path.

【００６９】次に、対象とする木において、予め定めた
条件を満たす辺以外の辺の始点名と終点名を『任意
（＊）』で置き換える（名前の省略を行う）。Next, in the target tree, the start point name and the end point name of the sides other than the side satisfying the predetermined condition are replaced with "arbitrary (*)" (names are omitted).

【００７０】ここでは、一例として、図２の木におい
て、『発明の名称（Ｅ）』を始点名とする辺と、『筆頭
発明者（Ｅ）』を始点名とする辺と、『発明者（Ｅ）』
を始点名とする辺と、『名前（Ｅ）』を始点名とする辺
と、『名（Ｅ）』を始点名とする辺と、『姓（Ｅ）』を
始点名とする辺とを除いた辺の始点名と終点名を、『任
意（＊）』で置き換えるものとする。ただし、＊には、
Ｅ、ＡＮ、ＡＶ、Ｓのいずれかが入るものとし、『根』
に関しては置き換えを行なわないものとする。Here, as an example, in the tree of FIG. 2, a side whose starting point name is “name of invention (E)”, a side whose starting point name is “first inventor (E)”, and “an inventor” (E)]
, A side having a starting name of "name (E)", a side having a starting name of "name (E)", and a side having a starting name of "surname (E)". The names of the start point and end point of the removed side are replaced with “arbitrary (*)”. However, *
E, AN, AV, or S shall be entered, and "root"
Shall not be replaced.

【００７１】図３に、図２の木を上記のようにして置き
換えた結果の木を例示する。FIG. 3 illustrates a tree resulting from replacing the tree of FIG. 2 as described above.

【００７２】次に、対象とする木において、辺の共有を
（再帰的に）行う。Next, edges are shared (recursively) in the target tree.

【００７３】図３の木においては、辺（根，任意
（Ｅ））の終点を始点とする辺（任意（Ｅ），任意
（Ｅ））は８本存在する。これら８本の辺を１本にまと
めて新しい木を作成する。In the tree of FIG. 3, there are eight sides (arbitrary (E), arbitrary (E)) starting from the end point of the side (root, arbitrary (E)). These eight sides are combined into one to create a new tree.

【００７４】図４に、図３の木において辺をまとめた結
果得られた木を例示する。FIG. 4 exemplifies a tree obtained as a result of grouping edges in the tree of FIG.

【００７５】図４の木においては、辺（任意（Ｅ），任
意（Ｅ））の終点を始点とする辺（任意（Ｅ），任意
（Ｓ））は２本存在し、辺（任意（Ｅ），任意（Ｅ））
の終点を始点とする辺（発明者（Ｅ），名前（Ｅ））は
４本存在する。それぞれの辺を１本にまとめ（終点は１
つに統合される）、新しい木を作成する。In the tree of FIG. 4, there are two sides (arbitrary (E), arbitrary (S)) starting from the end point of the side (arbitrary (E), arbitrary (E)), and the side (arbitrary (E) E), optional (E))
There are four sides (inventor (E), name (E)) starting from the end point of. Combine each side into one (end point is 1
Into a new tree).

【００７６】図５に、図４の木において辺をまとめた結
果得られた木を例示する。FIG. 5 exemplifies a tree obtained as a result of grouping edges in the tree of FIG.

【００７７】図５の木においては、辺（発明者（Ｅ），
名前（Ｅ））の終点を始点とする辺（名前（Ｅ），姓
（Ｅ））、辺（名前（Ｅ），名（Ｅ））はそれぞれ４本
存在する。これら４本の辺をそれぞれ１本にまとめ、新
しい木を作成する。In the tree of FIG. 5, the sides (inventor (E),
There are four sides (name (E), last name (E)) and four sides (name (E), first name (E)) starting from the end point of the name (E). These four sides are combined into one to create a new tree.

【００７８】図６に、図５の木において辺をまとめた結
果得られた木を例示する。FIG. 6 exemplifies a tree obtained as a result of grouping edges in the tree of FIG.

【００７９】この例では、これ以上まとめられる辺がな
いので、図６に示された木がＸ．ｘｍｌ（図２８）のイ
ンデックスのデータ構造である。In this example, since there are no more edges to be grouped, the tree shown in FIG. xml (FIG. 28) is a data structure of the index.

【００８０】作成されたインデックスは、インデックス
作成装置２からインデックス格納装置３に送信され、イ
ンデックス格納装置３に格納される。The created index is transmitted from the index creation device 2 to the index storage device 3 and stored in the index storage device 3.

【００８１】作成されたインデックスは、辺の単位でイ
ンデックス格納装置３に格納する。その際に、辺の始点
の名前（根，任意（Ｅ），任意（ＡＶ），発明の名称
（Ｅ），筆頭発明者（Ｅ），発明者（Ｅ），名前
（Ｅ），姓（Ｅ），名（Ｅ））が同じである辺ごとに、
インデックス格納装置３の連続する領域に格納するよう
にしてもよい。The created index is stored in the index storage device 3 in units of sides. At that time, the name of the start point of the side (root, arbitrary (E), arbitrary (AV), name of invention (E), first inventor (E), inventor (E), name (E), last name (E ), Name (E)),
The data may be stored in a continuous area of the index storage device 3.

【００８２】図７に、図６の木をインデックス格納装置
３に格納した様子を例示する。FIG. 7 shows an example in which the tree of FIG. 6 is stored in the index storage device 3.

【００８３】図７において、四角で囲まれた領域の各々
がインデックス格納装置３上の連続領域を表し、親子の
関係にある辺は親辺の終点と子供辺の始点とを点線で結
んである。さらに、各辺には、この辺の始点および終点
の名前とともに、この辺がもともと所属していたＸＭＬ
文書の名前を付記してある。In FIG. 7, each of the areas enclosed by squares represents a continuous area on the index storage device 3, and the sides having a parent-child relationship are connected by a dotted line between the end point of the parent side and the start point of the child side. . Further, each side is accompanied by the name of the start point and end point of the side, and the XML to which the side originally belonged.
The name of the document is appended.

【００８４】次に、インデックスが作成されていない状
態で、ＸＭＬ文書からインデックスを作成する場合に関
して、これまでの例で説明した各々の操作の処理手順を
説明する。Next, a description will be given of the processing procedure of each operation described in the examples up to this point in the case where an index is created from an XML document with no index created.

【００８５】図８に、この場合の処理手順の一例を示
す。FIG. 8 shows an example of the processing procedure in this case.

【００８６】（１）ＸＭＬ文書入力装置１から、インデ
ックスを作成しようとしているＸＭＬ文書を、入力する
（ステップＳ１）。(1) An XML document whose index is to be created is input from the XML document input device 1 (step S1).

【００８７】（２）ＸＭＬ文書入力装置１から入力され
たＸＭＬ文書を、インデックス作成装置２に、送信する
（ステップＳ２）。(2) The XML document input from the XML document input device 1 is transmitted to the index creation device 2 (step S2).

【００８８】（３）ＸＭＬ文書中にＸＭＬ文書からイン
デックスを作成する際に名前を省略するべき辺が含まれ
ているかどうか判断する（ステップＳ３）。そのような
辺が含まれている場合には（４）へ進み、そのような辺
が含まれていない場合には（５）へ進む。ＸＭＬ文書の
辺の名前を省略するかどうかは辺の名前（始点名と終点
名の組）によって決まるものであり、同一文書内に、同
一の名前の辺が複数ある場合には、すべての辺に関して
名前を保持するか、またはすべての辺に関して名前を省
略するかのいずれかである。例えば、図２において、
（公開特許公報（Ｅ），筆頭発明者（Ｅ））という名前
の辺は４本存在するが、インデックスにおいてこれらの
辺のうち一部の名前を保持し、一部の名前を省略すると
いうことは許されない（この例では４つの辺の名前のす
べてが省略されている）。また、（筆頭発明者（Ｅ），
名前（Ｅ））という名前の辺は４本存在するが、インデ
ックスにおいてこれらの辺のうち一部の名前を保持し、
一部の名前を省略することも許されない（この例では４
つの辺の名前すべてが保持されている）。(3) It is determined whether or not a side whose name should be omitted when an index is created from the XML document in the XML document (step S3). If such a side is included, the process proceeds to (4). If such a side is not included, the process proceeds to (5). Whether to omit the names of the sides of the XML document is determined by the names of the sides (a pair of the start point name and the end point name). If there are a plurality of sides having the same name in the same document, Either keep the name for, or omit the name for all edges. For example, in FIG.
Although there are four sides named (Publication Patent Publication (E) and First Inventor (E)), the index retains some of these sides and omits some of them. Is not allowed (all four side names are omitted in this example). In addition, (First inventor (E),
There are four sides named (name (E)), but some of these sides are retained in the index,
It is not permissible to omit some names (in this example, 4
All the names of the two sides are preserved).

【００８９】（４）省略を行なうべきすべての辺に関し
て、始点、終点名を別の名前で置き換える（ステップＳ
４）。具体的には、始点（終点）名が（Ｅ）で終わる場
合（つまり対応する頂点がエレメントである場合）に
は、『任意（Ｅ）』で置き換え、始点（終点）名が（Ａ
Ｎ）で終わる場合（つまり対応する頂点が属性名である
場合）には、『任意（ＡＮ）』で置き換え、始点（終
点）名が（ＡＶ）で終わる場合（つまり対応する頂点が
属性値である場合）には、『任意（ＡＶ）』で置き換
え、始点（終点）名が（Ｓ）で終わる場合（つまり対応
する頂点が文字列である場合）には、『任意（Ｓ）』で
置き換える。(4) For all sides to be omitted, the start point and end point names are replaced with different names (step S).
4). Specifically, when the start point (end point) name ends with (E) (that is, when the corresponding vertex is an element), it is replaced with “arbitrary (E)”, and the start point (end point) name changes to (A).
N) (that is, when the corresponding vertex is an attribute name), it is replaced with “arbitrary (AN)”, and when the start (end) name ends with (AV) (that is, the corresponding vertex is an attribute value). If there is), replace with "arbitrary (AV)", and if the start (end) name ends with (S) (that is, if the corresponding vertex is a character string), replace with "arbitrary (S)" .

【００９０】（５）ＸＭＬ文書からインデックスを作成
する際に、ＸＭＬ文書内で辺の共有を行なうかどうか判
断する（ステップＳ５）。辺の共有を行なう場合には
（６）へ進み、共有を行なわない場合には（７）へ進
む。(5) When creating an index from an XML document, it is determined whether to share a side in the XML document (step S5). If the side is to be shared, proceed to (6); otherwise, proceed to (7).

【００９１】（６）始点を共有し、同一の始点名、同一
の終点名を持つ辺が複数存在する場合には、それらの辺
を１本にまとめる（ステップＳ６）。（終点を１つに統
合し）辺をまとめた結果、新たに始点を共有し、同一の
始点名、同一の終点名を持つ辺が複数現われる場合もあ
り得るが、それらに関しても再帰的に１本にまとめ、そ
れ以上まとめられる辺がなくなるまで、辺をまとめる。(6) If there are a plurality of sides sharing the same start point and having the same start point name and the same end point name, those sides are put together into one (step S6). As a result of unifying the edges (by unifying the end points into one), a new start point may be shared and a plurality of sides having the same start point name and the same end point name may appear. Put it in a book and put the edges together until there are no more edges to put together.

【００９２】（７）インデックス作成装置２で作成され
たインデックスのデータ構造を、インデックス格納装置
３に転送し、インデックス格納装置３に格納する（ステ
ップＳ７）。(7) The data structure of the index created by the index creation device 2 is transferred to the index storage device 3 and stored in the index storage device 3 (step S7).

【００９３】インデックス格納装置３にインデックスを
格納する際には、インデックスの辺の単位での格納を行
なう。When storing an index in the index storage device 3, the index is stored in units of sides of the index.

【００９４】各辺が保持する情報は以下の通りである。（ａ）始点名（ｂ）終点名（ｃ）（必要があれば）子供辺へのポインタ（ｄ）（必要があれば）親辺へのポインタ（ｅ）（必要があれば）この辺に対応するＸＭＬ文書の
名前ただし、インデックスの木構造を保持するためには
（ｃ）と（ｄ）の両方または一方の情報を保持する必要
がある。The information held by each side is as follows. (A) Start point name (b) End point name (c) Pointer to child side (if necessary) (d) Pointer to parent side (if necessary) (e) Corresponds to this side (if necessary) However, in order to hold the tree structure of the index, it is necessary to hold both or one of the information (c) and (d).

【００９５】また、ここで作成されたインデックスに情
報が含まれるＸＭＬ文書はただ１つであるが、１つのイ
ンデックスにはただ１つのＸＭＬ文書のみ含まれるよう
にする構成と、後述するように１つのインデックスに複
数のＸＭＬ文書の情報を混在可能とする構成とがある。
１つのインデックスに含まれるＸＭＬ文書を常に１つと
する場合、あるいは１つのインデックスに含まれるＸＭ
Ｌ文書は１または複数であるが当該インデックスには新
たに別のＸＭＬ文書の情報を追加しない場合には、各辺
がそれぞれ対応するＸＭＬ文書の名前を保持する必要は
ない。しかし、１つのインデックスに複数のＸＭＬ文書
の情報が混在することを前提とする場合には、インデッ
クスに含まれる各辺がそれぞれどのＸＭＬ文書に対応す
るかを（ｅ）の情報として保持しておく必要がある。The index created here contains only one XML document, but one index contains only one XML document. There is a configuration in which information of a plurality of XML documents can be mixed in one index.
When there is always one XML document included in one index, or when XML documents included in one index
If there is one or a plurality of L documents but information of another XML document is not newly added to the index, it is not necessary to hold the name of the XML document corresponding to each side. However, when it is assumed that information of a plurality of XML documents coexists in one index, which XML document each side included in the index corresponds to is stored as information (e). There is a need.

【００９６】なお、（ｅ）は、例えばＵＲＬなどのよう
にＸＭＬ文書へのアクセス方法を示す情報等、ＸＭＬ文
書の名前以外のものを保持してもよい。Note that (e) may hold information other than the name of the XML document, such as information indicating a method of accessing the XML document, such as a URL.

【００９７】また、前述したように、辺を格納する際に
は、始点名が等しい辺を記憶装置の連続する領域に格納
するようにしてもよい。As described above, when storing sides, sides having the same start point name may be stored in continuous areas of the storage device.

【００９８】続いて、構造化文書のインデックスがイン
デックス格納装置３に格納されている状態で、このイン
デックスに他の構造化文書の情報を追加する手順に関し
て説明する。Next, a description will be given of a procedure for adding information of another structured document to an index of a structured document in a state where the index is stored in the index storage device 3.

【００９９】ここでは、Ｘ．ｘｍｌ（図２８）のインデ
ックスがインデックス格納装置３に格納されている状態
で、このインデックスにＹ．ｘｍｌ（図２９）の情報を
追加する場合を例にとって説明する。Here, X. xml (FIG. 28) is stored in the index storage device 3, and Y. xml (FIG. 29) will be described as an example.

【０１００】まず、ＸＭＬ文書入力装置１からＹ．ｘｍ
ｌを入力する。First, from the XML document input device 1 to Y. xm
Enter l.

【０１０１】Ｙ．ｘｍｌは、ＸＭＬ文書入力装置１か
ら、インデックス作成装置２に送信される。Y. The xml is transmitted from the XML document input device 1 to the index creation device 2.

【０１０２】インデックス作成装置２では以下の手続き
が行なわれる。The following procedure is performed in the index creation device 2.

【０１０３】まず、ＸＭＬ文書を木で表現する。First, an XML document is represented by a tree.

【０１０４】図９に、ＸＭＬ文書Ｙ．ｘｍｌを表現した
木を例示する。FIG. 9 shows an XML document Y. An example of a tree expressing xml is shown.

【０１０５】次に、対象とするＸＭＬ文書を表現した木
において、予め定めた特徴を持つ辺以外の辺の始点の名
前と終点の名前を『任意（＊）』で置き換える（名前の
省略を行う）。Next, in the tree representing the target XML document, the names of the start point and the end point of sides other than sides having predetermined characteristics are replaced with “arbitrary (*)” (names are omitted. ).

【０１０６】図９の木において、先程の例と同様に、
『発明の名称（Ｅ）』を始点とする辺と、『筆頭発明者
（Ｅ）』を始点とする辺と、『発明者（Ｅ）』を始点と
する辺と、『名前（Ｅ）』を始点とする辺と、『名
（Ｅ）』を始点とする辺と、『姓（Ｅ）』を始点とする
辺以外の辺の始点の名前と終点の名前を、『任意
（＊）』で置き換える。また、先程の例と同様に、＊に
は、Ｅ，ＡＮ，ＡＶ，Ｓのいずれかが入るものとし、
『根』に関しては置き換えを行なわないものとする。In the tree of FIG. 9, similar to the previous example,
A side starting from "Name of Invention (E)", a side starting from "First Inventor (E)", a side starting from "Inventor (E)", and a "Name (E)" The name of the start point and the end point of the sides other than the side starting from “, the side starting from“ name (E) ”and the side starting from“ surname (E) ”are“ arbitrary (*) ”. Replace with Also, as in the previous example, * represents any of E, AN, AV, and S.
No replacement is made for "root".

【０１０７】図１０に、図９の木を上記のようにして置
き換えた結果の木を例示する。FIG. 10 illustrates a tree resulting from replacing the tree of FIG. 9 as described above.

【０１０８】次に、対象とするＸＭＬ文書を表現した木
において、辺の共有を（再帰的に）行う。Next, edges are shared (recursively) in the tree representing the target XML document.

【０１０９】すなわち、図１０の木において、Ｘ．ｘｍ
ｌからインデックスを作った際と同様に、始点を共有
し、同一の始点名、同一の終点名を持つ辺が複数存在す
る場合には、それらの辺を１本にまとめる、という操作
を再帰的に繰り返して新しい木を作成する。That is, in the tree of FIG. xm
As in the case of creating an index from l, if there is a plurality of sides having the same start point name and the same end point name, the operation of combining those sides into one is performed recursively. Repeat to create a new tree.

【０１１０】図１１に、図１０の木において辺を（再帰
的に）まとめた結果得られた木を例示する。FIG. 11 exemplifies a tree obtained as a result of grouping (recursively) edges in the tree of FIG.

【０１１１】図１１に示された木がＹ．ｘｍｌのインデ
ックスのデータ構造である。The tree shown in FIG. 6 is a data structure of an xml index.

【０１１２】次に、先に作成され格納されたＸ．ｘｍｌ
（図２８）のインデックスを、インデックス格納装置３
から読み出し（読み出したデータ構造は図６に示した通
りである）、図１１の木の『根』頂点と図６の『根』頂
点とを１つにまとめて、新しいデータ構造を作成する。Next, the previously created and stored X.D. xml
The index shown in FIG. 28 is stored in the index storage device 3.
(The data structure read is as shown in FIG. 6), and the "root" vertex of the tree of FIG. 11 and the "root" vertex of FIG. 6 are combined into one to create a new data structure.

【０１１３】図１２に、この結果得られた木を示す。FIG. 12 shows the tree obtained as a result.

【０１１４】さらに、この１つにまとめた新しいデータ
構造の木において、辺の共有を（再帰的に）行う。Further, in the tree of the new data structure that has been put together, sharing of edges is performed (recursively).

【０１１５】図１２の木において、始点を共有し、同一
の始点名、同一の終点名を持つ辺が複数存在する場合に
は、それらの辺を１本にまとめる、という操作を再帰的
に繰り返して新しい木を作成する。In the tree of FIG. 12, when there are a plurality of sides having the same start point and the same start point name and the same end point name, the operation of combining those sides into one is recursively repeated. To create a new tree.

【０１１６】図１３に、図１２の木において辺を（再帰
的に）まとめた結果得られた木を例示する。FIG. 13 illustrates a tree obtained as a result of (recursively) grouping the edges in the tree of FIG.

【０１１７】作成されたインデックス（新たな構造化文
書に対応する情報が付け加えられたインデックス）は、
インデックス作成装置２からインデックス格納装置３に
送信され、インデックス格納装置３に格納される。The created index (an index to which information corresponding to a new structured document is added) is
The data is transmitted from the index creation device 2 to the index storage device 3 and stored in the index storage device 3.

【０１１８】先程の例と同様に、格納を行なう際には、
辺の単位でインデックス格納装置３に格納し、その際
に、辺の始点名（根，任意（Ｅ），任意（ＡＮ），発明
の名称（Ｅ），筆頭発明者（Ｅ），発明者（Ｅ），名前
（Ｅ），姓（Ｅ），名（Ｅ））が同じである辺ごとにイ
ンデックス格納装置３の連続する領域に格納するように
してもよい。As with the previous example, when storing,
The data is stored in the index storage device 3 in the unit of a side. At this time, the starting point name of the side (root, arbitrary (E), arbitrary (AN), title of the invention (E), first inventor (E), inventor ( (E), name (E), last name (E), first name (E)) may be stored in a continuous area of the index storage device 3 for each side having the same name.

【０１１９】図１４に、図１３の木をインデックス格納
装置３に格納した様子を例示する。FIG. 14 shows an example in which the tree of FIG. 13 is stored in the index storage device 3.

【０１２０】先程の例と同様に、図１４において、四角
で囲まれた範囲内の領域がインデックス格納装置３上の
連続領域を表し、親子の関係にある辺は親辺の終点と子
供辺の始点とを点線で結んである。さらに、各辺には、
この辺の始点名および終点名とともに、この辺がもとも
と所属していたＸＭＬ文書の名前を付記してある。In the same manner as in the previous example, in FIG. 14, the area within the range enclosed by the square represents the continuous area on the index storage device 3, and the side having the parent-child relationship is the end point of the parent side and the side of the child side. The starting point is connected with the dotted line. In addition, on each side,
The name of the XML document to which the side originally belonged is added together with the start point name and end point name of the side.

【０１２１】次に、既にインデックスが作成されている
状態で、インデックスに新たにＸＭＬ文書の情報を追加
する場合に関して、これまでの例で説明した各々の操作
の処理手順を説明する。Next, a description will be given of the processing procedure of each operation described in the examples up to this point in a case where information of an XML document is newly added to an index in a state where an index has already been created.

【０１２２】図１５に、この場合の処理手順の一例を示
す。FIG. 15 shows an example of the processing procedure in this case.

【０１２３】（１）ＸＭＬ文書入力装置１から、インデ
ックスを作成しようとしているＸＭＬ文書を、入力する
（ステップＳ１０１）。(1) An XML document whose index is to be created is input from the XML document input device 1 (step S101).

【０１２４】（２）ＸＭＬ文書入力装置１から入力され
たＸＭＬ文書を、インデックス作成装置２に、送信する
（ステップＳ１０２）。(2) The XML document input from the XML document input device 1 is transmitted to the index creation device 2 (step S102).

【０１２５】（３）ＸＭＬ文書中にＸＭＬ文書からイン
デックスを作成する際に名前を省略するべき辺が含まれ
ているかどうか判断する（ステップＳ１０３）。そのよ
うな辺が含まれている場合には（４）へ進み、そのよう
な辺が含まれていない場合には（５）へ進む。ＸＭＬ文
書の辺の名前を省略するかどうかは辺の名前（始点名と
終点名の組）によって決まるものであり、同一文書内
に、同一の名前の辺が複数ある場合には、すべての辺に
関して名前を保持するか、またはすべての辺に関して名
前を省略するかのいずれかである。また、インデックス
格納装置３内に格納されているインデックスにおいて保
持されている辺は、新たにインデックスに追加するＸＭ
Ｌ文書においても保持しなければならないし、インデッ
クス格納装置３内に格納されているインデックスにおい
て省略されている辺は、新たにインデックスに追加する
ＸＭＬ文書においても省略しなければならない。例え
ば、図９において、（公開特許公報（Ｅ），筆頭発明者
（Ｅ））という名前の辺は４本存在するが、この辺は図
７に示されているインデックスにおいて省略されている
ので、これらの辺はすべて省略しなければならない。ま
た、（筆頭発明者（Ｅ），名前（Ｅ））という名前の辺
は４本存在するが、この辺は図７に示されているインデ
ックスにおいて保持されているので、これらの辺はすべ
て保持しなければならない。(3) It is determined whether or not a side whose name should be omitted when an index is created from the XML document in the XML document (step S103). If such a side is included, the process proceeds to (4). If such a side is not included, the process proceeds to (5). Whether to omit the names of the sides of the XML document is determined by the names of the sides (a pair of the start point name and the end point name). If there are a plurality of sides having the same name in the same document, Either keep the name for, or omit the name for all edges. The side held in the index stored in the index storage device 3 is the XM to be newly added to the index.
The L document must be held, and the side omitted in the index stored in the index storage device 3 must be omitted in the XML document to be newly added to the index. For example, in FIG. 9, there are four sides named (Publication Patent Publication (E), Lead Inventor (E)), but these sides are omitted in the index shown in FIG. All sides must be omitted. Also, there are four sides named (First Inventor (E), Name (E)), but since these sides are stored in the index shown in FIG. 7, all of these sides are stored. There must be.

【０１２６】（４）省略を行なうべきすべての辺に関し
て、始点、終点名を別の名前で置き換える（ステップＳ
１０４）。具体的には、始点（終点）名が（Ｅ）で終わ
る場合（つまり対応する頂点がエレメントである場合）
には、『任意（Ｅ）』で置き換え、始点（終点）名が
（ＡＮ）で終わる場合（つまり対応する頂点が属性名で
ある場合）には、『任意（ＡＮ）』で置き換え、始点
（終点）名が（ＡＶ）で終わる場合（つまり対応する頂
点が属性値である場合）には、『任意（ＡＶ）』で置き
換え、始点（終点）名が（Ｓ）で終わる場合（つまり対
応する頂点が文字列である場合）には、『任意（Ｓ）』
で置き換える。(4) For all sides to be omitted, the start point and end point names are replaced with different names (step S).
104). Specifically, when the start point (end point) name ends with (E) (that is, when the corresponding vertex is an element)
Is replaced with "arbitrary (E)", and if the start point (end point) name ends with (AN) (that is, if the corresponding vertex is an attribute name), it is replaced with "arbitrary (AN)" and the start point ( When the (end point) name ends with (AV) (that is, when the corresponding vertex is an attribute value), it is replaced with “arbitrary (AV)”, and when the start point (end point) name ends with (S) (that is, the corresponding If the vertex is a character string), "arbitrary (S)"
Replace with

【０１２７】（５）ＸＭＬ文書からインデックスを作成
する際に、ＸＭＬ文書内で辺の共有を行なうかどうか判
断する（ステップＳ１０５）。辺の共有を行なう場合に
は（６）へ進み、共有を行なわない場合には（７）へ進
む。(5) When creating an index from an XML document, it is determined whether to share a side in the XML document (step S105). If the side is to be shared, proceed to (6); otherwise, proceed to (7).

【０１２８】（６）始点を共有し、同一の始点名、同一
の終点名を持つ辺が複数存在する場合には、それらの辺
を１本にまとめる（ステップＳ１０６）。辺をまとめた
結果、新たに始点を共有し、同一の始点名、同一の終点
名を持つ辺が複数現われる場合もあり得るが、それらに
関しても再帰的に１本にまとめ、これ以上まとめられる
辺がなくなるまで、辺をまとめる。(6) If there are a plurality of sides sharing the same start point and having the same start point name and the same end point name, those sides are put together into one (step S106). As a result of grouping edges, a new start point may be shared, and there may be multiple edges with the same start point name and the same end point name. Combine the edges until there are no more.

【０１２９】（７）インデックス格納装置３からインデ
ックスを読み出し、読み出したインデックスのデータ構
造の『根』頂点と、前のステップまでで得られた木の
『根』頂点を１つにまとめ、新しい木を作成する（ステ
ップＳ１０７）。(7) The index is read from the index storage device 3, and the “root” vertex of the data structure of the read index and the “root” vertex of the tree obtained in the previous step are combined into a single tree. Is created (step S107).

【０１３０】（８）複数のＸＭＬ文書からインデックス
を作成する際に、複数のＸＭＬ文書間で辺の共有を行な
うかどうか判断する（ステップＳ１０８）。辺の共有を
行なう場合には（９）へ進み、共有を行なわない場合に
は（１０）へ進む。(8) When creating an index from a plurality of XML documents, it is determined whether to share a side among the plurality of XML documents (step S108). If the side is to be shared, the process proceeds to (9). If the side is not shared, the process proceeds to (10).

【０１３１】（９）始点を共有し、同一の始点名、同一
の終点名を持ち、異なる構造化文書に属する複数存在す
る場合には、それらの辺を１本にまとめる（ステップＳ
１０９）。辺をまとめた結果、新たに始点を共有し、同
一の始点名、同一の終点名を持つ辺が複数現われる場合
もあり得るが、それらに関しても再帰的に１本にまと
め、これ以上まとめられる辺がなくなるまで、辺をまと
める。(9) If there are a plurality of documents that share the same start point, have the same start point name, the same end point name, and belong to different structured documents, those sides are combined into one (step S).
109). As a result of grouping edges, a new start point may be shared, and there may be multiple edges with the same start point name and the same end point name. Combine the edges until there are no more.

【０１３２】（１０）インデックス作成装置２で作成さ
れたインデックスのデータ構造を、インデックス格納装
置３に転送し、インデックス格納装置３に格納する（ス
テップＳ１１０）。(10) The data structure of the index created by the index creation device 2 is transferred to the index storage device 3 and stored in the index storage device 3 (step S110).

【０１３３】インデックス格納装置３にインデックスを
格納する際には、インデックスの辺の単位での格納を行
なう。When storing an index in the index storage device 3, the index is stored in units of sides of the index.

【０１３４】各辺が保持する情報は以下の通りである。（ａ）始点名（ｂ）終点名（ｃ）（必要があれば）子供辺へのポインタ（ｄ）（必要があれば）親辺へのポインタ（ｅ）この辺に対応するＸＭＬ文書の名前ただし、前述と同様に、インデックスの木構造を保持す
るためには（ｃ）と（ｄ）の両方または一方の情報を保
持する必要がある。The information held by each side is as follows. (A) Start point name (b) End point name (c) Pointer to child side (if necessary) (d) Pointer to parent side (if necessary) (e) Name of XML document corresponding to this side As described above, in order to hold the tree structure of the index, it is necessary to hold both or one of the information (c) and (d).

【０１３５】また、ここでは、１つのインデックスに複
数のＸＭＬ文書の情報が混在することになるので、イン
デックスに含まれる各辺がそれぞれどのＸＭＬ文書に対
応するかを（ｅ）の情報として保持しておく必要があ
る。In this case, since information of a plurality of XML documents is mixed in one index, which XML document each side included in the index corresponds to is stored as information of (e). Need to be kept.

【０１３６】なお、前述したように、（ｅ）は、例えば
ＵＲＬなどのようにＸＭＬ文書へのアクセス方法を示す
情報等、ＸＭＬ文書の名前以外のものを保持してもよ
い。As described above, (e) may hold information other than the name of the XML document, such as information indicating a method of accessing the XML document, such as a URL.

【０１３７】また、前述したように、辺を格納する際に
は、始点名が等しい辺を記憶装置の連続する領域に格納
するようにしてもよい。Further, as described above, when storing sides, sides having the same start point name may be stored in continuous areas of the storage device.

【０１３８】なお、ここでは、１つのＸＭＬ文書の情報
を含むインデックスにもう１つのＸＭＬ文書の情報を付
加して２つのＸＭＬ文書の情報を含むインデックスとす
る場合について説明したが、２以上のＸＭＬ文書の情報
を含むインデックスにもう１つのＸＭＬ文書の情報を付
加する場合も、１つのＸＭＬ文書の情報を含むインデッ
クスにもう１つのＸＭＬ文書の情報を付加する場合と同
様の処理を逐次行えばよい。また、インデックスに１つ
１つＸＭＬ文書の情報を付加していくのではなく、複数
のＸＭＬ文書の情報を含むインデックスと、複数のＸＭ
Ｌ文書の情報を含むインデックスとをこれまで説明した
ように処理してまとめることも可能である。[0138] Here, a case has been described where information of another XML document is added to an index containing information of one XML document to form an index containing information of two XML documents. When adding information of another XML document to an index including information of the document, the same processing as in adding information of another XML document to an index including information of one XML document may be sequentially performed. . Instead of adding XML document information to the index one by one, an index including information of a plurality of XML documents and a plurality of XML documents are added.
The index including the information of the L document can be processed and combined as described above.

【０１３９】また、３以上の単一のＸＭＬ文書の情報を
一括して１つのインデックスにまとめることも、複数の
ＸＭＬ文書の情報を含む３以上のインデックスを一括し
てまとめることも、同様に可能である。Similarly, it is possible to collectively collect information of three or more single XML documents into one index, or collectively collect three or more indexes including information of a plurality of XML documents. It is.

【０１４０】次に、本実施形態のインデックス作成処理
のバリエーションについて説明する。Next, a variation of the index creation processing of the present embodiment will be described.

【０１４１】まず、１つのインデックスにはただ１つの
ＸＭＬ文書の情報のみ含ませる（複数のＸＭＬ文書の情
報はまとめない）とした場合に、図２〜図６の手順例で
は、名前の省略を行い且つ辺の共有を行うものとした
が、（名前の省略を行い且つ辺の共有を行わないように
してもよいし、）名前の省略を行わず且つ辺の共有を行
うようにしてもよい。First, in the case where information of only one XML document is included in one index (information of a plurality of XML documents is not combined), in the example of the procedure of FIGS. Although the sharing and the sharing of the side are performed, the name may be omitted and the sharing of the side may not be performed. Alternatively, the name may not be omitted and the sharing of the side may be performed. .

【０１４２】また、１つのインデックスには複数のＸＭ
Ｌ文書の情報を含ませるとした場合に、図２〜図６およ
び図９〜図１３の手順例では、各ＸＭＬ文書内において
名前の省略を行い且つ辺の共有を行い且つ複数のＸＭＬ
文書間においても辺の共有を行うものとしたが、複数の
ＸＭＬ文書間においても辺の共有を行う場合に各ＸＭＬ
文書内における名前の省略と辺の共有との一方又は両方
を行わないようにしてもよい。Further, one index contains a plurality of XMs.
If the information of the L document is included, in the procedure examples of FIGS. 2 to 6 and FIGS. 9 to 13, the names are omitted, the sides are shared, and a plurality of XMLs are used in each XML document.
Although the sharing of the side is performed between the documents, when sharing the side between a plurality of XML documents, the XML
One or both of abbreviation of a name and sharing of a side in a document may not be performed.

【０１４３】次に、インデックス読み出し装置４を介し
て、インデックス格納装置３に格納されたインデックス
を用いてＸＭＬ文書の内容を閲覧する場合の手順につい
て説明する。Next, a procedure for browsing the contents of an XML document by using the index stored in the index storage device 3 via the index reading device 4 will be described.

【０１４４】ここでは、先の例のようにして作成され格
納された図１４に例示するインデックスを用いて、ＸＭ
Ｌ文書の内容を閲覧する場合を例にとって、閲覧の手順
に関して説明する。Here, using the index illustrated and illustrated in FIG. 14 created and stored as in the previous example, the XM
The browsing procedure will be described with reference to the case of browsing the contents of an L document as an example.

【０１４５】具体例として、『発明の名称』が『連続デ
ータサーバ装置および制御命令送出装置』である特許出
願に関係する所定の情報を格納してあるＸＭＬ文書をイ
ンデックスを閲覧しながら検索する場合を想定する。As a specific example, a case where a search is made while browsing an index for an XML document storing predetermined information related to a patent application whose “name of invention” is “continuous data server device and control command transmitting device” Is assumed.

【０１４６】まず、インデックス閲覧装置５に対して、
インデックス閲覧要求を入力する。First, with respect to the index browsing device 5,
Enter the index browsing request.

【０１４７】入力された閲覧要求は、インデックス読み
出し装置４に送信される。The input browsing request is transmitted to the index reading device 4.

【０１４８】インデックス読み出し装置４は、『根』を
始点名とする辺を格納してあるインデックス格納装置３
上の領域から『根』を始点名とする辺をすべて読み出
し、インデックス閲覧装置５に送信する。The index reading device 4 stores the side starting from “root” as the index name.
From the upper area, all sides starting from “root” are read and transmitted to the index browsing device 5.

【０１４９】なお、インデックスが複数あり得るシステ
ム構成の場合には、『根』頂点が複数存在するときに、
すべての『根』頂点について上記処理を行う方法と、所
定の基準で選択された一部の『根』頂点について上記処
理を行う方法がある。In the case of a system configuration in which there can be a plurality of indexes, when there are a plurality of “root” vertices,
There are a method of performing the above-described processing for all “root” vertices, and a method of performing the above-described processing for some “root” vertices selected based on a predetermined standard.

【０１５０】インデックス閲覧装置５は、受信した辺を
表示する。The index browsing device 5 displays the received side.

【０１５１】なお、複数の辺を受信した場合に、受信し
たすべての辺を一括して表示する方法と、所定の基準で
選択された１つまたは１グループ（例えば始点を同じく
する辺のグループ）ごとに順番に表示する方法とがあ
る。When a plurality of sides are received, a method of displaying all the received sides collectively, and a method of selecting one or a group (for example, a group of sides having the same starting point) selected based on a predetermined reference There is a method of displaying in order for each.

【０１５２】図１６に、このときに受信する辺を表示し
た様子の一例を示す。図１６の表示の意味するところ
は、（根、任意（Ｅ））というパスを含むＸＭＬ文書の
名前が「Ｘ．ｘｍｌ」および「Ｙ．ｘｍｌ」である、と
いうことである。図１６において、（根，任意（Ｅ））
と表示されている部分は選択可能になっている。FIG. 16 shows an example of a state in which the sides to be received at this time are displayed. The meaning of the display in FIG. 16 is that the names of the XML documents including the path (root, arbitrary (E)) are “X.xml” and “Y.xml”. In FIG. 16, (root, arbitrary (E))
The part displayed as is selectable.

【０１５３】ここで（根，任意（Ｅ））を選択すると、
インデックス閲覧装置５からインデックス読み出し装置
４に対して、辺（根，任意（Ｅ））の子供辺の読み出し
要求が送信される。Here, when (root, arbitrary (E)) is selected,
A request to read the child side of the side (root, arbitrary (E)) is transmitted from the index browsing device 5 to the index reading device 4.

【０１５４】インデックス読み出し装置４は、インデッ
クス格納装置３から辺（根，任意（Ｅ））の子供辺を読
み出し、インデックス閲覧装置５に転送する。The index reading device 4 reads the child side of the side (root, arbitrary (E)) from the index storage device 3 and transfers it to the index browsing device 5.

【０１５５】インデックス閲覧装置５は受信した辺を表
示する。The index browsing device 5 displays the received side.

【０１５６】図１７に、このときに受信する辺を表示し
た様子の一例を示す。図１７の表示の意味するところ
は、（根，任意（Ｅ））（任意（Ｅ），任意（ＡＮ））
というパスを含むＸＭＬ文書の名前が「Ｘ．ｘｍｌ」お
よび「Ｙ．ｘｍｌ」であり、（根，任意（Ｅ））（任意
（Ｅ），任意（Ｅ））というパスを含むＸＭＬ文書の名
前が「Ｘ．ｘｍｌ」および「Ｙ．ｘｍｌ」である、とい
うことである。図１７において、（任意（Ｅ），任意
（ＡＮ））、（任意（Ｅ），任意（Ｅ））と表示されて
いる部分はそれぞれ選択可能になっている。FIG. 17 shows an example of a state where the side to be received at this time is displayed. The meaning of the display in FIG. 17 is (root, arbitrary (E)) (optional (E), arbitrary (AN))
The names of the XML documents including the path "X.xml" and "Y.xml" are the names of the XML documents including the path (root, arbitrary (E)) (arbitrary (E), arbitrary (E)). Are “X.xml” and “Y.xml”. In FIG. 17, portions indicated as (arbitrary (E), arbitrary (AN)), (arbitrary (E), arbitrary (E)) can be selected.

【０１５７】ここでさらに、（任意（Ｅ），任意
（Ｅ））を選択すると、インデックス閲覧装置５からイ
ンデックス読み出し装置４に対して、辺（任意（Ｅ），
任意（Ｅ））の子供辺の読み出し要求が送信される。When (arbitrary (E), arbitrary (E)) is further selected, the side (arbitrary (E), arbitrary (E),
An optional (E) child side read request is transmitted.

【０１５８】インデックス読み出し装置４は、インデッ
クス格納装置３から辺（任意（Ｅ），任意（Ｅ））の子
供辺を読み出し、インデックス閲覧装置５に転送する。The index reading device 4 reads the child side (arbitrary (E), arbitrary (E)) from the index storage device 3 and transfers it to the index browsing device 5.

【０１５９】インデックス閲覧装置５は受信した辺を表
示する。The index browsing device 5 displays the received side.

【０１６０】図１８に、このときに受信する辺を表示し
た様子の一例を示す。図１８の表示の意味するところ
は、（根，任意（Ｅ））（任意（Ｅ），任意（Ｅ））
（発明の名称（Ｅ），画像伝送方式（Ｓ））というパス
を含むＸＭＬ文書の名前が「Ｘ．ｘｍｌ」であり、
（根，任意（Ｅ））（任意（Ｅ），任意（Ｅ））（発明
の名称（Ｅ），データ転送方式（Ｓ））というパスを含
むＸＭＬ文書の名前が「Ｙ．ｘｍｌ」であり、（根，任
意（Ｅ））（任意（Ｅ），任意（Ｅ））（任意（Ｅ），
任意（Ｓ））というパスを含むＸＭＬ文書の名前が
「Ｘ．ｘｍｌ」および「Ｙ．ｘｍｌ」であり、（根，任
意（Ｅ））（任意（Ｅ），任意（Ｅ））（筆頭発明者
（Ｅ），名前（Ｅ））というパスを含むＸＭＬ文書の名
前が「Ｘ．ｘｍｌ」および「Ｙ．ｘｍｌ」であり、
（根，任意（Ｅ））（任意（Ｅ），任意（Ｅ））（発明
者（Ｅ），名前（Ｅ））というパスを含むＸＭＬ文書の
名前が「Ｘ．ｘｍｌ」および「Ｙ．ｘｍｌ」である、と
いうことである。FIG. 18 shows an example of a state in which the sides to be received at this time are displayed. The meaning of the display in FIG. 18 is (root, arbitrary (E)) (optional (E), arbitrary (E))
The name of the XML document including the path (name of invention (E), image transmission method (S)) is “X.xml”,
The name of the XML document including the path of (root, arbitrary (E)) (optional (E), arbitrary (E)) (name of invention (E), data transfer method (S)) is "Y.xml". , (Root, any (E)) (any (E), any (E)) (any (E),
The names of the XML documents including the path “arbitrary (S)” are “X.xml” and “Y.xml”, and are (root, arbitrary (E)) (arbitrary (E), arbitrary (E)) (first invention) (E), the name of the XML document including the path of the name (E)) is “X.xml” and “Y.xml”,
The names of the XML documents including the paths (root, arbitrary (E)) (arbitrary (E), arbitrary (E)) (inventor (E), name (E)) are “X.xml” and “Y.xml”. That is, it is.

【０１６１】この結果、『発明の名称』が『連続データ
サーバ装置および制御命令送出装置』である特許出願に
関係する所定の情報を格納してあるＸＭＬ文書の名前は
「Ｘ．ｘｍｌ」であることが分かった。As a result, the name of the XML document storing the predetermined information related to the patent application whose “name of invention” is “continuous data server device and control command transmitting device” is “X.xml”. I understood that.

【０１６２】次に、インデックスを閲覧する場合に関し
て、これまでの例で示した各々の操作の処理手順を説明
する。Next, a description will be given of the processing procedure of each operation shown in the examples up to now when browsing the index.

【０１６３】図１９に、この場合の処理手順の一例を示
す。FIG. 19 shows an example of the processing procedure in this case.

【０１６４】（１）インデックス閲覧装置５に対して、
インデックス閲覧要求を入力する（ステップＳ２０
１）。(1) For the index browsing device 5,
Input an index browsing request (Step S20)
1).

【０１６５】（２）インデックス閲覧装置５からインデ
ックス読み出し装置４に、インデックス閲覧要求を送信
する（ステップＳ２０２）。なお、インデックス閲覧要
求には親辺を指定するものと親辺を指定しないものがあ
る。インデックス閲覧要求において親辺を指定するとい
うことは、この親辺の子供辺を閲覧しようとしているこ
とを意味し、親辺を指定しないということは親辺を持た
ない辺、つまり『根』頂点を始点とする辺の閲覧をしよ
うとしていることを意味する。ここでは、親辺は指定し
ない。(2) An index browsing request is transmitted from the index browsing device 5 to the index reading device 4 (step S202). The index browsing request includes a request for specifying the parent side and a request for not specifying the parent side. Specifying the parent side in the index browsing request means that the child side of this parent side is being browsed, and not specifying the parent side means that the side without the parent side, that is, the "root" vertex It means that you are trying to browse the starting side. Here, the parent side is not specified.

【０１６６】（３）インデックス読み出し装置４でイン
デックス読み出し要求を受信し、親辺が指定されている
かどうか判断する（ステップＳ２０３）。親辺が指定さ
れていない場合には（４）へ進み、親辺が指定されてい
る場合には（５）へ進む。(3) The index reading device 4 receives the index reading request and determines whether or not the parent side has been designated (step S203). If the parent side has not been specified, the process proceeds to (4). If the parent side has been specified, the process proceeds to (5).

【０１６７】（４）インデックス読み出し装置４は、イ
ンデックス格納装置３から『根』を始点名とする辺をす
べて読み出し、インデックス閲覧装置５に送信する（ス
テップＳ２０４）。なお、前述したように辺の始点の名
前が同じである辺ごとにインデックス格納装置３の連続
する領域に格納する構成を採用している場合には、
『根』を始点名とする辺はインデックス格納装置３の連
続領域に格納されている。次に、（６）へ進む。(4) The index reading device 4 reads out all sides starting from “root” from the index storage device 3 and sends them to the index browsing device 5 (step S204). As described above, in a case where a configuration is adopted in which the names of the starting points of the sides are stored in a continuous area of the index storage device 3 for each side,
The side starting from “root” is stored in a continuous area of the index storage device 3. Next, proceed to (6).

【０１６８】（５）インデックス読み出し装置４は、イ
ンデックス格納装置３から親辺で指定された辺の子供辺
をすべて読み出し、インデックス閲覧装置５に送信する
（ステップＳ２０５）。次に、（６）へ進む。(5) The index reading device 4 reads all child sides of the side designated as the parent side from the index storage device 3 and sends them to the index browsing device 5 (step S205). Next, proceed to (6).

【０１６９】（６）インデックス閲覧装置５は、インデ
ックス読み出し装置４から送信された辺を受信し、これ
を表示する（ステップＳ２０６）。(6) The index browsing device 5 receives the side transmitted from the index reading device 4 and displays it (step S206).

【０１７０】（７）新しい閲覧要求がインデックス閲覧
装置５に入力されたかどうか判断する（ステップＳ２０
７）。入力された場合には（３）へ進み、そうでない場
合には（７）へ進む。(7) It is determined whether a new browsing request has been input to the index browsing device 5 (step S20).
7). If an input is made, the process proceeds to (3); otherwise, the process proceeds to (7).

【０１７１】次に、検索入出力装置６を介して、インデ
ックス格納装置３に格納されたインデックスを用いてＸ
ＭＬ文書の内容を検索する場合の手順について説明す
る。Next, X is searched for using the index stored in the index storage device 3 via the search input / output device 6.
A procedure for searching the contents of the ML document will be described.

【０１７２】ここでは、先の例のようにして作成され格
納された図１４に例示するインデックスを用いて、ＸＭ
Ｌ文書の内容を検索する場合を例にとって、検索手順に
関して説明する。Here, using the index illustrated and illustrated in FIG. 14 created and stored as in the previous example, XM
The search procedure will be described by taking the case of searching for the contents of an L document as an example.

【０１７３】具体例として、『筆頭発明者』の『名前』
の『姓』が『岐津』である特許出願に関係する所定の情
報を格納してあるＸＭＬ文書を検索する場合を想定す
る。As a specific example, “name” of “lead inventor”
It is assumed that an XML document that stores predetermined information related to a patent application whose “surname” is “Gizu” is searched.

【０１７４】まず、検索入出力装置６に対して、（筆頭
発明者（Ｅ），名前（Ｅ））（名前（Ｅ），姓（Ｅ））
（姓（Ｅ），岐津（Ｓ））という検索パスを入力する。
入力された検索パスは、インデックス読み出し装置４に
送信される。First, for the search input / output device 6, (first inventor (E), name (E)) (name (E), surname (E))
A search path of (last name (E), Kizu (S)) is input.
The input search path is transmitted to the index reading device 4.

【０１７５】ここでは、検索要求がパス表現で入力され
る場合を考えるが、検索要求が別の方法で入力されるシ
ステム構成も考えられる。例えば、『筆頭発明者の名前
の姓が岐津であるという情報を含むＸＭＬ文書を検索せ
よ』という自然言語で入力するシステムはその一例であ
る。その場合には、検索入出力装置６において入力され
た検索要求をパス表現に変換する。Here, a case is considered where a search request is input in a path expression. However, a system configuration in which a search request is input by another method is also conceivable. For example, a natural language input system such as "Search for an XML document containing information that the first inventor's first name is Kizu" is an example. In that case, the search request input in the search input / output device 6 is converted into a path expression.

【０１７６】インデックス読み出し装置４は、筆頭発明
者（Ｅ）を始点名とし、名前（Ｅ）を終点名とする辺
が、インデックスにおいて保持の対象となっているの
か、それとも省略の対象になっているのかを判断する。
筆頭発明者（Ｅ）を始点とし、名前（Ｅ）を終点とする
辺は、保持の対象になっているので、筆頭発明者（Ｅ）
を始点とする辺を格納してあるインデックス格納装置３
上の領域から、筆頭発明者（Ｅ）を始点名とし、名前
（Ｅ）を終点名とする辺を読み出す。図１４の例の場合
に読み出される辺は、図２０にハッチング付きの辺で示
した通り、ただ１つの辺である。The index reading device 4 determines whether the side having the first inventor (E) as the start point name and the name (E) as the end point is to be held in the index or to be omitted. To determine if they are.
Since the side having the first inventor (E) as a starting point and the name (E) as an end point is to be retained, the first inventor (E)
Storage device 3 that stores the side starting from
From the upper area, the side having the first inventor (E) as the start point name and the name (E) as the end point name is read. The side read in the case of the example of FIG. 14 is only one side as shown by the hatched side in FIG.

【０１７７】次に、読み出した辺の子供辺の中で終点が
姓（Ｅ）または任意（Ｅ）になっている辺をすべて読み
出す。図１４の例の場合に読み出される辺は、図２１に
ハッチング付きの辺で示した通り、ただ１つの辺であ
る。Next, of the child sides of the read sides, all sides whose end points are the last name (E) or the arbitrary (E) are read. The side read in the case of the example of FIG. 14 is only one side as shown by the hatched side in FIG.

【０１７８】最後に、読み出した辺の子供辺の中で終点
が岐津（Ｓ）または任意（Ｓ）になっている辺をすべて
読み出す。図１４の例の場合に読み出される辺は、図２
２にハッチング付きの辺で示した通り、ただ１つの辺で
ある。Finally, of the child sides of the read sides, all sides whose end points are Gizu (S) or arbitrary (S) are read. The side read in the case of the example of FIG.
As shown by the hatched side in FIG. 2, there is only one side.

【０１７９】この結果、『筆頭発明者』の『名前』の
『姓』が『岐津』である特許出願に関係する所定の情報
を格納してあるＸＭＬ文書の名前は「Ｘ．ｘｍｌ」であ
ることを知ることができる。As a result, the name of the XML document storing the predetermined information related to the patent application in which the “last name” of the “first inventor” is “Kitsu” is “X.xml”. You can know that there is.

【０１８０】続いて、他の具体例として、『筆頭発明
者』が『岐津』である特許出願に関係する所定の情報を
格納してあるＸＭＬ文書を検索する場合を想定して説明
する。Next, as another specific example, a case where the “lead inventor” searches for an XML document storing predetermined information related to the patent application “Kizu” will be described.

【０１８１】ここで、『筆頭発明者』エレメントと『岐
津』文字列の間には、名前は何でも構わないがエレメン
トが２つ存在するものだけを検索するものとする。Here, between the "first inventor" element and the "Gizu" character string, any name may be used, but only those having two elements are searched.

【０１８２】検索入出力装置６に対して、（筆頭発明者
（Ｅ），（ＷＥ））（（ＷＥ），（ＷＥ））（（Ｗ
Ｅ），岐津（Ｓ））という検索パスを入力する。入力さ
れた検索パスは、インデックス読み出し装置４に送信さ
れる。For the search input / output device 6, (First inventor (E), (WE)) ((WE), (WE)) ((W
E), a search path of Kizu (S)) is input. The input search path is transmitted to the index reading device 4.

【０１８３】なお、前述のように、（ＷＥ）は任意のエ
レメントを意味する。また、（ＷＡＮ）は任意の属性
名、（ＷＡＶ）は任意の属性値、（ＷＳ）は任意の文字
列を意味する。As described above, (WE) means an arbitrary element. (WAN) means an arbitrary attribute name, (WAV) means an arbitrary attribute value, and (WS) means an arbitrary character string.

【０１８４】まず、インデックス読み出し装置４は、筆
頭発明者（Ｅ）を始点名とする辺の中に、インデックス
において省略対象であるものが存在するかどうか調べ
る。筆頭発明者（Ｅ）を始点とする辺の中に省略対象で
あるものは存在しないので、筆頭発明者（Ｅ）を始点と
する辺を格納してあるインデックス格納装置３上の領域
から、筆頭発明者（Ｅ）を始点とし、（Ｅ）で終了する
終点名を保持する辺を読み出す。図１４の例の場合に読
み出される辺は、図２０にハッチング付きの辺で示した
通り、ただ１つの辺である。First, the index reading device 4 checks whether or not any of the sides having the first inventor (E) as the starting point name is omitted in the index. Since there is no side to be omitted among the sides starting from the first inventor (E), from the area on the index storage device 3 where the side starting from the first inventor (E) is stored, The side holding the end point name starting from the inventor (E) and ending at (E) is read. The side read in the case of the example of FIG. 14 is only one side as shown by the hatched side in FIG.

【０１８５】次に、読み出した辺の子供辺で（Ｅ）で終
了する終点名を保持する辺と終点名が（ＷＥ）である辺
をすべて読み出す。図１４の例の場合に読み出される辺
は、図２３にハッチング付きの辺で示した通り、２つの
辺である。Next, all the sides holding the end point name ending in (E) and the sides having the end point name (WE) among the child sides of the read sides are read. The sides read in the case of the example of FIG. 14 are two sides as shown by hatched sides in FIG.

【０１８６】最後に、読み出した辺の子供辺の中で終点
名が岐津（Ｓ）または（ＷＳ）になっている辺をすべて
読み出す。図１４の例の場合に読み出される辺は、図２
２にハッチング付きの辺で示した通り、ただ１つの辺で
ある。Finally, of the child sides of the read sides, all sides whose end point names are Gizu (S) or (WS) are read. The side read in the case of the example of FIG.
As shown by the hatched side in FIG. 2, there is only one side.

【０１８７】この結果、『筆頭発明者』が『岐津』であ
る特許出願に関係する所定の情報を格納してあるＸＭＬ
文書の名前は「Ｘ．ｘｍｌ」であることを知ることがで
きる。As a result, the “leading inventor” stores in XML the predetermined information related to the patent application “Kizu”.
It can be known that the name of the document is “X.xml”.

【０１８８】インデックスを利用して検索を行なう場合
に関して、これまでの例で示した各々の操作の処理手順
を説明する。Regarding the case of performing a search using an index, the processing procedure of each operation shown in the examples up to this point will be described.

【０１８９】図２４に、この場合の処理手順の一例を示
す。FIG. 24 shows an example of the processing procedure in this case.

【０１９０】（１）検索入出力装置６に対して、検索パ
スを入力する（ステップＳ３０１）。(1) A search path is input to the search input / output device 6 (step S301).

【０１９１】（２）検索入出力装置６からインデックス
読み出し装置４に、検索パスを送信する（ステップＳ３
０２）。(2) A search path is transmitted from the search input / output device 6 to the index reading device 4 (step S3).
02).

【０１９２】（３）再チェックフラグをｆａｌｓｅに設
定する。(3) The recheck flag is set to false.

【０１９３】以降、ループ処理となる。Thereafter, a loop process is performed.

【０１９４】（４）検索パスに含まれる辺の中でインデ
ックス格納装置３から対応する辺を読み出されていない
ものを先頭から順番に探す（ステップＳ３０４）。その
ような辺が存在し（この辺をＥとおく）、これが検索パ
スの先頭の辺である場合（５）へ進み、そのような辺が
存在し（この辺をＥとおく）、これが検索パスの先頭の
辺でない場合（６）へ進み、そのような辺が存在しない
場合（８）へ進む（ステップＳ３４１，Ｓ３４２）。(4) Among the sides included in the search path, those whose corresponding sides have not been read from the index storage device 3 are searched in order from the top (step S304). If such a side exists (let this side be E) and this is the first side of the search path, go to (5), and if such a side exists (let this side be E), If it is not the first side, the process proceeds to (6), and if there is no such side, the process proceeds to (8) (steps S341 and S342).

【０１９５】（５）インデックス読み出し装置４は、Ｅ
の始点、終点を調べる（ステップＳ３０５）。ここで
は、以下のようにＥの始点および終点に応じた処理が行
われる。(5) The index reading device 4
The starting point and the ending point are checked (step S305). Here, processing according to the start point and end point of E is performed as follows.

【０１９６】（５−１）Ｅの始点名が（Ｅ）で終了する
場合Ｅの始点が（Ｅ）で終了する場合、以下のように、Ｅの
終点に関する分岐によって６つに分かれ、そのそれぞれ
において保持対象／省略対象に関する分岐によって２つ
に分かれる。なお、保持対象であるか省略対象であるか
は、例えば、保持対象とする辺の条件を登録したテーブ
ルを設け、該テーブルを参照するようにするなど、種々
の方法がある。(5-1) When the start point name of E ends with (E) When the start point of E ends with (E), the start point of E is divided into six parts according to the branch related to the end point of E as follows. Are divided into two according to the branch regarding the holding target / omission target. Note that there are various methods for determining whether the object is to be held or omitted, for example, by providing a table in which conditions of the side to be held are registered and referring to the table.

【０１９７】（５−１−１）Ｅの終点名が（Ｅ）で終了
する場合Ｅと同じ始点名、終点名を持つ辺がインデックスにおい
て保持対象であるか、それとも省略対象であるか調べ
る。（５−１−１−１）保持対象である場合インデックス格納装置３からＥと同じ始点名、終点名を
保持する辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。（５−１−１−２）省略対象である場合インデックス格納装置３から（任意（Ｅ）、任意
（Ｅ））辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。再チェックフラグをｔｒｕｅに設定する。(5-1-1) When the end point name of E ends with (E) It is checked whether the side having the same start point name and end point name as that of E is to be held in the index or is to be omitted. (5-1-1-1) When it is a holding object The sides that hold the same start point name and end point name as E are read from the index storage device 3, and “edge set” is set to these edges. (5-1-1-2) When it is an object to be omitted: Read all (arbitrary (E), arbitrary (E)) sides from the index storage device 3 and set “edge set” to these sides. Set the recheck flag to true.

【０１９８】（５−１−２）Ｅの終点名が（ＡＮ）で終
了する場合Ｅと同じ始点名、終点名を持つ辺がインデックスにおい
て保持対象であるか、それとも省略対象であるか調べ
る。（５−１−２−１）保持対象である場合インデックス格納装置３からＥと同じ始点名、終点名を
保持する辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。（５−１−２−２）省略対象である場合インデックス格納装置３から（任意（Ｅ）、任意（Ａ
Ｎ））辺をすべて読み出し、『辺集合』をこれらの辺に
設定する。再チェックフラグをｔｒｕｅに設定する。(5-1-2) When the end point name of E ends with (AN) It is checked whether a side having the same start point name and end point name as that of E is to be held in the index or is to be omitted. (5-1-2-1) When it is a holding object All sides holding the same start point name and end point name as E are read from the index storage device 3, and “edge set” is set to these sides. (5-1-2-2) Case of Omission From index storage device 3 (arbitrary (E), arbitrary (A
N)) Read out all sides and set “edge set” to these sides. Set the recheck flag to true.

【０１９９】（５−１−３）Ｅの終点名が（Ｓ）で終了
する場合Ｅと同じ始点名、終点名を持つ辺がインデックスにおい
て保持対象であるか、それとも省略対象であるか調べ
る。（５−１−３−１）保持対象である場合インデックス格納装置３からＥと同じ始点名、終点名を
保持する辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。（５−１−３−２）省略対象である場合インデックス格納装置３から（任意（Ｅ）、任意
（Ｓ））辺をすべて読み出し、『辺集合』をこれらの辺
を設定する。再チェックフラグをｔｒｕｅに設定する。(5-1-3) When the end point name of E ends in (S) It is checked whether a side having the same start point name and end point name as E is to be held in the index or to be omitted. (5-1-3-1) When it is a holding object All sides holding the same start point name and end point name as E are read from the index storage device 3, and "edge set" is set to these sides. (5-1-3-2) When it is a target to be omitted: All (arbitrary (E), arbitrary (S)) sides are read from the index storage device 3, and these sides are set in the "edge set". Set the recheck flag to true.

【０２００】（５−１−４）Ｅの終点名が（ＷＥ）であ
る場合Ｅと同じ始点名を持つ辺の中に、インデックスにおいて
省略対象であるものが存在するかどうか調べる。（５−１−４−１）存在する場合インデックス格納装置３からＥと同じ始点名、（Ｅ）で
終了する終点名を保持する辺と（任意（Ｅ）、任意
（Ｅ））辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。再チェックフラグをｔｒｕｅに設定する。（５−１−４−２）存在しない場合インデックス格納装置３からＥと同じ始点名、（Ｅ）で
終了する終点名を保持する辺をすべて読み出し、『辺集
合』をこれらの辺に設定する。(5-1-4) When the end point name of E is (WE): It is checked whether or not there is any side having the same start point name as that of the index, which is to be omitted in the index. (5-1-4-1) When Existing: All sides holding the same start point name as E and the end point name ending with (E) and (arbitrary (E), arbitrary (E)) sides from the index storage device 3 Read out and set “edge set” to these edges. Set the recheck flag to true. (5-1-4-2) When it does not exist All sides holding the same start point name as E and the end point name ending with (E) are read from the index storage device 3, and "edge set" is set to these sides. .

【０２０１】（５−１−５）Ｅの終点名が（ＷＡＮ）で
ある場合Ｅと同じ始点名を持つ辺の中に、インデックスにおいて
省略対象であるものが存在するかどうか調べる。（５−１−５−１）存在する場合インデックス格納装置３からＥと同じ始点名、（ＡＮ）
で終了する終点名を保持する辺と（任意（Ｅ）、任意
（ＡＮ））辺をすべて読み出し、『辺集合』をこれらの
辺に設定する。再チェックフラグをｔｒｕｅに設定す
る。（５−１−５−２）存在しない場合インデックス格納装置３からＥと同じ始点名、（ＡＮ）
で終了する終点名を保持する辺をすべて読み出し、『辺
集合』をこれらの辺に設定する。(5-1-5) When the end point name of E is (WAN) It is checked whether or not any of the sides having the same start point name as E in the index is omitted. (5-1-5-1) When it exists The same starting point name as E from the index storage device 3, (AN)
Then, all the sides holding the end point name and the (arbitrary (E), arbitrary (AN)) side are read, and an "edge set" is set to these sides. Set the recheck flag to true. (5-1-5-2) When it does not exist The same starting point name as E from the index storage device 3, (AN)
All the sides holding the end point names ending with are read out, and “edge set” is set to these sides.

【０２０２】（５−１−６）Ｅの終点名が（ＷＳ）であ
る場合Ｅと同じ始点名を持つ辺の中に、インデックスにおいて
省略対象であるものが存在するかどうか調べる。（５−１−６−１）存在する場合インデックス格納装置３からＥと同じ始点名、（Ｓ）で
終了する終点名を保持する辺と（任意（Ｅ）、任意
（Ｓ））辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。再チェックフラグをｔｒｕｅに設定する。（５−１−６−２）存在しない場合インデックス格納装置３からＥと同じ始点名、（Ｓ）で
終了する終点名を保持する辺をすべて読み出し、『辺集
合』をこれらの辺に設定する。(5-1-6) In the case where the end point name of E is (WS): It is checked whether or not any of the sides having the same start point name as E in the index is omitted. (5-1-6-1) When Existing: All sides holding the same start point name as E and the end point name ending at (S) and (arbitrary (E), arbitrary (S)) sides from the index storage device 3 Read out and set “edge set” to these edges. Set the recheck flag to true. (5-1-6-2) When it does not exist All the sides holding the same start point name as E and the end point name ending with (S) are read from the index storage device 3, and "edge set" is set to these sides. .

【０２０３】（５−２）Ｅの始点名が（ＡＮ）で終了す
る場合Ｅの始点が（ＡＮ）で終了する場合、以下のように、Ｅ
の終点に関する分岐によって２つに分かれ、そのそれぞ
れにおいて保持対象／省略対象に関する分岐によって２
つに分かれる。(5-2) When the start point name of E ends with (AN) When the start point of E ends with (AN), E
Are divided into two by the branch on the end point of
Divided into two.

【０２０４】（５−２−１）Ｅの終点が（ＡＶ）で終了
する場合Ｅと同じ始点名、終点名を持つ辺がインデックスにおい
て保持対象であるか、それとも省略対象であるか調べ
る。（５−２−１−１）保持対象である場合インデックス格納装置３からＥと同じ始点名、終点名を
保持する辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。（５−２−１−２）省略対象である場合インデックス格納装置３から（任意（ＡＮ）、任意（Ａ
Ｖ））辺をすべて読み出し、『辺集合』をこれらの辺を
設定する。再チェックフラグをｔｒｕｅに設定する。(5-2-1) When the end point of E ends with (AV) It is checked whether a side having the same start point name and end point name as that of E is to be held in the index or to be omitted. (5-2-1-1) When the side is a holding target All sides holding the same start point name and end point name as E are read from the index storage device 3, and "edge set" is set to these sides. (5-2-1-2) Case of Omission From index storage device 3 (any (AN), any (A
V)) Read out all sides and set "side set" to these sides. Set the recheck flag to true.

【０２０５】（５−２−２）Ｅの終点名が（ＷＡＶ）で
ある場合Ｅと同じ始点名を持つ辺の中に、インデックスにおいて
省略対象であるものが存在するかどうか調べる。（５−２−２−１）存在する場合インデックス格納装置３からＥと同じ始点名、（ＡＶ）
で終了する終点名を保持する辺と（任意（ＡＮ）、任意
（ＡＶ））辺をすべて読み出し、『辺集合』をこれらの
辺に設定する。再チェックフラグをｔｒｕｅに設定す
る。（５−２−２−２）存在しない場合インデックス格納装置３からＥと同じ始点名、（ＡＶ）
で終了する終点名を保持する辺をすべて読み出し、『辺
集合』をこれらの辺に設定する。(5-2-2) When the end point name of E is (WAV) It is checked whether or not any of the sides having the same start point name as E in the index is omitted. (5-2-2-1) When it exists: the same start point name as that of E from the index storage device 3, (AV)
Then, all the sides holding the end point name ending in and the (arbitrary (AN), arbitrary (AV)) sides are read out, and an “edge set” is set to these sides. Set the recheck flag to true. (5-2-2-2) When it does not exist The same start point name as that of E from the index storage device 3, (AV)
All the sides holding the end point names ending with are read out, and “edge set” is set to these sides.

【０２０６】（５−３）Ｅの始点名が（ＷＥ）で終了す
る場合Ｅの始点が（ＷＥ）で終了する場合、以下のように、Ｅ
の終点に関する分岐によって６つに分かれ、そのそれぞ
れにおいて保持対象／省略対象に関する分岐によって２
つに分かれる。(5-3) When the start point name of E ends with (WE) When the start point of E ends with (WE), as follows:
Are divided into six by the branch regarding the end point of
Divided into two.

【０２０７】（５−３−１）Ｅの終点名が（Ｅ）で終了
する場合Ｅと同一の終点名を持つ辺の中に、インデックスにおい
て省略対象であるものが存在するかどうか調べる。（５−３−１−１）存在する場合インデックス格納装置３からＥと同じ終点名を保持する
辺と（任意（Ｅ）、任意（Ｅ））辺をすべて読み出し、
『辺集合』をこれらの辺に設定する。再チェックフラグ
をｔｒｕｅに設定する。（５−３−１−２）存在しない場合インデックス格納装置３からＥと同じ終点名を保持する
辺をすべて読み出し、『辺集合』をこれらの辺に設定す
る。(5-3-1) When the end point name of E ends with (E) It is checked whether or not any of the sides having the same end point name as that of the index is to be omitted in the index. (5-3-1-1) When it exists: All the sides holding the same end point name as E and the (arbitrary (E), arbitrary (E)) sides are read from the index storage device 3,
An "edge set" is set for these edges. Set the recheck flag to true. (5-3-1-2) When it does not exist All the sides holding the same end point name as E are read from the index storage device 3, and "edge set" is set to these sides.

【０２０８】（５−３−２）Ｅの終点名が（ＡＮ）で終
了する場合Ｅと同一の終点名を持つ辺の中に、インデックスにおい
て省略対象であるものが存在するかどうか調べる。（５−３−２−１）存在する場合インデックス格納装置３からＥと同じ終点名を保持する
辺と（任意（Ｅ）、任意（ＡＮ））辺をすべて読み出
し、『辺集合』をこれらの辺に設定する。再チェックフ
ラグをｔｒｕｅに設定する。（５−３−２−２）存在しない場合インデックス格納装置３からＥと同じ終点名を保持する
辺をすべて読み出し、『辺集合』をこれらの辺に設定す
る。(5-3-2) When the end point name of E ends with (AN) It is checked whether or not any of the sides having the same end point name as E in the index is to be omitted. (5-3-2-1) When it exists: All the sides holding the same end point name as E and the (arbitrary (E), arbitrary (AN)) side are read from the index storage device 3 and the “edge set” is read out from these. Set to side. Set the recheck flag to true. (5-3-2-2) When it does not exist All the sides holding the same end point name as E are read from the index storage device 3, and the "edge set" is set to these sides.

【０２０９】（５−３−３）Ｅの終点名が（Ｓ）で終了
する場合Ｅと同一の終点名を持つ辺の中に、インデックスにおい
て省略対象であるものが存在するかどうか調べる。（５−３−３−１）存在する場合インデックス格納装置３からＥと同じ終点名を保持する
辺と（任意（Ｅ）、任意（Ｓ））辺をすべて読み出し、
『辺集合』をこれらの辺に設定する。再チェックフラグ
をｔｒｕｅに設定する。（５−３−３−２）存在しない場合インデックス格納装置３からＥと同じ終点名を保持する
辺をすべて読み出し、『辺集合』をこれらの辺に設定す
る。(5-3-3) When the end point name of E ends in (S) It is checked whether or not there is any side having the same end point name as that of the index in the index. (5-3-3-1) When Existing: All sides holding the same end point name as E and (arbitrary (E), arbitrary (S)) sides are read from the index storage device 3, and
An "edge set" is set for these edges. Set the recheck flag to true. (5-3-3-2) When it does not exist All sides holding the same end point name as E are read from the index storage device 3, and "edge set" is set to these sides.

【０２１０】（５−３−４）Ｅの終点が（ＷＥ）である
場合（Ｅ）で終了する終点を持つ辺の中に、インデックスに
おいて省略対象であるものが存在するかどうか調べる。（５−３−４−１）存在する場合インデックス格納装置３から（Ｅ）で終了する終点名を
保持する辺と（任意（Ｅ）、任意（Ｅ））辺をすべて読
み出し、『辺集合』をこれらの辺に設定する。（５−３−４−２）存在しない場合インデックス格納装置３から（Ｅ）で終了する終点名を
保持する辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。(5-3-4) In the case where the end point of E is (WE) It is checked whether or not any of the sides having the end point ending in (E) is to be omitted in the index. (5-3-4-1) When it exists The side holding the end point name ending at (E) and the (arbitrary (E), arbitrary (E)) side are all read from the index storage device 3, and the “edge set” is read. Are set on these sides. (5-3-4-2) When it does not exist All the sides holding the end point names ending in (E) are read from the index storage device 3, and the "edge set" is set to these edges.

【０２１１】（５−３−５）Ｅの終点名が（ＷＡＮ）で
ある場合（ＡＮ）で終了する終点を持つ辺の中に、インデックス
において省略対象であるものが存在するかどうか調べ
る。（５−３−５−１）存在する場合インデックス格納装置３から（ＡＮ）で終了する終点名
を保持する辺と（任意（Ｅ）、任意（ＡＮ））辺をすべ
て読み出し、『辺集合』をこれらの辺に設定する。（５−３−５−２）存在しない場合インデックス格納装置３から（ＡＮ）で終了する終点名
を保持する辺をすべて読み出し、『辺集合』をこれらの
辺に設定する。(5-3-5) When the end point name of E is (WAN) It is checked whether or not any of the sides having the end point ending in (AN) is to be omitted in the index. (5-3-5-1) When it exists The side holding the end point name ending with (AN) and the (arbitrary (E), arbitrary (AN)) side are all read from the index storage device 3, and the “edge set” is read. Are set on these sides. (5-3-5-2) When none exists: All sides holding the end point names ending with (AN) are read from the index storage device 3, and an "edge set" is set to these edges.

【０２１２】（５−３−６）Ｅの終点名が（ＷＳ）であ
る場合（Ｓ）で終了する終点を持つ辺の中に、インデックスに
おいて省略対象であるものが存在するかどうか調べる。（５−３−６−１）存在する場合インデックス格納装置３から（Ｓ）で終了する終点名を
保持する辺と（任意（Ｅ）、任意（Ｓ））辺をすべて読
み出し、『辺集合』をこれらの辺に設定する。（５−３−６−２）存在しない場合インデックス格納装置３から（Ｓ）で終了する終点名を
保持する辺をすべて読み出し、『辺集合』をこれらの辺
に設定する。(5-3-6) When the end point name of E is (WS): It is checked whether or not there is an omission target in the index among the sides having the end point ending in (S). (5-3-6-1) When it exists The side holding the end point name ending at (S) and the (arbitrary (E), arbitrary (S)) side are all read out from the index storage device 3 and “edge set”. Are set on these sides. (5-3-6-2) When the side does not exist All the sides holding the end point names ending in (S) are read from the index storage device 3, and the "side set" is set to these sides.

【０２１３】（５−４）Ｅの始点名が（ＷＡＮ）である
場合Ｅの始点が（ＷＡＮ）で終了する場合、以下のように、
Ｅの終点に関する分岐によって２つに分かれ、そのそれ
ぞれにおいて保持対象／省略対象に関する分岐によって
２つに分かれる。(5-4) When the starting point name of E is (WAN) When the starting point of E ends with (WAN), as follows:
E is divided into two by a branch related to the end point, and each is divided into two by a branch related to a holding target / omission target.

【０２１４】（５−４−１）Ｅの終点名が（ＡＶ）で終
了する場合Ｅと同一の始点名を持つ辺の中に、インデックスにおい
て省略対象であるものが存在するかどうか調べる。（５−４−１−１）存在する場合インデックス格納装置３からＥと同じ終点名を保持する
辺と（任意（ＡＮ）、任意（ＡＶ））辺をすべて読み出
し、『辺集合』をこれらの辺に設定する。再チェックフ
ラグをｔｒｕｅに設定する。（５−４−１−２）存在しない場合インデックス格納装置３からＥと同じ終点名を保持する
辺をすべて読み出し、『辺集合』をこれらの辺に設定す
る。(5-4-1) In the case where the end point name of E ends with (AV) It is checked whether or not any of the sides having the same start point name as E in the index is omitted. (5-4-1-1) When it exists: All the sides holding the same end point name as E and the (arbitrary (AN), arbitrary (AV)) sides are read out from the index storage device 3, and the "edge set" is Set to side. Set the recheck flag to true. (5-4-1-2) When it does not exist All sides holding the same end point name as E are read from the index storage device 3, and "edge set" is set to these sides.

【０２１５】（５−４−２）Ｅの終点名が（ＷＡＶ）で
終了する場合（ＡＶ）で終了する終点名を持つ辺の中に、インデック
スにおいて省略対象であるものが存在するかどうか調べ
る。（５−４−２−１）存在する場合インデックス格納装置３から（ＡＶ）で終了する終点を
保持する辺と（任意（ＡＮ）、任意（ＡＶ））辺をすべ
て読み出し、『辺集合』をこれらの辺に設定する。（５−４−２−２）存在しない場合インデックス格納装置３から（ＡＶ）で終了する終点名
を保持する辺をすべて読み出し、『辺集合』をこれらの
辺に設定する。(5-4-2) In the case where the end point name of E ends with (WAV) It is checked whether or not any of the sides having the end point name ending with (AV) is omitted in the index. . (5-4-2-1) In the case where there is, all of the side holding the end point ending at (AV) and the (arbitrary (AN), arbitrary (AV)) side are read out from the index storage device 3, and the "edge set" is read. Set these sides. (5-4-2-2) When it does not exist All the sides holding the end point names ending with (AV) are read from the index storage device 3, and the "side set" is set to these sides.

【０２１６】この（５）の処理の次は、（７）へ進む。After the process (5), the process proceeds to (7).

【０２１７】（６）インデックス読み出し装置４は、Ｅ
終点を調べる（ステップＳ３０６）。ここでは、以下の
ようにＥの終点に応じた処理が行われる。(6) The index reading device 4
The end point is checked (step S306). Here, processing according to the end point of E is performed as follows.

【０２１８】（６−１）Ｅの終点名が（Ｅ）で終了する
場合『辺集合』に含まれる辺の子供辺のうち、Ｅと同じ終点
名を持つ辺と終点名が任意（Ｅ）である辺をすべて読み
出し、『辺集合』を新たにこれらの辺に設定する。『辺
集合』の中に終点名が任意（Ｅ）である辺が存在する場
合には、再チェックフラグをｔｒｕｅに設定する。(6-1) When the end point name of E ends with (E) Of the child sides included in the “edge set”, the side having the same end point name as E and the end point name are arbitrary (E) Are read out, and an “edge set” is newly set for these edges. If there is an edge whose end point name is arbitrary (E) in the “edge set”, the recheck flag is set to true.

【０２１９】（６−２）Ｅの終点名が（ＡＮ）で終了す
る場合『辺集合』に含まれる辺の子供辺のうち、Ｅと同じ終点
名を持つ辺と終点名が任意（ＡＮ）である辺をすべて読
み出し、『辺集合』を新たにこれらの辺に設定する。
『辺集合』の中に終点名が任意（ＡＮ）である辺が存在
する場合には、再チェックフラグをｔｒｕｅに設定す
る。(6-2) When the end point name of E ends with (AN) Of the child sides included in the “edge set”, the side having the same end point name as E and the end point name are arbitrary (AN) Are read out, and an “edge set” is newly set for these edges.
If there is an edge whose end point name is arbitrary (AN) in the “edge set”, the recheck flag is set to true.

【０２２０】（６−３）Ｅの終点名が（ＡＶ）で終了す
る場合『辺集合』に含まれる辺の子供辺のうち、Ｅと同じ終点
名を持つ辺と終点名が任意（ＡＶ）である辺をすべて読
み出し、『辺集合』を新たにこれらの辺に設定する。
『辺集合』の中に終点名が任意（ＡＶ）である辺が存在
する場合には、再チェックフラグをｔｒｕｅに設定す
る。(6-3) When the end point name of E ends with (AV) Of the child sides included in the “edge set”, the side having the same end point name as E and the end point name are arbitrary (AV) Are read out, and an “edge set” is newly set for these edges.
If there is an edge whose end point name is arbitrary (AV) in the “edge set”, the recheck flag is set to true.

【０２２１】（６−４）Ｅの終点名が（Ｓ）で終了する
場合『辺集合』に含まれる辺の子供辺のうち、Ｅと同じ終点
名を持つ辺と終点名が任意（Ｓ）である辺をすべて読み
出し、『辺集合』を新たにこれらの辺に設定する。『辺
集合』の中に終点名が任意（Ｓ）である辺が存在する場
合には、再チェックフラグをｔｒｕｅに設定する。(6-4) When the end point name of E ends in (S) Among the child sides of the sides included in the “edge set”, the side having the same end point name as E and the end point name are arbitrary (S) Are read out, and an “edge set” is newly set for these edges. If there is an edge whose end point name is arbitrary (S) in the “edge set”, the recheck flag is set to true.

【０２２２】（６−５）Ｅの終点名が（ＷＥ）の場合『辺集合』に含まれる辺の子供辺のうち、（Ｅ）で終了
する終点名を持つ辺と終点名が任意（Ｅ）である辺をす
べて読み出し、『辺集合』を新たにこれらの辺に設定す
る。(6-5) When the end point name of E is (WE) Among the child sides of the sides included in the “edge set”, the side having the end point name ending with (E) and the end point name are arbitrary (E ) Are read out, and an “edge set” is newly set for these edges.

【０２２３】（６−６）Ｅの終点名が（ＷＡＮ）の場合『辺集合』に含まれる辺の子供辺のうち、（ＡＮ）で終
了する終点名を持つ辺と終点名が任意（ＡＮ）である辺
をすべて読み出し、『辺集合』を新たにこれらの辺に設
定する。(6-6) When the end point name of E is (WAN) Of the child sides included in the “edge set”, the side having the end point name ending with (AN) and the end point name are arbitrary (AN ) Are read out, and an “edge set” is newly set for these edges.

【０２２４】（６−７）Ｅの終点名が（ＷＡＶ）の場合『辺集合』に含まれる辺の子供辺のうち、Ｅ（ＡＶ）で
終了する終点名を持つ辺と終点名が任意（ＡＶ）である
辺をすべて読み出し、『辺集合』を新たにこれらの辺に
設定する。(6-7) When the end point name of E is (WAV) Of the child sides included in the “edge set”, the side having the end point name ending with E (AV) and the end point name are arbitrary ( AV) are read out, and an “edge set” is newly set for these edges.

【０２２５】（６−８）Ｅの終点名が（ＷＳ）の場合『辺集合』に含まれる辺の子供辺のうち、（Ｓ）で終了
する終点名を持つ辺と終点名が任意（Ｓ）である辺をす
べて読み出し、『辺集合』を新たにこれらの辺に設定す
る。(6-8) When the end point name of E is (WS) Among the child sides of the sides included in the “edge set”, the side having the end point name ending with (S) and the end point name are arbitrary (S ) Are read out, and an “edge set” is newly set for these edges.

【０２２６】この（６）の処理の次は、（７）へ進む。After the process (6), the process proceeds to (7).

【０２２７】（７）『辺集合』が空集合であるかどうか
判断する（ステップＳ３０７）。『辺集合』が空集合で
ある場合には、検索パスに合致する情報を保持するＸＭ
Ｌ文書が存在しないという意味である。この場合には、
ＸＭＬ文書が存在しないことを検索入出力装置６に送信
して、終了する（ステップＳ３０９）。『辺集合』が空
集合でない場合には、（４）へ戻る。(7) It is determined whether the “edge set” is an empty set (step S307). If the “edge set” is an empty set, an XM holding information matching the search path
This means that there is no L document. In this case,
The fact that there is no XML document is transmitted to the search input / output device 6, and the process ends (step S309). If the “edge set” is not an empty set, the process returns to (4).

【０２２８】（８）再チェックフラグを調べる。（８−１）再チェックフラグがｔｒｕｅの場合『辺集合』に設定されている辺が、検索パスに対応する
辺の集合である。これらの辺に対応するＸＭＬ文書を１
文書ずつ検索パスに合致するかどうか調べ、合致するも
のを検索入出力装置６に送信する（ステップＳ３０
８）。（８−２）再チェックフラグがｆａｌｓｅの場合『辺集合』に設定されている辺が、検索パスに対応する
辺の集合である。これらの辺に対応するＸＭＬ文書を検
索入出力装置６に送信する（ステップＳ３０８）。(8) Check the recheck flag. (8-1) When the recheck flag is true The side set in the “side set” is a set of sides corresponding to the search path. XML documents corresponding to these sides are
It is checked whether each document matches the search path, and the matching one is transmitted to the search input / output device 6 (step S30).
8). (8-2) When the recheck flag is false The side set in the “side set” is a set of sides corresponding to the search path. The XML document corresponding to these sides is transmitted to the search input / output device 6 (step S308).

【０２２９】ここでの実施形態では、検索パスに含まれ
る辺を先頭から順番に調べて行く方法を説明したが、検
索パスを最後から順番に調べて行く方法や、検索パスの
途中から調べる方法や、インデックスを用いて検索を行
なっている途中で、ＸＭＬ文書の候補が絞られたら後は
ＸＭＬ文書１文書ずつチェックを行なう方法、さらにこ
れらの組合せなど、種々のシステム構成があり得る。In this embodiment, the method of sequentially examining the sides included in the search path from the beginning has been described. However, the method of examining the search path in order from the end or the method of examining the search path from the middle. In addition, there are various system configurations such as a method of checking XML documents one document at a time when XML document candidates are narrowed down during the search using the index, and a combination of these.

【０２３０】以下では、本システムをクライアント・サ
ーバ・システムにより実現する場合について説明する。In the following, a case will be described in which the present system is realized by a client-server system.

【０２３１】図２５に、クライアント・サーバ・システ
ムの構成例を示す。この例では、サーバ１０側に、イン
デックス作成装置２、インデックス格納装置３、インデ
ックス読み出し装置４、ＸＭＬ文書格納装置７が実装さ
れ、クライアント２０側に、ＸＭＬ文書入力装置１、イ
ンデックス閲覧装置５、検索入出力装置６の全部または
一部が実装される。クライアント２０は、インターネッ
トあるいはイントラネットなどのネットワーク３０を介
して、サーバ１０と通信可能である。クライアント２０
のユーザは、サーバ１０により提供される、所望のＸＭ
Ｌ文書に対するインデックス作成・格納、インデックス
の閲覧、インデックスの検索、ＸＭＬ文書の取得・閲覧
のサービスを受けることができる。FIG. 25 shows an example of the configuration of a client server system. In this example, an index creation device 2, an index storage device 3, an index reading device 4, and an XML document storage device 7 are mounted on the server 10 side, and an XML document input device 1, an index browsing device 5, a search All or a part of the input / output device 6 is mounted. The client 20 can communicate with the server 10 via a network 30 such as the Internet or an intranet. Client 20
Of the desired XM provided by the server 10
It is possible to receive the services of creating and storing an index for L documents, browsing the index, searching the index, and acquiring and browsing the XML document.

【０２３２】図２６に、クライアント・サーバ・システ
ムの構成例を示す。この例では、サーバ１１側に、イン
デックス作成装置２、インデックス格納装置３、インデ
ックス読み出し装置４が実装され、クライアント２０側
に、ＸＭＬ文書入力装置１、インデックス閲覧装置５、
検索入出力装置６の全部または一部が実装され、ＸＭＬ
文書格納装置７は１又は複数の他のサーバ１２により提
供される。クライアント２０は、ネットワーク３０を介
して、サーバ１１やサーバ１２と通信可能である。クラ
イアント２０のユーザは、サーバ１１により提供され
る、所望のＸＭＬ文書に対するインデックス作成・格
納、インデックスの閲覧、インデックスの検索のサービ
スを受けることができ、サーバ１２により提供される、
ＸＭＬ文書の取得・閲覧のサービスを受けることができ
る。FIG. 26 shows a configuration example of a client server system. In this example, an index creation device 2, an index storage device 3, and an index reading device 4 are mounted on the server 11 side, and an XML document input device 1, an index browsing device 5,
All or part of the search input / output device 6 is implemented, and XML
The document storage device 7 is provided by one or more other servers 12. The client 20 can communicate with the server 11 or the server 12 via the network 30. The user of the client 20 can receive services provided by the server 11 for creating and storing an index for a desired XML document, browsing the index, and searching for the index.
An XML document acquisition / browsing service can be received.

【０２３３】上記の他にも種々のシステム構成が可能で
ある。In addition to the above, various system configurations are possible.

【０２３４】なお、以上の各機能は、ソフトウェアとし
ても実現可能である。Note that each of the above functions can be realized as software.

【０２３５】また、本実施形態は、コンピュータに所定
の手段を実行させるための（あるいはコンピュータを所
定の手段として機能させるための、あるいはコンピュー
タに所定の機能を実現させるための）プログラムを記録
したコンピュータ読取り可能な記録媒体としても実施す
ることもできる。The present embodiment is also directed to a computer which records a program for causing a computer to execute predetermined means (or for causing a computer to function as predetermined means or for causing a computer to realize predetermined functions). It can also be implemented as a readable recording medium.

【０２３６】以上では、構造化文書としてＸＭＬ文書を
例にとって説明したが、もちろん本発明はＨＴＭＬ文書
やその他の構造化文書にも適用可能である。In the above, an XML document has been described as an example of a structured document. However, the present invention can of course be applied to an HTML document and other structured documents.

【０２３７】本発明は、上述した実施の形態に限定され
るものではなく、その技術的範囲において種々変形して
実施することができる。The present invention is not limited to the above-described embodiments, but can be implemented with various modifications within the technical scope.

【０２３８】[0238]

【発明の効果】本発明によれば、対象となる構造化文書
の構造における親子関係を維持しつつ一定の関係にある
複数の辺を１つにまとめた構造を保持したインデックス
を作成するので、従来よりもデータ・サイズが小さく且
つ効率的な検索を可能とする、構造化文書のインデック
スを作成することができる。According to the present invention, an index is created which holds a structure in which a plurality of sides having a certain relationship are combined into one while maintaining a parent-child relationship in the structure of the target structured document. It is possible to create an index of a structured document that has a smaller data size and enables more efficient search than before.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る構造化文書／インデ
ックス処理システムの構成例を示す図FIG. 1 is a diagram showing a configuration example of a structured document / index processing system according to an embodiment of the present invention.

【図２】図２８のＸＭＬ文書を表現した木の例を示す図FIG. 2 is a diagram showing an example of a tree expressing the XML document of FIG. 28;

【図３】図３の木を置換処理した後の木の例を示す図FIG. 3 is a diagram showing an example of a tree after the tree of FIG. 3 has been replaced;

【図４】図３の木において辺をまとめた結果得られた木
の例を示す図FIG. 4 is a diagram illustrating an example of a tree obtained as a result of combining edges in the tree of FIG. 3;

【図５】図４の木において辺をまとめた結果得られた木
の例を示す図FIG. 5 is a diagram showing an example of a tree obtained as a result of combining edges in the tree of FIG. 4;

【図６】図５の木において辺をまとめた結果得られた木
の例を示す図FIG. 6 is a diagram showing an example of a tree obtained as a result of combining edges in the tree of FIG. 5;

【図７】図６の木のインデックス格納装置への格納形式
について説明するための図FIG. 7 is a view for explaining a storage format of the tree in FIG. 6 in the index storage device;

【図８】ＸＭＬ文書からインデックスを作成する場合の
処理手順の一例を示すフローチャートFIG. 8 is a flowchart illustrating an example of a processing procedure when an index is created from an XML document;

【図９】図２９のＸＭＬ文書を表現した木の例を示す図9 is a diagram showing an example of a tree expressing the XML document of FIG. 29;

【図１０】図９の木を置換処理した後の木の例を示す図FIG. 10 is a diagram illustrating an example of a tree after the tree of FIG. 9 has been replaced;

【図１１】図１０の木において辺を再帰的にまとめた結
果得られた木の例を示す図FIG. 11 is a diagram showing an example of a tree obtained as a result of recursively collecting edges in the tree of FIG. 10;

【図１２】図６の木に図１１の木をまとめた結果得られ
た木の例を示す図12 is a diagram illustrating an example of a tree obtained as a result of combining the tree of FIG. 11 with the tree of FIG. 6;

【図１３】図１０の木において辺を再帰的にまとめた結
果得られた木の例を示す図FIG. 13 is a diagram illustrating an example of a tree obtained as a result of recursively combining edges in the tree of FIG. 10;

【図１４】図１３の木のインデックス格納装置への格納
形式について説明するための図FIG. 14 is a view for explaining a storage format of the tree in FIG. 13 in the index storage device;

【図１５】既に作成されたインデックスに新たにＸＭＬ
文書の情報を追加する場合の処理手順の一例を示すフロ
ーチャート[FIG. 15] A new XML is added to an already created index.
Flowchart showing an example of a processing procedure when adding document information

【図１６】根頂点に対するインデックス閲覧要求により
得られた辺に関する情報の表示例を示す図FIG. 16 is a diagram showing a display example of information about a side obtained by an index browsing request for a root vertex.

【図１７】図１６の辺の子供辺に対するインデックス閲
覧要求により得られた辺に関する情報の表示例を示す図17 is a diagram showing a display example of information on a side obtained by an index browsing request for a child side of the side in FIG. 16;

【図１８】図１７の辺の子供辺に対するインデックス閲
覧要求により得られた辺に関する情報の表示例を示す図18 is a diagram showing a display example of information on a side obtained by an index browsing request for a child side of the side in FIG. 17;

【図１９】インデックスを閲覧する場合の処理手順一例
を示すフローチャートFIG. 19 is a flowchart illustrating an example of a processing procedure when browsing an index;

【図２０】インデックスを利用した検索について説明す
るための図FIG. 20 is a diagram for describing a search using an index.

【図２１】インデックスを利用した検索について説明す
るための図FIG. 21 is a diagram illustrating a search using an index.

【図２２】インデックスを利用した検索について説明す
るための図FIG. 22 is a diagram illustrating a search using an index.

【図２３】インデックスを利用した検索について説明す
るための図FIG. 23 is a diagram illustrating a search using an index.

【図２４】インデックスを利用して検索を行なう場合の
処理手順一例を示すフローチャートFIG. 24 is a flowchart illustrating an example of a processing procedure when a search is performed using an index.

【図２５】システム構成について説明するための図FIG. 25 is a diagram for describing a system configuration;

【図２６】システム構成について説明するための図FIG. 26 is a diagram for describing a system configuration;

【図２７】リレーショナルデータベースの一例を示す図FIG. 27 illustrates an example of a relational database.

【図２８】構造化文書の一例を示す図FIG. 28 shows an example of a structured document.

【図２９】構造化文書の他の例を示す図FIG. 29 shows another example of a structured document.

[Explanation of symbols]

１…ＸＭＬ文書入力装置２…インデックス作成装置３…インデックス格納装置４…インデックス読み出し装置５…インデックス閲覧装置６…検索入出力装置７…ＸＭＬ文書格納装置１０〜１２…サーバ２０…クライアント３０…ネットワーク DESCRIPTION OF SYMBOLS 1 ... XML document input device 2 ... Index creation device 3 ... Index storage device 4 ... Index reading device 5 ... Index browsing device 6 ... Search input / output device 7 ... XML document storage device 10-12 ... Server 20 ... Client 30 ... Network

───────────────────────────────────────────────────── フロントページの続き (72)発明者岐津俊樹神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者前田誠司神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者矢野浩邦神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内 (72)発明者矢尾浩神奈川県川崎市幸区小向東芝町１番地株式会社東芝研究開発センター内Ｆターム(参考） 5B075 ND02 ND34 NK10 NK21 NR02 5B082 AA11 EA01 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Toshiki Gitsu 1st address, Komukai Toshiba-cho, Saiwai-ku, Kawasaki-shi, Kanagawa Inside the Toshiba R & D Center (72) Inventor Seiji Maeda Komukai, Sai-ku, Kawasaki-shi, Kanagawa No. 1 Toshiba-cho, Toshiba R & D Center (72) Inventor Hirokuni Yano No. 1, Komukai Toshiba-cho, Koyuki-ku, Kawasaki-shi, Kanagawa Prefecture Toshiba R & D Center (72) Inventor Hiroshi Yao, Kawasaki, Kanagawa 1F, Komukai Toshiba-cho, Sachi-ku F-term in Toshiba R & D Center (reference) 5B075 ND02 ND34 NK10 NK21 NR02 5B082 AA11 EA01

Claims

[Claims]

1. A vertex other than the highest vertex has a single upper vertex as a parent, and a vertex other than the lowest vertex has one or a plurality of lower vertices as children to form a hierarchical parent-child relationship. An index creation method for creating an index from a structured document represented by a structure including a plurality of vertices to be processed and a plurality of edges holding information related to a pair of upper and lower vertices in a parent-child relationship. In the structured document to be processed, it is determined whether there are a plurality of sides having the same vertex on the upper side and holding a plurality of pieces of information having a predetermined relationship, and Exists, a vertex located on the lower side of the plurality of sides is integrated as one vertex to perform a sharing process of sharing the plurality of sides into one, and after performing the sharing process, Structured document structure Indexing method characterized by creating a lifting index.

2. A vertex other than the highest vertex has a single upper vertex as a parent, and a vertex other than the lowest vertex has one or more lower vertices as children to form a hierarchical parent-child relationship. An index creation method for creating an index from a plurality of structured documents represented by a structure including a plurality of vertices to be processed and a plurality of edges holding information related to a pair of upper and lower vertices in a parent-child relationship. The top vertices of the structures representing the plurality of target structured documents are integrated into one new structure, and each of the plurality of target structured documents and the new one structure It is determined whether there are a plurality of sides having the same vertex on the upper side and a plurality of sides holding information having a predetermined relationship, and if the plurality of sides exist, the About multiple sides A vertex located on the lower side of is integrated as one vertex to perform a sharing process of sharing the plurality of sides into one, and an index holding the structure of the structured document after performing the sharing process An index creation method characterized by creating an index.

3. A vertex other than the highest vertex has a single upper vertex as a parent, and a vertex other than the lowest vertex has one or a plurality of lower vertices as children to form a parent-child relationship hierarchically. An index creation method for creating an index from a plurality of structured documents represented by a structure including a plurality of vertices to be processed and a plurality of edges holding information related to a pair of upper and lower vertices in a parent-child relationship. Then, the top vertices of the structures representing a plurality of target structured documents are integrated into one new structure, and the new one structure has the same vertex on the upper side. It is determined whether there are a plurality of sides corresponding to the structured document and a plurality of sides holding information in a predetermined relationship, and if the plurality of sides exist, the plurality of sides are determined. Located below Vertices are integrated as one vertex to perform the sharing process of sharing the plurality of sides into one, and to create an index holding the structure of the structured document after performing the sharing process. Index creation method to be characterized.

4. The edge according to claim 2, wherein the side also holds information for specifying a structured document to which the side belongs.
Index creation method described in.

5. The index creation method according to claim 1, wherein the sharing process is repeatedly performed recursively until the edge cannot be shared.

6. Prior to the sharing process performed for the first time, whether information other than information satisfying a predetermined condition exists among information relating to the upper and lower vertices respectively held by the plurality of sides. 6. The index creation method according to claim 1, wherein it is determined whether or not the information is present, and if the information is present, a replacement process for replacing the information with a predetermined abbreviation is performed.

7. When storing the created index in a storage device, a part corresponding to a plurality of sides holding the same information as information related to the upper-side vertex is stored in the index. 7. The storage device according to claim 1, wherein the data is stored in a continuous area of the storage device.
Index creation method described in section.

8. When a browsing request is received for an index created by the index creation method according to any one of claims 1 to 7, a plurality of apexes having the same vertex as the upper vertex are received. An index display method characterized by displaying the index for each part corresponding to a side.

9. A portion corresponding to the side in the index when a search request specifying a structure is received for an index created by the index creation method according to any one of claims 1 to 7. An index search method comprising: searching for a part holding a corresponding structure by referring to information held in the index.

10. A vertex other than the highest vertex has a single upper vertex as a parent, and a vertex other than the lowest vertex has one or a plurality of lower vertices as children to form a hierarchical parent-child relationship. An index creation device for creating an index of a structured document represented by a structure including a plurality of vertices to be processed and a plurality of sides holding information related to a pair of upper and lower vertices in a parent-child relationship. An index comprising: means for creating an index holding a structure in which a plurality of the sides having a certain relationship are integrated into one while maintaining the parent-child relationship in the structure of the target structured document. Creation device.