JP3508623B2

JP3508623B2 - Structured document management system and method, and recording medium

Info

Publication number: JP3508623B2
Application number: JP14115399A
Authority: JP
Inventors: 拓哉北野; みさ波内; 邦敏鶴岡
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1999-05-21
Filing date: 1999-05-21
Publication date: 2004-03-22
Anticipated expiration: 2019-05-21
Also published as: JP2000331021A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、文書管理技術に関
し、特に、構造化文書をデータベースを用いて管理する
装置方法並びに記録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document management technique, and more particularly to an apparatus method and a recording medium for managing a structured document using a database.

【０００２】[0002]

【従来の技術】［構造化文書］テキスト文書の中に、記
号“<”と“>”とで囲まれた文字列、すなわちタグを文
書中に埋め込んで部分文書（「文書要素」という）を作
成し、文書要素単位で、文書の表示やデータ解析を行う
文書の作成手段として、ＳＧＭＬ（Standardized Gene
ralized Markup Language）、ＸＭＬ（eXtensible M
arkup Language）、及びＨＴＭＬ（HyperText Markup
Language）などが知られている。これらＳＧＭＬ、Ｘ
ＭＬ、ＨＴＭＬによって作成した文書を一般に、「構造
化文書」と呼ぶ。2. Description of the Related Art [Structured document] In a text document, a character string surrounded by symbols "<" and ">", that is, a tag is embedded in the document to form a partial document (referred to as "document element"). SGML (Standardized Gene) is used as a document creation means for creating and displaying documents and data analysis in document element units.
ralized Markup Language), XML (eXtensible M)
arkup Language) and HTML (HyperText Markup)
Language) is known. These SGML, X
A document created by ML or HTML is generally called a "structured document".

【０００３】タグのタグ名やタグ同士の関係などの定義
は、ＤＴＤ（Document Type Definition；文書型定
義）によって与えており、構造化文書の文書要素の構造
は、このＤＴＤのタグの定義に従うことになる。Definitions such as tag names of tags and relationships between tags are given by DTD (Document Type Definition), and the structure of the document element of the structured document follows the definition of the tag of this DTD. become.

【０００４】タグによって識別される文書要素（elemen
t；「要素」とも略記される）は、ある要素から見て０
または１つの親要素、すなわち自分自身を含む要素と、
０以上の子要素、すなわち自分がその内部に含む要素を
持ち、この構造を木構造で表現することができる。本明
細書では、この文書要素の木構造のことを、「構造化文
書の論理構造」と呼ぶ。A document element (elemen identified by a tag
t; also abbreviated as "element") is 0 when viewed from an element.
Or one parent element, that is, an element that contains itself,
It has zero or more child elements, that is, elements contained within itself, and this structure can be represented by a tree structure. In this specification, the tree structure of this document element is referred to as a “logical structure of a structured document”.

【０００５】［構造化文書のデータベースによる管理］
構造化文書の文書構造に関する情報をデータベースで管
理することにはいくつかの利点がある。データベースか
らの取出しの対象に、文書だけではなく文書中のタグに
よって識別される要素を単位とした部分文書（文書要
素）の取出しが可能となる。[Management of structured document by database]
Managing information about the document structure of structured documents in a database has several advantages. It is possible to take out not only a document but also a partial document (document element) in which the element identified by the tag in the document is a unit to be taken out from the database.

【０００６】またデータベースからの検索手段に、全文
検索や文書の書誌情報による検索に加えて、構造化文書
の論理構造に関する情報を条件とした検索が可能とな
る。すなわち要素の木構造上におけるさまざまな条件を
満たす文書、またはその文書要素を取り出すことが可能
となる。In addition to the full-text search and the search based on the bibliographic information of the document, the search means from the database can perform the search on the condition of the information on the logical structure of the structured document. That is, it is possible to extract a document that satisfies various conditions on the tree structure of an element or a document element thereof.

【０００７】このような構造化文書をデータベースに格
納し、文書構造や文書自身に対するインデックスを作成
し、このインデックスを利用して特定の文書及び文書要
素を管理・検索する技術に関する刊行物として、例えば
特開平6−131340号公報、特開平10−240752号公報等の
記載が参照される。Such a structured document is stored in a database, an index is created for the document structure and the document itself, and a publication relating to a technique for managing and retrieving specific documents and document elements using this index, for example, Reference is made to the descriptions in JP-A-6-131340 and JP-A-10-240752.

【０００８】このうち上記特開平6−131340号公報で
は、登録要求とともに処理対象を文書部品(すなわち要
素)として文書処理装置から受け付ける文書部品受付手
段と、前記受け付けた文書部品を解析して論理構造を特
定する論理構造特定手段と、前記文書部品を１以上保持
する文書部品保持手段と、前記特定した論理構造のイン
デックスを生成するインデックス生成手段と、前記生成
されたインデックスを１以上保持するインデックス保持
手段と、を有する構造化文書の文書部品管理装置の構成
が開示されており、実際の文書ファイルを恒常的な２次
記憶装置に格納し、インデックスに対してアクセスを行
うことにより、実際の文書ファイルを取り出すことな
く、文書の内部構造を条件とした検索が容易となり、ま
た、多量の文書群の中から特定の構造を高速に検索する
ことが可能としている。Among them, in the above-mentioned Japanese Patent Laid-Open No. 6-131340, a document component receiving means for receiving a processing request as a document component (that is, an element) from a document processing device together with a registration request, and a logical structure by analyzing the received document component. , A document structure holding unit that holds one or more of the document parts, an index generation unit that generates an index of the specified logical structure, and an index holding unit that holds one or more generated indexes. And a structure of a document parts management device for a structured document having means, and storing the actual document file in a permanent secondary storage device and accessing the index to obtain the actual document. It becomes easy to search on the condition of the internal structure of the document without extracting the file. Is it possible to find the structure of the high-speed.

【０００９】また上記特開平10−240752号公報には、構
造化文書を対象とした構造指定検索において、論理要素
の文書中における出現位置に関する条件を指定できるよ
うにして精度の高い構造指定検索を可能にすることを課
題として、文書をデータベースに登録する際、各登録対
象文書の持つ論理構造を重ね合わせ、文書中での出現位
置が等しい構造要素を単一のメタノードによって代表さ
せた構造インデックスを作成し、文書検索時には構造イ
ンデックスを参照して指定された構造条件を満足するメ
タノードの集合を求め、それらのメタノードの識別子を
キーとして文字列インデックスを検索することにより、
指定条件を満たす文書群を求める手段を提供し、これに
より、構造化文書の集合からなる文書データベース上に
おいて、精度の高い構造指定検索が可能とした方法が開
示されている。Further, in the above-mentioned Japanese Patent Laid-Open No. 10-240752, in the structure designation search for a structured document, a highly accurate structure designation search is made possible by designating the condition regarding the appearance position of the logical element in the document. As an issue to be made possible, when registering a document in the database, the logical structure of each registration target document is overlapped, and the structure index that represents the structural elements with the same appearance position in the document by a single metanode is used. By creating and searching for a set of metanodes that satisfy the specified structure condition by referring to the structure index at the time of document search, and searching the character string index using the identifiers of these metanodes as keys,
Disclosed is a method for providing a means for obtaining a document group satisfying a specified condition, thereby enabling a highly accurate structure-specified search on a document database including a set of structured documents.

【００１０】[0010]

【発明が解決しようとする課題】しかしながら、上記し
た刊行物等に記載された装置、及び方法は下記記載の問
題点を有している。However, the apparatus and method described in the above publications have the following problems.

【００１１】上記特開平10−240752号公報に記載される
方法は、複数の文書の論理構造を管理するインデックス
の作成に、その複数の文書の個々の論理構造を重ね合わ
せていくことによって作成している。具体的には、文書
をデータベースに登録する際、各登録対象文書の持つ論
理構造を重ね合わせ、文書中での出現位置が等しい構造
要素を単一のメタノードによって代表させた構造インデ
ックスを作成し、文書検索時には構造インデックスを参
照して指定された構造条件を満足するメタノードの集合
を求め、それらのメタノードの識別子をキーとして文字
列インデックスを検索することにより、指定条件を満た
す文書群を求めている。The method described in Japanese Patent Laid-Open No. 10-240752 is created by creating an index for managing the logical structure of a plurality of documents and superimposing the individual logical structures of the plurality of documents. ing. Specifically, when registering a document in the database, the logical structure of each registration target document is overlaid, and a structure index is created in which structure elements that have the same appearance position in the document are represented by a single metanode. At the time of document retrieval, a set of metanodes that satisfy the specified structural condition is obtained by referring to the structure index, and a character string index is searched by using the identifiers of those metanodes as keys to obtain a document group that satisfies the specified condition. .

【００１２】しかしながら、上記特開平10−240752号公
報に記載の方法では、このような複数の文書の論理構造
を重ね合わせることによって作成する構造インデックス
を検索時に利用して、ユーザが意図する論理構造条件に
適った文書を常に取出すことが保証できるか否か不明で
ある。However, in the method described in the above-mentioned Japanese Patent Laid-Open No. 10-240752, the structural index created by superposing the logical structures of a plurality of such documents is utilized at the time of retrieval, and the logical structure intended by the user is used. It is unclear whether it can be guaranteed that the documents that meet the conditions are always taken out.

【００１３】一般的に、構造化文書はある特定のＤＴＤ
に従い、ＤＴＤが異なる文書は一般的に種類が異なる文
書と判断される。In general, a structured document is a particular DTD.
Accordingly, documents having different DTDs are generally judged to be different types of documents.

【００１４】ＤＴＤが異なれば同じ要素名を持つ文書は
ほとんど存在しないであろうし、仮に同じ構造で同じ名
前をもつ要素が文書中に存在していたとしても、それが
同じ用法に基づいた同じ意味を持つものとは限らない。Documents having the same element name will hardly exist if the DTDs are different, and even if an element having the same structure and the same name exists in the document, it has the same meaning based on the same usage. It is not always the one with.

【００１５】上記特開平10−240752号公報に記載の方法
の問題は、ＤＴＤが異なる複数の文書を登録しようとす
る場合、文書中での出現位置が等しい構造要素を単一の
メタノードによって代表させた構造インデックスを作成
することが、実際には、不可能であり、仮にメタノード
を作成できたとしても、実際の文書の要素に対して意味
的な一意性を保証することはできない。The problem of the method described in Japanese Patent Laid-Open No. 10-240752 is that when a plurality of documents having different DTDs are to be registered, structuring elements having the same appearance position in the document are represented by a single metanode. It is actually impossible to create a structured index, and even if a metanode can be created, it is not possible to guarantee semantic uniqueness for the elements of the actual document.

【００１６】このため、そのような構造インデックスに
基づく検索によって取出した文書も、利用者が本当に意
図した構造要素に基づいて取出された文書かどうか保証
することはできない。Therefore, it is not possible to guarantee whether the document retrieved by such a search based on the structural index is a document retrieved based on the structural element that the user really intended.

【００１７】すなわち、上記特開平10−240752号公報に
記載の方法において、インデックスは、単一のＤＴＤ、
もしくは極めて類似したＤＴＤに従う文書に対しては有
効な方法であるが、ＤＴＤの構造が大きく異なる文書に
対しては不利な方法といえる。That is, in the method described in JP-A-10-240752, the index is a single DTD,
Alternatively, it is an effective method for documents conforming to extremely similar DTDs, but it is a disadvantageous method for documents having greatly different DTD structures.

【００１８】一方、上記特開平6−131340号公報に記載
の装置では、論理構造のインデックスを生成するインデ
ックス生成手段を備え、インデックスに対してアクセス
を行うことにより、実際の文書ファイルを取り出すこと
なく、文書の内部構造を条件とした検索が容易となり、
また、多量の文書群の中から特定の構造を高速に検索す
ることが可能としているが、このインデックス構造に
は、登録情報、部品種別、部品種別による固有情報など
が設けられており、インデックスとしては、情報量が多
すぎる。On the other hand, the apparatus described in the above-mentioned Japanese Patent Laid-Open No. 6-131340 is provided with index generating means for generating an index having a logical structure, and by accessing the index, without extracting the actual document file. , It becomes easy to search based on the internal structure of the document,
In addition, it is possible to search for a specific structure from a large number of documents at high speed, but this index structure contains registration information, part type, unique information by part type, etc. Has too much information.

【００１９】一般に、インデックスの情報は、計算機の
メモリ上にロードされて計算されるが、計算機のメモリ
資源には限りがあり、一度にロード可能なインデックス
の情報量も制限される。In general, index information is loaded and calculated in the memory of a computer, but the memory resources of the computer are limited and the amount of index information that can be loaded at one time is also limited.

【００２０】一つの文書や一つの要素に対するインデッ
クスの情報量が小さければ小さいほど、より多くの文書
や要素に対するインデックスによる計算処理をメモリ上
で実行することが可能となり、結果として、より多量の
文書群の中から特定の構造を高速に検索することが可能
となる。The smaller the information amount of the index for one document or one element, the more the calculation process by the index for more documents or elements can be executed on the memory, and as a result, the more documents there are. It is possible to quickly search for a specific structure in the group.

【００２１】インデックスを利用した文書検索では、第
一に多量の文書群から目的の文書群をすばやく絞り込む
ことが大きな目的の一つである。In the document search using the index, first of all, one of the major purposes is to quickly narrow down a target document group from a large number of document groups.

【００２２】インデックス構造が内包するいくつかの情
報について、目的の文書群を絞り込んだ後に、文書中か
ら取出して照合しても、検索性能全体に影響を与えるこ
とは少ないはずである。Even if the target document group is narrowed down with respect to some information contained in the index structure, and the information is retrieved and collated from the document, it should have little influence on the entire retrieval performance.

【００２３】したがって本発明は、上記問題点に鑑みて
なされたものであって、その目的は、ＤＴＤが異なるな
ど、構造の異なる複数の文書の集合に対するインデック
ス付けし、前記複数の文書の集合に対する検索時のイン
デックスを利用する装置及び方法を提供することにあ
る。Therefore, the present invention has been made in view of the above problems, and an object thereof is to index a set of a plurality of documents having different structures such as different DTDs, and to the set of a plurality of documents. An object of the present invention is to provide an apparatus and method that uses an index at the time of search.

【００２４】本発明の他の目的は、インデックスの省メ
モリ化という観点から、情報量を極力抑え、構造化文書
の論理構造に基づく検索をより高速に行う方法及び装置
を提供することにある。これ以外の本発明の目的、特徴
等は以下の説明から直ちに明らかとされるであろう。Another object of the present invention is to provide a method and apparatus for suppressing the amount of information as much as possible from the viewpoint of memory saving of an index and performing a search based on the logical structure of a structured document at a higher speed. Other objects, features, and the like of the present invention will be immediately apparent from the following description.

【００２５】[0025]

【課題を解決するための手段】前記目的を達成する本発
明は、データベースのスキーマとして、文書を管理する
文書クラス、文書の集合を管理する文書フォルダクラ
ス、ＤＴＤを管理するＤＴＤクラスを有し、文書クラス
には文書の要素の木構造関係を管理する要素ツリーイン
デックス、文書フォルダクラスには文書の要素、属性毎
にインデックスを作成し管理する要素−属性インデック
ス、ＤＴＤクラスには要素名と属性名をＩＤとして管理
するためのテーブルをそれぞれ持たせる。According to the present invention to achieve the above object, a database schema includes a document class for managing documents, a document folder class for managing a set of documents, and a DTD class for managing DTDs. The document class is an element tree index that manages the tree structure relationship of document elements, the document folder class is a document element, and an element-attribute index that creates and manages an index for each attribute. The DTD class is an element name and attribute name. Each has a table for managing as an ID.

【００２６】本発明において、データベースに文書を格
納する時に、文書格納実行手段の要素ツリーインデック
ス作成手段は、要素ツリーインデックスを、要素−属性
インデックス作成手段は要素−属性インデックスを、要
素−属性ＩＤテーブル作成手段は要素−属性ＩＤテーブ
ルをそれぞれ作成し、要素名ＩＤと属性名ＩＤは数値と
して与える。In the present invention, when storing a document in the database, the element tree index creating means of the document storing execution means, the element tree index, the element-attribute index creating means the element-attribute index, and the element-attribute ID table. The creating means respectively creates the element-attribute ID table, and gives the element name ID and the attribute name ID as numerical values.

【００２７】また本発明において、文書検索手段は、前
記要素ツリーインデックス、前記要素−属性ＩＤテーブ
ル、前記要素−属性インデックスを用いて、前記データ
ベースから、構造化文書を検索する。文書要素の取出し
に際して、構造化文書中のある文書要素からの相対的な
関係として、前記文書要素に対する親要素、祖先要素、
子要素、子孫要素、兄要素、弟要素、前要素、次要素の
うちのいずれか又はこれらの組合せで取出しを行う。In the present invention, the document retrieval means retrieves a structured document from the database using the element tree index, the element-attribute ID table, and the element-attribute index. When retrieving a document element, a parent element, an ancestor element for the document element, and a relative relationship from a document element in the structured document
The child element, the descendant element, the elder brother element, the younger brother element, the previous element, the next element, or a combination thereof is extracted.

【００２８】[0028]

【発明の実施の形態】本発明の実施の形態について説明
する。図１は、本発明の一実施の形態の構成を示す図で
ある。本発明の一実施の形態において、データベース１
のスキーマのクラスとして、文書を管理する文書クラス
１１と、文書の集合を管理する文書フォルダクラス１２
と、ＤＴＤを管理するＤＴＤクラス１３とを有する。文
書クラス１１には、文書の要素の木構造関係を管理する
要素ツリーインデックス１４を持たせ、文書フォルダク
ラス１２には、文書の要素ごと、属性ごとにインデック
スを作成し管理する要素−属性インデックス１５を持た
せ、さらにＤＴＤクラス１３には要素名と属性名をＩＤ
として管理するための要素−属性ＩＤテーブル１６を持
たせる。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described. FIG. 1 is a diagram showing a configuration of an embodiment of the present invention. In one embodiment of the present invention, the database 1
A document class 11 that manages documents and a document folder class 12 that manages a set of documents as the schema classes of
And a DTD class 13 that manages the DTD. The document class 11 has an element tree index 14 for managing the tree structure relationship of the elements of the document, and the document folder class 12 has an element-attribute index 15 for creating and managing indexes for each element and attribute of the document. And the DTD class 13 has an element name and an attribute name
An element-attribute ID table 16 for managing

【００２９】データベース１に構造化文書（単に「文
章」と略記される）を格納するときは文書格納実行部２
１が起動され、検索を実行するときは検索実行部２２が
起動される。When storing a structured document (simply abbreviated as “sentence”) in the database 1, the document storage execution unit 2
1 is started, and when executing the search, the search execution unit 22 is started.

【００３０】文書格納実行部２１は、要素ツリーインデ
ックス作成部２１１と、要素−属性インデックス作成部
２１２と、要素−属性ＩＤテーブル作成部２１３とを備
え、データベース１への文書の格納する際に、要素ツリ
ーインデックス作成部２１１は要素ツリーインデックス
１４を、要素−属性インデックス作成部２１２は要素−
属性インデックス１５を、要素−属性ＩＤテーブル作成
部２１３は要素−属性ＩＤテーブル１６をそれぞれ作成
する。The document storage execution unit 21 includes an element tree index creation unit 211, an element-attribute index creation unit 212, and an element-attribute ID table creation unit 213, and when storing a document in the database 1, The element tree index creation unit 211 uses the element tree index 14, and the element-attribute index creation unit 212 uses the element-
The attribute index 15 and the element-attribute ID table creating unit 213 respectively create the element-attribute ID table 16.

【００３１】本発明は、文書や要素をデータベースへ格
納するためのデータ格納構造を提供し、その格納構造に
基づいた省メモリのインデックスの構造を提供し、かつ
前記インデックス構造に基づいた要素の取り出し方法を
提供している。また本発明は、データベースに複数の文
書を分類して格納し、前記分類した各文書の集合を対象
とした検索を行うシステムに利用することができる。特
に、文書の論理構造、すなわち文書中の各要素同士の関
係を制約条件とした、文書及び要素単位の検索要求を受
け付け、前記文書及び要素単位の取出しを可能としてい
る。The present invention provides a data storage structure for storing documents and elements in a database, provides a memory-saving index structure based on the storage structure, and retrieves elements based on the index structure. Provide a way. Further, the present invention can be used in a system in which a plurality of documents are classified and stored in a database, and a search is performed on a set of the classified documents. In particular, it accepts a search request in document and element units, with the logical structure of the document, that is, the relationship between each element in the document as a constraint condition, and makes it possible to extract the document and element units.

【００３２】また検索処理を高速に行うため、前記文書
の論理構造、及び文書の要素、要素の属性の情報を保持
するインデックスのデータ構造を極力小さくすることに
より、計算機のメモリ資源をあまり消費させないで検索
処理を実現させる。In order to perform the retrieval process at high speed, the memory resource of the computer is not so much consumed by minimizing the logical structure of the document and the data structure of the index that holds the information of the document element and the attribute of the element. To realize the search process.

【００３３】［文書管理・検索のためのデータベースス
キーマ］まず本発明におけるデータベースのスキーマに
ついて説明する。図２は、データベースのスキーマに関
するクラスを示す図である。図２に示した記法は、ＯＭ
Ｔ(Object Modeling Technique)に準拠する。[Database Schema for Document Management / Search] First, the database schema in the present invention will be described. FIG. 2 is a diagram showing classes related to the schema of the database. The notation shown in FIG. 2 is OM
It conforms to T (Object Modeling Technique).

【００３４】本発明において、データベースに文書を格
納することは、文書格納実行部２１によって格納対象の
文書を、各クラスのインスタンスとして作成し登録する
ことである。In the present invention, storing a document in the database means creating a document to be stored by the document storage executing unit 21 and registering it as an instance of each class.

【００３５】また本発明においてデータベースから文書
を検索して取出すことは、文書検索実行部２２によって
検索条件を満たす各クラスのインスタンスを特定し、そ
のインスタンスを文書としてシステム外部に取出すこと
である。In the present invention, retrieving a document from the database means extracting an instance of each class satisfying the retrieval condition by the document retrieval executing unit 22 and retrieving the instance as a document from the system.

【００３６】データベース１に、文書やＤＴＤを格納す
るために、そのスキーマのクラスとして文書クラス１１
とＤＴＤクラス１３を用意する。In order to store documents and DTDs in the database 1, the document class 11 is a class of its schema.
And prepare DTD class 13.

【００３７】データベース１に格納する各文書、各ＤＴ
Ｄは、それぞれ文書クラス１１のインスタンス、ＤＴＤ
クラス１３のインスタンスとして管理する。Each document and each DT stored in the database 1
D is an instance of document class 11, DTD
It is managed as an instance of class 13.

【００３８】文書クラス１１のインスタンスを、「文書
オブジェクト」と呼ぶ。同様に、図２の文書フォルダク
ラス１２のインスタンスを「文書フォルダオブジェク
ト」、ＤＴＤクラス１３のインスタンスを「ＤＴＤオブ
ジェクト」と呼ぶ。An instance of the document class 11 is called a "document object". Similarly, an instance of the document folder class 12 in FIG. 2 is called a “document folder object” and an instance of the DTD class 13 is called a “DTD object”.

【００３９】クラスのインスタンスには、該クラス内で
一義的に識別可能なＩＤ（識別情報）が割り振られる。An ID (identification information) that can be uniquely identified within the class is assigned to each instance of the class.

【００４０】このＩＤは、データベース管理システムに
よって与えられる場合もあるし、アプリケーション側で
唯一となるように設定するようにしてもよい。This ID may be given by the database management system, or may be set uniquely on the application side.

【００４１】図３は、本発明を説明するための図であ
り、文書オブジェクトと、ＤＴＤオブジェクトの関係の
一例を示す図である。FIG. 3 is a diagram for explaining the present invention and is a diagram showing an example of the relationship between a document object and a DTD object.

【００４２】図３において、（文書）Ａ111、（文書）
Ｂ112、（文書）Ｃ113が文書クラスのインスタンスであ
り、（ＤＴＤ）Ｘ131、（ＤＴＤ）Ｙ132がＤＴＤクラス
１３のインスタンスである。データベース１は、これら
個々のインスタンスを管理するとともに、どの文書が、
どのＤＴＤに基づいて作成されたのかを表すインスタン
ス間のリンクも管理する。In FIG. 3, (document) A111, (document)
B112 and (document) C113 are instances of the document class, and (DTD) X131 and (DTD) Y132 are instances of the DTD class 13. The database 1 manages these individual instances and
It also manages links between instances that indicate which DTD was created based on.

【００４３】文書クラス１１は、データベース１中の全
文書、すなわち文書オブジェクトの全集合を管理し、文
書クラス１１と文書オブジェクト111、112、113は、is
−a関係を構成するが、文書オブジェクトの集合を管理
するための手段として、文書フォルダクラス１２を用意
する。The document class 11 manages all the documents in the database 1, that is, the entire set of document objects. The document class 11 and the document objects 111, 112, 113 are is
A document folder class 12 is prepared as a means for managing a set of document objects, which constitutes the -a relationship.

【００４４】文書フォルダクラス１２は、任意の文書オ
ブジェクトの集合を管理する。実際には、ある条件を満
たす文書オブジェクトの集合を一つの文書フォルダクラ
ス12のインスタンスで管理することになる。The document folder class 12 manages a set of arbitrary document objects. In reality, a set of document objects satisfying a certain condition is managed by one document folder class 12 instance.

【００４５】例えば、著者ごとに文書をまとめたり、年
代ごとや出版社ごとにまとめてその文書集合を管理する
ような場合である。本発明においては、文書フォルダク
ラス１２と文書クラス１１は、member−of関係を構成す
る。For example, there is a case in which documents are grouped by author or grouped by age group or publisher to manage the set of documents. In the present invention, the document folder class 12 and the document class 11 form a member-of relationship.

【００４６】文書フォルダクラス１２のインスタンス、
すなわち文書フォルダオブジェクトは、木構造に基づく
分類階層を構築するための構造を有する。An instance of the document folder class 12,
That is, the document folder object has a structure for constructing a classification hierarchy based on a tree structure.

【００４７】すなわち、すべての文書フォルダオブジェ
クトは、親に相当する、０または1つの文書フォルダオ
ブジェクトと、子に相当する０以上の文書フォルダオブ
ジェクトとそれぞれ関係を持つことができる。That is, all the document folder objects can have a relation with 0 or 1 document folder object corresponding to the parent and 0 or more document folder objects corresponding to the children.

【００４８】図４に示したインスタンスの例では、(文
書フォルダ)α121と(文書フォルダ)β122、(文書フォル
ダ)γ123との間は、(文書フォルダ)α121が親で、(文書
フォルダ)β122と(文書フォルダ)γ123がその子となる
親子関係を持つ。In the example of the instance shown in FIG. 4, the (document folder) α121 is the parent and the (document folder) β122 is between the (document folder) α121 and the (document folder) β122, and the (document folder) γ123. (Document folder) γ123 has a parent-child relationship as its child.

【００４９】同様に、(文書フォルダ)β122と(文書フォ
ルダ)δ124、(文書フォルダ)ε125との間は、(文書フォ
ルダ)β122が親で、(文書フォルダ)δ124と(文書フォル
ダ)ε125がその子となる親子関係、(文書フォルダ)γ12
3と(文書フォルダ)ζ126との間は、(文書フォルダ)γ12
3が親で、(文書フォルダ)ζ126がその子となる親子関係
を持つ。Similarly, between (document folder) β122 and (document folder) δ124, (document folder) ε125, (document folder) β122 is a parent, and (document folder) δ124 and (document folder) ε125 are its children. Parent-child relationship, (document folder) γ12
Between (Document folder) ζ 126 and (Document folder) γ 12
3 is a parent, and (document folder) ζ 126 has a parent-child relationship in which it is a child.

【００５０】応用例として、・著者ごとにまとめた文書の文書フォルダオブジェクト
を親とし、・その子として男女別著者の文書フォルダオブジェクト
を作成して文書を分類する、という構成、あるいは、年
代別に文書を分類する場合に、・例えば1990年代でまとめた文書フォルダオブジェクト
を親とし、・1990年代の１年ごとの文書フォルダオブジェクトを子
として分類する、等の構成が挙げられる。As an application example, the document folder objects of the documents collected for each author are used as parents, and the document folder objects of the authors for men and women are created as their children to classify the documents, or the documents are classified according to ages. When classifying, for example, the document folder objects collected in the 1990s are used as parents, and the document folder objects for each year in the 1990s are used as children, and so on.

【００５１】実際に親子の関係を持たせる方法に厳密な
規則はないが、一般には、子の文書フォルダオブジェク
トによって親の文書フォルダオブジェクトが構成される
というpart−of関係を構成する。Although there is no strict rule on the method of actually providing a parent-child relationship, in general, a child-document folder object constitutes a parent document folder object, forming a part-of relationship.

【００５２】［要素−属性ＩＤテーブル］構造化文書に
おける論理構造に関する検索条件を指定して特定の文書
や要素を取出すときには、文書に含まれる要素名、属性
名などの文字列データを具体的に検索条件に指定する必
要がある。[Element-attribute ID table] When a search condition for a logical structure in a structured document is specified and a specific document or element is extracted, character string data such as an element name and an attribute name included in the document are concretely specified. Must be specified in search conditions.

【００５３】これら文字列データを含む検索条件に一致
する文書や要素を取出すためには、データベース中の文
書オブジェクトの文字列データを、計算機のメモリ上に
ロードし、前記検索条件の要素名、属性名との判定を行
う必要がある。しかしながら、要素名や属性名などの文
字列データは、文書オブジェクトが持つ文書全体の文字
列データに比べればはるかにサイズ（長さ）が小さいこ
とが一般的であり、文書オブジェクト全体をそのまま計
算機のメモリ上にロードするのは効率的でない。このた
め、要素名や属性名など、検索条件を満たすかどうかの
判定に必要な情報のみを有するインデックスをあらかじ
め作成しておいて、検索時には、インデックスのみを計
算機のメモリ上にロードして計算する方法が用いられ
る。In order to retrieve the document or element that matches the search condition including these character string data, the character string data of the document object in the database is loaded on the memory of the computer, and the element name and attribute of the search condition are loaded. It is necessary to judge the name. However, the character string data such as the element name and the attribute name is generally much smaller in size (length) than the character string data of the entire document held by the document object. Loading on memory is not efficient. For this reason, an index that has only the information necessary for determining whether or not the search condition is satisfied, such as the element name and attribute name, is created in advance, and at the time of search, only the index is loaded into the memory of the computer for calculation. A method is used.

【００５４】ところで、インデックスで、要素名や属性
名を文字列データとして扱う場合、計算機のメモリに
は、その文字列長分のデータ領域を必要とする。具体的
には、１文字１バイト（かな漢字は１文字２バイト）で
ある計算機が多い。このため、データベース中の要素名
や属性名が長大であった場合には、ロードに必要なメモ
リ上のデータをそれだけ必要とすることになり、計算性
能上不利となる。By the way, when the element name and the attribute name are treated as character string data in the index, the memory of the computer requires a data area corresponding to the character string length. Specifically, many computers have one character for one byte (one character for Kana and Kanji for two bytes). For this reason, if the element name or attribute name in the database is long, that much data on memory required for loading is needed, which is disadvantageous in terms of calculation performance.

【００５５】そこで、本発明においては、インデックス
における要素名や属性名などの文字列に関するメモリの
消費を節約するために、ＤＴＤクラス１３と一対一の関
係をもつ要素−属性ＩＤテーブル１６を備えている。Therefore, in the present invention, the element-attribute ID table 16 having a one-to-one relationship with the DTD class 13 is provided in order to save the memory consumption of character strings such as element names and attribute names in the index. There is.

【００５６】構造化文書中で扱われる要素名や属性名
は、構造化文書が従うＤＴＤによってその種類と数が限
定される。単一のＤＴＤの範囲において、要素名は一意
に定められ、また属性名は、ＤＴＤの範囲の前記要素の
範囲において、一意に定められる。The type and number of element names and attribute names handled in the structured document are limited by the DTD that the structured document follows. Within the scope of a single DTD, element names are uniquely defined, and attribute names are uniquely defined within the range of the elements of the DTD.

【００５７】このＤＴＤ内で定義される各要素名に対し
て唯一となるような要素名ＩＤを数値として与え、その
要素名と要素名ＩＤとの対応関係のテーブルを作成し、
このテーブルを、ＤＴＤのオブジェクトと一対一の関係
で要素−属性ＩＤテーブル１６のインスタンスとしてデ
ータベースに格納する。An element name ID that is unique to each element name defined in this DTD is given as a numerical value, and a table of the correspondence relationship between the element name and the element name ID is created.
This table is stored in the database as an instance of the element-attribute ID table 16 in a one-to-one relationship with the DTD object.

【００５８】属性名に関しても、ＤＴＤオブジェクト内
の要素に対して唯一であり、かつ異なる要素名の間で同
じ属性名が存在する場合でも、これらを識別できるよう
な属性名ＩＤを数値で与える。すなわち属性名ＩＤは、
その値からどの要素の属性名であるかを判別できるよう
な数値が割り当てられる。With respect to the attribute name as well, even if the element name is unique to the element in the DTD object and the same attribute name exists among different element names, an attribute name ID that can identify them is given by a numerical value. That is, the attribute name ID is
A numerical value is assigned so that the attribute name of the element can be determined from the value.

【００５９】そして属性名と属性名ＩＤとの対応関係を
テーブルとして作成し、そのテーブルを、先ほど作成し
た要素−属性ＩＤテーブル１６のインスタンスとして、
データベースに格納する。Then, the correspondence between the attribute name and the attribute name ID is created as a table, and the table is used as an instance of the element-attribute ID table 16 created earlier.
Store in database.

【００６０】図８は、本発明の一実施の形態における、
要素−属性ＩＤテーブル１６におけるインスタンスの一
例を示す図である。要素名ＩＤと属性名ＩＤの数値の割
り当ては、前述のＤＴＤ内および要素内での一意性が保
証されれば任意の値でよい。ＤＴＤに対する要素名ＩＤ
と属性名ＩＤの与え方の一例を示す。FIG. 8 shows an embodiment of the present invention.
It is a figure which shows an example of the instance in the element-attribute ID table 16. Numerical values may be assigned to the element name ID and the attribute name ID as long as the uniqueness in the DTD and the element is guaranteed. Element name ID for DTD
An example of how to give the attribute name ID will be described.

【００６１】図５は、本発明の一実施の形態において、
要素名ＩＤを与える処理手順を示すフローチャートであ
る。ただし、ここで扱うＤＴＤは、妥当性が保証され、
かつＤＴＤの内部サブセット、外部サブセット、外部実
体などの使用によって、前記ＤＴＤの記述が複数の文書
に分割されている場合には、構造化文書のＤＴＤのパー
サ（構文解析器）によって一つの文書中にすべてがマー
ジされ、一つの文書中にＤＴＤの情報がすべて記述され
ているものとする。またＤＴＤのパーサによってパラメ
ータ実体も展開済みであるものとする。そして、図５に
示す処理手順は、このようなＤＴＤについてその記述を
先頭から最後までを連続的に読み込むこととする。FIG. 5 shows an embodiment of the present invention.
It is a flow chart which shows a processing procedure which gives element name ID. However, the validity of the DTD handled here is guaranteed,
When the DTD description is divided into a plurality of documents by using the DTD internal subset, external subset, external entity, etc., the DTD parser (parser) of the structured document Are merged, and all the DTD information is described in one document. It is also assumed that the parameter entity has already been expanded by the DTD parser. Then, in the processing procedure shown in FIG. 5, the description of such a DTD is continuously read from the beginning to the end.

【００６２】図５を参照すると、ステップＳ101で変数
ｉの初期化し、ステップＳ102、Ｓ107間のループ処理を
行う。このループ内の処理では、まず、ステップＳ103
でＤＴＤ内の要素型宣言（element type declaratio
n）を探し、要素型宣言が見つかった場合、ステップＳ1
05で、該要素宣言の要素名の要素ＩＤを上位桁の値をｉ
とし、下位の桁の値を０とした数値とし、つづいてステ
ップS106で変数ｉを一つインクリメントする。ステップ
Ｓ104の判定において、要素型宣言が見つらない場合、
処理を終了する。なお、要素型宣言は、文書要素の名
前、要素の階層構造（親子、兄弟関係等）を規定するも
のである。Referring to FIG. 5, the variable i is initialized in step S101, and a loop process between steps S102 and S107 is performed. In the processing in this loop, first, step S103.
With element type declaration in DTD (element type declaratio
n), and if an element type declaration is found, step S1
In 05, the element ID of the element name of the element declaration is set to the value of the upper digit i
Then, the value of the lower digit is set to 0, and the variable i is incremented by 1 in step S106. If the element type declaration is not found in the determination in step S104,
The process ends. The element type declaration defines the name of the document element and the hierarchical structure (parent-child, sibling relationship, etc.) of the element.

【００６３】図５において、ステップＳ105における要
素名ＩＤの上位桁の値とは、数字の桁において予め定め
た桁の間を境として左側の桁の値ことをいい、下位桁の
値とはその右側の桁の値ことをいう。例えば数値「1234
5678」で境目の桁の間を4、5桁目とすると、上位桁の値
は「1234」、下位桁の値は「5678」となる。In FIG. 5, the value of the upper digit of the element name ID in step S105 means the value of the digit on the left side between the predetermined digits of the numeral digit, and the value of the lower digit thereof. The value of the right digit. For example, the value "1234
If the digits between the boundaries of "5678" are defined as the 4th and 5th digits, the value of the higher digit is "1234" and the value of the lower digit is "5678".

【００６４】図６は、属性名ＩＤを与える処理手順を示
すフローチャートである。FIG. 6 is a flowchart showing a processing procedure for giving an attribute name ID.

【００６５】ステップＳ111とステップＳ122間のループ
処理は、ＤＴＤ内で属性リスト宣言（attribute-list
declaration）を探して処理するものであり、属性リス
ト宣言内の要素名に相当する要素名ＩＤを求め、属性リ
スト宣言が見つかった場合、変数ｉを初期化して、ルー
プ処理に入り、属性リスト宣言内で属性名を探し属性名
が見つかった場合、その属性名の属性ＩＤを上位桁の値
を要素名ＩＤ、下位桁の値をｉとした値とする。なお、
ＤＴＤ内の属性リスト宣言は、要素の付加情報として属
性を定義するものであり、どの要素にどの属性が付く
か、属性名、属性として指定可能な値、デフォルト値等
を規定するものである。The loop processing between step S111 and step S122 is performed by the attribute list declaration (attribute-list) in the DTD.
declaration) is processed, the element name ID corresponding to the element name in the attribute list declaration is obtained, and when the attribute list declaration is found, the variable i is initialized and the loop process is started to execute the attribute list declaration. If an attribute name is found by searching for the attribute name in the item, the attribute ID of the attribute name is set as the element name ID for the upper digit value and the lower digit value for i. In addition,
The attribute list declaration in the DTD defines an attribute as additional information of an element, and defines which attribute is attached to which element, an attribute name, a value that can be designated as an attribute, a default value, and the like.

【００６６】当該ＤＴＤのすべての要素名の要素名ＩＤ
が割り振られた後、前記ＤＴＤの記述の先頭から最後ま
でを連続的に読み込むこととする。Element name IDs of all element names of the DTD
After is allocated, the description from the beginning to the end of the DTD description is continuously read.

【００６７】図６のステップＳ119における属性名ＩＤ
の上位桁と下位桁の意味は、図５のステップＳ105にお
ける要素名ＩＤの場合と同じである。ただし条件とし
て、要素名ＩＤと属性名ＩＤの上位桁と下位桁の境目の
位置は同じとする。要素名ＩＤと属性名ＩＤの数値の桁
数をどこまで用意するか、また上位桁と下位桁をどこで
分けるかなどは、文書のＤＴＤの中で前述の一意性を保
証すべき要素名や属性名がどれほどの数になるか予測を
立てて決定することになる。Attribute name ID in step S119 of FIG.
The meaning of the upper digit and the lower digit of is the same as that of the element name ID in step S105 of FIG. However, as a condition, the position of the boundary between the upper digit and the lower digit of the element name ID and the attribute name ID is the same. The number of digits of the numerical value of the element name ID and the attribute name ID to be prepared, and where to divide the high-order digit and the low-order digit, etc., are the element names and attribute names for which the above-mentioned uniqueness should be guaranteed in the DTD of the document. It will be decided by predicting how many will be.

【００６８】仮に要素名ＩＤと属性名ＩＤがそれぞれ４
バイトとし、その上位桁と下位桁をそれぞれ２バイトず
つとすると、文書のＤＴＤの中で前述の一意性の保証が
可能な要素名の数は65535個(２¹⁶−1)となり、同じく文
書のＤＴＤの中の各要素の中で前述の一意性の保証が可
能な属性名の数も65535個となる。４バイトというデー
タ量は、文字列に換算すると４文字分にしか相当しな
い。実際、文字列は最後がヌル文字で終わることを考え
ると、実質３文字分である。ＤＴＤ内の要素名や属性名
が３文字を超えて定義されている場合、要素名や属性名
を表すデータ領域を節約することができる。It is assumed that the element name ID and the attribute name ID are 4 respectively.
If the upper digit and the lower digit are 2 bytes each, the number of element names in the DTD of the document that can guarantee the uniqueness is 65535 (2 ¹⁶ −1), which is also the same as the document. The number of attribute names that can guarantee the uniqueness is 65535 among the elements in the DTD. The data amount of 4 bytes corresponds to only 4 characters when converted into a character string. In fact, considering that the character string ends with a null character, the character string is actually three characters. When the element name or attribute name in the DTD is defined with more than three characters, the data area representing the element name or attribute name can be saved.

【００６９】次に、本発明の一実施の形態における要素
−属性ＩＤテーブル１６の構造について説明する。Next, the structure of the element-attribute ID table 16 in the embodiment of the present invention will be described.

【００７０】前述の通り、要素−属性ＩＤテーブルに
は、要素名と要素名ＩＤ、属性名と属性名ＩＤのそれぞ
れの対応関係のテーブルを含む。よって、要素−属性Ｉ
Ｄテーブルの構造として、少なくとも、(要素名ＩＤ、
要素名)、(属性名ＩＤ、属性名)の二つの項目を有する
データ構造を用意する。As described above, the element-attribute ID table includes a table of correspondence relationships between element names and element name IDs, and attribute names and attribute name IDs. Therefore, element-attribute I
As the structure of the D table, at least (element name ID,
A data structure having two items of (element name) and (attribute name ID, attribute name) is prepared.

【００７１】要素−属性ＩＤテーブル１６におけるデー
タエントリの順序は、それぞれ要素名、属性名の辞書的
な順序でソートしてリストとしておく。これは、各デー
タエントリの要素名、属性名の辞書的な順序を比較する
二分探索によって、それぞれ対応する要素名ＩＤ、属性
名ＩＤをより早く調べられるようにするためである。The order of data entry in the element-attribute ID table 16 is sorted in a lexicographical order of element names and attribute names, respectively, to make a list. This is to enable the corresponding element name ID and attribute name ID to be checked earlier by a binary search that compares the dictionary order of the element name and attribute name of each data entry.

【００７２】要素−属性ＩＤテーブル作成部213は、上
記したデータ構造の要素−属性ＩＤテーブルを作成し、
ＤＴＤオブジェクトと一対一の関係を保持したままデー
タベース１に格納する。The element-attribute ID table creating section 213 creates the element-attribute ID table having the above-mentioned data structure,
The data is stored in the database 1 while maintaining a one-to-one relationship with the DTD object.

【００７３】図７、及び図８は、以上の手順に従って要
素名ＩＤ、属性名ＩＤを割り振った例を示す図である。
図７にはＤＴＤの一例が示されている。図７において、
「！ELEMENT 要素名内容モデル」が要素型宣言であ
あり、「！ATTLIST 要素名、属性値候補、デフォルト
値」は属性リスト宣言である。またＣＤＡＴＡは文字デ
ータのことをいう。要素型宣言の＃ＰＣＤＡＴＡは混在
内容（mixed type）を指定するものである。FIGS. 7 and 8 are diagrams showing an example in which the element name ID and the attribute name ID are assigned according to the above procedure.
FIG. 7 shows an example of DTD. In FIG.
"! ELEMENT element name content model" is the element type declaration, and "! ATTLIST element name, attribute value candidate, default value" is the attribute list declaration. Also, CDATA means character data. #PCDATA in the element type declaration specifies mixed content.

【００７４】図８は、図７に示すＤＴＤの中の要素名や
属性名に対応して要素名ＩＤ、属性名ＩＤを割り振った
例を示している。要素名ＩＤ、属性名ＩＤはヘキサデシ
マル表示で示してある。この例では、要素名ＩＤと属性
名ＩＤの値は16進数で４バイト、上位桁下位桁共に２バ
イトずつとしている。FIG. 8 shows an example in which element name IDs and attribute name IDs are assigned corresponding to the element names and attribute names in the DTD shown in FIG. The element name ID and the attribute name ID are shown in hexadecimal display. In this example, the value of the element name ID and the value of the attribute name ID are 4 bytes in hexadecimal, and the upper digit and the lower digit are 2 bytes each.

【００７５】［要素ツリーインデックス］要素ツリーイ
ンデックス１４は、要素−属性ＩＤテーブル１６の要素
名ＩＤを用いて、各文書における構造、すなわち要素の
木構造を表現するデータ構造である。要素ツリーインデ
ックス１４は各文書オブジェクトごとに作成して保持さ
せる。[Element Tree Index] The element tree index 14 is a data structure that expresses the structure in each document, that is, the tree structure of the element, using the element name ID of the element-attribute ID table 16. The element tree index 14 is created and held for each document object.

【００７６】要素ツリーインデックス１４は、(要素名
ＩＤ、要素レベル、要素開始位置)の三つ組のデータ構
造を持つデータエントリのリストである。The element tree index 14 is a list of data entries having a triple data structure of (element name ID, element level, element start position).

【００７７】この三つ組データは、文書中の要素と対応
し、データの各項目には、それぞれ要素の要素名に対応
する要素名ＩＤ、要素を木構造としたときのルート要素
からの深さ、文書中の要素の記述（開始タグを含む）の
開始バイト位置の値がそれぞれ格納される。This triple data corresponds to the element in the document, and each item of the data has an element name ID corresponding to the element name of the element, a depth from the root element when the element has a tree structure, The value of the start byte position of the description (including the start tag) of the element in the document is stored.

【００７８】そして、要素ツリーインデックス１４にお
けるこれらのエントリの順番は、文書中の要素の開始タ
グの出現順番と等しくなるようにする。この開始タグの
出現順番は換言すれば、要素の木構造を構成したときの
左深さ優先順序となる。Then, the order of these entries in the element tree index 14 is made equal to the order of appearance of the start tags of the elements in the document. In other words, the appearance order of the start tags is the left depth priority order when the tree structure of the element is formed.

【００７９】要素ツリーインデックスの作成手順を示
す。要素ツリーインデックス１４の作成は、要素ツリー
インデックス作成部２１１で行う。A procedure for creating an element tree index will be described. The element tree index 14 is created by the element tree index creating unit 211.

【００８０】図９は、要素ツリーインデックスの作成手
順を示すフローチャートである。要素ツリーインデック
スの三つ組のデータ構造（要素名ＩＤ、要素レベル、要
素開始位置）の各項目の値を求め、これをエントリとし
て、要素ツリーインデックス１４のリストに追加する。
ただし、ここで扱う文書は、ＤＴＤに対する妥当性が保
証され、前記ＤＴＤ自身の妥当性も保証されていること
を前提とする。また文書中のタグの省略、属性値の省略
等は、構造化文書のパーサによってすべて補完されてい
ることを前提とする。図９に示した要素ツリーインデッ
クスの作成手順は、このような文書の記述を先頭から最
後までを連続的に読み込むものとする。FIG. 9 is a flowchart showing the procedure for creating the element tree index. The value of each item of the data structure (element name ID, element level, element start position) of the triple of the element tree index is obtained, and this is added to the list of the element tree index 14 as an entry.
However, it is premised that the document handled here is guaranteed to be valid for DTD, and the validity of the DTD itself is also guaranteed. It is also assumed that the omission of tags and omission of attribute values in a document are all complemented by the structured document parser. In the procedure for creating the element tree index shown in FIG. 9, the description of such a document is continuously read from the beginning to the end.

【００８１】ステップＳ131では、変数ｉを０に初期化
し、ステップＳ132のループａ開始端とステップＳ143の
終了端間のループ処理を行う。In step S131, the variable i is initialized to 0, and the loop processing between the loop a start end in step S132 and the end end in step S143 is performed.

【００８２】ステップＳ133では、要素の開始タグか終
了タグのどちらかを探す。In step S133, either the start tag or the end tag of the element is searched.

【００８３】タグが見つかりそれが開始タグの場合（ス
テップＳ134、Ｓ135）、開始タグの最初の文字位置を求
め（ステップＳ136）、開始タグの要素名を求め（ステ
ップＳ137）、当該文書オブジェクトと関係するＤＴＤ
オブジェクトから要素−属性ＩＤテーブルを取り出し、
要素名に対応する要素ＩＤを求め（ステップＳ138）、
要素レベルの値を変数ｉの値とし、要素名ＩＤ、要素レ
ベル、要素開始位置の三つ組データを、要素ツリーイン
デックスのエントリに追加し（ステップＳ140）、つづ
いて変数ｉを１つインクリメントし（ステップＳ14
1）。When the tag is found and it is the start tag (steps S134 and S135), the first character position of the start tag is obtained (step S136), the element name of the start tag is obtained (step S137), and the relation with the document object is obtained. Do DTD
Retrieve the element-attribute ID table from the object,
Obtain the element ID corresponding to the element name (step S138),
The element level value is used as the value of the variable i, and the triplet data of the element name ID, the element level, and the element start position is added to the entry of the element tree index (step S140), and then the variable i is incremented by 1 (step S140). S14
1).

【００８４】一方、ステップＳ135で、終了タグの場
合、変数を１つデクリメントする（ステップＳ142）。On the other hand, in step S135, if the tag is the end tag, one variable is decremented (step S142).

【００８５】またステップＳ134でタグが見つらない場
合、ループを飛び出して終了する。If the tag is not found in step S134, the loop is exited and the process ends.

【００８６】要素ツリーインデックスの例を示す。図７
のＤＴＤに従う文書として、図１０に示すような文書が
あるものとする。文書の左側の数字は、文書の行番号を
表すのではなく、それぞれ数字の右側に現れる要素の開
始タグや文字列の最初の文字位置、また終了タグの最後
の文字位置をそれぞれ表すものとする。An example of the element tree index is shown. Figure 7
It is assumed that there is a document as shown in FIG. 10 as a document complying with the DTD. The numbers on the left side of the document do not indicate the line numbers of the document, but the start tag of the element appearing on the right side of the number, the first character position of the string, and the last character position of the end tag. .

【００８７】図１０に示した文書中のタグ付けによって
識別される要素の関係を木構造で表すと、図１１に示す
ようなものとなる。The relationship between the elements identified by the tagging in the document shown in FIG. 10 is represented by a tree structure as shown in FIG.

【００８８】図１１の木構造において、ノード上のラベ
ルは要素名を表し、ノードの添字はそのノードのラベル
に対応した要素の開始位置を表している。In the tree structure of FIG. 11, the label on the node represents the element name, and the subscript of the node represents the start position of the element corresponding to the label of the node.

【００８９】図１０の文書の要素の開始タグの出現順番
は、図１１の木構造の左深さ優先順序となっている。The appearance order of the start tags of the elements of the document of FIG. 10 is the left depth priority order of the tree structure of FIG.

【００９０】この要素の順番に従い、(要素名ＩＤ、要
素レベル、要素開始位置)の三つ組データのリストが作
成される。その結果は、図１２に示す通りである。According to the order of the elements, a list of triplet data (element name ID, element level, element start position) is created. The result is as shown in FIG.

【００９１】［文書の論理構造に基づく検索］文書の論
理構造に基づく検索には、文書の要素名、属性名、属性
値、要素の中身の文書に関する条件と、要素間の関係に
関する条件を満たすかどうかの判定を含む。[Search Based on Logical Structure of Document] In order to search based on the logical structure of a document, the conditions regarding the element name, the attribute name, the attribute value of the document, the contents of the document, and the relation between the elements are satisfied. Including whether to determine.

【００９２】例えば図１０に示す文書に対して、文書の
要素名、属性名、属性値、要素の中身の文書に関する検
索条件「要素名がDであり、その要素の属性a2の値が２
で、その要素の中身に文字列"string1"を含むような要
素を持つ文書」の条件判定は、TRUEとなる。For example, for the document shown in FIG. 10, the document element name, the attribute name, the attribute value, and the search condition "the element name is D and the attribute a2 value of the element is 2" for the document of the content of the element.
Then, the condition judgment of "a document having an element that contains the character string" string1 "" in the content of the element is TRUE.

【００９３】さらに、前記検索条件に要素間の関係に関
する条件「そのような要素Dを子要素に持つ要素の属性a
1の値が１である要素を持つ文書」を加えることも、文
書の論理構造に基づく検索の範囲に入る。この検索要求
を加えた図１０の文書の条件判定はTRUEとなる。Further, in the search condition, the condition “relationship between elements” “attribute a of an element having such an element D as a child element
Adding "a document having an element whose value of 1 is 1" is also within the scope of the search based on the logical structure of the document. The condition determination of the document in FIG. 10 to which this search request is added is TRUE.

【００９４】以上の例からも分かるように、要素間の関
係に関する条件を満たすかどうかの判定処理には、ある
要素からの相対的な関係にある要素の取出し、例えばあ
る要素の親要素、祖先要素、子要素、子孫要素、兄要
素、弟要素、前要素、次要素の取出しなどが必要とな
る。As can be seen from the above example, the process of determining whether or not the condition regarding the relationship between elements is satisfied is the extraction of an element that is in a relative relationship from a certain element, such as the parent element or ancestor of an element. It is necessary to take out elements, child elements, descendant elements, elder brother elements, younger brother elements, previous elements, and next elements.

【００９５】これらの相対的な関係の説明については、
文献（XML Pointer Language(XPointer)の相対ロケー
ション項のancestor、child、descendant、psibling、f
sibling、preceding、following）等の記載が参照され
る。For an explanation of these relative relationships, see
References (ancestor, child, descendant, psibling, f of relative location terms in XML Pointer Language (XPointer)
Descriptions such as sibling, preceding, following) are referred to.

【００９６】このような相対関係に関する要素の特定
は、要素ツリーインデックスの任意のエントリを起点と
して計算することができる。The identification of elements relating to such a relative relationship can be calculated starting from an arbitrary entry of the element tree index.

【００９７】前記要素ツリーインデックスを用いて、あ
る要素からの親要素、祖先要素、子要素、子孫要素、兄
要素、弟要素、前要素、次要素を取出す手順を、それぞ
れ図１３乃至図１８、図２１、図２２にフローチャート
として示す。13 to 18, steps for extracting a parent element, an ancestor element, a child element, a descendant element, a brother element, a brother element, a previous element, and a next element from a certain element using the element tree index, respectively. 21 and 22 are shown as a flow chart.

【００９８】［親要素の取出し］図１３のフロチャート
を参照して、親要素の取出しの処理手順について説明す
る。[Retrieval of Parent Element] Referring to the flowchart of FIG. 13, the procedure for retrieving the parent element will be described.

【００９９】図１３を参照すると、ステップＳ161で
は、変数の初期化を行う。変数parentとelemは、要素ツ
リーインデックスの三つ組のエントリ、要素名ＩＤ、要
素レベル、要素開始位置の値を格納するための変数であ
る。Referring to FIG. 13, in step S161, variables are initialized. The variables parent and elem are variables for storing the values of the triplet entry of the element tree index, the element name ID, the element level, and the element start position.

【０１００】それぞれの変数の用途は、parentは、親要
素のエントリを格納するためのものであり、変数elem
は、計算の過程における要素ツリーインデックスの現在
エントリを格納するためのものである。The purpose of each variable is that parent is used to store the entry of the parent element, and the variable elem
Is for storing the current entry of the element tree index in the process of calculation.

【０１０１】変数baseは、相対関係の起点となる要素
(起点要素)のエントリの要素レベルの値を格納する。The variable base is the element that is the starting point of the relative relationship.
Stores the element-level value of the entry (starting element).

【０１０２】ステップＳ162の一つ前の要素があるかの
判定は、要素ツリーインデックスにおけるエントリのリ
ストにおいて、起点となるエントリより一つ前のエント
リが存在するかということである。The determination as to whether or not there is the previous element in step S162 is whether or not there is an entry before the entry serving as the starting point in the list of entries in the element tree index.

【０１０３】図１２に示した要素ツリーインデックス１
４の例に即して説明する。例えば起点となるエントリが
3番目のエントリ(00030000、2、30)であるとすると、
「一つ前の要素」とは、当該エントリよりも一つ前の、
2番目のエントリ(00020000、1、20)となる。Element tree index 1 shown in FIG.
A description will be given according to the example of No. 4. For example, the entry that is the starting point
Given the third entry (00030000, 2, 30),
"The element one before" means the one before the entry,
This is the second entry (00020000, 1, 20).

【０１０４】よって、この例の場合、「一つ前の要素が
ある」と判定され、ステップ63へ進む。Therefore, in the case of this example, it is determined that “the previous element exists”, and the process proceeds to step 63.

【０１０５】ステップＳ163では、ステップＳ162におけ
る「一つ前の要素」に相当するエントリを変数elemに代
入する。In step S163, the entry corresponding to the "previous element" in step S162 is assigned to the variable elem.

【０１０６】ステップＳ164では、ステップＳ163のelem
の要素レベルが、ステップ61の初期化で設定したbaseの
値より1少ない値であるか否かを判定する。In step S164, the elem of step S163.
It is determined whether the element level of is less than the value of base set in the initialization of step 61 by one.

【０１０７】ステップＳ164の判定の結果YESであれば、
現在のelemが求める親要素のエントリとなり、ステップ
Ｓ165でparentにelemの値を代入して処理を終了する。If the result of the determination in step S164 is YES,
The current elem becomes the entry of the parent element to be obtained, and the value of elem is assigned to parent in step S165, and the process ends.

【０１０８】ステップＳ164の判定結果がNOであれば、
ステップＳ162に戻り、該elemに対する一つ前の要素が
あるか否かを判定して、以降は、前述と同様の処理を繰
り返す。If the decision result in the step S164 is NO,
Returning to step S162, it is determined whether there is an element immediately preceding the elem, and thereafter, the same processing as described above is repeated.

【０１０９】［祖先要素の取出し］次に図１４のフロー
チャートを参照して、祖先要素の取出しの処理手順につ
いて説明する。[Retrieval of Ancestor Element] Next, the processing procedure for retrieval of an ancestor element will be described with reference to the flowchart of FIG.

【０１１０】ステップＳ171では、必要な変数の初期化
を行う。変数ancestorとelemは、要素ツリーインデック
スのエントリの値を格納するための変数である。それぞ
れの変数の用途は、ancestorは、フローチャートで求め
る、祖先要素のエントリを格納するためのものであり、
変数elemは、計算の過程における要素ツリーインデック
スの現在エントリを格納するためのものである。In step S171, necessary variables are initialized. The variables ancestor and elem are variables for storing the value of the entry of the element tree index. The purpose of each variable is to store the entry of the ancestor element that ancestor asks for in the flowchart,
The variable elem is for storing the current entry of the element tree index in the process of calculation.

【０１１１】変数baseは、起点要素のエントリの要素レ
ベルの値を格納する。変数nは、n番目の祖先要素を取出
すために指定された値である。変数iは、繰り返し処理
に関するカウンタとして用いる。The variable base stores the element level value of the entry of the starting element. The variable n is the value specified to retrieve the nth ancestor element. The variable i is used as a counter for the iterative process.

【０１１２】ステップＳ172からステップＳ174までは、
図１３のステップＳ162からステップＳ164までと同様で
ある。From step S172 to step S174,
This is the same as steps S162 to S164 of FIG.

【０１１３】ステップＳ174の判定結果がYESであれば、
現在のelemがi番目の祖先要素のエントリとなる。If the decision result in the step S174 is YES,
The current elem becomes the entry for the i th ancestor element.

【０１１４】そしてステップＳ175で、そのi番目の祖先
要素elemが求めるn番目の祖先要素か否かを判定し、YES
であればステップＳ177でancestorにelemの値を代入し
て処理を終了する。Then, in step S175, it is determined whether or not the i-th ancestor element elem is the n-th ancestor element to be obtained, and YES.
If so, the value of elem is assigned to ancestor in step S177, and the process ends.

【０１１５】一方、ステップＳ175の判定結果がNOであ
れば、該elemに対する親要素を求めるべく、ステップＳ
176でbeseの値を新たに当該elemの要素レベルに設定し
直し、かつiの値を一つ繰り上げて、ステップＳ172に戻
ってもう一度処理を繰り返す。On the other hand, if the decision result in the step S175 is NO, a step S is carried out to find a parent element for the elem.
At 176, the value of bese is newly set to the element level of the elem, and the value of i is incremented by one, and the process returns to step S172 and is repeated.

【０１１６】［子要素の取出し］次に図１５のフローチ
ャートを参照して、子要素の取出しの処理手順について
説明する。[Retrieval of Child Element] Next, with reference to the flow chart of FIG. 15, the procedure for extracting a child element will be described.

【０１１７】ステップＳ181では、必要な変数の初期化
を行う。変数childとelemは、要素ツリーインデックス
のエントリの値を格納するための変数である。それぞれ
の変数の用途は、childは本フローチャートで求める子
要素のエントリを格納するためのものであり、変数elem
は計算の過程における要素ツリーインデックスの現在エ
ントリを格納するためのものである。In step S181, necessary variables are initialized. The variables child and elem are variables for storing the value of the entry of the element tree index. The purpose of each variable is to store the entry of the child element that child wants in this flowchart.
Is for storing the current entry of the element tree index in the process of calculation.

【０１１８】変数baseは、起点要素のエントリの要素レ
ベルの値を格納する。変数nは、n番目の子要素を取出す
ために指定された値である。The variable base stores the element level value of the entry of the starting element. The variable n is the value specified to retrieve the nth child element.

【０１１９】変数iは繰り返し処理に関するカウンタと
して用いる。The variable i is used as a counter for the repetitive processing.

【０１２０】ステップＳ182の「一つ後の要素がある
か」とは、要素ツリーインデックスにおけるエントリの
リストにおいて、起点となるエントリより一つ後のエン
トリが存在するかということである。The “whether there is an element after one” in step S182 means whether or not there is an entry after the entry serving as the starting point in the list of entries in the element tree index.

【０１２１】図１２に示した要素ツリーインデックスの
例で説明すると、例えば起点となるエントリが3番目の
エントリ(00030000、2、30)であるとすると、一つ後の
要素とは、そのエントリより一つ後の、4番目のエント
リ(00040000、3、40)となる。よって、この例の場合
は、一つ後の要素がある」と判定され、ステップ83へと
進む。Explaining with the example of the element tree index shown in FIG. 12, if the entry serving as the starting point is the third entry (00030000, 2, 30), the element after the one is This is the fourth entry (00040000, 3, 40) after the one. Therefore, in the case of this example, it is determined that there is an element after one, ”and the process proceeds to step 83.

【０１２２】ステップＳ183では、ステップＳ182におけ
る一つ後の要素に相当するエントリを変数elemに代入す
る。At step S183, the entry corresponding to the element immediately after that at step S182 is assigned to the variable elem.

【０１２３】ステップＳ184では、ステップＳ183のelem
の要素レベルが、ステップＳ181の初期化で設定したbas
eの値よりも大きい値であるか否かを判定する。In step S184, the elem of step S183
Element level of bas set in the initialization of step S181
It is determined whether the value is larger than the value of e.

【０１２４】ステップＳ184の判定の結果YESであれば、
ステップＳ185へ進む。If the result of the determination in step S184 is YES,
It proceeds to step S185.

【０１２５】ステップＳ184の判定の結果がNOであれ
ば、処理を終了する。ステップＳ185では、さらにelem
の要素レベルが、baseの値よりも１多い値であるか否か
を判定する。If the result of the determination in step S184 is NO, the process ends. In step S185, further elem
It is determined whether the element level of is a value one more than the value of base.

【０１２６】ステップＳ185の判定結果がYESであれば、
現在のelemがi番目の子要素のエントリとなる。そして
ステップＳ186で、そのi番目の子要素elemが求めるn番
目の子要素か否かを判定し、YESであればステップＳ188
でchildにelemの値を代入して処理を終了する。If the decision result in the step S185 is YES,
The current elem becomes the i-th child element entry. Then, in step S186, it is determined whether or not the i-th child element elem is the n-th child element to be obtained, and if YES, step S188
Substitute the value of elem for child and finish the process.

【０１２７】ステップＳ186の判定結果がNOであれば、
次の子要素を求めるべく、ステップＳ187でiの値を一つ
繰り上げて、ステップＳ182に戻ってもう一度処理を繰
り返す。If the decision result in the step S186 is NO,
In order to obtain the next child element, the value of i is incremented by 1 in step S187, the process returns to step S182 and the process is repeated again.

【０１２８】［子孫要素の取出し］図１６のフローチャ
ートを参照して、子孫要素の取出しの処理手順について
説明する。[Retrieval of Descendant Element] The processing procedure for retrieval of a descendant element will be described with reference to the flowchart of FIG.

【０１２９】ステップＳ191では、必要な変数の初期化
を行う。変数descendantとelemは、要素ツリーインデッ
クスのエントリの値を格納するための変数である。それ
ぞれの変数の用途は、descendantは、この処理で求める
子孫要素のエントリを格納するためのものであり、変数
elemは計算の過程における要素ツリーインデックスの現
在エントリを格納するためのものである。In step S191, necessary variables are initialized. The variables descendant and elem are variables for storing the value of the entry of the element tree index. The purpose of each variable is that the descendant is for storing the entry of the descendant element to be obtained in this process.
elem is for storing the current entry of the element tree index in the process of calculation.

【０１３０】変数baseは、起点要素のエントリの要素レ
ベルの値を格納する。The variable base stores the element level value of the entry of the starting element.

【０１３１】変数nは、n番目の子孫要素を取出すために
指定された値である。The variable n is a value designated to fetch the n-th descendant element.

【０１３２】変数iは、繰り返し処理に関するカウンタ
として用いる。The variable i is used as a counter for the repetitive processing.

【０１３３】ステップＳ192からステップＳ194までは、
図１５のステップＳ182からステップＳ184までと同様で
ある。From step S192 to step S194,
This is the same as steps S182 to S184 in FIG.

【０１３４】ステップＳ194の判定結果がYESであれば、
現在のelemがi番目の子孫要素のエントリとなる。If the decision result in the step S194 is YES,
The current elem is the entry for the i th descendant element.

【０１３５】そしてステップＳ195で、そのi番目の子孫
要素elemが求めるn番目の子孫要素か否かを判定し、YES
であれば、ステップ97でdescendantにelemの値を代入し
て処理を終了する。Then, in step S195, it is determined whether or not the i-th descendant element elem is the n-th descendant element to be obtained, and YES.
If so, in step 97, the value of elem is assigned to descendant and the process ends.

【０１３６】ステップＳ195の判定結果がNOであれば、
次の子孫要素を求めるべく、ステップＳ196でiの値を一
つ繰り上げて、ステップＳ192に戻ってもう一度処理を
繰り返す。If the decision result in the step S195 is NO,
In order to obtain the next descendant element, the value of i is incremented by 1 in step S196, and the process returns to step S192 to repeat the process.

【０１３７】［兄要素の取出し］次に図１７のフローチ
ャートを参照して、兄要素の取出しの処理手順について
説明する。[Retrieval of Brother Element] Next, with reference to the flow chart of FIG. 17, the procedure for extracting the brother element will be described.

【０１３８】ステップＳ201では、必要な変数の初期化
を行う。変数psiblingとelemは、要素ツリーインデック
スのエントリの値を格納するための変数である。In step S201, necessary variables are initialized. The variables psibling and elem are variables for storing the value of the entry of the element tree index.

【０１３９】それぞれの変数の用途は、psiblingは、こ
の処理手順で求める兄要素のエントリを格納するための
ものであり、変数elemは計算の過程における要素ツリー
インデックスの現在エントリを格納するためのものであ
る。The use of each variable is that psibling is for storing the entry of the sibling element obtained in this processing procedure, and variable elem is for storing the current entry of the element tree index in the process of calculation. Is.

【０１４０】変数baseは、起点要素のエントリの要素レ
ベルの値を格納する。The variable base stores the element level value of the entry of the starting element.

【０１４１】変数nは、n番目の兄要素を取出すために指
定された値である。The variable n is a value designated to take out the n-th brother element.

【０１４２】変数iは、繰り返し処理に関するカウンタ
として用いる。The variable i is used as a counter for repeating processing.

【０１４３】ステップＳ202からステップＳ203までは、
図１３のステップＳ162からステップＳ163までと同様で
ある。From step S202 to step S203,
This is the same as steps S162 to S163 in FIG.

【０１４４】ステップＳ204では、ステップＳ203のelem
の要素レベルが、ステップＳ201の初期化で設定したbas
eの値と同じであるか否かを判定する。At step S204, the elem of step S203.
Element level of bas set in the initialization of step S201
It is determined whether it is the same as the value of e.

【０１４５】ステップＳ204の判定結果がYESであれば、
現在のelemがi番目の兄要素のエントリとなる。If the decision result in the step S204 is YES,
The current elem is the entry for the i-th sibling element.

【０１４６】そしてステップＳ206で、そのi番目の兄要
素elemが求めるn番目の子孫要素か否かを判定し、YESで
あれば、ステップa8でpsiblingにelemの値を代入して、
処理を終了する。Then, in step S206, it is determined whether or not the i-th brother element elem is the n-th descendant element to be obtained, and if YES, the value of elem is substituted in psibling in step a8,
The process ends.

【０１４７】ステップＳ206の判定結果がNOであれば、
次の兄要素を求めるべく、ステップＳ207でiの値を一つ
繰り上げて、ステップＳ202に戻って、もう一度処理を
繰り返す。If the decision result in the step S206 is NO,
In order to obtain the next brother element, the value of i is incremented by 1 in step S207, the process returns to step S202, and the process is repeated again.

【０１４８】ステップＳ204の判定結果がNOであれば、
ステップＳ205に進み、ステップＳ203のelemの要素レベ
ルが、baseの値よりも大きい値であるか否かを判定す
る。If the decision result in the step S204 is NO,
In step S205, it is determined whether the element level of elem in step S203 is larger than the value of base.

【０１４９】ステップＳ205の判定結果がYESであれば、
ステップＳ202に戻って、もう一度処理を繰り返す。判
定結果がNOであれば、処理を終了する。If the decision result in the step S205 is YES,
Returning to step S202, the process is repeated once again. If the determination result is NO, the process ends.

【０１５０】［弟要素の取出し］次に図１８のフローチ
ャートを参照して、弟要素の取出しの処理手順について
説明する。[Retrieval of younger brother element] Next, with reference to the flow chart of FIG. 18, a processing procedure for retrieving a younger brother element will be described.

【０１５１】ステップＳ211では、変数の初期化を行
う。変数fsiblingとelemは、要素ツリーインデックスの
エントリの値を格納するための変数である。In step S211, variables are initialized. The variables fsibling and elem are variables for storing the value of the entry of the element tree index.

【０１５２】それぞれの変数の用途は、変数psibling
は、この処理で求める弟要素のエントリを格納するため
のものであり、変数elemは計算の過程における要素ツリ
ーインデックスの現在エントリを格納するためのもので
ある。The usage of each variable is the variable psibling.
Is for storing the entry of the younger brother element obtained in this process, and the variable elem is for storing the current entry of the element tree index in the process of calculation.

【０１５３】変数baseは、起点要素のエントリの要素レ
ベルの値を格納する。変数nはn番目の弟要素を取出すた
めに指定された値である。The variable base stores the element level value of the entry of the starting element. The variable n is the value specified to retrieve the nth sibling element.

【０１５４】変数iは繰り返し処理に関するカウンタと
して用いる。The variable i is used as a counter for the repetitive processing.

【０１５５】ステップＳ212からステップＳ213までは、
図１５のステップＳ182からステップＳ183までと同様で
ある。From step S212 to step S213,
This is similar to steps S182 to S183 in FIG.

【０１５６】ステップＳ214では、ステップＳ213のelem
の要素レベルが、ステップＳ211の初期化で設定したbas
eの値と同じであるか否かを判定する。In step S214, the elem of step S213.
Element level of bas set in the initialization of step S211
It is determined whether it is the same as the value of e.

【０１５７】ステップＳ214の判定結果がYESであれば、
現在のelemがi番目の弟要素のエントリとなる。If the decision result in the step S214 is YES,
The current elem is the entry for the i-th sibling element.

【０１５８】そしてステップＳ216で、そのi番目の弟要
素elemが求めるn番目の子孫要素か否かを判定し、YESで
あればステップＳ218でfsiblingにelemの値を代入して
処理を終了する。Then, in step S216, it is determined whether or not the i-th younger brother element elem is the n-th descendant element to be obtained, and if YES, the value of elem is assigned to fsibling in step S218, and the processing is terminated.

【０１５９】ステップＳ216の判定結果がNOであれば、
次の兄要素を求めるべく、ステップＳ217でiの値を一つ
繰り上げて、ステップＳ212に戻ってもう一度処理を繰
り返す。If the decision result in the step S216 is NO,
In order to obtain the next brother element, the value of i is incremented by 1 in step S217, the process returns to step S212 and the process is repeated again.

【０１６０】ステップＳ214の判定結果がNOであれば、
ステップＳ215に進み、ステップＳ213のelemの要素レベ
ルが、baseの値よりも大きい値であるか否かを判定す
る。If the decision result in the step S214 is NO,
In step S215, it is determined whether the element level of elem in step S213 is larger than the value of base.

【０１６１】ステップＳ215の判定結果がYESであれば、
ステップＳ212に戻ってもう一度処理を繰り返す。判定
結果がNOであれば処理を終了する。If the decision result in the step S215 is YES,
The process returns to step S212 to repeat the process again. If the determination result is NO, the process ends.

【０１６２】［末弟要素の取出し］次に、ある要素の弟
要素よりもさらに弟の要素が存在しない場合、その弟要
素を末弟要素と呼ぶ。「末弟要素の取出し」プロセス
は、後に示す「右深さ優先順序での子孫要素の取出し」
に必要なサブプロセスである。[Fetching the youngest younger brother element] Next, when there are no more younger brother elements than a certain younger brother element, the younger brother element is called the younger brother element. The "Retrieve youngest brother element" process is shown in "Retrieving descendant elements in right-depth-first order" shown later.
It is a necessary sub-process.

【０１６３】図１９のフローチャートを参照して、末弟
要素の取出しの処理手順について説明する。With reference to the flowchart in FIG. 19, the processing procedure for taking out the younger brother element will be described.

【０１６４】ステップＳ221では、変数の初期化を行
う。変数fsiblingとelemは、要素ツリーインデックスの
エントリの値を格納するための変数である。それぞれの
変数の用途は、psiblingは、この処理で求める末弟要素
のエントリを格納するためのものであり、変数elemは計
算の過程における要素ツリーインデックスの現在エント
リを格納するためのものである。変数baseは起点要素の
エントリの要素レベルの値を格納する。In step S221, variables are initialized. The variables fsibling and elem are variables for storing the value of the entry of the element tree index. The use of each variable is to store the entry of the youngest sibling element obtained in this process, psibling, and to store the current entry of the element tree index in the process of calculation, as the variable elem. The variable base stores the element level value of the entry of the starting element.

【０１６５】ステップＳ222からステップＳ224までは、
図１８のステップＳ212からステップＳ214までと同様で
ある。From step S222 to step S224,
This is the same as steps S212 to S214 in FIG.

【０１６６】ステップＳ224の判定結果がYESであれば、
現在のelemは弟要素のエントリとして、ステップＳ225
でfsiblingにelemの値を代入する。ただし、このfsibli
ngが末弟であるかどうかはこの時点では分からないの
で、ステップＳ222に戻ってもう一度処理を繰り返し、f
siblingの値を逐次更新していく。If the decision result in the step S224 is YES,
The current elem is the entry of the younger brother element, and step S225
Assign the value of elem to fsibling. However, this fsibli
Since it is unknown at this point whether or not ng is the youngest brother, the process returns to step S222 and the process is repeated once again.
The value of sibling is updated sequentially.

【０１６７】ステップＳ224の判定結果がNOであれば、
ステップＳ226に進み、ステップＳ223のelemの要素レベ
ルが、baseの値よりも大きい値であるか否かを判定す
る。If the decision result in the step S224 is NO,
In step S226, it is determined whether the element level of elem in step S223 is greater than the value of base.

【０１６８】ステップＳ226の判定結果がYESであれば、
ステップＳ222に戻ってもう一度処理を繰り返す。判定
結果がNOであれば処理を終了する。If the decision result in the step S226 is YES,
The process returns to step S222 and the process is repeated again. If the determination result is NO, the process ends.

【０１６９】［右深さ優先順序での子孫要素の取出し］
右深さ優先順序での子孫要素の取出しプロセスは、後述
する前要素の取出しに必要なサブプロセスである。[Retrieval of descendant elements in right depth priority order]
The process of fetching the descendant elements in the right-depth priority order is a sub-process required for fetching the previous element, which will be described later.

【０１７０】ここで、右深さ優先順序とは、図１１に示
した木構造において、右から左へ深さ優先で要素をたど
るときに得られる順序である。Here, the right depth priority order is the order obtained when the elements are traced in depth priority from right to left in the tree structure shown in FIG.

【０１７１】図１０に示した文書で要素の開始タグの出
現順番は、図１１の木構造では、左から右へ深さ優先で
要素をたどるときに得られる順序と一致する。よって、
右深さ優先順序で取出される要素の順番は、文書の要素
の開始タグの出現順番とは異なる。In the document shown in FIG. 10, the appearance order of the start tags of the elements matches the order obtained when tracing the elements from the left to the right in the depth priority in the tree structure. Therefore,
The order of the elements fetched in the right depth priority order is different from the order of appearance of the start tags of the elements of the document.

【０１７２】図２０に示したフローチャートを参照し
て、右深さ優先順序での子孫要素の取出しの処理手順に
ついて説明する。With reference to the flow chart shown in FIG. 20, description will be given of a processing procedure for extracting descendant elements in the right depth priority order.

【０１７３】ステップＳ231では、変数の初期化を行
う。変数descendantとelemは、要素ツリーインデックス
のエントリの値を格納するための変数である。それぞれ
の変数の用途は、descendantは、この処理で求める子孫
要素のエントリを格納するためのものであり、変数elem
は計算の過程における要素ツリーインデックスの現在エ
ントリを格納するためのものである。変数nは、n番目の
子孫要素を取出すために指定された値である。変数iは
繰り返し処理に関するカウンタとして用いる。In step S231, variables are initialized. The variables descendant and elem are variables for storing the value of the entry of the element tree index. The purpose of each variable is that the descendant is for storing the entry of the descendant element to be obtained in this process, and the variable elem
Is for storing the current entry of the element tree index in the process of calculation. The variable n is the value specified to retrieve the nth descendant element. The variable i is used as a counter for the iterative process.

【０１７４】ステップＳ232では、サブプロセス「末弟
要素の取出し」を呼び出し、当該要素elemの末弟要素の
エントリを取出す。In step S232, the subprocess "take out youngest younger brother element" is called, and the entry of the youngest younger brother element of the element elem is taken out.

【０１７５】ステップＳ233で、その末弟要素を取出せ
たか否かを判定し、判定結果がYESであれば、その取出
した要素のエントリがi番目の子孫要素のエントリとな
る。In step S233, it is determined whether or not the youngest younger brother element can be extracted. If the determination result is YES, the entry of the extracted element becomes the entry of the i-th descendant element.

【０１７６】ステップＳ234でその取出した要素のエン
トリをelemに代入し、ステップＳ235でそのi番目の子孫
要素elemが求めるn番目の子孫要素か否かを判定する。In step S234, the entry of the fetched element is assigned to elem, and in step S235, it is determined whether or not the i-th descendant element elem is the n-th descendant element to be obtained.

【０１７７】ステップＳ235の判定結果がYESであればス
テップＳ242でdescendantにelemの値を代入して処理を
終了する。If the decision result in the step S235 is YES, a step S242 substitutes the value of elem for the descendant, and the process ends.

【０１７８】ステップＳ235の判定結果がNOであればス
テップＳ236へ進む。またステップＳ233の判定結果がNO
であれば処理を終了する。If the decision result in the step S235 is NO, the process advances to a step S236. Further, the determination result of step S233 is NO.
If so, the process ends.

【０１７９】ステップＳ236では、iの値を一つ繰り上
げ、また別の繰り返し処理用カウンタjを新たに用意し
て1に初期化する。In step S236, the value of i is incremented by one and another iteration processing counter j is newly prepared and initialized to 1.

【０１８０】ステップＳ237は、この新たな変数値i、j
による「右深さ優先順序での子孫要素の取出し」の再帰
呼び出しとなる。In step S237, the new variable values i and j are added.
It is a recursive call of "Retrieving descendant elements in right-depth-first order".

【０１８１】ステップＳ238で、その子孫要素を取出せ
たか否かを判定し、判定結果がYESであれば、ステップ
Ｓ239で取出した要素に相当するエントリをelemに代入
し、elemがi番目の子孫要素のエントリとなる。In step S238, it is determined whether or not the descendant element has been extracted. If the determination result is YES, the entry corresponding to the element extracted in step S239 is assigned to elem, and elem is the i-th descendant element. Will be the entry.

【０１８２】そしてステップＳ240で、そのi番目の子孫
要素elemが求めるn番目の子孫要素か否かを判定し、YES
であれば、ステップＳ242でdescendantにelemの値を代
入して処理を終了する。Then, in step S240, it is determined whether or not the i-th descendant element elem is the n-th descendant element to be obtained, and YES.
If so, the value of elem is assigned to descendant in step S242, and the process ends.

【０１８３】ステップＳ240の判定結果がNOであれば、
次の子孫要素を求めるべく、ステップＳ241でiとjの値
を一つ繰り上げて、ステップＳ237の再帰呼び出しに戻
って同様の処理を繰り返す。If the decision result in the step S240 is NO,
In order to obtain the next descendant element, the values of i and j are incremented by 1 in step S241, the process returns to the recursive call in step S237, and the same processing is repeated.

【０１８４】ステップＳ238の判定結果がNOであれば、
ステップＳ243に進み、サブプロセス「兄要素の取出
し」を呼び出す。If the decision result in the step S238 is NO,
The process proceeds to step S243 to call the sub-process "take out brother element".

【０１８５】ステップＳ244で当該elemに対する1番目の
兄要素を取出せたか否かを判定し、判定結果がYESであ
れば、ステップＳ245で取出した要素に相当するエント
リをelemに代入し、elemがi番目の子孫要素のエントリ
となる。In step S244, it is determined whether or not the first brother element for the elem has been taken out. If the decision result is YES, the entry corresponding to the element taken out in step S245 is assigned to elem, and elem is i It becomes the entry of the th descendant element.

【０１８６】そしてステップＳ246で、そのi番目の子孫
要素elemが求めるn番目の子孫要素か否かを判定し、YES
であればステップdgでdescendantにelemの値を代入して
処理を終了する。Then, in step S246, it is determined whether or not the i-th descendant element elem is the n-th descendant element to be obtained, and YES.
If so, the value of elem is assigned to descendant in step dg, and the process ends.

【０１８７】ステップＳ246の判定結果がNOであれば、
次の子孫要素を求めるべく、ステップＳ236でiの値を一
つ繰り上げ、jの値を再度1に初期化して、ステップＳ23
7の再帰呼び出しに戻って同様の処理を繰り返す。If the decision result in the step S246 is NO,
In order to obtain the next descendant element, the value of i is incremented by 1 in step S236, the value of j is initialized to 1 again, and step S23
Return to the recursive call of 7 and repeat the same process.

【０１８８】［前要素の取出し］次に、図２１のフロー
チャートを参照して、前要素の取出しの処理手順につい
て説明する。[Removal of Previous Element] Next, with reference to the flow chart of FIG. 21, a processing procedure for extracting the previous element will be described.

【０１８９】ステップＳ251では、変数の初期化を行
う。変数precedingとelemは、要素ツリーインデックス
のエントリの値を格納するための変数である。At step S251, variables are initialized. The variables preceding and elem are variables for storing the value of the entry of the element tree index.

【０１９０】それぞれの変数の用途は、precedingは、
この処理で求める前要素のエントリを格納するためのも
のであり、変数elemは、計算の過程における要素ツリー
インデックスの現在エントリを格納するためのものであ
る。変数nは、n番目の前要素を取出すために指定された
値である。変数iは、繰り返し処理に関するカウンタと
して用いる。The usage of each variable is that the preceding is
This is for storing the entry of the previous element obtained in this process, and the variable elem is for storing the current entry of the element tree index in the process of calculation. The variable n is the value specified to retrieve the nth previous element. The variable i is used as a counter for the iterative process.

【０１９１】ステップＳ252では、サブプロセス「兄要
素の取出し」を呼び出し、当該要素elemの1番目の兄要
素のエントリを取出す。In step S252, the sub-process "take out older brother element" is called to take out the entry of the first older brother element of the element elem.

【０１９２】ステップＳ253でその兄要素を取出せたか
否かを判定し、判定結果がYESであれば、ステップＳ259
で取出した要素に相当するエントリをelemに代入し、el
emがi番目の前要素のエントリとなる。In step S253, it is determined whether or not the older brother element has been extracted. If the determination result is YES, step S259
Assign the entry corresponding to the element fetched in step to elem, and enter el
em is the entry of the i-th previous element.

【０１９３】そして、ステップＳ260で、そのi番目の前
要素elemが求めるn番目の前要素か否かを判定し、YESで
あればステップＳ267でprecedingにelemの値を代入して
処理を終了する。Then, in step S260, it is determined whether or not the i-th previous element elem is the n-th previous element to be obtained, and if YES, the value of elem is substituted into the preceding in step S267, and the processing ends. .

【０１９４】ステップＳ260の判定結果がNOであればス
テップＳ261へ進む。またステップＳ253の判定結果がNO
であればステップＳ254へ進む。If the decision result in the step S260 is NO, the process advances to a step S261. In addition, the determination result of step S253 is NO.
If so, the process proceeds to step S254.

【０１９５】ステップＳ261では、iの値を一つ繰り上
げ、また別の繰り返し処理用カウンタjを新たに用意し
て1に初期化する。In step S261, the value of i is incremented by one and another iteration processing counter j is newly prepared and initialized to 1.

【０１９６】ステップＳ262では、この新たな変数値i、
jでサブプロセス「右深さ優先順序での子孫要素の取出
し」を呼び出す。At step S262, the new variable value i,
Invokes the subprocess "get descendant elements in right-depth-first order" with j.

【０１９７】ステップＳ263でその子孫要素を取出せた
か否かを判定し、判定結果がYESであれば、ステップＳ2
64で取出した要素に相当するエントリをelemに代入し、
elemがi番目の前要素のエントリとなる。In step S263, it is determined whether or not the descendant element has been extracted. If the determination result is YES, step S2
Assign the entry corresponding to the element fetched in 64 to elem,
elem is the entry of the i-th previous element.

【０１９８】そしてステップＳ265で、そのi番目の前要
素elemが求めるn番目の前要素か否かを判定し、YESであ
ればステップＳ267でprecedingにelemの値を代入して処
理を終了する。Then, in step S265, it is determined whether or not the i-th previous element elem is the n-th previous element to be obtained, and if YES, the value of elem is substituted for the preceding in step S267, and the processing is terminated.

【０１９９】ステップＳ265の判定結果がNOであれば、
次の前要素を求めるべく、ステップＳ266でiとjの値を
一つ繰り上げて、ステップＳ262に戻って同様の処理を
繰り返す。If the decision result in the step S265 is NO,
In order to obtain the next previous element, the values of i and j are incremented by 1 in step S266, and the process returns to step S262 to repeat the same processing.

【０２００】ステップＳ263の判定結果がNOであれば、
ステップＳ252に戻ってもう一度処理を繰り返す。If the decision result in the step S263 is NO,
The process returns to step S252 to repeat the process again.

【０２０１】ステップＳ254では、サブプロセス「親要
素の取出し」を呼び出し、当該要素elemの親要素のエン
トリを取出す。ステップＳ255でその親要素を取出せた
か否かを判定し、判定結果がYESであれば、ステップＳ2
56で取出した要素に相当するエントリをelemに代入し、
elemがi番目の前要素のエントリとなる。In step S254, the subprocess "take out parent element" is called to take out the entry of the parent element of the element elem. In step S255, it is determined whether or not the parent element has been extracted. If the determination result is YES, step S2
Assign the entry corresponding to the element fetched in 56 to elem,
elem is the entry of the i-th previous element.

【０２０２】そしてステップＳ257で、そのi番目の前要
素elemが求めるn番目の前要素か否かを判定し、YESであ
ればステップＳ267でprecedingにelemの値を代入して処
理を終了する。Then, in step S257, it is determined whether or not the i-th previous element elem is the n-th previous element to be obtained, and if YES, the value of elem is substituted in the preceding in step S267, and the processing is ended.

【０２０３】ステップＳ257判定結果がNOであれば、次
の前要素を求めるべく、ステップＳ258でiの値を一つ繰
り上げて、ステップＳ252に戻ってもう一度処理を繰り
返す。If the decision result in the step S257 is NO, the value of i is incremented by 1 in a step S258 in order to obtain the next preceding element, and the process returns to the step S252 to repeat the process again.

【０２０４】ステップＳ255の判定結果がNOであれば処
理を終了する。If the decision result in the step S255 is NO, the process ends.

【０２０５】［次要素の取出し］次に図２２のフローチ
ャートを参照して、次要素の取出しの処理手順について
説明する。[Retrieval of Next Element] Next, with reference to the flow chart of FIG. 22, the processing procedure of retrieval of the next element will be described.

【０２０６】ステップＳ271では、変数の初期化を行
う。変数followingとelemは、要素ツリーインデックス
のエントリの値を格納するための変数である。それぞれ
の変数の用途について説明すると、followingは、この
処理で求める次要素のエントリを格納するためのもので
あり、変数elemは計算の過程における要素ツリーインデ
ックスの現在エントリを格納するためのものである。変
数nは、n番目の次要素を取出すために指定された値であ
る。変数iは、繰り返し処理に関するカウンタとして用
いる。In step S271, variables are initialized. The variables following and elem are variables for storing the value of the entry of the element tree index. Describing the usage of each variable, following is for storing the entry of the next element obtained in this process, and variable elem is for storing the current entry of the element tree index in the process of calculation. . The variable n is the value specified to retrieve the nth next element. The variable i is used as a counter for the iterative process.

【０２０７】ステップＳ272では、サブプロセス「弟要
素の取出し」を呼び出し、当該要素elemの1番目の弟要
素のエントリを取出す。In step S272, the subprocess "take out younger brother element" is called to take out the entry of the first younger brother element of the element elem.

【０２０８】ステップＳ273でその弟要素を取出せたか
否かを判定し、判定結果がYESであれば、ステップＳ279
で取出した要素に相当するエントリをelemに代入し、el
emがi番目の次要素のエントリとなる。[0208] In step S273, it is determined whether or not the younger brother element has been extracted. If the determination result is YES, step S279
Assign the entry corresponding to the element fetched in step to elem, and enter el
em is the entry of the i-th next element.

【０２０９】そしてステップＳ280で、そのi番目の次要
素elemが求めるn番目の次要素か否かを判定し、YESであ
れば、ステップＳ287でfollowingにelemの値を代入して
処理を終了する。Then, in step S280, it is determined whether or not the i-th next element elem is the n-th next element to be obtained, and if YES, the value of elem is substituted for following in step S287, and the processing ends. .

【０２１０】ステップＳ280の判定結果がNOであればス
テップＳ281へ進む。またステップＳ273の判定結果がNO
であればステップＳ274へ進む。[0210] If the decision result in the step S280 is NO, the process advances to a step S281. In addition, the determination result of step S273 is NO.
If so, the process proceeds to step S274.

【０２１１】ステップＳ281では、iの値を一つ繰り上
げ、また別の繰り返し処理用カウンタjを新たに用意し
て1に初期化する。In step S281, the value of i is incremented by one, and another iteration processing counter j is newly prepared and initialized to 1.

【０２１２】ステップＳ282では、この新たな変数値i、
jでサブプロセス「子孫要素の取出し」を呼び出す。ス
テップＳ283でその子孫要素を取出せたか否かを判定
し、判定結果がYESであれば、ステップＳ284で取出した
要素に相当するエントリelemに代入し、elemがi番目の
次要素のエントリとなる。At step S282, the new variable value i,
Invoke the subprocess "Extract descendant elements" with j. In step S283, it is determined whether or not the descendant element has been extracted, and if the determination result is YES, it is substituted into the entry elem corresponding to the element extracted in step S284, and elem becomes the entry of the i-th next element.

【０２１３】そしてステップＳ285で、そのi番目の次要
素elemが求めるn番目の次要素か否かを判定し、YESであ
ればステップＳ287でfollowingにelemの値を代入して処
理を終了する。Then, in step S285, it is determined whether or not the i-th next element elem is the n-th next element to be obtained, and if YES, the value of elem is substituted for following in step S287, and the processing ends.

【０２１４】ステップＳ285の判定結果がNOであれば、
次の次要素を求めるべく、ステップＳ286でiとjの値を
一つ繰り上げて、ステップＳ282に戻って同様の処理を
繰り返す。If the decision result in the step S285 is NO,
In order to obtain the next next element, the values of i and j are incremented by one in step S286, and the process returns to step S282 to repeat the same processing.

【０２１５】ステップＳ283の判定結果がNOであれば、
ステップＳ272に戻ってもう一度処理を繰り返す。If the decision result in the step S283 is NO,
The process returns to step S272 and the process is repeated again.

【０２１６】ステップＳ274では、サブプロセス「親要
素の取出し」を呼び出し、当該要素elemの親要素のエン
トリを取出す。In step S274, the subprocess "take out parent element" is called, and the entry of the parent element of the element elem is taken out.

【０２１７】ステップＳ275でその親要素を取出せたか
否かを判定し、判定結果がYESであれば、ステップＳ276
で取出した要素に相当するエントリをelemに代入し、el
emがi番目の次要素のエントリとなる。In step S275, it is determined whether or not the parent element has been extracted. If the determination result is YES, step S276
Assign the entry corresponding to the element fetched in step to elem, and enter el
em is the entry of the i-th next element.

【０２１８】そしてステップＳ277で、そのi番目の次要
素elemが求めるn番目の次要素か否かを判定し、YESであ
ればステップＳ287でfollowingにelemの値を代入して処
理を終了する。Then, in step S277, it is determined whether or not the i-th next element elem is the n-th next element to be obtained, and if YES, the value of elem is substituted for following in step S287, and the processing ends.

【０２１９】ステップＳ277の判定結果がNOであれば、
次の次要素を求めるべく、ステップＳ278でiの値を一つ
繰り上げて、ステップＳ272に戻ってもう一度処理を繰
り返す。ステップＳ275の判定結果がNOであれば処理を
終了する。If the decision result in the step S277 is NO,
In order to obtain the next next element, the value of i is incremented by one in step S278, and the process returns to step S272 to repeat the process. If the decision result in the step S275 is NO, the process ends.

【０２２０】［要素−属性インデックス］要素−属性イ
ンデックス１５は、文書フォルダオブジェクトごとに作
成して保持するものであり、文書フォルダオブジェクト
が保有する文書集合から、要素名、属性名、属性値を条
件に、文書または要素を検索するために用いる。[Element-attribute index] The element-attribute index 15 is created and held for each document folder object, and the element name, the attribute name, and the attribute value are selected from the document set held by the document folder object. , Used to search for documents or elements.

【０２２１】要素−属性インデックスは、(ＤＴＤＩ
Ｄ、要素−属性名ＩＤ、属性値、文書ＩＤ、要素開始位
置)の五つ組みのデータ構造を持つデータエントリのリ
ストとする。The element-attribute index is (DTDI
D, element-attribute name ID, attribute value, document ID, element start position) is a list of data entries having a five-element data structure.

【０２２２】要素−属性名ＩＤについて説明する。要素
−属性名ＩＤとは、要素名ＩＤもしくは属性名ＩＤであ
る。要素名ＩＤは前述の通り、ＤＴＤオブジェクト内で
唯一なＩＤである。属性名ＩＤもＤＴＤオブジェクト内
の要素名に対して唯一なＩＤで、かつ異なる要素名で同
じ属性名が存在してもそれを識別可能なＩＤである。The element-attribute name ID will be described. The element-attribute name ID is an element name ID or an attribute name ID. As described above, the element name ID is the only ID in the DTD object. The attribute name ID is also unique to the element name in the DTD object, and is an ID that can identify the same attribute name even if different element names exist.

【０２２３】要素−属性インデックス１５のエントリの
作成では、文書中の要素が一つも属性を持たない場合、
その要素に対応するエントリを一つ作成し、その要素−
属性名ＩＤの値を、その要素の要素名ＩＤとする。When an element-attribute index 15 entry is created, if no element in the document has an attribute,
Create one entry for that element and
The value of the attribute name ID is used as the element name ID of the element.

【０２２４】文書中の要素が一つ以上の属性を持つ場合
は、その属性に対応する分のエントリを作成し、各属性
に対応する要素−属性名ＩＤの値を、その属性の属性名
ＩＤとする。属性名ＩＤからどの要素の属性名であるか
を判別できるので、属性名ＩＤを持つエントリからその
エントリに対応する要素を特定することが可能である。When the element in the document has one or more attributes, the entries corresponding to the attributes are created, and the value of the element-attribute name ID corresponding to each attribute is set to the attribute name ID of the attribute. And Since the element having the attribute name ID can be determined from the attribute name ID, it is possible to specify the element corresponding to the entry from the entry having the attribute name ID.

【０２２５】［ＤＴＤＩＤと文書ＩＤ］ＤＴＤＩＤおよ
び文書ＩＤについて説明する。ＤＴＤＩＤおよび文書Ｉ
Ｄは、それぞれデータベース中のＤＴＤクラス、文書ク
ラスの各インスタンスに割り当てられた、それぞれのク
ラス内で唯一に識別可能なオブジェクトＩＤである。こ
の番号は、データベース管理システムによって与えられ
るか、もしくは、アプリケーション側で唯一となるよう
に与えるようにしてもよい。[DTDID and Document ID] The DTDID and document ID will be described. DTDID and Document I
D is an object ID uniquely identifiable in each class, which is assigned to each instance of the DTD class and the document class in the database. This number may be given by the database management system, or may be given uniquely on the application side.

【０２２６】要素−属性インデックス１５のエントリに
おけるＤＴＤＩＤ、文書ＩＤの値は、要素−属性名ＩＤ
に対応する要素を持つ文書の文書ＩＤ、その文書が従う
ＤＴＤのＤＴＤＩＤとなる。The values of the DTDID and the document ID in the entry of the element-attribute index 15 are the element-attribute name ID.
It is the document ID of the document having the element corresponding to, and the DTD ID of the DTD to which the document complies.

【０２２７】［属性値］属性値について説明する。属性
値は、文字通り、属性の値である。要素−属性インデッ
クス１５のエントリにおける属性値の値は、要素−属性
名ＩＤに対応する属性の属性値となる。[Attribute Value] The attribute value will be described. The attribute value is literally the value of the attribute. The value of the attribute value in the entry of the element-attribute index 15 becomes the attribute value of the attribute corresponding to the element-attribute name ID.

【０２２８】［要素開始位置］要素開始位置について説
明する。要素開始位置とは、要素ツリーインデックス１
４における要素開始位置と同じで、文書中の要素の記述
(開始タグを含む)の開始バイト位置の値である。要素−
属性インデックスのエントリにおける要素開始位置の値
は、要素−属性名ＩＤに対応する要素の要素開始位置と
なる。[Element Start Position] The element start position will be described. The element start position is the element tree index 1
Same as element start position in 4 and description of element in document
The value of the start byte position (including the start tag). Element −
The value of the element start position in the entry of the attribute index becomes the element start position of the element corresponding to the element-attribute name ID.

【０２２９】要素−属性インデックス１５の作成手順を
示す。要素−属性インデックスの作成は、要素−属性イ
ンデックス作成部２１２で行われる。A procedure for creating the element-attribute index 15 will be described. The element-attribute index creation unit 212 creates the element-attribute index.

【０２３０】図２３及び図２４は、要素−属性インデッ
クス１５の作成手順を示すフローチャートであり、要素
−属性インデックス１５の前記五つ組みのデータ構造
(ＤＴＤＩＤ、要素−属性名ＩＤ、属性値、文書ＩＤ、
要素開始位置)の各項目の値を求め、(ＤＴＤＩＤ、要素
−属性名ＩＤ)の値でソートした順序を保ちながら、前
記要素−属性インデックスのリストに作成したエントリ
を追加する処理手順を示している。23 and 24 are flowcharts showing the procedure for creating the element-attribute index 15, and the data structure of the five-piece element-attribute index 15 structure.
(DTDID, element-attribute name ID, attribute value, document ID,
The procedure of obtaining the value of each item of (element start position) and adding the created entry to the list of the element-attribute index while maintaining the order sorted by the value of (DTDID, element-attribute name ID) is shown. There is.

【０２３１】ただし、ここで扱う文書は、ＤＴＤに対す
る妥当性が保証され、前記ＤＴＤ自身の妥当性も保証さ
れていることを前提とする。また文書中のタグの省略、
属性値の省略等は、構造化文書のパーサによってすべて
補完されていることを前提とする。そして、図２３及び
図２４に示す処理はこのような文書の記述の先頭から最
後までを連続的に読み込むこととする。However, it is assumed that the document handled here is guaranteed to be valid for DTD, and the validity of the DTD itself is also guaranteed. Also omit tags in the document,
It is assumed that the omission of attribute values, etc. are all complemented by the structured document parser. Then, in the processing shown in FIGS. 23 and 24, it is assumed that the description of such a document is continuously read from the beginning to the end.

【０２３２】図２３を参照すると、ステップＳ301、Ｓ3
02で文書フォルダオブジェクトに文書オブジェクトを追
加すると、ステップＳ303、ステップＳ304で、前記文書
オブジェクトと関係するＤＴＤオブジェクトも同時に追
加する。Referring to FIG. 23, steps S301 and S3
When the document object is added to the document folder object in 02, the DTD object related to the document object is also added in steps S303 and S304.

【０２３３】そして、ＤＴＤオブジェクトと１対１の関
係を持つ要素−属性ＩＤテーブルの情報を基にしなが
ら、以下のステップでは、追加した文書オブジェクトの
各要素に対する要素−属性インデックスのエントリを作
成・追加していく。Then, based on the information of the element-attribute ID table having a one-to-one relationship with the DTD object, in the following steps, an element-attribute index entry for each element of the added document object is created / added. I will do it.

【０２３４】ステップＳ312では、図２５に示すサブプ
ロセスを呼び出している。図２３のステップＳ312で与
える(ＤＴＤＩＤ、要素名ＩＤ)を元に、図２５のステッ
プＳ331では、その(ＤＴＤＩＤ、要素名ＩＤ)を持つエ
ントリのリストを探す。In step S312, the sub process shown in FIG. 25 is called. Based on (DTDID, element name ID) given in step S312 of FIG. 23, in step S331 of FIG. 25, a list of entries having that (DTDID, element name ID) is searched.

【０２３５】このステップＳ331では、さらに図２６に
フローチャートを示すサブプロセスを呼び出している。In step S331, a sub process whose flow chart is shown in FIG. 26 is called.

【０２３６】図２５のステップＳ331で与える要素−属
性インデックスの全エントリのリストに対する開始位置
と終了位置、そしてＤＴＤＩＤの値を目的のデータとし
て、要素−属性インデックスのリストから同じＤＴＤＩ
Ｄの値を持つエントリのリストを探す。From the element-attribute index list, the same DTDI is set by using the start position and end position for all entries of the element-attribute index and the DTDID value given in step S331 of FIG.
Find the list of entries with a value of D.

【０２３７】図２５のステップＳ332では、そのエント
リリストが見つかったか否かを判定する。In step S332 of FIG. 25, it is determined whether the entry list has been found.

【０２３８】見つかった場合は、図２６のステップＳ35
4とステップＳ361で、そのエントリリストの開始位置と
終了位置が返却されるため、図２５のステップＳ333で
さらにその開始位置と終了位置のエントリリストを対象
に、同じ要素−属性名ＩＤの値を持つエントリのリスト
を探す。If found, step S35 in FIG.
Since the start position and end position of the entry list are returned in step 4 and step S361, the same element-attribute name ID value is set for the entry list of the start position and end position in step S333 of FIG. Find the list of entries you have.

【０２３９】ステップＳ334でそのエントリリストが見
つかったか否かを判定し、見つかった場合は、図２６の
ステップＳ355でリスト上の挿入位置が返却されるの
で、図２５のステップＳ335でその挿入位置を返却す
る。In step S334, it is determined whether or not the entry list is found. If found, the insertion position on the list is returned in step S355 of FIG. 26, so that the insertion position is determined in step S335 of FIG. return.

【０２４０】また図２５のステップＳ332およびステッ
プＳ334において対象とするエントリリストが見つから
なかった場合も、図２６のステップＳ348でリスト上の
挿入位置が返却されるので、ステップＳ335でその挿入
位置を返却する。If the target entry list is not found in steps S332 and S334 of FIG. 25, the insertion position on the list is returned in step S348 of FIG. 26. Therefore, the insertion position is returned in step S335. To do.

【０２４１】図２４のステップＳ320でも同様に、図２
５のサブプロセスを呼び出している。ただし、与えるデ
ータは(ＤＴＤＩＤ、属性名ＩＤ)であり、以降の処理は
前述の(ＤＴＤＩＤ、要素名ＩＤ)の場合と同様である。Similarly in step S320 of FIG.
Calling 5 subprocesses. However, the data to be given is (DTDID, attribute name ID), and the subsequent processing is the same as in the case of (DTDID, element name ID) described above.

【０２４２】以上の処理により、要素−属性インデック
ス１５のエントリが作成され、かつそのエントリの順序
は、(ＤＴＤＩＤ、要素−属性名ＩＤ)の値でソート済み
の順序となる。By the above processing, the entry of the element-attribute index 15 is created, and the order of the entries is sorted by the value of (DTDID, element-attribute name ID).

【０２４３】要素−属性インデックス１５のエントリ
を、(ＤＴＤＩＤ、要素−属性名ＩＤ)の値でソートして
おく目的は、エントリの検索において、(ＤＴＤＩＤ、
要素−属性名ＩＤ)をキーとした二分探索を可能とする
ためである。The purpose of sorting the entries of the element-attribute index 15 by the value of (DTDID, element-attribute name ID) is to search for the entry (DTDID,
This is because it enables a binary search with the element-attribute name ID) as a key.

【０２４４】実際の二分探索の手順については、図２６
にフローチャートを示したサブプロセスが行う。FIG. 26 shows the actual binary search procedure.
The sub-process whose flowchart is shown in FIG.

【０２４５】図２７は、要素−属性インデックスにおけ
る、エントリの検索手順を示す図である。ステップＳ37
1では、図２６に示すサブプロセスを呼び出している。FIG. 27 is a diagram showing an entry search procedure in the element-attribute index. Step S37
In 1, the sub process shown in FIG. 26 is called.

【０２４６】ステップＳ371で与える要素−属性インデ
ックスの全エントリのリストに対する開始位置と終了位
置、そしてＤＴＤＩＤの値を目的のデータとして、要素
−属性インデックスのリストから同じＤＴＤＩＤの値を
持つエントリのリストを探す。A list of entries having the same DTDID value is selected from the element-attribute index list by using the start position and end position of all the element-attribute index entries given in step S371, and the DTDID value as the target data. look for.

【０２４７】ステップＳ372では、そのエントリリスト
が見つかったか否かを判定する。In step S372, it is determined whether the entry list has been found.

【０２４８】見つかった場合は、図２６のステップＳ35
4とステップＳ361で、そのエントリリストの開始位置と
終了位置が返却されるので、ステップＳ373でさらにそ
の開始位置と終了位置のエントリリストを対象に、同じ
要素−属性名ＩＤの値を持つエントリのリストを探す。If found, step S35 in FIG.
Since the start position and the end position of the entry list are returned in step 4 and step S361, the entry list having the same element-attribute name ID is further targeted to the entry list of the start position and the end position in step S373. Find the list.

【０２４９】ステップＳ374でそのエントリリストが見
つかったか否かを判定し、見つかった場合は、前述のＤ
ＴＤＩＤの場合と同様に、該当するエントリ・リストの
開始位置と終了位置が返却される。In step S374, it is determined whether or not the entry list is found. If found, the above-mentioned D
Similar to the case of TDID, the start position and end position of the corresponding entry list are returned.

【０２５０】ステップＳ375からステップＳ379までのル
ープ処理では、この返却されたエントリの一つ一つに対
して適用する。The loop processing from step S375 to step S379 is applied to each of the returned entries.

【０２５１】各エントリに対して、ステップＳ376で属
性値に対する条件判定を行い、ステップＳ377の判定の
結果、YESであったエントリをステップＳ378で返却す
る。With respect to each entry, the condition judgment for the attribute value is performed in step S376, and the entry which is YES as a result of the judgment in step S377 is returned in step S378.

【０２５２】ステップＳ376の属性値に対する条件には
さまざまな場合が考えられるが、例えば属性値が数値で
あった場合の算術的な判定、属性値が文字列であった場
合の文字列一致やその部分的な一致などが挙げられる。There are various possible conditions for the attribute value in step S376. For example, arithmetic determination when the attribute value is a numeric value, character string matching when the attribute value is a character string, Partial match and the like.

【０２５３】なお、本発明に実施の形態において、文書
格納実行部２１の配列ツリーインデックス作成部２１
１、要素−属性インデックス作成部２１２、要素−属性
テーブル作成部２１３の各処理ステップ、及び検索実行
部２２は、コンピュータで実行されるプログラムにより
実行される。例えば図５、図６、図９、図１２乃至図２
７等に示したフローチャートに従い所望のプログラミン
グ言語でプログラムを作成し実行モジュールを作成し、
該ソースプログラム又は実行モジュールを記録した、コ
ンピュータで読み出し可能な記録媒体、もしくは、コン
ピュータが通信接続可能な通信媒体から、該プログラム
を読み出してコンピュータにインストールして実行する
ことで、本発明を実施することができる。In the embodiment of the present invention, the array tree index creation unit 21 of the document storage execution unit 21.
1, the processing steps of the element-attribute index creation unit 212, the element-attribute table creation unit 213, and the search execution unit 22 are executed by a program executed by a computer. For example, FIG. 5, FIG. 6, FIG. 9, FIG.
According to the flowchart shown in 7 etc., create a program in the desired programming language and create an execution module,
The present invention is implemented by reading the program from a computer-readable recording medium that records the source program or the execution module or a communication medium that can be communicatively connected to the computer, and installing and executing the program in the computer. be able to.

【０２５４】[0254]

【実施例】本発明の一実施例として、本発明の文書格納
構造を基にした文書検索例に関して説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS As one embodiment of the present invention, an example of document retrieval based on the document storage structure of the present invention will be described.

【０２５５】［要素−属性インデックスを用いた文書及
びその要素の検索］図２８にインスタンス図として示す
ように、文書フォルダオブジェクトが一つあり、文書フ
ォルダオブジェクトは３つの文書オブジェクトA、B、C
を管理し、さらに文書オブジェクトは、個々に、ＤＴＤ
オブジェクトX、Yと関係付けられているとする。[Search for Document and Its Element Using Element-Attribute Index] As shown as an instance diagram in FIG. 28, there is one document folder object, and the document folder object is three document objects A, B, C.
Document objects, and document objects individually
Assume that it is associated with objects X and Y.

【０２５６】文書オブジェクトA、B、Cの文書ＩＤはそ
れぞれ1、2、3とし、ＤＴＤオブジェトX、YのＤＴＤＩ
Ｄはそれぞれ1、2とする。The document IDs of the document objects A, B, and C are 1, 2, and 3, respectively, and the DTDIs of the DTD objects X and Y are set.
D is 1 and 2, respectively.

【０２５７】文書フォルダオブジェクトは、ＤＴＤオブ
ジェクトX、Yも管理する。The document folder object also manages DTD objects X and Y.

【０２５８】図２９は、文書オブジェクトA、B、Cの内
容の例を示す図である。図２８の左側の数字は、文書の
行番号を表しているのではなく、それぞれ前記数字の右
側に現れる要素の開始タグの最初の文字位置、また終了
タグの最後の文字位置をそれぞれ表しているものとす
る。FIG. 29 is a diagram showing an example of the contents of the document objects A, B, and C. The numbers on the left side of FIG. 28 do not represent the line numbers of the document, but the first character position of the start tag and the last character position of the end tag of the elements appearing on the right side of the numbers, respectively. I shall.

【０２５９】図３０、及び図３１は、ＤＴＤオブジェク
トX、Yが管理する要素−属性ＩＤテーブルをそれぞれ示
す図である。FIGS. 30 and 31 are views showing the element-attribute ID tables managed by the DTD objects X and Y, respectively.

【０２６０】以上の文書オブジェクトA、B、Cを文書フ
ォルダオブジェクトに登録すると、文書フォルダオブジ
ェクトは、図２３及び図２４で示した手順に従い、図３
２に示す要素−属性インデックスを作成する。When the above document objects A, B, and C are registered in the document folder object, the document folder object follows the procedure shown in FIGS.
2. Create the element-attribute index shown in 2.

【０２６１】検索要求の例「要素"疾患記録"の子孫要素
で、属性"pＩＤ"が1000以上の要素"患者"を取出せ」に
対する、検索実行部２２の検索処理の手順について説明
する。Example of Search Request The procedure of the search process of the search execution unit 22 for the element “patient”, which is a descendant element of the element “disease record” and the attribute “pID” of which is 1000 or more, will be described.

【０２６２】ステップ１：属性"pＩＤ"の値が1000以上
の要素"患者"の取出し：Step 1: Retrieval of element "Patient" whose attribute "pID" value is 1000 or more:

【０２６３】a）文書フォルダオブジェクトは、自分が
管理するＤＴＤオブジェクトX、Yの要素−属性ＩＤテー
ブル、及びX、YのＤＴＤＩＤをロードする。A) The document folder object loads the element-attribute ID table of the DTD objects X and Y managed by itself, and the DTD IDs of X and Y.

【０２６４】b）ＤＴＤオブジェクトX、Yの要素−属性
ＩＤテーブルから、検索要求で用いられている要素名("
患者"、"疾患記録")、属性名("pＩＤ")の要素名ＩＤ、
属性名ＩＤを調べる。B) From the element-attribute ID table of the DTD objects X and Y, the element name ("" used in the search request)
"Patient", "disease record"), element name ID of attribute name ("pID"),
Check the attribute name ID.

【０２６５】・ＤＴＤオブジェクトX(ＤＴＤＩＤは１)
の場合： −"患者"=00030000、"pＩＤ"=00030001、"疾患記録"=00
040000 ・ＤＴＤオブジェクトY(ＤＴＤＩＤは２)の場合： −"患者"=00040000、"pＩＤ"=00040001、"疾患記録"=00
050000 ・"患者"と"pＩＤ"のＩＤの上位２バイトが同じ値であ
ることから、属性"pＩＤ"は、要素"患者"の属性である
ことが分かる。DTD object X (DTDID is 1)
For:-"Patient" = 00030000, "pID" = 00030001, "Disease Record" = 00
040000-For DTD object Y (DTDID is 2):-"Patient" = 00040000, "pID" = 00040001, "Disease record" = 00
050000-Since the upper 2 bytes of the ID of "patient" and "pID" have the same value, it can be seen that the attribute "pID" is the attribute of the element "patient".

【０２６６】c）文書フォルダオブジェクトが保有する
要素−属性インデックスにおいて、(ＤＴＤＩＤ、"pＩ
Ｄ"の属性名ＩＤ)と、属性値が1000以上という条件を満
たすデータを、図２７に示した検索探索の手順に従って
検索する。C) In the element-attribute index held by the document folder object, (DTDID, "pI
The data satisfying the condition that the attribute name ID of D ") and the attribute value is 1000 or more are searched according to the search search procedure shown in FIG.

【０２６７】このときの図２７のステップＳ376の検索
条件は、「属性値が1000以上」となる。結果、以下のエ
ントリが返却される。At this time, the search condition of step S376 of FIG. 27 is "attribute value is 1000 or more". As a result, the following entries are returned.

【０２６８】−(1、00030001、1200、1、20) −(1、00030001、1200、1、40) −(2、00040001、1500、3、40)-(1, 00030001, 1200, 1, 20) -(1, 00030001, 1200, 1, 40) -(2,00040001,1500,3,40)

【０２６９】すなわち、図３３において、「○」を付け
た部分の要素が検索結果に該当する。「×」を付けた部
分は、属性値の条件である1000以上を満たさなかった要
素"患者"を表している。That is, in FIG. 33, the elements marked with "○" correspond to the search results. The part marked with "x" represents the element "patient" that did not satisfy the attribute value condition of 1000 or more.

【０２７０】ステップ２.要素"患者"が要素"疾患記録"
の子孫要素であるかどうかのチェック： a）各データの文書ＩＤから文書オブジェクトをロード
し、各文書オブジェクトの要素ツリーインデックスをロ
ードする。Step 2. Element "patient" is element "disease record".
Check if it is a descendant element of: a) Load the document object from the document ID of each data, and load the element tree index of each document object.

【０２７１】図３４は、文書A(文書ＩＤは１)の文書オ
ブジェクトの要素ツリーインデックスを示す、図３５
は、文書C(文書ＩＤは３)の文書オブジェクトの要素ツ
リーインデックスを示す図である。FIG. 34 shows the element tree index of the document object of document A (document ID is 1).
FIG. 9 is a diagram showing an element tree index of a document object of document C (document ID is 3).

【０２７２】これらの要素ツリーインデックスは、図９
に示した要素ツリーインデックス作成手順によってあら
かじめ作成されたものである。These element tree indexes are shown in FIG.
It is created in advance by the element tree index creation procedure shown in.

【０２７３】b）図３４、及び図３５の要素ツリーイン
デックスを基に、ステップ1のc）で求めた属性"pＩＤ"
が1000以上の要素"患者"が、要素"疾患記録"の子孫要素
であるか否かを計算する。この計算処理が、文書の論理
構造に基づく検索に相当する。B) Based on the element tree indexes of FIGS. 34 and 35, the attribute "pID" obtained in step 1c)
Calculates whether or not the element "patient" with 1000 or more is a descendant element of the element "disease record". This calculation process corresponds to the search based on the logical structure of the document.

【０２７４】この計算は、図１６に示した子孫要素の取
出しによって、要素"疾患記録"の子孫要素にステップ1
のc）で求めた要素"患者"が存在するかチェックする。This calculation is performed by extracting the descendant elements shown in FIG. 16 to the descendant elements of the element "disease record" in step 1
Check whether the element "patient" obtained in c) exists.

【０２７５】このチェックの方法は、要素ツリーインデ
ックスにおける三番目のデータ項目「要素開始位置」
と、要素−属性インデックスの五番目のデータ項目「要
素開始位置」の値は、同じ要素であるならば同じ値とな
ることから、要素"疾患記録"の子孫要素の「要素開始位
置」の中に、ステップ1のc）で求めた要素−属性インデ
ックスのエントリの「要素開始位置」の値が含まれてい
るかを調べることによって判定することができる。This checking method is performed by the third data item "element start position" in the element tree index.
And the value of the fifth data item “element start position” of the element-attribute index is the same value if it is the same element, so in the “element start position” of the descendant element of the element “disease record” Can be determined by checking whether or not the value of the "element start position" of the entry of the element-attribute index obtained in step 1 c) is included.

【０２７６】以上の結果、以下のエントリが要素"疾患
記録"の子孫要素となる。As a result of the above, the following entries become descendant elements of the element "disease record".

【０２７７】−(1、00030001、1200、1、40) −(2、00040001、1500、3、40)-(1, 00030001, 1200, 1, 40) -(2,00040001,1500,3,40)

【０２７８】すなわち、図３６において、印「○」を付
けた部分の要素が最終的な検索結果に該当する。That is, in FIG. 36, the element of the part marked with "○" corresponds to the final search result.

【０２７９】c）各文書ＩＤの文書オブジェクトとその
要素開始位置から、検索結果である要素の中身を取出
し、返却する。C) From the document object of each document ID and its element start position, the content of the element which is the search result is extracted and returned.

【０２８０】[0280]

【発明の効果】以上説明したように本発明によれば下記
記載の効果を奏する。As described above, the present invention has the following effects.

【０２８１】本発明の第１の効果は、ＤＴＤが異なるな
どで構造の異なる複数の構造化文書の集合に対する検索
において、各文書の構造に基づく検索を正しく実行可能
としている、ということである。The first effect of the present invention is that in the search for a set of a plurality of structured documents having different structures such as different DTDs, the search based on the structure of each document can be correctly executed.

【０２８２】従来より、一つの文書に対する構造に基づ
く検索は可能であり、また複数の文書に対する構造に基
づく検索では、例えば特開平10−240752号公報等に、複
数の文書の構造を代表させる構造インデックスを用いた
検索などが提案されているが、前述したように、上記特
開平10−240752号公報等に記載される方法では、前記構
造インデックスが、各文書の構造を必ずしも正しく代表
しているとは限らない、という問題点を有しており、前
記構造インデックスを用いて検索を行っても、各文書の
論理構造の条件を正しく反映した検索結果が返ってくる
保証は得られなかった。Conventionally, it is possible to perform a structure-based search for one document, and for a structure-based search for a plurality of documents, for example, Japanese Patent Laid-Open No. 10-240752 discloses a structure that represents the structure of a plurality of documents. Although a search using an index has been proposed, as described above, in the method described in the above-mentioned Japanese Patent Laid-Open No. 10-240752, the structure index does not always correctly represent the structure of each document. However, even if a search is performed using the structure index, there is no guarantee that a search result that correctly reflects the logical structure condition of each document will be returned.

【０２８３】本発明によれば、文書の構造情報が記述さ
れるＤＴＤを、各文書ごとにリンクして管理し、文書の
構造に基づく検索では、各文書ごとのＤＴＤの情報を参
照することにより、各文書ごとの正しい論理構造に基づ
く検索を実現している。According to the present invention, the DTD in which the document structure information is described is linked and managed for each document, and in the search based on the document structure, the DTD information for each document is referred to. , Realizes a search based on the correct logical structure of each document.

【０２８４】本発明の第２の効果は、前記構造に基づく
検索に利用するためのインデックスの作成方法および利
用法を提供することにより、その検索性能を向上する、
ということである。The second effect of the present invention is to improve the search performance by providing a method of creating and using an index for use in the search based on the structure.
That's what it means.

【０２８５】文書の論理構造に基づく検索要求では、文
書の要素名や属性名を指定することが多い。要素名や属
性名は通常は文字列で表されるので、上記特開平6−131
340号公報記載の装置では、インデックスとして前記要
素名や属性名の文字列を直接扱っているが、インデック
スとして文字列を扱うことの問題点は、文字列は一般に
可変長であり、文書の要素に対して必要なインデックス
の容量をあらかじめ予測できないこと、そしてその文字
列が長大であった場合、その分だけインデックスの容量
を必要とする。In a search request based on the logical structure of a document, the element name or attribute name of the document is often designated. Since element names and attribute names are usually represented by character strings, the above-mentioned JP-A-6-131
In the device described in Japanese Patent No. 340, a character string of the element name or the attribute name is directly handled as an index, but the problem of handling a character string as an index is that the character string is generally variable length, and the document element It is impossible to predict in advance the required index capacity, and if the character string is long, the index capacity is required accordingly.

【０２８６】本発明によれば、第一のインデックスとし
て、ＤＴＤクラスに要素−属性ＩＤテーブルを備えてお
り、要素名や属性名はあるＤＴＤの範囲で唯一に定めら
れることに着目し、各要素名、属性名に対応する要素名
ＩＤ、属性名ＩＤを与えてその対応関係を要素−属性Ｉ
Ｄテーブルとして保持して管理したものであり、要素名
ＩＤと属性名ＩＤは数値として与えている。例えば32ビ
ット計算器におけるint型（整数型）のデータとして与
えた場合、要素名ＩＤ、属性名ＩＤに必要な容量は固定
長で32ビットとなり、32ビットは半角文字に換算すると
4文字分であり、インデックスの省メモリ化を図ること
ができる。According to the present invention, the DTD class is provided with an element-attribute ID table as the first index, and attention is paid to the fact that element names and attribute names are uniquely determined within a certain DTD range. Element-attribute I by giving the element name ID and the attribute name ID corresponding to the name and the attribute name
It is held and managed as a D table, and element name IDs and attribute name IDs are given as numerical values. For example, when given as int type (integer type) data in a 32-bit calculator, the capacity required for the element name ID and attribute name ID is a fixed length of 32 bits, and if 32 bits are converted to half-width characters
It's 4 characters long, so you can save memory of index.

【０２８７】インデックスを省メモリ化することは、よ
り多くの文書に関する情報を一度に計算機のメモリ上に
ロード可能にすることを意味し、結果として文書の検索
性能の向上につながる。The memory saving of the index means that information on more documents can be loaded into the memory of the computer at one time, and as a result, the document search performance is improved.

【０２８８】第二のインデックスとして、要素ツリーイ
ンデックスを備え、要素ツリーインデックスは、各文書
における要素間の関係を表すインデックスであり、各要
素に対して(要素名ＩＤ、要素レベル、要素開始位置)の
三つ組のデータを与えてその文書の構造を管理する。要
素ツリーインデックスの各エントリに必要な容量は、各
項目とも数値で表すこととができるので、例えば32ビッ
ト計算器におけるint型のデータとして与えれば、一エ
ントリに対して96ビット、12バイトとなる。すなわち、
要素ツリーインデックスも省メモリなインデックスとい
える。An element tree index is provided as a second index, and the element tree index is an index representing the relationship between elements in each document, and for each element (element name ID, element level, element start position). It manages the structure of the document by giving the triplet data of. Since the capacity required for each entry of the element tree index can be expressed numerically for each item, if given as int type data in a 32-bit calculator, 96 bits and 12 bytes are provided for each entry. . That is,
The element tree index is also a memory-saving index.

【０２８９】また要素ツリーインデックスを用いて、あ
る要素からの親要素、祖先要素、子要素、子孫要素、兄
要素、弟要素、前要素、次要素などの相対的な関係にあ
る要素の取出しを可能にするので、構造に基づく検索要
求に応えるためのインデックスとしては前記１２バイト
の構造で必要十分な情報量といえる。Also, by using the element tree index, extraction of a relative element such as a parent element, an ancestor element, a child element, a descendant element, a brother element, a brother element, a previous element, and a next element from a certain element. Since it is possible, it can be said that the 12-byte structure is a necessary and sufficient amount of information as an index for responding to a search request based on the structure.

【０２９０】本発明においては、第三のインデックスと
して、要素−属性インデックスを備えている。要素−属
性インデックスは、本発明の文書フォルダクラスのイン
デックスとして、複数の文書に対する要素や属性の情報
を保持するインデックスである。インデックスの構造
は、（ＤＴＤＩＤ、要素−属性名ＩＤ、属性値、文書Ｉ
Ｄ、要素開始位置)であり、二番目の項目「要素−属性
名ＩＤ」で要素名ＩＤまたは属性名ＩＤを用いており、
文字列をそのまま用いるよりはサイズを小さく抑え、ま
た固定長としている。In the present invention, an element-attribute index is provided as the third index. The element-attribute index is an index that holds element and attribute information for a plurality of documents as an index of the document folder class of the present invention. The structure of the index is (DTDID, element-attribute name ID, attribute value, document I
D, element start position), and the element name ID or attribute name ID is used in the second item "element-attribute name ID",
The size is kept small and the length is fixed rather than using the character string as it is.

【０２９１】また、一番目の項目「ＤＴＤＩＤ」からＤ
ＴＤオブジェクトの要素−属性ＩＤテーブルを参照する
ことにより、各ＤＴＤで定義した要素名、属性名に基づ
いた検索条件の判定を可能としている。Also, from the first item "DTDID" to D
By referring to the element-attribute ID table of the TD object, it is possible to determine the search condition based on the element name and attribute name defined in each DTD.

【０２９２】このように、ＤＴＤに関する情報との関係
をエントリ一つ一つに持たせたので、ＤＴＤが異なる複
数の文書を単一のインデックスで扱うことを可能として
いる。As described above, since each entry has a relationship with the information regarding DTD, it is possible to handle a plurality of documents having different DTDs by a single index.

【０２９３】要素−属性インデックスの文書ＩＤからは
文書オブジェクトを特定して、要素ツリーインデックス
を用いて構造に基づく条件判定を行ったり、要素開始位
置から即座に該当する要素を取出すことも可能としてい
る。It is also possible to specify a document object from the document ID of the element-attribute index, perform condition determination based on the structure using the element tree index, and immediately extract the corresponding element from the element start position. .

【０２９４】要素−属性インデックスの各エントリにつ
いては、(ＤＴＤＩＤ、要素−属性名ＩＤ)でソートした
順序としている。The entries of the element-attribute index are in the order sorted by (DTDID, element-attribute name ID).

【０２９５】これにより、(ＤＴＤＩＤ、要素−属性名
ＩＤ)を検索キーとした二分探索によるエントリの検索
を可能とし、より高速な目的の文書や要素の取出しを可
能としている。As a result, it is possible to search for an entry by a binary search using (DTDID, element-attribute name ID) as a search key, and it is possible to retrieve a target document or element at a higher speed.

[Brief description of drawings]

【図１】本発明の一実施の形態の構成を示す図である。FIG. 1 is a diagram showing a configuration of an embodiment of the present invention.

【図２】本発明の一実施の形態におけるデータベースの
スキーマに関するクラスを示す図である。FIG. 2 is a diagram showing classes related to a database schema according to the embodiment of the present invention.

【図３】本発明の一実施の形態における文書のインスタ
ンスとＤＴＤのインスタンスの間のリンク関係の例を示
す図である。FIG. 3 is a diagram showing an example of a link relationship between a document instance and a DTD instance according to an embodiment of the present invention.

【図４】本発明の一実施の形態における文書フォルダオ
ブジェクトの木構造に基づく分類階層の一例を示す図で
ある。FIG. 4 is a diagram showing an example of a classification hierarchy based on a tree structure of a document folder object according to the embodiment of the present invention.

【図５】本発明の一実施の形態における、単一のＤＴＤ
の範囲における要素名ＩＤの一意な割り振り方法の一例
を示す図である。FIG. 5 is a single DTD according to an embodiment of the present invention.
It is a figure which shows an example of the unique allocation method of element name ID in the range of.

【図６】本発明の一実施の形態における単一のＤＴＤの
範囲における属性名ＩＤの一意な割り振り方法の一例を
示す図である。FIG. 6 is a diagram showing an example of a unique allocation method of attribute name IDs within a single DTD range according to an embodiment of the present invention.

【図７】本発明の一実施の形態におけるＤＴＤの一例を
示す図である。FIG. 7 is a diagram showing an example of a DTD according to an embodiment of the present invention.

【図８】図７のＤＴＤに対して割り振った要素名ＩＤと
属性名ＩＤを保持する要素−属性ＩＤテーブルを示す図
である。8 is a diagram showing an element-attribute ID table holding an element name ID and an attribute name ID assigned to the DTD of FIG. 7. FIG.

【図９】本発明の一実施の形態における要素ツリーイン
デックスの作成手順を示す流れ図である。FIG. 9 is a flowchart showing a procedure for creating an element tree index according to the embodiment of the present invention.

【図１０】本発明の一実施の形態における文書の一例を
示す図である。FIG. 10 is a diagram showing an example of a document according to the embodiment of the present invention.

【図１１】図１０の文書の要素に関する木構造表記を示
す図である。11 is a diagram showing a tree structure notation for elements of the document of FIG.

【図１２】図１０の文書に対する要素ツリーインデック
スを示す図である。12 is a diagram showing an element tree index for the document of FIG.

【図１３】本発明の一実施の形態における親要素の取出
し手順を示す流れ図である。FIG. 13 is a flowchart showing a procedure for extracting a parent element according to the embodiment of the present invention.

【図１４】本発明の一実施の形態における祖先要素の取
出し手順を示す流れ図である。FIG. 14 is a flowchart showing a procedure for extracting an ancestor element in the embodiment of the present invention.

【図１５】本発明の一実施の形態における子要素の取出
し手順を示す流れ図である。FIG. 15 is a flowchart showing a procedure for taking out child elements according to the embodiment of the present invention.

【図１６】本発明の一実施の形態における子孫要素の取
出し手順を示す流れ図である。FIG. 16 is a flowchart showing a procedure for extracting a descendant element according to the embodiment of the present invention.

【図１７】本発明の一実施の形態における兄要素の取出
し手順を示す流れ図である。FIG. 17 is a flowchart showing a procedure for taking out a brother element in the embodiment of the present invention.

【図１８】本発明の一実施の形態における弟要素の取出
し手順を示す流れ図である。FIG. 18 is a flowchart showing a procedure for taking out the younger brother element in the embodiment of the present invention.

【図１９】本発明の一実施の形態における末弟要素の取
出し手順を示す流れ図である。FIG. 19 is a flowchart showing a procedure for taking out the youngest brother element in the embodiment of the present invention.

【図２０】本発明の一実施の形態における右深さ優先順
序での子孫要素の取出し手順を示す流れ図である。FIG. 20 is a flowchart showing a procedure for extracting descendant elements in the right depth priority order according to the embodiment of the present invention.

【図２１】本発明の一実施の形態における前要素の取出
し手順を示す流れ図である。FIG. 21 is a flowchart showing a procedure for taking out the front element according to the embodiment of the present invention.

【図２２】本発明の一実施の形態における次要素の取出
し手順を示す流れ図である。FIG. 22 is a flowchart showing a procedure for taking out a next element according to the embodiment of the present invention.

【図２３】本発明の一実施の形態における要素−属性イ
ンデックスの作成手順を示す流れ図（その１）である。FIG. 23 is a flowchart (part 1) showing a procedure for creating an element-attribute index in the embodiment of the present invention.

【図２４】本発明の一実施の形態における要素−属性イ
ンデックスの作成手順を示す流れ図（その２）である。FIG. 24 is a flowchart (part 2) showing the procedure for creating an element-attribute index according to the embodiment of the present invention.

【図２５】本発明の一実施の形態における要素−属性イ
ンデックスにおける、エントリの挿入位置を求める手順
を示す流れ図である。FIG. 25 is a flowchart showing a procedure for obtaining an entry insertion position in the element-attribute index according to the embodiment of the present invention.

【図２６】本発明の一実施の形態におけるリスト上のデ
ータに対する目的のデータの取出し手順、およびデータ
の挿入位置の計算手順を示す流れ図である。FIG. 26 is a flow chart showing a procedure for extracting target data from data on a list and a procedure for calculating a data insertion position according to an embodiment of the present invention.

【図２７】本発明の一実施の形態における要素−属性イ
ンデックスにおける、エントリの検索手順を示す流れ図
である。FIG. 27 is a flowchart showing an entry search procedure in the element-attribute index according to the embodiment of the present invention.

【図２８】本発明の一実施例における文書フォルダ、文
書、ＤＴＤの各オブジェクトの関係の例を示すインスタ
ンスを示す図である。FIG. 28 is a diagram showing an instance showing an example of a relationship between each object of a document folder, a document, and a DTD according to an embodiment of the present invention.

【図２９】本発明の一実施例における文書オブジェクト
A、B、Cを示す図である。FIG. 29 is a document object according to an embodiment of the present invention.
It is a figure which shows A, B, and C.

【図３０】本発明の一実施例におけるＤＴＤオブジェク
トXの要素−属性ＩＤテーブルを示す図である。FIG. 30 is a diagram showing an element-attribute ID table of a DTD object X according to an embodiment of the present invention.

【図３１】本発明の一実施例におけるＤＴＤオブジェク
トYの要素−属性ＩＤテーブルを示す図である。FIG. 31 is a diagram showing an element-attribute ID table of a DTD object Y according to an embodiment of the present invention.

【図３２】本発明の一実施例における要素−属性インデ
ックスを示す図である。FIG. 32 is a diagram showing an element-attribute index according to an embodiment of the present invention.

【図３３】本発明の一実施例における属性"pＩＤ"の値
が1000以上の要素"患者"の検索結果を示す図である。FIG. 33 is a diagram showing a search result of an element “patient” having an attribute “pID” value of 1000 or more in one embodiment of the present invention.

【図３４】本発明の一実施例における文書オブジェクト
Aの要素ツリーインデックスを示す図である。FIG. 34 is a document object according to an embodiment of the present invention.
It is a figure which shows the element tree index of A.

【図３５】本発明の一実施例における文書オブジェクト
Cの要素ツリーインデックスを示す図である。FIG. 35 is a document object according to an embodiment of the present invention.
It is a figure which shows the element tree index of C.

【図３６】本発明の一実施例における要素"疾患記録"の
子孫要素である要素"患者"の検索結果を示す図である。FIG. 36 is a diagram showing a search result of an element “patient” which is a descendant element of the element “disease record” in one example of the present invention.

[Explanation of symbols]

１データベース１１文書１２文書フォルダ１３ＤＴＤ１４要素ツリーインデックス１５要素−属性インデックス１６要素−属性ＩＤテーブル２１文書検索実行部２１１要素ツリーインデックス作成部２１２要素−属性インデックス２１３要素−属性ＩＤテーブル作成部２２検索実行部１１１、１１２、１１３文書１３１、１３２ＤＴＤ１２１〜１２６文書フォルダ 1 database 11 documents 12 document folders 13 DTD 14 element tree index 15 element-attribute index 16 elements-attribute ID table 21 Document Search Execution Unit 211 Element tree index creation part 212 element-attribute index 213 element-attribute ID table creation unit 22 Search execution unit 111, 112, 113 documents 131, 132 DTD 121-126 Document Folder

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩＧ０６Ｆ 12/00 ５４６Ｇ０６Ｆ 12/00 ５４６Ｔ (56)参考文献北野拓哉，波内みさ，半構造化データモデルに基づくＸＭＬ文書の格納と検索及びその実装方法，情報処理学会研究報告（99−ＤＢＳ−117），1999年１月 23日，Ｖｏｌ．99，Ｎｏ．６，ｐ．31− 38 北野拓哉，波内みさ，半構造化データモデルを利用したＸＭＬ文書管理システムの試作，情報処理学会第57回（平成10 年後期）全国大会講演論文集（３）, 1998年10月５日，ｐ．283−284 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/28 - 17/30 G06F 12/00 ─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁷ Identification code FI G06F 12/00 546 G06F 12/00 546T (56) References Takuya Kitano, Misa Namiuchi, XML document based on semi-structured data model Storage and retrieval and its implementation method, IPSJ Research Report (99-DBS-117), January 23, 1999, Vol. 99, No. 6, p. 31- 38 Takuya Kitano, Misa Namiuchi, Prototype of XML document management system using semi-structured data model, Proc. Of the 57th National Conference of Information Processing Society of Japan (Late 1998) (3), 1998 10 5th p.m. 283-284 (58) Fields investigated (Int.Cl. ⁷ , DB name) G06F 17/28-17/30 G06F 12/00

Claims

(57) [Claims]

1. A database for storing a structured document, and a structured document that has been input is analyzed and stored in the database.
Document storing and executing means for storing, and means for searching documents from the database and outputting search results
And a search means for outputting to the database, as a class in the schema, the document class that manages a document, the document of DTD; and DTD class that manages the (Document Type Definition document type definition), and the document A document folder class for managing a set of DTDs, and a document tree class representing a tree structure relationship of document elements in the document class as a first index, and the document as a second index. Folder class manages index for each element of document and attribute
Element for - with the attribute index, as a third index, the said DTD class, elements for managing the element and attribute names as ID - attribute I
D table, which is a numerical element name ID corresponding to the element name and the attribute name of the element in the document,
An element-attribute ID table having an attribute name ID is provided, and the document storage executing means is provided for each element-attribute ID table of the database.
A means for creating the element tree index for each document using a prime ID, and an element name defined in the DTD of the input structured document
And element name ID and attribute name ID respectively
And means for creating the element-attribute ID table, and storing the element and attribute information for the input structured document.
A means for creating the element-attribute index forming an index to be held is provided, and the input structured document is stored in the element tree index.
And the element-attribute ID table, and the element-attribute
Using an index, in the form based on the logical structure,
Said retrieval means and stored in the database and retrieves the document from the database,
When performing a search for the input search request based on the logical structure of the document, the target structuring is performed using the element tree index, the element-attribute ID table, and the element-attribute index .
A structured document management device characterized by searching a document and outputting the search result to the output means .

2. The retrieval unit extracts a document element as a hierarchical relationship from a certain document element in a structured document, and shows a parent element, an ancestor element, a child element, a descendant element, and an older element for the document element. brother element before the element, taken out in either or a combination of the following elements, a structured document management apparatus according to claim 1, wherein a.

Wherein said retrieval means, documents and Upon retrieval of the element, DTD performs a search for a plurality of structured documents logical structure are different from each other, such as different structured document according to claim 1, wherein the Management device.

4. A method for creating the element tree index.
The column is an index representing the relationship between elements in each document , and the structure of the index is such that for each element (element name ID, depth from the root element when the element is represented by a tree structure). The element tree having three sets of item data (element level indicating the
Create an index and store it in the database,
The structured document management device according to claim 1, wherein

5. A method for creating the element-attribute index.
A column is an index that holds information on elements and attributes for a plurality of documents, and includes (DTDID, element-attribute name I
D, attribute value, the document ID, said element consisting of five sets of elements starting position) - Creates an attribute index, the database
The structured document management apparatus according to claim 1 , wherein the structured document management apparatus stores the structured document management apparatus.

6. A procedure for creating the element-attribute ID table.
The stage is for creating an element-attribute ID table having correspondence between element name ID and element name and correspondence between attribute name ID and attribute name.
Or, wherein only become such element name ID for each element name defined in the DTD is given as a number, to create a correspondence between the element name and the element name ID as a table, the table, It is stored in the database as an instance of the element-attribute ID table in a one-to-one relationship with an object that is an instance of the DTD class, and the attribute name is unique to the element in the DTD object and different from each other. Even if the same attribute name exists between element names, an attribute name ID that can identify them is given as a numerical value, and the correspondence relationship between the attribute name and the attribute name ID is created as a table, and the table stores the element- The structured document according to claim 1 , wherein the structured document is stored in the database as an instance of an attribute ID table. Management device.

7. A schema of a database, a document class for managing documents, a DTD class for managing DTD (Document Type Definition) of the documents, and a document folder class for managing a set of the documents and DTDs. And, as a first index, the document class is provided with an element tree index representing a tree structure relationship of elements of the document, and as a second index, the document folder class is provided for a plurality of documents. It is an index that holds information on elements and attributes, (DTDID, element-attribute name ID,
Attribute value, document ID, ing than five sets of elements starting position) is needed
An element-attribute I for managing element names and attribute names as IDs is provided in the DTD class as a third index.
D table, which is a numerical element name ID corresponding to the element name and the attribute name of the element in the document,
Element having an attribute name ID - structured document management apparatus comprises a database having the attribute ID table, when storing the input <br/> documents in the database, said element (a) - Create an attribute ID table In doing so, an element name ID that is unique to each element name defined in the DTD of the input document is given as a numerical value, and a correspondence relationship between the element name and the element name ID is created as a table. , The created table, the DT
It is stored in the database as an instance of an element-attribute ID table in a one-to-one relationship with an object that is an instance of D class, and the attribute names are unique to the elements in the DTD object and different from each other. Numerical values are given to the attribute name IDs so that they can be identified even if the same attribute name exists between them, and the correspondence relationship between the attribute name and the attribute name ID is created as a table,
Storing the created table in the database as an instance of the element-attribute ID table, and (b) in creating the element tree index, the first character position of the start tag of the element of the input document. , The last character position of the end tag, the element name in the start tag as the element start position, the element-attribute ID table is retrieved from the DTD object related to the document object of the document, and the element name of the element Next, the ID is calculated, and then the element level indicating the depth from the root element when the element is represented by a tree structure is calculated.
D, element level, element start position) is added, and the created element tree index is added.
(C) Each item of the data structure (DTDID, element-attribute name ID, attribute value, document ID, element start position) of the quintuplet for storing the element-attribute index Value is calculated, the created entry is added to the list of the element-attribute index while maintaining the order sorted by DTDID and the element-attribute name ID value, and the created element is added.
A step of storing an attribute index in the database, the element tree index for performing a search based on the logical structure of the document from the database in response to an input search request , The element-attribute ID table,
Targeted structuring using the element-attribute index
A structured document management method comprising the steps of searching a document and outputting the search result from an output means .

8. A parent element, an ancestor element, a child element, or a descendant as an element having a relative relationship with a certain element using the element tree index when performing a search from the database based on the logical structure of the document. Element, brother element,
The structured document management method according to claim 7 , wherein any one of a younger brother element, a previous element, a next element, or a combination thereof is taken out.

9. When creating the element-attribute index, when a document object is added to an object that is an instance of a document folder class, a DTD object related to the document object is also added at the same time, and the added document is added. Create and add an entry for the element-attribute index for each element of the object, find the start tag of the element of the document, find the first character position of the start tag and the element name of the start tag, DTD associated with the document object An element having a one-to-one relationship with an object-an element ID corresponding to the element name is obtained from an attribute ID table, an attribute in the start tag is obtained, and if there is no attribute, the element is (DTDID, element name ID)- The insertion position for the attribute index is obtained, and (DTDI
D, element-attribute name ID, null, document ID, element start position) is added, and if there is an attribute in the start tag, the attribute value of the attribute is changed from the element-attribute ID table to the attribute name. The corresponding attribute ID is obtained, the insertion position with respect to the element-attribute index is obtained by (DTDID, attribute name ID), and the (DTDI
8. The structured document management method according to claim 7 , wherein an entry of D, element-attribute name ID, null, document ID, element start position) is added.

10. A DTD is used for searching a document and its elements.
8. The structured document management method according to claim 7 , wherein a search is performed for a plurality of structured documents having different logical structures such as different.

11. A schema of a database, a document class for managing a document, a DTD class for managing a DTD (Document Type Definition) of the document, and a document folder class for managing a set of the document and the DTD. And, as the first index, the document class is provided with an element tree index representing the tree structure relationship of the elements of the document, and as the second index, the document folder class is provided with the element-attribute. The index is an index that holds information on elements and attributes for multiple documents,
(DTDID, element-attribute name ID, attribute value, document ID,
Five sets than ing elements of the element starting position) - attribute index
As a third index, the DTD class has an element-attribute I for managing element names and attribute names as IDs.
D table, which is a numerical element name ID corresponding to the element name and the attribute name of the element in the document,
Element having attribute name ID-In a structured document management device provided with a database having an attribute ID table, in storing an input document in the database, (a) for each element name defined in the DTD A unique element name ID is given as a numerical value, a correspondence relationship between the element name and the element name ID is created as a table, and the created table is in a one-to-one relationship with an object that is an instance of the DTD class. And element-attribute I
Stored in the database as an instance of the D table, the attribute names are unique to the elements in the DTD object, and even if the same attribute name exists between different element names, these can be identified. An element-attribute index in which a numerical value is given to the attribute name ID, a correspondence relationship between the attribute name and the attribute name ID is created as a table, and the created table is stored in the database as an instance of the element-attribute ID table. (B) The first character position of the start tag and the last character position of the end tag of the document element are obtained, the element name in the start tag is obtained as the element start position, and the document object of the document is obtained. The element-attribute ID table is taken out from the related DTD object, the element name ID of the element is obtained, and then continued. Te, the element obtains a element level which represents the depth from the root element when expressed in a tree structure, (element name ID, element level, element start position) add to <br/> entries consisting triples (C) When a document object is added to an object that is an instance of the document folder class, a DTD object related to the document object is also added at the same time, and the element tree index is created. Create and add an element-attribute index entry for each element,
The start tag of the document element is searched for, the first character position of the start tag and the element name of the start tag are obtained, and the element name is obtained from the element-attribute ID table having a one-to-one relationship with the DTD object related to the document object. Element ID corresponding to
If there is no attribute in the start tag, the insertion position for the element-attribute index is calculated by (DTDID, element name ID), and the insertion position is set to (DTDI
D, element-attribute name ID, null, document ID, element start position) is added, and if there is an attribute in the start tag, the attribute value of the attribute is changed from the element-attribute ID table to the attribute name. The corresponding attribute ID is obtained, the insertion position with respect to the element-attribute index is obtained by (DTDID, attribute name ID), and the (DTDI
D, element-attribute name ID, null, document ID, element start position), element-attribute index creation processing, and (d) in response to the input search request When conducting a search based on the logical structure,
Target using the element tree index, the element-attribute ID table, and the element-attribute index
A structured document is searched, and at that time, as elements having a relative relationship from a certain element using the element tree index, a parent element, an ancestor element, a child element, a descendant element, an older brother element, a younger brother element, a previous element, A process for extracting any one of the following elements or a combination thereof and outputting the search result to the output means; and causing the computer constituting the structured document management device to execute the processes (a) to (d) above. A computer-readable recording medium in which a program is recorded.