JPH096803A

JPH096803A - Document data base management device

Info

Publication number: JPH096803A
Application number: JP7155944A
Authority: JP
Inventors: Hisashi Nakatsuyama; 恒中津山
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1995-06-22
Filing date: 1995-06-22
Publication date: 1997-01-10
Anticipated expiration: 2019-09-22
Also published as: JP3568062B2

Abstract

PURPOSE: To make it easy to judge the adequacy of a retrieval expression by collating a document type with the conditions that are specified with the retrieval expression and use parent-child and descendant relation. CONSTITUTION: A correspondence table holding part 9 holds a correspondence table which has as entries three sets of elements (starting element), elements (adjacent element set) appearing as children of the starting element, and a set (reachable element set) of elements that can be reached from the starting elements. Further, a correspondence table generation part 10 refers to the structure of a specified document type as to the document type and generates the correspondence table. A reaching possibility decision part 11 when given two elements which are a source and a destination scans the correspondence table held at the correspondence table holding part 9 to inspect whether or not the destination is included in the adjacent element set or reachable element set of an entry based upon an entry as a starting element. When a structured document is retrieved, collation is performed under conditions which are specified with the retrieval expression and use the parent-child and descendant relation to judge whether or not the retrieval expression is adequate.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、電子文書を管理対象と
する文書データベース管理装置に関わる。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document database management device for managing electronic documents.

【０００２】[0002]

【従来の技術】ワードプロセッサ等により作成された電
子文書は、デジタルデータとして表現されるので、追
加、削除、変更等の編集を容易に行なうことができ、文
書作成効率を高めることができる。また、複数の電子文
書を大容量の記憶装置に蓄積して文書データベース装置
を構築することにより、キーワード検索等により目的と
する文書を電子的に検索することができる。2. Description of the Related Art Since an electronic document created by a word processor or the like is expressed as digital data, it is possible to easily add, delete, change, etc., and improve the efficiency of document creation. Further, by building a document database device by accumulating a plurality of electronic documents in a large-capacity storage device, it is possible to electronically search for a target document by keyword search or the like.

【０００３】従来の電子文書を管理対象とする文書デー
タベース管理装置では、文書の検索を行なう場合には、
ワードプロセッサ等で作られた文書データそのものを蓄
積し、そのデータを使って検索を行なっていた。In a conventional document database management apparatus that manages electronic documents, when searching for documents,
The document data itself created by a word processor was stored, and the data was used for searching.

【０００４】一方、電子文書の作成や編集作業を容易に
行なえるようにするために、電子文書を構造化すること
が行なわれている。文書の構造は、たとえば、文書を構
成する章、見出し、段落などの要素と、その要素間の関
係についての情報、たとえば、章は、下位構造として見
出しと段落を持つなどについての情報により表される。On the other hand, in order to facilitate the creation and editing of electronic documents, the electronic documents are structured. The structure of a document is represented by information about the elements such as chapters, headings, and paragraphs that make up the document and the relationship between the elements, for example, a chapter has headings and paragraphs as substructures. It

【０００５】[0005]

【発明が解決しようとする課題】本発明が解決しようと
する課題を、文書構造の国際規格であるＯＤＡ（Ｏｆｆ
ｉｃｅＤｏｃｕｍｅｎｔＡｒｃｈｉｔｅｃｔｕｒ
ｅ）（ＩＳＯ８６１３）とＳＧＭＬ（Ｓｔａｎｄａｒ
ｄＧｅｎｅｒａｌｉｚｅｄＭａｒｋｕｐＬａｎｇ
ｕａｇｅ）（ＩＳＯ８８７９；ＪＩＳＸ４１５
１）を例にとって説明する。The problem to be solved by the present invention is solved by the ODA (Off) which is an international standard for document structure.
ice Document Architecture
e) (ISO 8613) and SGML (Standard)
d Generalized Markup Lang
age) (ISO 8879; JIS X415
1) will be described as an example.

【０００６】先ず、本明細書で使用する用語について説
明する。First, terms used in this specification will be described.

【０００７】「文書構造」という用語は、文書を表現す
る情報構造とする。たとえば、ＯＤＡが定める情報構造
は文書構造である。ＳＧＭＬのサブセッティング（機能
の制限）を行ない、使用する文字コードや図表などに用
いる情報構造を定めたものも文書構造である。なお、Ｓ
ＧＭＬについては、たとえば、ＭａｒｔｉｎＢｒｙａ
ｎ著，「ＳＧＭＬ入門」，株式会社アスキー，１９９１
年３月３１日発行を参照されたい。The term "document structure" is an information structure that represents a document. For example, the information structure defined by ODA is a document structure. The document structure is also one in which SGML subsetting (function limitation) is performed and the information structure used for the character codes and charts used is defined. Note that S
For GML, for example, Martin Brya
n, "Introduction to SGML", ASCII Corporation, 1991
Please refer to the publication on March 31, 2013.

【０００８】「文書型」という用語は、文書のテンプレ
ートを示すものとする。文書型は、そこから作られる文
書がどのような論理構造をもち得るか、すなわち、論理
構造中に現われるノードの種類、各ノードがもち得る属
性、各ノードがもち得る下位構造を定める。ＯＤＡの共
通論理構造（ｇｅｎｅｒｉｃｌｏｇｉｃａｌｓｔｒ
ｕｃｔｕｒｅ）や、ＳＧＭＬをサブセッティングした文
書アーキテクチャにおけるＤＴＤ（Ｄｏｃｕｍｅｎｔ
ＴｙｐｅＤｅｆｉｎｉｔｉｏｎ）は、文書型である。The term "document type" shall mean a template for a document. The document type defines what kind of logical structure a document created from can have, that is, the types of nodes that appear in the logical structure, the attributes that each node can have, and the substructure that each node can have. ODA common logical structure (generic logical str)
structure), and DTD (Document) in the document architecture in which SGML is set.
Type Definition) is a document type.

【０００９】次に、上述したような、構造化された文書
を検索する場合の問題点について説明する。Next, a problem in the case of retrieving a structured document as described above will be described.

【００１０】構造化文書では、文書の内容は論理構造と
呼ばれ、章、節、図などの複数の文書構成要素からなる
木構造で表現される。論理構造の例を図１０に示す。In a structured document, the content of the document is called a logical structure, and is represented by a tree structure composed of a plurality of document constituent elements such as chapters, sections and figures. An example of the logical structure is shown in FIG.

【００１１】論理構造はまったく自由に作成してよいの
ではなく、上述した文書型と呼ばれる構文規則に沿って
作成される。文書型の例を図１１に示す。矩形のノード
は要素の型（要素型）を定義している。ノードのラベル
は、要素型の名前を示している。同一の名前をもつノー
ドの実体は同一の要素型である。したがって、図１１の
「節」という名前の要素型は、再帰的に定義されている
ことになる。楕円で示したノードは要素のつながりを定
義する。このノードを構築子と呼ぶ。ＳＥＱノードは、
それにつながるノードのインスタンスがその順に生成さ
れることを示している。ＲＥＰノードは、それにつなが
るノードのインスタンスが１回以上生成されることを示
す。ＯＰＴノードは、それにつながるノードのインスタ
ンスが、出現してもしなくてもよいことを示す。ＣＨＯ
ノードは、それにつながるいずれか１つのノードのイン
スタンスが生成されることを示す。図１０の論理構造
は、図１１の文書型の制約を満たしている。The logical structure is not completely free to be created, but is created according to the syntax rule called the document type described above. An example of the document type is shown in FIG. The rectangular node defines the element type (element type). The label of the node indicates the name of the element type. Entities of nodes having the same name have the same element type. Therefore, the element type named "section" in FIG. 11 is recursively defined. Nodes indicated by ellipses define the connection of elements. This node is called a constructor. SEQ nodes are:
This indicates that the instances of the nodes connected to it are generated in that order. The REP node indicates that an instance of the node connected to the REP node is generated once or more. The OPT node indicates that the instance of the node connected to it may or may not appear. CHO
The node indicates that an instance of any one of the nodes connected to the node is generated. The logical structure of FIG. 10 satisfies the document type constraint of FIG.

【００１２】構造化文書を管理対象とする文書データベ
ース管理装置では、検索を記述するための問合せ言語を
提供している。問合せ言語は、テキストで記述されるも
のもあるが、グラフィカルユーザインタフェースで記述
されるものもある。グラフィカルユーザインターフェー
スで記述された検索式の例を図１２に示す。ノードの文
字列は要素型を示している。ノードの傍に示した文字列
は、そのノードがもつテキストがその文字列を含むこと
を示す。実線で示されたアークは、両端のノードが親子
関係にあることを示す。破線で示されたアークは、両端
のノードが祖孫関係（先祖と子孫の関係）にあることを
示す。ひとつのノードから複数のアークが出ている場
合、すべての条件を満すものが検索結果となる。つま
り、連言として指定されたことになる。図１２の検索式
は、「見出しに”文書”という文字列を含み、”データ
ベース”という文字列を含む段落をもつ章」の検索を指
定している。A document database management device that manages structured documents provides a query language for describing a search. Some query languages are described in text, and some are described in a graphical user interface. FIG. 12 shows an example of the search formula described by the graphical user interface. The character string of the node indicates the element type. The character string shown beside the node indicates that the text held by the node includes the character string. The arc indicated by the solid line indicates that the nodes at both ends have a parent-child relationship. The arcs indicated by broken lines indicate that the nodes at both ends have an ancestor relationship (relationship between ancestors and descendants). When multiple arcs are output from one node, the search results are those that satisfy all the conditions. In other words, it has been designated as a conjunction. The search formula of FIG. 12 specifies a search for "a chapter that includes a character string" document "in a heading and a paragraph that includes a character string" database "".

【００１３】検索式は、文書の要素に関する条件と、要
素間の接続関係に関する条件を用いて指定される。図１
２の例では、前者は要素型に関する条件、後者は祖孫関
係を用いた条件である。The search expression is specified by using the conditions regarding the elements of the document and the conditions regarding the connection relation between the elements. FIG.
In the example of 2, the former is a condition related to element types, and the latter is a condition using an ancestor relationship.

【００１４】図１３は、上述したような検索を行なう従
来のデータベース管理装置の模式図である。問合せエデ
ィタ１で作成された検索式は、検索式生成部２により、
検索式評価部３で実行可能な形式の検索式に変換され
る。この検索式は、文書型管理部５に渡され指定された
条件を満たす要素をもつ文書を検索する。なお、６はデ
ータ辞書、７はデータベースである。上記検索式評価部
３においては検索式が文法的に正しいかどうかの検査も
行なわれ、文法的に正しくない場合には、その旨が操作
者に通知されると共に処理が中止される。FIG. 13 is a schematic diagram of a conventional database management device for performing the above-described search. The search formula created by the query editor 1 is
The search expression is converted into a search expression that can be executed by the search expression evaluation unit 3. This search formula is passed to the document type management unit 5 to search for a document having an element that satisfies a specified condition. In addition, 6 is a data dictionary and 7 is a database. The search expression evaluation unit 3 also checks whether or not the search expression is grammatically correct. If the search expression is not grammatically correct, the fact is notified to the operator and the process is stopped.

【００１５】図１３に示す従来の電子文書を管理対象と
する文書データベース管理装置では、検索式が文法的に
正しいかどうかの検査だけを行なっていた。このため、
解が存在し得ない検索式を与えても、正しい検索式とし
て扱われる。たとえば、図１４に示すような「段落に”
データベース”という文字列を含み、その段落の下位に
繋がった見出しに”文書”という文字列を含む」という
検索式は、文法的には正しいが、これを満す要素をもつ
文書は存在しない。即ち、図１１に示す文書例の構造の
場合、段落の下位に見出しが存在することは有り得な
い。The conventional document database management apparatus for managing electronic documents shown in FIG. 13 only checks whether or not the search expression is grammatically correct. For this reason,
Even if a search formula for which a solution cannot exist is given, it is treated as a correct search formula. For example, "in paragraph" as shown in Figure 14
The search expression "including the character string" database and including the character string "document" in the heading connected to the lower part of the paragraph "is grammatically correct, but no document has an element that satisfies this. That is, in the case of the structure of the document example shown in FIG. 11, it is impossible that a headline exists in the lower part of a paragraph.

【００１６】このような、妥当でない検索式を検索した
結果は、条件を満すものが存在しないので、何も得られ
ない。ユーザの視点からは、検索式が文法的にも意味論
的にも正しいが、条件を満たすものがデータベースに存
在しなかったのか、そもそも検索式が妥当でなかったか
は容易には判断できず、検索式を構成する際のユーザの
負担になっていた。また、条件を満す要素をもつ文書が
存在し得ないにもかかわらず検索処理を行なうので、無
意味にシステムの計算時間を浪費する結果となってい
た。図１４に示す検索式の評価は、典型的には、見出し
のインスタンスと段落のインスタンスをすべて走査し、
その後、親子関係を満すものがあるかどうかを調べる
が、そもそも図１４の条件を満すものは存在し得ないの
で、評価する必要はない。As a result of searching such an invalid search expression, there is no one satisfying the condition, so that nothing is obtained. From the user's point of view, the search expression is grammatically and semantically correct, but it is not easy to determine whether or not the conditions that satisfy the conditions did not exist in the database, or the search expression was not valid in the first place. It was a burden on the user when constructing the search formula. Moreover, since the retrieval processing is performed even if there is no document having an element satisfying the condition, the calculation time of the system is meaninglessly wasted. The evaluation of the search expression shown in FIG. 14 typically scans all heading and paragraph instances,
After that, it is checked whether or not there is one that satisfies the parent-child relationship, but there is no one that satisfies the condition of FIG. 14 in the first place, so there is no need to evaluate.

【００１７】そこで本発明は、検索式の妥当性を容易に
判断できるようにすることを目的とする。また、本発明
の他の目的は、妥当でない検索式に対しては検索を行な
わないようにして計算時間の浪費を防止することであ
る。Therefore, an object of the present invention is to make it possible to easily judge the validity of a search expression. Another object of the present invention is to prevent the search time from being wasted by not performing a search for an invalid search expression.

【００１８】[0018]

【課題を解決するための手段】前記問題点を解決するた
め、本発明は、構文規則によって文書が取り得る構造を
規定する構造化文書を管理対象とし、文書の要素に関す
る条件と、要素間の接続関係に関する条件とを用いて、
検索対象を指定する文書データベース管理装置におい
て、構文規則と与えられた検索式を照合する手段を備え
たことを特徴とする。In order to solve the above-mentioned problems, the present invention manages a structured document that defines a structure that a document can take according to a syntax rule, and sets a condition regarding an element of the document and an inter-element condition. Using the condition regarding the connection relation,
The document database management device for designating a search target is characterized by including a means for matching a syntax rule with a given search expression.

【００１９】また本発明は、文書が取り得る構造に基づ
いて、出発要素と、前記出発要素に下位に隣接する要素
と、前記出発要素に下位に存在し得る要素を一つの組と
する対応表を生成する手段と、検索式に基づいて前記対
応表を走査して検索式が妥当であるか否かを検証する手
段とを備えていることを特徴とする。Further, according to the present invention, based on a structure that a document can take, a correspondence table is provided in which a starting element, an element subordinate to the starting element, and an element subordinate to the starting element are one set. And means for verifying whether the search expression is valid by scanning the correspondence table based on the search expression.

【００２０】[0020]

【作用】本発明によれば、構造化文書を検索するに際
し、文書型と、検索式で指定された、親子関係および祖
孫関係を用いた条件が照合され、検索式が妥当か否か判
断される。According to the present invention, when retrieving a structured document, the document type and the conditions using the parent-child relationship and the grandchildren relation specified in the retrieval formula are collated to judge whether the retrieval formula is valid or not. To be done.

【００２１】本発明においては、検索を実行する前に文
書型が調べられ、出発要素と、前記出発要素に下位に隣
接する要素と、前記出発要素に下位に存在し得る要素を
一つの組とする対応表が生成される。検索式が入力され
るとこの検索式に基づいて前記対応表が走査され、検索
式が対応表の条件を満足しているか否かが判別され、条
件が満足されないときには、検索式が妥当でないと判断
される。In the present invention, the document type is checked before executing the search, and a starting element, an element subordinate to the starting element, and a set of elements which can be subordinate to the starting element are grouped. A corresponding table is generated. When a search expression is input, the correspondence table is scanned based on this search expression, it is determined whether or not the search expression satisfies the conditions of the correspondence table, and when the conditions are not satisfied, the search expression is not valid. To be judged.

【００２２】[0022]

【実施例】図１は、本発明の文書データベース管理装置
のブロック図である。問合せエディタ１は、検索条件を
入力するためのもので図２に示すような問合せエディタ
画面を使用して入力される。問合せエディタ１で作成さ
れた検索式は、検索式生成部２により、検索式評価部３
で実行可能な形式の検索式に変換される。この検索式
は、検索式検証部４に渡され、妥当か否か判定される。
検索式検証部４における処理の詳細については後述す
る。妥当な検索式であれば検索式評価部３に渡され、指
定された条件を満たす要素をもつ文書を検索する。な
お、５は文書型管理部、６はデータ辞書、７はデータベ
ースである。1 is a block diagram of a document database management apparatus according to the present invention. The query editor 1 is for inputting search conditions and is input using a query editor screen as shown in FIG. The search formula created by the query editor 1 is processed by the search formula generation unit 2 and the search formula evaluation unit 3
It is converted into a search formula in an executable format. This search formula is passed to the search formula verification unit 4 and is judged whether it is valid or not.
Details of the processing in the search expression verification unit 4 will be described later. If it is a valid search expression, it is passed to the search expression evaluation unit 3 and a document having an element satisfying the specified condition is searched. Reference numeral 5 is a document type management unit, 6 is a data dictionary, and 7 is a database.

【００２３】検索式検証部４は、検証制御部８、対応表
保持部９、対応表生成部１０、到達可能性判定部１１を
もつ。The search expression verification unit 4 has a verification control unit 8, a correspondence table holding unit 9, a correspondence table generation unit 10, and a reachability judgment unit 11.

【００２４】検証制御部８は、全体を統轄する要素で、
対応表保持部９、対応表生成部１０、到達可能性判定部
１１を適宜呼び出す。The verification control unit 8 is an element that governs the whole,
The correspondence table holding unit 9, the correspondence table generation unit 10, and the reachability determination unit 11 are called as appropriate.

【００２５】対応表保持部９は、要素（出発要素）、出
発要素の子として出現し得る要素（隣接要素集合）、お
よび出発要素から到達可能な要素の集合（到達可能要素
集合）の３つ組をエントリとする対応表を保持する。こ
こで、ある要素Ａからある要素Ｂに到達可能であると
は、要素Ａのインスタンスの下位（子孫）として要素Ｂ
が出現し得ることを言う。図１１の文書型から生成した
対応表を図３に示す。たとえば、要素「記事」に対して
は、下位要素として「節」が隣接しており、要素「記
事」からは、要素「節」，「見出し」，「段落」の何れ
にも到達可能であることを示している。The correspondence table holding unit 9 has three elements: an element (starting element), an element that can appear as a child of the starting element (adjacent element set), and a set of elements reachable from the starting element (reachable element set). A correspondence table having a set as an entry is held. Here, being able to reach a certain element B from a certain element A means that the element B is a subordinate (descendant) of the instance of the element A.
Say that can appear. FIG. 3 shows a correspondence table generated from the document type shown in FIG. For example, a "section" is adjacent to the element "article" as a subordinate element, and any of the elements "section", "heading", and "paragraph" can be reached from the element "article". It is shown that.

【００２６】対応表生成部１０は、指定された文書型に
ついて文書型の構造を参照して上記した対応表を生成す
る。The correspondence table generator 10 generates the above correspondence table with reference to the structure of the document type for the designated document type.

【００２７】到達可能性判定部１１は、ソースとデステ
ィネーションの２つの要素が与えられたとき、対応表保
持部９に保持されている対応表（図３参照）を走査し、
ソースを出発要素とするエントリの隣接要素集合または
到達可能要素集合にデスティネーションが含まれるか検
査する。デスティネーションが隣接要素集合に含まれる
ときには、デスティネーションはソースの子として出現
し得る。デスティネーションが到達可能要素集合に含ま
れるときには、デスティネーションはソースの子孫とし
て出現し得る。The reachability determination unit 11 scans the correspondence table (see FIG. 3) held in the correspondence table holding unit 9 when the two elements of the source and the destination are given,
Check whether the destination has a neighbor element set or reachable element set of the entry whose source element is the starting element. A destination may appear as a child of the source when the destination is included in the neighboring element set. When a destination is included in the reachable element set, the destination can appear as a descendant of the source.

【００２８】図４〜図９に、検索式検証部４において実
行される与えられた検索式を検証する処理のフローを示
す。このフローに沿って、本実施例について説明する。4 to 9 show the flow of the processing executed by the search expression verification unit 4 to verify the given search expression. This embodiment will be described along this flow.

【００２９】図４は、検索式の検証の全体のフローであ
る。この処理の入力は、検索式生成部２で生成された検
索式である。検証制御部８は、文書型管理部５を呼び出
し、入力された検索式の検索対象となるスキーマ（文書
型）の情報を取得する（ステップ６−１）。続いて、対
応表生成部１０を呼び出し、そのスキーマの対応表を作
成する（ステップ６−２）。FIG. 4 is an overall flow of verification of a search expression. The input of this processing is the search formula generated by the search formula generation unit 2. The verification control unit 8 calls the document type management unit 5 and acquires information on the schema (document type) that is the search target of the input search formula (step 6-1). Then, the correspondence table generation unit 10 is called to create a correspondence table of the schema (step 6-2).

【００３０】図５は、対応表の作成処理（図４のステッ
プ６−２参照）のフローである。この処理は対応表生成
部１０で行なわれる。入力は、スキーマの情報である。
スキーマの情報は、図６に示すような有向グラフで表現
される。まず、入力されたスキーマのルートを選択する
（ステップ７−２）。次に、ルートから到達可能な要素
型の集合を求める（ステップ７−３）。変数Ｓに、ルー
トから到達可能な要素型の集合にルートを加えて集合を
保持させる（ステップ７−４）。なお、ステップ７−４
における戻り値とは直前の処理により得られた結果を示
す次いで、変数Ｓに未処理のノードがあるか検査する
（ステップ７−５）。未処理のノードがなければ終了で
ある（ステップ７−１０）。未処理のノードがあれば、
ノードをひとつ選択する（ステップ７−６）。選択した
ノードと隣接する要素型の集合を求める（ステップ７−
７）。さらに、選択したノードから到達可能な要素型の
集合を求める（ステップ７−８）。選択中のノード（要
素型）、ステップ７−７で得られた隣接要素型集合、お
よびステップ７−８で得られた到達可能要素型集合を３
つ組として対応表保持部９に渡し、対応表にエントリを
登録する（ステップ７−９）。この後、ステップ７−５
に戻る。FIG. 5 is a flow chart of the correspondence table creation process (see step 6-2 in FIG. 4). This processing is performed by the correspondence table generation unit 10. The input is schema information.
The schema information is represented by a directed graph as shown in FIG. First, the root of the input schema is selected (step 7-2). Next, a set of element types reachable from the root is obtained (step 7-3). The route is added to the variable S to the set of element types reachable from the route, and the set is held (step 7-4). Note that step 7-4
The return value in indicates the result obtained by the immediately preceding process. Then, it is checked whether the variable S has an unprocessed node (step 7-5). If there are no unprocessed nodes, the process ends (step 7-10). If there are unprocessed nodes,
Select one node (step 7-6). Obtain a set of element types adjacent to the selected node (step 7-
7). Further, a set of element types reachable from the selected node is obtained (step 7-8). Set the selected node (element type), the adjacent element type set obtained in step 7-7, and the reachable element type set obtained in step 7-8 to 3
It is passed to the correspondence table holding unit 9 as a set and the entry is registered in the correspondence table (step 7-9). After this, step 7-5
Return to

【００３１】図７は、到達可能要素型集合を求める処理
（図５のステップ７−３参照）のフローである。この処
理も対応表生成部１０で行なわれる。この処理の入力は
要素型で、出力は入力された要素型から到達可能な要素
型の集合である。このフローでは、要素型の集合を保持
する変数Ｓと、要素型のキューを保持する変数Ｑを用い
る。変数Ｓの初期値は空集合である（ステップ８−
２）。変数Ｑの初期値は、入力ノードに隣接するノード
すべてからなるキューである（ステップ８−３）。ま
ず、変数Ｑの長さが０かどうか判定する（ステップ８−
４）。変数Ｑの長さが０であれば、入力された要素型か
ら到達可能な要素の集合が変数Ｓに格納されているの
で、これを戻り値として制御を戻す（ステップ８−１
０）。変数Ｑの長さが１以上であれば、変数Ｑの先頭要
素を取り出す（ステップ８−１１）。取り出した要素が
変数Ｓに含まれていれば、ステップ８−４に戻る。取り
出した要素がＳに含まれていなければ、それが要素型か
どうか検査する（ステップ８−７）。要素型であれば、
Ｓにその要素型を加える（ステップ８−８）。取り出し
た要素に隣接するノードすべてを変数Ｑの末尾に追加し
（ステップ８−９）、ステップ８−４に戻る。FIG. 7 is a flow chart of a process for obtaining a reachable element type set (see step 7-3 in FIG. 5). This processing is also performed by the correspondence table generation unit 10. The input of this process is an element type, and the output is a set of element types reachable from the input element type. In this flow, a variable S holding an element type set and a variable Q holding an element type queue are used. The initial value of the variable S is an empty set (step 8-
2). The initial value of the variable Q is a queue consisting of all nodes adjacent to the input node (step 8-3). First, it is determined whether the length of the variable Q is 0 (step 8-
4). If the length of the variable Q is 0, the set of elements reachable from the input element type is stored in the variable S, and the control is returned using this as a return value (step 8-1).
0). If the length of the variable Q is 1 or more, the head element of the variable Q is taken out (step 8-11). If the extracted element is included in the variable S, the process returns to step 8-4. If the fetched element is not included in S, it is checked whether it is an element type (step 8-7). If it is an element type,
The element type is added to S (step 8-8). All the nodes adjacent to the fetched element are added to the end of the variable Q (step 8-9), and the process returns to step 8-4.

【００３２】図８は、隣接要素型集合を求める処理（図
５のステップ７−７参照）のフローである。この処理も
対応表生成部１０で行なわれる。この処理の入力は要素
型で、出力は入力された要素型と隣接する要素型の集合
である。到達可能要素型集合を求める処理と同様、この
フローでも、要素型の集合を保持する変数Ｓと、要素型
のキューを保持する変数Ｑを用いる。変数Ｓの初期値は
空集合である（ステップ９−２）。変数Ｑの初期値は、
入力ノードに隣接するノードすべてからなるキューであ
る（ステップ９−３）。まず、変数Ｑの長さが０かどう
か判定する（ステップ９−４）。変数Ｑの長さが０であ
れば、入力された要素型から到達可能な要素の集合が変
数Ｓに格納されているので、これを戻り値として制御を
戻す（ステップ９−８）。変数Ｑの長さが１以上であれ
ば、変数Ｑの先頭要素を取り出す（ステップ９−５）。
取り出した要素が要素型かどうか検査する（ステップ９
−６）。要素型であれば、変数Ｓにその要素型を加え
（ステップ９−７）、ステップ９−４に戻る。要素型で
なければ、取り出した要素に隣接するノードすべてを変
数Ｑの末尾に追加し（ステップ９−９）、ステップ９−
４に戻る。FIG. 8 is a flow chart of the processing for obtaining the adjacent element type set (see step 7-7 in FIG. 5). This processing is also performed by the correspondence table generation unit 10. The input of this process is an element type, and the output is a set of element types adjacent to the input element type. Similar to the process of obtaining the reachable element type set, this flow also uses the variable S holding the element type set and the variable Q holding the element type queue. The initial value of the variable S is an empty set (step 9-2). The initial value of the variable Q is
It is a queue consisting of all nodes adjacent to the input node (step 9-3). First, it is determined whether the length of the variable Q is 0 (step 9-4). If the length of the variable Q is 0, the set of elements reachable from the input element type is stored in the variable S, and the control is returned using this as a return value (step 9-8). If the length of the variable Q is 1 or more, the head element of the variable Q is taken out (step 9-5).
It is checked whether the fetched element is an element type (step 9)
-6). If it is an element type, the element type is added to the variable S (step 9-7), and the process returns to step 9-4. If it is not the element type, all the nodes adjacent to the fetched element are added to the end of the variable Q (step 9-9), and step 9-
Return to 4.

【００３３】図９は、検索式のノードの検証処理（図４
のステップ６−４参照）のフローである。この処理は、
到達可能性判定部１１で行なわれる。この処理の入力は
検索式のノード、出力はそのノードが妥当か否かを示す
真理値である。まず、対応表保持部９に保持されている
対応表を走査し、入力されたノードを出発要素型とする
エントリを求めておく（ステップ１０−２）。次に、未
処理の隣接ノードがあるかどうか検査する（ステップ１
０−３）。すべて処理が済んでいれば戻り値を真とし、
制御を戻す（ステップ１０−１２）。未処理の隣接ノー
ドがあれば、ノードをひとつ選ぶ（ステップ１０−
４）。選択したノードが、入力されたノードの子として
指定されているかどうか判定する（ステップ１０−
５）。子として指定されていれば、選択したノードが、
エントリの隣接要素型集合に含まれるかどうか検査する
（ステップ１０−７）。含まれていなければ、戻り値を
偽として制御を戻す（ステップ１０−７）。含まれてい
れば、選択したノードを検証する（ステップ１０−
８）。ステップ１０−５で、選択したノードが子として
指定されていなければ、つまり子孫として指定されてい
れば、選択したノードが、エントリの到達可能要素型集
合に含まれるかどうか検査する（ステップ１０−１
１）。含まれていなければ、戻り値を偽として制御を戻
す（ステップ１０−１３）。含まれていれば、ステップ
１０−８に行く。ステップ１０−８での検証結果が偽で
あれば、戻り値を偽として制御を戻す（ステップ１０−
１０）。真であれば、ステップ１０−３に戻る。FIG. 9 shows the verification process of the retrieval expression node (FIG. 4).
(See Step 6-4). This process
The reachability determination unit 11 performs this. The input of this process is the node of the search expression, and the output is the truth value indicating whether the node is valid or not. First, the correspondence table held in the correspondence table holding unit 9 is scanned to find an entry having the input node as the starting element type (step 10-2). Next, it is checked whether there is an unprocessed adjacent node (step 1
0-3). If all processing is completed, the return value will be true,
The control is returned (step 10-12). If there is an unprocessed adjacent node, select one node (step 10-
4). It is determined whether the selected node is designated as a child of the input node (step 10-
5). If specified as a child, the selected node is
It is checked whether the entry is included in the neighboring element type set (step 10-7). If it is not included, the return value is set to false and the control is returned (step 10-7). If it is included, the selected node is verified (step 10-
8). If the selected node is not designated as a child in step 10-5, that is, if it is designated as a descendant, it is checked whether or not the selected node is included in the reachable element type set of the entry (step 10- 1
1). If it is not included, the return value is set to false and the control is returned (step 10-13). If so, go to step 10-8. If the verification result in step 10-8 is false, the return value is set to false and control is returned (step 10-
10). If true, return to step 10-3.

【００３４】本実施例では、検索式の検証を行なう度に
対応表を構成しているが、文書型をデータベースに登録
する時点で対応表を構成し、検証時はその表を走査する
ようにしてもよい。In the present embodiment, the correspondence table is constructed every time the retrieval formula is verified. However, the correspondence table is constructed at the time of registering the document type in the database, and the table is scanned during the verification. May be.

【００３５】[0035]

【発明の効果】以上のように、本発明によれば、文書型
と、検索式で指定された、親子関係および祖孫関係を用
いた条件が照合され、妥当か否か判断される。As described above, according to the present invention, the document type is compared with the condition specified by the search expression using the parent-child relationship and the grandchild relationship, and it is determined whether the condition is valid.

【００３６】これにより、検索式の意味的な誤りにより
検索結果が得られなかったのか、条件に該当するインス
タンスがなかったのかを判別するのが容易になる。ま
た、システムが、検索結果があり得ない検索式を評価し
なくて済むようになり、計算時間の浪費を防ぐことがで
きる。As a result, it becomes easy to determine whether the search result was not obtained due to a semantic error in the search expression or there was no instance that met the condition. In addition, the system does not have to evaluate a search expression that cannot have a search result, and waste of calculation time can be prevented.

[Brief description of drawings]

【図１】本発明の文書データベース管理装置の実施例
の構成である。FIG. 1 is a configuration of an embodiment of a document database management device of the present invention.

【図２】問合せエディタのグラフィカルユーザインタ
ーフェースの例である。FIG. 2 is an example of a graphical user interface for a query editor.

【図３】対応表の例である。これは図１２に示した文
書型の対応表である。FIG. 3 is an example of a correspondence table. This is the document type correspondence table shown in FIG.

【図４】検索式の検証のフローである。FIG. 4 is a flow of verification of a search expression.

【図５】対応表の作成処理のフローである。FIG. 5 is a flowchart of a correspondence table creation process.

【図６】図１１の文書型を有向グラフで表現したもの
である。6 is a representation of the document type of FIG. 11 by a directed graph.

【図７】ある要素型から到達可能な要素型の集合を求
める処理のフローである。FIG. 7 is a flowchart of a process for obtaining a set of element types reachable from a certain element type.

【図８】ある要素型に隣接する要素型の集合を求める
処理のフローである。FIG. 8 is a flowchart of a process for obtaining a set of element types adjacent to a certain element type.

【図９】検索式のノードの検証処理のフローである。FIG. 9 is a flow of a verification process of a search expression node.

【図１０】文書インスタンスの例である。FIG. 10 is an example of a document instance.

【図１１】文書型の例である。これは図１０の文書イ
ンスタンスの文書型である。FIG. 11 is an example of a document type. This is the document type of the document instance in FIG.

【図１２】検索対象の指定の例である。FIG. 12 is an example of designation of a search target.

【図１３】従来の文書データベース管理装置の構成で
ある。FIG. 13 is a configuration of a conventional document database management device.

【図１４】妥当でない検索式の例である。この検索式
で用いている文書型は図１１のものである。FIG. 14 is an example of an invalid search expression. The document type used in this search formula is that shown in FIG.

[Explanation of symbols]

１…問い合わせエディタ、２…検索式生成部、３…検索
式評価部、４…検索式検証部、５…文書型管理部、６…
データ辞書、７…データベース、８…検証制御部、９…
対応表保持部、１０…対応表生成部、１１…到達可能性
判定部DESCRIPTION OF SYMBOLS 1 ... Inquiry editor, 2 ... Search expression generation unit, 3 ... Search expression evaluation unit, 4 ... Search expression verification unit, 5 ... Document type management unit, 6 ...
Data dictionary, 7 ... Database, 8 ... Verification control unit, 9 ...
Correspondence table storage unit, 10 ... Correspondence table generation unit, 11 ... Reachability determination unit

Claims

[Claims]

1. A document database in which a structured document that defines a structure that a document can take by a syntax rule is a management target, and a search target is specified by using a condition regarding an element of the document and a condition regarding a connection relation between the elements. A document database management device, characterized in that the management device is provided with means for matching a syntax rule with a given search expression.

2. A correspondence table in which a starting element, elements adjacent to the starting element in the lower order, and elements that can exist in the lower order of the starting element are combined into one set is generated based on a structure that the document can take. 2. The document database management apparatus according to claim 1, further comprising means and means for scanning the correspondence table based on a search expression to verify whether or not the search expression is valid.