JP2000200206A

JP2000200206A - Device and method for data management and recording medium

Info

Publication number: JP2000200206A
Application number: JP10376930A
Authority: JP
Inventors: Akihiko Matsuoka; 彰彦松岡
Original assignee: Railway Technical Research Institute
Current assignee: Railway Technical Research Institute
Priority date: 1998-12-28
Filing date: 1998-12-28
Publication date: 2000-07-18

Abstract

PROBLEM TO BE SOLVED: To make easily performable design, construction, and alter of a data base. SOLUTION: The data structure of the data base is defined only with items and their contents. The same management number, i.e., a record number 1 in this case is assigned to a number '00', a name 'Yamada', an age '30', and height '165' and the same record number, i.e., a record number 2 is assigned for a next number '01', a name 'Satoh', an age '25', and height '160'. Items are retrieved with a retrieval key 1 and contents are retrieved with a retrieval key 2 respectively to obtain sets of management numbers. Data whose contents of the items specified with the retrieval key 1 match the retrieval key 2 can be retrieved from a set of the management numbers obtained by ANDing the respective management numbers.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、データ管理装置お
よびデータ管理方法、並びに記録媒体に関し、特に、デ
ータ構造の制約をなくすことにより、不特定のデータ項
目を対象とするデータベースを効率的に作成することが
できるデータ管理装置およびデータ管理方法、並びに記
録媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data management device, a data management method, and a recording medium, and more particularly, to a database for an unspecified data item efficiently by eliminating a restriction on a data structure. The present invention relates to a data management device, a data management method, and a recording medium that can perform the operations.

【０００２】[0002]

【従来の技術】従来のデータベース管理システム（ＤＢ
ＭＳ（ｄａｔａｂａｓｅｍａｎａｇｅｍｅｎｔｓｙ
ｓｔｅｍ））は、利用者やアプリケーションプログラム
との間でデータやスキーマをアクセスさせるための基本
機能として、データベース定義言語、データベース操作
言語、問い合わせ言語等の実行機能を有している。2. Description of the Related Art A conventional database management system (DB)
MS (database management sy)
) has execution functions such as a database definition language, a database operation language, and an inquiry language as basic functions for accessing data and schema between a user and an application program.

【０００３】データベース定義言語は、格納するデータ
の構造を定義するためのものであり、例えば、リレーシ
ョナルデータベースでは、図７に示すように、番号、名
前、性別、年齢、身長などのような項目（フィールド）
を持つデータ構造を、例えば、次にように定義する。[0003] The database definition language is used to define the structure of data to be stored. For example, in a relational database, items such as numbers, names, genders, ages, and heights (see FIG. 7) are used. field)
Is defined as follows, for example.

【０００４】ｃｒｅａｔｅｔａｂｌｅｐｅｒｓｏｎ ( ｎｕｍｂｅｒｉｎｔｅｇｅｒ， {番号：整数型} ｎａｍｅｃｈａｒ（１０）， {名前：文字型（１０）} ｓｅｘｃｈａｒ（２）， {性別：文字型（２）} ａｇｅｉｎｔｅｇｅｒ， {年齢：整数型} ｈｅｉｇｈｔｆｌｏａｔ {身長：小数型} );Create table person (number integer, {number: integer type} name char (10), {name: character type (10)} sex char (2), {sex: character type (2)} age integer, { Age: integer type} height float {height: decimal type});

【０００５】データベース操作言語は、上記のように定
義されたデータ構造に基づいて、データの登録、削除、
及び追加を行ったり、データベースの結合、分割などを
行うことができる。[0005] The database operation language is used to register, delete, and register data based on the data structure defined as described above.
And addition, or combining and dividing databases.

【０００６】また、問い合わせ言語は、データベースと
利用者の間のインタフェース言語であり、利用者が目的
の情報をデータベースから検索するとき、用いられる。[0006] The query language is an interface language between the database and the user, and is used when the user searches for desired information from the database.

【０００７】このように、従来のデータベース管理シス
テムにおいては、データベース設計によって、格納する
データの枠組み（スキーマ）を予め設定しておく必要が
あった。As described above, in the conventional database management system, it is necessary to previously set the framework (schema) of the data to be stored by designing the database.

【０００８】[0008]

【発明が解決しようとする課題】このように、従来の既
存のほとんどのデータベース管理システムは、データベ
ース定義言語で定義されたデータ構造に基づいて、情報
の登録、削除、検索などの操作を行っており、データ構
造に変更が発生すると、変更前のデータ構造に基づいて
開発されたアプリケーションプログラムも変更する必要
があり、面倒である課題があった。As described above, most of the existing database management systems perform operations such as registration, deletion, and retrieval of information based on a data structure defined in a database definition language. Therefore, when a change occurs in the data structure, it is necessary to change the application program developed based on the data structure before the change, which has been a troublesome problem.

【０００９】また、これまで、データ構造を定義すると
きは、データベースの蓄積装置の記憶容量や処理能力な
どに制約があるため、なるべく、蓄積装置の記憶領域の
空きを少なくし、取り扱うデータの型を明示的に記述す
る必要があった。ところが、現在では、大容量の記憶装
置が安価で提供され、コンピュータの処理能力も飛躍的
に向上しており、記憶容量や処理能力などの物理的な制
約が、データベースの仕様にほとんど影響を及ぼさない
状況になっている。In the past, when defining a data structure, there are restrictions on the storage capacity and processing capacity of the storage device of the database. Had to be written explicitly. However, nowadays, large-capacity storage devices are provided at low cost, and the processing capacity of computers has been dramatically improved. Physical limitations such as storage capacity and processing capacity have little effect on database specifications. There is no situation.

【００１０】本発明はこのような状況に鑑みてなされた
ものであり、データ構造の制約をなくし、煩雑なデータ
ベース設計を行うことなく、効率的にデータベースの設
計及び構築を行うことができるようにするものである。The present invention has been made in view of such circumstances, and has been made in view of the above circumstances. Thus, it is possible to eliminate the restriction on the data structure and efficiently design and build a database without complicated database design. Is what you do.

【００１１】[0011]

【課題を解決するための手段】請求項１に記載のデータ
管理装置は、少なくとも、第１の項目のデータと第２の
項目のデータとからなり、第１の項目のデータと第２の
項目のデータとがタグによって識別可能な１または複数
の第１の情報から構成される第２の情報を入力する入力
手段と、入力手段によって入力された第２の情報を記憶
する第１の記憶手段と、第２の情報からタグを除いた第
３の情報を記憶する第２の記憶手段と、第２の記憶手段
に記憶された第３の情報に対して、少なくとも、第１の
項目に対する第１の検索キーと第２の項目に対する第２
の検索キーに基づいて検索を行う検索手段と、検索手段
によって検索された第１の情報を構成要素とする第２の
情報を出力する出力手段とを備えることを特徴とする。
また、文書データをタグ付きの第２の情報に変換する変
換手段をさらに設けるようにすることができる。また、
第１の項目のデータは、第１の情報の種類を表し、第２
の項目のデータは、第１の情報の内容を表すようにする
ことができる。請求項４に記載のデータ管理方法は、少
なくとも、第１の項目のデータと第２の項目のデータと
からなり、第１の項目のデータと第２の項目のデータと
がタグによって識別可能な１または複数の第１の情報か
ら構成される第２の情報を入力し、入力された第２の情
報を記憶し、第２の情報からタグを除いた第３の情報を
記憶し、記憶された第３の情報に対して、少なくとも、
第１の項目に対する第１の検索キーと第２の項目に対す
る第２の検索キーに基づいて検索を行い、検索された第
１の情報を構成要素とする第２の情報を出力することを
特徴とする。請求項５に記載の記録媒体は、請求項４に
記載のデータ管理方法を実行可能なプログラムが記録さ
れていることを特徴とする。本発明に係るデータ管理装
置およびデータ管理方法、並びに記録媒体においては、
少なくとも、第１の項目のデータと第２の項目のデータ
とからなり、第１の項目のデータと第２の項目のデータ
とがタグによって識別可能な１または複数の第１の情報
から構成される第２の情報を入力し、入力された第２の
情報を記憶し、第２の情報からタグを除いた第３の情報
を記憶し、記憶された第３の情報に対して、少なくと
も、第１の項目に対する第１の検索キーと第２の項目に
対する第２の検索キーに基づいて検索を行い、検索され
た第１の情報を構成要素とする第２の情報を出力する。According to a first aspect of the present invention, there is provided a data management apparatus comprising at least a first item data and a second item data, wherein the first item data and the second item data. Means for inputting second information composed of one or more pieces of first information whose data can be identified by a tag, and first storage means for storing the second information input by the input means And second storage means for storing third information obtained by removing the tag from the second information; and at least third information for the first item with respect to the third information stored in the second storage means. 1 search key and 2nd item for 2nd item
A search unit that performs a search based on the search key, and an output unit that outputs second information having the first information searched by the search unit as a component.
Further, it is possible to further provide a conversion means for converting the document data into tagged second information. Also,
The data of the first item indicates the type of the first information,
Can represent the contents of the first information. The data management method according to claim 4 includes at least data of a first item and data of a second item, and the data of the first item and the data of the second item can be identified by a tag. Inputting second information composed of one or a plurality of first information, storing the input second information, storing third information obtained by removing a tag from the second information; At least for the third information
A search is performed based on a first search key for a first item and a second search key for a second item, and second information having the searched first information as a component is output. And A recording medium according to a fifth aspect is characterized by recording a program capable of executing the data management method according to the fourth aspect. In the data management device, the data management method, and the recording medium according to the present invention,
At least, data of the first item and data of the second item are included, and the data of the first item and the data of the second item are composed of one or a plurality of pieces of first information that can be identified by a tag. Input second information, store the input second information, store third information obtained by removing a tag from the second information, and at least store the third information with respect to the stored third information. A search is performed based on a first search key for the first item and a second search key for the second item, and second information having the searched first information as a component is output.

【００１２】[0012]

【発明の実施の形態】図１は、本発明のデータベース管
理装置の一実施の形態の構成例を示すブロック図であ
る。同図に示すように、本実施の形態は、データベース
を管理するためのドキュメント管理サーバ（ＷＳ（ワー
クステーション））１と、それにアクセスするための各
アプリケーションからなるドキュメント管理クライアン
ト（ＰＣ（パーソナルコンピュータ））１０によって構
成されている。FIG. 1 is a block diagram showing a configuration example of an embodiment of a database management apparatus according to the present invention. As shown in FIG. 1, in the present embodiment, a document management server (WS (work station)) 1 for managing a database and a document management client (PC (personal computer)) including applications for accessing the document management server (WS (work station)) ) 10.

【００１３】ドキュメント管理サーバ１は、文書管理デ
ータベース２と、図表管理データベース３と、全文デー
タベース４とを有している。文書管理データベース２と
図表管理データベース３は、ファイル管理を行うための
ものであり、リレーショナルデータベースを用いて索引
情報等を管理するようになされている。The document management server 1 has a document management database 2, a chart management database 3, and a full-text database 4. The document management database 2 and the chart management database 3 are for performing file management, and manage index information and the like using a relational database.

【００１４】また、文書管理データベース２では、オリ
ジナル文書（例えば、各種ワープロ文書）と構造化文書
（例えば、ＳＧＭＬ（ＳｔａｎｄａｒｄＧｅｎｅｒａ
ｌｉｚｅｄＭａｒｋｕｐＬａｎｇｕａｇｅ）に準拠
したタグ付き文書）による２種類の管理形式があり、後
者には図表管理データベース３に記憶されている図表デ
ータファイルへのリンク情報なども組み込まれている。In the document management database 2, original documents (for example, various word processing documents) and structured documents (for example, SGML (Standard Genera) are stored.
There are two types of management formats based on a tagged document (compliant with a sized markup language), and the latter incorporates link information to a chart data file stored in the chart management database 3.

【００１５】一方、全文データベース４には、文書内の
全文に対して、フリーワードによる検索が行えるよう
に、構造化文書の全テキスト情報（タグを除く）を登録
管理するようになされている。On the other hand, in the full-text database 4, all text information (excluding tags) of the structured document is registered and managed so that a free word search can be performed for all the text in the document.

【００１６】ＰＣ上で稼働するドキュメント管理クライ
アント１０は、ドキュメント管理サーバ１上の各データ
ベースへのデータの登録を行う登録用アプリケーション
１１と、後述する検索アプリケーション１３によって検
索された検索結果を利用して文書を作成する編集用アプ
リケーション１２と、文書管理データベース２、図表管
理データベース３、及び全文データベース４のデータを
検索する検索用アプリケーション１３より構成されてい
る。A document management client 10 running on a PC uses a registration application 11 for registering data in each database on the document management server 1 and a search result searched by a search application 13 described later. The system includes an editing application 12 for creating a document, and a search application 13 for searching data in the document management database 2, the diagram management database 3, and the full-text database 4.

【００１７】また、登録用、検索用の各アプリケーショ
ン１１，１３には、それぞれがアクセスする各データベ
ースに対応するクライアントが用意されており、個々の
データベース毎にアプリケーションを構築することがで
きるようになされている。Each of the registration and search applications 11 and 13 is provided with a client corresponding to each database to be accessed, so that an application can be constructed for each database. ing.

【００１８】ここで、文書管理データベース２に記憶さ
れるオリジナル文書ファイルは、対応するアプリケーシ
ョンを使用することにより、再利用（編集、印刷）がで
きる文書ファイルのことをいう。また、図表データファ
イルを併せ持つワープロ文書等については、それらを１
つのファイルにまとめて、オリジナル文書とすることも
できる。文書管理データベース２には、システムが自動
付与する管理番号とともに、表題、著者名などの索引情
報、オリジナル文書ファイルの格納情報が登録される。
ただし、オリジナル文書ファイルの実体は、各利用者毎
に定められたＷＳ上に分散して格納される。Here, the original document file stored in the document management database 2 is a document file that can be reused (edited and printed) by using a corresponding application. In addition, for word processing documents that also have chart data files,
It can be combined into a single file to create an original document. In the document management database 2, index information such as a title and an author name, and storage information of an original document file are registered together with a management number automatically given by the system.
However, the entity of the original document file is distributed and stored on the WS defined for each user.

【００１９】また、図表データファイルとは、図、写真
などのイメージデータファイル、或いは、ロータス（Ｌ
ｏｔｕｓ）（Ｌｏｔｕｓは、ＬｏｔｕｓＤｅｖｅｌｏ
ｐｍｅｎｔＣｏｒｐｏｒａｔｉｏｎの登録商標）、エ
クセル（Ｅｘｃｅｌ）（Ｅｘｃｅｌは、米国Ｍｉｃｒｏ
ｓｏｆｔＣｏｒｐｏｒａｔｉｏｎの米国及びその他の
国における商標）等の表データファイルをオリジナル文
書ファイルとは独立した個々のデータとしてデータベー
スに登録したものをいう。図表データを登録する場合に
は、文書データを登録する場合と同様に、索引情報を図
表データに付与する必要がある。A chart data file is an image data file such as a figure or a photograph, or a Lotus (L
otus) (Lotus is Lotus Develo)
pment Corporation), Excel (Excel is a US Micro
(Trademark of Soft Corporation in the United States and other countries) is registered in a database as individual data independent of the original document file. When registering chart data, it is necessary to add index information to chart data as in the case of registering document data.

【００２０】構造化文書は、オリジナル文書、又は、Ｏ
ＣＲ等で読みとった印刷文書をベースにして、テキスト
部分を抜き出した後、文章構造解析プログラムを介して
タグ情報を付与する等の編集処理を施すことによって作
成される。作成された構造化文書は、索引情報が付与さ
れた後、単独、或いはオリジナル文書を伴って、文書管
理データベース２に登録される。また、登録された構造
化文書のテキストデータは、自動的に、全文データベー
ス４に登録されるようになされている。The structured document is an original document or O
It is created by extracting a text portion based on a print document read by a CR or the like and then performing an editing process such as adding tag information via a sentence structure analysis program. The created structured document is registered in the document management database 2 alone or together with the original document after the index information is added. Further, the text data of the registered structured document is automatically registered in the full-text database 4.

【００２１】登録文書に付与される管理番号には、形式
（オリジナル文書、構造化文書、全文データベース文
書）が異なる同一文書を管理する共通文書番号と、各文
書毎に固有な個別文書番号の２つがあり、文書管理デー
タベース２では、各索引情報をキーとして、共通文書番
号が検索される。また、共通文書番号をキーとして、オ
リジナル文書、構造化文書をダウンロードすることがで
きるようになされている。The management numbers assigned to the registered documents include a common document number for managing the same document having different formats (original document, structured document, full-text database document) and an individual document number unique to each document. In the document management database 2, a common document number is searched using each index information as a key. The original document and the structured document can be downloaded using the common document number as a key.

【００２２】図表管理データベース３は、索引情報をキ
ーとして、図形データ番号を検索し、これを元に図表デ
ータファイルをダウンロードすることができるようにな
されている。また、文書管理データベース２において検
索された構造化文書の図表データへのリンク情報を用い
て、同様に図表データを検索することもできる。The chart management database 3 retrieves a graphic data number using index information as a key, and can download a chart data file based on the figure data number. Further, the chart data can be similarly searched using the link information to the chart data of the structured document searched in the document management database 2.

【００２３】全文データベース４に記憶されている登録
文書の検索は、利用者が入力した文字列を登録文書の中
から探し出し、共通文書番号、又はその文脈（又は段
落）のテキストを検索クライアントに送信することによ
り、行うことができる。In the search of the registered document stored in the full-text database 4, a character string input by the user is searched from the registered document, and the common document number or the text of the context (or paragraph) is transmitted to the search client. Can be performed.

【００２４】図２は、本発明のデータベース管理装置が
扱うデータのデータ構造の例を示している。同図に示す
ように、データは「項目」と「その内容」の形式を必ず
有している。この例の場合、データ１は、項目「番号」
の内容が「００」であり、項目「名前」の内容が「山
田」である。また、項目「年齢」の内容が「３０」、項
目「身長」の内容が「１６５」である。データ２は、項
目「番号」の内容が「０１」、項目「名前」の内容が
「佐藤」、項目「年齢」の内容が「２５」、項目「身
長」の内容が「１６０」である。FIG. 2 shows an example of a data structure of data handled by the database management device of the present invention. As shown in the figure, data always has a format of "item" and "contents". In the case of this example, data 1 is the item “number”
Is “00”, and the content of the item “name” is “Yamada”. The content of the item “age” is “30”, and the content of the item “height” is “165”. In the data 2, the content of the item "number" is "01", the content of the item "name" is "Sato", the content of the item "age" is "25", and the content of the item "height" is "160".

【００２５】このデータ構造は、従来のＲＤＢ（ｒｅｌ
ａｔｉｏｎａｌｄａｔａｂａｓｅ）の場合においても
同様である。例えば、データ１の「名前」項目の内容
は、「山田」である。ＲＤＢの場合のデータ１（番号が
００、名前が山田、性別が男、年齢が３０、身長が１６
５）が、本実施の形態の項目「番号」乃至項目「身長」
に該当する。This data structure corresponds to a conventional RDB (rel
It is the same in the case of the national database. For example, the content of the “name” item of data 1 is “Yamada”. Data 1 for RDB (number is 00, name is Yamada, gender is male, age is 30, height is 16
5) are the items “number” to “height” in the present embodiment.
Corresponds to.

【００２６】このように、本実施の形態では、データ構
造は、次のような形式、即ち、項目と内容のみで定義さ
れる。（フィールド１項目文字列）、（フィールド
２項目文字列／バイナリ）、・・・、（フィールドＮ
項目文字列）。As described above, in the present embodiment, the data structure is defined by the following format, that is, only items and contents. (Field 1 item character string), (field 2 item character string / binary), ..., (field N
Item string).

【００２７】データが項目とその内容のみで構成される
ため、データを作成するときには、データベースの定義
を意識することなく、以下のように、タグ付きのテキス
ト情報（構造化文書）を作成すればよい。Since data is composed of only items and their contents, when creating data, it is possible to create tagged text information (structured document) as described below without regard to the definition of the database. Good.

【００２８】 [0028]

【００２９】ここで、文字列「＜ｄａｔａ＞」は、デー
タの始まりを示すタグである。文字列「＜ｉｔｅｍ＞」
は、次に続く文字列が項目の種類を表す文字列であるこ
とを示すタグである。文字列「＜ｃｏｎｔｅｎｔ＞」
は、次に続く文字列が直前の項目の内容を表す文字列で
あることを示すタグである。また、文字列「＜／ｉｔｅ
ｍ＞」は、項目の種類を表す文字列の終わりを示すタグ
である。文字列「＜／ｃｏｎｔｅｎｔ＞」は、項目の内
容を表す文字列の終わりを示すタグである。文字列「＜
／ｄａｔａ＞」は、データの終わりを示すタグである。Here, the character string "<data>" is a tag indicating the start of data. Character string "<item>"
Is a tag indicating that the following character string is a character string representing the type of item. Character string "<content>"
Is a tag indicating that the following character string is a character string representing the content of the immediately preceding item. In addition, the character string “</ item
m>"is a tag indicating the end of a character string representing the type of item. The character string “</ content>” is a tag indicating the end of the character string representing the content of the item. The string "<
/ Data>"is a tag indicating the end of data.

【００３０】作成したタグ付きのテキスト情報（構造化
文書）は、図３に示すように、構造化文書解釈プログラ
ムによって、項目とその内容に分けられ、その後、管理
情報（識別情報等）と全文ＤＢ（データベース）に分け
て保管される。即ち、項目とその内容は、全文データベ
ース４に格納され、各項目に対応するレコード番号など
の管理情報は、図示せぬ記憶部に格納される。The created tagged text information (structured document) is divided into items and their contents by a structured document interpretation program, as shown in FIG. 3, and then management information (identification information and the like) and full text Stored separately in DB (database). That is, the items and their contents are stored in the full-text database 4, and management information such as a record number corresponding to each item is stored in a storage unit (not shown).

【００３１】このようにして保管されたデータを検索す
る場合、まず、ユーザは、項目として「○○」を、ま
た、内容として「××」を検索キーとしてそれぞれ指定
する。検索用アプリケーション１３の検索クライアント
は、指定された各検索キーを用いて、全文データベース
４から、該当するデータをそれぞれ検索する。検索の方
法は、完全一致、又は部分一致とすることができる。そ
して、検索されたデータに対応する管理情報、この例の
場合、レコード番号の集合を得る。When retrieving data stored in this manner, the user first designates "OO" as an item and "XX" as a content as a retrieval key. The search client of the search application 13 uses the specified search keys to search the full-text database 4 for relevant data. The search method can be an exact match or a partial match. Then, management information corresponding to the searched data, in this case, a set of record numbers is obtained.

【００３２】例えば、項目に対する検索キー「○○」を
「年齢」とし、内容に対する検索キーを「３０」とする
と、項目が「年齢」であるデータのレコード番号の集合
｛１，２５，４４，７５，８９，１１１，・・・｝と、
内容が「３０」であるデータのレコード番号の集合
｛１，３３，４４，５７，７６，８９，９３，・・・｝
が得られる。For example, assuming that the search key “OO” for an item is “age” and the search key for content is “30”, a set of record numbers {1, 25, 44, 75, 89, 111,...
Set of record numbers of data whose content is "30" {1, 33, 44, 57, 76, 89, 93, ...}
Is obtained.

【００３３】ここで、各検索キーによって検索された各
レコード番号に対して、論理積（ＡＮＤ）演算を施すこ
とにより、項目が「年齢」で、かつ、内容が「３０」の
データ、即ち、レコード番号の集合｛１，４４，８９，
・・・｝を得ることができる。このようにして、年齢が
３０となっているデータを検索することができる。Here, by performing a logical product (AND) operation on each record number searched by each search key, data having an item of “age” and a content of “30”, that is, Set of record numbers $ 1,44,89,
...｝ can be obtained. In this manner, data whose age is 30 can be searched.

【００３４】このように、本実施の形態の場合、全文デ
ータベース４に対して検索を行っているため、高速な処
理が可能となっている。同様の処理を、項目とその内容
の２つのフィールドからなるデータを登録したリレーシ
ョナルデータベースに対して行うと、大量のデータに対
しては、検索時間が相当かかってしまう。一般に、全文
データベースシステムには、学習と呼ばれる前処理（文
書データを変換して独自のデータ構造にして蓄積する処
理）を行い、検索エンジンと呼ばれるアプリケーション
による高速検索を行う仕組みが用意されており、本実施
の形態ではこれを用いるようにしている。なお、学習の
方法や検索エンジンについての詳細な説明はここでは省
略する。As described above, in the case of the present embodiment, since the full-text database 4 is searched, high-speed processing is possible. If the same processing is performed on a relational database in which data composed of two fields of items and their contents are registered, it takes a considerable amount of time to search a large amount of data. Generally, a full-text database system is provided with a mechanism for performing pre-processing called learning (processing for converting document data into a unique data structure and storing it) and performing high-speed search by an application called a search engine. In the present embodiment, this is used. A detailed description of the learning method and the search engine is omitted here.

【００３５】次に、社員情報管理データベースを例に、
本実施の形態についてさらに詳細に説明する。例えば、
ある企業で、図４に示すような個人情報カードが、例え
ば、ＭｉｃｒｏｓｏｆｔＷｏｒｄ（Ｍｉｃｒｏｓｏｆｔ
は、米国マイクロソフトＣｏｒｐｏｒａｔｉｏｎの米
国及び他の国における登録商標）等のワープロソフトを
用いて作成され、これを基に、社員情報の管理データベ
ースが作成されるものとする。そして、このカードは、
社員全員により記入され、毎年、４月に更新されるもの
とする。Next, taking an employee information management database as an example,
This embodiment will be described in more detail. For example,
In a certain company, a personal information card as shown in FIG.
Is created using word processing software such as Microsoft Corporation in the United States and other countries), and a management database of employee information is created based on this. And this card,
Completed by all employees and updated annually in April.

【００３６】まず、データを登録するために、構造化文
書（ここではＳＧＭＬ文書）を作成する。各社員から提
出されたＷｏｒｄ文書ファイル（Ｍｉｃｒｏｓｏｆｔ
Ｗｏｒｄを用いて作成された文書ファイル）、例えば、
社員番号を利用して、ファイル名が７１９８２３５．Ｄ
ＯＣなどとなっているファイルを基に、タグ情報を付与
し、ファイル名が「７１９８２３５．ＳＧＭ」というフ
ァイルを作成する。First, in order to register data, a structured document (here, an SGML document) is created. Word document file (Microsoft) submitted by each employee
Document file created using Word), for example,
Using the employee number, the file name is 7198235. D
Tag information is added based on the file such as OC, and a file having a file name “7198235.SGM” is created.

【００３７】管理情報は、想定するアプリケーションに
より異なるが、上記例の場合、登録時に付与すべき情報
としては、登録日と受領者（管理者）の社員番号又は氏
名などとすることができる。しかしながら、データベー
スの管理用の情報を加えると、管理情報は次のようなも
のが必要となる。The management information varies depending on the assumed application. In the case of the above example, the information to be given at the time of registration may be the registration date and the employee number or name of the recipient (manager). However, when the information for managing the database is added, the following management information is required.

【００３８】・管理番号（登録時にシステムが自動採番
するもので、レコード番号に対応する）・登録日・受領者（管理者の社員番号）又は氏名・オリジナルファイル名（例えば、７１９８２３５．Ｄ
ＯＣ）・ＳＧＭＬファイル名（例えば、７１９８２３５．ＳＧ
Ｍ）・各ファイルのアクセス権限・データ削除フラグManagement number (automatically assigned by the system at the time of registration, corresponding to the record number) Registration date Recipient (administrator's employee number) or name Original file name (for example, 7198235.D)
OC) • SGML file name (for example, 7198235.SG
M)-Access authority of each file-Data deletion flag

【００３９】また、システムの管理情報は、次のような
ものとすることができる。The management information of the system can be as follows.

【００４０】・オリジナルファイル格納ディレクトリ
（マシン（ＷＳやＰＣなど）名等を含む）・ＳＧＭＬファイル格納ディレクトリ（マシン名等を含
む）Directory for storing original files (including machine (WS, PC, etc.) names, etc.) Directory for storing SGML files (including machine names, etc.)

【００４１】次に、上記管理情報の構造について説明す
る。管理情報は、システム管理情報と、データベースの
管理情報とからなる。Next, the structure of the management information will be described. The management information includes system management information and database management information.

【００４２】システム管理情報は、ファイルの格納場所
を示す情報であり、管理方法はどのような形式でもかま
わないが、例えば、管理ファイル（ＳＹＳＴＥＭ．ＩＮ
Ｉ）を作成して、次のように、グローバル変数として管
理する方法などがある。The system management information is information indicating the storage location of the file. The management method may be in any format. For example, the system management information may be a management file (SYSTEM.IN).
There is a method of creating I) and managing it as a global variable as follows.

【００４３】＄ＯＲＧ＿ＦＩＬＥ＝ＷＳ１：／ｈｏｍｅ／ＯＲＧ＄ＳＧＭ＿ＦＩＬＥ＝ＷＳ２：／ｈｏｍｅ／ＳＧＭＬ＄ ORG_FILE = WS1: / home / ORG ＄ SGM_FILE = WS2: / home / SGML

【００４４】データベースの管理情報は、通常のデータ
ベースの場合と同様に、図５に示すようなテーブル形式
で管理する。即ち、管理番号、登録日、管理者、オリジ
ナルファイル名、ＳＧＭＬファイル名、権限、及び削除
の有無などの項目からなるテーブルを用いて、管理情報
を管理する。この管理情報の管理には、従来のデータベ
ースを用いてもよいし、管理ファイルからドキュメント
管理サーバ１のサーバプログラムに読み込んでサーバプ
ログラムが管理するようにしてもよい。The management information of the database is managed in a table format as shown in FIG. 5, as in the case of a normal database. That is, management information is managed using a table including items such as a management number, a registration date, an administrator, an original file name, an SGML file name, authority, and whether or not there is a deletion. For the management of the management information, a conventional database may be used, or the management information may be read from a management file into the server program of the document management server 1 and managed by the server program.

【００４５】全文データベース４の作成にあたっては、
上記管理番号とＳＧＭＬファイルを用いて、内部的に
は、図６に示すような索引情報を構築する。即ち、管理
番号に対応づけて、項目（職員番号、記入年月、氏名、
ふりがな、所属など）と、その内容とからなる索引情報
を作成し、これを全文データベース４とする。In preparing the full-text database 4,
Using the management number and the SGML file, index information as shown in FIG. 6 is internally constructed. That is, the items (staff number, date and time, name,
Index information composed of phonetic characters, affiliations, etc.) and their contents is created, and this is used as the full-text database 4.

【００４６】図５及び図６から分かるように、管理情報
と全文データベース４とは、管理番号によってリンクさ
れている。即ち、この例の場合、リンク情報は管理番号
ということになる。従って、全文データベース４を項目
及び内容で検索し、得られた管理番号の集合から、ＡＮ
Ｄ処理などによって目的とするデータを検索することが
できる。また、管理情報から、検索された管理番号に対
応するオリジナルファイル名などが分かるため、オリジ
ナルファイルを表示したりすることができる。As can be seen from FIGS. 5 and 6, the management information and the full-text database 4 are linked by a management number. That is, in this example, the link information is a management number. Therefore, the full-text database 4 is searched for by item and content, and from the set of obtained management numbers,
The target data can be searched by the D processing or the like. In addition, the original file name or the like corresponding to the searched management number can be known from the management information, so that the original file can be displayed.

【００４７】以上説明したように、上記実施の形態にお
いては、検索に全文データベースを用いるため、データ
ベースの設計を容易にすることができる。また、従来の
データベース管理システム（ＤＢＭＳ）においては、デ
ータベースの変更に伴って、プログラムの修正等が必要
となるが、本実施の形態の場合、データ構造そのものは
変更されることがない（常に、項目とその内容とからな
る）ため、データベースに新たな項目を自由に追加した
り、或いは、項目を自由に削除することができる。As described above, in the above embodiment, since the full-text database is used for the search, the design of the database can be facilitated. Further, in a conventional database management system (DBMS), it is necessary to modify a program in accordance with a change in the database, but in the case of the present embodiment, the data structure itself is not changed (always, Therefore, a new item can be freely added to the database or an item can be freely deleted.

【００４８】なお、上記実施の形態において用いた、＜
ｄａｔａ＞、＜／ｄａｔａ＞、＜ｉｔｅｍ＞、＜／ｉｔ
ｅｍ＞、＜ｃｏｎｔｅｎｔ＞、＜／ｃｏｎｔｅｎｔ＞な
どのタグは、例であってこれに限定されるものではな
い。It should be noted that the <
data>, </ data>, <item>, </ it
The tags such as <em>, <content>, and </ content> are examples, and the present invention is not limited thereto.

【００４９】また、上記実施の形態においては、ドキュ
メント管理サーバをワークステーション（ＷＳ）で構成
するようにしたが、これに限定されるものではなく、パ
ーソナルコンピュータ（ＰＣ）で構成するようにしても
よい。In the above embodiment, the document management server is constituted by the workstation (WS). However, the present invention is not limited to this, and the document management server may be constituted by a personal computer (PC). Good.

【００５０】また、上記実施の形態においては、文書管
理データベースにおいて、ＳＧＭＬで記述された構造化
文書を記憶するようにしたが、ＳＧＭＬは一例であり、
ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａ
ｎｇｕａｇｅ）等の他の構造化文書であってもかまわな
い。In the above embodiment, the structured document described in SGML is stored in the document management database. However, SGML is an example.
HTML (Hyper Text Markup La)
nstructure), and other structured documents.

【００５１】また、上記実施の形態においては、本発明
をクライアント／サーバ型のシステムに応用する場合に
ついて説明したが、これに限定されるものではなく、ス
タンドアロン型のシステムに適用することもできる。In the above embodiment, the case where the present invention is applied to a client / server type system has been described. However, the present invention is not limited to this, and may be applied to a stand-alone type system.

【００５２】さらに、上記実施の形態のサーバプログラ
ム及びクライアントにおける各アプリケーションソフト
ウェアは、ＣＤ−ＲＯＭ（ｃｏｍｐａｃｔｄｉｓｃ
ｒｅａｄｏｎｌｙｍｅｍｏｒｙ）、ＤＶＤ（ｄｉｇ
ｉｔａｌｖｉｄｅｏｄｉｓｃ）、ＦＤ（ｆｌｏｐｐ
ｙｄｉｓｃ）、又はその他の記録媒体に記録して提供
することかできる。Further, each application software in the server program and the client in the above embodiment is a compact disc (CD-ROM).
read only memory), DVD (dig)
ital video disc), FD (flopp
y disc) or other recording medium.

【００５３】[0053]

【発明の効果】以上の如く、本発明に係るデータ管理装
置およびデータ管理方法、並びに記録媒体によれば、少
なくとも、第１の項目のデータと第２の項目のデータと
からなり、第１の項目のデータと第２の項目のデータと
がタグによって識別可能な１または複数の第１の情報か
ら構成される第２の情報を入力し、入力された第２の情
報を記憶し、第２の情報からタグを除いた第３の情報を
記憶し、記憶された第３の情報に対して、少なくとも、
第１の項目に対する第１の検索キーと第２の項目に対す
る第２の検索キーに基づいて検索を行い、検索された第
１の情報を構成要素とする第２の情報を出力するように
したので、データ構造の制約をなくし、煩雑なデータベ
ース設計を行うことなく、効率的にデータベースの設計
及び構築を行うことができる。従って、データベースへ
の項目の追加や削除を容易に行うことができる。また、
データ構造に制約がないため、不特定のデータ項目を対
象とするデータベースを効率的に作成することができ
る。As described above, according to the data management apparatus, the data management method, and the recording medium of the present invention, at least the data of the first item and the data of the second item are provided. Inputting second information composed of one or more pieces of first information in which the data of the item and the data of the second item are identifiable by a tag, storing the input second information, The third information obtained by removing the tag from the information of is stored, and at least the stored third information is
A search is performed based on a first search key for a first item and a second search key for a second item, and second information having the searched first information as a component is output. Therefore, it is possible to efficiently design and construct a database without restricting the data structure and without complicated database design. Therefore, it is possible to easily add or delete items from the database. Also,
Since there is no restriction on the data structure, a database for unspecified data items can be efficiently created.

[Brief description of the drawings]

【図１】本発明のデータ管理装置の一実施の形態の構成
例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of a data management device according to an embodiment of the present invention.

【図２】本発明のデータ管理装置で用いられるデータ構
造の例を示す図である。FIG. 2 is a diagram showing an example of a data structure used in the data management device of the present invention.

【図３】テキスト情報を管理情報と全文データベースに
分けて保管する手順を示す図である。FIG. 3 is a diagram showing a procedure for storing text information separately in management information and a full-text database.

【図４】個人情報カードの例を示す図である。FIG. 4 is a diagram showing an example of a personal information card.

【図５】データベースの管理情報を管理するためのテー
ブルを示す図である。FIG. 5 is a diagram showing a table for managing management information of a database.

【図６】全文データベースの例を示す図である。FIG. 6 is a diagram illustrating an example of a full-text database.

【図７】従来のリレーショナルデータベースのデータ構
造の例を示す図である。FIG. 7 is a diagram showing an example of a data structure of a conventional relational database.

[Explanation of symbols]

１ドキュメント管理サーバ２文書管理データベース３図表管理データベース４全文データベース１１登録用アプリケーション１２編集用アプリケーション１３検索用アプリケーション 1 Document Management Server 2 Document Management Database 3 Chart Management Database 4 Full Text Database 11 Registration Application 12 Editing Application 13 Search Application

Claims

[Claims]

At least a first item data and a second item data
, The data of the first item and the data of the second item can be identified by a tag.
Or input means for inputting second information composed of a plurality of first information; first storage means for storing the second information input by the input means; and A second storage unit for storing third information excluding the tag; and a first search for at least the first item with respect to the third information stored in the second storage unit. Search means for performing a search based on a key and a second search key for the second item; and output means for outputting the second information having the first information searched by the search means as a component. A data management device comprising:

2. The data management apparatus according to claim 1, further comprising a conversion unit configured to convert document data into the tagged second information.

3. The data of the first item indicates a type of the first information, and the data of the second item indicates a content of the first information. A data management device according to claim 1.

4. At least the data of the first item and the second item
, The data of the first item and the data of the second item can be identified by a tag.
Or inputting second information composed of a plurality of first information; storing the input second information; storing third information obtained by removing the tag from the second information; The stored third information is searched based on at least a first search key for the first item and a second search key for the second item. And outputting the second information having the above information as a constituent element.

5. A recording medium on which a program capable of executing the data management method according to claim 4 is recorded.