JP3202341B2

JP3202341B2 - Database system

Info

Publication number: JP3202341B2
Application number: JP21383692A
Authority: JP
Inventors: 睦治垣原; 陽子宮尾
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1992-08-11
Filing date: 1992-08-11
Publication date: 2001-08-27
Anticipated expiration: 2016-08-27
Also published as: JPH0659950A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明はデータベースシステムに
関し、特に、テキスト形式データのフルサーチにより直
接固定長レコードの照合を行うデータベースシステムに
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a database system, and more particularly to a database system for directly collating fixed-length records by full search of text format data.

【０００２】[0002]

【従来の技術】従来一般に普及しているデータベースシ
ステムは大きく分類して、関係型データモデルを用いた
データベース（リレーショナル・データベース）と網型
データモデルを用いたデータベース（ネットワーク・デ
ータベース）とがある。関係型データベースは、レコー
ドの各項目を表形式で表現した概念モデルを持ち、集合
演算に基づく関係代数演算により表の参照操作などが行
われる。また、網型データベースはデータを節点で表
し、その親子関係などを有向辺で表現するようなグラフ
表現で示される概念モデルを持つ。これらは、蓄積され
ている各データレコードやその内部のデータ項目を関係
づけることにより、膨大な量のデータの高速なデータ検
索やデータ管理の一元化をはかろうとするものである。2. Description of the Related Art Conventionally, widely used database systems are roughly classified into a database (relational database) using a relational data model and a database (network database) using a net data model. The relational database has a conceptual model in which each item of a record is expressed in a table format, and a reference operation of the table is performed by a relational algebra operation based on a set operation. Further, the mesh database has a conceptual model represented by a graph expression in which data is represented by nodes and the parent-child relationship thereof is represented by directed edges. These are intended to perform high-speed data retrieval of a huge amount of data and unify data management by associating accumulated data records and data items therein.

【０００３】[0003]

【発明が解決しようとする課題】従来のデータベース
は、各データを関係づけることによりデータ管理の一元
化などがなされるが、利用に関してはいくつかの問題点
もある。In a conventional database, data management is unified by associating data with each other, but there are some problems in use.

【０００４】まず、各レコードや内部項目データの関係
を十分に把握しなければデータベース構築及び運用管理
が困難である。特に網型データベースなどでは物理的な
データ構造やデータへのアクセスパスなどについての知
識が必要となり、データベースに関する知識のない者に
は利用が難しい。また、データ同士の関係などをデータ
ベースが保持するため、データそのもの以外に必要とさ
れる情報が多く、実際のデータ量と比較してデータベー
ス自体の規模が大きくなり易い。[0004] First, it is difficult to construct and operate a database unless the relationship between each record and internal item data is sufficiently understood. In particular, a net-type database or the like requires knowledge of the physical data structure and the access path to data, and is difficult to use for those who do not have knowledge of the database. Further, since the database holds the relationship between data and the like, much information is required other than the data itself, and the size of the database itself tends to be larger than the actual data amount.

【０００５】また、データ検索という意味においては、
蓄積されたデータの先頭から終わりまで全てを検索し
て、該当するレコードを抽出するという方法も考えられ
るが、従来の装置においては検索速度が遅く実用的では
ない。[0005] In the sense of data retrieval,
A method of searching the entirety of the accumulated data from the beginning to the end and extracting a corresponding record is also conceivable, but the search speed is not practical in a conventional device, and is not practical.

【０００６】[0006]

【課題を解決するための手段】本発明は、複数の項目か
らなるレコード形式のデータを記憶するデータ蓄積部
と、前記複数の項目のうちの特定の項目についての検索
すべきキーワードを入力するキーワード入力部と、入力
されたキーワードを、前記データ蓄積部のデータのレコ
ード形式と同一形式のキーであって、前記特定の項目に
前記入力したキーワードが当てはめられ、その他の項目
にワイルドカードが当てはめられたキーに変換するキー
ワード変換部と、前記キーワード変換部によって変換，
生成されたキーと前記データ蓄積部のデータとの照合を
行う照合部と、前記照合部によって照合された結果、前
記検索条件に合致した該当レコード情報を蓄積する検索
結果蓄積部と、前記検索結果蓄積部による検索結果を利
用者に提示する検索結果表示部とを備えたデータベース
システムであって、前記照合部が、文字列照合プロセッ
サであり、前記データ蓄積部のデータを直接キーとして
検索することが可能であることを特徴とするものであ
る。Means for Solving the Problems The present invention, or a plurality of items
A data storage unit for storing data in a record format, a keyword input unit for inputting a keyword to be searched for a specific item of the plurality of items, and a keyword stored in the data storage unit. No record
Key with the same format as the
The input keyword is applied and other items
A keyword conversion unit that converts the key to a wildcard, and a conversion by the keyword conversion unit.
A collating unit for collating the generated key with the data of the data accumulating unit, a retrieval result accumulating unit for accumulating corresponding record information matching the retrieval condition as a result of collation by the collating unit, A database system comprising: a search result display unit for presenting a search result by a storage unit to a user , wherein the matching unit is a character string matching processor, and the data of the data storage unit is searched directly as a key. Is possible.

【０００７】[0007]

【実施例】本発明の実施例について、図面を参照して説
明する。Embodiments of the present invention will be described with reference to the drawings.

【０００８】図１に本発明によるデータベースシステム
の一実施例を示す。FIG. 1 shows an embodiment of a database system according to the present invention.

【０００９】データ蓄積部１１は、データベースとして
登録されるデータを記憶する。The data storage unit 11 stores data registered as a database.

【００１０】キーワード入力部１２は、データベース利
用者が検索しようとするキーワードや検索条件を設定す
るための装置である。The keyword input unit 12 is a device for setting a keyword or search condition to be searched by a database user.

【００１１】キーワード変換部１３は、キーワード入力
部１２より入力されたキーワードや検索条件から、デー
タ蓄積部１１のデータ形式に対応した形式のデータを生
成する。The keyword conversion unit 13 generates data in a format corresponding to the data format of the data storage unit 11 from the keywords and search conditions input from the keyword input unit 12.

【００１２】照合部１４は、文字列照合プロセッサであ
り、変換されたキーワードとデータ蓄積部１１のデータ
の照合を行う。本プロセッサは１つの文字列だけではな
く、複数個の文字列を同時に照合する能力を持つ。ま
た、本プロセッサにより照合の結果一致した文字列の終
端アドレスあるいは始端アドレスが得られる。The collation unit 14 is a character string collation processor, and collates the converted keyword with the data in the data storage unit 11. The present processor has the ability to collate not only one character string but a plurality of character strings at the same time. Further, the processor can obtain the end address or the start address of the character string that matches as a result of the comparison.

【００１３】検索結果蓄積部１５は、照合結果として検
索条件に合致したデータレコード情報を蓄積する。The search result storage unit 15 stores data record information that matches the search conditions as a collation result.

【００１４】検索結果表示部１６は、検索結果の情報を
利用者に提示する。The search result display section 16 presents information of the search result to the user.

【００１５】次に、図２のフローチャートを用いて動作
を説明する。まず、本データベース利用者はキーワード
入力部１２より、自分の検索したいキーワードを入力す
る。この時キーワードを複数指定したり、検索条件（〜
AND 〜、〜OR〜）を指定することができる（ステップ２
１）。キーワード変換部１３は、キーワード入力部１２
から入力されたキーワード及び検索条件を基に、照合部
１４における照合時のキーとなるデータを生成する（ス
テップ２２）。キーワード変換部１３の動作及びデータ
蓄積部１１のデータ形式は、ともに密接に関連するた
め、詳しくは図３，図４を用いて後述するものとする。
キーワード変換部１３で変換生成されたデータは検索キ
ーとして、照合部１４に送られ、照合部１４においてデ
ータ蓄積部１１のデータとの照合を行う（ステップ２
３）。照合の結果、抽出すべきデータとして検索された
結果データの情報を検索結果蓄積部１５に蓄積し（ステ
ップ２４）、検索結果表示部１６に検索結果が表示でき
る（ステップ２５）。Next, the operation will be described with reference to the flowchart of FIG. First, the user of the database inputs his / her desired keyword from the keyword input unit 12. At this time, you can specify multiple keywords, search conditions (~
AND ~, ~ OR ~) can be specified (Step 2)
1). The keyword conversion unit 13 includes the keyword input unit 12
Based on the keyword and the search condition input from, data to be a key at the time of collation in the collation unit 14 is generated (step 22). Since the operation of the keyword conversion unit 13 and the data format of the data storage unit 11 are closely related, they will be described later in detail with reference to FIGS.
The data converted and generated by the keyword conversion unit 13 is sent to the matching unit 14 as a search key, and the matching unit 14 performs matching with the data in the data storage unit 11 (step 2).
3). As a result of the collation, information on the result data searched as data to be extracted is stored in the search result storage unit 15 (step 24), and the search result can be displayed on the search result display unit 16 (step 25).

【００１６】次に、本データベースシステムにおける検
索方式を説明する。本データベースシステムにおいて
は、照合部１４に文字列照合プロセッサを用いているた
め、データの検索は文字列の照合で一致したものを抽出
するという方式である。従って、データベースのデータ
形式及び照合時の検索キーとなるキーワードデータの変
換生成処理を特徴とする。Next, a search method in the present database system will be described. In the present database system, a character string collation processor is used for the collation unit 14, so that the data retrieval is a method of extracting a match in character string collation. Therefore, it is characterized by the conversion and generation processing of the data format of the database and the keyword data serving as a search key at the time of collation.

【００１７】そこで、「氏名」「出身地」「性別」「生
年月日」を項目とする名簿のようなものを本データベー
スに適用する場合を例として示す。まず、データ蓄積部
１１に蓄積されているデータベースデータの表現方式に
ついて、図３を用いて説明する。例に示した各項目から
成るレコードは、概念的には図３（Ａ）のような形式で
表現することができるが、本データベースにおいては図
３（Ｂ）のようなテキスト形式でデータを表現する。こ
れは各レコードの開始位置にレコードの開始を示す「始
端コード」、終了位置にレコードの終了を示す「終端コ
ード」を付加した固定長レコードである。Therefore, an example will be described in which a database such as a list having "name", "hometown", "sex" and "date of birth" is applied to this database. First, a method of expressing database data stored in the data storage unit 11 will be described with reference to FIG. The record composed of the items shown in the example can be conceptually expressed in a format as shown in FIG. 3A, but in this database, the data is expressed in a text format as shown in FIG. 3B. I do. This is a fixed-length record in which a “start code” indicating the start of a record is added at the start position of each record, and a “end code” indicating the end of the record is added at the end position.

【００１８】次に、入力されたキーワードと検索条件か
ら、前記のデータ形式に適したキー生成を行うキーワー
ド変換部１３の動作を図４を用いて説明する。図３
（Ｂ）のようなテキスト形式ファイルに対して、次のよ
うな検索条件及びキーワードが入力されたものとする
（この例では ”？”はどのような文字が入っても一致
とする１文字分のワイルドカードを示す）。Next, the operation of the keyword conversion unit 13 for generating a key suitable for the data format from the input keywords and search conditions will be described with reference to FIG. FIG.
It is assumed that the following search conditions and keywords are input to the text format file as in (B) (in this example, "?" Indicates a wildcard).

【００１９】姓＝山？ AND 出身地＝（千葉県 OR 神奈川県）すると、キーワード変換部では、レコードデータ形式に
合わせて図４に示す２つのキーを生成する。ここで変換
生成された２つのキーはそれぞれを１つの文字列とみな
すことができるので、これらを照合部１４にキーとして
渡す。照合部の文字列照合プロセッサでは、図３（Ｂ）
に示されたテキスト形式をもつデータ蓄積部１３のデー
タとキーとの照合を行う。本データベースシステムで使
用しているプロセッサは、複数キーワードの照合やワイ
ルドカードの使用が可能であるため、１回のデータ検索
でこれらのキーの照合を行う。照合部では、一致した文
字列の始端あるいは終端アドレスを知ることができるた
め、図３（Ｂ）に示したレコード形式、あるいは図４に
示したキーの形式のように、蓄積データとキーの双方に
始端・終端コードを付加しておけば、必ず一致レコード
のアドレスを得られる。得られた情報は図２のフローチ
ャートに示したように検索結果蓄積部に蓄積される。Last name = mountain? AND Hometown = (Chiba OR Kanagawa) Then, the keyword conversion unit generates two keys shown in FIG. 4 according to the record data format. Since the two keys converted and generated here can be regarded as one character string, these are passed to the matching unit 14 as keys. In the character string matching processor of the matching unit, FIG.
The data of the data storage unit 13 having the text format shown in FIG. The processor used in the present database system can match a plurality of keywords and use wildcards, so that these keys are matched in one data search. Since the collating unit can know the start or end address of the matched character string, both the stored data and the key are used as in the record format shown in FIG. 3B or the key format shown in FIG. If the start and end codes are added to, the address of the matching record can be always obtained. The obtained information is stored in the search result storage unit as shown in the flowchart of FIG.

【００２０】[0020]

【発明の効果】以上説明したように本発明によれば、照
合部は文字列照合の専用プロセッサを用いているため高
速な文字列照合が可能である。従って、データベースの
文字列データを直接検索することが可能であり、インデ
ックス登録を行う必要がなく、通常のテキストファイル
としてデータベースの登録，更新が簡単にできる。ま
た、インデックスやデータ同士の関係などの冗長な情報
を持たないため、従来のデータベースと比較して少ない
ディスク使用量でデータを格納することが可能である。As described above, according to the present invention, a high-speed character string collation is possible because the collating unit uses a dedicated processor for character string collation. Therefore, it is possible to directly search the character string data in the database, and it is not necessary to perform index registration, and registration and updating of the database as a normal text file can be easily performed. Further, since there is no redundant information such as an index and a relationship between data, it is possible to store data with a smaller disk usage compared to a conventional database.

[Brief description of the drawings]

【図１】本発明によるデータベースシステムの一実施例
を示すブロック図である。FIG. 1 is a block diagram showing one embodiment of a database system according to the present invention.

【図２】図１のデータベースシステムの動作を説明する
ためのフローチャートである。FIG. 2 is a flowchart illustrating an operation of the database system of FIG. 1;

【図３】図１のデータベースシステムを構成するデータ
蓄積部のデータ形式を説明するための図である。FIG. 3 is a diagram for explaining a data format of a data storage unit included in the database system of FIG. 1;

【図４】図１のデータベースシステムを構成するキーワ
ード変換部の動作を説明するための図である。FIG. 4 is a diagram for explaining an operation of a keyword conversion unit constituting the database system of FIG. 1;

[Explanation of symbols]

１１データ蓄積部１２キーワード入力部１３キーワード変換部１４照合部１５検索結果蓄積部１６検索結果表示部 11 data storage unit 12 keyword input unit 13 keyword conversion unit 14 collation unit 15 search result storage unit 16 search result display unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者宮尾陽子東京都港区高輪二丁目17番11号日本電気ソフトウェア株式会社内 (56)参考文献特開平１−265321（ＪＰ，Ａ) 東功，“ａｗｋ入門”，月刊アスキー，株式会社アスキー，1989．４．１, Ｖｏｌ．13 Ｎｏ．４，ｐ．317−322 魚住和朗，“ＳｔｒａｎｇｅｒｔｈａｎＭＳ−ＤＯＳパソコン人生幸路だ −責任者ででこい！”月刊アスキー，株式会社アスキー，1991．１．１Ｖｏｌ．15，Ｎｏ．１ｐ．313−320 山谷正己，「ファイル編成入門」，第１版，株式会社オーム社，昭和55年７月 25日，ｐ．１−26 宮原末治他２名，”ＳＭＩＤ型並列プロセッサを用いたフルテキスト検索”，情報処理学会論文誌情報処理学会Ｖｏｌ．33 Ｎｏ．３ 1992．３, ｐ．397−404 バインス情報センター，「初めて使うｄＢＡＳＥ▲ＩＩＩ▼」，第１版，株式会社技術評論社，昭和62年11月，ｐ．90 −102「ＳｏｆｔｗａｒｅＬｉｂｒａｒｙＭＳ−ＤＯＳ３．３ユーザーズリファレンスマニュアル」，ＮＥＣＣｏｒｐｏｒａｔｉｏｎ，1988，ｐ．５−11 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 12/00 ────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Yoko Miyao 2-17-11 Takanawa, Minato-ku, Tokyo Within NEC Software Corporation (56) References JP-A-1-265321 (JP, A) “Introduction to awk”, Monthly ASCII, ASCII Corporation, 1989.4.1, Vol. 13 No. 4, p. 317-322 Kazuo Uozumi, "Stranger than an MS-DOS personal computer is the life of a man-a responsible person!" Ascii Monthly ASCII, ASCII Corporation, 1991.1.1 Vol. 15, No. 1 p. 313-320 Masami Yamatani, "Introduction to File Organization", 1st edition, Ohmsha Co., Ltd., July 25, 1980, p. 1-26 Sueharu Miyahara et al., "Full-text search using SMID-type parallel processor", Transactions of Information Processing Society of Japan, IPSJ Vol. 33 No. 3 1992.3, p. 397-404 Vice Information Center, "First Use dBASE III", 1st edition, Technical Review Co., Ltd., November 1987, p. 90-102, "Software Library MS-DOS 3.3 Users Reference Manual", NEC Corporation, 1988, p. 5-11 (58) Field surveyed (Int. Cl. ⁷ , DB name) G06F 12/00

Claims

(57) [Claims]

A data storage unit for storing data in a record format including a plurality of items; and a keyword input unit for inputting a keyword to be searched for a specific item among the plurality of items. The input keyword is converted to the data of the data storage unit.
A key with the same format as the record format,
The keyword entered above is applied to the eyes,
A keyword conversion unit that converts the key into a key in which a wildcard is applied to an item ; a matching unit that matches the key converted and generated by the keyword conversion unit with the data in the data storage unit; As a result, a database system comprising: a search result storage unit that stores applicable record information that matches the search condition; and a search result display unit that presents a search result by the search result storage unit to a user. A database system, wherein the collation unit is a character string collation processor, and the data in the data storage unit can be searched directly as a key.

2. The keyword input unit according to claim 1 , wherein
Enter multiple keywords and link these keywords.
And the keyword conversion unit determines that the search condition is “OR”.
The number of keywords connected by "OR"
Key and the keywords linked by "OR"
Generate a key applied to the corresponding item of the key
The database system according to claim 1, wherein:

Wherein said data storage unit, the data base according to claim 1 or 2, characterized by using a text file consisting of fixed-length records having a record start code and end code as data representation scheme <br/>