JPS62288934A

JPS62288934A - Massive information retrieving system

Info

Publication number: JPS62288934A
Application number: JP61131864A
Authority: JP
Inventors: Takashi Shindo; 隆進藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1986-06-09
Filing date: 1986-06-09
Publication date: 1987-12-15

Abstract

PURPOSE:To collectively store values of a designated column by designating the column of a data base retrieval key, which is effective for retrieval, to a control system at the time of data base constitution to determine the storage position on the data base of each information. CONSTITUTION:A console 1 for system manager and plural user terminals 3 which users operate are connected to a data base control system 4 of a mass information retrieving system. A key information ID control system table 7 is provided in the system table group of the system 4, and an external data base 5 is connected to a buffer 6. A manager 2 of this retrieving system designates the column of the data base retrieval key, which is effective for retrieval, to the system at the time of constituting the data base 5. The position on the data base 5 of each information is determined, and values of the designated column are collectively stored in the buffer to retrieve plural pages.

Description

【発明の詳細な説明】３、発明の詳細な説明〔産業上の利用分野〕本発明は、データベースを利用して大量情報を検索する
システムにおいて、特に扱う情報が、大きなカテゴリを
持ち、該カテゴリを特徴づけるキーが存在するような場
合に好適な、リレーショナルデータベース管理方式に関
する。[Detailed Description of the Invention] 3. Detailed Description of the Invention [Field of Industrial Application] The present invention provides a system for searching a large amount of information using a database, in which the information handled has large categories, and This invention relates to a relational database management method suitable for cases where there are keys that characterize a database.

[Conventional technology]

従来の装置は、特開昭５６−２１２７３号公報に記載の
ように、ビューに応用する場合のビュー処理を行うＣＰ
Ｕ時間や、ファイルアクセス回数を少なくする為に、ビ
ュ一対応データを作成するなどの方式がとられていた。The conventional device is a CP that performs view processing when applied to views, as described in Japanese Patent Application Laid-Open No. 56-21273.
In order to reduce U time and the number of file accesses, methods such as creating view-compatible data have been used.

しかし、リレーショナルデータベースにおける２次記憶
装置上のデータ格納位置を考慮して、該データベースに
対するアクセス回数を減らすという点は、考慮されてい
なかった。However, no consideration has been given to reducing the number of accesses to the relational database by taking into consideration the data storage location on the secondary storage device.

[Problem that the invention seeks to solve]

上記従来の技術では、リレーショナル型データベース管
理システムにおいて、「データを格納する２次記憶装置
上の位置は、データベース管理システムが決定する。Ｊ
という原則に従い情報を処理・格納しており、該データ
ベース上の情報の位置については配慮されておらず、大
量の情報を持つデータベースから情報を検索する場合、
該データベース上の情報を取り込むデータベース管理シ
ステムのバッファの大きさと、該データベースの大きさ
との関係から、後者が非常に大きいシステムにおいては
該データベースをアクセスする回数が多くなり、性能劣
化を招くという問題があった。In the above conventional technology, in a relational database management system, "the location on the secondary storage device where data is stored is determined by the database management system."J
Information is processed and stored according to the principle of
Due to the relationship between the buffer size of the database management system that takes in information in the database and the size of the database, in systems where the latter is very large, the number of accesses to the database increases, leading to performance deterioration. there were.

本発明の目的は、」二部大量情報を有するデータベース
上の情報を検索する時に、該データベースに対するアク
セス回数を少なくし、性能劣化を防ぐことを目的として
いる。An object of the present invention is to reduce the number of accesses to a database and prevent performance deterioration when searching for information on a database containing a large amount of information.

[Means for solving problems]

」一部門的は、情報検索システムの中のリレーショナル
型データベース管理システムにおいて、該データベース
構築時及び情報追加・変更時に、該情報を格納する位置
を同類の情報を同じ位置に自動的に決定し、格納するこ
とで達成される。すなわち、上記情報検索システム管理
者が、該情報検索システム利用者の主な検索パターン及
び」−記データベースに格納されている情報の性質から
、より多くの該検索に対し有効となるような該データベ
ース検索キーとなる欄をデータベース構築時に該データ
ベース管理システムに指定することにより、」二部リレ
ーショナル型データベース管理システムが、該データベ
ースの作成及び該情報の追加・削除・変更時に該情報の
該データベース上の格納位置を、上記指定欄の値ごとに
まとめて格納し、更に該情報の格納状況を管理するシス
テムテーブルを作成することで、達成される。``In a relational database management system in an information retrieval system, when building the database and when adding or changing information, the location where the information is stored is automatically determined in the same location for similar information, This is achieved by storing. In other words, the information retrieval system administrator has designed the database to be effective for a larger number of searches based on the main search patterns of the information retrieval system users and the nature of the information stored in the database. By specifying a search key field to the database management system when constructing the database, the two-part relational database management system can store the information on the database when creating the database and adding, deleting, or modifying the information. This is achieved by storing the storage locations for each value in the specified field and creating a system table for managing the storage status of the information.

[Effect]

上記情報検索システムのデータベース管理システムは、
」二部情報検索システム管理者より指定された情報格納
テーブルのキーとなる欄の値に関して、該データベース
に登録される情報の該欄値ごとにレコードをまとめて格
納する。すなわち、該データベース管理システムが１回
のデータベースアクセスで入出力できるバッファ１面（
以下、ページと記す）の大きさを１単位とし、同一ペー
ジ°　３　゛内には上記欄値の同じ値を持つ情報しか存在しないよう
に管理する。該データベースへの追加時及び更新時には
情報の格納場所として、キー別情報ＩＤ管理システムテ
ーブルにページ番号と該ページのアドレスをペアにした
情報ＩＤを登録する。The database management system of the above information retrieval system is
''Regarding the value of the key column of the information storage table designated by the administrator of the two-part information retrieval system, records are collectively stored for each column value of the information registered in the database. In other words, the database management system has one buffer surface (
The size of a page (hereinafter referred to as a page) is taken as one unit, and management is performed so that only information having the same column value exists within the same page. When adding to or updating the database, an information ID that is a pair of a page number and the address of the page is registered in the key-based information ID management system table as the storage location of the information.

更に、該データベースの情報検索によく使われる、上記
キーより優先度の低い柵を２次キーとして数個を、該情
報検索システム管理者より指定されることにより、」―
記システムテーブル上に、同一の２次キーの欄値を持つ
情報のＩＤをポインタで結ぶような逆ファイルリストを
作成する。該データペース−］二情報の更新時には、上
記主キーの欄値が更新された場合、上記キー別情報ＩＤ
管理システムテーブルを参照し、適所に再格納し直し上
記システムテーブル上の情報ＩＤを修正する。上記２次
キーの値が更新された場合には、ポインタのつけ変えを
行うことにより、上記逆ファイルリストを修正する。本
方式により、情報検索時に該データベース管理システム
は、上記システムテーブルを参照することにより、該情
報検索システム利用゛　４　“ 者の要求する情報を格納しているページを知ることが出
来、必要最小限のデータベースアクセス回数で検索を行
うことができる。本発明では、指定されたキー値の重複
率が大きい程効果が大きい。Furthermore, the information retrieval system administrator specifies several secondary keys that are often used for information retrieval of the database and have a lower priority than the above key.
A reverse file list is created on the system table, in which IDs of information having the same secondary key field value are connected by pointers. When updating the data page-]2 information, if the field value of the above primary key is updated, the above key information ID
Refer to the management system table, re-storage to the appropriate location, and correct the information ID on the system table. When the value of the secondary key is updated, the reverse file list is corrected by changing the pointer. With this method, when searching for information, the database management system can know the page that stores the information requested by the user of the information search system by referring to the above system table, and can search for the minimum necessary information. In the present invention, the greater the duplication rate of the specified key value, the greater the effect.

ページがオーバフローした場合、該ページから一番近い
ページを次のデータ格納ページとする。When a page overflows, the page closest to the page is set as the next data storage page.

本発明により、インデクスをページ単位で作成できる為
、インデクスの容量を小さくすることが可能となる。According to the present invention, since an index can be created on a page-by-page basis, it is possible to reduce the capacity of the index.

〔Example〕

以下、本発明の一実施例を第１図により説明する。すな
わち、上記情報検索システムは、該情報検索システムを
管理するシステム管理者１、該情報検索システムを運用
する為に該システム管理者１とデータベース管理システ
ム４とのコミュニケーションを仲介するコンソール例え
ばＣＲＴディスプレー２、該情報検索システム利用者と
データベース管理システム４とのコミュニケーションを
仲介する端末例えばＣＲＴディスプレー３、該システム
利用者の検索する情報を蓄えるデータベースすなわち２
次記憶装置５、該データベースを管理するデータベース
管理システム４内の該データベースからアクセスした情
報を受は取るバッファ６、及び該データベース管理シス
テムが作成するキー別情報ＩＤ管理システムテーブル７
などから成る。」二部情報検索システム管理者］は、該
データベースを作成する時、すなわちテーブル設計時に
、主キー及び２次キーとなる。該データベースのテーブ
ルの欄名を該データベース管理システムに伝える。本実
施例では、第２図に示すような情報を格納するものとす
る。すなわち、文献情報として、文献番号５−１、文献
の分類５−２、文献名５−３、著者名５−４、著者の所
属５−５、出版社名５−６、文献発行年月日５−７、文
献内容キーワード５−８、文献内容概要５−９、文献購
入年月日５−１０、文献購入金額５−１１．文献保管場
所５−１２、文献管理情報５−１３の欄を持つテーブル
とする。該システム管理者１は、まず該テーブルの文献
購入年月１１５−１−０以降の欄値情報は、文献管理情
報として利用者に公開しない為、文献番号欄５−１から
文献内容概要欄５−９に対するビューを定義する。次に
、当該情報の性質から、文献の分類欄５−２を主キーと
して選ぶことにより、該文献情報は分類ごとにデータベ
ース」二にまとめて格納されることになり、同時に文献
名、著者名、文献の内容キーワードの大部分が同じ値を
持つ情報が同一ページ内に格納されることになる。２次
キーとしては、」二足主キーで分類しきれない出版社名
欄５−６を指定し、残りの５−３から５−５及び５−７
．５−８の各欄に関してはインデクスを張る。以上のよ
うなテーブル設計を行い、データベース構築時に上記デ
ータベース管理システム４に伝えることで、該データベ
ース管理システム４は、該文献情報入力時に第３図に示
す主キーに関するキー別情報ＩＤ管理テーブル７及び２
次キーの逆ファイルリストを作成していく。」二部検索
システム利用者が情報検索条件を端末３より入力した時
、データベース管理システム４は、上記システムテーブ
ル及びインデクスを利用しながら、該システム利用者の
要求する情・　７　・報を抽出し利用者に提供する。情報の更新・削除時は、
システム管理者１により行われ、データベース管理シス
テム４はその都度、キー別情報ＩＤ管理システムテーブ
ルを修正していく。本方式により、」二部情報検索シス
テム利用者の要求する情報はページ単位に存在し、デー
タベースアクセス回数が少なくなる為、検索性能が向−
ヒする。An embodiment of the present invention will be described below with reference to FIG. That is, the information retrieval system includes a system administrator 1 who manages the information retrieval system, and a console such as a CRT display 2 that mediates communication between the system administrator 1 and the database management system 4 in order to operate the information retrieval system. , a terminal that mediates communication between the information retrieval system user and the database management system 4, such as a CRT display 3, and a database that stores information searched by the system user, that is, 2.
A storage device 5, a buffer 6 for receiving and receiving information accessed from the database in the database management system 4 that manages the database, and a key-based information ID management system table 7 created by the database management system.
Consists of etc. ``Two-Part Information Retrieval System Administrator'' becomes the primary key and secondary key when creating the database, that is, when designing the table. The column names of the tables in the database are communicated to the database management system. In this embodiment, it is assumed that information as shown in FIG. 2 is stored. That is, the document information includes document number 5-1, document classification 5-2, document name 5-3, author name 5-4, author's affiliation 5-5, publisher name 5-6, document publication date. 5-7, Document content keyword 5-8, Document content summary 5-9, Document purchase date 5-10, Document purchase amount 5-11. The table has columns for document storage location 5-12 and document management information 5-13. The system administrator 1 first selects the document number column 5-1 to the document content summary column 5 because the column value information after the document purchase date 115-1-0 in the table is not disclosed to the user as document management information. -Define a view for 9. Next, due to the nature of the information, by selecting the document classification column 5-2 as the primary key, the document information will be stored in the database 2 for each category, and at the same time the document name and author name. , information in which most of the content keywords of documents have the same value will be stored within the same page. As the secondary key, specify the publisher name field 5-6 that cannot be classified using the two-legged primary key, and the remaining 5-3 to 5-5 and 5-7.
．． Indexes are provided for each column 5-8. By designing the table as described above and transmitting it to the database management system 4 at the time of database construction, the database management system 4 can create the key-by-key information ID management table 7 and the primary key information ID management table 7 regarding the primary key shown in FIG. 3 when inputting the document information. 2
Create a reverse file list for the next key. ” When a user of the two-part search system inputs information search conditions from the terminal 3, the database management system 4 extracts the information requested by the system user using the above system table and index. Provide to users. When updating or deleting information,
This is performed by the system administrator 1, and the database management system 4 modifies the key-based information ID management system table each time. With this method, the information requested by users of the two-part information retrieval system exists on a page-by-page basis, reducing the number of database accesses and improving search performance.
Hi.

〔Effect of the invention〕

本発明により、リレーショナルデータベースを持つ大量
情報検索システムにおいて該情報の検索性能が向上する
。該データベース」二の情報検索効率は、主キーの選び
方及び情報の性質に依存するが一般には情報量が多い程
、該情報検索効率は」−昇する。例えば、１つの情報が
平均ｎバイトの長さを持ち、これがＮ個データベース上
に格納されバッファ長（１ページの大きさ）がＢバイト
とすると１ページには［Ｎｎ／Ｂｌ　−１個の情報が格
納できる。従来のデータベースにおいて、１ページ当り
に主キー値の異なる情報が格納されている平均割合をＭ
％とすると（１−００−Ｍ）　／Ｎページに渡り同一キ
ー値を持つ情報が存在することになり、少なくとも（１
００−Ｍ）／Ｎページをアクセスする必要があった。本
発明の方式ではこれが、［（１００−Ｍ）ｎ／Ｂｌペー
ジの検索ですむことになる。ここで、記号［ａ］はｄ以
」−の最小整数を表わすものとする。According to the present invention, the information retrieval performance is improved in a large-volume information retrieval system having a relational database. The information retrieval efficiency of the database depends on the selection of the primary key and the nature of the information, but generally speaking, the larger the amount of information, the higher the information retrieval efficiency. For example, if one piece of information has an average length of n bytes, and N pieces of information are stored in a database, and the buffer length (size of one page) is B bytes, one page contains [Nn/Bl -1 pieces of information. can be stored. In a conventional database, the average percentage of information with different primary key values stored per page is M
%, there will be information with the same key value across (1-00-M) /N pages, and at least (1
00-M)/N pages needed to be accessed. In the method of the present invention, this can be accomplished by searching [(100-M)n/Bl pages. Here, the symbol [a] represents the smallest integer greater than or equal to d.

[Brief explanation of drawings]

第１図は本発明の一実施例の情報検索システムを説明す
る概要図、第２図は該情報検索システムの扱う情報を蓄
えるテーブルの例で文献テーブルの説明図、第３図は、
本発明の効果を向−１−させるキー別情報ＩＤ管理シス
テムテーブルの説明図である。１・・・情報検索システム管理者、４・・・データベー
ス管理システム、５・・・データベース、６・・バッフ
ァ、７・・・キー別情報ＪＴ）管理システムテーブル、
５−１〜５−１３・文献テーブルの欄。７デ°１−−８（パFIG. 1 is a schematic diagram illustrating an information retrieval system according to an embodiment of the present invention, FIG. 2 is an explanatory diagram of a literature table as an example of a table that stores information handled by the information retrieval system, and FIG.
FIG. 1 is an explanatory diagram of a key-based information ID management system table that improves the effects of the present invention. 1... Information retrieval system administrator, 4... Database management system, 5... Database, 6... Buffer, 7... Information by key JT) Management system table,
5-1 to 5-13・Reference table column. 7de°1--8 (Pa

Claims

[Claims]

1. In a table representing information in a relational database, the administrator of the information retrieval system specifies a key column, and the system uses this column to create a database and to add, delete, or change information on the database. 2. Configure the database for each field value above.
The storage location on the secondary storage device is stored in the same location in units of accessing information from the secondary storage device, and the storage location is automatically created and modified in memory as one of the system tables, and the information A mass information retrieval system comprising a relational database management system having a function of accessing a database by using the above table when searching for.