JPH04245563A

JPH04245563A - Preparation of retrieving table

Info

Publication number: JPH04245563A
Application number: JP3010743A
Authority: JP
Inventors: Shinji Hasunuma; 蓮沼　信二
Original assignee: Matsushita Graphic Communication Systems Inc
Current assignee: Panasonic System Solutions Japan Co Ltd
Priority date: 1991-01-31
Filing date: 1991-01-31
Publication date: 1992-09-02

Abstract

PURPOSE:To prepare a retrieving table which can secure the approximately equal time required for retrieval of the key words included in each index. CONSTITUTION:A key word that shows the contents of the document stored in a storage medium is provided with a pointer for the document number that can be derived from the key word. Then the key words are divided into groups that are shown in the indexes. These indexes contain the pointers that can derive the key words from these indexes, and the number of key words included in indexes are approximately equal to each other through a retrieving table.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】本発明は、記憶媒体上に記憶され
た文書を検索する検索テーブルの作成方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for creating a search table for searching documents stored on a storage medium.

【０００２】0002

【従来の技術】近来光ディスクがかなり普及している。光ディスクの特徴は莫大なメモリ容量を有する点である
。このため１個の光ディスクに多数の文書（情報の記憶
単位）が記憶されることになり、この中から所望の文書
を迅速に取り出す必要が生じる。このため各文書を検索
するのに必要な検索情報も記憶媒体に記憶しておき、こ
の検索情報を用いて所望の文書にアクセスするようにし
ている。2. Description of the Related Art Optical disks have become quite popular in recent years. A feature of optical discs is that they have a huge memory capacity. Therefore, a large number of documents (units of information storage) are stored on one optical disk, and it is necessary to quickly retrieve a desired document from among the documents. For this reason, the search information necessary to search each document is also stored in the storage medium, and this search information is used to access the desired document.

【０００３】このような検索情報の一例を図５を用いて
説明する。図５は文書の内容を表すキーワードを所定の
グループに分け、このグループをインデックスとして表
し、インデックスごとにキーワードを分類した検索テー
ブルを表す。キーワードはそのキーワードが表す内容を
有する文書の番号である文書番号を引き出せるポインタ
を有しており、インデックスはそのインデックスに属す
るキーワードを引き出せるポインタを有する。この検索
テーブルを用いて所望の文書を検索するには、まず所望
の文書の内容を表すキーワードが含まれると思われるイ
ンデックスを取り出しそのインデックスに属するキーワ
ードを検索してゆく。目的とするキーワードが見つかれ
ばそのキーワードのポインタからそのキーワードが表す
１つまたは複数の文書番号を得て、この中から目的の文
書にアクセスする。このインデックス内に目的のキーワ
ードが含まれていない場合は、別のインデックスのキー
ワードを検索してゆく。An example of such search information will be explained using FIG. 5. FIG. 5 shows a search table in which keywords representing the contents of a document are divided into predetermined groups, and the groups are represented as indexes, and the keywords are classified for each index. A keyword has a pointer from which a document number, which is the number of a document having the content represented by the keyword, can be retrieved, and an index has a pointer from which a keyword belonging to the index can be retrieved. To search for a desired document using this search table, first, an index that is thought to include a keyword representing the content of the desired document is retrieved, and keywords belonging to that index are searched. If a target keyword is found, one or more document numbers represented by the keyword are obtained from the keyword pointer, and the target document is accessed from among these. If the desired keyword is not included in this index, the keyword in another index is searched.

【０００４】0004

【発明が解決しようとする課題】上述した図５の検索テ
ーブルを作成するに当たっては、１つのインデックスに
属するキーワードの数（図５ではＫｍ個）を固定の値と
する場合が多い。つまり１つの記憶媒体に格納される文
書の内容を表す全てのキーワード数をＩとするとインデ
ックスの数ＳをＩ／Ｋｍとして上述の検索テーブルを作
成する。しかるにその後キーワードの数がＳ×Ｋｍを越
えて登録された場合、最終インデックス（つまりＳ個目
のインデックス）に属するキーワードの数はＫｍを大き
く上回る数となる。このため（Ｓ−１）個までのインデ
ックスの検索は、各インデックスともほぼ同じ時間で検
索できるが最終インデックスに属するキーワードの検索
には検索時間が他のインデックスより長くなってしまう
。In creating the above-described search table shown in FIG. 5, the number of keywords belonging to one index (Km in FIG. 5) is often set to a fixed value. That is, if the total number of keywords representing the contents of a document stored in one storage medium is I, the above-mentioned search table is created by setting the number S of indexes to I/Km. However, if the number of keywords exceeds S×Km and is subsequently registered, the number of keywords belonging to the final index (that is, the S-th index) greatly exceeds Km. Therefore, up to (S-1) indexes can be searched in approximately the same amount of time for each index, but the search time for a keyword belonging to the final index is longer than for other indexes.

【０００５】本発明は、上述の問題点に鑑みてなされた
もので、各インデックス内のキーワードの検索に要する
時間がほぼ等しくなるような検索テーブルの作成方法を
提供することを目的とする。The present invention has been made in view of the above-mentioned problems, and it is an object of the present invention to provide a method for creating a search table in which the time required to search for keywords in each index is approximately equal.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するため
、本発明の検索テーブル作成方法は、記憶媒体に記憶さ
れた文書の内容を表すキーワードにそのキーワードより
引き出すことのできる文書番号のポインタを設け、前記
キーワードをグリープに分けこのグループをインデック
スで表し、このインデックスにはそこに含まれる前記キ
ーワードが引き出させるポインタを設け、このインデッ
クスとこのインデックスに属するキーワードからなる検
索テーブルを作成するに際し、各前記インデックスに属
する前記キーワード数をほぼ均等になるようにする。[Means for Solving the Problems] In order to achieve the above object, the search table creation method of the present invention provides a pointer to a document number that can be derived from a keyword that represents the content of a document stored in a storage medium. The keywords are divided into groups, and this group is represented by an index. This index is provided with a pointer to which the keywords contained therein are retrieved. When creating a search table consisting of this index and the keywords belonging to this index, each group is represented by an index. The number of keywords belonging to the index is made approximately equal.

【０００７】[0007]

【作用】上記構成により、各インデックスに属するキー
ワードの数はほぼ均一になるので全体として検索時間の
高速化を図ることができる。[Operation] With the above configuration, the number of keywords belonging to each index is almost uniform, so that the search time can be speeded up as a whole.

【０００８】[0008]

【実施例】以下、本発明の実施例を図面を参照して説明
する。図１は本発明の実施例により作成された検索テー
ブルを示す。この検索テーブルを説明するに先立ち、こ
の検索テーブルを用いて検索するシステムについて説明
する。Embodiments Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 shows a search table created according to an embodiment of the invention. Before explaining this search table, a system for searching using this search table will be explained.

【０００９】図２は光ディスク検索システムを示し、１
は光ディスクを内蔵する光ディスク制御装置、２は検索
するのに必要なキーワードや文書の入力、検索結果や読
み出した文書の表示や記録をする入出力装置、３は入出
力装置２からの指示により光ディスクから検索情報を引
き出し検索装置４で行う検索を制御するシステム制御装
置である。検索装置４では光ディスクより読み出した検
索情報を展開し検索が行われる。FIG. 2 shows an optical disc search system, in which 1
is an optical disk control device that has a built-in optical disk, 2 is an input/output device that inputs keywords and documents necessary for searching, and displays and records search results and read documents; 3 is an optical disk controller that uses instructions from input/output device 2. This is a system control device that extracts search information from the search device 4 and controls the search performed by the search device 4. The search device 4 expands the search information read from the optical disc and performs a search.

【００１０】図３は、図２に示した検索装置４の構成図
である。検索装置４はシステム制御装置３とのインタフ
ェースをとるシステムインタフェース部４１と、光ディ
スクに記憶された検索情報を読み出し、これを後述する
主記憶４３上に展開して検索を行う検索制御部４２と、
検索のワークエリアとなる主記憶４３とから構成される
。FIG. 3 is a block diagram of the search device 4 shown in FIG. 2. The search device 4 includes a system interface unit 41 that interfaces with the system control device 3, a search control unit 42 that reads search information stored on an optical disk, expands it on a main memory 43 (described later), and performs a search.
It is composed of a main memory 43 that serves as a search work area.

【００１１】図４は、図３に示した主記憶４３に検索情
報を展開した状態を示す図である。主記憶４３の領域は
インデックス領域４３１　と各インデックスに属するキ
ーワードを記載するキーワード領域４３２　および各キ
ーワードが指し示す文書を表す文書番号領域４３３　と
で構成される。FIG. 4 is a diagram showing a state in which search information is expanded in the main memory 43 shown in FIG. The area of the main memory 43 is composed of an index area 431, a keyword area 432 in which keywords belonging to each index are written, and a document number area 433 representing the document pointed to by each keyword.

【００１２】次に動作を説明する。検索装置４はシステ
ムインタフェース部４１を介して索引情報を光ディスク
にアクセスして主記憶４３上に図４に示した各領域に展
開する。次に登録されている全てのキーワード数をイン
デックス領域４３１　に作成可能なインデックス数Ｓで
割ることにより各インデックスが管理するキーワード数
Ａを算出し、キーワード領域４３２　に大きい順（　ま
たは小さい順）　にソートされたキーワードの先頭から
算出したインデックス毎のキーワード数Ａ毎にインデッ
クスを作成し、インデックス領域４３１　にセットする
。なお、このインデックスはその属するキーワードにア
クセスするポインタを有する。このようにして、図１に
示す検索テーブルが作成される。Next, the operation will be explained. The search device 4 accesses the index information through the system interface section 41 to the optical disk and expands it into each area shown in FIG. 4 on the main memory 43. Next, calculate the number of keywords A managed by each index by dividing the number of all registered keywords by the number S of indexes that can be created in the index area 431, and sort them in the keyword area 432 in ascending order (or descending order). An index is created for each keyword number A for each index calculated from the beginning of the keyword, and is set in the index area 431. Note that this index has a pointer for accessing the keyword to which it belongs. In this way, the search table shown in FIG. 1 is created.

【００１３】次にこの検索テーブルを用いて所望の文書
を検索する動作を説明する。入出力装置２よりあるキー
ワードが入力され、このキーワードに関する文書番号の
検索指示があった場合には、この検索指定キーワードを
システムインタフェース部４１で受信した後、インデッ
クス領域４３１　のバイナリーサーチを行い検索指定キ
ーワードの属するキーワードリストを検出し、このキー
ワードリストの中から検索指定キーワードを検出する。各インデックスの管理するキーワード数は平均化されて
いるので検索指定キーワードを検索する検索時間は各イ
ンデックスとも平均化される。このため従来技術で説明
した検索テーブルと異なり、各インデックスに属するキ
ーワード数はインデックスの数と全キーワード数によっ
て決まるので、あるインデックスにキーワードの数が集
中するということがなく全体として検索時間の高速化が
図れる。Next, the operation of searching for a desired document using this search table will be explained. When a certain keyword is input from the input/output device 2 and there is an instruction to search for a document number related to this keyword, the system interface unit 41 receives this search specified keyword, performs a binary search of the index area 431, and then specifies the search specification. A keyword list to which the keyword belongs is detected, and a search specified keyword is detected from this keyword list. Since the number of keywords managed by each index is averaged, the search time for searching the specified search keyword is averaged for each index. Therefore, unlike the search table described in the conventional technology, the number of keywords belonging to each index is determined by the number of indexes and the total number of keywords, so the number of keywords is not concentrated in a certain index, and the search time is faster overall. can be achieved.

【００１４】なお、バイナリーサーチとは二分探索法と
言われ、データが一定の順序、例えば、整数の間の大小
関係や文字列の間の辞書式順序に基づいて並べている場
合効果的な探索方法で、まず全体を二分して目的データ
がいずれにあるか調べ、ある方を更に二分して目的デー
タが存在する方を調べるということを次々繰り返すこと
により目的のデータを探索する方法である。[0014] Binary search is called a binary search method, and is an effective search method when data is arranged in a certain order, for example, based on the magnitude relationship between integers or the lexicographical order between character strings. In this method, the target data is searched for by first dividing the whole into two parts and checking which part contains the target data, and then dividing the first part into two parts and checking which part contains the target data one after another.

【００１５】[0015]

【発明の効果】以上の説明から明らかなように、本発明
は各インデックスの管理するキーワードの数をほぼ等し
くするように作成した検索テーブルを用いることにより
、どのキーワードの検索も平均して同じ程度の時間で行
うことができ、検索効率を向上することができる。Effects of the Invention As is clear from the above description, the present invention uses a search table created so that the number of keywords managed by each index is approximately equal, so that searches for all keywords are performed to the same extent on average. This can be done in a short amount of time, improving search efficiency.

[Brief explanation of the drawing]

【図１】本実施例により構成した検索テーブルを示す図
[Figure 1] Diagram showing a search table configured according to this embodiment

【図２】本実施例を実行するシステム構成図[Figure 2] System configuration diagram for executing this embodiment

【図３】図
２に示す検索装置の構成図[Figure 3] Configuration diagram of the search device shown in Figure 2

【図４】図３に示す主記憶上に検索情報を展開する領域
を示す図[Fig. 4] A diagram showing an area in which search information is expanded on the main memory shown in Fig. 3.

【図５】従来の検索テーブルの一例を示す図[Figure 5] Diagram showing an example of a conventional search table

[Explanation of symbols]

１　　光ディスク制御装置２　　入出力装置３　　システム制御装置４　　検索装置４１　　システムインタフェース部４２　　検索制御部４３　　主記憶 1 Optical disc control device 2 Input/output device 3 System control device 4 Search device 41 System interface section 42 Search control section 43 Main memory

Claims

[Claims]

Claim 1: A pointer to a document number that can be retrieved from a keyword is provided for a keyword representing the content of a document stored in a storage medium, the keywords are divided into groups, and this group is represented by an index, and this index includes A pointer is provided to retrieve the keyword contained in the index, and when creating a search table consisting of this index and the keywords belonging to this index, the number of keywords belonging to each index is made to be approximately equal. How to create a search table.