JP3193249B2

JP3193249B2 - Keyword search method

Info

Publication number: JP3193249B2
Application number: JP29942594A
Authority: JP
Inventors: 雅光湊川; 真樹作田; 佳津子市江; 千加子百山; 隆弘林
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1994-12-02
Filing date: 1994-12-02
Publication date: 2001-07-30
Anticipated expiration: 2016-07-30
Also published as: JPH08161341A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、文書ファイルから抽出
したキーワードを格納し、キーワードに基づき文書ファ
イルを検索する方法に関する。 BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to extraction from a document file.
Stored keywords, and based on the keywords,
How to search for files.

【０００２】電子媒体の文書はこれを蓄積することによ
りデータベースとして利用することができる。このよう
なデータ資産の利用形態としては、文書中に記載されて
いる文書キーワードを表示させたり、必要に応じて更新
したりすることがある。[0002] Documents on electronic media can be used as a database by storing them. As a usage form of such a data asset, a document keyword described in a document may be displayed, or may be updated as necessary.

【０００３】なお、電子化された文書において、文書キ
ーワードはデータベースのインスタンス、文書はデータ
ベースのレコードに対応するので、以下の説明で特に断
らない限り、データベースのインスタンスは文書キーワ
ードを含むものとする。In an electronic document, a document keyword corresponds to an instance of a database and a document corresponds to a record of a database. Therefore, unless otherwise specified in the following description, an instance of a database includes a document keyword.

【０００４】[0004]

【従来の技術】キーワードから該当する文書を検索する
場合、文書ファイル毎にそれぞれのファイルをオープン
し、目的のキーワードを検索して読み込み、ファイルを
クローズするという操作を繰り返して実行していた。し
かし、このような方法では多くの検索処理時間を要して
しまうため、通常は、各文書ファイルのキーワードをキ
ーワード管理用ファイルにまとめて管理し、検索する際
には、このキーワード管理用ファイルだけを読みにいく
ようにし、ファイルをオープンする操作は該当した際の
１回のみとして、しかも複数文書ファイルのキーワード
の検索は連続して行う方法が用いられている。 2. Description of the Related Art Searching for a corresponding document from a keyword
Open each file for each document file
Search for the desired keyword, load it, and load the file.
The operation of closing was repeatedly performed. I
However, such a method requires a lot of search processing time.
Normally, the keyword of each document file is
-When managing and searching in a word management file
Will read only this keyword management file
Operation to open the file
One time only, and keywords for multiple document files
Search is performed continuously.

【０００５】[0005]

【発明が解決しようとする課題】キーワード管理ファイ
ルはテーブルの形式でキーワードを保存している。その
際に定義されるレコード長は、文書ファイルから抽出さ
れるキーワードの文字数が任意であるため、通常は予想
される最大の長さに予め設定される。 SUMMARY OF THE INVENTION Keyword management file
Le stores keywords in the form of a table. That
The record length defined at this time is extracted from the document file.
Is usually expected because the number of characters in the keyword
The maximum length is set in advance.

【０００６】このため、そのレコード長を越える文字列
はキーワード検索の対象にすることができず、したがっ
て、キーワード管理ファイルに格納できない。一方、レ
コード長の設定を余り長くすると抽出されるキーワード
が文書ファイルの文字数に近いものばかりというような
場合には、実質的には文書ファイルのファイルサイズと
同じようなキーワード管理ファイルを検索することにな
り、検索処理の短縮化が実現できなくなる。 Therefore, a character string exceeding the record length
Cannot be included in keyword searches,
And cannot be stored in the keyword management file. On the other hand,
Keywords extracted if code length setting is too long
Is almost the same as the number of characters in the document file
In case, the file size of the document file is effectively
Searching for similar keyword management files
As a result, the search process cannot be shortened.

【０００７】本発明はこのような点に鑑みてなされたも
のであり、キーワードを格納するレコード長に拘わらず
いかなる文字列もキーワード管理ファイルに格納するこ
とができ、このキーワード管理ファイルによって文書フ
ァイルを検索することが可能となるキーワード検索方法
を提供することを目的とする。[0007] The present invention has been made in view of such a point, and regardless of the record length for storing a keyword.
Any string can be stored in the keyword management file.
This keyword management file allows
It is an object of the present invention to provide a keyword search method capable of searching for a file .

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
に、本発明は、文書ファイルからキーワードを抽出し、
抽出されたキーワードの文字数があらかじめ設定された
格納文字数の上限を越えているかどうかを判断し、抽出
されたキーワードの文字数が前記格納文字数の上限を越
えていなければ前記キーワードの文字列を、前記格納文
字数の上限を越えていれば前記キーワードが存在する前
記文書ファイル中の位置情報を、キーワード管理用ファ
イルに格納するステップからなることを特徴とする。 [MEANS FOR SOLVING THE PROBLEMS] To achieve the above object
In the present invention, a keyword is extracted from a document file,
The number of characters of extracted keywords is set in advance
Determines whether the number of stored characters exceeds the upper limit and extracts
The number of characters of the entered keyword exceeds the upper limit of the number of stored characters.
If not, replace the keyword string with the stored statement
If the number of characters exceeds the upper limit, before the keyword exists
The location information in the document file can be
In the file.

【０００９】また、本発明は、前記キーワード管理ファ
イルのキーワード格納用テーブル内レコードを読み出
し、読み出したレコードに位置情報があるかどうかを判
断し、前記位置情報がなければ、検索対象文字列と前記
キーワード管理用ファイルから読み出したキーワードと
を照合し、前記位置情報があれば、その位置情報にて指
示された文字列が記述されている文書ファイルをオープ
ンし、そこから位置情報が指す文字列を取り出して前記
検索対象文字列との照合を行う、ステップからなること
を特徴とする。 Further , the present invention provides the keyword management file
The record in the file keyword storage table
To determine whether the read record has location information.
If there is no location information, the search target character string and the
Keywords read from the keyword management file and
And if there is the position information, the finger is
Open the document file in which the indicated character string is described.
And extract the character string indicated by the location information from there.
It consists of steps to match with the search target string
It is characterized by.

【００１０】[0010]

【作用】上述の手段によれば、キーワードを格納するレ
コード長に拘わらずいかなる文字列もキーワード管理フ
ァイルに格納することができ、このキーワード管理ファ
イルによって文書ファイルを検索することが可能とな
る。 According to the above-mentioned means, a key for storing a keyword is stored.
Regardless of the code length, any character string can be
This keyword management file can be stored in
Files can be searched by
You.

【００１１】[0011]

【実施例】まず、本発明の実施例の概略を説明する。図
１はインスタンス更新装置の構成を示すブロック図であ
る。インスタンス更新装置は、インスタンス入力装置２
０と、これに接続された外部記憶装置３０及びディスプ
レイ装置４０とから構成される。インスタンス入力装置
２０は、外部記憶装置３０に格納された文書ファイル３
１からインスタンスを読み込むインスタンス読み込み装
置２１と、読み込まれたインスタンスが展開されるメモ
リ装置２２と、このメモリ装置２２に展開されたインス
タンスを横又は縦の一覧表示形式にしてディスプレイ装
置４０に表示させるような一覧のデータを作成する一覧
作成装置２３と、インスタンス更新の指示などのイベン
トを解析するイベント解析装置２４と、メモリ装置２２
内の更新されたインスタンスを外部記憶装置３０に書き
戻すためのインスタンス格納装置２５とから構成されて
いる。DESCRIPTION OF THE PREFERRED EMBODIMENTS First, an embodiment of the present invention will be outlined. Figure
1 is a block diagram illustrating a configuration of an instance updating device. The instance update device is the instance input device 2
0 and an external storage device 30 and a display device 40 connected thereto. The instance input device 20 stores the document file 3 stored in the external storage device 30.
An instance reading device 21 for reading an instance from the first instance, a memory device 22 on which the read instance is expanded, and the instances expanded on the memory device 22 are displayed on the display device 40 in a horizontal or vertical list display format. List creating device 23 for creating a simple list data, event analyzing device 24 for analyzing an event such as an instance update instruction, and memory device 22
And an instance storage device 25 for writing back the updated instance in the external storage device 30.

【００１２】インスタンスの一覧表示を行うには、ま
ず、インスタンス入力装置２０のインスタンス読み込み
装置２１が外部記憶装置３０に格納されている文書ファ
イル３１をオープンし、目的の文書キーワードを検索し
て抽出し、メモリ装置２２に展開する。一覧作成装置２
３は、あらかじめ定められた一覧の表示形式、たとえば
図２に示したような横一覧表示の形式に従ってメモリ装
置２２に展開されたインスタンスを並べて一覧を作成
し、ディスプレイ装置４０に送り出す。ディスプレイ装
置４０は一覧作成装置２３から一覧のデータを受けて、
その一覧を表示する。To display a list of instances, first, the instance reading device 21 of the instance input device 20 opens the document file 31 stored in the external storage device 30, searches for and extracts a target document keyword. , To the memory device 22. List creation device 2
3 creates a list by arranging instances developed in the memory device 22 according to a predetermined list display format, for example, a horizontal list display format as shown in FIG. 2, and sends the list to the display device 40. The display device 40 receives the list data from the list creation device 23,
Display the list.

【００１３】インスタンスの変更を行う場合は、イベン
ト解析装置２４が、インスタンスの選択及び更新の指示
を解析し、その旨を一覧作成装置２３に通知する。イベ
ント解析装置２４はまた、インスタンスの変更の後の入
力確定の指示に応じてメモリ装置２２内の該当するイン
スタンスを書き換える。この書き換えられたインスタン
スは一覧作成装置２３における一覧の作成にすぐに反映
され、ディスプレイ装置４０によって更新された一覧が
表示される。また、この書き換えられたメモリ装置２２
内のインスタンスはインスタンス格納装置２５によっ
て、外部記憶装置３０内の文書ファイル３１に書き戻さ
れる。このため、インスタンス格納装置２５は書き換え
られたインスタンスの格納位置から外部記憶装置３０に
格納されている文書ファイル３１を特定し、その文書フ
ァイル３１の該当するインスタンスを検索して書き換え
る。When an instance is changed, the event analyzer 24 analyzes an instruction for selecting and updating an instance, and notifies the list creating device 23 of the analysis. The event analysis device 24 also rewrites the corresponding instance in the memory device 22 according to the instruction to confirm the input after the change of the instance. This rewritten instance is immediately reflected in the creation of the list in the list creation device 23, and the updated list is displayed by the display device 40. Also, the rewritten memory device 22
Are written back to the document file 31 in the external storage device 30 by the instance storage device 25. For this reason, the instance storage device 25 specifies the document file 31 stored in the external storage device 30 from the storage location of the rewritten instance, searches for a corresponding instance of the document file 31, and rewrites.

【００１４】好ましい実施例では、外部記憶装置３０に
は、インスタンス読み込み装置２１によって読み込まれ
るべき各文書ファイル３１内のキーワードを抽出して１
つのファイルに格納したキーワード管理用ファイル３２
を格納しており、インスタンス読み込み装置２１が各文
書ファイル３１のキーワードを読み込みに行くときに、
このキーワード管理用ファイル３２を読みに行くように
している。In a preferred embodiment, the keywords in each document file 31 to be read by the instance reading device 21 are extracted and stored in the external storage device 30.
Management file 32 stored in one file
When the instance reading device 21 reads the keyword of each document file 31,
This keyword management file 32 is read.

【００１５】通常は、インスタンス読み込み装置２１が
各文書ファイル３１のキーワードを読み込みに行くと
き、文書ファイル毎にそれぞれのファイルをオープン
し、目的のキーワードを検索して読み込み、ファイルを
クローズするという操作を繰り返して実行している。本
実施例では、各文書ファイル３１のキーワードをキーワ
ード管理用ファイル３２に纏めて管理するようにしてい
るので、インスタンス読み込み装置２１は各文書ファイ
ル３１のキーワードを読み込みに行くとき、このキーワ
ード管理用ファイル３２だけを読みに行くようにしてい
る。このため、ファイルを読みに行くときにファイルを
オープンする操作は１回で済み、しかも複数文書ファイ
ルのキーワードの読み込みは連続して行うことができる
という利点があり、インスタンスの一覧表示も速くする
ことができる。Normally, when the instance reading device 21 goes to read the keyword of each document file 31, the operation of opening each file for each document file, searching and reading the target keyword, and closing the file is performed. Running repeatedly. In the present embodiment, since the keywords of each document file 31 are collectively managed in the keyword management file 32, the instance reading device 21 reads the keywords of each document file 31 when reading the keywords of each document file 31. I try to read only 32. Therefore, there is an advantage that the operation of opening the file when reading the file only needs to be performed once, and that the keywords of the plurality of document files can be read continuously, and that the list of instances can be displayed quickly. Can be.

【００１６】このように、文書ファイル中から抽出した
文字列を、キーワードとして検索用の管理ファイルにテ
ーブルの形式で保存しておくことは一般に行われてい
る。このため、外部記憶装置３０には、文書ファイル３
１からキーワードを抽出してキーワード管理用ファイル
３２を生成するためのキーワード管理装置が接続されて
いる。As described above, it is common practice to store a character string extracted from a document file as a keyword in a search management file in a table format. Therefore, the external storage device 30 stores the document file 3
1 is connected to a keyword management device for extracting keywords from the keyword No. 1 and generating a keyword management file 32.

【００１７】図２はキーワード管理装置の構成例を示す
図である。キーワード管理装置は、外部記憶装置３０の
文書ファイル３１からキーワードを抽出するキーワード
抽出装置３３と、抽出されたキーワードを外部記憶装置
３０のキーワード管理用ファイル３２に格納するキーワ
ード格納装置３４とによって構成される。FIG . 2 is a diagram showing a configuration example of the keyword management device. The keyword management device includes a keyword extraction device 33 that extracts keywords from the document file 31 of the external storage device 30 and a keyword storage device 34 that stores the extracted keywords in the keyword management file 32 of the external storage device 30. You.

【００１８】キーワード抽出装置３３は、最初、すべて
の文書ファイル３１に対してあらかじめ定められたキー
ワードを抽出し、抽出されたキーワードはキーワード格
納装置３４によってキーワード管理用ファイル３２に格
納される。ここで、文書ファイル３１の１つが更新され
ると、その度に、その更新された文書ファイル３１に対
してキーワードの抽出が行われ、キーワード管理用ファ
イル３２も更新される。インスタンス読み込み装置２１
による次のインスタンスの読み込みが行われる場合に
は、この更新されたキーワード管理用ファイル３２から
インスタンスが読み出される。The keyword extracting device 33 first extracts predetermined keywords from all the document files 31, and the extracted keywords are stored in the keyword management file 32 by the keyword storage device 34. Here, every time one of the document files 31 is updated, a keyword is extracted from the updated document file 31 and the keyword management file 32 is also updated. Instance reading device 21
When the next instance is read, the instance is read from the updated keyword management file 32.

【００１９】キーワード管理用ファイル３２はテーブル
の形式でキーワードを保存している。このキーワード管
理用ファイル３２で定義されているキーワード格納用テ
ーブルのレコード長は、キーワード抽出装置３３によっ
て文書ファイル３１から抽出されるキーワードの文字数
が任意であるため、通常は予想される最大の長さにあら
かじめ設定されている。The keyword management file 32 stores keywords in a table format. The record length of the keyword storage table defined in the keyword management file 32 is usually the maximum expected length because the number of characters of the keyword extracted from the document file 31 by the keyword extraction device 33 is arbitrary. Is set in advance.

【００２０】このため、キーワード管理用ファイル３２
で使用されるキーワードの格納用テーブルのレコード長
を越える文字列はキーワード検索の対象にすることがで
きず、レコード長を越える文字列のキーワードは格納す
ることができない。しかし、レコード長を設定を余り長
くすると、抽出されるキーワードが設定された長さの文
字数に近いものばかりというような場合には、実質的に
同じ内容でサイズの大きなファイルがキーワード管理用
ファイル３２としてもう１つ作成されてしまうことにな
る。Therefore, the keyword management file 32
The character string exceeding the record length of the keyword storage table used in the above cannot be searched for the keyword, and the keyword of the character string exceeding the record length cannot be stored. However, if the record length is set to be too long, and if the extracted keywords are almost the same as the number of characters having the set length, a large file having substantially the same contents will be stored in the keyword management file 32. And another one will be created.

【００２１】このため、キーワード管理装置はキーワー
ドの格納用テーブルのレコード長を越える文字列につい
ても、これを格納することができるよう構成してある。
すなわち、文書ファイル３１から抽出されたキーワード
をキーワード管理用ファイル３２に格納する際、格納文
字数の上限を設け、この上限に満たないキーワードにつ
いてはその文字列をキーワード管理用ファイル３２に格
納し、格納文字数の上限を越えるキーワードはそのキー
ワードが格納されている文書ファイルの中の出現位置を
表す情報をキーワード管理用ファイル３２に格納するよ
うにしている。格納文字数の上限は、通常は、キーワー
ドの格納用テーブルのレコード長に等しくされる。Therefore, the keyword management device is configured to be able to store even a character string exceeding the record length of the keyword storage table.
That is, when the keywords extracted from the document file 31 are stored in the keyword management file 32, an upper limit of the number of characters to be stored is provided, and for the keywords less than the upper limit, the character strings are stored in the keyword management file 32 and stored. For a keyword exceeding the upper limit of the number of characters, information indicating an appearance position in a document file in which the keyword is stored is stored in the keyword management file 32. The upper limit of the number of stored characters is usually equal to the record length of the keyword storage table.

【００２２】図３はキーワード抽出装置の作用を示すフ
ローチャートである。このフローチャートによれば、キ
ーワード抽出装置３３は、まず、外部記憶装置３０に格
納されている文書ファイル３１からキーワードを抽出す
る（ステップＳ１１）。次いで、抽出されたキーワード
はあらかじめ設定された格納文字数の上限を越えている
かどうかが判断される（ステップＳ１２）。FIG . 3 is a flowchart showing the operation of the keyword extracting device. According to this flowchart, the keyword extracting device 33 first extracts a keyword from the document file 31 stored in the external storage device 30 (Step S11). Next, it is determined whether or not the extracted keyword exceeds a preset upper limit of the number of stored characters (step S12).

【００２３】ステップＳ１２において、抽出されたキー
ワードが格納文字数の上限を越えていないと判断されれ
ば、そのキーワードの文字列は、キーワード格納装置３
４によって外部記憶装置３０のキーワード管理用ファイ
ル３２に格納される（ステップＳ１３）。ステップＳ１
２の判断において、抽出されたキーワードが格納文字数
の上限を越えていれば、そのキーワードが存在する文書
ファイル中の位置情報がキーワード格納装置３４によっ
て外部記憶装置３０のキーワード管理用ファイル３２に
格納される（ステップＳ１４）。If it is determined in step S12 that the extracted keyword does not exceed the upper limit of the number of stored characters, the character string of the keyword is stored in the keyword storage device 3.
4 is stored in the keyword management file 32 of the external storage device 30 (step S13). Step S1
If the extracted keyword exceeds the upper limit of the number of stored characters in the determination of step 2, the position information of the keyword in the document file is stored in the keyword management file 32 of the external storage device 30 by the keyword storage device 34. (Step S14).

【００２４】図４は文書ファイルの文書内容を例示した
図である。この図示の例によれば、文書名を「文書
１」、「文書２」、「文書３」とする３つの文書ファイ
ル３１ａ，３１ｂ，３１ｃがある。これらの文書の中
で、他の文字列と区別するため、キーワードとして抽出
される文字列にはアンダーラインが引かれている。たと
えば、「文書１」の中でキーワードとして設定されてい
る文字列は、「マーク付け」、「定型処理フォーマッ
ト」及び「自動生成」である。これら文書中のキーワー
ドは、キーワード管理装置にて、抽出及びキーワード管
理用ファイル３２への格納が行われる。FIG . 4 is a diagram exemplifying the document contents of the document file. According to the illustrated example, there are three document files 31a, 31b, and 31c whose document names are "document 1", "document 2", and "document 3". In these documents, character strings extracted as keywords are underlined to distinguish them from other character strings. For example, character strings set as keywords in “document 1” are “marking”, “standard processing format”, and “automatic generation”. The keywords in these documents are extracted and stored in the keyword management file 32 by the keyword management device.

【００２５】図５はキーワード管理用ファイルの内容を
例示した図である。キーワード管理用ファイル３２は
「文書名」及び「キーワード」の２つの項目の対応テー
ブルになっていて、「キーワード」のフィールドにおけ
る格納文字数の上限はたとえば日本語で１０文字（２０
バイト）としてある。したがって、この上限までの格納
文字数のキーワードはその文字列が「キーワード」のフ
ィールドにそのまま格納される。FIG . 5 is a diagram exemplifying the contents of the keyword management file. The keyword management file 32 is a correspondence table of two items of “document name” and “keyword”. The upper limit of the number of stored characters in the “keyword” field is, for example, 10 characters (20
Byte). Therefore, a keyword having the number of stored characters up to this upper limit is stored as it is in the “keyword” field.

【００２６】ここで、図４に示した「文書３」の２番目
のキーワードのように、抽出された文字列の文字数が格
納文字数の上限を越えているような場合には、そのキー
ワードの文字列をキーワード管理用ファイル３２にその
まま格納するのではなく、その文字列が文書中に出現す
る位置の情報がキーワード管理用ファイル３２に格納さ
れる。たとえば図示の例では、「文書３」の２番目のキ
ーワードは文書の先頭から数えて、１２１バイト目と１
５４バイト目との間に存在していることを示している。Here, when the number of characters in the extracted character string exceeds the upper limit of the number of stored characters, as in the case of the second keyword of “document 3” shown in FIG. Instead of storing the column in the keyword management file 32 as it is, information on the position where the character string appears in the document is stored in the keyword management file 32. For example, in the example shown in the figure, the second keyword of “document 3” is counted from the beginning of the document to the 121st byte and the 1st byte.
It indicates that it exists between the 54th byte.

【００２７】インスタンス読み込み装置２１が外部記憶
装置３０からのインスタンスの読み込みを、文書ファイ
ル３１からではなく、キーワードと位置情報とが混在し
て格納されたキーワード管理用ファイル３２から行う場
合には、キーワードとして格納されている情報がキーワ
ードであるのか位置情報であるのかが考慮される。When the instance reading device 21 reads an instance from the external storage device 30 not from the document file 31 but from the keyword management file 32 in which keywords and location information are stored in a mixed manner, It is considered whether the information stored as is a keyword or position information.

【００２８】すなわち、キーワード管理用ファイル３２
から読み込んだ情報がキーワードの位置情報であれば、
まず、そのキーワードが格納されている文書ファイル３
１ｃをオープンし、次いで、そのキーワードの位置情報
が指している文書ファイル３１ｃ中の文字列を取り出
し、その文字列を一覧表示のためのインスタンスとす
る。なお、文字列の取り出しが済んだならば、その文書
ファイル３１ｃは適当な時期にクローズされる。That is, the keyword management file 32
If the information read from is the keyword location information,
First, a document file 3 in which the keyword is stored
1c is opened, and then a character string in the document file 31c pointed to by the position information of the keyword is extracted, and the character string is used as an instance for list display. When the character string has been extracted, the document file 31c is closed at an appropriate time.

【００２９】外部記憶装置３０のキーワード管理用ファ
イル３２は、また、インスタンス読み込みのためのキー
ワードの検索のほか、単なるキーワードの検索にも利用
することができる。キーワードの検索をキーワード管理
用ファイル３２を利用しないで行おうとすると、全文検
索を行う場合には、すべての文書ファイルをオープンし
て、検索対象文字列と文書内の全文字列との照合を行わ
なければならない。これに対し、キーワードと位置情報
とが混在して格納されたキーワード管理用ファイル３２
を利用する場合には、キーワード管理用ファイル３２の
オープンに加え、位置情報が登録されていればそれに関
する文書ファイルをオープンするだけでよい。The keyword management file 32 in the external storage device 30 can be used not only for searching for a keyword for reading an instance, but also for searching for a mere keyword. If a keyword search is to be performed without using the keyword management file 32, when performing a full-text search, all document files are opened, and a search target character string is compared with all character strings in the document. There must be. On the other hand, the keyword management file 32 in which the keyword and the position information are stored mixedly
In the case where is used, in addition to opening the keyword management file 32, it is only necessary to open a document file relating to the location information if it has been registered.

【００３０】図６はキーワード検索方法を示すフローチ
ャートである。検索対象文字列を指定して文書ファイル
中の対応するキーワードを検索しようとするときには、
まず、キーワード管理用ファイル３２をオープンする
（ステップＳ２１）。次いで、キーワード格納用テーブ
ル内のレコードを１件読み出す（ステップＳ２２）。こ
こで、ファイルの終わり（ＥＯＦ）かどうかをチェック
し（ステップＳ２３）、ファイルの終わりと判断されれ
ば、検索終了である。FIG . 6 is a flowchart showing a keyword search method. When trying to search for the corresponding keyword in the document file by specifying the search target string,
First, the keyword management file 32 is opened (step S21). Next, one record in the keyword storage table is read (step S22). Here, it is checked whether or not the end of the file (EOF) has been reached (step S23). If it is determined that the end of the file has been reached, the search is completed.

【００３１】ファイルの終わりでなければ、読み出した
レコードに位置情報があるかどうかをチェックする（ス
テップＳ２４）。位置情報がなければ、あらかじめ指定
した検索対象文字列とキーワード管理用ファイル３２か
ら読み出したキーワードとを照合し（ステップＳ２
５）、次のレコードの検索に進む。ステップＳ２４にて
位置情報があると判断されれば、その位置情報にて指示
された文字列が記述されている文書ファイルをオープン
し、そこから位置情報が指す文字列を取り出し、検索対
象文字列との照合を行う（ステップＳ２６）。そして、
この照合の結果はディスプレイ装置によって表示され
る。If it is not the end of the file, it is checked whether or not the read record has position information (step S24). If the position information does not exist, the search target character string specified in advance is compared with the keyword read from the keyword management file 32 (step S2).
5), proceed to search for the next record. If it is determined in step S24 that there is position information, a document file in which the character string specified by the position information is described is opened, and the character string indicated by the position information is extracted therefrom. Is performed (step S26). And
The result of this collation is displayed by the display device.

【００３２】図７は本発明を実施するコンピュータシス
テムのハードウエア構成の一例を示す図である。図にお
いて、コンピュータシステムは、プロセッサ５１と、読
み取り専用メモリ（ＲＯＭ）５２と、メインメモリ（Ｒ
ＡＭ）５３と、グラフィック制御回路５４及び表示装置
５５と、マウス５６と、キーボード５７と、ハードディ
スク装置（ＨＤＤ）５８と、フロッピーディスク装置
（ＦＤＤ）５９と、プリンタ６０とで構成されており、
これらの構成要素はバス６１によって相互に結合されて
いる。FIG . 7 is a diagram showing an example of a hardware configuration of a computer system embodying the present invention. In the figure, a computer system includes a processor 51, a read-only memory (ROM) 52, and a main memory (R).
AM) 53, a graphic control circuit 54 and a display device 55, a mouse 56, a keyboard 57, a hard disk device (HDD) 58, a floppy disk device (FDD) 59, and a printer 60.
These components are interconnected by a bus 61.

【００３３】プロセッサ５１はコンピュータシステム全
体を統括的に制御する。読み取り専用メモリ５２にはた
とえば立ち上げ時に必要なプログラムなどが格納されて
いる。メインメモリ５３にはシステムプログラム、一覧
表示用のアプリケーションプログラムなどが展開されて
いる他に、読み込んだインスタンスなどのデータが格納
される。The processor 51 controls the entire computer system. The read-only memory 52 stores, for example, programs required at the time of startup. The main memory 53 stores data such as read instances and the like in addition to the system programs and the application programs for list display.

【００３４】グラフィック制御回路５４はビデオメモリ
などを有し、メインメモリ５３に読み込まれたインスタ
ンス群の一覧表示画面などを表示信号に変換し、表示装
置５５に送る。表示装置５５は、受けた表示信号を基に
横又は縦の一覧表示画面などを表示する。The graphic control circuit 54 has a video memory and the like, converts a list display screen of the instance group read into the main memory 53 into a display signal, and sends the display signal to the display device 55. The display device 55 displays a horizontal or vertical list display screen based on the received display signal.

【００３５】マウス５６は表示装置５５の画面上に表示
されているカーソルを移動させ、ボタンをクリックする
ことによって一覧表示画面上の更新しようとするインス
タンスを選択したり、各種メニューなどに列記されてい
るコマンドを選択・指示するポインティングデバイスで
ある。キーボード５７はインスタンスの書き換え時の文
字などの入力に使用される。The mouse 56 moves the cursor displayed on the screen of the display device 55, and selects an instance to be updated on the list display screen by clicking a button. This is a pointing device for selecting and instructing a command. The keyboard 57 is used for inputting characters and the like when rewriting an instance.

【００３６】ハードディスク装置５８は、システムプロ
グラム、インスタンス一覧表示用のアプリケーションプ
ログラム、文書ファイル、キーワード管理用ファイルな
どが格納されている。フロッピーディスク装置５９はフ
ロッピーディスク５９ａに記憶されている文書ファイル
をハードディスク装置５８に取り込んだり、たとえば一
覧表示結果のデータをフロッピーディスク５９ａに記憶
させることができる外部記憶装置である。The hard disk device 58 stores a system program, an application program for displaying an instance list, a document file, a keyword management file, and the like. The floppy disk device 59 is an external storage device that can take a document file stored in the floppy disk 59a into the hard disk device 58 and store data of a list display result in the floppy disk 59a.

【００３７】また、インスタンスの一覧は、その表示デ
ータをプリンタ６０に渡して、紙に印刷することができ
る。The list of instances can be printed on paper by passing the display data to the printer 60.

【００３８】[0038]

【発明の効果】複数の文書ファイルから抽出したキーワ
ードをキーワード管理用ファイルに格納する方法とし
て、格納文字数の上限を設定し、上限に満たないキーワ
ードはそのキーワードの文字列を、上限を越えるキーワ
ードはそのキーワードの文書内出現位置を、キーワード
管理用ファイルに格納するようにした。このため、キー
ワード管理用ファイルのサイズの最適化が図れると共
に、キーワードを１つのファイルに纏めて管理すること
により、検索効率をある程度保ちながら、文書ファイル
から抽出したあらゆる文字数の文字列をキーワードとし
て管理することができる。As a method of storing keywords extracted from a plurality of document files in a keyword management file, an upper limit of the number of stored characters is set, a keyword less than the upper limit is a character string of the keyword, and a keyword exceeding the upper limit is a keyword. The appearance position of the keyword in the document is stored in the keyword management file. For this reason, the size of the keyword management file can be optimized, and by managing keywords collectively in one file, a character string of any number of characters extracted from a document file can be managed as a keyword while maintaining search efficiency to some extent. can do.

[Brief description of the drawings]

【図１】インスタンス更新装置の構成を示すブロック図
である。 FIG. 1 is a block diagram illustrating a configuration of an instance update device.
It is.

【図２】キーワード管理装置の構成例を示す図である。 FIG. 2 is a diagram illustrating a configuration example of a keyword management device.

【図３】キーワード抽出装置の作用を示すフローチャー
トである。 FIG. 3 is a flowchart showing the operation of the keyword extraction device .
It is.

【図４】文書ファイルの文書内容を例示した図である。 FIG. 4 is a diagram exemplifying document contents of a document file;

【図５】キーワード管理用ファイルの内容を例示した図
である。 FIG. 5 is a diagram exemplifying the contents of a keyword management file.
It is.

【図６】キーワード検索方法を示すフローチャートであ
る。 FIG. 6 is a flowchart showing a keyword search method .
You.

【図７】本発明を実施するコンピュータシステムのハー
ドウエア構成の一例を示す図である。 FIG. 7 is a computer system embodying the present invention;
FIG. 3 is a diagram illustrating an example of a hardware configuration.

[Explanation of symbols]

２０インスタンス入力装置２１インスタンス読み込み装置２２メモリ装置２３一覧作成装置２４イベント解析装置２５インスタンス格納装置３０外部記憶装置３１文書ファイル３２キーワード管理用ファイル３３キーワード抽出装置３４キーワード格納装置 Reference Signs List 20 instance input device 21 instance reading device 22 memory device 23 list creation device 24 event analysis device 25 instance storage device 30 external storage device 31 document file 32 keyword management file 33 keyword extraction device 34 keyword storage device

フロントページの続き (72)発明者百山千加子富山県婦負郡八尾町保内二丁目２番１株式会社富山富士通内 (72)発明者林隆弘富山県婦負郡八尾町保内二丁目２番１株式会社富山富士通内 (56)参考文献特開昭60−24657（ＪＰ，Ａ) 特開平５−181907（ＪＰ，Ａ) 特開平５−81101（ＪＰ，Ａ) 特開昭62−189552（ＪＰ，Ａ) 特開平５−128159（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/30 210 Continued on the front page (72) Inventor Chikako Hyakuyama 2-2-1 Honai, Yao-cho, Neguro-gun, Toyama Fujiyama Co., Ltd. (72) Inventor Takahiro Hayashi 2-2-1 Honai, Yao-cho, Meguri-gun, Toyama Co., Ltd. JP-A-60-24657 (JP, A) JP-A-5-181907 (JP, A) JP-A-5-81101 (JP, A) JP-A-62-189552 (JP, A A) JP-A-5-128159 (JP, A) (58) Fields investigated (Int. Cl. ⁷ , DB name) G06F 17/30 210

Claims

(57) [Claims]

A keyword extracted from a document file;
Keys stored in the keyword management file and stored
Keyword search to search document files based on words
In the method, a keyword is extracted from a document file , and the number of characters of the extracted keyword is set in advance.
Judge whether the number of stored characters exceeds the upper limit of the number of stored characters.
If not, the character string of the keyword is
If the number of characters exceeds the upper limit, the keyword exists.
The location information in the document file
When storing in a management file and performing search processing, the stored keyword management file is used.
A keyword search method characterized by comprising a step of searching for a document file based on a file
Law.

2. A record of the keyword management file.
To determine if the read record has location information
And, if there is the position information, the searched character string keywords
Check with keyword read from file management file
If there is the position information, it is indicated by the position information.
Open the document file that describes the character string
From there, extract the character string indicated by the location information and
2. The key according to claim 1, further comprising a step of performing matching with an elephant character string.
Word search method.