JPH08320877A

JPH08320877A - Document retrieval device

Info

Publication number: JPH08320877A
Application number: JP7126631A
Authority: JP
Inventors: Masao Ito; 藤正雄伊
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1995-05-25
Filing date: 1995-05-25
Publication date: 1996-12-03

Abstract

PURPOSE: To handle a document file in record and field structure even when index retrieval from plural document files is performed by using a sectioning result file which contains the positions and sizes of records and fields of the document file. CONSTITUTION: A sectioning result generation part 4 is provided and then the document file in a document file storage part 2 is sectioned into records and fields by using sectioning conditions inputted at a sectioning condition input part 3. The positions and sizes of the sectioned records and fields are recorded in a sectioning result file storage part 5, and the sectioning result file is used to generate indexes for retrieval by an index generation part 6, thereby retrieving the document file having the structure.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、計算機を用いて大量の
文書ファイルから必要な文書を高速に検索する文書検索
装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document retrieving apparatus for retrieving a required document from a large number of document files at high speed using a computer.

【０００２】[0002]

【従来の技術】近年、ワードプロセッサやパーソナルコ
ンピュータの普及、コンピュータによる文字認識の実用
化に伴い、これらによって作成される電子化文書が多く
なってきた。このため、大量の文書情報を蓄積し、必要
に応じて文書情報を検索するための文書データベースに
対する関心が高まってきている。さらに、文書データベ
ースの中でも、文書に対してキーワード付けを行なわな
くても必要な文書を検索できる全文検索技術が注目され
ている。この技術では、利用者から与えられた検索条件
と蓄積されている文書の全ての情報との間で照合を行な
い、検索条件を満たす文書を出力する。検索条件では、
従来のキーワードのような単語以外に文などの文字列を
用いても良い。但し、この検索条件と文書情報との照合
は、データが大きくなると時間がかかるため、検索用の
索引ファイルを作成して、この索引ファイルから検索を
行ない、検索時間の短縮を図っている。この方式では、
対象となる文書ファイルの形式として、索引作成方式の
指定したフォーマットに従う必要があった。2. Description of the Related Art In recent years, with the spread of word processors and personal computers and the practical use of character recognition by computers, the number of electronic documents created by these has increased. For this reason, interest in a document database for accumulating a large amount of document information and searching for the document information as needed is increasing. Further, in a document database, a full-text search technique that can search for a required document without adding a keyword to the document has attracted attention. In this technique, the search condition given by the user is compared with all the information of the stored documents, and the document satisfying the search condition is output. In the search condition,
A character string such as a sentence may be used in addition to a word such as a conventional keyword. However, it takes time to collate the search condition with the document information when the data becomes large. Therefore, an index file for search is created and the search is performed from this index file to shorten the search time. With this method,
It was necessary to comply with the format specified by the index creation method as the target document file format.

【０００３】以下、従来の文書検索装置について説明す
る。図１１は従来の文書検索装置の構成を示すものであ
る。図１１において、１１０１は文書登録部、１１０２
は文書ファイル格納部、１１０３は索引作成部、１１０
４は索引ファイル格納部、１１０５は検索処理部、１１
０６は検索条件入力部、１１０７は索引検索部、１１０
８は検索結果表示部である。A conventional document retrieval device will be described below. FIG. 11 shows the configuration of a conventional document search device. In FIG. 11, reference numeral 1101 denotes a document registration unit 1102.
Is a document file storage unit, 1103 is an index creation unit, 110
4 is an index file storage unit, 1105 is a search processing unit, 11
Reference numeral 06 is a search condition input unit, 1107 is an index search unit, and 110
Reference numeral 8 is a search result display section.

【０００４】以上のように構成された文書検索装置につ
いて、以下その動作について説明する。まず文書登録部
１１０１では、文書ファイル格納部１１０２の文書ファ
イルに対して、索引作成部１１０３で検索用の索引を作
成して索引ファイル格納部１１０４に格納する。次に検
索処理部１１０５では、検索条件入力部１１０６で入力
された検索条件で、索引ファイル１１０４に対して索引
検索部１１０７で検索を行ない、検索結果を検索結果表
示部１１０８で表示する。The operation of the document retrieval apparatus configured as described above will be described below. First, in the document registration unit 1101, the index creation unit 1103 creates a search index for the document file in the document file storage unit 1102 and stores it in the index file storage unit 1104. Next, in the search processing unit 1105, the index search unit 1107 searches the index file 1104 with the search condition input in the search condition input unit 1106, and the search result display unit 1108 displays the search result.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来の文書検索装置では、複数ファイルからの検索が行な
える場合でも、１ファイル全体を検索の単位としていた
ため、複数レコードに対する検索をする場合に、１レコ
ードを１ファイルとして作成するため、ファイルの数が
多くなるという欠点があり、また１ファイルに複数レコ
ードが扱える場合でも、複数ファイルが扱えず、複数の
ファイルを１ファイルにする必要があり、１ファイルが
大きくなるという欠点があった。また従来は、全てのフ
ィールドに対して索引を作成していたので、索引の容量
が大きくなるという欠点があった。また従来は、作成し
た索引ファイルの全てから検索するため、検索する必要
のないデータからも検索する必要があったため、余計に
検索時間がかかるという欠点があった。さらに、従来
は、検索対象となる文書ファイルを指定する方法とし
て、検索対象の文書ファイルの全てを指定するといった
方法が行なわれており、ファイル数が非常に多い場合
は、文書ファイルを指定するのが困難であった。また文
書ファイルが更新された場合に、即座に検索するのが困
難であった。However, in the above-described conventional document search apparatus, even when a search can be performed from a plurality of files, the entire file is used as a search unit. Therefore, when searching for a plurality of records, Since one record is created as one file, there is a disadvantage that the number of files increases, and even if multiple records can be handled in one file, multiple files cannot be handled and multiple files need to be one file. There is a drawback that one file becomes large. Further, conventionally, since indexes have been created for all fields, there is a drawback that the index capacity becomes large. In addition, conventionally, since all the created index files are searched, it is necessary to search even data that does not need to be searched, which results in an additional search time. Further, conventionally, as a method of designating a document file to be searched, a method of designating all of the document files to be searched has been performed. When the number of files is very large, the document file is designated. Was difficult. In addition, when the document file is updated, it is difficult to search immediately.

【０００６】本発明は、上記従来技術の課題を解決する
もので、複数の文書ファイルでさらにレコードやフィー
ルドといった構造を持った文書ファイルに対しても、検
索用の索引ファイルを作成することができ、また全ての
フィールドに対して索引を作成するのではなく、指定し
たフィールドから選択的に索引を作成できるようにし
て、索引ファイルの容量を減らすことができ、さらに、
索引ファイルの全てから検索するのではなく、指定した
文書ファイルのみ選択して検索できるようにし、検索対
象の文書ファイル名の指定において、ディレクトリの下
の全てのファイルを指定できる場合や、サフィックスを
指定できることにより、検索対象文書ファイル名の指定
を簡単にできるようにした文書検索装置を提供すること
を目的とする。The present invention solves the above-mentioned problems of the prior art. It is possible to create a search index file for a document file having a structure such as a record or a field among a plurality of document files. Also, instead of creating an index for every field, you can selectively create an index from specified fields to reduce the size of the index file.
Instead of searching from all index files, only the specified document file can be selected and searched, and when specifying the document file name to be searched, it is possible to specify all files under the directory or specify the suffix It is an object of the present invention to provide a document search device capable of easily specifying a search target document file name.

【０００７】[0007]

【課題を解決するための手段】上記目的を達成するため
に、本発明は、複数の文書ファイルでさらにレコードや
フィールドといった構造を持つ文書ファイルに対応する
ために、複数の文書ファイルを格納する文書ファイル格
納部と、文書ファイルをレコードとフィールドに分割す
るための条件を入力する区切り条件入力部と、文書ファ
イルに対して区切り条件でレコードとフィールドの位置
およびサイズを求める区切り結果作成部と、区切り結果
作成部で作られた情報を格納する区切り結果ファイル格
納部と、区切り結果ファイルから必要なレコードとフィ
ールドの位置およびサイズを求め、文書ファイルからデ
ータを読み出して検索用の索引を作成する索引作成部
と、索引作成部で作られた索引ファイルを格納する索引
ファイル格納部と、検索条件を入力するための検索条件
入力部と、文書ファイルと前記索引ファイルから索引検
索を行なう索引検索部と、索引検索部でヒットした文書
に対して結果を表示する検索結果表示部とを備えたもの
である。In order to achieve the above object, the present invention relates to a document storing a plurality of document files in order to correspond to a document file having a structure such as a record or a field in a plurality of document files. A file storage part, a delimiter condition input part that inputs the conditions for dividing the document file into records and fields, a delimiter result creation part that determines the position and size of the record and field in the document file according to the delimiter conditions, and a delimiter A delimiter result file storage that stores the information created by the result creation unit, and the position and size of the required records and fields are calculated from the delimiter result file, and the data is read from the document file to create an index for searching. Section, an index file storage section that stores the index file created by the index creation section, A search condition input unit for inputting search conditions, an index search unit for performing an index search from the document file and the index file, and a search result display unit for displaying a result for a document hit by the index search unit It is a thing.

【０００８】また指定したフィールドから選択的に索引
を作成できるようにするために、索引検索部の前段に検
索対象フィールド入力部を有し、指定した文書ファイル
のみ選択して検索できるようにするために、検索対象と
なるファイル名を入力するための検索対象ファイル名入
力部と、区切り結果ファイルから指定されたファイルの
索引の位置を検出する索引位置検出部を有し、さらに、
文書ファイルを指定する場合にディレクトリやサフィッ
クスを指定できる文書ファイル選択部と、文書ファイル
が更新されたかどうか判定する文書ファイル変更判定部
を備えたものである。In order to selectively create an index from designated fields, a search target field input section is provided in front of the index retrieval section so that only designated document files can be selected and searched. Has a search target file name input section for inputting a file name to be searched, and an index position detection section for detecting the position of the index of the specified file from the delimitation result file.
It is provided with a document file selection unit that can specify a directory and suffix when designating a document file, and a document file change determination unit that determines whether the document file has been updated.

【０００９】[0009]

【作用】本発明は、上記構成によって、複数の文書ファ
イルから区切り条件に従って、区切り結果作成部でレコ
ードとサイズの位置とサイズ情報を記録した区切り結果
ファイルを作成し、区切り結果ファイルを用いて索引作
成部で索引ファイルを作成することにより、レコードや
フィールドといった構造を持った複数のファイルを同時
に扱うことができる。According to the present invention, with the above configuration, a delimiter result creating unit records a position of record, size position, and size information from a plurality of document files according to a delimiter condition and creates an index using the delimiter result file. By creating an index file in the creating unit, it is possible to handle multiple files having a structure such as records and fields at the same time.

【００１０】また検索対象フィールド入力部を設けて検
索対象となるフィールドを入力し、索引作成部で指定し
たフィールドのみ索引を作成することにより、全てのフ
ィールドについて索引を作成する必要はないので、索引
ファイルの容量を小さくすることができる。また区切り
結果ファイルと索引位置検出部を用いることにより、実
際に検索対象とする文書ファイルの索引ファイル内での
位置を求めることができるので、指定した文書ファイル
についてだけ索引検索すればよく、無駄な索引検索をし
なくて良いので、時間の縮小と検索ゴミの縮小を図るこ
とができる。さらに、文書ファイル選択部でディレクト
リやサフィックスの選択ができるので、簡単に検索対象
の文書ファイルを指定することができる。Since it is not necessary to create an index for all fields by providing a field to be searched and inputting a field to be searched and creating an index only for the field designated by the index creating section, The file size can be reduced. Further, by using the delimiter result file and the index position detection unit, the position of the document file to be actually searched in the index file can be obtained. Therefore, it is sufficient to perform the index search only for the specified document file, which is wasteful. Since there is no need to search the index, time can be reduced and search garbage can be reduced. Furthermore, since the directory and suffix can be selected in the document file selection section, the document file to be searched can be easily specified.

【００１１】[0011]

【Example】

（実施例１）以下、本発明の実施例について、図面を参
照しながら説明する。図１は本発明の第１の実施例にお
ける文書検索装置の構成を示す図である。図１におい
て、１は文書登録部、２は文書ファイル格納部、３は区
切り条件入力部、４は区切り結果作成部、５は区切り結
果ファイル格納部、６は索引作成部、７は索引ファイル
格納部、８は検索処理部、９は検索条件入力部、１０は
索引検索部、１１は検索結果表示部である。(Embodiment 1) Hereinafter, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a diagram showing the configuration of a document search device according to the first embodiment of the present invention. In FIG. 1, 1 is a document registration unit, 2 is a document file storage unit, 3 is a delimiter condition input unit, 4 is a delimitation result creation unit, 5 is a delimitation result file storage unit, 6 is an index creation unit, and 7 is an index file storage. Reference numeral 8 is a search processing unit, 9 is a search condition input unit, 10 is an index search unit, and 11 is a search result display unit.

【００１２】以上の構成において、文書ファイル格納部
２に格納された文書ファイルに対して、区切り条件に入
力部３から入力された区切条件に従って、各文書ファイ
ルをレコードとフィールドに区切り、レコードとフィー
ルドの先頭位置とサイズの情報と文書ファイル名を効率
良く格納するための区切結果を区切り結果作成部４で作
成し、その区切り結果を区切り結果ファイル格納部５に
格納する。次に、その区切り結果から各フィールドの位
置とサイズを取り出し、文書ファイル格納部２から実際
のデータを取り出して、文書検索用の索引を索引作成部
で作成し、作成した索引ファイルを索引ファイル格納部
７に格納する。検索処理部８では、検索条件入力部９で
入力された検索条件に従って、検索ファイル格納部７の
索引ファイルを読み出して索引検索部１０で検索を行な
い、条件にヒットした結果を検索結果表示部１１に送
り、検索結果の表示を行なう。In the above configuration, for the document file stored in the document file storage unit 2, each document file is divided into a record and a field according to the delimiter condition input from the input unit 3 in the delimiter condition, and the record and the field are separated. The delimiter result creating unit 4 creates a delimiter result for efficiently storing the head position and size information and the document file name, and stores the delimiter result in the delimiter result file storage unit 5. Next, the position and size of each field are extracted from the delimiter result, the actual data is extracted from the document file storage unit 2, the index for document search is created by the index creation unit, and the created index file is stored in the index file. Store in part 7. In the search processing unit 8, the index file in the search file storage unit 7 is read according to the search condition input in the search condition input unit 9 and the index search unit 10 performs a search, and the result hit by the condition is displayed in the search result display unit 11 To display the search results.

【００１３】このように区切り結果作成部４と区切り結
果ファイル格納部５を設けることにより、複数の文書フ
ァイルに対してレコードやフィールドの構造のあるデー
タに対して索引を作成して検索することができる。By providing the delimiter result creation unit 4 and the delimiter result file storage unit 5 in this way, it is possible to create an index for data having a record or field structure for a plurality of document files and perform a search. it can.

【００１４】図２は第１の実施例の要素である区切り結
果ファイルの構成を示す。図２において、２０１は区切
り結果ファイルヘッダー、２０２は第１ファイルのフィ
ールドテーブル、２０３は第１ファイルのレコードテー
ブル、２０４は第２ファイルのフィールドテーブル、２
０５は第２ファイルのレコードテーブル、２０６はこれ
らの連続を示す部分、２０７は第ｎファイルのフィール
ドテーブル、２０８は第ｎファイルのレコードテーブ
ル、２０９はファイル名管理ブロック、２１０はパス名
格納領域である。FIG. 2 shows the structure of a delimiter result file which is an element of the first embodiment. In FIG. 2, 201 is a delimited result file header, 202 is a first file field table, 203 is a first file record table, 204 is a second file field table, and 2
Reference numeral 05 is a record table of the second file, 206 is a portion indicating these sequences, 207 is a field table of the nth file, 208 is a record table of the nth file, 209 is a file name management block, and 210 is a path name storage area. is there.

【００１５】２０１は区切り結果ファイルに含まれるフ
ァイルの数や最大フィールドサイズや最大レコードバイ
トサイズの情報が書き込まれる区切り結果ファイルヘッ
ダーであり、２０２は各ファイル毎のフィールドの位置
とサイズの情報が書き込まれるフィールドテーブルであ
り、２０３は各ファイル毎のレコードの位置とサイズの
情報が書き込まれるレコードテーブルであり、２０９は
各ファイル名のレコード数やフィールドテーブルの位
置、レコードテーブルの位置、ファイル名を記憶する木
構造の情報が書き込まれるファイル名管理ブロックであ
り、２１０は各ファイル名を記憶する領域であるパス名
格納領域である。Reference numeral 201 denotes a delimiter result file header in which information on the number of files included in the delimiter result file, the maximum field size and the maximum record byte size is written, and 202 is written the field position and size information for each file. 203 is a record table in which information on the position and size of the record for each file is written, and 209 stores the number of records of each file name, the position of the field table, the position of the record table, and the file name. Is a file name management block in which the information of the tree structure is written, and 210 is a path name storage area which is an area for storing each file name.

【００１６】レコードテーブルとフィールドテーブル
は、第１ファイルから第ｎファイルの順になっている
が、これは先頭から作成する場合はこの方が作成しやす
いだけであり、どの順番になっていても良く、パス名格
納領域の後にあっても良い。The record table and the field table are arranged in the order from the first file to the n-th file. However, when they are created from the beginning, this is easier to create and any order may be used. It may be located after the path name storage area.

【００１７】このように区切り結果ファイルを構成する
ことにより、文書ファイルを更新した場合でも区切り結
果ファイルを最初から作り直すのではなく、修正したと
ころのレコードテーブルとフィールドテーブルだけ修正
すれば良いので管理が容易である。By constructing the delimiter result file in this way, even if the document file is updated, it is sufficient to correct only the record table and field table that have been modified, rather than recreating the delimiter result file from the beginning. It's easy.

【００１８】図３は図２で示した区切り結果ファイルの
構成の１つであるフィールドテーブルの構造を示す。図
３において、３０１は第１レコードのフィールド情報、
３０２は第２レコードのフィールド情報、３０３は第３
レコードのフィールド情報、３０４はこれらの連続を示
す部分、３０５は第ｎレコードのフィールド情報であ
る。３０６は第１レコード第１フィールドの位置、３０
７は第１レコード第１フィールドのサイズ、３０８は第
１レコード第２フィールドの位置、３０９は第１レコー
ド第２フィールドのサイズ、３１０はこれらの連続を示
す部分、３１１は第１レコード第ｎ１フィールドの位
置、３１２は第１レコード第ｎ１フィールドのサイズで
ある。FIG. 3 shows the structure of a field table which is one of the structures of the delimitation result file shown in FIG. In FIG. 3, 301 is the field information of the first record,
302 is the field information of the second record, 303 is the third
The field information of the record, 304 is a portion indicating these continuations, and 305 is the field information of the nth record. 306 is the position of the first field of the first record, 30
7 is the size of the 1st field of the 1st record, 308 is the position of the 2nd field of the 1st record, 309 is the size of the 2nd field of the 1st record, 310 is a part indicating these continuations, 311 is the n1st field of the 1st record Position 312 is the size of the n1st field of the first record.

【００１９】フィールドテーブルは、第１レコードのフ
ィールド情報３０１から始まり、順に第２レコードのフ
ィールド情報３０２、第３レコードのフィールド情報３
０３、‥、第ｎレコードのフィールド情報３０５とな
る。さらに、最初に第１レコードの第１フィールドの位
置３０６、第１レコードの第１フィールドのサイズ３０
７の順に、第２フィールド位置３０８およびサイズ３０
９と続き、第ｎフィールド位置３１１およびサイズ３１
２まで続く。The field table starts from the field information 301 of the first record, the field information 302 of the second record, and the field information 3 of the third record in order.
03, ..., It becomes the field information 305 of the nth record. Further, first, the position 306 of the first field of the first record, the size 30 of the first field of the first record
7, second field position 308 and size 30
9 followed by the nth field position 311 and size 31
Continues to 2.

【００２０】このようにフィールドテーブルを構成する
ことにより、各フィールドの位置やサイズを求めること
ができるので、フィールド単位の索引を作成する場合や
検索結果として特定のフィールドを表示する場合に簡単
に求めることができる。By constructing the field table in this way, the position and size of each field can be obtained, so that it is easily obtained when creating an index for each field or when displaying a specific field as a search result. be able to.

【００２１】図４は図２で示した区切り結果ファイルの
構成の１つであるレコードテーブルの構造を示す。図４
において、４０１は第１レコードのバイトサイズ、４０
２は第１レコードのフィールドテーブルのバイトサイ
ズ、４０３は第１レコードと第２レコードのバイトサイ
ズの合計、４０４は第１レコードと第２レコードのフィ
ールドテーブルのバイトサイズの合計、４０５は、これ
らの連続を示す部分、４０６は第１レコードから第ｎレ
コードまでのバイトサイズの合計、４０７は第１レコー
ドから第ｎレコードまでのフィールドテーブルのバイト
サイズの合計を示している。FIG. 4 shows the structure of a record table which is one of the structures of the delimitation result file shown in FIG. FIG.
, 401 is the byte size of the first record, 40
2 is the byte size of the field table of the first record, 403 is the total byte size of the first record and the second record, 404 is the total byte size of the field table of the first record and the second record, and 405 is these. A portion indicating continuity, 406 indicates a total byte size from the first record to the nth record, and 407 indicates a total byte size of the field table from the first record to the nth record.

【００２２】このようにレコードテーブルを構成するこ
とにより、各レコードのサイズを簡単に求めることがで
き、さらに、フィールドテーブルの位置と、各レコード
のフィールド数を簡単に求めることができるので、コー
ド全体の索引を作成する場合やレコード全体を表示する
場合に簡単に求めることができる。By constructing the record table in this way, the size of each record can be easily obtained, and further, the position of the field table and the number of fields of each record can be easily obtained. You can easily ask for it when indexing or viewing the entire record.

【００２３】図５は図２で示した区切り結果ファイルの
構成の１つであるファイル名管理ブロックの構成を示
す。図５において、５０１は第１ファイルの管理情報、
５０２は第２ファイルの管理情報、５０３は第３ファイ
ルの管理情報、５０４はこれらの連続を示す部分、５０
５は第ｎファイルの管理情報である。５０６は第１ファ
イルの索引内ファイル番号、５０７は第１ファイルのフ
ァイル名の長さ、５０８は第１ファイルのフィールドテ
ーブルのオフセット、５０９は第１ファイルのレコード
テーブルのオフセット、５１０は第１ファイルのファイ
ル内レコード数、５１１は第１ファイルの更新開始時
間、５１２は第１ファイルの「左リンク」、５１３は第
１ファイルの「右リンク」、５１４は第１ファイルの
「戻りリンク」、５１５は第１ファイルの「親リン
ク」、５１６は第１ファイルの「スキップバイト長」、
５１７は第１ファイルの「検査マスク」、５１８は第１
ファイルの「スキップ文字列」へのオフセット、５１９
は第１ファイルの「残り文字列」のオフセットである。FIG. 5 shows the structure of a file name management block which is one of the structures of the delimitation result file shown in FIG. In FIG. 5, 501 is management information of the first file,
Reference numeral 502 is management information of the second file, 503 is management information of the third file, 504 is a portion indicating the continuation of these, 50
Reference numeral 5 is management information of the nth file. 506 is the file number in the index of the first file, 507 is the length of the file name of the first file, 508 is the offset of the field table of the first file, 509 is the offset of the record table of the first file, and 510 is the first file. The number of records in the file, 511 is the update start time of the first file, 512 is the “left link” of the first file, 513 is the “right link” of the first file, 514 is the “return link” of the first file, 515 Is the “parent link” of the first file, 516 is the “skip byte length” of the first file,
517 is the “inspection mask” of the first file, 518 is the first
Offset to "skip string" of file, 519
Is the offset of the "remaining character string" in the first file.

【００２４】５０６〜５１９の情報は、第１ファイルの
管理情報を詳しく示したものであり、第２ファイルから
第ｎファイルまで同じように５０６〜５１９までの情報
が続く。５１２〜５１９までの木構造は、「Ｐａｔｒｉ
ｃｉａ木」と呼ばれる文字列を管理する方法をファイル
名管理に改良したものであり、第１ファイルから第ｎフ
ァイルまでが木のノードを示している。木構造は２分木
になっており、２分木を表すために５１２〜５１５の情
報を用いる。「Ｐａｔｒｉｃｉａ木」では比較する位置
をビット単位で表すが、８ビット単位の方が処理しやす
いので、比較を開始するのに何バイトスキップするかを
表す「スキップバイト長」とスキップした後に比較する
ビットを示す「検査マスク」で代用する。また「スキッ
プ文字列」へのオフセット５１８と、「残り文字列」へ
のオフセット５１９は、図２のパス名格納領域２１０へ
のオフセットを示しており、この２つのオフセットを用
いて、各ノードのファイル名を作成することができる。
また、この木構造を用いてファイル名を管理することに
より、ファイル名のアルファベット順のソートや指定し
た文書ファイル名が含まれているかどうか高速に調べる
ことができる。The information of 506 to 519 shows the management information of the first file in detail, and the information of 506 to 519 similarly follows from the second file to the nth file. The tree structure from 512 to 519 is "Patri".
The method for managing a character string called "cia tree" is improved to the file name management, and the first to nth files indicate the nodes of the tree. The tree structure is a binary tree, and information 512 to 515 is used to represent the binary tree. In the "Patricia tree", the position to be compared is expressed in bit units, but the 8-bit unit is easier to process, so the comparison is performed after skipping with the "skip byte length" that indicates how many bytes to skip to start the comparison. The "inspection mask" indicating a bit is used instead. Further, the offset 518 to the "skip character string" and the offset 519 to the "remaining character string" indicate the offset to the path name storage area 210 in FIG. 2, and using these two offsets, You can create a file name.
Further, by managing the file names using this tree structure, it is possible to sort the file names in alphabetical order and check at a high speed whether or not the specified document file name is included.

【００２５】なお、本実施例において、区切り結果ファ
イル格納部５の区切り結果ファイルのファイル名を格納
する方法として「Ｐａｔｒｉｃｉａ木」の改良を用いた
が、「Ｐａｔｒｉｃｉａ木」の代わりにファイル名その
ものを格納してもよいことは言うまでもない。In the present embodiment, the improvement of the "Patricia tree" is used as a method of storing the file name of the delimiter result file in the delimiter result file storage unit 5. However, the file name itself is used instead of the "Patricia tree". It goes without saying that it may be stored.

【００２６】（実施例２）次に、本発明の第２の実施例
について、図面を参照しながら説明する。図６は本発明
の第２の実施例における文書検索装置の構成を示す図で
ある。図６において、６０１は文書登録部、６０２は文
書ファイル格納部、６０３は区切り条件入力部、６０４
は区切り結果作成部、６０５は区切り結果ファイル格納
部、６０６は検索対象フィールド入力部、６０７は索引
作成部、６０８は索引ファイル格納部、６０９は検索処
理部、６１０は検索条件入力部、６１１は索引検索部、
６１２は検索結果表示部である。図１の構成と異なるの
は、検索対象フィールド入力部６０６を追加した点であ
る。(Second Embodiment) Next, a second embodiment of the present invention will be described with reference to the drawings. FIG. 6 is a diagram showing the configuration of a document search device according to the second embodiment of the present invention. In FIG. 6, 601 is a document registration unit, 602 is a document file storage unit, 603 is a delimitation condition input unit, and 604.
Is a delimitation result creation unit, 605 is a delimitation result file storage unit, 606 is a search target field input unit, 607 is an index creation unit, 608 is an index file storage unit, 609 is a search processing unit, 610 is a search condition input unit, and 611 is Index search section,
Reference numeral 612 is a search result display unit. The difference from the configuration of FIG. 1 is that a search target field input unit 606 is added.

【００２７】上記のように構成された文書検索装置につ
いて、以下その動作を説明する。基本的な動作について
は第１の実施例と同じである。第１の実施例と異なるの
は、検索対象フィールド入力部６０６を設けたことによ
り、索引作成部６０７で指定されたフィールドについて
のみ索引を作成するように変更することで、全フィール
ドについてのみ作成するのではなく、指定したフィール
ドについてのみ索引を作成すれば良いので索引の容量を
小さくすることができる。The operation of the document search apparatus configured as described above will be described below. The basic operation is the same as in the first embodiment. The difference from the first embodiment is that by providing a search target field input unit 606, the index is created only for the fields designated by the index creating unit 607, and only all fields are created. Instead, it is only necessary to create an index for a specified field, so the index capacity can be reduced.

【００２８】なお、本実施例において、索引作成部６０
７は、指定したフィールドについて索引を作成するよう
にしたが、複数のフィールドについて索引を作成しても
よく、また全フィールドについて索引を作成してもよ
く、またレコード全体で索引を作成してもよい。In this embodiment, the index creating section 60
In the case of No. 7, an index is created for specified fields, but an index may be created for multiple fields, an index may be created for all fields, or an index may be created for the entire record. Good.

【００２９】（実施例３）次に、本発明の第３の実施例
について、図面を参照しながら説明する。図７は本発明
の第３の実施例における文書検索装置の構成を示す図で
ある。図７において、７０１は文書登録部、７０２は文
書ファイル格納部、７０３は区切り条件入力部、７０４
は区切り結果作成部、７０５は区切り結果ファイル格納
部、７０６は索引作成部、７０７は索引ファイル格納
部、７０８は検索処理部、７０９は検索対象ファイル名
入力部、７１０は索引位置検出部、７１１は検索条件入
力部、７１２は索引作成部、７１３は検索結果表示部で
ある。図１の構成と異なるのは、検索対象ファイル名入
力部７０９と索引位置検出部７１０を追加した点であ
る。(Embodiment 3) Next, a third embodiment of the present invention will be described with reference to the drawings. FIG. 7 is a block diagram showing the arrangement of a document search device according to the third embodiment of the present invention. In FIG. 7, 701 is a document registration unit, 702 is a document file storage unit, 703 is a delimiter condition input unit, and 704.
Is a delimitation result creation unit, 705 is a delimitation result file storage unit, 706 is an index creation unit, 707 is an index file storage unit, 708 is a search processing unit, 709 is a search target file name input unit, 710 is an index position detection unit, 711. Is a search condition input unit, 712 is an index creation unit, and 713 is a search result display unit. The difference from the configuration of FIG. 1 is that a search target file name input unit 709 and an index position detection unit 710 are added.

【００３０】上記のように構成された文書検索装置につ
いて、以下その動作を説明する。基本的な動作について
は、上記第１の実施例と同じである。第１の実施例と異
なるのは、第１の実施例では、文書ファイル格納部７０
２に格納されたファイルの全てについて検索を行なって
いたのに対して、本実施例では、検索対象ファイル名入
力部７０９で指定されたファイルに対してのみ索引検索
を行なうことである。索引位置検出部７１０では、検索
対象ファイル名入力部７０９で指定されたファイル名
が、区切り結果ファイル格納部７０５の区切り結果ファ
イルのどの位置にあるかを検出する。位置情報は、索引
検索部７１２に渡して、索引検索部７１２では、指定さ
れた位置から検索を開始して、検索結果を検索結果表示
部７１３に渡して表示を行なうことができる。The operation of the document search apparatus configured as described above will be described below. The basic operation is the same as in the first embodiment. The difference from the first embodiment is that in the first embodiment, the document file storage unit 70
In contrast to the case where the search is performed for all the files stored in No. 2, the index search is performed only for the file specified by the search target file name input unit 709 in the present embodiment. The index position detection unit 710 detects at which position in the delimiter result file of the delimiter result file storage unit 705 the file name designated by the search target file name input unit 709 is located. The position information can be transferred to the index search unit 712, and the index search unit 712 can start the search from the specified position and transfer the search result to the search result display unit 713 for display.

【００３１】なお、本実施例において、検索対象ファイ
ル名入力部７０９では、検索対象ファイルを入力できる
が、複数のファイルを指定してもよく、ファイル名の拡
張子（サフィックス）の指定や、区切り結果ファイルに
含まれる全てのファイルを指定してもよい。In this embodiment, the search target file name input unit 709 can input search target files, but a plurality of files may be specified, and the file name extension (suffix) can be specified or delimited. All files included in the result file may be specified.

【００３２】（実施例４）次に、本発明の第４の実施例
について、図面を参照しながら説明する。図８は本発明
の第１の実施例における文書検索装置の構成を示す図で
ある。図８において、８０１は文書登録部、８０２は文
書ファイル格納部、８０３は区切り条件入力部、８０４
は区切り結果作成部、８０５は区切り結果／索引ファイ
ル格納部、８０６は索引作成部、８０７は検索処理部、
８０８は検索条件入力部、８０９は索引検索部、８１０
は検索結果表示部である。本実施例は、図１の区切り結
果ファイル格納部５と索引ファイル格納部７を結合し
て、区切り結果／索引ファイル格納部８０５にした点で
ある。(Fourth Embodiment) Next, a fourth embodiment of the present invention will be described with reference to the drawings. FIG. 8 is a diagram showing the configuration of the document search device according to the first embodiment of the present invention. In FIG. 8, 801 is a document registration unit, 802 is a document file storage unit, 803 is a delimiter condition input unit, and 804.
Is a delimiter result creation unit, 805 is a delimiter result / index file storage unit, 806 is an index creation unit, 807 is a search processing unit,
808 is a search condition input unit, 809 is an index search unit, 810
Is a search result display section. In this embodiment, the delimiter result / index file storage unit 805 is formed by combining the delimiter result file storage unit 5 and the index file storage unit 7 of FIG.

【００３３】上記のように構成された文書検索装置につ
いて、以下その動作を説明する。基本的な動作について
は第１の実施例と同じである。第１の実施例と異なるの
は、区切り結果作成部８０４で作成した区切り結果と索
引作成部８０６で作成した文書検索用の索引を区切り結
果／索引ファイル格納部８０５に格納する点である。索
引情報の格納方法は、区切り結果ファイルのパス名格納
領域以降に格納する。このようにすることでファイル数
が少なくなるので管理を簡単にすることができる。The operation of the document search device configured as described above will be described below. The basic operation is the same as in the first embodiment. The difference from the first embodiment is that the delimiter result created by the delimiter result creation unit 804 and the document search index created by the index creation unit 806 are stored in the delimiter result / index file storage unit 805. The index information is stored after the path name storage area of the delimitation result file. By doing so, the number of files is reduced, and management can be simplified.

【００３４】（実施例５）次に、本発明の第５の実施例
について、図面を参照しながら説明する。図９は本発明
の第５の実施例における文書検索装置の構成を示す図で
ある。図９において、９０１は文書登録部、９０２は文
書ファイル格納部、９０３は区切り条件入力部、９０４
は文書ファイル選択部、９０５は区切り結果作成部、９
０６は区切り結果ファイル格納部、９０７は索引作成
部、９０８は索引ファイル格納部、９０９は検索処理
部、９１０は検索条件入力部、９１１は索引検索部、９
１２は検索結果表示部である。図１の構成と異なるの
は、文書ファイル選択部９０４を追加した点である。(Fifth Embodiment) Next, a fifth embodiment of the present invention will be described with reference to the drawings. FIG. 9 is a block diagram showing the arrangement of a document search device according to the fifth embodiment of the present invention. In FIG. 9, 901 is a document registration unit, 902 is a document file storage unit, 903 is a delimiter condition input unit, and 904.
Is a document file selection section, 905 is a delimitation result creation section, 9
Reference numeral 06 is a delimiter result file storage unit, 907 is an index creation unit, 908 is an index file storage unit, 909 is a search processing unit, 910 is a search condition input unit, 911 is an index search unit, 9
Reference numeral 12 is a search result display unit. The difference from the configuration of FIG. 1 is that a document file selection unit 904 is added.

【００３５】上記のように構成された文書検索装置につ
いて、以下その動作を説明する。基本的な動作について
は、第１の実施例と同じである。第１の実施例と異なる
のは、文書ファイル選択部９０４文書ファイル格納部９
０２の文書ファイルを選択する場合において、ディレク
トリ下のファイルを探して全て文書ファイルとみなし
て、区切り結果作成部９０５に選択されたファイルのパ
ス情報を渡すことで、文書ファイルの指定を簡単にする
ことができる。またディレクトリを再帰的にたどって、
全てのファイルを探して検索対象文書ファイルとするこ
とも可能である。さらに、ディレクトリの下に文書ファ
イルとその他のファイルが混在している場合に、ファイ
ルの拡張子（サフィックス）が文書ファイルとその他の
ファイルで異なる場合には、文書ファイルの拡張子のみ
を検索して文書ファイルの指定が可能になり、全ての文
書ファイル名を指定するのと比較して指定を簡単にする
ことができる。The operation of the document search device configured as described above will be described below. The basic operation is the same as in the first embodiment. The difference from the first embodiment is that the document file selection unit 904 document file storage unit 9
When selecting the 02 document file, the files under the directory are searched for, all are regarded as the document files, and the path information of the selected file is passed to the delimiter result creation unit 905 to simplify the designation of the document file. be able to. Also recursively traverse the directory,
It is also possible to search for all the files and use them as the search target document files. In addition, if a document file and other files are mixed under the directory and the file extension (suffix) differs between the document file and the other files, search only the document file extension. It becomes possible to specify the document file, and the specification can be simplified compared to specifying all document file names.

【００３６】なお、本実施例において、文書ファイル選
択部９０４は、文書ファイルをある基準で選択するもの
としたが、この基準として文書ファイルの作成または更
新日時より新しい、または古い、またはある範囲内の文
書ファイルというように時間で選択してもよい。In the present embodiment, the document file selection unit 904 selects the document file based on a certain standard. As a standard, the document file is newer or older than the date of creation or update of the document file, or within a certain range. The document file may be selected by time.

【００３７】（実施例６）次に、本発明の第６の実施例
について説明する。構成は第５の実施例と同じである。
動作として異なる点は、文書ファイル選択部９０４で区
切り結果作成部９０５に文書ファイル名を渡す順番とし
て、ファイル名のアルファベット順にする場合や、ファ
イルの作成日の順にする場合や、ファイルの拡張子の順
にソートして、文書ファイルのパスを区切り結果作成部
９０５に渡すことで、検索した場合の検索結果の順番を
制御することができる。(Embodiment 6) Next, a sixth embodiment of the present invention will be described. The structure is the same as that of the fifth embodiment.
The operation is different in that the document file selection unit 904 passes the document file names to the delimiter result creation unit 905 in the alphabetical order of the file names, in the order of the file creation date, or in the file extension. By sorting in order and passing the path of the document file to the delimiter result creation unit 905, it is possible to control the order of the search results when searching.

【００３８】なお、本実施例において、文書ファイル選
択部９０４で文書ファイルをある基準に従ってソートし
たが、この基準としてファイルのサイズ順やファイルの
更新日やアクセス日の順にソートしてもよい。またソー
トする順序としては昇順または降順のどちらでもよいこ
とは言うまでもない。In the present embodiment, the document files are sorted by the document file selection unit 904 according to a certain standard. However, the criteria may be sorted in the order of file size, file update date, or access date. Needless to say, the order of sorting may be either ascending or descending.

【００３９】（実施例７）次に、本発明の第７の実施例
について、図面を参照しながら説明する。図１０は本発
明の第７の実施例における文書検索装置の構成を示す図
である。図１０において、１００１は文書登録部、１０
０２は文書ファイル格納部、１００３は区切り条件入力
部、１００４は文書ファイル変更判定部、１００５は区
切り結果作成部、１００６は区切り結果ファイル格納
部、１００７は索引作成部、１００８は索引ファイル格
納部、１００９は検索処理部、１０１０は検索条件入力
部、１０１１は索引検索部、１０１２は検索結果表示部
である。図１の構成と異なるのは、文書ファイル変更判
定部１００４を追加した点である。(Embodiment 7) Next, a seventh embodiment of the present invention will be described with reference to the drawings. FIG. 10 is a block diagram showing the arrangement of a document search device according to the seventh embodiment of the present invention. In FIG. 10, reference numeral 1001 denotes a document registration unit, 10
Reference numeral 02 is a document file storage unit, 1003 is a delimiter condition input unit, 1004 is a document file change determination unit, 1005 is a delimitation result creation unit, 1006 is a delimitation result file storage unit, 1007 is an index creation unit, 1008 is an index file storage unit, Reference numeral 1009 is a search processing unit, 1010 is a search condition input unit, 1011 is an index search unit, and 1012 is a search result display unit. The difference from the configuration of FIG. 1 is that a document file change determination unit 1004 is added.

【００４０】上記のように構成された文書検索装置につ
いて、以下その動作を説明する。基本的な動作について
は、第１の実施例と同じである。第１の実施例と異なる
のは、文書ファイル変更判定部１００４で文書ファイル
が変更されたかどうかを判定して、変更された場合には
自動的に区切り結果ファイルと索引ファイルの作成を行
なうことで、文書ファイルと索引の整合性を保つことが
できる。The operation of the document search apparatus configured as described above will be described below. The basic operation is the same as in the first embodiment. The difference from the first embodiment is that the document file change determining unit 1004 determines whether or not the document file has been changed, and if so, automatically creates a delimiter result file and an index file. , The integrity of the document file and the index can be maintained.

【００４１】[0041]

【発明の効果】以上のように、本発明は、複数ファイル
に対する文書検索を行なう場合に、区切り結果ファイル
を用いて行なうことにより、複数レコードで複数フィー
ルドのファイルについても文書検索用の索引を作成する
ことができ、索引を用いた高速な検索が可能である。ま
た、指定されたフィールドについてのみ検索用の索引を
作成すれば良いので、索引の容量を減らすことが可能で
ある。さらに、索引検索時に索引に含まれている全ての
ファイルから検索するのではなく、指定したファイルの
位置を区切り結果ファイルから読み込んで検索すること
ができるので、不必要なファイルに対して検索しないで
済み、検索時間の短縮を図ることができる。さらに、文
書ファイルを指定する場合にディレクトリの下の全ての
ファイルを指定でき、またディレクトリを再帰的にたど
ったり、ファイルの拡張子を指定することができるの
で、検索対象のファイル名を全て入力しなくても良い。
さらに、区切り結果ファイルに入力する際に文書ファイ
ルの選択部を設けることにより、アルファベット順や、
ファイルの作成日順や、ファイルのサイズ順や、ファイ
ルの拡張子順の指定が可能になる。さらに、文書ファイ
ル変更判定部を設けることにより、文書ファイルが変更
された場合に自動的に区切り結果ファイルと索引ファイ
ルの作成を行なうため、常に最新の索引ファイルから検
索できるので、修正された文書データも即時に検索する
ことができる優れた文書検索装置の実現が可能である。As described above, according to the present invention, when a document search for a plurality of files is performed using a delimiter result file, an index for document search is created even for a file having a plurality of records and a plurality of fields. It is possible to perform high-speed search using the index. Further, since it is sufficient to create an index for searching only for the designated field, it is possible to reduce the capacity of the index. Furthermore, instead of searching all files included in the index at the time of index search, you can search by reading the specified file position from the delimited result file, so do not search for unnecessary files. The search time can be shortened. In addition, when specifying a document file, you can specify all files under the directory, you can recursively follow the directory and specify the extension of the file, so enter all the file names to be searched. You don't have to.
Furthermore, by providing a document file selection part when inputting into the delimiter result file, alphabetical order,
It becomes possible to specify the order of file creation date, file size, and file extension. In addition, by providing the document file change judgment unit, when the document file is changed, the delimiter result file and the index file are automatically created, so that the latest index file can always be searched. It is possible to realize an excellent document search device capable of instantly searching.

[Brief description of drawings]

【図１】本発明の第１の実施例における文書検索装置の
構成を示すブロック図FIG. 1 is a block diagram showing the configuration of a document search device according to a first embodiment of the present invention.

【図２】本発明の第１の実施例における区切り結果ファ
イルの構造を示す模式図FIG. 2 is a schematic diagram showing the structure of a delimitation result file according to the first embodiment of the present invention.

【図３】本発明の第１の実施例におけるフィールドテー
ブルの構造を示す模式図FIG. 3 is a schematic diagram showing the structure of a field table in the first embodiment of the present invention.

【図４】本発明の第１の実施例におけるレコードテーブ
ルの構造を示す模式図FIG. 4 is a schematic diagram showing the structure of a record table in the first embodiment of the present invention.

【図５】本発明の第１の実施例におけるファイル名管理
ブロックの構造を示す模式図FIG. 5 is a schematic diagram showing the structure of a file name management block in the first embodiment of the present invention.

【図６】本発明の第２の実施例における文書検索装置の
構成を示すブロック図FIG. 6 is a block diagram showing the configuration of a document search device according to a second embodiment of the present invention.

【図７】本発明の第３の実施例における文書検索装置の
構成を示すブロック図FIG. 7 is a block diagram showing a configuration of a document search device according to a third embodiment of the present invention.

【図８】本発明の第４の実施例における文書検索装置の
構成を示すブロック図FIG. 8 is a block diagram showing a configuration of a document search device according to a fourth embodiment of the present invention.

【図９】本発明の第５および第６の実施例における文書
検索装置の構成を示すブロック図FIG. 9 is a block diagram showing the configuration of a document search device according to fifth and sixth embodiments of the present invention.

【図１０】本発明の第７の実施例における文書検索装置
の構成を示すブロック図FIG. 10 is a block diagram showing the configuration of a document search device according to a seventh embodiment of the present invention.

【図１１】従来の文書検索装置の構成を示すブロック図FIG. 11 is a block diagram showing the configuration of a conventional document search device.

[Explanation of symbols]

１文書登録部２文書ファイル格納部３区切り条件入力部４区切り結果作成部５区切り結果ファイル格納部６索引作成部７索引ファイル格納部８検索処理部９検索条件入力部１０索引検索部１１検索結果表示部６０１文書登録部６０２文書ファイル格納部６０３区切り条件入力部６０４区切り結果作成部６０５区切り結果ファイル格納部６０６検索対象フィールド入力部６０７索引作成部６０８索引ファイル格納部６０９検索処理部６１０検索条件入力部６１１索引検索部６１２検索結果表示部７０１文書登録部７０２文書ファイル格納部７０３区切り条件入力部７０４区切り結果作成部７０５区切り結果ファイル格納部７０６索引作成部７０７索引ファイル格納部７０８検索処理部７０９検索対象ファイル入力部７１０索引位置検出部７１１検索条件入力部７１２索引検索部７１３検索結果表示部８０１文書登録部８０２文書ファイル格納部８０３区切り条件入力部８０４区切り結果作成部８０５区切り結果／索引ファイル格納部８０６索引作成部８０７検索処理部８０８検索条件入力部８０９索引検索部８１０検索結果表示部９０１文書登録部９０２文書ファイル格納部９０３区切り条件入力部９０４文書ファイル選択部９０５区切り結果作成部９０６区切り結果ファイル格納部９０７索引作成部９０８索引ファイル格納部９０９検索処理部９１０検索条件入力部９１１索引検索部９１２検索結果表示部１００１文書登録部１００２文書ファイル格納部１００３区切り条件入力部１００４文書ファイル選択部１００５区切り結果作成部１００６区切り結果ファイル格納部１００７索引作成部１００８索引ファイル格納部１００９検索処理部１０１０検索条件入力部１０１１索引検索部１０１２検索結果表示部１１０１文書登録部１１０２文書ファイル格納部１１０３索引作成部１１０４索引ファイル格納部１１０５検索処理部１１０６検索条件入力部１１０７索引検索部１１０８索引結果表示部 1 Document Registration Section 2 Document File Storage Section 3 Separation Condition Input Section 4 Separation Result Creation Section 5 Separation Result File Storage Section 6 Index Creation Section 7 Index File Storage Section 8 Search Processing Section 9 Search Condition Input Section 10 Index Search Section 11 Search Results Display unit 601 Document registration unit 602 Document file storage unit 603 Separation condition input unit 604 Separation result creation unit 605 Separation result file storage unit 606 Search target field input unit 607 Index creation unit 608 Index file storage unit 609 Search processing unit 610 Search condition input unit 610 Part 611 Index search part 612 Search result display part 701 Document registration part 702 Document file storage part 703 Separation condition input part 704 Separation result creation part 705 Separation result file storage part 706 Index creation part 707 Index file storage part 708 Search processing part 709 Search versus File input unit 710 Index position detection unit 711 Search condition input unit 712 Index search unit 713 Search result display unit 801 Document registration unit 802 Document file storage unit 803 Separation condition input unit 804 Separation result creation unit 805 Separation result / Index file storage unit 806 Index creation unit 807 Search processing unit 808 Search condition input unit 809 Index search unit 810 Search result display unit 901 Document registration unit 902 Document file storage unit 903 Separation condition input unit 904 Document file selection unit 905 Separation result creation unit 906 Separation result file storage Part 907 Index creation part 908 Index file storage part 909 Search processing part 910 Search condition input part 911 Index search part 912 Search result display part 1001 Document registration part 1002 Document file storage part 1003 Separation condition input part 1004 Document file selection unit 1005 Separation result creation unit 1006 Separation result file storage unit 1007 Index creation unit 1008 Index file storage unit 1009 Search processing unit 1010 Search condition input unit 1011 Index search unit 1012 Search result display unit 1101 Document registration unit 1102 Document file storage Part 1103 Index creation part 1104 Index file storage part 1105 Search processing part 1106 Search condition input part 1107 Index search part 1108 Index result display part

Claims

[Claims]

1. A document file storage unit for storing a plurality of document files, a delimiter condition input unit for inputting a condition for dividing the document file into records and fields, and a delimiter condition for the document file. A delimiter result creation unit for obtaining the positions and sizes of records and fields, a delimiter result file storage unit for storing the information created by the delimiter result creation unit, and a required record and field position and size from the delimiter result file. An index creation unit that obtains data from the document file and creates an index for search, an index file storage unit that stores the index file created by the index creation unit, and a search condition for inputting search conditions An input unit, an index search unit for performing an index search from the document file and the index file, and the index A document search device having a search result display unit for displaying a result for a document hit by the search unit.

2. The index capacity is reduced by providing a search field input section for inputting a search field in front of the index creation section and creating an index only for the field designated by the index creation section. The document search device according to claim 1, wherein

3. A search target file input unit for inputting a search target file and an index position detection unit are provided in front of the index search unit, and at which position of the index file the search target file is located in the index position detection unit. 2. The document search device according to claim 1, wherein the search time is shortened by detecting and searching the index.

4. The delimiter result file and the index file are 1
The document retrieval device according to claim 1, wherein a delimiter result / index file storage unit for storing in one file is provided to facilitate management of the index file.

5. For a document file in a document file storage unit, all files under a directory are designated, or all files are designated by recursively tracing a directory under the directory, or files are designated. A document file selection section that selects whether to specify only the extension (suffix) of the document file or to specify the document file based on the creation or update date and time of the document file is provided so that the document file to be searched can be easily specified. The document search device according to claim 1,

6. The document file selection section stores the delimiter result information of the document files in the delimitation designation file in the alphabetical order of file names, in the order of file creation date, in the order of file size, or in the order of file size. 6. The document search device according to claim 5, wherein the extension order can be designated.

7. A document file change determination unit for determining whether a document file has been changed after the time when the delimitation result file and the index file were created, and when there is a change, the delimitation result file and the index are automatically added. The delimiter result file and the index file can be always kept up to date by creating a file.
Document retrieval device described.