JPH08153112A

JPH08153112A - Device and method for document preparation

Info

Publication number: JPH08153112A
Application number: JP6292714A
Authority: JP
Inventors: Yasuo Tanosaki; 康雄田野崎; Isamu Iwai; 勇岩井
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1994-11-28
Filing date: 1994-11-28
Publication date: 1996-06-11

Abstract

PURPOSE: To provide a device having a function for accurately retrieving a document that a user requires without using any special thesaurus or synonym dictionary, and a method for document preparation. CONSTITUTION: A control unit 4 performs recurrent retrieval based upon a word or phrase, which is obtained by performing entire-text retrieval based upon a key word specified by a user through an input device 1 at the time of retrieval and included in the document, as a secondary key word, and displays words or phrases, employed as key words in halfway documents, on a display device 2 in order corresponding to retrieval results unless a document obtained indirectly through the retrieval includes the key word specified by the user. Further, the control unit 4 selects the secondary key word out of the document.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、指定されたキーワード
を含む文書や指定されたキーワードに関連する語句を含
む文書を高速に検索する文書作成装置、及び文書作成方
法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document creating apparatus and a document creating method for rapidly searching a document containing a designated keyword or a document containing a phrase related to the designated keyword.

【０００２】[0002]

【従来の技術】従来のこの種文書検索システムとして
は、例えば予め人手によりキーワードを付加しておき、
これらのキーワードを検索時に参照する方式と、予めキ
ーワードは付加せず全文検索を行なう方式がある。2. Description of the Related Art As a conventional document retrieval system of this kind, for example, a keyword is manually added in advance,
There are a method of referring to these keywords at the time of searching, and a method of performing full-text search without adding keywords in advance.

【０００３】前者は、文書中に存在しない語、つまり文
書中に存在する語句と関連する語句、或いは文書の内容
を要約したような語句を付加キーワードとして用いるた
め、予め適当なキーワード付けが行なわれていれば、効
率が良い検索が可能であった。しかし、こういったキー
ワードを付加する作業には多くの労力を要し、特に特許
文書のように大量のデータ量が毎週のように更新される
場合等には、その作業は困難を極めていた。The former uses words that do not exist in the document, that is, words that are related to words that exist in the document, or words that summarize the contents of the document, as additional keywords, so that appropriate keyword assignment is performed in advance. If so, efficient search was possible. However, the task of adding such a keyword requires a lot of labor, and especially when a large amount of data is updated every week like a patent document, the task is extremely difficult.

【０００４】一方、後者の方式は、このような状況に対
応したもので、予めキーワードを付加しておかなくて
も、検索時に入力されたキーワード文字列を含む文書な
ら、高速に検索することが可能となるものである。しか
し後者の場合、検索時に入力されたキーワード文字列と
字面が正確に一致する文書しか検索対象にならないた
め、例えば、入力キーワードと関連した語句を含んでい
ても検索されないという不具合が生じてきた。On the other hand, the latter method deals with such a situation, and a document including a keyword character string input at the time of searching can be searched at high speed without adding a keyword in advance. It is possible. However, in the latter case, since only the document whose character face exactly matches the keyword character string input at the time of the search becomes the search target, there has been a problem that the search cannot be performed even if the word or phrase including the input keyword is included.

【０００５】この不具合を解決するために、シソーラス
或いは類語辞書を用いて入力キーワードを展開するとい
う方法も広く用いられている。しかし、この方法にあっ
ても、シソーラス或いは類語辞書の性質上、既に体系化
した知識しか記述できないため、日々刻々と新語・造語
が作られる先端科学分野での文献検索等には不適当なも
のである。このような分野の文書にも適合するようにシ
ソーラス或いは類語辞書をユーザが整備している場合も
あるが、一般にこの作業には多くの労力が費やされてい
る。To solve this problem, a method of expanding an input keyword using a thesaurus or a thesaurus is also widely used. However, even with this method, due to the nature of the thesaurus or thesaurus, only knowledge that has already been systematized can be described, so it is unsuitable for literature search in the field of advanced science where new words and coined words are created day by day. Is. There are cases where the user prepares a thesaurus or a thesaurus to fit documents in such fields, but in general, much work is spent on this work.

【０００６】[0006]

【発明が解決しようとする課題】上述したように、全文
検索システムでは、予めキーワードを付加する必要がな
いという長所があるが、ユーザの必要とする文書を的確
に検索するのには、その機能は十分ではなく、シソーラ
ス或いは類語辞書を補助的に用いるという手段もある
が、新しい分野の文書に対応するためには、これらを人
手で更新・整備せねばならず、実用上の大きな障害とな
っていた。As described above, the full-text search system has an advantage that it is not necessary to add a keyword in advance. However, in order to accurately search a document required by the user, its function is required. Is not enough, and there is a means to use a thesaurus or a thesaurus as a supplement, but in order to deal with documents in a new field, these must be manually updated and maintained, which is a major obstacle to practical use. Was there.

【０００７】本発明は上記事情を考慮して成されたもの
で、特別なシソーラス或いは類語辞書を用いることな
く、ユーザの必要とする文書を的確に検索する機能を持
つ文書作成装置、及び文書作成方法を提供することを目
的とする。The present invention has been made in consideration of the above circumstances, and has a function of accurately searching a document required by a user without using a special thesaurus or a thesaurus, and a document creating apparatus. The purpose is to provide a method.

【０００８】[0008]

【課題を解決するための手段】本発明は、上記目的を達
成するために、検索時にユーザによって指定されたキー
ワードについて全文検索を行なった結果得られた文書に
含まれている語句を２次キーワードとし再帰的に検索を
実行する再帰的検索実行手段と、この検索手段によって
間接的に得られた文書がユーザによって指定されたキー
ワードを含んでいなかった場合に、途中経由した文書中
でキーワードとして採用された語句を検索結果に対応付
けて、順に表示するキーワードパス表示手段と、２次キ
ーワードを文書中から選び出す２次キーワード選択方式
とを具備してなることを特徴とする。SUMMARY OF THE INVENTION In order to achieve the above object, the present invention uses a secondary keyword for a word or phrase included in a document obtained as a result of performing a full-text search for a keyword specified by a user at the time of search. When the document indirectly obtained by this search means does not include the keyword specified by the user, the recursive search execution means for executing the search recursively is used as the keyword in the intermediate document. It is characterized by comprising a keyword path display means for sequentially displaying the adopted words and phrases in association with a search result and a secondary keyword selection method for selecting a secondary keyword from a document.

【０００９】[0009]

【作用】上記構成に於いて、一般に、文章中で、ある語
句の近傍には、その語句に関連した語句があるという性
質を利用し、検索時にユーザの指定したキーワードを含
んでいる文書中で、上記２次キーワード選択方式を用い
ることにより、キーワードに関連した語句を選び出すこ
とができる。ここで選ばれた語句を用いた再帰的検索実
行手段による検索により、文書中にユーザの指定したキ
ーワードを直接含まなくても、そのキーワードに関連し
た文書が検索できたことになる。更に、キーワードパス
表示手段を用いることにより、検索の結果得られた文書
にユーザの指定したキーワード場合にその文書が検索さ
れた理由を明示することができる。In the above structure, generally, in a document containing a keyword specified by the user at the time of retrieval, the property that a certain phrase in the sentence is related to the phrase is generally used. By using the secondary keyword selection method, it is possible to select a word or phrase related to the keyword. By the search by the recursive search executing means using the selected phrase, it is possible to search the document related to the keyword even if the document does not directly include the keyword specified by the user. Further, by using the keyword path display means, it is possible to clearly indicate the reason why the document is searched in the case of the keyword specified by the user in the document obtained as a result of the search.

【００１０】[0010]

【実施例】本発明の概要は、検索時にユーザによって指
定されたキーワードについて全文検索を行なった結果得
られた文書に含まれている語句を２次キーワードとし再
帰的に検索を実行する再帰的検索実行手段と、この検索
手段によって間接的に得られた文書がユーザによって指
定されたキーワードを含んでいなかった場合に、途中経
由した文書中でキーワードとして採用された語句を検索
結果に対応付けて順に表示するキーワードパス表示手段
と、２次キーワードを文書中から選び出す２次キーワー
ド選択方式とを具備したことにある。そして、このよう
な構成により、一般に文章中で、ある語句の近傍にはそ
の語句に関連した語句があるという性質を利用し、検索
時にユーザの指定したキーワードを含んでいる文書中
で、上記２次キーワード選択方式を用いることにより、
キーワードに関連した語句を選び出すことができる。こ
こで選ばれた語句を用いた再帰的検索実行手段による検
索により、文書中にユーザの指定したキーワードを直接
含まなくても、そのキーワードに関連した文書が検索で
きたことになる。更に、キーワードパス表示手段を用い
ることにより、検索の結果得られた文書にユーザの指定
したキーワード場合にその文書が検索された理由を明示
することができる。BEST MODE FOR CARRYING OUT THE INVENTION The outline of the present invention is a recursive search in which a word included in a document obtained as a result of performing a full-text search for a keyword specified by a user at the time of search is used as a secondary keyword and a recursive search is performed. If the execution means and the document indirectly obtained by this search means do not include the keyword specified by the user, the words and phrases adopted as the keyword in the document that was passed through are associated with the search result. It is provided with a keyword path display means for displaying sequentially and a secondary keyword selection system for selecting secondary keywords from the document. With such a configuration, the property that a word or phrase generally exists in the vicinity of a word or phrase in a sentence is used, and in a document including a keyword specified by a user at the time of search, By using the next keyword selection method,
You can select words and phrases related to keywords. By the search by the recursive search executing means using the selected phrase, it is possible to search the document related to the keyword even if the document does not directly include the keyword specified by the user. Further, by using the keyword path display means, it is possible to clearly indicate the reason why the document is searched in the case of the keyword specified by the user in the document obtained as a result of the search.

【００１１】以下図面を参照して本発明の一実施例を説
明する。図１は本発明の一実施例に係る装置のハードウ
ェア構成を示すブロック図である。An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing the hardware configuration of an apparatus according to an embodiment of the present invention.

【００１２】図１に於いて、１乃至４はそれぞれ本発明
の一実施例に係る装置の構成要素をなすもので、１は入
力装置、２は表示装置、３は外部記憶装置、４は制御装
置である。In FIG. 1, 1 to 4 are components of an apparatus according to an embodiment of the present invention, 1 is an input device, 2 is a display device, 3 is an external storage device, and 4 is a control device. It is a device.

【００１３】入力装置１は、文字コード、制御コマン
ド、位置情報等を入力する装置であり、例えばキーボー
ドとマウス及びこれらを制御する装置等で構成される。
表示装置２は、ユーザに入力を行なわせるためのメッセ
ージ、入力された文字列、或いは検索の後に選ばれた文
書データ等の表示を行なうための装置であり、例えばＶ
ＲＡＭとこのＶＲＡＭに格納されたビット情報をドット
列として表示するためのディスプレイ等からなる。The input device 1 is a device for inputting a character code, a control command, position information, etc., and is composed of, for example, a keyboard and a mouse and a device for controlling them.
The display device 2 is a device for displaying a message for prompting the user to perform an input, an input character string, or document data selected after a search.
It comprises a RAM and a display for displaying the bit information stored in the VRAM as a dot row.

【００１４】外部記憶装置３は、例えばハードディスク
装置等からなり、図２に示すように複数の文書データが
格納されている。格納されている順に文書の識別子とし
てのＩＤ番号が割り当てられている。更に、一つの文書
データ中には、検索結果として得られた際に表示するか
否かを示す表示フラグ格納領域、文書のタイトル・著者
名・作成日時などの文書補助情報を格納する補助情報格
納領域、テキストデータを格納するテキスト格納領域、
図表データ・画像データなどを格納する非テキストデー
タ格納領域、及び検索インデックス格納領域がある。検
索インデックスは、テキストデータ中に出現する各文字
の出現位置を各文字コード毎対応付けて記述してあるも
ので、検索時にこれを参照することにより、指定された
任意の文字列がテキスト中に存在するか否かを、テキス
トデータを直接参照するよりも高速に判定することがで
きる。The external storage device 3 comprises, for example, a hard disk device or the like, and stores a plurality of document data as shown in FIG. ID numbers are assigned as document identifiers in the order in which they are stored. Further, in one document data, a display flag storage area indicating whether or not to display when it is obtained as a search result, auxiliary information storage for storing document auxiliary information such as document title, author name, creation date and time Area, text storage area for storing text data,
There is a non-text data storage area for storing chart data, image data, etc., and a search index storage area. The search index describes the appearance position of each character that appears in the text data by associating it with each character code. By referring to this at the time of search, any specified character string can be written in the text. Whether or not it exists can be determined faster than by directly referring to the text data.

【００１５】尚、一般的な文書については、表示フラグ
の値が“真”であるが、２次検索を効率良く行なう目的
で特別に作成した文書、例えば語句の定義等を記述した
文書については、このフラグの値は“偽”に設定してあ
り、この場合、補助情報格納領域及び非テキストデータ
格納領域の内容は空になっている。For a general document, the value of the display flag is "true", but for a document specially created for the purpose of efficiently performing a secondary search, for example, a document in which the definition of a word or phrase is described. The value of this flag is set to "false", and in this case, the contents of the auxiliary information storage area and the non-text data storage area are empty.

【００１６】制御装置４は、例えば中央制御処理装置
（ＣＰＵ）を用いて構成されるもので、各ハードウェア
装置とバスにより接続されており、各装置の制御、装置
間のデータの転送等の処理及び制御を行なう。The control device 4 is constructed by using, for example, a central control processing unit (CPU), is connected to each hardware device by a bus, and controls each device and transfers data between the devices. Perform processing and control.

【００１７】メモリ５は、例えばダイナミックＲＡＭか
らなり、ここでは図３に示すように制御装置４が各種制
御や処理を実行するためのプログラムを格納するプログ
ラム部４ａと、処理の際に必要なデータを格納するため
のデータバッファ部４ｂからなっている。The memory 5 is composed of, for example, a dynamic RAM, and here, as shown in FIG. 3, a program section 4a for storing a program for the control device 4 to execute various controls and processes, and data necessary for the process. Data buffer section 4b for storing

【００１８】更に、プログラム部４ａは、メイン処理部
４ｄ、初期化部４ｅ、キーワード入力部４ｆ、キーワー
ドサーチ部４ｇ、２次キーワード選出部４ｋ、候補文書
一覧表示部４ｈ、文書選択部４ｉ、文書表示部４ｊ等に
分割される。Further, the program section 4a includes a main processing section 4d, an initialization section 4e, a keyword input section 4f, a keyword search section 4g, a secondary keyword selection section 4k, a candidate document list display section 4h, a document selection section 4i, and a document. It is divided into the display unit 4j and the like.

【００１９】又、データバッファ部４ｂは、１次キーワ
ード格納バッファ４ｓ、２次キーワード格納バッファ４
ｘ、１次検索結果格納バッファ４ｔ、表示文書数格納バ
ッファ４ｕ、２次検索結果格納バッファ４ｖ、作業用バ
ッファ４ｗ等から構成される。Further, the data buffer section 4b includes a primary keyword storage buffer 4s and a secondary keyword storage buffer 4s.
x, primary search result storage buffer 4t, display document number storage buffer 4u, secondary search result storage buffer 4v, work buffer 4w, and the like.

【００２０】以下、プログラム部４ａとデータバッファ
部４ｂの各部の機能について説明する。メイン処理部４
ｄは、装置全体の処理の制御を司るものであり、プログ
ラムの分岐、初期化部４ｅ以下の各モジュールの呼び出
し等を行なうことにより１次検索及び２次検索を実行す
る。The functions of the program section 4a and the data buffer section 4b will be described below. Main processing unit 4
Reference numeral d controls the processing of the entire apparatus, and executes a primary search and a secondary search by branching a program and calling each module below the initialization unit 4e.

【００２１】初期化部４ｅは、各ハードウェア装置の初
期設定及びデータバッファ部４ｂを構成する各バッファ
の内容の初期化を行なう。キーワード入力部４ｆは入力
装置１のキーボードを介してユーザに検索の際にキーと
なるキーワードである文字列を入力させ、これを１次キ
ーワード格納バッファ４ｓに格納する。The initialization unit 4e initializes each hardware device and initializes the contents of each buffer constituting the data buffer unit 4b. The keyword input unit 4f allows the user to input a character string, which is a keyword used as a key at the time of search, via the keyboard of the input device 1, and stores the character string in the primary keyword storage buffer 4s.

【００２２】キーワードサーチ部４ｇは、外部記憶装置
３に中の指定された文書データの検索用インデックスを
参照し、キーワードのサーチを実行し、サーチ結果をメ
イン処理部４ｄに返す。The keyword search unit 4g refers to the search index of the designated document data in the external storage device 3, executes the keyword search, and returns the search result to the main processing unit 4d.

【００２３】２次キーワード選出部４ｋは、検索の結果
得られた、指定されたキーワードを含む文書より２次キ
ーワードとして適当な語句を選び出し結果をメイン処理
部４ｄに返す。The secondary keyword selection unit 4k selects an appropriate phrase as a secondary keyword from the document including the designated keyword obtained as a result of the search and returns the result to the main processing unit 4d.

【００２４】候補文書一覧表示部４ｈは、１次検索結果
格納バッファ４ｔ及び２次検索結果格納バッファ４ｖ中
に格納されているＩＤ番号に対応する文書のタイトルを
表示装置２の表示画面上に列挙し表示する。The candidate document list display section 4h enumerates on the display screen of the display device 2 the titles of the documents corresponding to the ID numbers stored in the primary search result storage buffer 4t and the secondary search result storage buffer 4v. And display it.

【００２５】文書選択部４ｉは、既に候補文書一覧表示
部４ｈによって列挙表示されているタイトルのいずれか
をユーザに選択させるものである。文書表示部４ｊは文
書選択部によって選択されたタイトルに対応する文書の
テキストデータを外部記憶装置３より読み出し、テキス
トを表示画面上に表示する。この際、表示された文書が
２次検索の結果得られたものであるなら、そのとき用い
たキーワードも対応付けて表示するものである。The document selection section 4i allows the user to select any of the titles that are already listed and displayed by the candidate document list display section 4h. The document display unit 4j reads the text data of the document corresponding to the title selected by the document selection unit from the external storage device 3 and displays the text on the display screen. At this time, if the displayed document is obtained as a result of the secondary search, the keywords used at that time are also displayed in association with each other.

【００２６】１次検索結果格納バッファ４ｔは構造体を
要素とする配列変数で、図４の（ａ）に示すように、一
つの要素には、検索の結果、得られた文書のＩＤ番号
と、表示フラグ情報が格納される。The primary search result storage buffer 4t is an array variable having a structure as an element. As shown in FIG. 4A, one element is the ID number of the document obtained as a result of the search. , Display flag information is stored.

【００２７】２次検索結果格納バッファ４ｖも構造体を
要素とする配列変数で、図４の（ｂ）に示すように、一
つの要素には、検索結果得られた文書のＩＤ番号と、表
示フラグ情報が格納され、更に、この文書を検索するた
めに用いられた１次及び２次キーワード情報が格納され
る。The secondary search result storage buffer 4v is also an array variable having a structure as an element. As shown in FIG. 4B, one element includes the ID number of the document obtained as the search result and a display. Flag information is stored, and further, primary and secondary keyword information used for searching this document is stored.

【００２８】１次キーワード格納バッファ４ｓ、及び２
次キーワード格納バッファ４ｘは、文字列を格納するた
めのバッファであり、更に作業用バッファ４ｕ内には、
各処理で必要な各種の処理変数等を格納できるようにな
っている。Primary keyword storage buffers 4s and 2
The next keyword storage buffer 4x is a buffer for storing a character string, and further, in the work buffer 4u,
Various processing variables required for each processing can be stored.

【００２９】上記の構成につき、本実施例装置の具体的
な処理の流れについて図５を主に用いて述べる。ここで
外部記憶装置３には予め文書データが格納されているも
のとする。With respect to the above configuration, a specific processing flow of the apparatus of this embodiment will be described mainly with reference to FIG. Here, it is assumed that document data is stored in the external storage device 3 in advance.

【００３０】処理全体の制御はメイン処理部４ｄが司ど
る。即ちメイン処理部４ｄは、先ず初期化部４ｅを起動
する（ステップ６ａ）。起動された初期化部４ｅはバッ
ファ部４ｂの１次キーワード格納バッファ４ｓ、２次キ
ーワード格納バッファ４ｘ、１次検索結果格納バッファ
４ｔ、表示文書数格納バッファ４ｕ、２次検索結果格納
バッファ４ｖ、及び作業用バッファ４ｗ等の初期化、及
び入力装置１と表示装置２の初期設定等を行なう。The main processing section 4d controls the entire processing. That is, the main processing unit 4d first activates the initialization unit 4e (step 6a). The activated initialization unit 4e includes a primary keyword storage buffer 4s of the buffer unit 4b, a secondary keyword storage buffer 4x, a primary search result storage buffer 4t, a display document number storage buffer 4u, a secondary search result storage buffer 4v, and The work buffer 4w and the like are initialized, and the input device 1 and the display device 2 are initialized.

【００３１】続いて、メイン処理部４ｄは、キーワード
入力部４ｆを起動する（ステップ６ｂ）。起動されたキ
ーワード入力部４ｆは、ユーザに入力装置１のキーボー
トを介してキーワードを入力させ、得られたキーワード
を１次キーワード格納バッファ４ｓに格納する。Then, the main processing section 4d activates the keyword input section 4f (step 6b). The activated keyword input unit 4f causes the user to input a keyword via the keyboard of the input device 1, and stores the obtained keyword in the primary keyword storage buffer 4s.

【００３２】この後、メイン処理部４ｄは、先ずステッ
プ６ｃより始まる１次検索を実行する。ここでは外部記
憶装置３に格納されている全ての文書データに対して、
キーワードサーチ部４ｇを実行する（ステップ６ｃ）。Thereafter, the main processing section 4d first executes the primary search starting from step 6c. Here, for all the document data stored in the external storage device 3,
The keyword search unit 4g is executed (step 6c).

【００３３】キーワードサーチ部４ｇでは指定された文
書データ中検索用インデックスを参照し、対応するテキ
ストデータ中に２次キーワード格納バッファ４ｘに格納
されているキーワード文字列が格納するかどうか判別し
（ステップ６ｄ）、結果をメイン処理部４ｄに返す。The keyword search unit 4g refers to the specified index for searching in the document data to determine whether or not the keyword character string stored in the secondary keyword storage buffer 4x is stored in the corresponding text data (step 6d), the result is returned to the main processing unit 4d.

【００３４】ここでキーワードのサーチが成功したな
ら、メイン処理部４ｄは成功した文書のＩＤ番号及び文
書データ中に格納されている表示フラグを順に１次検索
結果格納バッファ４ｔの所定の領域に格納していく（ス
テップ６ｅ）。If the keyword search is successful, the main processing unit 4d sequentially stores the ID number of the successful document and the display flag stored in the document data in a predetermined area of the primary search result storage buffer 4t. (Step 6e).

【００３５】この際、表示フラグの値が“真”ならば、
表示文書数格納バッファ４ｕの値のインクリメントを行
なう（ステップ６ｆ）。次に、メイン処理部４ｄは、表
示文書数格納バッファ４ｕの値を参照し（ステップ６
ｇ）、一定数（実施例装置では１６に設定してある）よ
り小さい場合には、制御をステップ６ｈに移し、２次検
索処理を開始する。又、一定数以上の場合には、２次検
索処理は行なわず、制御はステップ６ｏに移る。At this time, if the value of the display flag is "true",
The value of the display document number storage buffer 4u is incremented (step 6f). Next, the main processing unit 4d refers to the value in the display document number storage buffer 4u (step 6).
g) If the number is smaller than a certain number (set to 16 in the apparatus of the embodiment), the control is moved to step 6h, and the secondary search process is started. On the other hand, when the number is equal to or more than the predetermined number, the secondary search process is not performed and the control proceeds to step 6o.

【００３６】ステップ６ｈでは、１次検索結果格納バッ
ファ４ｔに格納されているＩＤ番号に対応した全ての文
書について、先ず２次キーワード選出部４ｋを実行する
（ステップ６ｈ）。In step 6h, the secondary keyword selection section 4k is first executed for all the documents corresponding to the ID numbers stored in the primary search result storage buffer 4t (step 6h).

【００３７】ここで２次キーワード選出部４ｋの処理の
流れを図６を用いて説明する。２次キーワード選出部４
ｋでは、先ず、メイン処理部より与えられた文書ＩＤに
対応するテキストデータを読み出し（ステップ７ａ）、
その中で１次キーワード格納バッファ４ｓに格納されて
いるキーワード文字列を含んでいる１文を取り出す（ス
テップ７ｂ）。The processing flow of the secondary keyword selection unit 4k will be described with reference to FIG. Secondary keyword selection section 4
In k, first, the text data corresponding to the document ID given by the main processing unit is read (step 7a),
Among them, one sentence containing the keyword character string stored in the primary keyword storage buffer 4s is taken out (step 7b).

【００３８】ここで、該当する文が複数ある場合には、
そのうち最初に現れるものを採用する。次に、取り出さ
れた１文に対して形態素解析を実行し、単語切り及び各
単語の品詞の解析を行ない、バッファに格納する（ステ
ップ７ｃ）。このときのバッファ中に格納されているデ
ータを原文と対応づけて図７に示す。If there are a plurality of relevant sentences,
The one that appears first is adopted. Next, morphological analysis is performed on the extracted one sentence, word cutting and analysis of the part of speech of each word are performed, and the result is stored in the buffer (step 7c). The data stored in the buffer at this time is shown in FIG. 7 in association with the original text.

【００３９】続いて、バッファ中に格納されているデー
タを参照して、バッファ中で、「１次キーワード格納バ
ッファ４ｓに格納されているキーワード文字列に相当す
る単語」の最も近くにある名詞を取り出し（ステップ７
ｄ）、これをキーワード文字列としてメイン処理部４ｄ
に返す（ステップ７ｅ）。Next, referring to the data stored in the buffer, the noun closest to the "word corresponding to the keyword character string stored in the primary keyword storage buffer 4s" in the buffer is searched for. Take out (Step 7
d), the main processing unit 4d using this as a keyword character string
(Step 7e).

【００４０】例えばキーワード格納バッファ４ｓに格納
されている文字列が「ヤンゴン」という文字列の場合に
は、図７中での名詞：「ミャンマー」が取り出されるこ
とになる。For example, when the character string stored in the keyword storage buffer 4s is the character string "Yangon", the noun "Myanmar" in FIG. 7 is taken out.

【００４１】続いて、メイン処理部４ｄは、２次キーワ
ード選出部４ｋより得られたキーワード文字列を２次キ
ーワード格納バッファ４ｘに格納したのち（ステップ６
ｉ）、外部記憶装置３に格納されている文書データのう
ち、１次及び２次検索で得られたものを除く文書全てに
対して、キーワードサーチ部４ｇを実行する（ステップ
６ｊ）。Subsequently, the main processing unit 4d stores the keyword character string obtained by the secondary keyword selection unit 4k in the secondary keyword storage buffer 4x (step 6).
i) The keyword search unit 4g is executed for all the documents except the documents obtained by the primary and secondary searches among the document data stored in the external storage device 3 (step 6j).

【００４２】ここでキーワードサーチ部４ｇではステッ
プ６ｃの場合と同様に、指定された文書データ中の検索
用インデックスを参照し、対応するテキストデータ中に
２次キーワード格納バッファ４ｘに格納されているキー
ワード文字列が格納するかどうか判別し、結果をメイン
処理部４ｄに返す。Here, as in step 6c, the keyword search section 4g refers to the search index in the designated document data, and the keyword stored in the secondary keyword storage buffer 4x in the corresponding text data. It is determined whether or not the character string is stored, and the result is returned to the main processing unit 4d.

【００４３】キーワードのサーチが成功したなら（ステ
ップ６ｌ）、メイン処理部４ｄは成功した文書のＩＤ番
号及び文書データ中に格納されている表示フラグ、及び
１次キーワード格納バッファ４ｓと２次キーワード格納
バッファ４ｘとの内容を２次検索結果格納バッファ４ｖ
の所定の領域に順に格納してゆく（ステップ６ｍ）。If the keyword search is successful (step 6l), the main processing unit 4d stores the ID number of the successful document, the display flag stored in the document data, the primary keyword storage buffer 4s and the secondary keyword storage. The contents of the buffer 4x are stored in the secondary search result storage buffer 4v.
Are sequentially stored in a predetermined area (step 6m).

【００４４】ここで、表示フラグの値が“真”ならば、
表示文書数格納バッファ４ｕの値のインクリメントを行
なう（ステップ６ｎ）。続いて、候補文書一覧表示部４
ｈが起動する（ステップ６ｏ）。If the value of the display flag is "true",
The value of the display document number storage buffer 4u is incremented (step 6n). Then, the candidate document list display unit 4
h is activated (step 6o).

【００４５】候補文書一覧表示部４ｈは、１次検索結果
格納バッファ４ｔ、及び２次検索結果格納バッファ４ｖ
に格納されているＩＤ番号に対応する文書のタイトル情
報を、外部記憶装置３に格納されている文書データの補
助情報格納領域よりロードし、一覧表として表示装置２
の表示画面上に列挙し表示する。The candidate document list display section 4h includes a primary search result storage buffer 4t and a secondary search result storage buffer 4v.
The title information of the document corresponding to the ID number stored in the storage device 3 is loaded from the auxiliary information storage area of the document data stored in the external storage device 3, and is displayed as a list on the display device 2
Are listed and displayed on the display screen of.

【００４６】次に文書選択部４ｉが起動する（ステップ
６ｐ）。文書選択部４ｉは、既に、候補文書一覧表示部
４ｉによって列挙表示されているタイトルのいずれかを
ユーザに選択させる。Next, the document selecting section 4i is activated (step 6p). The document selection unit 4i causes the user to select any of the titles that are already listed and displayed by the candidate document list display unit 4i.

【００４７】ここで指定された文書のＩＤ番号が得られ
る。続いて文書表示部４ｊが起動する（ステップ６
ｑ）。文書表示部４ｋは、文書選択部４ｉによって選択
されたタイトルに対応する文書のテキストデータ等を外
部記憶装置３中の文書データより読み出し、テキスト、
図表等を表示装置２の表示画面上に表示する。The ID number of the designated document is obtained. Then, the document display unit 4j is activated (step 6).
q). The document display unit 4k reads the text data of the document corresponding to the title selected by the document selection unit 4i from the document data in the external storage device 3, and outputs the text,
A chart or the like is displayed on the display screen of the display device 2.

【００４８】検索結果が２次検索結果格納バッファ４ｖ
に格納されているものに由来する場合には、２次検索結
果格納バッファ４ｖ中に格納されている２次キーワード
情報を参照して、テキスト中のこのキーワードを含んだ
部分に下線を付加すると同時に、２次検索結果格納バッ
ファ４ｖ中の１次及び２次キーワード情報を関連づけて
画面左下に表示する。この際の検索結果として得られた
文書を表示した例を図８に示す。The search result is the secondary search result storage buffer 4v.
If it is derived from the one stored in, the secondary keyword information stored in the secondary search result storage buffer 4v is referenced, and at the same time an underline is added to the portion of the text containing this keyword. The primary and secondary keyword information in the secondary search result storage buffer 4v are associated and displayed in the lower left of the screen. FIG. 8 shows an example of displaying a document obtained as a search result at this time.

【００４９】以上が本発明の一実施例である。上記実施
例では、２次の検索までしか行なわれなかったが、表示
数が少ない場合は、更に２次検索結果格納バッファ４ｖ
の内容を順に参照し、ここに格納されている２次キーワ
ード情報を１次キーワード格納バッファ４ｓに格納した
あと、ステップ６ｈからステップ６ｎの処理を繰り返す
ことも可能である。この場合、参照した２次検索結果格
納バッファ４ｖの経路を記憶しておくことにより、ステ
ップ６ｑで、画面左下にキーワードの参照パスを第１０
図のように表示することも可能である。又、２次キーワ
ード選出部４ｋでの、２次キーワードの選択方法に関し
ても、例えば文に対して構文解析等を行なう方法を用い
ても良い。この他、発明の趣旨を逸脱しない限り種々の
変形が可能である。The above is one embodiment of the present invention. In the above embodiment, only the secondary search was performed. However, when the number of displays is small, the secondary search result storage buffer 4v is further used.
It is also possible to sequentially refer to the contents of the above, store the secondary keyword information stored therein in the primary keyword storage buffer 4s, and then repeat the processing from step 6h to step 6n. In this case, the route of the referred secondary search result storage buffer 4v is stored, so that in step 6q, the keyword reference path is displayed in the lower left corner of the screen.
It is also possible to display as shown. As for the method of selecting the secondary keyword in the secondary keyword selection unit 4k, for example, a method of parsing a sentence may be used. Besides, various modifications can be made without departing from the spirit of the invention.

【００５０】[0050]

【発明の効果】以上詳記したように本発明によれば、ユ
ーザの指定したキーワードを含んでいなくても、そのキ
ーワードに関連した語句を含んだ文書が存在すれば、こ
の文書をシソーラス或いは類語辞書等を用いることな
く、検索結果として得ることが可能となる。そのため、
目的とする文献・資料等を効率良く得ることが可能とな
り、その実用的効果は大である。As described above in detail, according to the present invention, even if the document does not include the keyword designated by the user, if there is a document including a phrase related to the keyword, this document is used as a thesaurus or It is possible to obtain the search result without using a thesaurus. for that reason,
It is possible to efficiently obtain desired documents and materials, and the practical effects are great.

[Brief description of drawings]

【図１】本発明の一実施例のハードウェア構成を示すブ
ロック図。FIG. 1 is a block diagram showing a hardware configuration of an embodiment of the present invention.

【図２】同実施例に係る外部記憶装置に格納されている
文書データの構造を示す図。FIG. 2 is a diagram showing a structure of document data stored in an external storage device according to the embodiment.

【図３】同実施例に係るメモリ内の構成を示す図。FIG. 3 is a diagram showing a configuration in a memory according to the embodiment.

【図４】同実施例に係る１次検索結果格納バッファ及び
２次検索結果格納バッファの構造を示す図。FIG. 4 is a diagram showing structures of a primary search result storage buffer and a secondary search result storage buffer according to the embodiment.

【図５】同実施例に係る実施例装置の全体の処理の流れ
を示すフローチャート。FIG. 5 is a flowchart showing the overall processing flow of the apparatus according to the embodiment.

【図６】同実施例に係る２次キーワード選出部の処理の
流れを示すフローチャート。FIG. 6 is a flowchart showing a processing flow of a secondary keyword selection unit according to the embodiment.

【図７】同実施例に係る２次キーワード選出部での形態
素解析の解析例を示す図。FIG. 7 is a diagram showing an analysis example of morphological analysis in the secondary keyword selection unit according to the embodiment.

【図８】同実施例に係る検索結果として得られた文書の
表示例を示す図。FIG. 8 is a diagram showing a display example of a document obtained as a search result according to the embodiment.

【図９】同実施例に係るキーワードの参照パスの例を示
す図。FIG. 9 is a diagram showing an example of a reference path of a keyword according to the embodiment.

[Explanation of symbols]

１…入力装置、２…表示装置、３…外部記憶装置、４…
制御装置、４ａ…プログラム部、４ｂ…バッファ部、４
ｄ…メイン処理部、４ｅ…初期化部、４ｆ…キーワード
入力部、４ｇ…キーワードサーチ部、４ｋ…２次キーワ
ード選出部４ｋ、４ｈ…候補文書一覧表示部、４ｉ…文
書選択部、４ｊ…文書表示部、４ｓ…１次キーワード格
納バッファ、４ｘ…２次キーワード格納バッファ、４ｔ
…１次検索結果格納バッファ、４ｕ…表示文書数格納バ
ッファ、４ｖ…２次検索結果格納バッファ、４ｗ…作業
用バッファ４ｗ、５…メモリ。1 ... Input device, 2 ... Display device, 3 ... External storage device, 4 ...
Control device, 4a ... Program section, 4b ... Buffer section, 4
d ... Main processing unit, 4e ... Initialization unit, 4f ... Keyword input unit, 4g ... Keyword search unit, 4k ... Secondary keyword selection unit 4k, 4h ... Candidate document list display unit, 4i ... Document selection unit, 4j ... Document Display unit, 4s ... Primary keyword storage buffer, 4x ... Secondary keyword storage buffer, 4t
... primary search result storage buffer, 4u ... display document number storage buffer, 4v ... secondary search result storage buffer, 4w ... work buffer 4w, 5 ... memory.

Claims

[Claims]

1. A keyword input means for inputting initial keyword information, a document storage means for storing document information, a keyword search means for searching document information including designated keyword information, and a search by this keyword search means. And a search document display unit for displaying the retrieved document information, and when the predetermined document information is retrieved by the keyword retrieval unit, the phrase information included in the document information is 2
A document creating apparatus characterized in that a search is recursively executed as next keyword information.

2. The document creation according to claim 1, further comprising keyword path display means for displaying the phrase information adopted as the keyword information in the documents referred to in the middle in association with the search result in order. apparatus.

3. The document creating apparatus according to claim 1 or 2, wherein the subsequent search is performed only when the number of documents obtained by the search at each level is a certain number or more. .

4. A sentence including primary keyword information used to search a document that is referred to midway as secondary keyword information for performing a next-stage search in a document that is referred to midway 2. The document creating apparatus according to claim 1, wherein the word / phrase information is used.

5. A morpheme analysis is performed on one sentence containing primary keyword information, word segmentation and analysis of part-of-speech information are performed, and a noun closest to the primary keyword information in the sentence is used as secondary keyword information. 2. The document creating apparatus according to claim 1, further comprising a secondary keyword selecting unit for selecting.

6. A part of the document information has a property of being only non-terminal nodes, and for this document information, each word information in the sentence is divided beforehand and stored together with the part-of-speech information. 1. The document creation device described in 1.

7. When the document information having the property of only the non-terminal node is searched, the content is controlled not to be displayed even if the initial keyword information is included therein. Item 1. The document creation device according to item 1.

8. An initial keyword information is input, document information is stored, document information including specified keyword information is searched, and this keyword-searched document information is displayed. A document creating method characterized in that, when predetermined document information is searched by a keyword search, the phrase information included in the document information is recursively executed as secondary keyword information.

9. The document creating method according to claim 8, wherein word / phrase information adopted as keyword information in a document referred to in the middle is associated with a search result and displayed in order.

10. The document creating method according to claim 8 or 9, wherein the subsequent search is performed only when the number of documents obtained by the search at each level is a certain number or more. .

11. A sentence containing primary keyword information used for searching a document that is referred to midway as secondary keyword information for performing a next-stage search in a document that is referred to midway 9. The document creating method according to claim 8, wherein the phrase information of is used.

12. A method that includes primary keyword information 1
It is characterized in that a secondary keyword selection means is provided for performing morphological analysis on a sentence, performing word segmentation and analysis of part-of-speech information, and selecting a noun closest to the primary keyword information in the sentence as secondary keyword information. The document creating method according to claim 8.

13. A part of the document information has a property of being a non-terminal node only, and for this document information, each word information in the sentence is divided in advance and stored together with the part-of-speech information. Document preparation method described in 8.

14. When the document information having the property of only the non-terminal node is searched, the content is controlled not to be displayed even if the initial keyword information is included therein. The document creation method according to item 8.