JPH0793345A

JPH0793345A - Document retrieval device

Info

Publication number: JPH0793345A
Application number: JP5233433A
Authority: JP
Inventors: Seiji Miike; 誠司三池; Kazuo Sumita; 一男住田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1993-09-20
Filing date: 1993-09-20
Publication date: 1995-04-07

Abstract

PURPOSE:To provide the document retrieval device which makes a user know what retrieval word and retrieval expression should be inputted and also, know what synonyms a word that the user has inputted has. CONSTITUTION:An index word consisting of a character string including a retrieval character string inputted from an input part 11 is retrieved by using a dictionary for analysis and the retrieval result is displayed at an input part 11, and the retrieval characters are selected according to the retrieval result displayed at the input part, the selected retrieval characters are retrieved in documents stored in a document data storage part 17, and the retrieval result is displayed at a retrieval result display part 16.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は複数の文書を格納した文
書データベースから利用者の所望する文書を検索する文
書検索装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document retrieval apparatus for retrieving a document desired by a user from a document database storing a plurality of documents.

【０００２】[0002]

【従来の技術】近来、文書情報検索装置、特にフルテキ
ストサーチ方式による文書検索装置が実現されてきてい
る。フルテキストサーチ方式では、文書中のすべての文
字または単語を検索対象とするので、キーワードを付与
する労力を必要とせず、また、利用者がキーワードに精
通する必要がないうという長所がある。2. Description of the Related Art Recently, a document information retrieval apparatus, particularly a document retrieval apparatus based on a full text search system has been realized. The full-text search method has an advantage that all characters or words in the document are searched, so that the labor for assigning the keyword is not required and the user needs to be familiar with the keyword.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、フルテ
キストサ−チ方式において、検索漏れがないようにする
ためには、必要な検索語をすべて入力する必要がある。
一般に検索したいと思う文書中にどのような単語が用い
られているかを利用者が予想することが困難であるため
に、適切な検索式を入力することが困難であるという問
題があった。すなわち、検索漏れを生じずに検索するた
めには、利用者が思いつく限りの検索語を入力する必要
があり、この場合でも、利用者が思いついた単語以外の
単語が用いられている可能性がある。また一方、検索さ
れた文書の中には利用者が必要としない文書も多数含ま
れることになる。However, in the full-text search method, it is necessary to input all necessary search words in order to prevent omission of search.
Generally, it is difficult for a user to predict what kind of word is used in a document that he or she wants to search, and thus it is difficult to input an appropriate search formula. In other words, in order to search without omission of search, it is necessary for the user to enter as many search words as he or she can think of, and even in this case, it is possible that words other than the words that the user had come up with are used. is there. On the other hand, the retrieved documents include many documents that the user does not need.

【０００４】利用者が必要な検索語をすべて入力するこ
とは一般に困難であるので、同義語辞書を用意して、検
索語と検索語の同義語を用いて検索することが行われて
いる。しかしながら、同義語の情報は利用者や文書の種
類、専門分野によって異なるので、同義語辞書を作成す
ることは非常に困難であった。Since it is generally difficult for a user to input all necessary search terms, a synonym dictionary is prepared and a search is performed using the search terms and synonyms of the search terms. However, it is very difficult to create a synonym dictionary because the synonym information varies depending on the user, the type of document, and the field of expertise.

【０００５】本発明は上記の点に鑑みてなされたもの
で、その目的は利用者がどのような検索語・検索式を入
力すればよいかがわかるようにし、しかも、利用者が入
力した単語にどのような同義語があるかがわかるように
することができる文書検索装置を提供することにある。The present invention has been made in view of the above points, and its purpose is to make it possible for a user to know what kind of search word / search expression should be input, and moreover, to add a word input by the user. An object of the present invention is to provide a document search device capable of recognizing what synonyms there are.

【０００６】[0006]

【課題を解決するための手段】請求項１に係わる文書検
索装置は、少なくとも見出し語を格納している辞書と、
表示手段と、複数の文書を格納している文書格納手段
と、検索文字列を入力する入力手段と、この入力手段に
より入力された検索文字列を含む文字列で構成された見
出し語を上記辞書から検索する辞書検索手段と、上記辞
書検索手段で検索された検索結果を上記表示手段に表示
する手段と、上記表示手段に表示された検索結果から検
索文字を選択する選択手段と、この選択手段で選択され
た検索文字を上記文書格納手段に格納されている文書の
中から検索する検索手段と、この検索手段により検索さ
れた検索結果を上記表示手段に表示する手段とを具備し
たことを特徴とする。A document search device according to a first aspect of the present invention includes a dictionary storing at least entry words,
Display means, document storage means for storing a plurality of documents, input means for inputting a search character string, and entry words composed of a character string including the search character string input by the input means, in the dictionary. A dictionary search means for searching from the search means, a means for displaying the search results searched by the dictionary search means on the display means, a selecting means for selecting a search character from the search results displayed on the display means, and a selecting means And a search unit for searching the document stored in the document storage unit for the search character selected in step 1, and displaying the search result searched by the search unit on the display unit. And

【０００７】請求項２に係わる文書検索装置は、少なく
とも見出し語を格納している辞書と、表示手段と、複数
の文書を格納している文書格納手段と、検索文字列を入
力する入力手段と、この入力手段により入力された検索
文字列の一部と一致する見出し語を上記辞書から検索す
る辞書検索手段と、上記辞書検索手段で検索された検索
結果を上記表示手段に表示する手段と、上記表示手段に
表示された検索結果から検索文字を選択する選択手段
と、この選択手段で選択された検索文字を上記文書格納
手段に格納されている文書の中から検索する検索手段
と、この検索手段により検索された検索結果を上記表示
手段に表示する手段とを具備したことを特徴とする。According to another aspect of the present invention, there is provided a document search device, which includes a dictionary storing at least an entry word, a display unit, a document storage unit storing a plurality of documents, and an input unit for inputting a search character string. A dictionary search unit that searches the dictionary for an entry word that matches a part of the search character string input by the input unit; and a unit that displays the search result searched by the dictionary search unit on the display unit. Selecting means for selecting a search character from the search result displayed on the display means; searching means for searching the search character selected by the selecting means from the documents stored in the document storing means; Means for displaying the search results searched by the means on the display means.

【０００８】請求項３に係わる文書検索装置は、複合語
を構成する単語情報を記憶する辞書と、表示手段と、複
数の文書を格納している文書格納手段と、検索複合語を
入力する入力手段と、この入力手段により入力された検
索複合語を構成する構成語と同じ構成語から構成されて
いる複合語を上記辞書から検索する辞書検索手段と、上
記辞書検索手段で検索された検索結果を上記表示手段に
表示する手段と、上記表示手段に表示された検索結果か
ら検索文字を選択する選択手段と、この選択手段で選択
された検索文字を上記文書格納手段に格納されている文
書の中から検索する検索手段と、この検索手段により検
索された検索結果を上記表示手段に表示する手段とを具
備したことを特徴とする。According to a third aspect of the present invention, there is provided a document retrieval device, a dictionary for storing word information constituting a compound word, a display means, a document storage means for storing a plurality of documents, and an input for inputting a retrieval compound word. Means, a dictionary search means for searching a compound word composed of the same constituent words as the constituent word constituting the search compound word input by the input means from the dictionary, and a search result searched by the dictionary search means On the display means, a selection means for selecting a search character from the search results displayed on the display means, and a search character selected by the selection means for a document stored in the document storage means. It is characterized by comprising a search means for searching from the inside and a means for displaying the search result searched by the search means on the display means.

【０００９】請求項４に係わる文書検索装置は、表示手
段と、複数の文書を格納している文書格納手段と、検索
文字列を入力する入力手段と、この入力手段により入力
された検索文字列に隣接する文字列を上記文書格納手段
に格納されている文書中から検索する第１の検索手段
と、上記検索手段で検索された検索結果を上記表示手段
に表示する手段と、上記表示手段に表示された検索結果
から検索文字を選択する選択手段と、この選択手段で選
択された検索文字を上記文書格納手段に格納されている
文書の中から検索する第２の検索手段と、この第２の検
索手段により検索された検索結果を上記表示手段に表示
する手段とを具備したことを特徴とする。According to another aspect of the present invention, there is provided a document retrieving apparatus, a display means, a document storing means for storing a plurality of documents, an input means for inputting a retrieval character string, and a retrieval character string input by the input means. A first search means for searching a character string adjacent to the document stored in the document storage means, means for displaying the search result searched by the search means on the display means, and the display means. Selecting means for selecting a search character from the displayed search results; second searching means for searching the search character selected by the selecting means from the documents stored in the document storing means; and the second searching means. Means for displaying the search result searched by the search means on the display means.

【００１０】請求項５に係わる文書検索装置は、請求項
４記載の第１の検索手段は利用者が上記選択手段により
選択したあるいは破棄した文書を対象として検索するこ
とを特徴とする。The document searching apparatus according to claim 5 is characterized in that the first searching means according to claim 4 searches for a document selected by the user by the selecting means or discarded.

【００１１】請求項６に係わる文書検索装置は、請求項
６記載の第１の検索手段は、利用者が上記入力手段から
入力した文字列の直前または直後の文字について、同じ
文字種の範囲の文字の異なるものだけを表示することを
特徴とする。According to a sixth aspect of the present invention, there is provided the first retrieval means according to the sixth aspect, wherein the characters immediately before or after the character string input by the user from the input means are in the same character type range. It is characterized by displaying only different ones.

【００１２】[0012]

【作用】入力手段から入力された検索文字列を含む文字
列で構成された見出し語あるいは入力された検索文字列
の一部と一致する見出し語あるいは入力された検索複合
語を構成する構成語と同じ構成語から構成されている複
合語を上記辞書で検索し、その検索結果を表示手段に表
示するようにし、表示手段に表示された検索結果から検
索文字を選択手段で選択し、この選択手段で選択された
検索文字を文書格納手段に格納されている文書の中から
検索し、その検索結果を表示手段に表示するようにして
いる。With the present invention, a headword formed of a character string including the search character string input from the input means or a constituent word forming a headword that matches a part of the input search character string or an input search compound word The dictionary is searched for a compound word composed of the same constituent word, the search result is displayed on the display means, the search character is selected by the selecting means from the search result displayed on the display means, and the selecting means is selected. The search character selected in step 1 is searched from the document stored in the document storage means, and the search result is displayed on the display means.

【００１３】[0013]

【実施例】以下、図１乃至図１２を参照して本発明の第
１実施例に係わる文書検索装置について説明する。図１
は本文書検索装置の全体構成を示すブロック図である。
図１において、１は本発明に基づく制御や処理を行う中
央処理装置である。この中央処理装置１には半導体メモ
リ、磁気ディスク、光ディスクなどの記憶部２、検索結
果の表示や文書内容の表示を行うための液晶ディスプレ
イやプラズマディスプレイなどの表示部４の制御を行う
表示コントローラ３、利用者の検索命令の入力を行うキ
ーボードやマウスなどの入力部６、およびその制御を行
う入力コントローラ５が接続される。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A document retrieval apparatus according to a first embodiment of the present invention will be described below with reference to FIGS. Figure 1
FIG. 1 is a block diagram showing the overall configuration of this document search device.
In FIG. 1, reference numeral 1 is a central processing unit that performs control and processing based on the present invention. The central processing unit 1 includes a display controller 3 for controlling a storage unit 2 such as a semiconductor memory, a magnetic disk, an optical disk, and a display unit 4 such as a liquid crystal display or a plasma display for displaying search results and document contents. An input unit 6 such as a keyboard and a mouse for inputting a user's search command and an input controller 5 for controlling the input unit 6 are connected.

【００１４】次に、図２を参照して本実施例の構成につ
いて説明する。図２において、１１は検索を指示するキ
ーワードもしくは自然言語からなる文または文章を入力
し、入力解析部により利用者の入力を検索命令に変換
し、検索部１５に出力する入力部、１２は利用者の入力
を入力部１１より入力し、検索命令に変換した後、かか
る検索命令を入力部に出力し、かつ、同義語候補作成の
処理を行う入力解析部１２、１３は入力解析部１２から
参照され、日本語解析用辞書と同義語辞書からなる解析
用辞書、１４は各部を統括的に制御する制御部、１５は
検索命令を入力部より入力し、文書データ記憶部１７に
格納されている文書データを参照し、関連する文書集合
を検索し、その検索結果を検索結果表示部１６に出力す
る検索部、１６は検索結果を入力し、検索結果の文書を
表示する検索結果表示部、１７は文書データおよび、文
書データを識別するための文書ＩＤと単語からなるイン
デックステーブルを格納しておく文書データ記憶部であ
る。Next, the configuration of the present embodiment will be described with reference to FIG. In FIG. 2, reference numeral 11 is an input section for inputting a keyword or a natural language sentence or sentence for instructing a search, converting an input of a user into a search command by an input analysis section, and outputting the result to a search section 15, After inputting the input of the person from the input unit 11 and converting the input into a search command, the search command is output to the input unit, and the input analysis units 12 and 13 that perform the synonym candidate creation process are input from the input analysis unit 12. An analysis dictionary, which is referred to and includes a Japanese analysis dictionary and a synonym dictionary, 14 is a control unit that controls each unit centrally, 15 is a search command input from the input unit, and is stored in the document data storage unit 17. A search unit that refers to the existing document data, searches for a related document set, and outputs the search result to the search result display unit 16. The search result display unit 16 inputs the search result and displays the document of the search result. 17 is a sentence Data and a document data storage unit for storing the index table of the document ID and the word for identifying the document data.

【００１５】なお、解析用辞書１３の構成は図６に、検
索部１５の検索処理については図１２に、文書デ−タ記
憶部１７のキ−ワ−ドインデックステ−ブルの構成を図
１１に示しておく。The structure of the analysis dictionary 13 is shown in FIG. 6, the search processing of the search unit 15 is shown in FIG. 12, and the structure of the keyword index table of the document data storage unit 17 is shown in FIG. It shows in.

【００１６】次に、上記のように構成された本発明の第
１実施例の動作について説明する。まず、入力部１１か
ら入力文が入力されると、その入力文は入力解析部１２
に送られ、この入力解析部１２において入力文の形態素
解析と検索式の作成処理、および同義語候補作成の処理
が行なわれる。Next, the operation of the first embodiment of the present invention constructed as above will be described. First, when an input sentence is input from the input unit 11, the input sentence is input to the input analysis unit 12.
The input analysis unit 12 performs the morphological analysis of the input sentence, the process of creating a search expression, and the process of creating synonym candidates.

【００１７】図３に、入力文と、入力解析部１２におけ
る当該入力文の形態素解析結果および作成された検索式
の一例を示す。形態素解析の方法は本発明の主眼ではな
く「日本語情報処理」（長尾真監修、電子通信学会）な
どに記述されている方法を用いればよい。入力解析部１
２は、形態素解析結果の中から名詞の連続する部分を取
り出し、それらを「＊」を介して接続して、検索式を作
成する。つまり、図４に示すように入力部１１に入力文
『自然言語処理用のソフトについて』、検索式および検
索式中の単語が表示される。そして、利用者が、入力部
１１のウィンドウ上の検索語を指定し、［同義語候補表
示］をマウスなどで指定すると、入力部は入力解析部１
２を起動する。例えば、利用者が検索語の中の「ソフ
ト」を指定し、「ソフト」が表示されている行の四角の
記号が反転表示される。FIG. 3 shows an example of the input sentence, the morphological analysis result of the input sentence in the input analysis unit 12, and the created search formula. As a method of morphological analysis, a method described in "Japanese Information Processing" (supervised by Nagao Shin, Institute of Electronics and Communication Engineers) and the like may be used instead of the main object of the present invention. Input analysis unit 1
2 takes out consecutive parts of nouns from the morphological analysis result and connects them through “*” to create a search formula. That is, as shown in FIG. 4, the input sentence "about the software for natural language processing", the search expression, and the words in the search expression are displayed on the input unit 11. Then, when the user specifies a search word on the window of the input unit 11 and specifies [synonym candidate display] with a mouse or the like, the input unit becomes the input analysis unit 1.
Start up 2. For example, the user specifies "soft" in the search word, and the square symbol of the line in which "soft" is displayed is highlighted.

【００１８】以下、入力解析部１２における同義語候補
作成の処理について説明する。まず、図６を参照して形
態素解析および同義語候補作成のために用いられる解析
用辞書１３の構成について説明する。図６に示すよう
に、解析用辞書１３は見出しと、解析に用いられる左接
続コード、右接続コード、品詞の集合で構成されてい
る。The process of creating synonym candidates in the input analysis unit 12 will be described below. First, the configuration of the analysis dictionary 13 used for morphological analysis and creation of synonym candidates will be described with reference to FIG. As shown in FIG. 6, the analysis dictionary 13 is composed of a headline and a set of left connection code, right connection code, and part of speech used for analysis.

【００１９】図７に、解析用辞書を検索するための検索
用辞書構造を示す。第７図に示すように、検索用辞書は
解析用辞書の見出しの１文字めの文字の格納領域と、２
文字め以降の文字の格納領域とからなる。各格納領域
は、文字番号と文字、後方に接続可能な文字の文字番号
の並び、セパレータ０、そこまでの文字の並びからなる
見出しの辞書情報（左接続コード、右接続コード、品
詞）の並び、セパレータ０からなる。なお、格納領域の
文字は、例えばＪＩＳコード順で並べられている。ここ
で、例えば「ソフト」では、「ソ」から始まり後方に接
続可能な文字の文字番号をたどって得られる「ト」の位
置の辞書情報が、「ソフト」の見出しの存在とその辞書
情報に対応する。さらに、「ト」から後方に接続可能な
文字の文字番号をたどって得られる「ア」の位置の辞書
情報が、「ソフトウェア」に対応する。さらに、「ア」
から後方に接続可能な文字の文字番号をたどって得られ
る「機」の文字の辞書情報が、「ソフトウェア危機」に
対応する。FIG. 7 shows a search dictionary structure for searching the analysis dictionary. As shown in FIG. 7, the search dictionary has a storage area for the first character of the heading of the analysis dictionary and 2
It consists of the storage area for the characters after the first character. Each storage area is a sequence of character numbers and characters, a sequence of character numbers of characters that can be connected backward, a separator 0, and a sequence of heading dictionary information (left connection code, right connection code, part of speech) consisting of the sequence of characters up to that point. , Separator 0. The characters in the storage area are arranged in the JIS code order, for example. Here, for example, in "soft", the dictionary information of the position of "to", which is obtained by tracing the character numbers of the characters that can be connected backwards starting from "so", is included in the existence of the "soft" heading and its dictionary information. Correspond. Furthermore, the dictionary information at the position of "A" obtained by tracing the character numbers of the characters that can be connected backward from "T" corresponds to "software". In addition, "A"
The dictionary information of the characters of “machine” obtained by tracing the character numbers of the characters that can be connected backwards corresponds to “software crisis”.

【００２０】図８に上述した同義語候補作成処理の流れ
を示しておく。まず、検索語の１文字目を、１文字目
を、１文字めの領域の中から探す。一致する文字が存在
する場合には、さらに後方に接続可能な文字の文字番号
がある限りそれらをたどっていく。その過程で得られる
すべての見出しを出力する。このようにして、図５のよ
うな同義語の候補が入力部１１に表示される。FIG. 8 shows the flow of the synonym candidate creation process described above. First, the first character of the search word is searched for in the first character area. If there are matching characters, follow them as long as there is a character number of characters that can be connected further. Print out all the headings you get along the way. In this way, synonym candidates as shown in FIG. 5 are displayed on the input unit 11.

【００２１】利用者は、表示された同義語候補表示処理
結果について、修正、削除、追加を行なうことができ
る。第５図で「ソフト」の同義語の中から、利用者が
「ソフトウェア」を選択した後の表示の例を図９に示
す。利用者が［選択］を指定すると、入力解析部１２は
当該の修正結果を同義語辞書と図示しないバッファに格
納し、制御を入力部１１へ戻す。入力部１１はバッファ
の内容に応じて、図１０に示すように表示する。The user can correct, delete, or add the displayed synonym candidate display processing result. FIG. 9 shows an example of the display after the user selects "software" from the synonyms of "software" in FIG. When the user specifies [Select], the input analysis unit 12 stores the correction result in a synonym dictionary and a buffer (not shown), and returns control to the input unit 11. The input unit 11 displays as shown in FIG. 10 according to the contents of the buffer.

【００２２】以下、文書デ−タ記憶部１７を検索する処
理について説明する。図１１は文書データ記憶部１７の
キーワードインデックステーブルの一例を示す。本実施
例では、文書中のキーワードは、ＴＲＩＥ構造の形式で
キーワードインデックステーブルに格納している。この
ＴＲＩＥ構造では、記憶容量ならびに検索の手間を削減
するため、各キーワードのうち同じ文字列を共有化して
記憶している。キーワードインデックステーブルは、各
キーワードを構成する文字とその文字の間のリンク情報
を格納している。The process of searching the document data storage unit 17 will be described below. FIG. 11 shows an example of the keyword index table of the document data storage unit 17. In this embodiment, the keywords in the document are stored in the keyword index table in the TRIE structure format. In this TRIE structure, in order to reduce the storage capacity and the trouble of searching, the same character string of each keyword is shared and stored. The keyword index table stores characters forming each keyword and link information between the characters.

【００２３】例えば、「機械」というキーワードに対応
して、文字「機」のリンク情報には「００９３５」とい
うリンク情報がある。このリンク情報は、文字「機」の
格納されているアドレスを表している。そしてアドレス
「００９３５」に格納されている文字「機」には、キー
ワード「機械」を含む文書データが、「ｆｉｌｅ４」で
あること、さらに「機械」を前２文字の部分文字列とし
て含む別のキーワード、「機械翻訳」についてのリンク
情報として、「０１２０１」が格納されている。このリ
ンク情報をたどることにより、「機械翻訳」をキーワー
ドとする文書データが、「ｆｉｌｅ２５」と「ｆｉｌｅ
２１」であることがわかる（「ｆｉｌｅ４」などは文書
データが格納されている文書ファイル名を表してい
る）。For example, in correspondence with the keyword "machine", the link information of the character "machine" has the link information "00935". This link information represents the address where the character "machine" is stored. Then, in the character "machine" stored at the address "09935", the document data including the keyword "machine" is "file4", and another "machine" is included as a partial character string of the preceding two characters. “01201” is stored as link information about the keyword “machine translation”. By following this link information, the document data with "machine translation" as a keyword becomes "file25" and "file".
21 "(" file4 "and the like represent the document file name in which the document data is stored).

【００２４】また、キーワード「実例」と「実験」のよ
うに、前方に同じ文字列を含むキーワードは、文字
「実」のリンク情報に格納されている二つのアドレスを
「０１００３」と「０１００４」が、それぞれ「験」と
「例」の格納されているアドレスを表している。Further, for keywords including the same character string in front such as the keywords "actual example" and "experimental", two addresses stored in the link information of the character "actual" are "01003" and "01004". , Respectively, represent the addresses at which “test” and “example” are stored.

【００２５】なお、リンク情報における「０」は、アド
レスや文書データのセパレータを表している。また、す
べてのキーワードの一文字目については、一定の連続す
る記憶領域にＪＩＳコード順などの順序でソートされて
格納されている。"0" in the link information represents an address or a separator of document data. The first characters of all keywords are sorted and stored in a fixed continuous storage area in the order such as JIS code order.

【００２６】検索部１５は、入力解析部１２で作成され
た検索式中の検索語ごとに、文書デ−タ記憶部１７に格
納されている文書ファイルを検索し、その検索語を含む
文書ファイルの集合を求める。さらに、文書ファイル集
合の論理的な集合演算により検索式に合致する文書ファ
イルの集合を求める。The search unit 15 searches the document file stored in the document data storage unit 17 for each search word in the search formula created by the input analysis unit 12, and the document file including the search word. Find the set of. Further, a set of document files that matches the search expression is obtained by a logical set operation of the document file set.

【００２７】以下、図１２を参照して検索部１５の処理
について説明する。まず始めに、初期化処理として変数
ｉ、Ｎをそれぞれ１、検索語の個数に設定する。変数ｉ
は対象とする検索語を示すインデックスを表す。変数ｉ
が変数Ｎより小さい間、検索語ｉの一文字目の文字につ
いて「一文字目の文字の格納領域」を検索し、その文字
が格納されているブロックを求め、そのブロックをブロ
ックＡとする。「一文字目の文字の格納領域」には、文
字がソートされて格納されているので、文字が格納され
ているブロックを求めるためには、バイナリサーチによ
って求めることができる。The processing of the search unit 15 will be described below with reference to FIG. First, as initialization processing, variables i and N are set to 1 and the number of search words is set, respectively. Variable i
Represents an index indicating a target search term. Variable i
While is smaller than the variable N, the “first character storage area” is searched for the first character of the search word i, the block in which the character is stored is obtained, and the block is designated as block A. Since the characters are sorted and stored in the "first character storage area", the block in which the character is stored can be obtained by a binary search.

【００２８】次に、変数ｋに２を格納する。この変数ｋ
は、検索語ｉのうち着目している文字位置を表す。変数
ｋに格納されている値が、検索語ｉの文字列長より小さ
い間、各ブロックに格納されている文字と、検索語の各
文字の照合を行い対応するブロックをもとめることを行
っていく。最終的に、検索語に対応する文書データが存
在する場合、それに対応して、キーワードインデックス
テーブル中のブロックが検出できることになる。そのブ
ロックのリンク情報に格納されている文書ファイル名を
文書ファイル集合ｉに設定する。Next, 2 is stored in the variable k. This variable k
Represents the character position of interest in the search word i. While the value stored in the variable k is smaller than the character string length of the search word i, the character stored in each block is compared with each character of the search word to find the corresponding block. . Finally, if there is document data corresponding to the search word, the block in the keyword index table can be detected correspondingly. The document file name stored in the link information of the block is set in the document file set i.

【００２９】上記の処理を、すべての検索語に対して行
うことにより、文書ファイル集合１から文書ファイル集
合Ｎには、それぞれの検索語に対応する文書ファイル集
合が設定されることになる。次のステップでは、すべて
の文書ファイル集合についても共通部分を、集合演算に
よりもとめ最終的な文書ファイル集合とする。By performing the above process for all the search words, the document file sets 1 to N are set with the document file sets corresponding to the respective search words. In the next step, the common part of all the document file sets is found by the set operation and is set as the final document file set.

【００３０】検索部１５で得られた文書ファイル集合の
文書のタイトルなどを検索結果表示部１６に表示して検
索処理を終了する。なお、図７に示した解析用辞書１３
を検索するためのみ検索用辞書構造では、では後方に接
続可能な文字の文字番号の並びのみを格納したが、前方
に接続可能な文字の文字番号の並びを格納することもで
きる。すなわち、図７（ａ）の１文字目の格納領域に
は、後方に接続可能な文字の文字番号の並びとともに、
前方に接続可能な文字の文字番号の並びを格納する。さ
らに、図７（ｂ）と同じ構造で、後方に接続可能な文字
の文字番号の並びの代わりに、前方に接続可能な文字の
文字番号の並びを格納した格納領域を設ける。この辞書
構造を用いて、図８に示した処理の内容と同様にして、
検索語を先頭の位置以外に含む見出し語を検索すること
ができる。The title of the document of the document file set obtained by the search unit 15 is displayed on the search result display unit 16 and the search process is terminated. The analysis dictionary 13 shown in FIG.
In the search dictionary structure only for searching, only the character number sequence of characters that can be connected backward is stored in, but it is also possible to store the character number sequence of characters that can be connected forward. That is, in the storage area for the first character in FIG.
Stores the sequence of character numbers that can be connected to the front. Further, in the same structure as in FIG. 7B, a storage area is provided which stores a sequence of character numbers of connectable characters in the front instead of a sequence of character numbers of connectable characters in the rear. Using this dictionary structure, in the same way as the contents of the processing shown in FIG.
It is possible to search for a headword that includes the search word other than the position of the head.

【００３１】また、図７では１文字目の文字の格納領域
と、２文字目以降の文字の格納領域に分けたが、同じ格
納領域を用いることも可能である。さらに、前述した検
索部１５では、予め文書を形態素解析して作成した、単
語とその単語を含む文書ＩＤからなるキーワードインデ
ックステーブルを用いて検索したが、次のように、予め
文書を形態素解析しない方法でもよい。まず、文字所と
その文字を含む文書のＩＤからなるキーワードインデッ
クステーブルを用いて検索する。次に、検索された文書
中の文について形態素解析し、検索語が単語としてその
文書に含まれているかどうかをチェックしてそれらを含
む文書のもを表示する。また、キーワードインデックス
テーブルを用いずに直接文書を処理してもよい。Although the storage area for the first character and the storage area for the second and subsequent characters are divided in FIG. 7, the same storage area can be used. Further, although the above-described search unit 15 searches using a keyword index table composed of a word and a document ID including the word, which is created by morphological analysis of the document in advance, the document is not morphologically analyzed in advance as follows. It may be a method. First, a search is performed using a keyword index table composed of character places and document IDs containing the characters. Next, morphological analysis is performed on the sentences in the retrieved document, it is checked whether or not the retrieval word is included as a word in the document, and the document containing them is also displayed. Further, the document may be directly processed without using the keyword index table.

【００３２】なお、上述の説明では、検索語を一部に含
んでいる形態素解析用辞書中の見出し単語を、検索時に
表示する方法について述べたが、利用者が解析用辞書に
単語を登録する時点に表示するようにしても良い。In the above description, the method of displaying the headword in the morphological analysis dictionary partially including the search word at the time of search is described, but the user registers the word in the analysis dictionary. You may make it display at a time.

【００３３】以上のように、第１実施例では文書および
検索要求文を形態素解析するフルテキストサーチシステ
ムにおいて、入力された検索語について、当該の検索語
の文字列を含む形態素解析用辞書中の見出し単語を表示
し、利用者は表示された見出し単語を検索語に追加して
検索することができるようにしたので、利用者は最初に
検索語を漏れなく入力する作業の負担が軽減され、また
検索漏れを防ぐことができる。さらに、追加した検索語
を同義語辞書に保存しておくことによって後の検索時に
利用することができる。As described above, in the first embodiment, in the full-text search system for morphologically analyzing a document and a search request sentence, an input search word is stored in a morphological analysis dictionary containing a character string of the search word. By displaying the heading word and allowing the user to search by adding the displayed heading word to the search word, the user can reduce the burden of first entering the search word without omission, In addition, omission of search can be prevented. Further, by storing the added search word in the synonym dictionary, it can be used at a later search.

【００３４】次に、図１３乃至図２０を参照して本発明
の第２実施例について説明する。この第２実施例では、
文書および検索要求文を構文意味解析するフルテキスト
サーチシステムにおいて、入力された検索要求文につい
て、当該検索要求部の構文意味構造を構文意味構造の中
に含んでいる構文意味解析用辞書中の見出し語と当該構
文意味関係を表示するようにしている。Next, a second embodiment of the present invention will be described with reference to FIGS. 13 to 20. In this second embodiment,
In a full-text search system that performs a syntactic and semantic analysis of a document and a search request sentence, for an input search request sentence, a heading in a syntactic and semantic analysis dictionary that includes the syntactic and semantic structure of the search request part in the syntactic and semantic structure. The words and their syntactic and semantic relationships are displayed.

【００３５】この第２実施例の機器構成及び機能構成は
前述した第１実施例の図１及び図２と以下の点を除いて
同じである。前述した第１実施例では、入力解析部１２
は入力部１１から入力される検索命令、および文書デー
タ記憶部１７の文書を形態素解析したが、この第２実施
例では検索命令と文書素解析および構文意味解析するよ
うにしている。The device configuration and functional configuration of the second embodiment are the same as those of the above-described first embodiment shown in FIGS. 1 and 2 except for the following points. In the first embodiment described above, the input analysis unit 12
Although the morphological analysis was performed on the search command input from the input unit 11 and the document in the document data storage unit 17, in the second embodiment, the search command, the document element analysis, and the syntactic and semantic analysis are performed.

【００３６】つまり、入力解析部１２は、入力文の形態
素解析、構文意味解析と検索式の作成処理、および同義
語候補作成の処理を行なう。図１３は入力解析部１２で
用いる辞書の形式と例を示すもので、図１３に示すよう
に、解析用辞書は見出しと、左接続コード、右接続コー
ド、品詞、意味構造の集合で構成されている。この解析
用辞書は形態素解析、構文意味および同義語候補作成の
ために用いられる。That is, the input analysis unit 12 performs a morphological analysis of an input sentence, a syntactic and semantic analysis, a search expression creating process, and a synonym candidate creating process. FIG. 13 shows a format and an example of a dictionary used in the input analysis unit 12. As shown in FIG. 13, the analysis dictionary is composed of a headline, a left connection code, a right connection code, a part of speech, and a set of semantic structures. ing. This analysis dictionary is used for morphological analysis, syntactic meaning and synonym candidate creation.

【００３７】図１４に、入力文と、入力解析部１２にお
ける当該入力文の形態素解析結果、構文解析結果、意味
解析結果、不要表現規則適用結果、作成された検索式の
一例を示しておく。なお、形態素解析および構文意味解
析の方法は本発明の主眼ではなく「日本語情報処理」
（長尾真監修、電子通信学会）などに記述されている方
法を用いればよい。不要表現規則の適用では、図１５に
示すような不要表現規則と、意味解析結果とを照合し、
意味解析結果から不要表現規則と一致した部分を削除す
る。FIG. 14 shows an example of an input sentence, a morphological analysis result, a syntactic analysis result, a semantic analysis result, an unnecessary expression rule application result, and a created search expression of the input sentence in the input analysis unit 12. The method of morphological analysis and syntactic and semantic analysis is not the main point of the present invention, but "Japanese information processing".
(Makoto Nagao supervision, Institute of Electronics and Communication Engineers) etc. may be used. In applying the unnecessary expression rule, the unnecessary expression rule as shown in FIG. 15 is compared with the semantic analysis result,
Delete the part that matches the unnecessary expression rule from the semantic analysis result.

【００３８】次に、上記のように構成された本発明の第
２実施例の動作について説明する。入力部１１から検索
要求文字を入力すると、図１２に示すように、入力文と
検索式および検索式中の単語が表示され。利用者が、
［同義候補表示］をマウスなどで指定すると、入力部１
１は入力解析部１２を起動する。Next, the operation of the second embodiment of the present invention constructed as above will be described. When the search request character is input from the input unit 11, the input sentence, the search expression, and the words in the search expression are displayed as shown in FIG. The user
When [Synonymous candidate display] is specified with the mouse, etc.
1 activates the input analysis unit 12.

【００３９】この入力解析部１２は、同義語候補作成の
処理を行ない、その結果を入力部１１へ転送する。入力
部１１は転送された結果を入力部用ウィンドウに表示す
る。図１７に、同義語候補作成処理結果の表示の一例を
示している。図１７では、「文書検索」の同義語候補と
して「文書情報検索」が表示されている。The input analysis unit 12 carries out a process of creating synonym candidates and transfers the result to the input unit 11. The input unit 11 displays the transferred result in the input unit window. FIG. 17 shows an example of the display of the synonym candidate creation processing result. In FIG. 17, “document information search” is displayed as a synonym candidate for “document search”.

【００４０】以下、図１８のフロ−チャ−トを参照して
同義語候補作成処理について説明する。つまり、検索式
の中で直接の意味関係にある単語Ａと単語Ｂを取り出
し、これらの単語Ａ、単語Ｂおよびそれらの間の意味関
係Ｃが、辞書中の一つの意味構造の中に存在するかどう
かを照合する。このように、検索式の中のすべての２つ
の単語とその間の意味関係が、辞書中の一つの意味構造
の中に存在するかどうかを照合し、すべてが存在する場
合にその意味構造を入力部へ渡す。この処理を辞書中の
すべての意味構造について行なっている。このようにし
て検索された同義語候補は表示される。The synonym candidate creation process will be described below with reference to the flowchart of FIG. That is, the words A and B having a direct semantic relationship in the search formula are taken out, and the word A, the word B, and the semantic relationship C between them are present in one semantic structure in the dictionary. Check whether or not. In this way, it is checked whether all two words in the search expression and the semantic relationship between them exist in one semantic structure in the dictionary, and if all exist, the semantic structure is input. Hand over to the department. This process is performed for all semantic structures in the dictionary. The synonym candidates retrieved in this way are displayed.

【００４１】そして、利用者は、表示された同義語候補
について、修正、削除、追加を行なうことができる。図
１７の表示から、利用者が新たに検索式を選択した後の
表示の例を図１９に示す。図において、利用者が［選
択］を指定すると、入力解析部１２は当該の修正結果を
同義語辞書と図示しないバッファに格納し、制御を入力
部１１へ戻す。入力部１１はバッファの内容に応じて、
図２０に示すように表示する。Then, the user can correct, delete, and add the displayed synonym candidates. FIG. 19 shows an example of the display after the user newly selects the search formula from the display of FIG. In the figure, when the user specifies [select], the input analysis unit 12 stores the correction result in a synonym dictionary and a buffer (not shown), and returns control to the input unit 11. The input unit 11 is
It is displayed as shown in FIG.

【００４２】そして、検索部１５において、前述した検
索処理が行われ、検索結果表示部１６に、検索部１５で
得られた文書ファイル集合のタイトルなどが表示され
る。なお、形態素解析および構文意味解析による検索方
法は本発明の主眼ではなく特願平５−１２５６１号公報
などに記述されている方法を用いればよい。また、特願
平５−１２５６１号公報では、予め文書を形態素解析お
よび構文意味解析して作成したキーワードインデックス
テーブルを用いて検索するが、次のように、予め解析し
ない方法でもよい。まず、入力された検索語について、
文字とその文字を含む文書のＩＤからなるキーワードイ
ンデックステーブルを用いて検索し、検索された文書中
の文を形態素解析、構文意味解析し、単語および構文意
味解析結果がその文書に含まれているかどうかをチェッ
クする。その結果、当該解析結果を含む文書のみを表示
する。また、キーワードインデックステーブルを用いず
に直接文書を解析処理することも可能である。Then, the above-mentioned search processing is performed in the search section 15, and the title of the document file set obtained in the search section 15 and the like are displayed in the search result display section 16. It should be noted that the search method based on the morphological analysis and the syntactic and semantic analysis may use the method described in Japanese Patent Application No. 5-12561, not the main subject of the present invention. Further, in Japanese Patent Application No. 5-12561, a document is searched in advance using a keyword index table created by morphological analysis and syntactic and semantic analysis, but a method that does not analyze in advance may be used as follows. First, regarding the entered search word,
Whether a sentence in the retrieved document is morpheme-analyzed and syntactic-semantic-analyzed by using a keyword index table consisting of characters and the ID of the document containing the character, and whether the word and the syntactic-semantic analysis result are included in the document Check if As a result, only the document including the analysis result is displayed. It is also possible to directly analyze the document without using the keyword index table.

【００４３】以上の説明では、検索要求文の構文意味構
造を一部に含んでいる構文意味解析用辞書中の見出し単
語を、検索時に表示する方法について述べたが、利用者
が解析用辞書に単語を登録する時点に表示してもよい。In the above explanation, the method of displaying the index word in the syntactic / semantic analysis dictionary partially including the syntactic / semantic structure of the retrieval request sentence at the time of retrieval has been described. It may be displayed when the word is registered.

【００４４】以上のように第２実施例によれば、文書お
よび検索要求文を構文意味解析するフルテキストサーチ
システムにおいて、入力された検索要求文について、当
該検索要求部の構文意味構造を構文意味構造の中に含ん
でいる構文意味解析用辞書中の見出し語と当該構文意味
関係を表示するようにしたので、利用者は表示された構
文意味関係などを検索式に追加して検索することができ
る。従って、利用者は検索要求文の入力の負担が軽減さ
れ、また検索漏れを防ぐ効果を得られる。また、追加し
た単語や構文意味関係表現を、同義語辞書や同義意味表
現辞書に保存しておくことによって、後の検索時に利用
することができる。As described above, according to the second embodiment, in the full-text search system for performing the syntactic and semantic analysis of the document and the retrieval request sentence, the syntactic and semantic structure of the retrieval request part is inputted into the syntactic and meaning structure for the inputted retrieval request sentence. Since the headwords in the syntactic and semantic analysis dictionary included in the structure and the syntactic and semantic relations are displayed, the user can search by adding the displayed syntactic and semantic relations to the search expression. it can. Therefore, the user can reduce the burden of inputting the search request sentence and can prevent the omission of the search. In addition, by storing the added words and syntactic and semantic relation expressions in the synonym dictionary and the synonymous expression dictionary, they can be used at the time of later retrieval.

【００４５】次に、図２１乃至図２５を参照して本発明
の第３実施例について説明する。この第３実施例の機器
構成、機能構成および入力解析部１２における形態素解
析と検索式作成の処理は、前述した第１実施例と同じで
あるのでその説明を省略する。Next, a third embodiment of the present invention will be described with reference to FIGS. The device configuration, the functional configuration, and the processing of morphological analysis and search expression creation in the input analysis unit 12 of the third embodiment are the same as those of the first embodiment described above, and therefore the description thereof is omitted.

【００４６】まず、入力部１１から検索要求入力文を入
力すると、入力部１１に図２１に示すように表示され
る。そして、利用者が、当該の入力部用ウィンドウ上の
検索語を指定し、［同義語候補表示］をマウスなどで指
定すると、入力部１１は入力解析部１２を起動する。図
２１では、利用者が検索語の中の「ソフトウェア」を指
定し、「ソフトウェア」が表示されている行の四角の記
号が反転表示されている。First, when a search request input sentence is input from the input unit 11, it is displayed on the input unit 11 as shown in FIG. Then, when the user specifies the search word on the input section window and specifies [Synonym candidate display] with a mouse or the like, the input section 11 activates the input analysis section 12. In FIG. 21, the user specifies "software" in the search word, and the square symbol of the line in which "software" is displayed is highlighted.

【００４７】入力解析部１２は、反転表示された検索語
に対する同義語候補作成の処理を図２３を参照して後述
するように行われ、その結果が入力部１１に転送され
る。入力部１１は転送された結果を入力部用ウィンドウ
に表示する。図２２に、同義語候補作成処理結果の表示
の一例を示し、図２２では「ソフトウェア」の同義語候
補として「ソフト」と「ウェア」が表示されている。The input analysis unit 12 performs the process of creating synonym candidates for the highlighted search word as described later with reference to FIG. 23, and the result is transferred to the input unit 11. The input unit 11 displays the transferred result in the input unit window. FIG. 22 shows an example of the display of the synonym candidate creation processing result. In FIG. 22, “software” and “ware” are displayed as synonym candidates for “software”.

【００４８】以下、上記の入力解析部１２における同義
語候補作成処理について図２３のフロ−チャ−トを参照
しながら説明する。なお、解析用辞書１３の内容の形式
と例は図６と同じであり、解析用辞書１３が格納されて
いる形式は図７と以下の点を除いて同じである。つま
り、図７では後方に接続可能な文字の文字番号を格納し
ていたが、この実施例では、前方に接続可能な文字の文
字番号を格納している。Hereinafter, the synonym candidate creation process in the input analysis unit 12 will be described with reference to the flowchart of FIG. The format and example of the contents of the analysis dictionary 13 are the same as those in FIG. 6, and the format in which the analysis dictionary 13 is stored is the same as that in FIG. 7 except for the following points. That is, in FIG. 7, the character numbers of connectable characters are stored backward, but in this embodiment, the character numbers of connectable characters are stored forward.

【００４９】まず、変数ｎに１をセットし、検索語の１
文字目の文字を解析用辞書の見出しの１文字目の文字か
ら探す。そして一致する文字があった場合には見出し情
報が格納されていれば、それを表示出力する。First, the variable n is set to 1 and the search term 1 is set.
Search for the first character from the first character in the heading of the analysis dictionary. Then, when there is a matching character, if the headline information is stored, it is displayed and output.

【００５０】そして、検索語の文字数がｎに一致するま
で、後方に接続可能な文字の文字番号がある限りそれら
をたどっていき、その過程で得られる同義語候補につい
てはすべて表示するようにしている。Then, as long as there are character numbers of connectable characters in the backward direction, the character numbers of the connectable characters are traced until the number of characters of the search word matches n, and all the synonym candidates obtained in the process are displayed. There is.

【００５１】このようにして、表示された同義語候補表
示処理結果について、利用者は修正、削除、追加を行な
うことができる。図２２で「ソフトウェア」の同義語候
補の中から、利用者が「ソフト」を選択した後の表示の
例を図２４に示す。図２４で、利用者が［選択］を指定
すると、入力解析部１２は当該の修正結果を同義語辞書
と図示しないバッファに格納し、制御を入力部１１へ戻
す。入力部１１はバッファの内容に応じて、図２５に示
すように表示する。In this way, the user can correct, delete, and add the displayed synonym candidate display processing result. FIG. 24 shows an example of the display after the user selects “software” from the synonym candidates of “software” in FIG. In FIG. 24, when the user specifies [select], the input analysis unit 12 stores the correction result in a synonym dictionary and a buffer (not shown), and returns control to the input unit 11. The input unit 11 displays as shown in FIG. 25 according to the contents of the buffer.

【００５２】前述した検索部１５では、予め文書を形態
解析して作成した、単語とその単語を含む文書ＩＤから
なるキーワードインデックステーブルを用いて検索した
が、次のように、予め文書を形態素解析しない方法でも
よい。まず、文字とその文字を含む文書のＩＤからなる
キーワードインデックステーブルを用いて検索し、検索
された文書中の文について形態素解析し、検索語がその
文書に含まれているかどうかをチェックしてそれらを含
む文書のみを表示する。また、キーワードインデックス
テーブルを用いずに直接文書を処理してもよい。The above-described search unit 15 searches using a keyword index table composed of words and a document ID including the words, which is created by morphologically analyzing the document in advance. You can choose not to. First, a keyword index table consisting of characters and the ID of the document containing the characters is used for a search, the sentences in the searched document are morphologically analyzed, and it is checked whether or not the search word is included in the document. Display only documents that contain. Further, the document may be directly processed without using the keyword index table.

【００５３】上述の説明では、検索語を構成とする単語
と一致する形態素解析用辞書中の見出し単語を、検索時
に表示する方法について述べたが、利用者が解析用辞書
に単語を登録する時点に表示してもよい。In the above description, the method of displaying the headword in the morphological analysis dictionary that matches the word constituting the search word at the time of search has been described. However, when the user registers the word in the analysis dictionary, May be displayed in.

【００５４】この第３実施例では、検索語「ソフトウェ
ア」と見出し単語「ソフト」のように、入力された検索
語の一部と一致する形態素解析用辞書中の見出し単語を
表示する方法について述べた。さらに、検索語の一部と
形態素解析用辞書中の見出し単語の一部とが一致する場
合に、当該の見出し単語を表示するようにしてもよい。
例えば、検索語として複合語である「機械翻訳」が入力
された場合に、複合語を構成する「翻訳」という単語を
含んでいる「自動翻訳」や「同時翻訳」、「翻訳本」な
どの見出し単語を表示し、利用者が選択するようにして
もよい。In the third embodiment, a method of displaying a headword in the morphological analysis dictionary that matches a part of the input search word, such as the search word "software" and the headword "soft", will be described. It was Further, when a part of the search word and a part of the heading word in the morphological analysis dictionary match, the heading word may be displayed.
For example, when a compound word "machine translation" is entered as a search word, "automatic translation", "simultaneous translation", and "translation book" that include the word "translation" that composes the compound word The headword may be displayed so that the user can select it.

【００５５】以上のようにこの第３実施例によれば、文
書および検索要求分を形態素解析するフルテキストサー
チシステムにおいて、入力された検索語の一部と一致す
る形態素解析用辞書中の見出し単語を表示するようにし
たので、利用者は表示された見出し単語を検索語に追加
して検索することができる。従って、利用者は最初に検
索語を漏れなく入力する必要がないので、検索式の入力
の負担が軽減され、また検索漏れを防ぐ効果を得られ
る。また、追加した検索語を同義語辞書に保存しておく
ことによって後の検索時に利用することができる。As described above, according to the third embodiment, in the full-text search system for morphologically analyzing a document and a search request, a headword in the morphological analysis dictionary that matches a part of the input search word. Since it is displayed, the user can search by adding the displayed headword to the search word. Therefore, the user does not need to input the search word first without omission, and thus the burden of inputting the search expression can be reduced and the omission of the search can be prevented. In addition, by storing the added search word in the synonym dictionary, it can be used in the subsequent search.

【００５６】次に、本発明の第４実施例について図２６
乃至図３０を参照しながら説明する。この第４実施例の
機器構成、機能構成および入力解析部１２における形態
素解析、構文意味解析と検索式作成の処理は、前述した
第２実施例と同じであるので説明を省略する。Next, the fourth embodiment of the present invention will be described with reference to FIG.
It will be described with reference to FIGS. The device configuration, the functional configuration, and the morphological analysis, the syntactic and semantic analysis, and the processing for creating the search formula in the input analysis unit 12 of the fourth embodiment are the same as those in the second embodiment described above, and thus the description thereof will be omitted.

【００５７】入力解析部１２では、入力文の形態素解
析、構文意味解析と検索式の作成処理、および同義候補
作成の処理を行なう。入力解析部１２で用いる辞書の形
式、入力解析部の解析結果の例、および不要表現規則の
例は、図１３、図１４及び図１５と同じである。The input analysis unit 12 performs a morphological analysis of an input sentence, a syntactic and semantic analysis, a search expression creation process, and a synonym candidate creation process. The format of the dictionary used in the input analysis unit 12, the example of the analysis result of the input analysis unit, and the example of the unnecessary expression rule are the same as those in FIGS. 13, 14 and 15.

【００５８】入力部１１より検索要求入力文が入力され
ると、図２６に示すように、入力文、検索式および検索
式中の単語が表示される。利用者が、［同義候補表示］
をマウスなどで指定すると、入力部１１は入力解析部１
２を起動する。入力解析部１２は、同義候補作成の処理
を行ない、その結果を入力部１１へ転送する。入力部１
１は転送された結果を入力部用ウィンドウに表示する。
図２７には、同義候補作成処理結果の表示を示すもの
で、図２７では、「文書情報検索」の同義候補として
「文書検索」が表示されている状態を示している。When a search request input sentence is input from the input unit 11, as shown in FIG. 26, the input sentence, the search formula, and the words in the search formula are displayed. The user clicks [Synonym candidate display]
If you specify with a mouse, etc., the input unit 11
Start up 2. The input analysis unit 12 performs a synonym candidate creation process and transfers the result to the input unit 11. Input section 1
1 displays the transferred result in the input section window.
FIG. 27 shows a display of the synonym candidate creation processing result, and FIG. 27 shows a state in which “document search” is displayed as a synonym candidate of “document information search”.

【００５９】以下、図２８を参照して同義語候補作成処
理について説明する。まず、検索式の中から単語Ａを取
り出し、辞書中の意味構造と照合する。辞書中の意味構
造の中に存在する場合に、辞書中の意味構造の中から単
語Ａと直接に意味関係をもつ単語Ｂと、当該の意味関係
Ｃを取り出す。単語Ｂと検索式を照合し、単語Ａと単語
Ｂの間に２つ以上の意味関係が存在し、その中に意味関
係Ｃが存在することをチェックする。辞書中の一つの意
味構造について、その中に含まれる直接に意味関係をも
つすべての２つの単語とその間の意味関係が、検索式の
中に存在するかどうかをチェックする。そして、すべて
が存在する場合にその意味構造を入力部１１へ渡す。こ
の処理を辞書中のすべての意味構造について行なう。The synonym candidate creation process will be described below with reference to FIG. First, the word A is extracted from the search formula and matched with the semantic structure in the dictionary. When it exists in the semantic structure in the dictionary, the word B having a direct semantic relationship with the word A and the relevant semantic relationship C are extracted from the semantic structure in the dictionary. The word B and the search expression are collated, and it is checked that there are two or more semantic relationships between the word A and the word B, and that the semantic relationship C is present therein. For one semantic structure in the dictionary, it is checked whether all two words having a direct semantic relation contained therein and the semantic relation between them exist in the search expression. Then, when all are present, the semantic structure is passed to the input unit 11. This process is performed for all semantic structures in the dictionary.

【００６０】利用者は、以上のようにして検索された表
示された同義語候補表示処理結果について、修正、削
除、追加を行なうことができる。図２７で利用者が新た
に表示された検索式を選択して後の表示の例を選択した
後の表示の例を図２９に示す。The user can correct, delete, or add the displayed synonym candidate display processing result retrieved as described above. FIG. 29 shows an example of the display after the user selects the newly displayed search formula in FIG. 27 and selects the example of the subsequent display.

【００６１】そして、図２９の表示状態で、利用者が
［選択］を指定すると、入力解析部１２は当該の修正結
果を同義語辞書と図示しないバッファに格納し、制御を
入力部１１へ戻す。入力部１１はバッファの内容に応じ
て、図３０に示すように表示する。When the user specifies [select] in the display state of FIG. 29, the input analysis section 12 stores the correction result in the synonym dictionary and a buffer (not shown), and returns the control to the input section 11. . The input unit 11 displays as shown in FIG. 30 according to the contents of the buffer.

【００６２】なお、特願平５−１２５６１号公報では、
予め文書を形態素解析および構文意味解析して作成した
キーワードインデックステーブルを用いて検索するが、
次のように、予め解析しない方法でもよい。つまり、ま
ず、入力された検索語について、文字とその文字を含む
文書のＩＤからなるキーワードインデックステーブルを
用いて検索し、次に、検索された文書中の文を形態素解
析、構文意味解析し、単語および構文意味解析結果がそ
の文書に含まれているかどうかをチェックする。その結
果、当該解析結果を含む文書のみを表示する。また、キ
ーワードインデックステーブルを用いずに直接文書を解
析処理することも可能である。In Japanese Patent Application No. 5-12561,
A document is searched using a keyword index table created by morphological analysis and syntactic and semantic analysis in advance.
A method that does not analyze in advance may be used as follows. That is, first, the input search word is searched using a keyword index table composed of characters and the ID of the document containing the characters, and then the sentence in the searched document is subjected to morphological analysis and syntactic and semantic analysis. Check if the word and syntactic and semantic analysis results are included in the document. As a result, only the document including the analysis result is displayed. It is also possible to directly analyze the document without using the keyword index table.

【００６３】上述の説明では、検索要求文の構文意味構
造の一部と一致する構文意味解析用辞書中の見出し単語
を、検索時に表示する方法について述べたが、利用者が
解析用辞書に単語を登録する時点に表示してもよい。In the above description, the method of displaying the index word in the syntactic / semantic analysis dictionary that matches a part of the syntactic / semantic structure of the retrieval request sentence at the time of retrieval has been described. May be displayed at the time of registration.

【００６４】上述の説明では、入力文の「文書情報検
索」と見出し単語「文書検索」の場合のように、入力さ
れた複合語の構文意味構造の一部と一致する構文意味構
造の見出し単語を表示する方法について述べた。さら
に、入力文の一部と構文意味解析用辞書中の見出し単語
の一部とが一致する場合に、当該の見出し単語を表示す
るようにしてもよい。例えば、検索語として複合語であ
る「機械翻訳」が入力された場合に、複合語の構文意味
構造を構成する「翻訳」という単語を構文意味構造の中
に含んでいる「自動翻訳」や「同時翻訳」などの見出し
単語を表示し、利用者が選択するようにしてもよい。当
然ながら、構文意味構造中の単語だけでなく、（道具）
や（対象）などの関係も一致する場合に同義の候補とす
ることができる。以上のようにこの第４実施例によれ
ば、文書および検索要求文を構文意味解析するフルテキ
ストサーチシステムにおいて、入力された検索要求文に
ついて、当該検索要求文の構文意味構造の一部と一致す
る構文意味構造の構文意味解析用辞書中の見出し語と当
該構文意味関係を表示するようにしたので、利用者は表
示された構文意味関係などを検索式に追加して検索する
ことができる。従って、利用者は検索要求文の入力の負
担が軽減され、また検索漏れを防ぐ効果を得られる。ま
た、追加した単語や構文意味関係表現を、同義語辞書や
同義表現辞書に保存しておくことによって、後の検索時
に利用することができる。In the above description, as in the case of the “document information search” and the heading word “document search” of the input sentence, the heading word of the syntactic and semantic structure that matches a part of the syntactic and semantic structure of the input compound word. Described how to display. Furthermore, when a part of the input sentence and a part of the heading word in the syntactic and semantic analysis dictionary match, the heading word may be displayed. For example, when a compound word “machine translation” is input as a search word, “automatic translation” or “automatic translation” that includes the word “translation” that constitutes the syntactic meaning structure of the compound word in the syntactic meaning structure The user may select a heading word such as “simultaneous translation” and select it. Naturally, not only the words in the syntactic and semantic structure, but also (tools)
When the relationships such as and (target) also match, the candidates can be synonymous. As described above, according to the fourth embodiment, in the full-text search system that performs the syntactic and semantic analysis of the document and the search request sentence, the input search request sentence matches a part of the syntactic and semantic structure of the search request sentence. Since the headwords in the syntactic-semantic-analysis dictionary of the syntactic-semantic structure and the syntactic-semantic relationship are displayed, the user can search by adding the displayed syntactic-semantic relationship to the search expression. Therefore, the user can reduce the burden of inputting the search request sentence and can prevent the omission of the search. In addition, by storing the added word or syntactic / semantic relation expression in the synonym dictionary or the synonym expression dictionary, it can be used at a later search.

【００６５】以上説明した第１実施例乃至第４実施例で
は入力文中の単語や構文意味構造と、解析用辞書中の見
出し語やその構文意味構造とが一部分一致する場合に、
見出し語を同義候補として表示するものであった。以下
の第５実施例から第７実施例では、解析用辞書１３では
なく、文書データ記憶部１７の文書の中から同義候補を
抽出し表示するようにしている。In the first to fourth embodiments described above, when a word or a syntactic meaning structure in an input sentence partially matches an entry word or its syntactic meaning structure in the analysis dictionary,
The headword was displayed as a synonym candidate. In the following fifth to seventh embodiments, synonymous candidates are extracted and displayed from the document in the document data storage unit 17 instead of the analysis dictionary 13.

【００６６】以上の第１乃至第４実施例では、文書の検
索処理を行なう前に解析用辞書の検索を行なったが、文
書の検索処理の後に解析用辞書の検索を行なってもよ
い、例えば、利用者の入力した文字列で文書の検索の処
理を行ない、利用者がその結果を見て判断し、必要な場
合のみに解析用辞書の検索を行えるようにし、これを利
用して文書の検索処理をやり直すようにしてもよい。ま
た、文書の検索処理と、解析用辞書の検索を並行して処
理することが可能である。すなわち、利用者の入力した
文字列で文書の検索処理を行ない、並列計算機などによ
り並行して解析用辞書の検索を行なっておく。従って、
利用者は、必要な場合に直ちに解析用辞書の検索を見る
ことが可能になる。In the above-described first to fourth embodiments, the analysis dictionary is searched before the document search process, but the analysis dictionary may be searched after the document search process, for example. , The document search process is performed using the character string entered by the user, the user can judge by looking at the result, and can search the analysis dictionary only when necessary. The search process may be redone. Further, it is possible to process the document search processing and the analysis dictionary search in parallel. That is, a document search process is performed using the character string input by the user, and a parallel computer or the like searches the analysis dictionaries in parallel. Therefore,
The user can immediately see the search of the analysis dictionary when necessary.

【００６７】次に、本発明の第５実施例について図３１
乃至図３５を参照して説明する。この第５実施例の機器
構成、機能構成および入力解析部における形態素解析と
検索式作成の処理は、第１実施例と同じであるので説明
を省略する。同義語候補作成の処理は検索部１５で行な
われる。Next, FIG. 31 shows the fifth embodiment of the present invention.
It will be described with reference to FIGS. The device configuration, the functional configuration, and the processing of morphological analysis and search expression creation in the input analysis unit of the fifth embodiment are the same as those of the first embodiment, and therefore description thereof will be omitted. The process of creating synonym candidates is performed by the search unit 15.

【００６８】ところで、文書デ−タ記憶部１７のキ−ワ
−ドインデックステ−ブルを図３３を参照して説明す
る。図３３に示すように、キーワードインデックステー
ブルでは、単語と、各単語を含んでいる文書ＩＤとが、
当該の文書で当該の単語と隣接する単語ともに格納され
ている。例えば、「ホスト」という単語は、ｆｉｌｅ１
３とｆｉｌｅ８２に含まれており、「ホスト」の後ろに
「コンピュータ」という単語が用いられている。検索部
１５は、このキーワードインデックステーブルを引くこ
とにより、入力された単語にどのような単語が続いてい
るかの情報を入力部１１に渡すようにしている。The keyword index table of the document data storage unit 17 will be described with reference to FIG. As shown in FIG. 33, in the keyword index table, the word and the document ID including each word are
Both the word and the word adjacent to the word are stored in the document. For example, the word "host" means file1
3 and file 82, the word "computer" is used after "host". The search unit 15 passes the information of what kind of word follows the input word by pulling this keyword index table to the input unit 11.

【００６９】入力部１１より検索要求入力文が入力され
ると、入力文、検索式、検索式中の検索語および検索語
を構成している構成語が図３１に示すように表示されて
いる。利用者が、当該の入力部用ウィンドウ上の構成語
を指定し、［同義語候補表示］をマウスなどで指定する
と、入力部１１は入力解析部１２を起動する。When a search request input sentence is input from the input unit 11, the input sentence, the search expression, the search word in the search expression and the constituent words forming the search word are displayed as shown in FIG. . When the user designates a constituent word on the input section window and designates [display synonym candidate] with a mouse or the like, the input section 11 activates the input analysis section 12.

【００７０】図３１の場合には、利用者が構成語の中の
「ホスト」を指定し、「ホスト」が表示されている行の
四角の記号が反転表示されている。そして、検索部１５
は、同義語候補作成の処理と文書検索の処理を行なう。
検索部１５は、同義語候補作成の処理を行なう。つま
り、検索部１５は、このキーワードインデックステーブ
ルを引くことにより、入力された単語にどのような単語
が続いているかの情報を得て、入力部１１へ渡す。その
結果、入力部１１は転送された結果を入力部用ウィンド
ウに表示する。図３２に、同義語候補作成処理結果の表
示の一例を示すもので、「ホスト計算機」の同義語候補
として「ホストコンピュータ」と「ホストサテライトシ
ステム」、「ホストプロセッサー」、「ホスト集中シス
テム」が表示されている状態を示している。In the case of FIG. 31, the user designates "host" in the constituent word, and the square symbol of the line in which "host" is displayed is highlighted. And the search unit 15
Performs a synonym candidate creation process and a document search process.
The search unit 15 performs a synonym candidate creation process. That is, the search unit 15 obtains information about what kind of word follows the input word by drawing this keyword index table, and passes it to the input unit 11. As a result, the input unit 11 displays the transferred result in the input unit window. FIG. 32 shows an example of the display of the synonym candidate creation processing result. As the synonym candidates of “host computer”, “host computer”, “host satellite system”, “host processor”, and “host centralized system” are displayed. It shows the displayed status.

【００７１】そして、利用者は、図３２のように表示さ
れた同義語候補作成処理結果について、修正、削除、追
加を行なうことができる。図３２で「ホスト計算機」の
同義語の中から、利用者が「ホストコンピュータ」と
「ホストプロセッサー」を選択した後の表示の例を図３
４に示す。第３４図で、利用者が［選択］を指定する
と、検索部は当該の修正結果を同義語辞書と図示しない
バッファに格納し、制御を入力部１１へ戻す。入力部１
１はバッファの内容に応じて、図３５に示すように表示
する。Then, the user can correct, delete, or add the synonym candidate creation processing result displayed as shown in FIG. An example of a display after the user selects "host computer" and "host processor" from the synonyms of "host computer" in FIG. 32 is shown in FIG.
4 shows. In FIG. 34, when the user specifies [select], the search unit stores the correction result in the synonym dictionary and a buffer (not shown), and returns control to the input unit 11. Input section 1
1 is displayed as shown in FIG. 35 according to the contents of the buffer.

【００７２】上記実施例では、文書から検索語を構成す
る構成語を含んでいる複合語を抽出し表示したが、当該
の複合語を含んでいる文を表示してもよい。さらに、検
索部１５では、予め文書を形態素解析して作成した、単
語とその単語を含む文書ＩＤからなるキーワードインデ
ックステーブルを用いて検索したが、次のように、予め
文書を形態素解析しない方法でもよい。まず、入力され
た検索語について、文字とその文字を含む文書のＩＤか
らなるキーワードインデックステーブルを用いて検索
し、次に、検索された文書中の文を形態素解析し、上述
と同様にして同義語候補を抽出し表示する。その後、利
用者が同義語の選択を行なう。さらに、検索語と選択さ
れた同義語について、文字のキーワードインデックステ
ーブルを用いて検索する。次に、検索された文書中の文
を形態素解析し、検索語が単語としてその文書に含まれ
ているかどうかをチェックしてそれらを含む文書のみを
表示する。なお、キーワードインデックステーブル用い
ずに直接文書を処理して検索することも可能である。In the above-mentioned embodiment, the compound word including the constituent words forming the search word is extracted from the document and displayed, but the sentence including the compound word may be displayed. Further, the search unit 15 searches using a keyword index table that is created in advance by performing morphological analysis on a document and that includes a word and a document ID that includes the word. Good. First, the input search word is searched using a keyword index table composed of characters and the ID of the document containing the characters, and then the sentence in the searched document is subjected to morphological analysis and synonymous with the above. Extract and display word candidates. After that, the user selects a synonym. Further, the search word and the selected synonym are searched using the keyword index table of characters. Next, the sentence in the retrieved document is morphologically analyzed, it is checked whether or not the retrieval word is included as a word in the document, and only the document including them is displayed. Note that it is also possible to directly process and search the document without using the keyword index table.

【００７３】以上述べた説明では、同義語の候補を、検
索時に表示する方法について述べたが、利用者が解析用
辞書に単語を登録する時点に表示してもよい。また、次
のように複合語の構成語の情報を格納した辞書を用い
て、入力された複合語の構成語を含む辞書の見出しを検
索することが可能である。まず、図４４に辞書の形式と
例を示す。図４５に、入力解析部１２における同義語候
補作成処理の流れを示す。図４５に示すように、利用者
が指定した、複合語を構成している構成語について、複
合語中の構成語の並びの順位および構成語の表記が同じ
である複合語を辞書から探し、入力部１１へ渡す。な
お、解析用辞書は、構成語の並びの順位ごとに、構成語
で検索できるように格納してもよい。また、図７などと
同様に、構成語を見出しとし、後方に接続可能な構成語
の番号、あるいは前方に接続可能な構成語の番号を格納
して、構成語から複合語を検索できるっようにしてもよ
い。In the above description, a method of displaying synonym candidates at the time of retrieval has been described, but they may be displayed at the time when the user registers a word in the analysis dictionary. Further, it is possible to search for a dictionary heading that includes the input constituent word of the compound word by using the dictionary that stores the information of the constituent word of the compound word as follows. First, FIG. 44 shows a format and an example of the dictionary. FIG. 45 shows the flow of synonym candidate creation processing in the input analysis unit 12. As shown in FIG. 45, for the constituent words that make up the compound word specified by the user, search for a compound word in which the order of arrangement of the constituent words in the compound word and the notation of the constituent words are the same, It is passed to the input unit 11. It should be noted that the analysis dictionary may be stored so that the constituent words can be searched for each rank of the arrangement of the constituent words. Further, as in FIG. 7 and the like, a constituent word is used as a heading, and the number of constituent words connectable to the rear or the number of constituent words connectable to the front is stored so that a compound word can be searched from the constituent words. You may

【００７４】また、この第５実施例では、文書データ記
憶部１７のすべての文書について、検索語に隣接する単
語を検索し表示したが、入力された検索語で文書の検索
を行ない、検索された文書の中からのみ検索語に隣接す
る単語を検索し表示してもよい。さらに、利用者が検索
された文書の中から適切であると判断した任意の数の文
書に対して、その中からのみ検索語に隣接する単語を検
索し表示してもよい。また、利用者が検索された文書の
中から適切でないと判断した任意の数の文書に対して、
その中から検索語に隣接する単語を検索表示し、利用者
がｎｏｔの条件を付けて検索するのに用いられることが
できるようにすることもできる。Further, in the fifth embodiment, the word adjacent to the search word is searched and displayed for all the documents in the document data storage unit 17, but the document is searched by the input search word and searched. The word adjacent to the search word may be searched and displayed only in the document. Furthermore, for any number of documents that the user has determined to be appropriate from the retrieved documents, the words adjacent to the search term may be retrieved and displayed only from among them. In addition, for any number of documents that the user determined to be inappropriate from the retrieved documents,
It is also possible to search and display a word adjacent to the search word from among them so that the user can use it for searching with a not condition.

【００７５】以上のように第５実施例によれば、入力さ
れた検索語が複合語である場合に、複合語を構成してい
る単語が検索対象の文書中で他のどのような単語と隣接
しているかを表示したので、利用者が必要な複合語や単
語を検索語に追加し、検索できるようにすることができ
る。従って、利用者は最初に検索語を漏れなく入力する
必要がないので、検索式の入力の負担が軽減され、また
検索漏れを防ぐ効果を得られる。また、追加した検索語
を同義語辞書に保存しておくことによって後の検索時に
利用することができる。As described above, according to the fifth embodiment, when the input search word is a compound word, the words forming the compound word are different from any other words in the document to be searched. Since it is displayed whether or not they are adjacent to each other, the user can add a necessary compound word or word to the search word so that the user can search. Therefore, the user does not need to input the search word first without omission, and thus the burden of inputting the search expression can be reduced and the omission of the search can be prevented. In addition, by storing the added search word in the synonym dictionary, it can be used in the subsequent search.

【００７６】次に、本発明の第６実施例について図３６
乃至図４１を参照して説明する。この第６実施例の機器
構成と機能構成、および入力解析部１２における形態素
解析、構文意味解析と検索式作成の処理は、第２実施例
と同じであるので説明を省略する。文書データ記憶部１
７、検索部１５における文書検索の処理、検索結果表示
部などの説明は第２実施例と同様であるのでここでは省
略する。なお、同義候補作成の処理は検索部１５で行な
われる。Next, the sixth embodiment of the present invention will be described with reference to FIG.
It will be described with reference to FIGS. The device configuration and functional configuration of the sixth embodiment, and the morpheme analysis, syntactic and semantic analysis, and search expression creation processing in the input analysis unit 12 are the same as those in the second embodiment, so description thereof is omitted. Document data storage unit 1
7. The description of the document search process in the search unit 15, the search result display unit, and the like are the same as those in the second embodiment, and are omitted here. The process of creating synonym candidates is performed by the search unit 15.

【００７７】図３８は文書データ記憶部１７内の意味構
造インデックスを示しておく。文書データ記憶部１７に
はさらに、図３９に示す意味構造の接続情報が格納され
ている。図３９では、図３８に示されている意味構造の
番号で、どの意味構造がどの意味構造に接続するかが記
憶されている。例えば、１０１番の意味構造の後ろに、
３０１番と４０５番の意味構造が接続することが示され
ている。意味構造の番号と、図３９の接続情報により、
任意の意味構造に接続する意味構造が検索できる。FIG. 38 shows the semantic structure index in the document data storage unit 17. The document data storage unit 17 further stores connection information having the semantic structure shown in FIG. 39, the meaning structure numbers shown in FIG. 38 are used to store which meaning structure is connected to which meaning structure. For example, after the semantic structure of 101,
It is shown that the semantic structures No. 301 and No. 405 are connected. By the number of the semantic structure and the connection information of FIG. 39,
You can search for semantic structures that connect to any semantic structure.

【００７８】入力部１１から検索要求入力文を入力する
と、図３６に示すように、入力文、検索式および検索式
中の単語が表示されている。利用者が、［同義候補表
示］をマウスなどで指定すると、入力部１１は入力解析
部１２を起動する。図３６では、利用者が検索語の中の
「翻訳」を指定し、「翻訳」が表示されている行の四角
の記号が反転表示されている状態を示している。When a search request input sentence is input from the input unit 11, as shown in FIG. 36, the input sentence, the search formula, and the words in the search formula are displayed. When the user designates “display of synonymous candidates” with a mouse or the like, the input unit 11 activates the input analysis unit 12. In FIG. 36, the user has designated “translation” in the search word, and the square symbol of the line in which “translation” is displayed is highlighted.

【００７９】そして、検索部１５は、同義候補作成の処
理と文書検索の処理を行なう。検索部１５は、同義語候
補作成の処理を行ない、その結果を入力部１１へ転送す
る。入力部１１は転送された結果を入力部用ウィンドウ
に表示する。図３７に、同義候補作成処理結果の表示の
一例を示すもので、「機械翻訳」の同義語候補として
「自動翻訳」と「人間翻訳」が表示されている。Then, the search section 15 carries out a synonym candidate creation process and a document search process. The search unit 15 performs a synonym candidate creation process and transfers the result to the input unit 11. The input unit 11 displays the transferred result in the input unit window. FIG. 37 shows an example of the display of the synonym candidate creation processing result, and "automatic translation" and "human translation" are displayed as synonym candidates for "machine translation".

【００８０】利用者は入力部用ウンイドに表示された同
義語候補作成処理結果について、修正、削除、追加を行
なうことができる。図３７で「機械翻訳」の同義候補の
中から、利用者が「自動翻訳」を選択した後の表示の例
を図４０に示す。図４０で、利用者が［選択］を指定す
ると、検索部１５は当該の修正結果を同義語辞書と図示
しないバッファに格納し、制御を入力部１１へ戻す。入
力部１１はバッファの内容に応じて、図４１に示すよう
に表示する。The user can correct, delete, and add the synonym candidate creation processing result displayed on the input section. FIG. 40 shows a display example after the user selects “automatic translation” from the synonymous candidates of “machine translation” in FIG. 37. In FIG. 40, when the user specifies [select], the search unit 15 stores the correction result in a synonym dictionary and a buffer (not shown), and returns control to the input unit 11. The input unit 11 displays as shown in FIG. 41 according to the contents of the buffer.

【００８１】なお、特願平５−１２５６１号公報では、
予め文書を形態素解析および構文意味解析して作成した
キーワードインデックステーブルを用いて検索するが、
次のように、予め解析しない方法でもよい。まず、入力
された検索語について、文字とその文字を含む文書のＩ
Ｄからなるキーワードインデックステーブルを用いて検
索し、次に、検索された文書中の文を形態素解析、構文
意味解析し、上述と同様にして同義候補を抽出し表示す
る。その後、利用者が同義候補の選択を行ない、検索語
と選択された同義候補について、文字のキーワードイン
デックステーブルを用いて検索する。次に、検索された
文書中の文を形態素解析、構文意味解析し、単語および
構文意味解析結果がその文書に含まれているかどうかを
チェックする。その結果、当該解析結果を含む文書のみ
を表示する。なお、キーワードインデックステーブルを
用いずに直接文書解析処理することも可能である。In Japanese Patent Application No. 5-12561,
A document is searched using a keyword index table created by morphological analysis and syntactic and semantic analysis in advance.
A method that does not analyze in advance may be used as follows. First, regarding the input search word, I and the document I containing the character
A keyword index table composed of D is used for searching, and then the sentence in the searched document is subjected to morphological analysis and syntactic and semantic analysis, and synonymous candidates are extracted and displayed in the same manner as described above. After that, the user selects a synonym candidate, and searches for the synonym candidate selected as the search term using the character keyword index table. Next, the sentence in the retrieved document is subjected to morphological analysis and syntactic and semantic analysis to check whether the word and the syntactic and semantic analysis result are included in the document. As a result, only the document including the analysis result is displayed. Note that it is also possible to directly perform document analysis processing without using the keyword index table.

【００８２】以上の説明では、文書から同義候補を抽出
し表示したが、当該の同義候補を含んでいる文を表示し
てもよい。さらに、以上の説明では、同義語の候補を、
検索時に表示する方法について述べたが、利用者が解析
用辞書に単語を登録する時点に表示してもよい。In the above description, the synonym candidates are extracted from the document and displayed, but a sentence containing the synonym candidates may be displayed. Furthermore, in the above explanation, synonym candidates are
Although the method of displaying at the time of search is described, it may be displayed at the time when the user registers a word in the analysis dictionary.

【００８３】この実施例では、入力文の構文意味解析結
果を構成する部分が、文書中でどのような単語と隣接し
ているかを表示した。構成する部分ではなく、検索語ま
たは入力文の構文意味解析結果全体が、文書中でどのよ
うな単語と隣接しているかを表示してもよい。In this embodiment, it is displayed what kind of word in the document the part constituting the result of the syntactic and semantic analysis of the input sentence is adjacent to. It is also possible to display what kind of word the search word or the entire syntactic and semantic analysis result of the input sentence is adjacent to in the document, instead of the constituent part.

【００８４】また、構文意味解析の情報を格納した辞書
を用いて、入力された検索式の一部を含む単語を検索す
ることができる。図４６に、この場合の入力解析部１２
の同義候補作成処理の流れを示す。Further, it is possible to search for a word including a part of the input search expression by using a dictionary that stores the information of the syntactic and semantic analysis. FIG. 46 shows the input analysis unit 12 in this case.
The flow of the synonym candidate creation process of is shown.

【００８５】この第６実施例では、文書データ記憶部１
７のすべての文書について、検索語に隣接する単語を検
索し表示したが、入力された検索語で文書の検索を行な
い、検索された文書の中からのみ検索語に隣接する単語
を検索し表示してもよい。さらに、利用者が検索された
文書の中から適切であると判断した任意の数の文書に対
して、その中からのみ検索語に隣接する単語を検索し表
示してもよい。また、利用者が検索された文書の中から
適切でないと判断した任意の数の文書に対して、その中
から検索語に隣接する単語を検索表示し、利用者がｎｏ
ｔの条件を付けて検索するのに用いられることができる
ようにすることもできる。In the sixth embodiment, the document data storage unit 1
Words adjacent to the search word were searched and displayed for all the documents of No. 7, but the documents were searched by the input search word, and the words adjacent to the search word were searched and displayed only from the searched documents. You may. Furthermore, for any number of documents that the user has determined to be appropriate from the retrieved documents, the words adjacent to the search term may be retrieved and displayed only from among them. In addition, for any number of documents that the user has determined to be inappropriate from the retrieved documents, the word adjacent to the search word is retrieved and displayed, and the user selects no.
It can also be used to search with a condition of t.

【００８６】以上のように第６実施例によれば、入力さ
れた検索語が複合語である場合に、複合語を構成してい
る単語が検索対象の文書中で他のどのような単語と隣接
しているかを表示するようにしたので、利用者が必要な
複合語や単語を検索語に追加し、検索することができる
ようにする。従って、利用者は最初に検索語を漏れなく
入力する必要がないので、検索式の入力の負担が軽減さ
れ、また検索漏れを防ぐ効果を得られる。また、追加し
た検索語を同義語辞書に保存しておくことによって後の
検索時に利用することができる。As described above, according to the sixth embodiment, when the input search word is a compound word, the words forming the compound word are different from any other words in the document to be searched. Since it is displayed whether or not they are adjacent to each other, the user can add a necessary compound word or word to the search word so that the user can search. Therefore, the user does not need to input the search word first without omission, and thus the burden of inputting the search expression can be reduced and the omission of the search can be prevented. In addition, by storing the added search word in the synonym dictionary, it can be used in the subsequent search.

【００８７】次に、本発明の第７実施例について図４２
及び図４３を参照して説明する。前述した第５実施例で
は検索語を構成する単語が、文書中でどのような単語と
隣接しているかを表示したが、この第７実施例では、検
索語と一致する文字列がどのような文字列と隣接してい
るかを表示するようにしている。Next, the seventh embodiment of the present invention will be described with reference to FIG.
And FIG. 43. In the fifth embodiment described above, it is displayed what words in the document are adjacent to the words constituting the search word. In the seventh embodiment, however, what kind of character string matches the search word is displayed. It is displayed whether it is adjacent to the character string.

【００８８】この第７実施例の機器構成、機能構成およ
び入力解析部１２における形態素解析と検索式作成の処
理は、前述した第１実施例と同じであるので説明を省略
する。また、文書データ記憶部１７、検索部１５、検索
結果表示部１６などの説明は前述した第１実施例と同様
であるのでここでは省略する。The device configuration, the functional configuration, and the morphological analysis and the retrieval expression creating process in the input analysis unit 12 of the seventh embodiment are the same as those in the first embodiment described above, and the description thereof will be omitted. The description of the document data storage unit 17, the search unit 15, the search result display unit 16 and the like is the same as in the first embodiment described above, and therefore will be omitted here.

【００８９】検索部１５は、検索結果の文書中の文字列
について、入力文字列との照合を行なう。そして、図４
２に示すように、入力文字列を含む検索結果表示部から
表示する。The retrieval unit 15 collates the character string in the document of the retrieval result with the input character string. And FIG.
As shown in 2, the search result display section including the input character string is displayed.

【００９０】図４２では、「計算機」が入力された場合
の検索結果の文書から、「計算機」という文字列を含ん
だ文を検索した結果を表示している。また、「計算機」
の直後の文字列について、ひらがな、カタカナ、漢字、
その他の４つの文字の種類に従って、「計算機」の直後
の文字列で同じ文字種である範囲の文字が同じである文
は所定の１文のみを表示するようにしてもよい。図４２
の結果について、この処理を行なった結果を図４３に示
す。In FIG. 42, the result of searching for a sentence containing the character string "calculator" from the document of the search result when "calculator" is input is displayed. Also, "Calculator"
About the character string immediately after, Hiragana, Katakana, Kanji,
According to the other four character types, only a predetermined one sentence may be displayed as the sentence having the same characters in the range of the same character type in the character string immediately after “Computer”. FIG. 42
43 shows the result of performing this process.

【００９１】以上のように第７実施例によれば、検索語
と一致する文字列がどのような文字列と隣接しているか
を表示するようにしたので、検索漏れを防ぐ効果を得ら
れる。As described above, according to the seventh embodiment, it is possible to display what kind of character string the character string that matches the search word is adjacent to. Therefore, it is possible to obtain an effect of preventing omission of the search.

【００９２】なお、以上の第１実施例から第４実施例の
辞書は、検索処理のための辞書だけではなく、利用者の
用語集や、機械翻訳システムやワードプロセッサなどの
辞書を用いて同義候補を作成するように拡張することが
できる。The dictionaries of the first to fourth embodiments described above are not limited to the dictionaries for the retrieval process, but can also be synonymous candidates using the user's glossary, dictionaries such as machine translation systems and word processors. Can be extended to create.

【００９３】前述した第５実施例、第６実施例、第７実
施例では、検索語の後ろに続く単語を表示する場合を説
明したが、前に続く単語、あるいはその両方を検索し、
表示するようにすることが可能である。第１実施例から
第７実施例では、日本語の解析と日本語文書の検索の例
を述べたが、同様に他の言語で行なうことも可能であ
る。In the fifth, sixth, and seventh embodiments described above, the case where the word following the search word is displayed has been described, but the word following the search word or both of them are searched,
It is possible to display it. In the first to seventh embodiments, examples of Japanese analysis and Japanese document retrieval have been described, but it is possible to use other languages as well.

【００９４】[0094]

【発明の効果】以上詳述したように本発明によれば、検
索要求文が入力されると、利用者が要求すれば、日本語
解析用辞書および検索対象の文書を検索して、同義語等
を表示するようにしたので、利用者がどのような検索語
・検索式を入力すればよいかがわかるようにでき、少な
い労力で、検索漏れの少ない検索結果を得られるように
することができる。As described in detail above, according to the present invention, when a search request sentence is input, if the user requests it, the Japanese analysis dictionary and the document to be searched are searched for synonyms. Since it is displayed, etc., it is possible to let the user know what kind of search term / search formula should be input, and it is possible to obtain search results with less omission of search with less effort. .

[Brief description of drawings]

【図１】本発明の第１実施例に係わる機器構成を示すブ
ロック図。FIG. 1 is a block diagram showing a device configuration according to a first embodiment of the present invention.

【図２】第１実施例の機能構成を示すブロック図。FIG. 2 is a block diagram showing a functional configuration of the first embodiment.

【図３】第１実施例の入力解析部の解析結果の一例を示
す図。FIG. 3 is a diagram showing an example of an analysis result of an input analysis unit of the first embodiment.

【図４】第１実施例の入力部の表示の一例を示す図。FIG. 4 is a diagram showing an example of a display of an input unit of the first embodiment.

【図５】第１実施例の入力部の表示の一例を示す図。FIG. 5 is a diagram showing an example of a display of an input unit of the first embodiment.

【図６】第１実施例の解析用辞書の形式とその一例を示
す図。FIG. 6 is a diagram showing a format of an analysis dictionary according to the first embodiment and an example thereof.

【図７】第１実施例の解析用辞書の格納形式の一例を示
す図。FIG. 7 is a diagram showing an example of a storage format of an analysis dictionary of the first embodiment.

【図８】第１実施例の解析用辞書の検索処理を示すフロ
−チャ−ト。FIG. 8 is a flowchart showing a search process of an analysis dictionary according to the first embodiment.

【図９】第１実施例の入力部の表示の一例を示す図。FIG. 9 is a diagram showing an example of a display of an input unit of the first embodiment.

【図１０】第１実施例の入力部の表示の一例を示す図。FIG. 10 is a diagram showing an example of a display of the input unit of the first embodiment.

【図１１】第１実施例のキーワードインデックステーブ
ルを示す図。FIG. 11 is a diagram showing a keyword index table of the first embodiment.

【図１２】第１実施例の検索部の処理を示すフロ−チャ
−ト。FIG. 12 is a flowchart showing the processing of the search unit of the first embodiment.

【図１３】本発明の第２実施例に係わる解析用辞書の形
式とその一例を示す図。FIG. 13 is a diagram showing a format of an analysis dictionary according to a second embodiment of the present invention and an example thereof.

【図１４】第２実施例の入力解析部の解析結果の例を示
す図。FIG. 14 is a diagram showing an example of an analysis result of the input analysis unit of the second embodiment.

【図１５】第２実施例の不要表現規則の例を示す図。FIG. 15 is a diagram showing an example of an unnecessary expression rule according to the second embodiment.

【図１６】第２実施例の入力部の表示の一例を示す図。FIG. 16 is a diagram showing an example of a display of an input unit of the second embodiment.

【図１７】第２実施例の入力部の表示の一例を示す図。FIG. 17 is a diagram showing an example of a display on the input unit according to the second embodiment.

【図１８】第２実施例の解析用辞書の検索処理を示すフ
ロ−チャ−ト。FIG. 18 is a flowchart showing the search processing of the analysis dictionary of the second embodiment.

【図１９】第２実施例の入力部の表示の一例を示す図。FIG. 19 is a diagram showing an example of a display on the input unit according to the second embodiment.

【図２０】第２実施例の入力部の表示の一例を示す図。FIG. 20 is a diagram showing an example of a display of an input unit of the second embodiment.

【図２１】本発明の第３実施例に係わる入力部の表示の
一例を示す図。FIG. 21 is a diagram showing an example of a display of an input unit according to the third embodiment of the present invention.

【図２２】第３実施例の入力部の表示の一例を示す図。FIG. 22 is a diagram showing an example of a display on the input unit of the third embodiment.

【図２３】第３実施例の解析用辞書の検索処理を示すフ
ロ−チャ−ト。FIG. 23 is a flowchart showing the search processing of the analysis dictionary of the third embodiment.

【図２４】第３実施例の入力部の表示の一例を示す図。FIG. 24 is a diagram showing an example of a display of an input unit of the third embodiment.

【図２５】第３実施例の入力部の表示の一例を示す図。FIG. 25 is a diagram showing an example of a display of an input unit of the third embodiment.

【図２６】本発明の第４実施例に係わる入力部の表示の
一例を示す図。FIG. 26 is a diagram showing an example of a display of an input unit according to the fourth embodiment of the present invention.

【図２７】第４実施例の入力部の表示の一例を示す図。FIG. 27 is a diagram showing an example of a display on the input unit according to the fourth embodiment.

【図２８】第４実施例の解析用辞書の検索処理を示すフ
ロ−チャ−ト。FIG. 28 is a flowchart showing the search processing of the analysis dictionary of the fourth embodiment.

【図２９】第４実施例の入力部の表示の一例を示す図。FIG. 29 is a diagram showing an example of a display on the input unit according to the fourth embodiment.

【図３０】第４実施例の入力部の表示の一例を示す図。FIG. 30 is a diagram showing an example of a display on the input unit of the fourth embodiment.

【図３１】本発明の第５実施例に係わる入力部の表示の
一例を示す図。FIG. 31 is a diagram showing an example of a display of an input unit according to the fifth embodiment of the present invention.

【図３２】第５実施例の入力部の表示の一例を示す図。FIG. 32 is a diagram showing an example of a display on the input unit of the fifth embodiment.

【図３３】第５実施例のキーワードインデックステーブ
ルを示す図。FIG. 33 is a diagram showing a keyword index table of the fifth embodiment.

【図３４】第５実施例の入力部の表示の一例を示す図。FIG. 34 is a diagram showing an example of a display on the input unit of the fifth embodiment.

【図３５】第５実施例の入力部の表示の一例を示す図。FIG. 35 is a diagram showing an example of a display on the input unit of the fifth embodiment.

【図３６】本発明の第６実施例に係わる入力部の表示の
一例を示す図。FIG. 36 is a diagram showing an example of a display of an input unit according to the sixth embodiment of the present invention.

【図３７】第６実施例の入力部の表示の一例を示す図。FIG. 37 is a diagram showing an example of a display on the input unit of the sixth embodiment.

【図３８】第６実施例の入力部の表示の一例を示す図。FIG. 38 is a diagram showing an example of a display on the input unit according to the sixth embodiment.

【図３９】第６実施例の意味構造インデックステーブル
を示す図。FIG. 39 is a diagram showing a semantic structure index table according to the sixth embodiment.

【図４０】第６実施例の意味構造インデックステーブル
を示す図。FIG. 40 is a diagram showing a semantic structure index table according to the sixth embodiment.

【図４１】第６実施例の入力部の表示の一例を示す図。FIG. 41 is a diagram showing an example of a display on the input unit according to the sixth embodiment.

【図４２】本発明の第７実施例の入力部の表示の一例を
示す図。FIG. 42 is a diagram showing an example of a display on the input unit according to the seventh embodiment of the present invention.

【図４３】第７実施例の表示部の表示の一例を示す図。FIG. 43 is a diagram showing an example of a display on the display unit of the seventh embodiment.

【図４４】解析用辞書の形式とその一例を示す図。FIG. 44 is a diagram showing a format of an analysis dictionary and an example thereof.

【図４５】解析用辞書の検索処理を示すフロ−チャ−
ト。FIG. 45 is a flowchart showing a search process of an analysis dictionary.
To.

【図４６】解析用辞書の検索処理を示すフロ−チャ−
ト。FIG. 46 is a flowchart showing a search process of an analysis dictionary.
To.

[Explanation of symbols]

１…中央処理装置、２…記憶部、３…表示コントロ−
ラ、４…表示部、５…入力コントロ−ラ、６，１１…入
力部、１２…入力解析部、１３…解析用辞書、１４…制
御部、１５…検索部、１６…検索結果表示部、１７…文
書デ−タ記憶部。1 ... Central processing unit, 2 ... Storage unit, 3 ... Display controller
La, 4 ... Display unit, 5 ... Input controller, 6, 11 ... Input unit, 12 ... Input analysis unit, 13 ... Analysis dictionary, 14 ... Control unit, 15 ... Search unit, 16 ... Search result display unit, 17 ... Document data storage unit.

Claims

[Claims]

1. A dictionary that stores at least headwords, a display unit, a document storage unit that stores a plurality of documents, an input unit that inputs a search character string, and an input by this input unit. A dictionary search means for searching the dictionary for a headword composed of a character string including a search character string, a means for displaying the search result searched by the dictionary search means on the display means, and a display means for displaying on the display section. Selecting means for selecting a search character from the search results, searching means for searching the search character selected by the selecting means from the documents stored in the document storing means, and search performed by the searching means A document retrieving apparatus comprising: means for displaying the result on the display means.

2. A dictionary that stores at least headwords, a display unit, a document storage unit that stores a plurality of documents, an input unit that inputs a search character string, and an input by this input unit. A dictionary search unit that searches the dictionary for a headword that matches a part of a search character string, a unit that displays the search results searched by the dictionary search unit on the display unit, and a search displayed on the display unit. Selecting means for selecting a search character from the results; searching means for searching the search character selected by the selecting means from the documents stored in the document storing means; and a search result searched by the searching means. A document retrieval device comprising: means for displaying on the display means.

3. A dictionary for storing word information constituting a compound word, a display means, a document storing means for storing a plurality of documents, an input means for inputting a search compound word, and an input means for inputting by this input means. A dictionary search means for searching the dictionary for a compound word composed of the same constituent words as the constituent word constituting the search compound word; and means for displaying the search result searched by the dictionary search means on the display means. Selecting means for selecting a search character from the search results displayed on the display section; and searching means for searching the document stored in the document storing means for the search character selected by the selecting means; A document retrieving apparatus comprising: means for displaying the retrieval result retrieved by the retrieval means on the display means.

4. A display means, a document storage means for storing a plurality of documents, an input means for inputting a search character string, and a character string adjacent to the search character string input by the input means, in the document. First search means for searching the documents stored in the storage means, means for displaying the search results searched by the search means on the display means, and search characters from the search results displayed on the display section. Selecting means for selecting, the second searching means for searching the search character selected by the selecting means from the documents stored in the document storing means, and the search searched by the second searching means. A document retrieving apparatus comprising: means for displaying the result on the display means.

5. The document searching apparatus according to claim 4, wherein the first searching means searches for a document selected by the user by the selecting means or discarded.

6. The first search means displays only the characters in the same character type range but different from each other immediately before or after the character string input by the user from the input means. The document search device according to claim 4.