JPH09198400A

JPH09198400A - Information retrieval device

Info

Publication number: JPH09198400A
Application number: JP8006055A
Authority: JP
Inventors: Takehiro Koyama; 剛弘小山
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1996-01-17
Filing date: 1996-01-17
Publication date: 1997-07-31

Abstract

PROBLEM TO BE SOLVED: To more precisely give priority to a retrieval result even if a keyword is a sentence and a clause and to improve retrieval efficiency by means of urging retrieval in order from the result of the highest priority. SOLUTION: A thesaurus development part 2 refers to a thesaurus 3 and thesaurus-develops the keyword inputted from an input part 1 and gives it to a retrieval part 4. The retrieval part 4 retrieves a text information storage part 5 by a word which is thesaurus-developed and gives the retrieval result to a thesaurus development check part 6, a relation check part 7 and a related word check part 8. The check parts 6, 7a and 8 check the presence or absence of thesaurus development, the matching of relation between a word with the keyword and the words, and the number and the positions of the keywords and related words, and give the check results to a priority calculation part 9. The priority calculation part 9 calculates priority with weighting on the check results. A display part 10 displays the retrieval result in accordance with the priority.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、キーワードを用い
てテキスト情報等を検索した後、該検索結果それぞれの
重要度を計算し、その重要度に応じて上記検索結果を表
示する情報検索装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information retrieval apparatus which retrieves text information or the like using a keyword, calculates the degree of importance of each retrieval result, and displays the retrieval result according to the degree of importance. .

【０００２】[0002]

【従来の技術】近年、文書処理の高機能化により扱い得
る文書量が増加するのに伴い、検索対象の情報も増加す
る傾向にあり、こうした状況の中で、重要な情報をより
効率的に得ることへの要求が益々高まりつつある。例え
ば、キーワードを用いてテキスト情報の検索を行う装置
に関して言えば、検索結果の中には重要な情報からキー
ワードとほとんど関係ないゴミ情報まで様々なレベルの
情報が混在しており、こうした大量の情報の中からいか
に重要な情報を効率的に得るかが課題となっている。2. Description of the Related Art In recent years, as the amount of documents that can be handled has increased due to the higher functionality of document processing, the amount of information to be searched tends to increase. The demand for gain is increasing. For example, regarding a device that searches for text information using keywords, various levels of information are mixed in the search results, from important information to garbage information that has little to do with keywords. The challenge is how to efficiently obtain important information from inside.

【０００３】かかる要望に対し、キーワードによる検索
結果を重要度によって優先度づけしてユーザに知らし
め、優先度の高いものからチェックを行うことで、重要
な情報を効率的に得るという手法が実用化されている。In response to such a request, a method of efficiently obtaining important information by prioritizing search results by keywords to notify the user and checking from the highest priority is practical. Has been converted.

【０００４】ところで、キーワードを用いて情報検索を
行う装置の中には、例えば特開昭62-248032 号公報に記
載される文書検索装置の様に、検索漏れを極力少なくす
るために、シソーラス辞書によりキーワードを同義語な
どでふくらませて検索する手法を採用したものも少なく
ない。By the way, some of the devices that perform information retrieval using keywords, such as the document retrieval device described in Japanese Patent Laid-Open No. 62-248032, have a thesaurus dictionary to minimize the omission of retrieval. There are not a few that have adopted the method of searching by expanding keywords with synonyms.

【０００５】しかしながら、この種の文書検索装置で
は、シソーラス辞書によりキーワードを同義語などでふ
くらませて検索する結果、検索漏れの防止効果があがる
一方で、検索結果が益々増大することになり、検索結果
の優先度づけが他の装置にも増して重要となる。However, in this type of document retrieval apparatus, as a result of inflating a keyword with a synonym or the like in a thesaurus dictionary, the omission of retrieval can be prevented, while the retrieval result is further increased. Is more important than any other device.

【０００６】そこで、検索結果の優先度づけを行う従来
代表的な例として、特開昭59-223865 号公報に記載され
る情報検索方式や特開平4-281565号公報に記載される文
書検索装置等が知られている。前者の検索方式では、検
索対象のテキスト中に含まれるキーワードの個数によっ
て優先度づけを行うようにしており、また後者の装置で
は、キーワードの個数と位置（タイトル、アブストラク
ト、本文）によって優先度づけを行っている。Therefore, as a conventional representative example of prioritizing search results, an information search method described in Japanese Patent Laid-Open No. 59-223865 and a document search apparatus described in Japanese Patent Laid-Open No. 4-281565. Etc. are known. In the former search method, prioritization is performed according to the number of keywords included in the text to be searched, and in the latter device, priority is given based on the number and position (title, abstract, body) of keywords. It is carried out.

【０００７】この他、検索結果の優先度づけを行う従来
技術としては、キーワード（語）と関連語（キーワード
と関係の強い語）の個数と位置により検索結果の優先度
を計算する方法や、キーワード（語、節、文）の中から
語と語間の関係を抽出し、同様に検索結果の中から抽出
した語と語間の関係と一致しているものを優先して表示
する方法なども知られている。In addition, as a conventional technique for prioritizing search results, a method of calculating the priority of search results based on the number and position of keywords (words) and related words (words having a strong relationship with keywords), A method that extracts word-word relationships from keywords (words, clauses, sentences), and preferentially displays words that match the word-word relationships extracted from the search results. Is also known.

【０００８】[0008]

【発明が解決しようとする課題】検索結果の重要度はキ
ーワードに関する記述量で判断でき、このキーワードに
関する記述としては、キーワ−ドについて直接言及した
直接記述と、キーワードに関連する項目について述べた
間接記述が考えられる。The importance of a search result can be judged by the amount of description of a keyword. As the description of this keyword, a direct description that directly refers to a keyword and an indirect description that describes an item related to the keyword are given. A description is possible.

【０００９】検索結果の中でのキーワードの個数あるい
は位置によって優先度づけを行う方法（特開昭59-22386
5 号公報、あるいは特開平4-281565号公報）では、キー
ワードに関する直接記述しか考慮しておらず、特に文数
が少なくキーワードの個数が少ない場合はあまり差がつ
かず、精度の良い優先度づけができなかった。A method of prioritizing according to the number or position of keywords in search results (Japanese Patent Laid-Open No. 59-22386)
No. 5, or Japanese Patent Laid-Open No. 4-281565) only considers the direct description of the keywords. Especially when the number of sentences is small and the number of keywords is small, there is not much difference and the prioritization is performed with good accuracy. I couldn't.

【００１０】この点、検索結果の中でのキーワードと関
連語の個数と位置によって優先度づけを行う方式では、
キーワードそのものが直接記述に相当し、関連語が間接
記述に相当していることから、直接記述と間接記述をあ
る程度考慮していると言える。この方式では、文数が少
ない場合でも精度よく優先度づけを行うことができる
が、反面、キーワードを単語に限定しており、キーワー
ドが文や節の場合には優先度づけそのものが行えなかっ
た。In this respect, in the method of prioritizing according to the number and position of keywords and related words in the search result,
Since the keyword itself corresponds to the direct description and the related word corresponds to the indirect description, it can be said that the direct description and the indirect description are considered to some extent. With this method, even if the number of sentences is small, it is possible to prioritize accurately, but on the other hand, the keywords are limited to words, and when the keywords are sentences or clauses, prioritization itself cannot be performed. .

【００１１】そこで、キーワードが文や節の場合にも優
先度づけを可能にする方式として、キーワードから複数
の語とこれらの語の間の関係を抽出するとともに、同様
に検索結果の中からも語と語間の関係を抽出し、両者の
関係が一致しているもの（検索結果）を優先して表示す
る方式が着目されている。Therefore, as a method that enables prioritization even when a keyword is a sentence or a clause, a plurality of words and the relationship between these words are extracted from the keyword, and the same is obtained from the search results. Attention has been focused on a method of extracting a word and a relationship between the words and preferentially displaying the one (search result) in which the relationship between the words matches.

【００１２】この方式は、例えば、キーワード「文書を
作成する」を与えると、「文書」と、「作成」と、これ
ら両方の語の間の関係「を格」とが抽出され、検索結果
の中で、語と語間の関係が一致する例えば「文書を作成
する」，「文書の作成」，「文書作成」などが優先して
表示される結果、キーワードを単語に限定した方式に比
べて適用範囲が格段に高まる。In this system, for example, when the keyword "create a document" is given, "document", "create", and the relation "wo case" between these words are extracted, and the search result Among them, when the relationship between words is the same, for example, "Create document", "Create document", "Create document", etc. are displayed with priority, the result is that the keyword is limited to the word The scope of application is dramatically increased.

【００１３】しかしながら、この方式では、キーワード
中と検索結果中でのそれぞれの語と語間の関係が一致す
るか否かのみを評価する方式のため、算出される優先度
が２値（一致、不一致）に止まっていた。これにより、
依然としておおざっぱな優先度づけしかできず、検索結
果を優先度順に表示して優先度の高いものからチェック
を促したところで、重要な情報を効率的に得るという観
点からは期待通りの効果は望めなかった。However, according to this method, only the words in the keyword and the search result are evaluated as to whether or not the relationships between the words match. Therefore, the calculated priority is binary (match, It has stopped at (mismatch). This allows
Only rough prioritization is still possible, search results are displayed in order of priority, and checking is performed from the highest priority, and the expected effect cannot be expected from the viewpoint of efficiently obtaining important information. It was

【００１４】本発明は上記問題点を解消するためになさ
れたものであり、キーワードによる検索結果の優先度づ
けをキーワードが文や節の場合にも行えるように拡張
し、かつ上記検索結果に対するより多様な項目チェック
に基づく総合的な評価によって当該検索結果の高精度な
優先度づけを可能にし、優先度の高いものからチェック
を促して重要な情報をより効率的に取得できる情報検索
装置を提供することを目的とする。The present invention has been made in order to solve the above-mentioned problems, and it is possible to extend the prioritization of search results by keywords so that they can be performed even when the keywords are sentences or clauses. We provide an information retrieval device that enables highly accurate prioritization of relevant search results through comprehensive evaluation based on various item checks, and prompts checks from the highest priority to obtain important information more efficiently. The purpose is to do.

【００１５】[0015]

【課題を解決するための手段】上記目的を達成するため
に、第１の発明は、テキスト情報を記憶するテキスト情
報記憶手段と、検索キーを入力する入力手段と、前記入
力手段から入力された前記検索キーから、該検索キーの
語を同義語展開した同義語を含む検索式を作成する検索
式作成手段と、関連語を記憶する関連語記憶手段と、該
関連語記憶手段から、前記検索キーの関連語を取得する
関連語取得手段と、前記テキスト情報記憶手段を前記検
索式により検索する検索手段と、該検索手段の検索結果
別に、前記検索式中の同義語展開した語がいくつ含まれ
るかをチェックする同義語展開チェック手段と、前記検
索結果別に、各検索結果の中での語間の関係が、前記検
索キーの中の語間の関係に一致するか否かをチェックす
る関係チェック手段と、前記検索結果別に、各検索結果
に含まれる前記検索キーおよびその関連語の個数と位置
についてチェックする関連語チェック手段と、前記同義
語展開チェック手段、前記関係チェック手段、前記関連
語チェック手段の各チェック結果に基づき前記検索結果
別の優先度を算出する優先度算出手段と、前記各検索結
果を当該各検索結果に対応する前記優先度に従って表示
する表示手段とを具備することを特徴とする。In order to achieve the above object, the first aspect of the present invention is such that text information storage means for storing text information, input means for inputting a search key, and input from the input means. From the search key, a search expression creating means for creating a search expression including a synonym that is a synonym expansion of the word of the search key, a related word storage means for storing a related word, and the search from the related word storage means. A related word acquisition unit that acquires a related word of the key, a search unit that searches the text information storage unit by the search formula, and a number of words that are synonym-expanded in the search formula for each search result of the search unit. And a synonym expansion checking means for checking whether the relation between words in each search result matches the relation between words in the search key. Check hand A related word check means for checking the number and position of the search key and its related words included in each search result for each search result, the synonym expansion check means, the relationship check means, and the related word check means. And a display unit for displaying each search result in accordance with the priority corresponding to each search result. To do.

【００１６】また、第２の発明は、テキスト情報を記憶
するテキスト情報記憶手段と、前記テキスト情報を予め
解析した解析情報を記憶する解析情報記憶手段と、検索
キーを入力する入力手段と、前記入力手段から入力され
た前記検索キーから、該検索キーの語を同義語展開した
同義語を含む検索式を作成する検索式作成手段と、関連
語を記憶する関連語記憶手段と、該関連語記憶手段か
ら、前記検索キーの関連語を取得する関連語取得手段
と、前記解析情報記憶手段を前記検索式により検索する
検索手段と、該検索手段の検索結果別に、前記検索式中
の同義語展開した語がいくつ含まれるかをチェックする
同義語展開チェック手段と、前記検索結果別に、各検索
結果の中での語間の関係が、前記検索キーの中の語間の
関係に一致するか否かをチェックする関係チェック手段
と、前記検索結果別に、各検索結果に含まれる前記検索
キーおよびその関連語の個数と位置についてチェックす
る関連語チェック手段と、前記同義語展開チェック手
段、前記関係チェック手段、前記関連語チェック手段の
各チェック結果に基づき前記検索結果別の優先度を算出
する優先度算出手段と、前記各検索結果を当該各検索結
果に対応する前記優先度に従って表示する表示手段とを
具備することを特徴とする。A second aspect of the invention is a text information storage means for storing text information, an analysis information storage means for storing analysis information obtained by previously analyzing the text information, an input means for inputting a search key, and From the search key input from the input means, a search expression creating means for creating a search expression including a synonym obtained by expanding a synonym of the word of the search key, a related word storage means for storing a related word, and the related word A related word acquisition unit that acquires a related word of the search key from a storage unit, a search unit that searches the analysis information storage unit by the search formula, and a synonym in the search formula for each search result of the search unit. Synonym expansion checking means for checking how many expanded words are included, and whether the relationship between words in each search result matches the relationship between words in the search key for each search result. or not Relationship checking means for checking, related word checking means for checking the number and position of the search key and its related words included in each search result for each search result, the synonym expansion checking means, the relationship checking means, And a display unit for displaying each search result in accordance with the priority corresponding to each search result, based on each check result of the related word checking unit. It is characterized by doing.

【００１７】[0017]

【発明の実施の形態】以下、本発明の実施の形態を添付
図面を参照して詳細に説明する。図１は、第１の発明の
実施の形態に係わる情報検索装置の概略構成を示すもの
である。この情報検索装置は、入力部１、シソーラス展
開部２、シソーラス辞書３、検索部４、テキスト情報記
憶部５、シソーラス展開チェック部６、関係チェック部
７、関連語チェック部８、優先度算出部９、表示部１０
を具備して構成される。Embodiments of the present invention will be described below in detail with reference to the accompanying drawings. FIG. 1 shows a schematic configuration of an information search device according to an embodiment of the first invention. This information search device includes an input unit 1, a thesaurus expansion unit 2, a thesaurus dictionary 3, a search unit 4, a text information storage unit 5, a thesaurus expansion check unit 6, a relationship check unit 7, a related word check unit 8, and a priority calculation unit. 9, display unit 10
It comprises.

【００１８】入力部１は、形態素解析辞書を有し、ユー
ザからキーワードＤ1 を得ると、上記辞書を用いて形態
素解析を行い、キーワードＤ1 の解析情報Ｄ2 をシソー
ラス展開部２に伝達する。The input unit 1 has a morphological analysis dictionary. When the keyword D1 is obtained from the user, the input unit 1 performs morphological analysis using the dictionary and transmits the analysis information D2 of the keyword D1 to the thesaurus expansion unit 2.

【００１９】シソーラス展開部２は、キーワードＤ1 の
解析情報Ｄ2 を得ると、キーワードＤ1 の名詞、サ変動
詞について、シソーラス辞書３を検索し、同義語、関連
語情報Ｄ3 を得る。次いで、この同義語、関連語情報Ｄ
3 を用いて、キーワードＤ1を同義語展開し、その結果
得られる同義語展開情報及び関連語情報の両情報Ｄ4を
検索部４に伝達する。When the analysis information D2 of the keyword D1 is obtained, the thesaurus expansion unit 2 searches the thesaurus dictionary 3 for the noun and Sa verb of the keyword D1 to obtain synonym and related word information D3. Then, this synonym and related word information D
3, the keyword D1 is expanded into a synonym, and the resulting synonym expansion information and related word information D4 are transmitted to the search unit 4.

【００２０】シソーラス辞書３は、見出し語に対して同
義語、関連語（関係が強い語）の情報が格納されてい
る。ここで、同義語とは見出し語と同じ意味の語であ
り、関連語とは見出し語と意味は異なるが関係の強い語
のことである。例えば、見出し語「ＡＩ、ＯＡ」に関し
ては、ＡＩ…同義語「人工知能…」関連語「コンピュータ、エキスパートシステム…」ＯＡ…同義語「オフィスオートメーション…」関連語「ワープロ、パソコン、ファックス…」といった情報が格納されている。The thesaurus dictionary 3 stores information on synonyms and related words (words having a strong relationship) with respect to a headword. Here, a synonym is a word having the same meaning as a headword, and a related word is a word that has a different meaning but a stronger relationship with the headword. For example, regarding the entry word "AI, OA", AI ... synonyms "artificial intelligence ..." Related words "computer, expert system ..." OA ... synonyms "office automation ..." Related words "word processor, personal computer, fax ..." Information is stored.

【００２１】検索部４は、キーワードＤ1 の同義語展開
情報と関連語情報の両情報Ｄ4 を得ると、キーワードＤ
1 の自立語間は積集合、同義語間は和集合として扱った
検索式を用いてテキスト情報記憶部５を検索し、その検
索結果Ｄ5 をシソーラス展開部６、関係チェック部７、
関連語チェック部８にそれぞれ伝達する。When the retrieval section 4 obtains both the synonym expansion information and the related word information D4 of the keyword D1, the keyword D is obtained.
The text information storage unit 5 is searched by using a search expression treated as a product set for the independent words of 1 and a union set for the synonyms, and the search result D5 is searched by the thesaurus expansion unit 6, the relation checking unit 7,
The information is transmitted to the related word check unit 8.

【００２２】テキスト情報記憶部５には、テキスト情報
が格納されており、任意の単語で検索することにより、
その単語に関する記述を得ることができる。Text information is stored in the text information storage section 5, and by searching for an arbitrary word,
You can get a description of the word.

【００２３】シソーラス展開チェック部６は、検索部４
から上記検索結果Ｄ5 を得ると、上記シソーラス展開に
より得た同義語情報を参照し、上記検索結果Ｄ5 の中
で、この同義語の中の何語がヒットしたかをチェック
し、そのチェック結果Ｄ6 を優先度算出部９に伝達す
る。The thesaurus expansion check unit 6 includes a search unit 4
When the search result D5 is obtained from the above, the synonym information obtained by the thesaurus expansion is referred to, and in the search result D5, it is checked how many words in the synonym are hit, and the check result D6 is obtained. Is transmitted to the priority calculation unit 9.

【００２４】関係チェック部７は、検索部４から上記検
索結果Ｄ5 を得ると、キーワードＤ1 から該キーワード
Ｄ1 を構成している幾つかの語とこれらの語の間の関係
を抽出したうえで、上記検索結果Ｄ5 のそれぞれについ
ても語と語の間の関係を抽出し、両者（キーワードＤ1
と検索結果Ｄ5 ）で語と語間の関係が一致しているか否
かをチェックし、そのチェック結果Ｄ7 を優先度算出部
９に伝達する。When the relation checking unit 7 obtains the retrieval result D5 from the retrieval unit 4, the relation checking unit 7 extracts relations between the keywords D1 and some of the words constituting the keyword D1. For each of the above search results D5, the relationship between the words is extracted and both (keyword D1
And the search result D5) to check whether the relation between the words matches, and the check result D7 is transmitted to the priority calculation unit 9.

【００２５】関連語チェック部８は、検索部４から上記
検索結果Ｄ5 を得ると、この検索結果Ｄ5 の中に存在す
る上記キーワードＤ1 とその関連語の個数と位置をチェ
ックし、そのチェック結果Ｄ8 を優先度算出部９に伝達
する。When the related word check unit 8 obtains the search result D5 from the search unit 4, the related word check unit 8 checks the number and position of the keyword D1 and its related words existing in the search result D5, and the check result D8. Is transmitted to the priority calculation unit 9.

【００２６】優先度算出部９は、上記検索結果Ｄ5 と、
上記シソーラス展開チェック結果Ｄ6 と、上記関係チェ
ック結果Ｄ7 と、上記関連語チェック結果Ｄ8 とを得る
と、これら各チェック結果を反映させることにより、上
記検索結果Ｄ5 それぞれの優先度を算出し、その結果得
られる優先度情報Ｄ9 を表示部１０に伝達する。表示部
１０は、優先度算出部９から上記検索結果Ｄ5 に関する
優先度情報Ｄ9 を得ると、その優先度に従って対応する
各検索結果Ｄ5 を表示する。The priority calculation unit 9 calculates the search result D5,
When the thesaurus expansion check result D6, the relation check result D7, and the related word check result D8 are obtained, the priority of each of the search results D5 is calculated by reflecting these check results, and the result is calculated. The obtained priority information D9 is transmitted to the display unit 10. When the display unit 10 obtains the priority information D9 regarding the search result D5 from the priority calculation unit 9, the display unit 10 displays the corresponding search results D5 according to the priority.

【００２７】次に、実際の例を用いて各構成部の動作を
説明する。図２は、ユーザがキーワードとして『文書を
作成する』を指定した時の各構成部の入力及び出力の例
を処理の流れに併記した図である。Next, the operation of each component will be described using an actual example. FIG. 2 is a diagram in which an example of input and output of each component when the user specifies “create a document” as a keyword is also shown in the flow of processing.

【００２８】入力部１は、キーワードＤ1 として『文書
を作成する』という文を得ると、形態素解析を行い、上
記キーワードＤ1 を形態素に分解し、この形態素に品詞
情報を付与した情報Ｄ2 文書「名」，を／，作成［サ
動］）をシソーラス展開部２に伝達する。ここで［］，
／は品詞情報で、名は名詞、／は助詞、サ動はサ変動詞
を表している。キーワードＤ1 は、上記の例『文書を作
成する』からも分かるように、語に留まらず、節や文な
どにも対応できる。When the input section 1 obtains the sentence "create a document" as the keyword D1, the morpheme analysis is performed, the keyword D1 is decomposed into morphemes, and the information D2 document "name" is added to this morpheme. ,, /, creation [support]] to the thesaurus expansion unit 2. here[],
/ Is part-of-speech information, a name is a noun, a / is a particle, and a verb is a verb. The keyword D1 is not limited to words, and can correspond to clauses and sentences, as can be seen from the above-mentioned example "create a document".

【００２９】シソーラス展開部２では、キーワードＤ1
の解析情報Ｄ2 （文書「名」，を／，作成［サ動］）
を得ると、キーワードＤ1 の名詞、サ変動詞（文書、作
成）についてシソーラス辞書３を検索し、キーワードＤ
1 の同義語情報（「文書」の同義語「ドキュメント、仕
様書」，「作成」の同義語「製作」）及び関連語情報
（「文書作成」の関連語「ワープロ」）の両情報Ｄ3 を
得る。In the thesaurus expansion unit 2, the keyword D1
Analysis information D2 (Document "name", /, create [support])
Then, the thesaurus 3 is searched for the noun of the keyword D1 and the sa verb (document, creation), and the keyword D
Both synonym information 1 (synonyms “document, specification” of “document”, synonym “production” of “creation”) and related word information (related word “word processing” of “document creation”) D3 obtain.

【００３０】次いで、この同義語、関連語の両情報Ｄ3
を用いて、キーワードＤ1 を同義語展開し、その結果で
ある同義語展開情報〔（文書、ドキュメント、仕様書）
を（作成、製作）〕と関連語情報（ワープロ）の両情報
Ｄ4 を検索部４に伝達する。ここで、シソーラス辞書３
の関連語情報の見出し語としては、キーワードＤ1が文
や節の場合にも対応できるように、キーワードＤ1 の自
立語を結合した複合語としている。この例の場合、キー
ワードＤ1 『文書を作成する』から複合語（文書作成）
を合成し、関連語情報を検索する。また、この例では、
関連語情報をシソーラス辞書３から得ているが、ユーザ
に直接入力させたり、テキスト情報からその場で抽出す
るようにしても良い。後者の場合、テキスト情報からキ
ーワードとの共起頻度（キーワードの近くに出現する頻
度）を用いて関連語情報を抽出することができる。例え
ば、共起の範囲を１文とすると、キーワードＤ1 が語の
場合は、キーワードＤ1 と共起した（キーワードＤ1 と
同一文中に含まれる）語の頻度をカウントし、高頻度語
から不要語を除き関連語とし、キーワードＤ1 が文、節
の場合は、キーワードＤ1 から自立語を抽出し、キーワ
ードＤ1 の自立語全てと共起した（キーワードＤ1 の自
立語全てを含む文中に含まれる）語の頻度をカウント
し、高頻度語から不要語を除き関連語とする。Next, both synonymous and related information D3
The keyword D1 is expanded by using the synonym, and the result is synonym expansion information [(document, document, specification)
(Creation, production)] and related word information (word processor) information D4 are transmitted to the search unit 4. Where the thesaurus dictionary 3
The headword of the related word information of is a compound word in which the independent word of the keyword D1 is combined so that the keyword D1 can correspond to a sentence or a clause. In the case of this example, the keyword D1 "create a document" to a compound word (document creation)
Is searched for related word information. Also, in this example,
Although the related word information is obtained from the thesaurus dictionary 3, it may be directly input by the user or extracted on the spot from the text information. In the latter case, the related word information can be extracted from the text information by using the co-occurrence frequency with the keyword (frequency that appears near the keyword). For example, assuming that the range of co-occurrence is one sentence, when the keyword D1 is a word, the frequency of words co-occurring with the keyword D1 (included in the same sentence as the keyword D1) is counted, and unnecessary words are extracted from high-frequency words. When the keyword D1 is a sentence or a clause, the independent word is extracted from the keyword D1 and co-occurred with all the independent words of the keyword D1 (included in the sentence including all the independent words of the keyword D1). Count the frequency and remove unnecessary words from high-frequency words to make them related words.

【００３１】検索部４は、キーワードＤ1 の同義語展開
情報〔（文書、ドキュメント、仕様書）を（作成、製
作）〕と関連語情報（ワープロ）の両情報Ｄ4 を得る
と、キーワードＤ1 の自立語間は積集合、同義語間は和
集合すなわち〔（文書、ドキュメント、仕様書）＆（作
成、製作）〕という検索式を用いてテキスト情報記憶部
５を検索する。When the retrieval unit 4 obtains both synonym expansion information [(document, document, specification) (create, produce)] and related word information (word processor) information D4 of the keyword D1, the keyword D1 becomes independent. The text information storage unit 5 is searched by using a search expression of a product set for terms and a union for synonyms, that is, [(document, document, specification) & (creation, production)].

【００３２】この例では、上記検索結果Ｄ5 として、図
２のａ欄に示すように、〔…ワープロで文書を作成
…、…仕様書の作成…、…文書作成…、…文書で
表を作成…〕という４つの文書が検索された場合の様子
を示している。ここで〜は、文書番号を表す。この
検索結果Ｄ5 は、検索部４から、シソーラス展開チェッ
ク部６、関係チェック部７、関連語チェック部８にそれ
ぞれ伝達される。In this example, as the search result D5, as shown in the column a of FIG. 2, [... create document by word processor ..., ... create specification ..., ... create document ..., create table by document. ...] is shown when four documents are searched. Here, ~ represents a document number. The search result D5 is transmitted from the search unit 4 to the thesaurus expansion check unit 6, the relation check unit 7, and the related word check unit 8, respectively.

【００３３】シソーラス展開チェック部６は、上記検索
結果Ｄ5 （…ワープロで文書を作成…、…仕様書の
作成…、…文書作成、…文書で表を作成…）を得る
と、上記キーワードＤ1 をシソーラス展開して得られた
同義語（ドキュメント、仕様書、製作）の中で何語ヒッ
トしたかをチェックし、そのチェック結果Ｄ6 を優先度
算出部９に伝達する。When the thesaurus expansion check section 6 obtains the search result D5 (... create document by word processor ..., create specification document ..., create document, ... create table by document ...), the keyword D1 is set. It checks how many words are hit in the synonyms (documents, specifications, productions) obtained by expanding the thesaurus, and transmits the check result D6 to the priority calculation unit 9.

【００３４】この例では、文書だけが同義語「仕様
書」でヒットしているため、この文書が１個だけヒッ
トしているような内容のチェック結果Ｄ6 を得る。この
様子を、図２のｂ欄においては、文書に対応付けて数
字の「１」を付した態様で表している。なお、本実施の
形態によれば、シソーラス展開した語の中で何語ヒット
したかを出力しているが、シソーラス展開した語でヒッ
トしている場合、元の語と展開した語との意味的な距離
を算出して出力し、優先度づけに利用するようにしても
良い。In this example, since only the document is hit with the synonym "specification", the check result D6 is obtained such that only one document is hit. In the column b of FIG. 2, this state is represented by a number "1" associated with the document. According to the present embodiment, the number of hit words in the thesaurus expanded word is output. However, when the thesaurus expanded word is hit, the meaning of the original word and the expanded word is output. It is also possible to calculate and output a specific distance and use it for prioritization.

【００３５】関係チェック部７は、検索部４から検索結
果Ｄ5 （…ワープロで文書を作成…、…仕様書の作
成…、…文書作成、…文書で表を作成…）を得る
と、キーワードＤ1 の中から語と語間の関係を抽出する
一方、上記検索結果Ｄ5 すなわち文書〜のそれぞれ
についても語と語間の関係を抽出し、この関係が上記キ
ーワードＤ1 の関係と一致しているか否かについての関
係チェックを行う。When the relationship check unit 7 obtains the search result D5 (... Create document by word processor ..., Create specification document ..., Create document, ... Create table with document ...) from the search unit 4, the keyword D1 is obtained. While extracting the word-to-word relationship from among the above, the search result D5, that is, the document-to-word relationship is also extracted for each of the documents .about., And whether or not this relationship matches the above-mentioned keyword D1 relationship. Check the relationship about.

【００３６】この例では、キーワードＤ1 『文書を作成
する』からは（文書←［を格］←作成）が抽出される。
他方、上記検索結果Ｄ5 のうち、文書「ワープロで文
書を作成」からは（文書←［を格］←作成）が、文書
「仕様書の作成」からは（仕様書←［を格］←作成）
が、文書「文書作成」からは（文書←［を格］←作
成）が、文書「文書で表を作成」からは（文書←［で
格］←作成）がそれぞれ抽出される。ここで、［を格］
は関係を示し、（文書←［を格］←作成）は「作成」の
［を格］が「文書」であることを示す。In this example, from the keyword D1 "create a document", (document ← [case] ← create) is extracted.
On the other hand, in the search result D5 above, the document "Create a document with a word processor" (Document ← [Case] ← Create), but the document "Specification create" (Specification ← [Case] ← Create )
However, from the document “document creation” (document ← [was case] ← created), and from the document “create a table with a document” (document ← [was case] ← created). Where [the case]
Indicates a relationship, and (document ← [corresponds] ← create) indicates that the [create case] of [create] is “document”.

【００３７】この場合、キーワードＤ1 から抽出された
語と語間の関係（文書←［を格］←作成）に対し、文書
「ワープロで文書を作成」から抽出された（文書←
［を格］←作成）が一致、文書「仕様書の作成」から
抽出された（仕様書←［を格］←作成）が一致、文書
「文書作成」から抽出された（文書←［を格］←作成）
が一致、文書「文書で表を作成」から抽出された（文
書←［で格］←作成）が不一致となり、関係チェック部
７から優先度算出部９に対しては、これらの判定に対応
したチェック結果Ｄ7 が伝達される。この様子を、図２
のｂ欄においては、関係一致が認められる文書，文書
，文書に対応付けて丸印を付した態様で表してい
る。In this case, for the relationship between the words extracted from the keyword D1 and the relationship between the words (document ← [was] ← created), it was extracted from the document "create a document with a word processor" (document ←
[Case] ← Create) matches and is extracted from the document “Creation of specifications” (Specification ← [Case] ← Create) matches and is extracted from the document “Creation of documents” (Document ← [Case ] ← Create)
Match, and the document extracted from the document “create table with document” (document ← [in case] ← created) does not match, and the relationship check unit 7 responds to these determinations to the priority calculation unit 9. The check result D7 is transmitted. This state is shown in FIG.
In column b, the documents are shown to have a relationship match, the documents, and the circles are associated with the documents.

【００３８】関連語チェック部８は、検索部４から上記
検索結果Ｄ5 （…ワープロで文書を作成…、…仕様
書の作成…、…文書作成、…文書で表を作成…）を
得ると、これら各文書中における上記キーワードＤ1 と
の関連語の個数と位置（タイトル、本文）をチェックし
（この例では簡単のために関連語の個数のみを出力して
いる）、そのチェック結果Ｄ8 を優先度算出部９に伝達
する。When the related word check unit 8 obtains the search result D5 (... Create document by word processor ..., Create specification ..., Create document, ... Create table by document ...) from the search unit 4, The number and position (title, body) of the related words with respect to the keyword D1 in each document are checked (in this example, only the number of related words is output for simplification), and the check result D8 is given priority. It is transmitted to the degree calculation unit 9.

【００３９】この例では、上記キーワードＤ1 『文書を
作成する』の関連語「ワープロ」が、文書の記述（ワ
ープロで文書を作成する）中に存在するため、文書に
関連語が１個だけ存在することを示すチェック結果Ｄ8
が優先度算出部９へと伝達される。この様子について、
図２のｂ欄では、関連語が存在する文書に対応付けて
その関連語の個数を示す数字「１」を付した態様で表し
ている。In this example, since the related word "word processor" of the above-mentioned keyword D1 "create a document" exists in the description of the document (creates a document by the word processor), only one related word exists in the document. Check result D8
Is transmitted to the priority calculation unit 9. About this situation,
In the column b of FIG. 2, a document in which a related word exists is associated with a number “1” indicating the number of the related words.

【００４０】優先度算出部９は、上記検索結果Ｄ5 （
…ワープロで文書を作成…、…仕様書の作成…、…
文書作成、…文書で表を作成…）と、上記シソーラス
展開チェック結果Ｄ6 （文書２が１個ヒット）、上記関
係チェック結果Ｄ7 （文書，文書，文書が一
致）、上記関連語チェック結果Ｄ8 （文書が関連語１
個を含む）とを得ると、これら各チェック結果Ｄ6 ，Ｄ
7 ，Ｄ8 を参照して、上記検索結果Ｄ5 である文書〜
それぞれの優先度を算出する。The priority calculation section 9 determines the search result D5 (
… Writing a document with a word processor …… Creating a specification…
Document creation, ... Creating a table with documents ...), the thesaurus expansion check result D6 (one document 2 hits), the relationship check result D7 (document, document, document match), the related word check result D8 ( Document is related word 1
(Including the number of), and these check results D6, D
Referring to 7 and D8, the document which is the above retrieval result D5 ~
Each priority is calculated.

【００４１】優先度の算出方法は、例えば、上記各チェ
ック項目に応じた重み付けにより行い、その具体的な重
み値は例えば以下の如く定めることができる。The method of calculating the priority is performed, for example, by weighting according to each check item, and the specific weight value can be determined as follows, for example.

【００４２】（１）シソーラス展開した同義語でヒットした場合、１個につき -500 （２）キーワードに対する語と語間の関係が一致した場合、 2000 （３）関連語１個につき 100 この例を用いて優先度を算出すると、その算出結果は以
下の表１（図２のｃ欄に相当）のようになる。すなわ
ち、文書は関係が一致（2000）しかつ関連語が１個存
在する（100 ）ために合計2100となり、文書はシソー
ラス展開によるヒットがあり（-500）かつ関係が一致す
る（2000）ために合計1500となり、文書は関係が一致
するのみのために（2000）となり、文書は上記チェッ
ク項目のいずれについても判定要件を満たさないので
（0 ）となる。(1) When the synonym expanded by the thesaurus is hit, -500 per one (2) When the relationship between the words for the keyword and the word match, 2000 (3) 100 per one related word When the priority is calculated using the calculation result, the calculation result is as shown in Table 1 below (corresponding to column c in FIG. 2). That is, the document has a total of 2100 because the relation matches (2000) and there is one related word (100), and the document has a thesaurus expansion hit (-500) and the relation matches (2000). The total is 1500, which is (2000) because the documents only match the relationship, and (0) because the document does not meet the judgment requirements for any of the above check items.

【００４３】上記検索結果Ｄ5 の優先度情報Ｄ9 （文書［2100］…
ワープロで文書を作成…、文書［1500］…仕様書の作
成…、文書[2000]…文書作成…、文書［0］…文書
で表を作成…）は、優先度算出部９から表示部１０に伝
達される。[0043] Priority information D9 of the search result D5 (Document [2100] ...
Create a document with a word processor ..., Document [1500] ... Create specifications ..., Document [2000] ... Create document ..., Document [0] ... Create table with document ...) from the priority calculation unit 9 to the display unit 10. Be transmitted to.

【００４４】表示部１０は、上記検索結果Ｄ5 の優先度
情報Ｄ9 （文書［2100］…ワープロで文書を作成…、
文書［1500］…仕様書の作成…、文書[2000]…文書
作成…、文書［0 ］…文書で表を作成…）を得ると、
その中の優先度に従って検索結果Ｄ10を表示する。具体
的には、図２のｄ欄に示すように、上記優先度情報Ｄ9
の中の点数を参照し、この点数が多い順に、１：ワープロで文書を作成…、２：…仕様書の作成…、３：…文書作成…、４：…文書で表を作成… という内容の表示を行う。The display unit 10 displays priority information D9 (document [2100] ... Create document by word processor ...) of the search result D5.
Document [1500] ... Create specifications ..., Document [2000] ... Create documents ..., Document [0] ... Create table with documents ...)
The search result D10 is displayed according to the priority among them. Specifically, as shown in the column d in FIG. 2, the priority information D9
Refer to the points in the above, and in descending order of the points, 1: Create a document with a word processor, 2: Create a specification document, 3: Create a document, 4: Create a table with a document ... Is displayed.

【００４５】なお、この例では、検索結果を優先度順に
ソートして表示しているが、ある点数以上のものだけを
表示するとか、点数によってグループ分けして表示する
等の種々の変形は勿論可能である。また、優先度の点数
をユーザに表示しても良い。このように、第１の発明で
は、キーワードを用いた情報検索結果について、上記キ
ーワードとの意味的な距離、上記キーワードに対する語
と語間の関係の一致／不一致、上記キーワードに関する
記述量などを総合的に評価することにより、情報検索結
果の高精度な優先度づけを行うことができ、優先度の高
いものからチェックすることにより重要な情報を効率的
に得ることができる。In this example, although the search results are sorted and displayed in order of priority, various modifications such as displaying only those having a certain score or more or displaying them in groups according to the score are of course possible. It is possible. The priority score may be displayed to the user. As described above, in the first invention, regarding the information search result using the keyword, the semantic distance from the keyword, the match / mismatch of the relationship between the words with respect to the keyword, the amount of description about the keyword, etc. It is possible to prioritize the information retrieval results with high accuracy by performing a quantitative evaluation, and it is possible to efficiently obtain important information by checking the information retrieval results with the highest priority.

【００４６】次に、第２の発明の実施の形態について説
明する。図３は、第２の発明に係わる情報検索装置の概
略構成を示すものであり、図１における第１発明に係わ
る装置の各部と同じ機能を果たす部分には同一の符号を
付している。この第２の発明に係わる装置は、第１の発
明に係わる装置のテキスト情報記憶部５を解析情報記憶
部１１に置換し、テキスト情報記憶部５は表示部１０に
直結した構成を有するものである。Next, an embodiment of the second invention will be described. FIG. 3 shows a schematic configuration of an information retrieval apparatus according to the second invention, and the same reference numerals are given to the portions having the same functions as the respective portions of the apparatus according to the first invention in FIG. The apparatus according to the second aspect of the invention has a configuration in which the text information storage section 5 of the apparatus according to the first aspect of the invention is replaced with an analysis information storage section 11, and the text information storage section 5 is directly connected to the display section 10. is there.

【００４７】第１の発明によれば、検索部４はテキスト
情報そのものを検索対象としていて、キーワードＤ1 を
シソーラス展開して得た検索語に基づき単にこの検索語
を含むテキスト情報を検索するのみであった。このた
め、その後にこの検索結果Ｄ5の優先度を算出するにあ
たり、関係チェック部７あるいは関連語チェック部８に
おいて、上記検索結果Ｄ5 それぞれについての語と語間
の関係や、キーワードＤ1 との関連語の個数と位置等の
抽出に関連して当該検索結果Ｄ5 を改めて解析する必要
があった。According to the first aspect of the present invention, the search unit 4 targets the text information itself as a search target, and simply searches the text information including the search word based on the search word obtained by expanding the keyword D1 in the thesaurus. there were. Therefore, in calculating the priority of the search result D5 thereafter, in the relation check unit 7 or the related word check unit 8, the relation between the words for each of the search results D5 and the relation between the words and the related word with the keyword D1. It was necessary to analyze the search result D5 again in connection with the extraction of the number and the position of the search results.

【００４８】第２の発明は、この種の解析処理に起因す
る検索時間の増大を抑えるためになされたものであり、
第１の発明に係わる装置がテキスト情報そのものを検索
して優先度算出を行うのに対し、第２の発明に係わる装
置では予めテキストを解析して得た解析情報を検索対象
とし、この解析情報の検索結果に基づき優先度算出を行
うものである。The second invention is made in order to suppress an increase in search time due to this kind of analysis processing.
While the device according to the first invention searches the text information itself to calculate the priority, the device according to the second invention uses the analysis information obtained by analyzing the text in advance as the search target. The priority is calculated based on the search result of.

【００４９】すなわち、第２の発明に係わる装置では、
テキスト情報記憶部５に記憶されているテキスト情報を
予め解析し、その結果を、解析情報として解析情報記憶
部１１に格納しておく。この解析情報としては、少なく
とも分割された形態素情報と品詞情報が格納されていれ
ば良い。この解析情報記憶部１１を、検索部４により、
任意の単語で検索することにより、その単語に関する記
述の解析情報を直に得ることができ、以降の関係チェッ
ク部７や関連語チェック部８での解析処理の負担軽減が
図れる。That is, in the device according to the second invention,
The text information stored in the text information storage unit 5 is analyzed in advance, and the result is stored in the analysis information storage unit 11 as analysis information. As the analysis information, at least divided morpheme information and part-of-speech information may be stored. This analysis information storage unit 11 is searched by the search unit 4
By searching for an arbitrary word, the analysis information of the description relating to that word can be directly obtained, and the load of analysis processing in the subsequent relation checking unit 7 and related word checking unit 8 can be reduced.

【００５０】次に、この第２の発明に係わる装置の情報
検索動作の概略について説明する。この装置の情報検索
動作において、検索部４は、第１の発明と同様にしてキ
ーワードＤ1 に基づく検索語を得る。Next, an outline of the information retrieval operation of the device according to the second invention will be described. In the information search operation of this device, the search section 4 obtains a search word based on the keyword D1 as in the first aspect of the invention.

【００５１】すなわち、検索部４は、キーワードＤ1 の
同義語展開情報と関連語情報の両情報Ｄ4 を得ると、キ
ーワードＤ1 の自立語間は積集合、同義語間は和集合で
解析情報記憶部１１を検索し、その検索結果Ｄ50をシソ
ーラス展開チェック部６、関係チェック部７、関連語チ
ェック部８にそれぞれ伝達する。That is, when the retrieval unit 4 obtains both the synonym expansion information and the related word information D4 of the keyword D1, the analysis information storage unit uses the intersection between the independent words of the keyword D1 and the union between the synonyms. 11 is searched, and the search result D50 is transmitted to the thesaurus expansion check unit 6, the relation check unit 7, and the related word check unit 8, respectively.

【００５２】ここで、解析情報記憶部１１には、予めテ
キスト情報を解析して得た解析情報が格納されており、
検索部４より、キーワードＤ1 の自立語と同義語間の上
記論理式を満足する検索式を用いて検索することによ
り、その検索式に関する記述を含む解析情報を上記検索
結果Ｄ50として得ることができる。Here, the analysis information storage unit 11 stores analysis information obtained by analyzing text information in advance,
By performing a search from the search unit 4 using a search expression that satisfies the above logical expression between the independent word and the synonym of the keyword D1, analysis information including a description regarding the search expression can be obtained as the search result D50. .

【００５３】シソーラス展開チェック部６、関係チェッ
ク部７、関連語チェック部８は、上記検索結果Ｄ50を検
索部４から得ると、この検索結果について、それぞれキ
ーワードＤ1 との意味的な距離、キーワードＤ1 との語
と関係の一致、キーワードＤ1 に関する記述量などのチ
ェック処理を行う。When the thesaurus expansion check unit 6, the relation check unit 7, and the related word check unit 8 obtain the search result D50 from the search unit 4, the semantic distance between the search result and the keyword D1 and the keyword D1 are obtained. A check process is performed for the matching of the word and the relation, the description amount of the keyword D1 and the like.

【００５４】すなわち、シソーラス展開チェック部６
は、検索部４から上記検索結果Ｄ50を得ると、上記シソ
ーラス展開により得た同義語情報を参照し、検索結果Ｄ
50の中で当該同義語の中の何語がヒットしたかをチェッ
クし、チェック結果Ｄ6 を優先度算出部９に伝達する。That is, the thesaurus expansion check unit 6
When the retrieval result D50 is obtained from the retrieval unit 4, refers to the synonym information obtained by the thesaurus expansion, and retrieves the retrieval result D50.
In 50, it is checked how many of the synonyms hit, and the check result D6 is transmitted to the priority calculation unit 9.

【００５５】関係チェック部７は、検索部４から上記検
索結果Ｄ50を得ると、キーワードＤ1 から該キーワード
Ｄ1 を構成している幾つかの語とこれら語の間の関係を
抽出し、検索結果Ｄ50のそれぞれの検索結果における語
とこれら語の間の関係が上記キーワードＤ1 の関係と一
致しているか否かをチェックし、チェック結果Ｄ7 を優
先度算出部９に伝達する。ここで、検索結果Ｄ50は、予
めテキスト情報を解析して得た解析情報の中から読み込
まれたものであり、関係チェック部７での上記関係チェ
ックに際して、検索結果Ｄ50を改めて解析する必要はな
い。When the search result D50 is obtained from the search unit 4, the relation checking unit 7 extracts the keywords D1 from the words constituting the keyword D1 and the relation between these words, and retrieves the results D50. It is checked whether or not the relationship between the words in each of the search results and the relationship between these words match the relationship of the keyword D1 and the check result D7 is transmitted to the priority calculation unit 9. Here, the search result D50 is read from the analysis information obtained by analyzing the text information in advance, and it is not necessary to analyze the search result D50 again when the relationship check unit 7 checks the relationship. .

【００５６】関連語チェック部８は、検索部４から上記
検索結果Ｄ50を得ると、この検索結果Ｄ50の中に存在す
る上記キーワードＤ1 に一致する語及びその関連語の個
数と位置をチェックし、そのチェック結果Ｄ8 を優先度
算出部９に伝達する。ここでも、検索結果Ｄ50が予めテ
キスト情報を解析した解析情報の中の一情報であること
から、関連語チェック部８での上記関連語チェックに際
して、検索結果Ｄ50を改めて解析するといった処理は不
要となる。When the related word check unit 8 obtains the search result D50 from the search unit 4, the related word check unit 8 checks the number and position of the words matching the keyword D1 existing in the search result D50 and the related words. The check result D8 is transmitted to the priority calculation unit 9. Also here, since the search result D50 is one of the pieces of analysis information obtained by analyzing the text information in advance, it is not necessary to analyze the search result D50 again when the related word check unit 8 checks the related words. Become.

【００５７】優先度算出部９は、上記検索結果Ｄ50と、
上記シソーラス展開チェック結果Ｄ6 と、上記関係チェ
ック結果Ｄ7 と、上記関連語チェック結果Ｄ8 とを得る
と、これら各チェック結果Ｄ6 ，Ｄ7 ，Ｄ8 に対して、
各チェック項目毎に予め定めた値で重み付けを行うこと
により、上記検索結果Ｄ50それぞれの優先度を算出し、
その算出結果である優先度情報Ｄ9 を表示部１０に伝達
する。The priority calculation unit 9 calculates the search result D50,
When the thesaurus expansion check result D6, the relation check result D7, and the related word check result D8 are obtained, for each of these check results D6, D7, D8,
The priority of each search result D50 is calculated by performing weighting with a predetermined value for each check item,
The priority information D9 as the calculation result is transmitted to the display unit 10.

【００５８】表示部１０は、検索結果Ｄ50の優先度情報
Ｄ9 を得ると、テキスト情報記憶部５からこれら各検索
結果Ｄ50に対応するテキスト情報を読み出す一方、この
読み出したテキスト情報を、当該テキスト情報に対応す
る検索結果Ｄ50に与えられた上記優先度に従ってユーザ
に表示する。When the display unit 10 obtains the priority information D9 of the search result D50, it reads the text information corresponding to each search result D50 from the text information storage unit 5, and at the same time, reads the read text information. It is displayed to the user according to the priority given to the search result D50 corresponding to.

【００５９】なお、第２の発明によれば、解析情報記憶
部１１には、予めテキスト情報を解析して得た解析情報
が、少なくとも分割された形態素情報と品詞情報という
形態で格納されており、検索部４は、キーワードＤ1 を
用いて該キーワードＤ1 に対応する記述の解析情報を上
記検索結果Ｄ50として得るという処理を基本としている
ため、上記優先度算出部９から表示部１０に伝達される
優先度情報Ｄ9 の中には、上記キーワードＤ1 に関連し
た、分割された形態素情報と品詞情報が少なくとも内包
されている。従って、表示部１０における検索結果の優
先度順の表示にあたっては、わざわざテキスト情報記憶
部５からテキスト情報を検索する方法に依らずに、上記
優先度情報Ｄ9 中の解析情報の中から品詞情報を除いて
形態素情報から元のテキスト情報を合成する構成とする
ことも考えられる。According to the second invention, the analysis information storage unit 11 stores the analysis information obtained by analyzing the text information in advance in the form of at least divided morpheme information and part-of-speech information. Since the search unit 4 is basically based on the process of using the keyword D1 to obtain the analysis information of the description corresponding to the keyword D1 as the search result D50, the search result is transmitted from the priority calculation unit 9 to the display unit 10. The priority information D9 includes at least divided morpheme information and part-of-speech information related to the keyword D1. Therefore, when displaying the search results in the order of priority on the display unit 10, the part-of-speech information is selected from the analysis information in the priority information D9 regardless of the method of searching the text information from the text information storage unit 5. Except for this, it may be possible to combine the original text information from the morpheme information.

【００６０】次に、実際の例を用いて各構成部の動作を
説明する。図４は、ユーザがキーワードＤ1 として『文
書を作成する』を指定した時の各構成部の入力及び出力
の例を処理の流れに併記した図である。この場合、キー
ワードＤ1 『文書を作成する』に対してのシソーラス展
開により検索語を得るまでの動作は、第１の発明と同様
であるため、以後の処理から説明する。Next, the operation of each component will be described using an actual example. FIG. 4 is a diagram in which an example of input and output of each component when the user specifies "create a document" as the keyword D1 is also shown in the process flow. In this case, the operation until the retrieval word is obtained by thesaurus expansion for the keyword D1 "create document" is the same as in the first aspect of the invention, and therefore the following processing will be described.

【００６１】検索部４は、キーワードＤ1 『文書を作成
する』の同義語展開情報〔（文書、ドキュメント、仕様
書）を（作成、製作）〕と関連語情報（ワープロ）の両
情報Ｄ4 を得ると、キーワードＤ1 『文書を作成する』
の自立語間は積集合、同義語間は和集合で扱った検索式
すなわち〔（文書、ドキュメント、仕様書）＆（作成、
製作）〕で解析情報記憶部１１を検索し、その検索の結
果、 …ワープロ［名］で／文書［名］を／作成［サ動］… …仕様書［名］の／作成［サ名］… …文書［名］作成［サ名］… …文書［名］で／表［名］を／作成［サ動］… という解析情報（図４のａ欄参照）を得る。この検索結
果Ｄ50は、検索部４から、シソーラス展開チェック部
６、関係チェック部７、関連語チェック部８にそれぞれ
伝達される。この検索結果Ｄ50の中の符号〜は、文
書（内容は、解析情報）番号を表している。The search unit 4 obtains both synonym expansion information [(document, document, specification) (create, produce)] of the keyword D1 "create document" and related word information (word processor) D4. And the keyword D1 "Create a document"
Independent words in a product set, and synonyms in a union set are treated as a search expression, that is, [(document, document, specification) & (create,
[Production)] to search the analysis information storage unit 11, and as a result of the search, ... With word processor [name] / create document [name] / create [support] ...… / Create [name] of specification [name] ... ... document [name] creation [service name] ... ... analysis information (see column a in Fig. 4) of document [name] / table [name] / create [service]. The search result D50 is transmitted from the search unit 4 to the thesaurus expansion check unit 6, the relation check unit 7, and the related word check unit 8, respectively. The symbols (1) to (5) in the search result D50 represent document (content is analysis information) numbers.

【００６２】シソーラス展開チェック部６は、上記検索
結果Ｄ50（…ワープロ［名］で／文書［名］を／作成
［サ動］…，…仕様書［名］の／作成［サ名］…，
…文書［名］作成［サ名］…，…文書［名］で／表
［名］を／作成［サ動］…）を得ると、上記キーワード
Ｄ1 をシソーラス展開して得られた同義語（ドキュメン
ト、仕様書、製作）の中で何語ヒットしたかをチェック
し、そのチェック結果Ｄ6 を優先度算出部９に伝達す
る。The thesaurus expansion check unit 6 uses the search result D50 (... word processor [name] / create document [name] / create [service] ..., ... / spec [name] / create [service] ...,
... [document [name] create [service name] ..., document [name] / table [name] / create [service] ...), the synonym (the synonym obtained by thesaurus expansion of the above keyword D1) Documents, specifications, production), how many words are hit is checked, and the check result D6 is transmitted to the priority calculation unit 9.

【００６３】この例では、上記第１の実施例と同様、文
書だけが同義語「仕様書」でヒットしているため、こ
の文書が１個だけヒットしているような内容のチェッ
ク結果Ｄ6 を得る。この様子を、図４のｂ欄において
は、文書に対応付けて数字の「１」を付した態様で表
している。なお、この場合においても、シソーラス展開
した語の中で何語ヒットしたかを出力する方法の他、シ
ソーラス展開した語でヒットしている場合、元の語と展
開した語との意味的な距離を算出して出力し、優先度づ
けに利用するようにしても良い。In this example, as in the first embodiment, only the document is hit with the synonym "specification", so the check result D6 is such that only one document is hit. obtain. In the column b in FIG. 4, this state is represented by a number "1" associated with the document. Even in this case, in addition to the method of outputting how many words are hit in the thesaurus-expanded words, if the thesaurus-expanded words are hit, the semantic distance between the original word and the expanded words May be calculated and output to be used for prioritization.

【００６４】関係チェック部７は、検索部４から検索結
果Ｄ50（…ワープロ［名］で／文書［名］を／作成
［サ動］…，…仕様書［名］の／作成［サ名］…，
…文書［名］作成［サ名］…，…文書［名］で／表
［名］を／作成［サ動］…）を得ると、キーワードＤ1
から語と語間の関係を抽出する一方、上記検索結果Ｄ50
の中の各文書〜それぞれについても語と語間の関係
を抽出し、この関係が上記キーワードＤ1 の関係と一致
しているか否かについての関係チェックを行う。The relation checking unit 7 retrieves the search result D50 (... word processor [name] / create document [name] / create [service] ..., ... / spec [name] / create [service name] from the search unit 4] … 、
… Document [name] Create [service name]…, Document [name] / Table [name] / Create [service]…) Keyword D1
While extracting the relationship between words from the above, the above search result D50
The relationship between the words is extracted for each of the documents 1 to 3, and the relationship is checked to see if this relationship matches the relationship of the keyword D1.

【００６５】この例においても、上記第１の発明と同
様、キーワードＤ1 『文書を作成する』から抽出された
（文書←［を格］←作成）に対し、上記検索結果Ｄ50を
成す各解析情報のうち、文書から抽出された（文書←
［を格］←作成）が一致、文書から抽出された（仕様
書←［を格］←作成）が一致、文書から抽出された
（文書←［を格］←作成）が一致、文書から抽出され
た（文書←［で格］←作成）が不一致となり、関係チェ
ック部７から優先度算出部９に対しては、これらの判定
に対応したチェック結果Ｄ7 が伝達される。この様子
を、図４のｂ欄においては、関係一致が認められる文書
，文書，文書に対応付けて丸印を付した態様で表
している。Also in this example, like the first invention, each analysis information forming the above retrieval result D50 for the keyword D1 "create a document" (document ← [category] ← create) Of the documents (document ←
[Case] ← Create) match, extracted from document (Specification ← [Case] ← Create) match, extract from document (Document ← [Case] ← Create) match, extract from document (Document ← [is case] ← created) does not match, and the check result D7 corresponding to these judgments is transmitted from the relationship check unit 7 to the priority calculation unit 9. In the column b of FIG. 4, this state is represented by a document in which a relationship match is recognized, a document, and a mode in which a circle is associated with the document.

【００６６】関連語チェック部８は、検索部４から上記
検索結果Ｄ50（…ワープロ［名］で／文書［名］を／
作成［サ動］…，…仕様書［名］の／作成［サ名］
…，…文書［名］作成［サ名］…，…文書［名］で
／表［名］を／作成［サ動］…）を得ると、これら各文
書〜中での上記キーワードＤ1 と関連語の個数と位
置（タイトル、本文）をチェックし（この例では簡単の
ために関連語の個数のみを出力している）、そのチェッ
ク結果Ｄ8 を優先度算出部９に伝達する。The related word checking unit 8 sends the search result D50 (... word processor [first name] / document [first name] /
Create [Service] ..., Specification of [Name] / Create [Service]
..., ... Document [name] Create [service name] ..., Document [name] / Table [name] / Create [service] ...), and these are associated with the above keyword D1 in each document. The number and position (title, body) of words are checked (in this example, only the number of related words is output for simplicity), and the check result D8 is transmitted to the priority calculation unit 9.

【００６７】この例においても、第１の発明と同様、上
記キーワードＤ1 『文書を作成する』の関連語「ワープ
ロ」が、文書に関する解析情報（１…ワープロ［名］
で／文書［名］を／作成［サ動］…）中に存在するた
め、文書に関連語が１個だけ存在することを示すチェ
ック結果Ｄ8 が優先度算出部９へと伝達される。この様
子について、図４のｂ欄では、関連語が存在する文書
に対応付けてその関連語の個数を示す数字「１」を付し
た態様で表している。Also in this example, as in the first invention, the related word "word processor" of the keyword D1 "create a document" is the analysis information (1 ... word processor [name]) about the document.
./Document [name] / create [service] ...), the check result D8 indicating that there is only one related word in the document is transmitted to the priority calculation unit 9. In the column b of FIG. 4, this state is shown in a form in which a number “1” indicating the number of related words is associated with the document in which the related word exists.

【００６８】優先度算出部９は、上記検索結果Ｄ50（
…ワープロ［名］で／文書［名］を／作成［サ動］…，
…仕様書［名］の／作成［サ名］…，…文書［名］
作成［サ名］…，…文書［名］で／表［名］を／作成
［サ動］…）と、上記シソーラス展開チェック結果Ｄ6
（文書が１個ヒット）、上記関係チェック結果Ｄ7
（文書，文書，文書が一致）、上記関連語チェッ
ク結果Ｄ8 （文書が関連語１個含む）を得ると、これ
ら各チェック結果Ｄ6 ，Ｄ7 ，Ｄ8 を参照して、上記検
索結果Ｄ50を成す文書〜それぞれの優先度を算出す
る。The priority calculation section 9 determines the search result D50 (
… In word processor [name] / create document [name] / create [support] ...,
… Specification [name] / Create [service name]…,… Document [name]
Create [service name] ..., Document [name] / Table [name] / Create [service] ...) and the above thesaurus expansion check result D6
(One document hit), the above relation check result D7
(Documents, documents, documents match), and when the related word check result D8 (the document includes one related word) is obtained, the search result D50 is formed by referring to these check results D6, D7, and D8. ~ Calculate each priority.

【００６９】この優先度の算出に際しての各チェック項
目毎の重み付けの処理は、第１の発明の時と同様になさ
れる。この結果、本発明においても、上記表１に示した
ような内容（図４のｃ欄参照）に沿って、上記検索結果
Ｄ50の優先度情報Ｄ9 として、（［2100］…ワープロ
［名］で／文書［名］を／作成［サ動］…，［1500］
…仕様書［名］の／作成［サ名］…，[2000]…文書
［名］作成［サ名］…，[0] …文書［名］で／表
［名］を／作成［サ動］…）が得られ、これが優先度算
出部９から表示部１０へと伝達される。The weighting process for each check item at the time of calculating the priority is performed in the same manner as in the first invention. As a result, also in the present invention, in accordance with the contents shown in Table 1 (see column c of FIG. 4), as the priority information D9 of the search result D50, ([2100] ... word processor [name] / Create document [name] / Create [support] ..., [1500]
… Specification [Name] / Create [Service Name]…, [2000]… Document [Name] Create [Service Name]…, [0]… Document [Name] / Table [Name] / Create [Service] ] ...) is obtained, and this is transmitted from the priority calculation unit 9 to the display unit 10.

【００７０】表示部１０は、上記検索結果Ｄ50の優先度
情報Ｄ9 （［2100］…ワープロ［名］で／文書［名］
を／作成［サ動］…，［1500］…仕様書［名］の／作
成［サ名］…，[2000]…文書［名］作成［サ名］…，
[0] …文書［名］で／表［名］を／作成［サ動］…）
を得ると、テキスト情報記憶部５からこれら各検索結果
Ｄ50に対応するテキスト情報を検索し、次いで、このテ
キスト情報を、当該テキスト情報に対応する検索結果Ｄ
50に与えられた上記優先度に従い、検索結果Ｄ10として
ユーザに表示する。The display unit 10 displays the priority information D9 ([2100] ... word processor [name] / document [name]] of the search result D50.
Create / [Create] [...], [1500] ... Specification [Name] / Create [Support] ..., [2000] ... Create Document [Name] [Support],
[0]… In document [first name] / table [first name] / create [service]…)
Then, the text information storage unit 5 is searched for the text information corresponding to each of the search results D50, and this text information is then searched for the search result D corresponding to the text information.
The search result D10 is displayed to the user in accordance with the priority given to the item 50.

【００７１】具体的には、図４のｄ欄に示す如く、上記
優先度情報Ｄ9 の中の［］内の点数を参照し、この点数
が多い順に、１…ワープロで文書を作成… ２…文書作成… ３…仕様書の作成… ４…文書で表を作成… という内容の表示を行う。Specifically, as shown in the column d of FIG. 4, the points in the brackets [] in the priority information D9 are referred to, in order of increasing score, 1 ... Create a document with a word processor ... 2 ... Create document ... 3 ... Create specification ... 4 ... Create table with document ...

【００７２】なお、上実施の形態では、表示部１０がテ
キスト情報を実際に検索して優先度順に表示する場合に
ついて述べたが、この他、検索結果Ｄ50の解析情報から
テキスト情報を合成するようにしても良く、この場合に
は、テキスト情報記憶部５を設けず済む。In the above embodiment, the case where the display unit 10 actually searches for text information and displays it in order of priority has been described. In addition to this, text information may be synthesized from the analysis information of the search result D50. However, in this case, it is not necessary to provide the text information storage unit 5.

【００７３】このように、第２の発明では、予めテキス
ト情報を解析して得た解析情報を格納したものを対象に
検索を行い、かつ優先度算出を行うものである。上記第
１の発明では、テキスト情報に対して直接検索にいき、
優先度算出を行うため、その後の関係チェックや関連語
チェックに際して検索結果のテキスト情報を解析する必
要があった。これに対して、第２の発明では、予めテキ
スト情報を解析しておいた解析情報を利用して優先度算
出を行うため、上記関係チェックや関連語チェックにお
いて検索結果のテキスト情報を解析する必要がなく、高
速に処理を行うことができる。As described above, according to the second aspect of the present invention, a search is performed for a target in which analysis information obtained by analyzing text information in advance is stored, and a priority is calculated. In the first invention, the text information is directly searched,
In order to calculate the priority, it was necessary to analyze the text information of the search result in the subsequent relation check and related word check. On the other hand, in the second invention, since the priority is calculated by using the analysis information which has been analyzed in advance in the text information, it is necessary to analyze the text information of the search result in the relation check or the related word check. There is no problem, and the processing can be performed at high speed.

【００７４】[0074]

【発明の効果】以上説明したように、第１の発明によれ
ば、キーワードによる情報検索結果の優先度付けを、キ
ーワードが文や節の場合にも適応できるように拡張する
とともに、上記キーワードと対応する情報検索結果との
関係については、キーワードとの意味的な距離、キーワ
ードとの語と語間の関係の一致／不一致、キーワードに
関する記述量の各項目毎にそのチェック結果を点数算出
により重みづけして優先度を算出し、その優先度に従っ
て上記検索結果を表示するようにしたため、上記複数の
項目チェックに基づく総合的な評価によって当該検索結
果の高精度な優先度づけができ、優先度の高いものから
チェックを促して重要な情報をより効率的に検索できる
ようになる。As described above, according to the first aspect of the present invention, the prioritization of information retrieval results by keywords is expanded so that it can be applied even when a keyword is a sentence or a clause, and Regarding the relationship with the corresponding information retrieval result, the check result is weighted by score calculation for each item of the semantic distance with the keyword, the match / mismatch of the relationship between the words with the keyword, and the description amount of the keyword. The search results are displayed according to the priority of each search result. Therefore, the search results can be prioritized with high accuracy by the comprehensive evaluation based on the above multiple item checks. You will be able to search for important information more efficiently by urging you to check from the highest.

【００７５】また、第２の発明では、上記第１の発明
で、キーワードに基づきテキスト情報を直接検索した後
に、この検索結果のテキスト情報の解析を経て上記各項
目チェックを行う方法に換えて、予めテキスト情報を解
析して得た解析情報を用意しておき、キーワードにより
まずこの解析情報を検索した後、上記各項目チェックを
行うことにより、上記検索結果に対する上記各項目チェ
ックのうち関係チェック及び関連語チェックにおいて検
索結果のテキスト情報の解析処理が不要となることか
ら、より高速な検索動作を実現できる。In addition, in the second invention, instead of the method of the first invention, in which the text information is directly searched based on the keyword and then the above-mentioned item check is performed through analysis of the text information of the search result, Prepare analysis information obtained by analyzing text information in advance, first search this analysis information with a keyword, and then perform each item check described above to check the relation check and the relation check among the above item checks for the search result. Since the text information of the search result does not need to be analyzed in the related word check, a faster search operation can be realized.

[Brief description of drawings]

【図１】第１の発明の実施の形態に係わる情報検索装置
の概略構成図。FIG. 1 is a schematic configuration diagram of an information search device according to an embodiment of the first invention.

【図２】図１における装置の情報検索処理の流れを各構
成要素毎の入力及び出力例を併記して示す概念図。FIG. 2 is a conceptual diagram showing a flow of information search processing of the apparatus in FIG. 1 together with input and output examples for each component.

【図３】第２の発明の実施の形態に係わる情報検索装置
の概略構成図。FIG. 3 is a schematic configuration diagram of an information search device according to an embodiment of the second invention.

【図４】図３における装置の情報検索処理の流れを各構
成要素毎の入力及び出力例を併記して示す概念図。FIG. 4 is a conceptual diagram showing the flow of information search processing of the apparatus in FIG. 3 with input and output examples for each component together.

[Explanation of symbols]

１…入力部、２…シソーラス展開部、３…シソーラス辞
書、４…検索部、５…テキスト情報記憶部、６…シソー
ラス展開チェック部、７…関係チェック部、８…関連語
チェック部、９…優先度算出部、１０…表示部、１１…
解析情報記憶部DESCRIPTION OF SYMBOLS 1 ... Input part, 2 ... Thesaurus expansion part, 3 ... Thesaurus dictionary, 4 ... Search part, 5 ... Text information storage part, 6 ... Thesaurus expansion check part, 7 ... Relationship check part, 8 ... Related word check part, 9 ... Priority calculation section, 10 ... Display section, 11 ...
Analysis information storage

Claims

[Claims]

1. A text information storage means for storing text information, an input means for inputting a search key, and a synonym obtained by expanding a synonym of the word of the search key from the search key input from the input means. Retrieval expression creating means for creating a retrieval expression including, related word storage means for storing related words, related word acquisition means for acquiring related words of the search key from the related word storage means, and the text information storage means And a synonym expansion checking unit for checking how many synonym expansion words in the search formula are included, according to the search result of the search unit. The relationship between words in the search results is
Relationship checking means for checking whether or not the relationship between words in the search key matches, and a relationship for checking the number and position of the search key and its related words included in each search result for each search result. A word check means, the synonym expansion check means, the relationship check means,
And a display unit for displaying each search result in accordance with the priority corresponding to each search result based on each check result of the related word checking unit. An information retrieval device characterized by:

2. A text information storage means for storing text information, an analysis information storage means for storing analysis information obtained by previously analyzing the text information, an input means for inputting a search key, and an input from the input means. From the search key, a search expression creating means for creating a search expression including a synonym obtained by expanding the word of the search key, a related word storage means for storing a related word, and the search from the related word storage means. A related word acquisition unit that acquires a related word of the key; a search unit that searches the analysis information storage unit by the search expression; and a number of words that are synonym-expanded in the search expression for each search result of the search unit. And a synonym expansion check means for checking whether or not there is a relation between words in each search result,
Relationship checking means for checking whether or not the relationship between words in the search key matches, and a relationship for checking the number and position of the search key and its related words included in each search result for each search result. A word check means, the synonym expansion check means, the relationship check means,
And a display unit for displaying each search result in accordance with the priority corresponding to each search result based on each check result of the related word checking unit. An information retrieval device characterized by: