JP2003162531A

JP2003162531A - Document retrieval system and document retrieval method

Info

Publication number: JP2003162531A
Application number: JP2001361625A
Authority: JP
Inventors: Atsushi Hosoda; 篤志細田
Original assignee: Matsushita Electric Works Ltd
Current assignee: Panasonic Electric Works Co Ltd
Priority date: 2001-11-27
Filing date: 2001-11-27
Publication date: 2003-06-06
Anticipated expiration: 2021-11-27
Also published as: JP3915488B2

Abstract

<P>PROBLEM TO BE SOLVED: To facilitate the retrieval of a document which conforms to retrieval intention. <P>SOLUTION: In a document database DB1, a plurality of documents to be used in the range of a plurality of specific fields are stored. A retrieval sentence setup means 11 makes a user specify the fields of retrieval range, as well as set up retrieval sentences. When the retrieval sentence is set by the retrieval sentence setup means 11, a retrieval processing means 12 extracts documents satisfying the retrieval conditions specified by the retrieval sentences from the document database DB1. The retrieval process means 12 decides rating criterion for rating the use value of each extracted document in the specified field by the retrieval sentence setup means 11, arranges the retrieved sentences according to the size of the rating criterion as a retrieval result, and delivers the result to a retrieval result output means 13. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、文書を蓄積した文
書データベースから所望のキーワードに関連する文書を
検索する文書検索システムおよび文書検索方法に関する
ものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document retrieval system and a document retrieval method for retrieving a document related to a desired keyword from a document database that stores documents.

【０００２】[0002]

【従来の技術】一般に、文書を蓄積した文書データベー
スから文書を検索する技術として、キーワードとなる語
彙を与え、あらかじめ文書に付帯して設定されたキーワ
ードあるいは文書の全文とキーワードとの文字列を照合
する技術が広く採用されている。2. Description of the Related Art Generally, as a technique for retrieving a document from a document database in which documents are stored, a vocabulary as a keyword is given, and a keyword set in advance with the document or a full text of the document and a character string of the keyword are collated. The technology to do is widely adopted.

【０００３】しかしながら、文書に付帯して設定される
キーワードや文書中に使用される語彙は統一されたもの
ではなく、ほぼ同じ概念であっても語彙が異なっている
ことも多いから、与えたキーワードに一致する語彙が所
望の文書に使用されておらず、いわゆる検索漏れによっ
て適切な文書を抽出できない場合がある。キーワードを
自由語で与える場合には、関連する複数の語彙をキーワ
ードとして与えることによって検索漏れの可能性を低減
することができるものの、文書検索のたびに関連する語
彙を探し出すのは手間がかかる上に、検索しようとする
文書に関連する分野に精通していなければ適切な語彙を
探し出すことはできないという問題がある。However, the keywords attached to the document and the vocabulary used in the document are not unified, and even if the concepts are almost the same, the vocabulary is often different. There is a case where the vocabulary matching with is not used for the desired document, and a proper document cannot be extracted due to so-called search omission. When a keyword is given as a free word, the possibility of omission of search can be reduced by giving multiple related vocabularies as keywords, but it is time-consuming to find the relevant vocabulary each time a document is searched. In addition, there is a problem in that an appropriate vocabulary cannot be found unless one is familiar with the field related to the document to be searched.

【０００４】これに対して、特開平１１−１２６２０２
号公報には、利用者が入力した簡易な検索条件を、あら
かじめテーブルに登録された検索式に変換し、検索条件
を展開することによって入力された検索条件よりも語彙
数を増やして検索漏れの可能性を低減し、しかも適正な
検索式に変換することによって不必要なノイズを排除す
る技術が記載されている。検索条件からどのような形の
検索式に展開するかは、検索条件に応じてあらかじめテ
ーブルに登録されている。つまり、検索条件に対して検
索式は固定的に決定されることになる。On the other hand, JP-A-11-126202
In the gazette, simple search conditions entered by the user are converted into search formulas registered in a table in advance, and the search conditions are expanded to increase the number of vocabularies than the input search conditions and to avoid omission of search. A technique is described which reduces the possibility and eliminates unnecessary noise by converting into a proper search expression. The form of the search expression to be expanded from the search condition is registered in the table in advance according to the search condition. That is, the search formula is fixedly determined for the search condition.

【０００５】また、特開平１０−７２１０７号公報で
は、与えたキーワードにより所望の文書が抽出されなか
った場合に、キーワードを複数の語彙に分解し、またキ
ーワードに含まれる語彙に関連する別の語彙を導出して
語彙数を拡張することにより、検索漏れの可能性を低減
する技術が記載されている。Further, in Japanese Unexamined Patent Publication No. 10-72107, when a desired document is not extracted by a given keyword, the keyword is decomposed into a plurality of vocabulary and another vocabulary related to the vocabulary included in the keyword is decomposed. A technique is disclosed that reduces the possibility of omission of search by deriving and expanding the number of vocabularies.

【０００６】さらに、特開平８−１７１５６９号公報に
おいては、キーワードを関連する複数の語彙に展開する
技術に加えて、検索意図に合わない語彙がキーワードに
含まれないようにするために、キーワードとなる語彙に
カテゴリを設定し、与えられたキーワードが複数のカテ
ゴリに分類されるときには、各カテゴリの他の語彙を利
用者に提示することによって、検索意図に合わない語彙
を展開しないようにする技術が記載されている。Further, in Japanese Unexamined Patent Publication No. 8-171569, in addition to the technique of expanding a keyword into a plurality of related vocabularies, in order to prevent the vocabulary that does not match the search intention from being included in the keywords, When a given keyword is classified into multiple categories, a vocabulary that is defined as a vocabulary is displayed, and other vocabulary in each category is presented to the user to prevent the development of vocabulary that does not match the search intention. Is listed.

【０００７】[0007]

【発明が解決しようとする課題】ところで、特開平１１
−１２６２０２号公報に記載された技術では、検索条件
に対して検索漏れの可能性を低減するとともに不要なノ
イズを排除することができるとはいうものの、検索条件
に対する検索式が固定的に設定されているものであり、
入力された曖昧な検索条件をより検索に適した検索条件
に置き換えているに過ぎないものである。したがって、
検索条件に対応する検索式をあらかじめ用意しなければ
ならず、適切な検索式を設定するには膨大な労力を要す
ることになる。その結果、特殊な専門用語を検索条件に
用いるような場合には検索条件に対応する検索式が設定
されていない可能性が高くなり、結果的に検索漏れが増
加し不要なノイズが多く含まれる可能性が高くなる。By the way, Japanese Unexamined Patent Application Publication No. H11-11
According to the technique described in Japanese Patent Laid-Open No. 126202, although it is possible to reduce the possibility of omission of search for search conditions and eliminate unnecessary noise, a search formula for search conditions is fixedly set. That is
The ambiguous search conditions that have been input are simply replaced with search conditions that are more suitable for searching. Therefore,
A search formula corresponding to the search condition must be prepared in advance, and a huge amount of labor is required to set an appropriate search formula. As a result, when a special technical term is used as a search condition, it is highly possible that the search formula corresponding to the search condition is not set, resulting in an increase in search omissions and a lot of unnecessary noise. More likely.

【０００８】特開平１０−７２１０７号公報に記載され
た技術では、キーワードを複数の語彙に分解して語彙数
を拡張するだけであるから、特殊な専門用語であっても
対応するのは比較的容易であるが、キーワードに対して
所望の文書が抽出されなかったときにキーワードを複数
の語彙に分解したり、語彙数を拡張したりするから、所
望の文書が抽出されるまでには複数回の検索処理が必要
になることが多い。つまり、所望の文書が抽出されるま
でに比較的長い時間を要することになる。また、語彙数
を拡張するだけであるから、不要なノイズを排除するこ
とは困難になる。The technique disclosed in Japanese Patent Laid-Open No. 10-72107 only decomposes a keyword into a plurality of vocabularies and expands the number of vocabularies. Therefore, even a special technical term is relatively compatible. It is easy, but when the desired document is not extracted for the keyword, the keyword is decomposed into multiple vocabularies and the number of vocabularies is expanded. Often, the search process of is required. That is, it takes a relatively long time until the desired document is extracted. In addition, it is difficult to eliminate unnecessary noise because the vocabulary number is simply expanded.

【０００９】特開平８−１７１５６９号公報には、語彙
のカテゴリを分類するとともに、各カテゴリに分類され
る語彙を利用者に選択させることによって、検索意図に
合致するカテゴリの範囲内で語彙を展開する技術が記載
されており、この技術では、語彙の拡張によって検索漏
れは低減され、またカテゴリを制限することによって不
要なノイズは比較的少なくなると考えられる。In Japanese Unexamined Patent Publication No. 8-171569, vocabulary categories are classified into categories, and the vocabulary classified into each category is selected by the user, so that the vocabulary is expanded within the range of the categories that match the search intention. It is considered that search omission is reduced by expanding the vocabulary and unnecessary noise is relatively reduced by restricting the categories.

【００１０】しかしながら、この公報に記載の技術を用
いてもノイズを排除することはできないから、検索条件
に対して複数の文書が抽出されることが多く、最終的に
は抽出された複数の文書の中から利用者が所望の文書を
探し出さなければならず、検索意図に合致する文書を探
し出すための労力は依然として大きいものである。However, since noise cannot be eliminated even by using the technique described in this publication, a plurality of documents are often extracted according to the search condition, and finally a plurality of extracted documents are extracted. The user has to find out a desired document from among these, and the effort for finding the document that matches the search intention is still great.

【００１１】本発明は上記事由に鑑みて為されたもので
あり、その目的は、検索対象となる文書に分野別の評価
尺度を対応付け、この評価尺度の大きさの順に検索結果
を並べ替えて表示することにより、検索意図に合致する
文書が迅速に見つかるように並べて文書の抽出を容易に
した文書検索システムおよび文書検索方法を提供するこ
とにある。The present invention has been made in view of the above reasons, and an object thereof is to associate a document to be searched with an evaluation scale for each field and sort search results in order of the size of the evaluation scale. It is to provide a document search system and a document search method in which documents that match the search intention are arranged so that the documents that match the search intention can be quickly found and the documents can be easily extracted.

【００１２】[0012]

【課題を解決するための手段】請求項１の発明は、特定
の複数分野の範囲内で用いる複数の文書が格納された文
書データベースと、利用者に検索文を設定させるととも
に検索範囲の分野を指定させる検索文設定手段と、検索
文設定手段により設定された検索文の検索条件に合致す
る文書を文書データベースから抽出する検索処理手段
と、検索処理手段での検索結果を出力する検索結果出力
手段とを備え、検索処理手段は、文書データベースに登
録された各文書ごとに分野別に分類された評価データに
基づいて各文書の各分野における評価尺度を求める機能
と、求め方の異なる複数種類の評価尺度から所望の評価
尺度を利用者に選択させる機能と、抽出した各文書ごと
に検索文設定手段で指定された分野に関して利用者が選
択した種類の評価尺度を求めるとともに当該評価尺度の
大小順に並べて検索結果として検索結果出力手段に引き
渡す機能とを有することを特徴とする。According to the invention of claim 1, a document database in which a plurality of documents used within a range of a specific plurality of fields are stored, and a user sets search texts and sets fields of the search range. Retrieval statement setting means for designating, retrieval processing means for extracting documents matching the retrieval conditions of the retrieval statement set by the retrieval statement setting means from the document database, and retrieval result output means for outputting retrieval results by the retrieval processing means The search processing means includes a function of obtaining an evaluation scale in each field of each document based on evaluation data classified by field for each document registered in the document database, and a plurality of types of evaluation methods different in the method of acquisition. A function for allowing the user to select a desired evaluation scale from the scales, and an evaluation scale of the type selected by the user for the field specified by the search sentence setting means for each extracted document. And having a function to pass the search result output unit as a search result are arranged in magnitude order of the evaluation scale with seeking.

【００１３】請求項２の発明は、請求項１の発明におい
て、前記検索文設定手段が、利用者にキーワードを含む
自然文である一次検索文を入力させる一次検索文入力手
段と、一次検索文からキーワードを抽出する検索文解析
手段と、キーワードとなる語彙に対する関連用語が前記
分野別に登録された関連用語データベースと、検索文解
析手段により抽出したキーワードを関連用語データベー
スに照合し検索文解析手段で抽出したキーワードに対す
る関連用語を用いた二次検索文を生成する機能を有した
二次検索文生成手段と、一次検索文と二次検索文とから
前記検索処理手段に与える検索文を利用者に選択させる
実行検索文選択手段とから成ることを特徴とする。According to a second aspect of the invention, in the first aspect of the invention, the search sentence setting means allows the user to input a primary search sentence which is a natural sentence including a keyword, and a primary search sentence. A search sentence analyzing means for extracting a keyword from the related sentence, a related term database in which the related terms for the vocabulary to be the keyword are registered for each of the fields, and the keyword extracted by the search sentence analyzing means are collated with the related term database and the search sentence analyzing means A secondary search sentence generation means having a function of generating a secondary search sentence using a related term for the extracted keyword, and a search sentence given to the search processing means from the primary search sentence and the secondary search sentence to the user. It is characterized by comprising an execution search sentence selection means for selecting.

【００１４】請求項３の発明は、請求項２の発明におい
て、前記関連用語データベースには、キーワードとなる
語彙の類義語に加えて各分野において特定の関連性を有
する語彙が関連用語として登録されるとともに、各関連
用語に各分野での利用価値の目安となる用語重要度が対
応付けられ、前記二次検索文生成手段では用語重要度を
関連用語とともに利用者に示して関連用語から利用者の
希望する関連用語を選択させることを特徴とする。According to a third aspect of the present invention, in the second aspect of the present invention, in the related term database, in addition to the synonyms of the vocabulary serving as a keyword, a vocabulary having a specific relevance in each field is registered as a related term. At the same time, each related term is associated with a term importance that is a measure of the utility value in each field, and the secondary search sentence generation means shows the term importance to the user together with the related term, and the related term The feature is that the desired related term is selected.

【００１５】請求項４の発明は、請求項１ないし請求項
３の発明において、前記文書データベースに登録された
各文書ごとの各分野での利用価値を前記評価データとな
る文書重要度として登録した文書重要度データベースを
備え、前記検索処理手段では、文書重要度が前記評価尺
度の選択肢の一つとして選択されると、前記検索条件に
より抽出した文書を文書重要度データベースに照合する
ことにより文書重要度を前記評価尺度に用いて文書を並
べることを特徴とする。According to a fourth aspect of the present invention, in the first to third aspects of the invention, the utility value in each field of each document registered in the document database is registered as the document importance serving as the evaluation data. A document importance database is provided, and when the document importance is selected as one of the evaluation scale options in the search processing means, the document extracted by the search condition is collated with the document importance database to thereby determine the document importance. The document is arranged by using the degree as the evaluation scale.

【００１６】請求項５の発明は、請求項１ないし請求項
３の発明において、前記文書データベースに登録された
各文書ごとの各分野別の参照頻度を前記評価データとし
て登録した参照頻度データベースを備え、前記検索処理
手段では、参照頻度が前記評価尺度の選択肢の一つとし
て選択されると、前記検索条件により抽出した文書を参
照頻度データベースに照合することにより参照頻度を前
記評価尺度に用いて文書を並べることを特徴とする。According to a fifth aspect of the present invention, in the first to third aspects of the invention, a reference frequency database is provided in which the reference frequency for each field for each document registered in the document database is registered as the evaluation data. In the search processing means, when the reference frequency is selected as one of the options of the evaluation scale, the document extracted by the search condition is collated with a reference frequency database to use the reference frequency as the evaluation scale. It is characterized by arranging.

【００１７】請求項６の発明は、請求項２または請求項
３の発明において、前記文書データベースに登録された
各文書ごとの各分野での利用価値を前記評価データとな
る文書重要度として登録した文書重要度データベース
と、前記文書データベースに登録された各文書ごとの各
分野別の参照頻度を前記評価データとして登録した参照
頻度データベースとを備え、前記検索処理手段では、抽
出した各文書中でのキーワードの出現頻度と文書重要度
と参照頻度とを重み付け加算した評価ポイントが前記評
価尺度の選択肢の一つとして選択されると、前記検索条
件により抽出した文書を文書重要度データベースおよび
参照頻度データベースに照合するとともに抽出した各文
書中でのキーワードの出現頻度を求めることにより評価
ポイントを求め、この評価ポイントを前記評価尺度に用
いて文書を並べることを特徴とする。According to a sixth aspect of the invention, in the second or third aspect of the invention, the utility value in each field of each document registered in the document database is registered as the document importance serving as the evaluation data. The document importance database and the reference frequency database in which the reference frequency for each field for each document registered in the document database is registered as the evaluation data are provided. When the evaluation point obtained by weighting and adding the appearance frequency of the keyword, the document importance and the reference frequency is selected as one of the options of the evaluation scale, the document extracted by the search condition is stored in the document importance database and the reference frequency database. The evaluation points are obtained by matching and matching the appearance frequency of the keyword in each extracted document. The evaluation points, characterized in that arranging the documents using the rating scale.

【００１８】請求項７の発明は、請求項５または請求項
６の発明において、前記検索処理手段では、前記文書デ
ータベースに登録された文書が参照されたときに、前記
参照頻度データベースに登録されている参照頻度が大き
いほど大きい値を参照頻度に加算することを特徴とす
る。According to the invention of claim 7, in the invention of claim 5 or 6, when the document registered in the document database is referenced in the search processing means, the document is registered in the reference frequency database. It is characterized in that a larger value is added to the reference frequency as the reference frequency is higher.

【００１９】請求項８の発明は、請求項５または請求項
６の発明において、前記検索処理手段では、前記文書デ
ータベースに登録された文書が参照されたときに利用者
に利用価値に相当する投票値を入力させ、投票値の合計
を参照頻度に用いることを特徴とする。According to an eighth aspect of the present invention, in the fifth or sixth aspect of the present invention, the search processing means votes when the document registered in the document database is referred to, which corresponds to the utility value to the user. It is characterized in that a value is entered and the sum of voting values is used as the reference frequency.

【００２０】請求項９の発明は、請求項４または請求項
６の発明において、前記検索処理手段では、前記文書デ
ータベースに登録された文書が参照されたときに利用者
に利用価値に相当する投票値を入力させ、前記文書重要
度データベースに登録された当該文書の文書重要度を投
票値が大きいほど大きくするように補正することを特徴
とする。According to a ninth aspect of the present invention, in the fourth or sixth aspect of the invention, the retrieval processing means votes when the document registered in the document database is referred to, which corresponds to the utility value to the user. A value is input, and the document importance of the document registered in the document importance database is corrected so as to increase as the voting value increases.

【００２１】請求項１０の発明は、特定の複数分野の範
囲内で用いる複数の文書が格納された文書データベース
から目的の文書を検索する方法であって、利用者に検索
文を設定させるとともに検索範囲の分野を指定させた
後、設定された検索文の検索条件に合致する文書を文書
データベースから抽出し、さらに各文書ごとに分野別に
分類された評価データから求められる各文書の各分野で
の評価尺度を、求め方の異なる複数種類から利用者に選
択させ、次に抽出した各文書ごとに指定された分野に関
して選択された種類の評価尺度を求めるとともに抽出し
た文書を当該評価尺度の大小順に並べて検索結果として
出力することを特徴とする。The invention of claim 10 is a method for searching for a target document from a document database in which a plurality of documents used within a specific plurality of fields are stored. After specifying the fields of the range, documents that match the search conditions of the set search sentence are extracted from the document database, and further, in each field of each document obtained from the evaluation data classified by field for each document. Let the user select an evaluation scale from multiple types of different methods of obtaining, and then calculate the type of evaluation scale selected for the specified field for each extracted document, and sort the extracted documents in order of magnitude. It is characterized in that they are arranged and output as search results.

【００２２】請求項１１の発明は、請求項１０の発明に
おいて、利用者にキーワードを含む自然文である一次検
索文を入力させた後、一次検索文からキーワードを抽出
し、次にキーワードとなる語彙に対する関連用語が前記
分野別に登録された関連用語データベースに一次検索文
から抽出したキーワードを照合して関連用語を用いた二
次検索文を生成し、一次検索文と二次検索文とから利用
者に前記検索文を選択させることを特徴とする。According to the invention of claim 11, in the invention of claim 10, after the user inputs the primary search sentence which is a natural sentence including the keyword, the keyword is extracted from the primary search sentence and becomes the next keyword. Generates a secondary search sentence using the related terms by collating the keywords extracted from the primary search sentences with the related term database in which the related terms for the vocabulary are registered in the above-mentioned fields, and uses them from the primary search sentence and the secondary search sentence. It is characterized by allowing a person to select the search sentence.

【００２３】請求項１２の発明は、請求項１０または請
求項１１の発明において、前記文書データベースに登録
されている各文書ごとの各分野での利用価値である文書
重要度と、前記文書データベースに登録されている各文
書ごとの各分野別の参照頻度と、前記文書データベース
から検索条件により抽出した各文書中でのキーワードの
出現頻度とのうちの少なくとも１つを前記評価尺度に用
いて文書を並べることを特徴とする。According to a twelfth aspect of the present invention, in the tenth or eleventh aspect of the invention, the document importance, which is the utility value in each field for each document registered in the document database, and the document database, Using at least one of the reference frequency of each registered document for each field and the appearance frequency of the keyword in each document extracted from the document database according to the search condition as the evaluation scale, the document is identified. Characterized by lining up.

【００２４】請求項１３の発明は、請求項１２の発明に
おいて、前記文書データベースに登録された文書が参照
されたときに利用者に利用価値に相当する投票値を入力
させ、投票値の合計を参照頻度に用いることを特徴とす
る。According to the invention of claim 13, in the invention of claim 12, when a document registered in the document database is referred to, a user inputs a voting value corresponding to the utility value, and the total of the vote values is calculated. It is characterized by being used for reference frequency.

【００２５】請求項１４の発明は、請求項１２の発明に
おいて、前記文書データベースに登録された文書が参照
されたときに利用者に利用価値に相当する投票値を入力
させ、当該文書の文書重要度を投票値が大きいほど大き
くするように補正することを特徴とする。According to the invention of claim 14, in the invention of claim 12, when the document registered in the document database is referred to, the user inputs a voting value corresponding to the utility value, and the document importance of the document is important. It is characterized in that the degree is corrected so that the larger the voting value, the larger the degree.

【００２６】[0026]

【発明の実施の形態】本実施形態では、文書を格納した
文書データベースＤＢ１を備えるサーバ１と、文書デー
タベースＤＢ１に蓄積された文書を検索するために利用
者が操作する端末２とが、ローカルエリアネットワーク
ＮＴを介して接続されている例を示すが、本発明の技術
思想は、サーバ１と端末２とはインターネットのような
広域ネットワークを介して接続する場合、あるいはネッ
トワークを用いずにサーバ１と端末２との機能を１台の
コンピュータ装置によって実現する場合にも適用可能で
ある。また、本実施形態では商品の企画から販売までの
過程、つまり商品の企画・開発・製造・販売の各業務に
おいて利用される文書を検索する場合を例として説明
し、分野としては、商品企画、商品開発、商品設計、製
造技術、品質管理、商品営業、トラブル記録を想定して
いるが、他の文書であっても複数の分野の文書を文書デ
ータベースＤＢ１に登録し、分野別に文書を利用する場
合には、本発明の技術思想を適用することが可能であ
る。たとえば、業務による分類による分野のほか会社内
での利用者の所属部署による分類なども分野として利用
可能である。BEST MODE FOR CARRYING OUT THE INVENTION In the present embodiment, a server 1 having a document database DB1 storing documents and a terminal 2 operated by a user to search for documents stored in the document database DB1 are in a local area. Although an example in which the server 1 and the terminal 2 are connected via a network NT is shown, the technical idea of the present invention is to connect the server 1 and the terminal 2 via a wide area network such as the Internet, or to the server 1 without using the network. It is also applicable when the function of the terminal 2 is realized by one computer device. In the present embodiment, the process from product planning to sales, that is, the case of searching for documents used in each business of product planning, development, manufacturing, and sales is described as an example. Product development, product design, manufacturing technology, quality control, product sales, and trouble recording are assumed, but even for other documents, documents in multiple fields are registered in the document database DB1 and the documents are used for each field. In that case, the technical idea of the present invention can be applied. For example, in addition to fields by business classification, fields by user's department within the company can be used as fields.

【００２７】文書データベースＤＢ１に格納された各文
書は、全文検索が可能な場合にはキーワードを付与しな
くてもよいが、文書がイメージデータである場合のよう
に全文検索が不可能な場合には適宜にキーワードや説明
文が付与される。つまり、この種の文書はキーワードや
説明文に含まれる語彙が検索される。さらに、文書デー
タベースＤＢ１では各文書に対して見出しも対応付けて
ある。Each document stored in the document database DB1 need not be provided with a keyword when the full text search is possible, but when the full text search is impossible such as when the document is image data. Is appropriately added with keywords and explanations. That is, the vocabulary included in the keyword or the description is searched for in this type of document. Further, in the document database DB1, each document is associated with a headline.

【００２８】図１に示すように、サーバ１は、文書デー
タベースＤＢ１に蓄積された文書を抽出するための検索
文を設定する検索文設定手段１１を備え、検索文設定手
段１１により設定された検索文を検索処理手段１２に与
えることによって、文書データベースＤＢ１に蓄積され
た文書と検索文により指定された検索条件とを照合す
る。検索文は後述するように自然文の形式で与えられ
る。検索処理手段１２では、検索文に含まれるキーワー
ドとなる語彙を抽出し、キーワードとなる語彙が複数で
あれば語彙の論理的結合関係を抽出する。ここに、語彙
の論理的結合関係とは、論理積、論理和、否定の組合せ
を意味し、検索文の意味解析によって抽出することがで
きる。こうして検索文から語彙および語彙の論理的結合
関係を抽出することにより検索条件を設定することがで
きる。検索処理手段１２は、文書データベースＤＢ１に
登録されている文書と検索条件とを照合するだけではな
く、各文書ごと文書重要度を対応付けた文書重要度デー
タベースＤＢ２と、各文書ごとに過去に抽出された頻度
を対応付けた参照頻度データベースＤＢ３とを参照して
抽出する文書を決定する。ただし、この処理については
後述する。As shown in FIG. 1, the server 1 comprises a search statement setting means 11 for setting a search statement for extracting a document stored in the document database DB1, and a search set by the search statement setting means 11 By giving the sentence to the retrieval processing means 12, the document accumulated in the document database DB1 is collated with the retrieval condition designated by the retrieval sentence. The search sentence is given in the form of a natural sentence as described later. The search processing means 12 extracts a vocabulary to be a keyword included in the search sentence, and if there are a plurality of vocabularies to be a keyword, extracts a logical combination of the vocabulary. Here, the vocabulary logical connection relation means a combination of a logical product, a logical sum, and a negation, which can be extracted by semantic analysis of a search sentence. Thus, the search condition can be set by extracting the vocabulary and the logical connection of the vocabulary from the search sentence. The search processing means 12 not only collates the documents registered in the document database DB1 with the search conditions, but also extracts the document importance database DB2 in which each document is associated with the document importance and each document in the past. The document to be extracted is determined by referring to the reference frequency database DB3 in which the created frequencies are associated. However, this processing will be described later.

【００２９】検索処理手段１２により抽出された文書に
関する情報は検索結果出力手段１３に格納され、検索結
果出力手段１３から端末２に提示される。端末２に設け
たディスプレイ装置の画面には、検索処理手段１２によ
り抽出された文書の見出しが一覧表示され、一覧表示さ
れた見出しから利用者の検索意図にあった文書を指定す
れば、指定された文書が文書データベースＤＢ１から読
み出されて端末２に転送される。このように、検索処理
手段１２により抽出された文書の見出しを端末２に提示
し、利用者が選択した見出しに対応する文書のみを端末
２に転送するから、ローカルネットワークＮＴを通して
伝送される文書数は少なく、文書のデータサイズが大き
い場合でもトラフィックを大幅に増加させることなく文
書を転送することが可能になる。The information about the document extracted by the search processing means 12 is stored in the search result output means 13 and presented from the search result output means 13 to the terminal 2. On the screen of the display device provided in the terminal 2, the headlines of the documents extracted by the search processing means 12 are displayed in a list, and if a document that matches the user's search intention is specified from the displayed headlines, the list is specified. The document is read from the document database DB1 and transferred to the terminal 2. In this way, since the headline of the document extracted by the search processing means 12 is presented to the terminal 2 and only the document corresponding to the headline selected by the user is transferred to the terminal 2, the number of documents transmitted through the local network NT. It is possible to transfer a document without significantly increasing the traffic even when the data size of the document is large.

【００３０】ところで、検索文設定手段１１は、利用者
が指定した一次検索文が自然文の形で入力される一次検
索文入力手段１４と、一次検索文の形態素解析を行って
キーワードとなる語彙を抽出するとともに抽出したキー
ワードに関連する関連用語を含むように拡張したキーワ
ード（単語または複合語）を設定する検索文解析手段１
５と、検索文解析手段１５において設定されたキーワー
ドを用いて自然文の二次検索文を生成する二次検索文生
成手段１６と、一次検索文と二次検索文とを利用者に提
示し、文書の検索にあたって利用者が希望する検索文を
選択させる実行検索文選択手段１７とを備える。また、
検索文設定手段１１は、検索文解析手段１５における形
態素解析の際に参照するために語彙の品詞を登録した品
詞データベースＤＢ４と、検索文解析手段１５において
関連用語を含むキーワードの設定の際に参照するために
関連用語を登録した関連用語データベースＤＢ５とを備
える。関連用語データベースＤＢ５に登録される関連用
語は、キーワードとなる語彙に対する類義語だけではな
く、分野ごとの専門用語や特殊用語、あるいは当該語彙
に対して類義以外の特定の関連性を有するような語彙も
関連用語として登録される。たとえば、一般に企業内で
扱う文書にはトラブルに関する語彙のように不利益を連
想する場合でも重要な語彙があるから、このような特定
の関連性を有する語彙も関連用語として関連用語データ
ベースＤＢ５に登録される。By the way, the search sentence setting means 11 carries out the morphological analysis of the primary search sentence and the primary search sentence input means 14 in which the primary search sentence designated by the user is inputted in the form of a natural sentence. And a search sentence analysis unit 1 for setting a keyword (word or compound word) expanded to include a related term related to the extracted keyword.
5, the secondary search sentence generation means 16 for generating a secondary search sentence of a natural sentence using the keyword set in the search sentence analysis means 15, the primary search sentence and the secondary search sentence are presented to the user. And an execution search sentence selection means 17 for selecting a search sentence desired by the user when searching for a document. Also,
The search sentence setting means 11 is referred to when the search sentence analyzing means 15 sets a part-of-speech database DB4 in which a part of speech of a vocabulary is registered and a keyword including a related term is set in the search sentence analyzing means 15. In order to do so, a related term database DB5 in which related terms are registered is provided. The related terms registered in the related term database DB5 are not only synonyms for the keyword vocabulary, but also technical terms or special terms for each field, or a vocabulary having a specific relevance other than the synonyms for the vocabulary. Is also registered as a related term. For example, generally, a document handled in a company has an important vocabulary such as a vocabulary related to troubles even when it is associated with a disadvantage. Therefore, a vocabulary having such a specific relevance is also registered in the related term database DB5 as a related term. To be done.

【００３１】関連用語データベースＤＢ５は、語彙に対
する関連用語だけではなく、各関連用語が各分野に対し
て持つ重要度（以下では、用語重要度という）が各関連
用語に対応付けて登録される。たとえば、「電気特性」
という語彙に対して、品質管理の分野では関連用語を
「漏電、短絡、過電流、電流」とし、商品企画の分野で
は関連用語を「電流、電圧、温度」とし、商品設計の分
野では関連用語を「電流、位相、リサジュ図、温度ドリ
フト」としているとすれば、表１のように分野別に関連
用語が分類され、さらに各関連用語ごとに数値による用
語重要度が対応付けられる。この例では「電気特性」に
対して「漏電、短絡、過電流、温度ドリフト」などは不
利益を連想させる語彙ではあるが重要な語彙であるか
ら、「電気特性」の関連用語として関連用語データベー
スＤＢ５に登録される。なお、用語重要度を設定する方
法については詳しく説明しないが、各分野で当該語彙が
使用されている文書数と、１つの文書中での語彙の出現
度数とに基づいて設定する。In the related term database DB5, not only the related terms for the vocabulary but also the degree of importance of each related term for each field (hereinafter referred to as term importance) is registered in association with each related term. For example, "electrical characteristics"
In the field of quality control, the related terms are "leakage, short circuit, overcurrent, current", in the field of product planning, the related terms are "current, voltage, temperature", and in the field of product design. Is expressed as “current, phase, Lissajous figure, temperature drift”, the related terms are classified according to fields as shown in Table 1, and numerical importance is associated with each related term. In this example, “leakage, short-circuit, overcurrent, temperature drift” is a vocabulary that is associated with disadvantages but is an important vocabulary for “electrical characteristics”. Registered in DB5. Although the method of setting the term importance is not described in detail, it is set based on the number of documents in which the vocabulary is used in each field and the appearance frequency of the vocabulary in one document.

【００３２】[0032]

【表１】 [Table 1]

【００３３】検索文解析手段１５では、一次検索文入力
手段１４を通して入力された一次検索文の品詞分解（形
態素解析）を行い、キーワードになる語彙（主として名
詞であるが、動詞、形容詞、副詞、形容動詞も可能）を
抽出する。また、品詞分解により抽出した語彙に複合語
があれば複合語を抽出する。たとえば、一次検索文が
「製品の電気特性について」であるときには、一次検索
部を品詞データベースＤＢ４に照合することによって、
「製品／の／電気／特性／に／ついて」という形で品詞
分解がなされる（ただし、／は品詞の区切りを示す）。
この一次検索文には「電気特性」という複合語が含まれ
るから、一次検索文は最終的に「製品／の／電気特性／
に／ついて」という形に変換される。一次検索文がこの
ように変換されることによって、キーワードとして「製
品」と「電気特性」とが採用される。The search sentence analysis means 15 performs part-of-speech decomposition (morphological analysis) of the primary search sentence input through the primary search sentence input means 14, and the vocabulary (mainly nouns, verbs, adjectives, adverbs, etc. Adjective verbs are also possible). If the vocabulary extracted by the part-of-speech decomposition has a compound word, the compound word is extracted. For example, when the primary search sentence is “about the electrical characteristics of the product”, the primary search part is collated with the part-of-speech database DB4,
Part-of-speech decomposition is performed in the form of "product / no / electricity / characteristic / to / about" (where / indicates a part-of-speech division).
Since the primary search sentence includes the compound word "electrical characteristics", the primary search sentence finally becomes "product / of / electrical characteristic /
It is converted to the form By converting the primary search sentence in this way, “product” and “electrical characteristic” are adopted as keywords.

【００３４】一次検索文からキーワードとして採用する
語彙が決定されると、各キーワードは関連用語データベ
ースＤＢ５に照合され、検索文解析手段１５により抽出
されたキーワードに対して関連用語が存在するときに
は、どのキーワードに関連用語が存在するかが利用者に
提示される。関連用語は分野別に異なるから分野の指定
が可能になっており、検索文解析手段１５では指定され
た分野について関連用語を照合する。ここで利用者が特
定のキーワードについて関連用語の提示を希望すれば、
そのキーワードについて関連用語の一覧が用語重要度と
ともに提示され、利用者は用語重要度を参照しながら所
望の関連用語を選択することが可能になる。二次検索文
生成手段１６は、一次検索文から抽出したキーワード
と、関連データベースＤＢ５に格納された関連用語のう
ち利用者が選択した関連用語とを用いて二次検索文を生
成する。When the vocabulary to be adopted as the keyword is determined from the primary search sentence, each keyword is collated with the related term database DB5, and when the related term exists for the keyword extracted by the search statement analyzing means 15, which keyword is found? The user is presented whether the keyword has a related term. Since the related terms are different for each field, it is possible to specify the field, and the search statement analysis unit 15 collates the related term for the specified field. Here, if the user wants to present related terms for a particular keyword,
A list of related terms for the keyword is presented together with the term importance, and the user can select a desired related term while referring to the term importance. The secondary search sentence generation means 16 generates a secondary search sentence using the keyword extracted from the primary search sentence and the related term selected by the user from the related terms stored in the related database DB5.

【００３５】たとえば、上述のように１次検索文から抽
出したキーワードが「製品」と「電気特性」とであっ
て、関連用語データベースＤＢ５には「電気特性」の関
連用語として表１の内容が登録されているものとし、か
つ利用者が分野として「品質管理」を指定したとする
と、関連用語としては「漏電、短絡、過電流、電流」が
抽出され、用語重要度とともに利用者に提示される。こ
こで、利用者が関連用語として「漏電」と「過電流」と
をしたとすると、キーワードが「製品」と「漏電」およ
び「過電流」とになるから、二次検索文生成手段１６で
は「製品の漏電、過電流について」という二次検索文を
生成する。For example, the keywords extracted from the primary search sentence as described above are "product" and "electrical characteristic", and the related term database DB5 stores the contents of Table 1 as related terms of "electrical characteristic". If it is assumed to be registered and the user specifies "quality control" as the field, related terms "leakage, short circuit, overcurrent, current" are extracted and presented to the user together with the term importance. It If the user uses "leakage" and "overcurrent" as related terms, the keywords are "product", "leakage", and "overcurrent". A secondary search sentence "About product leakage and overcurrent" is generated.

【００３６】一次検索文が入力され上述のような作業に
よって二次検索文が生成されると、実行検索文選択手段
１７により一次検索文と二次検索文とが利用者に提示さ
れる。この段階で利用者は一次次検索文と二次検索文と
から検索に用いる検索文を選択することが可能になる。
ただし、本実施形態では一次検索文と二次検索文とから
１つの検索文のみを選択可能としてある。このようにし
て、一次検索文だけではなく二次検索文も利用者に提示
し、さらには二次検索文も用いて検索を可能とすること
によって、利用者が気付かなかったキーワードでの検索
が可能になるのである。When the primary search text is input and the secondary search text is generated by the above-mentioned operation, the execution search text selecting means 17 presents the primary search text and the secondary search text to the user. At this stage, the user can select the search sentence to be used for the search from the primary search sentence and the secondary search sentence.
However, in this embodiment, only one search sentence can be selected from the primary search sentence and the secondary search sentence. In this way, not only the primary search text but also the secondary search text is presented to the user, and the secondary search text is also used to enable the search, so that the search by the keyword that the user does not notice can be performed. It will be possible.

【００３７】一次検索文と二次検索文とはいずれも自然
文であって、利用者がどの検索文を選択するかにかかわ
らず検索処理手段１２には自然文による検索文が入力さ
れる。検索処理手段１２では、上述したように自然文で
ある検索文から検索条件を抽出する。上述の例で二次検
索文を検索文として用いるとすれば、「製品の漏電、過
電流について」が検索文になるから、検索処理手段１２
では「製品」「漏電」「過電流」の語彙を抽出し、「製
品」と「漏電」との論理積と、「製品」と「過電流」と
の論理積との論理和を検索条件として文書データベース
ＤＢ１に照合する。つまり、論理積の論理記号を∧、論
理和の論理記号を∨とすれば、製品∧（漏電∨過電流）
という検索条件を満たす文書を文書データベースＤＢ１
から抽出するのである。Both the primary search text and the secondary search text are natural texts, and the search texts are input to the search processing means 12 regardless of which search text is selected by the user. The search processing means 12 extracts the search condition from the search sentence which is a natural sentence as described above. If the secondary search sentence is used as the search sentence in the above example, "about product leakage and overcurrent" is the search sentence, and therefore the search processing means 12
Then, the vocabulary of "product", "leakage" and "overcurrent" is extracted, and the logical sum of the logical product of "product" and "leakage" and the logical product of "product" and "overcurrent" is used as the search condition. Collate with the document database DB1. In other words, if the logical symbol of the logical product is ∧ and the logical symbol of the logical sum is ∨, the product ∧ (leakage ∨ overcurrent)
Document database DB1 for documents that satisfy the search condition
It is extracted from.

【００３８】ところで、上述したように、検索処理手段
１２は、文書データベースＤＢ１に登録されている文書
と検索条件とを照合するだけではなく、各文書について
各分野ごとの文書重要度を対応付けた文書重要度データ
ベースＤＢ２と、各文書ごとに過去に参照された頻度
（以下、参照頻度という）を対応付けた参照頻度データ
ベースＤＢ３とを参照して抽出する文書を決定する。す
なわち、検索処理手段１２では、検索条件として用いた
キーワードが文書中に出現する頻度（以下、出現頻度と
いう）、各文書の各分野別の文書重要度、各分野別の参
照頻度を評価データとして用いて各文書の評価尺度を求
め、検索条件を満たす文書の見出しを評価尺度の高い順
に並べて検索結果出力手段１３に出力する。分野別の文
書重要度や参照頻度は、検索文設定手段１１で指定され
た分野に関する文書重要度および参照頻度を用いる。文
書重要度データベースＤＢ２と参照頻度データベースＤ
Ｂ３とのデータ例を表２、表３にそれぞれ示す。なお、
文書重要度データベースＤＢ２に格納される文書重要度
は文書の登録者などによって設定される。By the way, as described above, the search processing means 12 not only collates the document registered in the document database DB1 with the search condition, but also associates the document importance of each field with each document. A document to be extracted is determined by referring to the document importance database DB2 and a reference frequency database DB3 in which the frequency of each document in the past (hereinafter referred to as a reference frequency) is associated. That is, in the search processing means 12, the frequency with which the keyword used as the search condition appears in the document (hereinafter referred to as the appearance frequency), the document importance of each field of each document, and the reference frequency of each field are used as evaluation data. The evaluation scale of each document is obtained by using it, and the headings of the documents satisfying the search condition are arranged in the descending order of evaluation scale and output to the search result output means 13. As the document importance and the reference frequency for each field, the document importance and the reference frequency for the field specified by the search statement setting unit 11 are used. Document importance database DB2 and reference frequency database D
Data examples with B3 are shown in Table 2 and Table 3, respectively. In addition,
The document importance stored in the document importance database DB2 is set by the document registrant or the like.

【００３９】[0039]

【表２】 [Table 2]

【００４０】[0040]

【表３】 [Table 3]

【００４１】参照頻度としては、上述のようにして見出
しを抽出した文書の本文が要求された度数を用いてもよ
いが、本実施形態では文書が閲覧されると（文書の本文
が要求されると）、参照頻度データベースＤＢ３に格納
されている参照頻度が大きいほど大きくなるように重み
付けした値を求め、この値を現在の参照頻度に加算す
る。また、各文書の閲覧後に利用者が入力する投票値を
集計した値を参照頻度に用いてもよい。前者の参照頻度
は本文が要求された回数の多いほど急速に大きくなり、
後者の参照頻度は各文書を閲覧した利用者に文書の利用
価値に関する投票値を複数段階で投票させるから利用者
の判断によって変化する。As the reference frequency, the frequency at which the body of the document in which the headline is extracted as described above is requested may be used, but in the present embodiment, when the document is browsed (the body of the document is requested. And), a value weighted so that the reference frequency stored in the reference frequency database DB3 increases as the reference frequency increases, and this value is added to the current reference frequency. Further, a value obtained by totaling voting values input by the user after browsing each document may be used as the reference frequency. The former reference frequency increases rapidly as the text is requested more often,
The latter reference frequency is changed by the judgment of the user because the user who browses each document votes the voting value regarding the utility value of the document in multiple stages.

【００４２】評価尺度としては、次式によって求められ
る評価ポイントＥＰのほか、出現頻度、文書重要度、参
照頻度などが選択可能になっている。つまり、評価デー
タからの求め方の異なる複数種類の評価尺度から所望の
評価尺度を利用者が選択できるようになっている。ＥＰ＝ω１×出現頻度＋ω２×文書重要度＋ω３×参照
頻度ただし、ω１，ω２，ω３は重み係数であり、出現頻
度、文書重要度、参照頻度の算出方法に応じて適宜に設
定される。たとえば、表４のように、出現頻度を文書中
の語彙数に対するキーワードの出現回数の百分率、文書
重要度を１０段階の数値、参照頻度を参照回数とする場
合には、ω１＝５０、ω２＝１、ω３＝０．０５などと
設定することができる。あるいはまた、出現頻度や参照
頻度が文書重要度と同程度の範囲の数値になるように正
規化している場合には、ω１＝１．０、ω２＝０．８、
ω３＝１．２などと設定することができる。As the evaluation scale, in addition to the evaluation point EP obtained by the following equation, appearance frequency, document importance, reference frequency, etc. can be selected. That is, the user can select a desired evaluation scale from a plurality of types of evaluation scales that are obtained differently from the evaluation data. EP = ω1 × appearance frequency + ω2 × document importance degree + ω3 × reference frequency However, ω1, ω2, ω3 are weighting factors, and are appropriately set according to the calculation method of the appearance frequency, the document importance degree, and the reference frequency. For example, as shown in Table 4, when the appearance frequency is a percentage of the number of appearance times of the keyword with respect to the number of vocabularies in the document, the document importance is a numerical value of 10 levels, and the reference frequency is the reference frequency, ω1 = 50, ω2 = 1, ω3 = 0.05, etc. can be set. Alternatively, when the appearance frequency and the reference frequency are normalized so as to be a numerical value in a range similar to the document importance, ω1 = 1.0, ω2 = 0.8,
It is possible to set ω3 = 1.2 or the like.

【００４３】[0043]

【表４】 [Table 4]

【００４４】上述のように評価ポイントＥＰの高い順に
文書を並べることは、検索文設定手段１１で指定された
分野における文書重要度の順と当該分野における参照頻
度の順とを考慮し、文書重要度が高い順であってかつ参
照頻度の多い順に文書を並べたことになる。なお、評価
尺度として上式の評価ポイントの重み係数を変えた値を
用いることも可能である。出現頻度、文書重要度、参照
頻度を単独で評価尺度に用いたり、いずれか２つを組み
合わせて評価尺度に用いることは、いずれかの重み係数
を０に設定することに相当する。As described above, the documents are arranged in the descending order of the evaluation points EP in consideration of the order of the document importance in the field designated by the search sentence setting means 11 and the order of the reference frequency in the field. This means that the documents are arranged in descending order of frequency and frequency of reference. It is also possible to use a value obtained by changing the weighting factor of the evaluation point in the above expression as the evaluation scale. Using the appearance frequency, the document importance, and the reference frequency independently as an evaluation scale, or using any two of them in combination as an evaluation scale is equivalent to setting any weighting coefficient to zero.

【００４５】検索処理手段１２における処理手順を図２
に示す。検索処理手段１２において検索が開始される
と、文書データベースＤＢ１に検索条件が照合されて検
索が実行される（Ｓ１）。検索条件に合致する検索結果
は一旦メモリに格納される（Ｓ２）。また、検索処理手
段１２では指定の分野を検索文設定手段１１から取得し
（Ｓ３）、取得した分野について各文書に対応する文書
重要度を文書重要度データベースＤＢ２から抽出する
（Ｓ４）とともに、参照頻度を参照頻度データベースＤ
Ｂ３から抽出する（Ｓ５）。このようにして求めた文書
重要度および参照頻度を用いてメモリに格納した文書を
並べ替え、結果を検索結果出力手段１３に出力するので
ある（Ｓ６）。The processing procedure in the search processing means 12 is shown in FIG.
Shown in. When the search is started by the search processing means 12, the search condition is collated with the document database DB1 and the search is executed (S1). Search results that match the search conditions are temporarily stored in the memory (S2). Further, the search processing means 12 acquires the designated field from the search sentence setting means 11 (S3), extracts the document importance level corresponding to each document for the acquired field from the document importance level database DB2 (S4), and also refers to it. Refer to frequency Frequency database D
Extract from B3 (S5). The documents stored in the memory are rearranged using the document importance and reference frequency thus obtained, and the result is output to the search result output means 13 (S6).

【００４６】以下では、具体的な作業手順を示して本実
施形態の動作を説明する。図３ないし図１０に示す画面
はサーバ１に接続された端末２のディスプレイ装置に表
示されているものとする。文書の検索を開始する前に
は、まず図３に示す画面が端末２に表示される。この画
面には、一次検索文の入力を促すフィールドＦ１と、フ
ィールドＦ１に入力された一次検索文から抽出したキー
ワードに対する関連用語を表示するフィールドＦ２と、
文書の検索を行う分野を指定するフィールドＦ３とが設
けられる。フィールドＦ１の近傍には「検索実行」、
「語句拡張」、「リセット」の各ボタンＢ１〜Ｂ３が設
けられる。「検索実行」ボタンＢ１は一次検索文のみを
用いた文書検索の実行を指示する際に用い、「語句拡
張」ボタンＢ２は二次検索文の生成を指定する際に用
い、「リセット」ボタンＢ３はフィールドＦ１に書き込
んだ一次検索文を消去して新たな一次検索文の入力を指
示する際に用いる。The operation of this embodiment will be described below by showing a specific work procedure. It is assumed that the screens shown in FIGS. 3 to 10 are displayed on the display device of the terminal 2 connected to the server 1. Before starting the document search, the screen shown in FIG. 3 is first displayed on the terminal 2. On this screen, a field F1 for prompting the input of a primary search sentence, a field F2 for displaying a related term for a keyword extracted from the primary search sentence input in the field F1,
A field F3 for designating a field in which a document is searched is provided. In the vicinity of the field F1, "execute search",
Buttons B1 to B3 for "word expansion" and "reset" are provided. The "execute search" button B1 is used to instruct execution of a document search using only the primary search sentence, the "expand phrase" button B2 is used to specify generation of the secondary search sentence, and the "reset" button B3 is used. Is used when erasing the primary search sentence written in the field F1 and instructing the input of a new primary search sentence.

【００４７】一次検索文入力手段１４により端末２の画
面に提示されるフィールドＦ１の下方には、キーワード
の関連用語を表示するフィールドＦ２が設けられれ、フ
ィールドＦ２の右端部の上方および下方には、「二次検
索文で検索」と表記されたボタンＢ４が設けられる。さ
らに、フィールドＦ２の左端部の下方には、分野を指定
するためのフィールドＦ３が設けられる。さらに、フィ
ールドＦ３にはボタンＢ５が隣接して設けられる。ボタ
ンＢ５を操作すると（通常は、マウスのようなポインテ
ィングデバイスによりクリックすることを意味する）、
フィールドＦ３に対応するポップアップメニューが提示
され、ポップアップメニューに示された選択肢にカーソ
ルを合わせて選択すると（一般に、マウスのようなポイ
ンティングデバイスを用いるときにはカーソルを合わせ
てクリックすることを意味し、キーボードによる操作の
場合にはカーソルキーを用いてカーソルを合わせた後に
リターンキーを押下することを意味する）、フィールド
Ｆ３の内容が確定する。フィールドＦ３に対応する選択
肢は、文書データベースＤＢ１に格納された文書に関連
する分野であって、本実施形態では、上述したように、
商品企画、商品開発、商品設計、製造技術、品質管理、
商品営業、トラブル記録の各分野が選択可能になってい
る。フィールドＦ１，Ｆ２が表示されている画面の右下
部には「ログアウト」と表記されたボタンＢ６が設けら
れ、このボタンＢ６は文書の検索処理を終了する際に操
作される。Below the field F1 presented on the screen of the terminal 2 by the primary search sentence input means 14, there is provided a field F2 for displaying the related terms of the keyword, and above and below the right end of the field F2. A button B4 labeled "Search with secondary search text" is provided. Further, below the left end of the field F2, a field F3 for designating a field is provided. Further, a button B5 is provided adjacent to the field F3. When button B5 is operated (usually, it means clicking with a pointing device such as a mouse),
A pop-up menu corresponding to the field F3 is presented, and when the cursor is placed on the option shown in the pop-up menu and selected (generally, when a pointing device such as a mouse is used, the cursor is placed and clicked. In the case of operation, it means that the return key is pressed after aligning the cursor with the cursor key), and the content of the field F3 is confirmed. The option corresponding to the field F3 is a field related to the document stored in the document database DB1, and in the present embodiment, as described above,
Product planning, product development, product design, manufacturing technology, quality control,
Each field of product sales and trouble recording can be selected. A button B6 described as "logout" is provided in the lower right part of the screen on which the fields F1 and F2 are displayed, and this button B6 is operated when ending the document search processing.

【００４８】いま、図４に示すように、「製品の電気特
性について」という一次検索部をフィールドＦ１に入力
すると、上述したように、検索文解析手段１５におい
て、品詞データベースＤＢ４を参照して形態素解析が行
われ、「製品」と「電気特性」とがキーワードとして抽
出される。ここで、「語句拡張」ボタンＢ２を操作する
と、検索文解析手段１５では、キーワードを関連用語デ
ータベースＤＢ５に照合し、関連用語データベースＤＢ
５に関連用語の登録されているキーワードがあれば、当
該キーワードをフィールドＦ２の「拡張対象語句」欄に
表示するとともに、フィールドＦ２における「拡張実
行」欄に「語句検索」ボタンＢ１３を表示する。このよ
うに、フィールドＦ２に「電気特性」が示されたことに
よって、「電気特性」には関連用語が登録されているこ
とが示される。Now, as shown in FIG. 4, when the primary search section "about the electrical characteristics of the product" is entered in the field F1, the morpheme database DB4 is referred to in the search statement analysis means 15 as described above. Analysis is performed and “product” and “electrical characteristics” are extracted as keywords. Here, when the "expand phrase" button B2 is operated, the search statement analysis means 15 collates the keyword with the related term database DB5, and the related term database DB5.
If there is a keyword for which a related term is registered in 5, the keyword is displayed in the “expansion target phrase” field of the field F2, and the “phrase search” button B13 is displayed in the “expansion execution” field of the field F2. As described above, the fact that the "electrical characteristic" is indicated in the field F2 indicates that the related term is registered in the "electrical characteristic".

【００４９】利用者が「電気特性」という語彙に関して
関連用語を知ろうとするときには、「語句拡張ボタン」
Ｂ２を操作すれば、図５に示すように、フィールドＦ３
において選択されている分野について、「電気特性」と
いう語彙の関連用語を一覧表示したフィールドＦ５が端
末２の画面に表示される。ここで、フィールドＦ５の上
方にはフィールドＦ５の中に示した関連用語が、どのよ
うな語彙に対する関連用語かを示すフィールドＦ４が設
けられる。図示例では、「商品設計」の分野における
「電気特性」の関連用語がフィールドＦ５に示されてい
る（表１の内容を想定している）。また、各関連用語に
は用語重要度が並記される。フィールドＦ５の右下方に
は「戻る」ボタンＢ７が設けられ、「戻る」ボタンＢ７
の操作によって１画面前の状態に戻ることができる。When the user wants to know a related term regarding the vocabulary of "electrical characteristics", "word expansion button"
By operating B2, as shown in FIG. 5, field F3
A field F5 displaying a list of related terms in the vocabulary “electrical characteristics” for the field selected in is displayed on the screen of the terminal 2. Here, above the field F5, a field F4 is provided which indicates to which vocabulary the related term shown in the field F5 is related. In the illustrated example, the related term of “electrical characteristics” in the field of “product design” is shown in the field F5 (the contents of Table 1 are assumed). Further, the term importance is written in parallel with each related term. A "return" button B7 is provided at the lower right of the field F5, and a "return" button B7 is provided.
It is possible to return to the state of the previous screen by the operation of.

【００５０】図５のように関連用語がフィールドＦ５に
示された画面において、利用者は用語重要度を参照し
て、キーワードに用いる関連用語を選択することができ
る。つまり、図６に示すように、キーワードとして用い
ようとする関連用語を選択する（一般にはマウスカーソ
ルを関連用語付近でクリックする）と、各関連用語が反
転表示される（図における斜線部が反転表示された領
域）。図示例では、関連用語のうち「漏電」と「過電
流」とを選択した状態を示している。これは、商品設計
のような分野では「電気特性」のうち「電流」は重要な
語彙ではあるが、「品質管理」の分野ではあまり重要で
はなく、むしろ「漏電」「過電流」が重要になる。そこ
で、「品質管理」の分野で文書を検索しようとする利用
者は、「漏電」と「過電流」とを選択することになる。
「漏電」と「過電流」とを反転表示させた状態で「戻
る」ボタンＢ７を操作すると、図７に示すように、フィ
ールドＦ２における「拡張語句」欄に「漏電、過電流」
が表示される。つまり、端末２の画面には、フィールド
Ｆ１に一次検索文が表示され、一次検索文から抽出した
キーワードのうち利用者が関連用語データベースＤＢ２
から選択した関連用語がフィールドＦ２に表示される。
なお、拡張語句を修正する必要があれば、「拡張語句」
欄を選択して他の語彙に修正することも可能である。On the screen in which the related terms are shown in the field F5 as shown in FIG. 5, the user can refer to the term importance and select the related terms used for the keyword. That is, as shown in FIG. 6, when a related term to be used as a keyword is selected (generally, the mouse cursor is clicked in the vicinity of the related term), each related term is highlighted (the shaded area in the figure is reversed). Area displayed). The illustrated example shows a state in which "leakage" and "overcurrent" are selected from the related terms. This is because "electric current" is an important vocabulary in "electrical characteristics" in fields such as product design, but it is not so important in the field of "quality control" and rather "leakage" and "overcurrent" are important. Become. Therefore, a user who searches for a document in the field of "quality control" selects "leakage" and "overcurrent".
When the "return" button B7 is operated in a state where "leakage" and "overcurrent" are highlighted, "leakage, overcurrent" is displayed in the "extended phrase" field in the field F2, as shown in FIG.
Is displayed. That is, the primary search sentence is displayed in the field F1 on the screen of the terminal 2, and the user selects the related term database DB2 among the keywords extracted from the primary search sentence.
The related term selected from is displayed in the field F2.
If it is necessary to correct the extended phrase, "extended phrase"
It is also possible to select a field and modify it to another vocabulary.

【００５１】この状態で「二次検索文で検索」ボタンＢ
４を操作すると、二次検索文生成手段１６によって自然
文である二次検索文が自動的に生成される。二次検索文
が生成されると、実行検索文選択手段１７によって、図
８に示すように、新たに開いたウインドウＷ１内のフィ
ールドＦ６に二次検索文が表示される。ここでは関連用
語として「漏電」と「過電流」とが選択されているか
ら、「製品の漏電、過電流について」という二次検索文
が生成される。このウインドウＷ１の中では、生成した
二次検索文を用いて文書を検索するか否かが利用者に問
われ、利用者はウインドウＷ１内の「Ｙｅｓ」ボタンＢ
８と「Ｎｏ」ボタンＢ９とのいずれかを操作することに
なる。「Ｙｅｓ」ボタンＢ８を操作すればウインドウＷ
１が閉じて二次検索文を用いた検索が自動的に実行さ
れ、「Ｎｏ」ボタンＢ９を操作すればウインドウＷ１が
閉じて前画面に戻る。前画面では一次検索文がフィール
ドＦ１に表示されているから、「検索実行」ボタンＢ１
を操作すれば一次検索文による検索が可能になる。In this state, "Search by secondary search text" button B
When 4 is operated, the secondary search sentence generating means 16 automatically generates a secondary search sentence which is a natural sentence. When the secondary search text is generated, the execution search text selection means 17 displays the secondary search text in the field F6 in the newly opened window W1, as shown in FIG. Here, since "leakage" and "overcurrent" are selected as related terms, a secondary search sentence "about product leakage and overcurrent" is generated. In this window W1, the user is asked whether or not to search the document using the generated secondary search text, and the user asks for the “Yes” button B in the window W1.
8 or the "No" button B9 is operated. Window W by operating "Yes" button B8
1 is closed and the search using the secondary search sentence is automatically executed. If the "No" button B9 is operated, the window W1 is closed and the previous screen is displayed. On the previous screen, the primary search sentence is displayed in the field F1, so the "execute search" button B1
You can search by the primary search sentence by operating.

【００５２】一次検索文と二次検索文との一方を選択し
て検索の実行を指示すれば、検索処理手段１２によって
文書データベースＤＢ１が検索され、文書重要度データ
ベースＤＢ２および参照頻度データベースＤＢ３を参照
して条件に合った文書が抽出され、上述した評価ポイン
トＥＰを用いて、評価ポイントＥＰの高い順に抽出され
た文書の見出しが並べられる。つまり、図９に示すよう
に、フィールドＦ１には検索を実行した検索文（ここで
は、二次検索文）が示され、フィールドＦ１の下方に表
示されるフィールドＦ７には、評価尺度の種類および算
出方法が示される。図示例では評価尺度として評価ポイ
ントＥＰを用いているから、評価ポイントＥＰの演算式
が示される。フィールドＦ７の下方に表示されるフィー
ルドＦ８には、文書の所在（ファイル名）、評価ポイン
トＥＰ、文書の見出しが一覧表示される。フィールドＦ
８には各文書に対応する「表示」ボタンＢ１０が設けら
れ、「表示」ボタンＢ１０の操作によって、見出しが示
された文書の本文が画面に表示される。When one of the primary search sentence and the secondary search sentence is selected and the execution of the search is instructed, the search processing means 12 searches the document database DB1 and refers to the document importance database DB2 and the reference frequency database DB3. Then, documents satisfying the conditions are extracted, and the above-mentioned evaluation points EP are used to arrange the headings of the extracted documents in descending order of evaluation points EP. That is, as shown in FIG. 9, a search sentence (here, a secondary search sentence) in which a search is performed is shown in the field F1, and a field F7 displayed below the field F1 shows the type of evaluation scale and The calculation method is shown. In the illustrated example, since the evaluation point EP is used as the evaluation scale, the arithmetic expression of the evaluation point EP is shown. A field F8 displayed below the field F7 displays a list of document locations (file names), evaluation points EP, and document headings. Field F
A "display" button B10 corresponding to each document is provided on the document 8, and the body of the document with the headline is displayed on the screen by operating the "display" button B10.

【００５３】図９に示す画面内でフィールドＦ８の左下
方に設けたフィールドＦ１１では、抽出した文書を並べ
るための評価尺度を利用者に選択させる。すなわち、フ
ィールドＦ１１にはボタンＢ１４が並設され、ボタンＢ
１４を操作するとポップアップメニューが示され、この
ポップアップメニューには選択肢として「評価ポイン
ト」のほか、「出現頻度」「文書重要度」「参照頻度」
など異なる複数種類の評価尺度が用意されている。検索
処理手段１２により抽出された文書は、ポップアップメ
ニューにより選択した評価尺度に従って並べ替えられ、
利用者は様々な評価尺度の順で文書の見出しを並べ替え
ることができ、検索意図に合う文書を探し出す方法を様
々に選択することが可能になる。なお、フィールドＦ１
１に表示された評価尺度はフィールドＦ７にも示され
る。In the field F11 provided in the lower left of the field F8 in the screen shown in FIG. 9, the user is allowed to select the evaluation scale for arranging the extracted documents. That is, the button B14 is provided in parallel in the field F11, and the button B
When you operate 14, a pop-up menu will be displayed. In this pop-up menu, in addition to “evaluation point”, “appearance frequency”, “document importance”, “reference frequency”
Different types of evaluation scales are available. The documents extracted by the search processing means 12 are sorted according to the evaluation scale selected from the pop-up menu,
The user can sort the document headings in the order of various evaluation scales, and can select various methods for finding documents that match the search intention. The field F1
The rating scale displayed in 1 is also shown in the field F7.

【００５４】上述のような作業によって文書データベー
スＤＢ１から検索意図に合致する文書を抽出した後に
は、図１０に示すように、分野を示すフィールドＦ９
と、抽出した文書を示すフィールドＦ１０とを備えた画
面が表示される。この画面には、抽出された文書に対し
て利用者の投票値を入力する欄が設けられ、「大変役に
立った」「役に立った」「あまり役に立たなかった」と
いう３段階で文書の利用価値を投票するようになってい
る。各段階の項目にはラジオボタンＢ１２が付設され、
いずれかのラジオボタンＢ１２を選択することによって
投票値が入力されるようにしてある。この画面で入力さ
れた投票値は、上述のように参照頻度データベースＤＢ
３の参照頻度の演算に用いられる。つまり、「大変役に
立った」に対応する投票値に対しては参照頻度の加算値
を大きくし、「あまり役に立たなかった」に対応する投
票値に対しては参照頻度の加算値を小さくする。また、
投票値は文書重要度データベースＤＢ２に格納された文
書重要度に対する補正値としても用いられる。つまり、
文書重要度データベースＤＢ２には、投票値に基づいて
設定される補正値を各文書に対応付けて各分野ごとに格
納する領域があり、投票値に基づいて設定した補正値が
以後の検索において用いられることになる。この補正値
は、「大変役に立った」に対応する投票値に対しては文
書重要度を大きくするように設定され、「あまり役に立
たなかった」に対応する投票値に対しては文書重要度を
小さくするように設定される。After the document matching the retrieval intention is extracted from the document database DB1 by the above-mentioned operation, as shown in FIG. 10, a field F9 indicating a field is displayed.
And a field F10 indicating the extracted document is displayed. This screen has a column for entering the user's vote value for the extracted document, and votes the document's utility value in three stages: "Very useful,""Useful," and "Not very useful." It is supposed to do. Radio button B12 is attached to each stage item,
The voting value is input by selecting one of the radio buttons B12. The voting value entered on this screen is the reference frequency database DB as described above.
It is used to calculate the reference frequency of 3. That is, the added value of the reference frequency is increased with respect to the vote value corresponding to "very useful", and the added value of the reference frequency is decreased with respect to the vote value corresponding to "not very useful". Also,
The voting value is also used as a correction value for the document importance stored in the document importance database DB2. That is,
The document importance database DB2 has an area for storing a correction value set based on a vote value in association with each document for each field, and the correction value set based on the vote value is used in subsequent searches. Will be done. This correction value is set to increase the document importance for the vote value corresponding to "very useful", and to reduce the document importance for the vote value corresponding to "not very useful". Is set to do.

【００５５】上述した本実施形態の処理手順の全体を図
１１に示す。すなわち、文書データベースＤＢ１から文
書を検索しようとするときには、まず利用者によって一
次検索文が入力される（Ｓ１）。入力された一次検索文
を品詞分解し（Ｓ２）、一次検索文から抽出したキーワ
ードに対する関連用語が関連用語データベースＤＢ５に
登録されているときには（Ｓ３）、利用者によって関連
用語を用いるように指定されると二次検索文を自動的に
生成する（Ｓ４）。生成された二次検索文を利用者に提
示し、二次検索式を用いるか否かを選択させる（Ｓ
５）。ここに、関連用語がなければ一次検索文を用いて
検索することになる。検索式が決定されると検索を実行
し（Ｓ６）、検索結果として複数の文書が抽出されたと
きには（Ｓ７）、評価尺度の高い順に並べ替える（Ｓ
８）。また、抽出された文書が１つであればそのまま出
力される。このようにして抽出された文書の本文の閲覧
が利用者に要求されたときには文書が参照されたものと
みなし（Ｓ９）、参照回数を更新する（Ｓ１０）。ま
た、参照されなければそのまま終了する。FIG. 11 shows the entire processing procedure of this embodiment described above. That is, when trying to search for a document from the document database DB1, the user first inputs a primary search sentence (S1). The input primary search sentence is part-of-speech decomposed (S2), and when the related term for the keyword extracted from the primary search sentence is registered in the related term database DB5 (S3), it is designated by the user to use the related term. Then, a secondary search sentence is automatically generated (S4). The generated secondary search sentence is presented to the user and the user is made to select whether or not to use the secondary search expression (S
5). If there is no related term here, it will be searched using the primary search sentence. When the search formula is determined, the search is executed (S6), and when a plurality of documents are extracted as the search result (S7), the documents are sorted in descending order of evaluation scale (S).
8). If there is only one extracted document, it is output as it is. When the user is requested to browse the body of the document thus extracted, it is considered that the document has been referred to (S9), and the reference count is updated (S10). If it is not referenced, the process ends as it is.

【００５６】[0056]

【発明の効果】請求項１の発明は、特定の複数分野の範
囲内で用いる複数の文書が格納された文書データベース
と、利用者に検索文を設定させるとともに検索範囲の分
野を指定させる検索文設定手段と、検索文設定手段によ
り設定された検索文の検索条件に合致する文書を文書デ
ータベースから抽出する検索処理手段と、検索処理手段
での検索結果を出力する検索結果出力手段とを備え、検
索処理手段は、文書データベースに登録された各文書ご
とに分野別に分類された評価データに基づいて各文書の
各分野における評価尺度を求める機能と、求め方の異な
る複数種類の評価尺度から所望の評価尺度を利用者に選
択させる機能と、抽出した各文書ごとに検索文設定手段
で指定された分野に関して利用者が選択した種類の評価
尺度を求めるとともに当該評価尺度の大小順に並べて検
索結果として検索結果出力手段に引き渡す機能とを有す
るものであり、各文書の利用価値を分野別に評価した評
価尺度を用いて検索結果を評価尺度の大小順に並べて出
力するから、利用者の検索意図にあった文書である可能
性が高い文書から優先して提示することができ、利用者
にとって不要なノイズである文書について考慮すること
なく目的の文書を抽出できる可能性が高くなる。しか
も、複数種類の評価尺度から利用者が所望の評価尺度を
選択するから、異なる評価尺度を用いることによって文
書の並び順を変えることができ、目的に応じた評価尺度
を選択することで目的の文書に到達できる可能性を高め
ることができる。According to the invention of claim 1, a document database in which a plurality of documents used within a range of a plurality of specific fields are stored, and a search statement for allowing a user to set a search statement and specify a field of the search range. And a search result output unit for outputting a search result by the search processing unit, a search processing unit for extracting a document that matches the search condition of the search text set by the search text setting unit from the document database. The search processing means has a function of obtaining an evaluation scale in each field of each document based on the evaluation data classified by field for each document registered in the document database, and a desired one from a plurality of types of evaluation scales different in how to obtain the evaluation scale. A function to let the user select an evaluation scale, and to calculate an evaluation scale of the type selected by the user for the fields specified by the search sentence setting means for each extracted document And has a function of arranging the evaluation scales in order of magnitude and passing them as search results to the search result output means, and outputting the search results in order of magnitude of evaluation scales by using the evaluation scales in which the utility value of each document is evaluated in each field. Therefore, it is possible to preferentially present documents that are likely to match the user's search intention, and the target document can be extracted without considering the noise that is unnecessary for the user. Will be more likely. Moreover, since the user selects a desired evaluation scale from multiple types of evaluation scales, the order of the documents can be changed by using different evaluation scales. The chances of reaching a document can be increased.

【００５７】請求項２の発明は、請求項１の発明におい
て、前記検索文設定手段が、利用者にキーワードを含む
自然文である一次検索文を入力させる一次検索文入力手
段と、一次検索文からキーワードを抽出する検索文解析
手段と、キーワードとなる語彙に対する関連用語が前記
分野別に登録された関連用語データベースと、検索文解
析手段により抽出したキーワードを関連用語データベー
スに照合し検索文解析手段で抽出したキーワードに対す
る関連用語を用いた二次検索文を生成する機能を有した
二次検索文生成手段と、一次検索文と二次検索文とから
前記検索処理手段に与える検索文を利用者に選択させる
実行検索文選択手段とから成るものであり、キーワード
を拡張する関連用語が分野別に分類されているから、キ
ーワードを拡張して検索漏れを少なくしながらも、分野
を制限することによって不要なノイズが含まれる可能性
を低減することができる。According to a second aspect of the present invention, in the first aspect of the present invention, the search sentence setting means allows the user to input a primary search sentence that is a natural sentence including a keyword, and a primary search sentence. A search sentence analyzing means for extracting a keyword from the related sentence, a related term database in which the related terms for the vocabulary to be the keyword are registered for each of the fields, and the keyword extracted by the search sentence analyzing means are collated with the related term database to be searched by the search sentence analyzing means. A secondary search sentence generation means having a function of generating a secondary search sentence using a related term for the extracted keyword, and a search sentence given to the search processing means from the primary search sentence and the secondary search sentence to the user. It consists of the execution search sentence selection means for selecting, and since the related terms for expanding the keyword are classified by field, the keyword is expanded. While reducing the search omission, it is possible to reduce the possibility that contain unwanted noise by limiting the field.

【００５８】請求項３の発明は、請求項２の発明におい
て、前記関連用語データベースには、キーワードとなる
語彙の類義語に加えて各分野において特定の関連性を有
する語彙が関連用語として登録されるとともに、各関連
用語に各分野での利用価値の目安となる用語重要度が対
応付けられ、前記二次検索文生成手段では用語重要度を
関連用語とともに利用者に示して関連用語から利用者の
希望する関連用語を選択させるものであり、関連用語に
分野別の用語重要度を設定しているから、関連用語とし
て拡張する語彙の有効性の目安を用語重要度によって利
用者に与えることができ、キーワードの拡張範囲を利用
者に選択させることができるから、抽出された文書のう
ちで検索意図に合致しないノイズとなる文書数を低減さ
せることができる。つまり、利用者の検索意図に合致す
る文書を抽出できる可能性が高くなる。According to the invention of claim 3, in the invention of claim 2, in the related term database, in addition to the synonyms of the vocabulary as a keyword, a vocabulary having a specific relevance in each field is registered as a related term. At the same time, each related term is associated with a term importance that is a measure of the utility value in each field, and the secondary search sentence generation means shows the term importance to the user together with the related term, and the related term This is to select the desired related term, and since the term importance for each field is set for the related term, it is possible to give the user a measure of the effectiveness of the vocabulary to be expanded as a related term by the term importance. Since the user can select the expansion range of the keyword, it is possible to reduce the number of extracted documents that are noises that do not match the search intention. That is, there is a high possibility that a document that matches the user's search intention can be extracted.

【００５９】請求項４の発明は、請求項１ないし請求項
３の発明において、前記文書データベースに登録された
各文書ごとの各分野での利用価値を前記評価データとな
る文書重要度として登録した文書重要度データベースを
備え、前記検索処理手段では、文書重要度が前記評価尺
度の選択肢の一つとして選択されると、前記検索条件に
より抽出した文書を文書重要度データベースに照合する
ことにより文書重要度を前記評価尺度に用いて文書を並
べるものであり、各分野での文書の利用価値を文書重要
度として設定しているから、文書重要度を登録者が設定
するようにすれば、目的の文書の各分野での利用価値を
人の意思に従って配列することができ、利用価値の高い
文書ほど高い順位で提示される可能性が高くなる。According to the invention of claim 4, in the inventions of claims 1 to 3, the utility value in each field of each document registered in the document database is registered as the document importance which is the evaluation data. A document importance database is provided, and when the document importance is selected as one of the evaluation scale options in the search processing means, the document extracted by the search condition is collated with the document importance database to thereby determine the document importance. Documents are arranged using the degree as the evaluation scale, and the utility value of the document in each field is set as the document importance. Therefore, if the registrant sets the document importance, The utility value of each document in each field can be arranged according to a person's intention, and a document having a higher utility value is more likely to be presented in a higher rank.

【００６０】請求項５の発明は、請求項１ないし請求項
３の発明において、前記文書データベースに登録された
各文書ごとの各分野別の参照頻度を前記評価データとし
て登録した参照頻度データベースを備え、前記検索処理
手段では、参照頻度が前記評価尺度の選択肢の一つとし
て選択されると、前記検索条件により抽出した文書を参
照頻度データベースに照合することにより参照頻度を前
記評価尺度に用いて文書を並べるものであり、参照頻度
の高い文書つまり利用実績の多い文書ほど高い順位で提
示されることになる。According to a fifth aspect of the present invention, in the first to third aspects of the present invention, a reference frequency database is provided in which the reference frequency for each field for each document registered in the document database is registered as the evaluation data. In the search processing means, when the reference frequency is selected as one of the options of the evaluation scale, the document extracted by the search condition is collated with a reference frequency database to use the reference frequency as the evaluation scale. Documents that are frequently referred to, that is, documents that have been used frequently are presented in a higher order.

【００６１】請求項６の発明は、請求項２または請求項
３の発明において、前記文書データベースに登録された
各文書ごとの各分野での利用価値を前記評価データとな
る文書重要度として登録した文書重要度データベース
と、前記文書データベースに登録された各文書ごとの各
分野別の参照頻度を前記評価データとして登録した参照
頻度データベースとを備え、前記検索処理手段では、抽
出した各文書中でのキーワードの出現頻度と文書重要度
と参照頻度とを重み付け加算した評価ポイントが前記評
価尺度の選択肢の一つとして選択されると、前記検索条
件により抽出した文書を文書重要度データベースおよび
参照頻度データベースに照合するとともに抽出した各文
書中でのキーワードの出現頻度を求めることにより評価
ポイントを求め、この評価ポイントを前記評価尺度に用
いて文書を並べるものであり、出現頻度と文書重要度と
参照頻度とを総合的に考慮した順位で文書が提示される
ことになり、目的とする文書が高い順位で提示される確
率が高くなる。According to the invention of claim 6, in the invention of claim 2 or claim 3, the utility value in each field of each document registered in the document database is registered as the document importance serving as the evaluation data. The document importance database and the reference frequency database in which the reference frequency for each field for each document registered in the document database is registered as the evaluation data are provided. When the evaluation point obtained by weighting and adding the appearance frequency of the keyword, the document importance and the reference frequency is selected as one of the options of the evaluation scale, the document extracted by the search condition is stored in the document importance database and the reference frequency database. The evaluation points are obtained by matching and matching the appearance frequency of the keyword in each extracted document. Documents are arranged by using the evaluation points as the evaluation scale, and the documents are presented in a rank in which the appearance frequency, the document importance, and the reference frequency are comprehensively considered, and the target document has a high rank. Will be more likely to be presented at.

【００６２】請求項７の発明は、請求項５または請求項
６の発明において、前記検索処理手段では、前記文書デ
ータベースに登録された文書が参照されたときに、前記
参照頻度データベースに登録されている参照頻度が大き
いほど大きい値を参照頻度に加算するものであり、参照
頻度を重視した順位付けが可能になる。According to the invention of claim 7, in the invention of claim 5 or 6, when the document registered in the document database is referenced in the search processing means, the document is registered in the reference frequency database. The larger the reference frequency is, the larger value is added to the reference frequency, and the ranking can be performed with the reference frequency being emphasized.

【００６３】請求項８の発明は、請求項５または請求項
６の発明において、前記検索処理手段では、前記文書デ
ータベースに登録された文書が参照されたときに利用者
に利用価値に相当する投票値を入力させ、投票値の合計
を参照頻度に用いるものであり、利用者による利用価値
の判断によって提示される順位が決まるから利用価値が
高いと認識される文書ほど上位で提示されることにな
る。According to an eighth aspect of the present invention, in the fifth or sixth aspect of the invention, the search processing means votes corresponding to the utility value to the user when the document registered in the document database is referenced. A value is entered, and the total of voting values is used as the reference frequency. Since the order of presentation is determined by the judgment of the utility value by the user, the higher the value of the document, the higher the value of the document. Become.

【００６４】請求項９の発明は、請求項４または請求項
６の発明において、前記検索処理手段では、前記文書デ
ータベースに登録された文書が参照されたときに利用者
に利用価値に相当する投票値を入力させ、前記文書重要
度データベースに登録された当該文書の文書重要度を投
票値が大きいほど大きくするように補正するものであ
り、文書重要度が利用者による利用価値の判断によって
補正されるから、利用者の価値判断に対応した文書を抽
出しやすくなる。According to a ninth aspect of the present invention, in the fourth or sixth aspect of the present invention, the search processing means votes when the document registered in the document database is referred to, which corresponds to the utility value to the user. A value is entered, and the document importance of the document registered in the document importance database is corrected to be larger as the voting value is larger. The document importance is corrected by the judgment of the utility value by the user. Therefore, it becomes easy to extract the document corresponding to the value judgment of the user.

【００６５】請求項１０の発明は、特定の複数分野の範
囲内で用いる複数の文書が格納された文書データベース
から目的の文書を検索する方法であって、利用者に検索
文を設定させるとともに検索範囲の分野を指定させた
後、設定された検索文の検索条件に合致する文書を文書
データベースから抽出し、さらに各文書ごとに分野別に
分類された評価データから求められる各文書の各分野で
の評価尺度を、求め方の異なる複数種類から利用者に選
択させ、次に抽出した各文書ごとに指定された分野に関
して選択された種類の評価尺度を求めるとともに抽出し
た文書を当該評価尺度の大小順に並べて検索結果として
出力することを特徴としており、各文書の利用価値を分
野別に評価した評価尺度を用いて、検索結果を評価尺度
の大小順に並べて出力するから、利用者の検索意図にあ
った文書である可能性が高い文書から優先して提示する
ことができ、利用者にとって不要なノイズである文書に
ついて考慮することなく目的の文書を抽出できる可能性
が高くなる。しかも、複数種類の評価尺度から利用者が
所望の評価尺度を選択するから、異なる評価尺度を用い
ることによって文書の並び順を変えることができ、目的
に応じた評価尺度を選択することで目的の文書に到達で
きる可能性を高めることができる。The invention of claim 10 is a method for retrieving a target document from a document database in which a plurality of documents used within a specific plurality of fields are stored. After specifying the fields of the range, documents that match the search conditions of the set search sentence are extracted from the document database, and further, in each field of each document obtained from the evaluation data classified by field for each document. Let the user select an evaluation scale from multiple types of different methods of obtaining, and then calculate the type of evaluation scale selected for the specified field for each extracted document, and sort the extracted documents in order of magnitude. The feature is that they are output side by side as a search result, and the search results are arranged in order of magnitude of the evaluation scale using an evaluation scale that evaluates the utility value of each document by field. Therefore, it is possible to preferentially present documents that are likely to match the user's search intention, and the target document can be extracted without considering the noise that is unnecessary for the user. Will be more likely. Moreover, since the user selects a desired evaluation scale from multiple types of evaluation scales, the order of the documents can be changed by using different evaluation scales. The chances of reaching a document can be increased.

【００６６】請求項１１の発明は、請求項１０の発明に
おいて、利用者にキーワードを含む自然文である一次検
索文を入力させた後、一次検索文からキーワードを抽出
し、次にキーワードとなる語彙に対する関連用語が前記
分野別に登録された関連用語データベースに一次検索文
から抽出したキーワードを照合して関連用語を用いた二
次検索文を生成し、一次検索文と二次検索文とから利用
者に前記検索文を選択させることを特徴としており、キ
ーワードを拡張する関連用語が分野別に分類されている
から、キーワードを拡張して検索漏れを少なくしながら
も、分野を制限することによって不要なノイズが含まれ
る可能性を低減することができる。According to the invention of claim 11, in the invention of claim 10, after the user inputs the primary search sentence which is a natural sentence including the keyword, the keyword is extracted from the primary search sentence and becomes the next keyword. Generates a secondary search sentence using the related terms by collating the keywords extracted from the primary search sentences with the related term database in which the related terms for the vocabulary are registered in the above-mentioned fields, and uses them from the primary search sentence and the secondary search sentence. It is characterized by allowing the user to select the above-mentioned search sentence, and the related terms for expanding the keyword are classified by field. Therefore, it is unnecessary to restrict the field while expanding the keyword to reduce the omission of search. The possibility that noise is included can be reduced.

【００６７】請求項１２の発明は、請求項１０または請
求項１１の発明において、前記文書データベースに登録
されている各文書ごとの各分野での利用価値である文書
重要度と、前記文書データベースに登録されている各文
書ごとの各分野別の参照頻度と、前記文書データベース
から検索条件により抽出した各文書中でのキーワードの
出現頻度とのうちの少なくとも１つを前記評価尺度に用
いて文書を並べることを特徴としており、文書重要度を
評価尺度に用いると文書重要度の設定者の意図に従った
順位で文書を抽出することができ、参照頻度を評価尺度
に用いると文書の利用実績に従った順位で文書を抽出す
ることができ、出現頻度を評価尺度に用いると文書自身
の客観的な評価尺度に従う順位で文書を抽出することが
できる。したがって、これらを総合した評価尺度を用い
ると目的の文書を上位に配置して提示できる可能性が高
くなる。According to a twelfth aspect of the invention, in the tenth or eleventh aspect of the invention, the document importance, which is the utility value in each field for each document registered in the document database, and the document database are stored. Using at least one of the reference frequency of each registered document for each field and the appearance frequency of the keyword in each document extracted from the document database according to the search condition as the evaluation scale, the document is identified. The feature is that they are arranged, and when the document importance is used as the evaluation scale, the documents can be extracted in the order according to the intention of the person who set the document importance, and when the reference frequency is used as the evaluation scale, the usage history of the document is improved. The documents can be extracted in the order of conformity, and if the appearance frequency is used as the evaluation scale, the documents can be extracted in the order of conformity to the objective evaluation scale of the document itself. Therefore, if an evaluation scale that integrates these is used, it is highly possible that the target document can be arranged and presented in a higher order.

【００６８】請求項１３の発明は、請求項１２の発明に
おいて、前記文書データベースに登録された文書が参照
されたときに利用者に利用価値に相当する投票値を入力
させ、投票値の合計を参照頻度に用いることを特徴とし
ており、利用者による利用価値の判断によって提示され
る順位が決まるから利用価値が高いと認識される文書ほ
ど上位で提示されることになる。According to the invention of claim 13, in the invention of claim 12, when the document registered in the document database is referred to, the user inputs a voting value corresponding to the utility value, and the total of the vote values is calculated. It is characterized in that it is used for reference frequency. Since the order of presentation is determined by the judgment of the utility value by the user, the documents recognized as having higher utility value are presented higher.

【００６９】請求項１４の発明は、請求項１２の発明に
おいて、前記文書データベースに登録された文書が参照
されたときに利用者に利用価値に相当する投票値を入力
させ、当該文書の文書重要度を投票値が大きいほど大き
くするように補正することを特徴としており、文書重要
度が利用者による利用価値の判断によって補正されるか
ら、利用者の価値判断に対応した文書を抽出しやすくな
る。According to a fourteenth aspect of the present invention, in the twelfth aspect of the present invention, when the document registered in the document database is referred to, the user inputs a voting value corresponding to the use value, and the document importance of the document is important. The feature is that it is corrected so that the larger the voting value is, the larger the value becomes, and the document importance is corrected by the judgment of the utility value by the user, so it becomes easier to extract the document corresponding to the user's value judgment. .

[Brief description of drawings]

【図１】本発明の実施形態を示すブロック図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

【図２】同上に用いる検索処理手段の動作説明図であ
る。FIG. 2 is an operation explanatory diagram of a search processing means used in the above.

【図３】同上の動作説明図である。FIG. 3 is an operation explanatory diagram of the above.

【図４】同上の動作説明図である。FIG. 4 is an operation explanatory diagram of the above.

【図５】同上の動作説明図である。FIG. 5 is an operation explanatory diagram of the above.

【図６】同上の動作説明図である。FIG. 6 is an operation explanatory diagram of the above.

【図７】同上の動作説明図である。FIG. 7 is an operation explanatory diagram of the above.

【図８】同上の動作説明図である。FIG. 8 is an operation explanatory diagram of the above.

【図９】同上の動作説明図である。FIG. 9 is an explanatory diagram of an operation of the above.

【図１０】同上の動作説明図である。FIG. 10 is an operation explanatory diagram of the above.

【図１１】同上の全体の処理手順を示す動作説明図であ
る。FIG. 11 is an operation explanatory view showing the overall processing procedure of the above.

[Explanation of symbols]

１サーバ２端末１１検索文設定手段１２検索処理手段１３検索結果出力手段１４一次検索文入力手段１５検索文解析手段１６二次検索文生成手段ＤＢ１文書データベースＤＢ２文書重要度データベースＤＢ３参照頻度データベースＤＢ４品詞データベースＤＢ５関連用語データベース 1 server 2 terminals 11 Search sentence setting means 12 Search processing means 13 Search result output means 14 Primary search text input means 15 Search sentence analysis means 16 Secondary Search Statement Generation Means DB1 document database DB2 document importance database DB3 Reference frequency database DB4 part-of-speech database DB5 related term database

─────────────────────────────────────────────────────
─────────────────────────────────────────────────── ───

【手続補正書】[Procedure amendment]

【提出日】平成１３年１１月２８日（２００１．１１．
２８）[Submission date] November 28, 2001 (2001.11.
28)

【手続補正１】[Procedure Amendment 1]

【補正対象書類名】図面[Document name to be corrected] Drawing

【補正対象項目名】図９[Correction target item name] Figure 9

【補正方法】変更[Correction method] Change

【補正内容】[Correction content]

【図９】 [Figure 9]

Claims

[Claims]

1. A document database in which a plurality of documents used within a range of a plurality of specific fields are stored, a search statement setting unit for allowing a user to set a search statement and a field of a search range, and a search statement setting. The document processing apparatus includes search processing means for extracting a document that matches the search condition of the search sentence set by the means from the document database, and search result output means for outputting the search result by the search processing means. The function to obtain the evaluation scale in each field of each document based on the evaluation data classified by field for each document registered in, and the user can select the desired evaluation scale from multiple types of evaluation scales with different methods. Function, and for each extracted document, obtain the evaluation scale of the type selected by the user for the field specified by the search sentence setting means, and A document retrieval system having a function of arranging in a small order and delivering the retrieval result to a retrieval result output means.

2. The search sentence setting means, a primary search sentence inputting means for allowing a user to input a primary search sentence which is a natural sentence including a keyword, a search sentence analyzing means for extracting a keyword from the primary search sentence, and a keyword. A secondary search using a related term database in which related terms for the vocabulary to be registered are registered in the above-mentioned fields, and a keyword extracted by the search statement analysis unit is collated with the related term database and related terms for the keyword extracted by the search statement analysis unit are used. A secondary search sentence generating means having a function of generating a sentence; and an execution search sentence selecting means for causing a user to select a search sentence to be given to the search processing means from a primary search sentence and a secondary search sentence. The document retrieval system according to claim 1, wherein the document retrieval system is a document retrieval system.

3. In the related term database, in addition to the synonyms of a vocabulary that is a keyword, a vocabulary having a specific relevance in each field is registered as a related term, and each related term has a utility value in each field. Is associated with the degree of importance of terms, and the secondary search sentence generation means shows the degree of importance of terms to the user together with the related terms and selects the related terms desired by the user from the related terms. The document search system according to claim 2.

4. A document importance database in which the utility value of each document registered in the document database in each field is registered as the document importance serving as the evaluation data, and the retrieval processing means has the document importance. Is selected as one of the options of the evaluation scale, the documents extracted by the search conditions are collated with the document importance database to arrange the documents by using the document importance as the evaluation scale. The document search system according to any one of claims 1 to 3.

5. A reference frequency database in which a reference frequency for each field for each document registered in the document database is registered as the evaluation data, wherein the search processing means has a reference frequency as an option of the evaluation scale. When selected as one, the documents extracted by the search condition are collated with a reference frequency database to arrange the documents by using the reference frequency as the evaluation measure.
The document search system according to claim 3.

6. A document importance database in which the utility value in each field of each document registered in the document database is registered as the document importance serving as the evaluation data, and each document registered in the document database And a reference frequency database in which the reference frequency of each field is registered as the evaluation data, and the search processing means performs weighted addition of the appearance frequency of the keyword in each extracted document, the document importance, and the reference frequency. When the evaluation point is selected as one of the options of the evaluation scale, the document extracted by the search condition is collated with the document importance database and the reference frequency database, and the appearance frequency of the keyword in each extracted document is obtained. The evaluation points are obtained by using the evaluation points and the documents are arranged by using the evaluation points as the evaluation scale. The document search system according to claim 2 or 3, which is used as a signature.

7. The retrieval processing means, when a document registered in the document database is referred to, adds a larger value to the reference frequency as the reference frequency registered in the reference frequency database increases. The document search system according to claim 5 or 6, which is characterized.

8. The search processing means causes a user to input a voting value corresponding to a utility value when a document registered in the document database is referenced, and uses the total of the vote values as a reference frequency. The document search system according to claim 5 or 6, which is characterized.

9. The search processing means causes a user to input a voting value corresponding to a utility value when a document registered in the document database is referenced, and the document registered in the document importance database. 7. The document retrieval system according to claim 4 or 6, wherein the document importance of is corrected so as to increase as the voting value increases.

10. A method for retrieving a target document from a document database in which a plurality of documents used within a range of a specific plurality of fields are stored, wherein a user sets a search sentence and designates a field of the search range. After that, the documents that match the search conditions of the set search text are extracted from the document database, and the evaluation scale in each field is calculated from the evaluation data classified by field for each document. The user is asked to select from a plurality of different types, and then the evaluation scale of the selected type is determined for the specified field for each extracted document, and the extracted documents are arranged in order of magnitude of the evaluation scale and output as a search result. A document retrieval method characterized by:

11. A relation in which a user inputs a primary search sentence that is a natural sentence including a keyword, extracts a keyword from the primary search sentence, and then a related term for the vocabulary to be the keyword is registered for each of the fields. A feature is that a keyword extracted from a primary search sentence is collated with a term database to generate a secondary search sentence using related terms, and a user is allowed to select the search sentence from the primary search sentence and the secondary search sentence. The document search method according to claim 10.

12. A document importance, which is a utility value in each field for each document registered in the document database, and a reference frequency for each field for each document registered in the document database, 11. The document is arranged by using at least one of the appearance frequency of a keyword in each document extracted from the document database according to a search condition as the evaluation scale.
Document search method described in 1.

13. The method according to claim 12, wherein when a document registered in the document database is referred to, a user inputs a voting value corresponding to a utility value, and the total of the vote values is used as a reference frequency. Document search method described.

14. When a document registered in the document database is referred to, the user is made to input a voting value corresponding to the utility value, and the document importance of the document is corrected so as to increase as the voting value increases. 13. The document search method according to claim 12, wherein: