JP2013041385A

JP2013041385A - Document retrieval method, document retrieval device, and document retrieval program

Info

Publication number: JP2013041385A
Application number: JP2011177398A
Authority: JP
Inventors: Masao Tamaki; 理雄玉木
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2011-08-15
Filing date: 2011-08-15
Publication date: 2013-02-28

Abstract

PROBLEM TO BE SOLVED: To provide a document retrieval method which efficiently retrieves a patent document by reducing retrieval omission.SOLUTION: A document retrieval device performs the steps of: on the basis of inputted retrieval keywords, retrieving documents including the retrieval keywords from database in which retrieval object documents are accumulated; extracting first keywords from the first retrieved result; extracting a classification codes included in a group of documents in the retrieved result; tabulating the extracted classification codes; retrieving documents including the classification codes from the database on the basis of the tabulated classification codes; extracting second keywords included in a group of documents in the second retrieval result; comparing the first keywords with the second keywords to extract keywords that are not included in the first keywords; and displaying the first retrieval result and the keywords in the keyword comparison result.

Description

本発明は、文献を検索する文献検索方法、文献検索装置及び文献検索プログラムに関する。 The present invention relates to a document search method, a document search apparatus, and a document search program for searching for a document.

特許出願を行うにあたり、先行技術を調査することは必須の作業である。通常、発明者は先行技術を調査するに当たり、IPDLに代表されるような特許検索システムを利用する。このとき、発明者は、自らの発明を連想させるキーワードにて検索を行うのが一般的である。発明者は、自分が希望する特許公報にヒットする検索キーワードを予想し、その入力の仕方を工夫しながら検索を繰り返し、有用な情報を見つけ出す。 In order to file a patent application, it is essential to search for prior art. Usually, the inventor uses a patent search system represented by IPDL to search the prior art. At this time, the inventor generally performs a search using a keyword reminiscent of his invention. The inventor anticipates a search keyword that hits the desired patent gazette, repeats the search while devising the input method, and finds useful information.

ところで、一般的な検索システムでは、一つまたは複数の検索キーワードを入力し、それらのキーワードを文書中に全て含むもの（論理積またはAND検索）、どれか含むもの（論理和またはOR検索）、あるいはどれも含まないもの（否定またはNOT検索）などの論理演算を入れることが可能である。これによってユーザの期待する文書群の対象により近づけることができる。また、キーワードAとBのどちらかを含み、かつCを含むものといった、論理演算の組合せによる高度な検索も多くの検索システムでサポートされている。 By the way, in a general search system, one or a plurality of search keywords are input, and those keywords are all included in the document (logical product or AND search), some are included (logical sum or OR search), Or it is possible to include logical operations such as nothing (negation or NOT search). As a result, the target of the document group expected by the user can be brought closer. In addition, many search systems that support a combination of logical operations, such as those containing either keyword A or B and also containing C, are supported.

しかし、このような検索技術を用いて実際に検索する場合には、いくつかの問題が生じる。例えばカメラのズームレンズについての先行技術を検索しようとして、「カメラ」「ズームレンズ」というキーワードを指定して検索する場合を考える。このとき、公報の文中に「ズームレンズ」ではなく「高倍率のレンズ」という言葉が使われている場合は検索にヒットしない。すなわち、このような類義語や同概念を包含するキーワードを検索時に思いつかなければ、情報の取りこぼしが発生してしまう。一方、キーワード「カメラ」「ズームレンズ」「高倍率のレンズ」を思いつき、いずれかを含むように検索対象の範囲を広げたとする。しかしその結果、何千・何万件の検索結果が提示されれば、発明者の望まない公報も多く含むことがあり得るし、全てに目を通すのは現実的に不可能となる。あるいは、公知例として調査すべき文書がいずれのキーワードも使用していない可能性もある。つまり、取りこぼしのない（または少ない）検索結果の取得といえども、検索にヒットする文書が多すぎる場合には、ある程度の絞込みを行うことで適当な件数で抑えるのが現実的である。反対に、検索にヒットする文書が無い場合や少なすぎる場合には、ユーザが入力した検索キーワードから連想するようなキーワードを提案し検索可能性を広げる必要がある。 However, several problems arise when an actual search is performed using such a search technique. For example, let us consider a case where a search is made by specifying keywords “camera” and “zoom lens” in order to search for prior art regarding a zoom lens of a camera. At this time, if the term “high-power lens” is used instead of “zoom lens” in the text of the publication, the search will not be hit. That is, if such a synonym or a keyword including the same concept is not conceived at the time of retrieval, information may be missed. On the other hand, suppose that the keywords “camera”, “zoom lens”, and “high-power lens” are conceived and the range of the search target is expanded to include any of them. However, as a result, if thousands or tens of thousands of search results are presented, there may be many publications that the inventor does not want, and it is practically impossible to read all of them. Alternatively, there is a possibility that a document to be investigated as a known example does not use any keyword. In other words, even if the search results are not missed (or few), if there are too many documents that hit the search, it is practical to limit the number of documents to a suitable number by narrowing down to some extent. On the other hand, if there are no documents that hit the search, or if there are too few documents, it is necessary to propose a keyword that is associated with the search keyword input by the user to expand the search possibility.

絞込みの方法としては、分類コードを使用する検索方法がある。例えば、特許情報であれば、ＩＰＣ（International Patent Classification）、日本のＦＩ（File Index）やＦ（File Forming）ターム、米国特許分類（U. S. Patent Classification）といった種類の階層的な分類コードが各特許文献に付与されている。特に、ＦＩやＦタームでは、分類コードの値が１０万項目以上に細分化されている。従って、ユーザが検索目的や意図に適合した分類コードの値を知ることができた場合には、その分類コードの値をキーとして検索することにより、検索ノイズや検索モレの少ない高精度な検索を実現することが可能となる。しかしながら、分類が階層化・細分化されているため、発明者にとってはなじみのない分類コードを選別し、検索目的や意図に適合した分類コードを見つけることは困難である。 As a narrowing down method, there is a search method using a classification code. For example, in the case of patent information, hierarchical patent codes such as IPC (International Patent Classification), Japanese FI (File Index) and F (File Forming) terms, and US Patent Classification are included in each patent document. Has been granted. In particular, in FI and F-term, the classification code values are subdivided into 100,000 items or more. Therefore, when the user can know the value of the classification code suitable for the purpose and purpose of the search, the search can be performed using the value of the classification code as a key to perform a high-precision search with less search noise and search leakage. It can be realized. However, since the classification is hierarchized and subdivided, it is difficult for the inventor to select a classification code that is unfamiliar and find a classification code suitable for the search purpose and intention.

特許文献１では、キーワードの辞書を用意し、入力されたキーワードに関連するキーワードを導き出し、発明者の検索を補助している。 In Patent Document 1, a keyword dictionary is prepared, a keyword related to the input keyword is derived, and the inventor's search is assisted.

特許文献２では、検索結果から分類コードを抽出し、絞込みに使用すべき分類コードを導き出し、発明者の検索を保護している。 In Patent Document 2, a classification code is extracted from a search result, a classification code to be used for narrowing down is derived, and the inventor's search is protected.

特開２０１０−００３０１５号公報JP 2010-003015 A 特開２００８−１６５４０１号公報JP 2008-165401 A

上述したように、関連するキーワードや分類コードの導出により、検索漏れや絞込みへの対応が可能となるが、以下の課題が存在する。
（課題１）キーワード辞書はあくまでシステム提供側が用意するものであり、常に最新の情報であるとは限らないため、新たに関連すべきとなったキーワードが出現しても、それが辞書に反映されるとは限らず、キーワードが導出されない恐れがある。
（課題２）分類コードは依然として発明者にはなじみのないものであり、その検索結果に対して有効な分類コードがわかったとしても、それを次回の検索に活用するにはまだ敷居が高く、発明者の検索の効率化に大きく貢献するとは限らない。 As described above, it is possible to deal with search omissions and narrowing down by deriving related keywords and classification codes.
(Problem 1) The keyword dictionary is prepared by the system provider and is not always the latest information. Therefore, even if a keyword that should be newly related appears, it is reflected in the dictionary. However, the keyword may not be derived.
(Problem 2) The classification code is still unfamiliar to the inventor, and even if an effective classification code is found for the search result, there is still a threshold to use it for the next search. It does not always contribute greatly to the efficiency of the inventors' search.

本発明は上述した問題点を解決するためになされたものであり、特許文献検索時の検索漏れを低減し、効率的に検索を行うことができる文献検索方法、これを実施する文献検索装置及び文献検索プログラムを提供することを目的とする。 The present invention has been made in order to solve the above-described problems. A document search method capable of efficiently performing a search by reducing search omissions at the time of patent document search, a document search apparatus for performing the search, and An object is to provide a literature search program.

上記課題を解決するための一手段を説明する。本発明は、入力された検索キーワードを基に、検索対象となる文献が蓄積されたデータベースの中から、該当検索キーワードを含んだ文献を検索し、その検索結果を記憶装置に記憶する第一の検索処理を実施する文献検索装置における文献検索方法である。そして、前記文献検索装置により、前記第一の検索処理の結果から第一のキーワードを抽出し、前記第一の検索処理の結果の文献群に含まれる分類コードを抽出する分類抽出処理と、抽出された分類コードを集計し、その集計結果を前記記憶装置に記憶する分類集計処理と、集計された分類コードを基に、前記データベースから該当分類コードを含んだ文献を検索し、その検索結果を前記記憶装置に記憶する第二の検索処理と、前記第二の検索処理により検索された検索結果の文献群に含まれる第二のキーワードを抽出するキーワード抽出処理と、前記第一のキーワードと前記第二のキーワードを比較して、前記第一のキーワードに存在しないキーワードを抽出するキーワード比較処理と、前記第一の検索処理にて得られる検索結果及びキーワード比較結果のキーワードを表示する検索結果表示処理とを実施することを特徴とする。 One means for solving the above problem will be described. The first aspect of the present invention is to search for a document including a corresponding search keyword from a database in which documents to be searched are accumulated based on the input search keyword, and store the search result in a storage device. It is the literature search method in the literature search device which performs search processing. Then, the document search device extracts a first keyword from the result of the first search process, and extracts a classification code included in a document group as a result of the first search process, and an extraction Based on the classified and aggregated processing for storing the totaled classification codes and storing the totaled results in the storage device, and on the basis of the totaled classification codes, the documents including the corresponding classification codes are searched from the database. A second search process stored in the storage device; a keyword extraction process for extracting a second keyword included in a document group of search results searched by the second search process; the first keyword; A keyword comparison process for comparing a second keyword and extracting a keyword that does not exist in the first keyword, and a search result and a key obtained by the first search process Which comprises carrying out the search result display process for displaying the keywords over de comparison result.

本発明によれば、特許検索の検索漏れを低減し、効率的に検索を行うことが可能となる。 According to the present invention, it is possible to reduce search omission of patent search and to perform search efficiently.

文献検索システムの全体構成の一例を示す図である。It is a figure which shows an example of the whole structure of a literature search system. 文献検索装置１０２のハードウェア構成の一例を示す図である。2 is a diagram illustrating an example of a hardware configuration of a document search apparatus 102. FIG. 文献検索装置１０２の動作の概要の一例を示すフローチャートである。5 is a flowchart illustrating an example of an outline of operation of a document search apparatus 102. 公報テーブルのレコードの一例を示す図である。It is a figure which shows an example of the record of a gazette table. 検索結果集団の例を示す図である。It is a figure which shows the example of a search result group. 検索キーワード集団の例を示す図である。It is a figure which shows the example of a search keyword group. 分類コード集団の例を示す図である。It is a figure which shows the example of a classification code group. 再検索結果集団の例を示す図である。It is a figure which shows the example of a re-search result group. 再検索による検索キーワード集団の例を示す図である。It is a figure which shows the example of the search keyword group by re-search. 関連キーワードの集団の例を示す図である。It is a figure which shows the example of the group of a related keyword. 検索結果集団Ａ及びキーワード集団α生成の処理の例を示す図である。It is a figure which shows the example of the process of search result group A and keyword group alpha production | generation. 表示結果の例を示す図である。It is a figure which shows the example of a display result.

以下、本発明の一実施例について、図面を参照して詳細に説明する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

まず、本実施の形態に係る文献検索システムの全体構成について説明する。 First, the overall configuration of the document search system according to the present embodiment will be described.

図１は、本発明の一実施形態として構築される、文献検索システムの全体構成を概略的に示した構成図である。このシステムは、発明者が操作する端末１０１、システムが動作する文書検索装置（サーバ）１０２、端末１０１と文献検索装置１０２を結ぶネットワーク１０３にて構成される。 FIG. 1 is a configuration diagram schematically showing the overall configuration of a document search system constructed as an embodiment of the present invention. This system includes a terminal 101 operated by the inventor, a document search device (server) 102 on which the system operates, and a network 103 connecting the terminal 101 and the document search device 102.

図２は、本発明の一実施形態として構築される、文献検索装置１０２のハードウェア構成の一例を示す図である。この文献検索装置１０２は、一般的なコンピュータ（電子計算機）により実現でき、ネットワーク１０３を介して端末１０１と情報の送受信を行う通信装置２００と、発明者等が検索キーワードを入力するための入力装置２０１と、検索結果を表示する表示装置２０２と、処理装置１０２と、記憶装置２０３と、文献データベース（以下、「データベース」は「ＤＢ」と記す）２０４を具備する。各装置は、バス（BUS）等により接続されている。処理装置１０２は、検索実行処理部２０５、検索結果表示処理部２０６、分類抽出処理部２０７、分類集計処理部２０８、再検索処理部２０９、キーワード抽出処理部２１０、キーワード比較処理部２１１を備える。ここで、入力装置２０１、表示装置２０２は文献検索装置１０２に具備しなくても良い。また、記憶装置２０３、文献ＤＢ２０４が必ずしも文書検索装置１０２の内部にある必要はなく、別の装置に配置されていても構わない。処理装置１０２は、ＣＰＵ等により実現できる。処理装置１０２は、記憶装置２０３から各種プログラムを、図示していないメモリにロードし、所定のプログラムを実行することにより上術の各処理部を実現する。すなわち、処理部２０５〜２１２は、処理装置１０２の処理プロセスとして実現される。各処理部を実現する所定のプログラムは、予め記憶装置２０３に格納されていても良いし、コンピュータが利用可能な可搬性を有する記憶媒体に格納されており読取装置を介して必要に応じて読み出されても良いし、あるいは、コンピュータが利用可能な通信媒体であるネットワーク１０３またはネットワーク１０３上を伝搬する搬送波を利用する通信装置２００と接続された他の装置から必要に応じてダウンロードされて記憶装置２０３に格納されるものであっても良い。 FIG. 2 is a diagram illustrating an example of a hardware configuration of the document search apparatus 102 constructed as an embodiment of the present invention. The document search apparatus 102 can be realized by a general computer (electronic computer), and includes a communication apparatus 200 that transmits and receives information to and from the terminal 101 via the network 103, and an input apparatus for the inventors to input search keywords. 201, a display device 202 that displays search results, a processing device 102, a storage device 203, and a literature database (hereinafter, “database” is referred to as “DB”) 204. Each device is connected by a bus (BUS) or the like. The processing device 102 includes a search execution processing unit 205, a search result display processing unit 206, a classification extraction processing unit 207, a classification totaling processing unit 208, a re-search processing unit 209, a keyword extraction processing unit 210, and a keyword comparison processing unit 211. Here, the input device 201 and the display device 202 may not be included in the document search device 102. Further, the storage device 203 and the document DB 204 do not necessarily have to be inside the document search device 102, and may be arranged in another device. The processing device 102 can be realized by a CPU or the like. The processing device 102 loads various programs from the storage device 203 to a memory (not shown) and executes predetermined programs to realize each processing unit of the above. That is, the processing units 205 to 212 are realized as a processing process of the processing device 102. Predetermined programs for realizing each processing unit may be stored in the storage device 203 in advance, or stored in a portable storage medium that can be used by a computer, and read through the reader as needed. Or may be downloaded as necessary from the network 103, which is a communication medium that can be used by a computer, or another device connected to the communication device 200 using a carrier wave propagating on the network 103, and stored. It may be stored in the device 203.

記憶装置２０３には、後述するキーワード集団α５０１、検索結果集団Ａ６０１、分類コード７０１、検索結果集団Ｂ８０１、キーワード集団β９０１、関連キーワード１００１が格納される。文献ＤＢ２０４には、公報ごとの本文４０２やＩＰＣ４０３、ＦＩ４０４、Ｆターム４０５といった情報を持つレコードを記憶する公報テーブルが格納されている。 The storage device 203 stores a keyword group α501, a search result group A601, a classification code 701, a search result group B801, a keyword group β901, and a related keyword 1001, which will be described later. The document DB 204 stores a bulletin table that stores records having information such as the body text 402, IPC 403, FI 404, and F term 405 for each bulletin.

検索実行処理部２０５は、端末１０１あるいは入力装置２０１からの検索要求に基づき検索処理を実行する。 The search execution processing unit 205 executes search processing based on a search request from the terminal 101 or the input device 201.

検索結果表示処理部２０６は、検索結果画面を検索要求元の端末１０１に送信し、あるいは表示装置２０２に表示する。 The search result display processing unit 206 transmits the search result screen to the search request source terminal 101 or displays it on the display device 202.

分類抽出処理部２０７は、検索結果の文献群に含まれる分類コードを抽出する。 The classification extraction processing unit 207 extracts a classification code included in the document group of the search result.

分類集計処理部２０８は、抽出された分類コードを集計する。 The classification totalization processing unit 208 totalizes the extracted classification codes.

再検索処理部２０９は、集計された分類コードを基に、前記文献ＤＢ２０４から該当分類コードを含んだ文献を検索する。 The re-search processing unit 209 searches the document DB 204 for documents including the corresponding classification code based on the collected classification codes.

キーワード抽出処理部２１０は、再検索により探し出された検索結果の文献群に含まれるキーワードを抽出する。 The keyword extraction processing unit 210 extracts keywords included in a document group of search results found by re-searching.

キーワード集計処理部２１１は、キーワード抽出処理部２１０により抽出されたキーワードを集計する。 The keyword totalization processing unit 211 totals the keywords extracted by the keyword extraction processing unit 210.

キーワード比較処理部２１２は、キーワード集計処理部２１１により集計されたキーワードと、後述のキーワード集団α５０１を比較する。 The keyword comparison processing unit 212 compares the keywords aggregated by the keyword aggregation processing unit 211 with a keyword group α501 described later.

次に、本実施の形態に係る文献検索システムの動作について説明する。 Next, the operation of the document search system according to this embodiment will be described.

図３は、本実施の形態に係る文献検索システムの動作の概要の一例を示すフローチャートである。 FIG. 3 is a flowchart showing an example of the outline of the operation of the document search system according to the present embodiment.

ステップ３０１、３０２では、検索実行処理２０５にて、発明者等が入力装置２０１または端末１０１から入力した検索キーワードを含む検索要求を受信し、受信した検索キーワードに基づき検索処理が実行される。 In steps 301 and 302, the search execution process 205 receives a search request including a search keyword input by the inventors from the input device 201 or the terminal 101, and the search process is executed based on the received search keyword.

ステップ３０３の一連の処理を図１１に示す。検索実行処理２０５にて、入力された検索キーワード１１０１により、図４の公報テーブルの本文４０２に対して全文検索が行われ、ヒットしたレコードをまとめ、図４の公報テーブルの公開番号４０１によって構成される図５の検索結果集団Ａ５０１が記憶装置２０１上に生成される。また、検索結果集団Ａ５０１の公開番号から、図４の公報テーブルを参照し、公報番号４０１からレコードを特定し、本文４０２内のキーワードを取得し、図６のキーワード集団α６０１が記憶装置２０１上に生成される。キーワードの取得は、形態素解析にて行われる。図４の公報テーブルは、予めシステムで用意されたテーブルで、文献の情報が蓄積されている。 A series of processing in step 303 is shown in FIG. In the search execution processing 205, a full-text search is performed on the text 402 of the gazette table in FIG. 4 by the input search keyword 1101, and the hit records are collected and configured by the publication number 401 in the gazette table in FIG. 4. A search result group A 501 in FIG. 5 is generated on the storage device 201. Further, referring to the publication table of FIG. 4 from the public number of the search result group A 501, the record is identified from the publication number 401, the keyword in the text 402 is acquired, and the keyword group α 601 of FIG. 6 is stored in the storage device 201. Generated. Acquisition of keywords is performed by morphological analysis. The publication table in FIG. 4 is a table prepared in advance by the system, and information on documents is accumulated.

ステップ３０４、３０５では、分類抽出処理２０７にて、検索結果集団Ａ５０１の個々の文献について、図４の公報テーブルを参照し、公報番号４０１からレコードを特定し、分類データとして、ＩＰＣ４０３を抽出し（、分類集計処理２０８にて、集計が行われ、図７の特許分類集団７０１が記憶装置２０１上に生成される。このとき、分類データとしては、ＦＩ４０４やＦターム４０５を使用することも可能である。また、集計した結果で、付与された公報の数が多い上位の分類だけに絞り込んで集団を生成することも可能である。 In steps 304 and 305, the classification extraction process 207 refers to the publication table of FIG. 4 for each document in the search result group A501, identifies the record from the publication number 401, and extracts the IPC 403 as classification data ( 7, the totalization is performed, and the patent classification group 701 of Fig. 7 is generated on the storage device 201. At this time, the FI 404 or the F term 405 can be used as the classification data. It is also possible to generate a group by narrowing down only the top classifications with a large number of granted publications as a result of aggregation.

ステップ３０６、３０７では、特許分類集団７０１内の分類コードにて、再検索処理２０９にて検索が実行され、ヒットした文献の公開番号４０１が抽出され、図８の検索結果集団Ｂ８０１が記憶装置２０１上に生成される。ここで、文献ＤＢ２０４に、公開番号以外４０１の番号として出願番号や登録番号といった番号が格納されている場合は、そちらを抽出してもよい。 In steps 306 and 307, a search is executed in the re-search process 209 using the classification code in the patent classification group 701, the publication number 401 of the hit document is extracted, and the search result group B801 in FIG. Generated on top. Here, when a number such as an application number or a registration number is stored in the document DB 204 as a number 401 other than the publication number, it may be extracted.

ステップ３０８、３０９では、キーワード抽出処理２１０にて、検索結果集団Ｂ８０１の個々の文献について、図４の公報テーブルを参照し、公報番号４０１からレコードを特定し、本文４０２から、形態素解析にてキーワードを抽出し、キーワード集計処理２１１にて、集計が行われ、図９のキーワード集団β９０１が記憶装置２０１上に生成される。このとき、集計した結果で、出現頻度が多い上位のキーワードだけに絞り込んで集団を生成することも可能である。 In steps 308 and 309, in the keyword extraction process 210, for each document in the search result group B 801, the record is identified from the publication number 401 with reference to the publication number 401, and the keyword is obtained from the body text 402 through morphological analysis. 9 is extracted, and totalization is performed in the keyword totalization processing 211, and the keyword group β901 in FIG. 9 is generated on the storage device 201. At this time, it is also possible to generate a group by narrowing down to only upper keywords having a high appearance frequency as a result of aggregation.

ステップ３１０、３１１では、キーワード集団α５０１とキーワード集団β９０１をキーワード比較処理２１２にて比較し、キーワード集団β９０１に存在し、かつ、キーワード集団α５０１に存在しないキーワードが抽出され、図１０の関連キーワード群１００１が記憶装置２０１上に生成される。 In steps 310 and 311, the keyword group α 501 and the keyword group β 901 are compared by the keyword comparison process 212, and keywords that exist in the keyword group β 901 and do not exist in the keyword group α 501 are extracted, and the related keyword group 1001 in FIG. Is generated on the storage device 201.

ステップ３１２では、検索結果表示処理２０６にて、検索結果集団Ａ６０１と、関連キーワード群１００１を表示装置２０２に表示する。表示例を図１２に示す。 In step 312, the search result display process 206 displays the search result group A 601 and the related keyword group 1001 on the display device 202. A display example is shown in FIG.

以上、本発明の一実施例について説明した。 The embodiment of the present invention has been described above.

上記実施例によれば、特許文献検索時の検索漏れを低減し、効率的に検索を行うことができる。また、検索結果が表示されるとともに、関連するキーワードが表示されるので、絞込みにどのようなキーワードを加えればよいか、よいのかを効率的・直接的に理解することができる。また、どのようなキーワードで検索をしなおせばよいかという判断材料にもなるので、次の検索に使用すべきキーワードを効率的に選別することができる。 According to the above embodiment, search omission at the time of patent document search can be reduced and search can be performed efficiently. Further, since the search result is displayed and related keywords are displayed, it is possible to efficiently and directly understand what keywords should be added to the narrowing down. Moreover, since it also becomes a material for determining what keyword should be used for the search again, it is possible to efficiently select keywords to be used for the next search.

以上、本発明の実施例を説明したが、本発明はこれに限定されるものではなく、その要旨を逸脱しない範囲で種々変更可能である。 As mentioned above, although the Example of this invention was described, this invention is not limited to this, A various change is possible in the range which does not deviate from the summary.

１０１・・・端末、１０２・・・ネットワーク、１０３・・・文書検索装置（サーバ）、２０１・・・入力装置、２０２・・・表示装置、２０３・・・メモリ、２０４・・・文献ＤＢ。 DESCRIPTION OF SYMBOLS 101 ... Terminal, 102 ... Network, 103 ... Document search device (server), 201 ... Input device, 202 ... Display device, 203 ... Memory, 204 ... Literature DB.

Claims

Based on the input search keyword, search the document containing the search keyword from the database that stores the documents to be searched, and implement the first search process to store the search result in the storage device A document search method in a document search device that performs:
Extracting a first keyword from the result of the first search process;
A classification extraction process for extracting a classification code included in a document group as a result of the first search process;
A classification totaling process for totalizing the extracted classification codes and storing the totalization results in the storage device;
Based on the aggregated classification codes, search for documents including the corresponding classification codes from the database, and storing the search results in the storage device,
A keyword extraction process for extracting a second keyword included in a document group of search results searched by the second search process;
A keyword comparison process for comparing the first keyword with the second keyword and extracting a keyword that does not exist in the first keyword;
A search result display process for displaying the search result and the keyword of the keyword comparison result obtained in the first search process;
A document retrieval method characterized by that.

Based on the input search keyword, a first search means for searching for a document including the corresponding search keyword from a database in which documents to be searched are stored and storing the search result in a storage device is provided. A document retrieval device,
Means for extracting a first keyword from a search result by the first search means;
Classification extraction means for extracting a classification code included in the document group of the search results;
A classification totaling unit that totalizes the extracted classification codes and stores the totalization results in the storage device;
Based on the totaled classification code, a second search means for searching a document including the corresponding classification code from the database, and storing the search result in the storage device;
Keyword extracting means for extracting a second keyword contained in a document group of search results searched by the second search means;
A keyword comparing means for comparing the first keyword with the second keyword and extracting a keyword that does not exist in the first keyword;
Search result display means for displaying the search results obtained by the first search means and the keywords of the keyword comparison results;
A document retrieval apparatus characterized by that.

A first search in which a computer is searched for a document including the corresponding search keyword from a database in which documents to be searched are stored based on the input search keyword, and the search result is stored in a storage device. A literature search program for executing processing,
Extracting a first keyword from the result of the first search process;
A classification extraction process for extracting a classification code included in a document group as a result of the first search process;
A classification totaling process for totalizing the extracted classification codes and storing the totalization results in the storage device;
Based on the aggregated classification codes, search for documents including the corresponding classification codes from the database, and storing the search results in the storage device,
A keyword extraction process that extracts and aggregates second keywords included in the document group of the search results searched by the second search process, and stores the aggregation results in the storage device;
A keyword comparison process for comparing the first keyword with the second keyword and extracting a keyword that does not exist in the first keyword;
A search result display process for displaying a search result obtained in the first search process and a keyword of a keyword comparison result is executed.
A document search program characterized by that.