JP2004342016A

JP2004342016A - Information retrieval program and medium having information retrieval program recorded thereon

Info

Publication number: JP2004342016A
Application number: JP2003140555A
Authority: JP
Inventors: Kazunobu Igarashi; 和信五十嵐
Original assignee: ULT RES CO Ltd; ULT RESEARCH CO Ltd
Current assignee: ULT RES CO Ltd; ULT RESEARCH CO Ltd
Priority date: 2003-05-19
Filing date: 2003-05-19
Publication date: 2004-12-02
Anticipated expiration: 2023-05-19
Also published as: JP3929418B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a means which eliminates the trouble of keyword registration and can perform retrieval from data collection to be investigated. <P>SOLUTION: Based on the information retrieval program, a CPU extracts a keyword constituting an information block of reference data from a storage means 2. The CPU extracts keywords constituting a plurality of the information blocks from data to be retrieved which is stored in the storage means 2. Further, the CPU displays the keyword extracted from the reference data and the keywords extracted from the data to be retrieved on a display means 4, and allows an investigator to select the correlation of the keywords in the data to be retrieved based on the relationship of their synonymous words or the like. In addition to this, the CPU retrieves similar blocks from the data to be retrieved based on the correlated keywords in the data to be investigated. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、目的とする資料等を調査する為のシステムに係り、特に産業財産権等に関する異議申し立て資料の調査や、先願調査等のように、実際に文書等の内容の読み込みを調査者が行う必要のある調査を補助し、調査作業を容易にするシステムに関する。
【０００２】
【従来の技術】
従来、特許情報等を機械検索によって検索できる特許情報データベース検索システムが開発使用されている。特許情報データベースを例にとれば、特許出願毎に発生する特許情報をデータベースとしてホストコンピュータの記憶手段に記憶させることによって構築されている。
【０００３】
そして、このような特許情報データベースには、一般的な書誌的事項の他に、各種キーワード（例えばフリーキーワード、固定キーワード、Ｆターム等）がファイル毎に記憶され、これらを指定することにより、対応する特許情報が検索できるように構成されている。また、近年発行されているＣＤ−ＲＯＭ形式の特許公報等や、各メーカによって編集された特許情報に関するＣＤ−ＲＯＭ等（以下、これらを電子特許情報という）には、上記した書誌的事項の他に明細書、要約書及び図面も電子情報として記憶され、各種検索項目に基づいて対応する特許情報が検索できるように構成されている。
【０００４】
ここで、前述の特許情報データベースや電子特許情報を用いて無効資料や異議申立資料を調査しようとする場合、調査者が検索式を作成し、その検索式にヒットした内容の抄録を見て更に必要により公報全文を読んで、無効資料あるいは異議申立資料として利用できるかを判断している。この検索の結果、ヒット件数が少ない場合は直接公報に目を通すことが行われる。該当する公報が見つからない場合は、再度検索式を作り直してヒットした公報を読むことが繰り返される。また、これから出願しようとする発明に対する先願調査等も、同様の手順によって行われる。
【０００５】
ここで、少し詳しく特許等調査の過程について説明する。まず、調査者は本願の内容と特徴点を理解した後、前述の特許データベースや電子特許情報に記憶されている分類記号、キーワード等を組合わせて、目的とする技術が例えば数十件乃至百件程度の回答集合中に含まれるように検索式を作成する。勿論、検索の結果ヒットした件数が少なく、このヒットしたデータ集合中に所望の技術が網羅されていれば良いのであるが、僅かな件数に絞り込むと部分的に関連する類似技術の抽出や、他分野にある類似技術が抽出されないことも多い。このため、調査者は通常は数十件、多くても数百件の抽出を目安として、ヒットしたデータ集合中に、目的とする技術が含まれるように検索式を作成する。数百件を超える集合が作成された場合は、その後のスクリーニングに多大な時間を要するので、効率的な検索を行うことは難しい。しかし、必ずその中に目的とする類似技術が抽出されていると確信できる場合は、ヒットした全ての出願について抄録や、公報等を読んで、類似技術を探すこととなる。この作業は手作業で行われる。また、多くの公報等を読んだ後に類似技術が抽出できない場合は、再び検索式を作り変えて同様な操作を繰り返す。そして、いくら探しても類似技術が抽出できないことが確認できるまで、調査作業が続けられる。特に公知例、異議、無効資料調査にあっては、検索式を作成するために要する時間に加えて機械検索の結果を人が熟読して内容を更に吟味し、取捨する作業時間が加わることになる。そのため、調査が完了するまでに多くの時間を要し、調査コストも多大なものとなる。分類体系や調査の手法に習熟した調査の専門家であっても、機械検索の結果を吟味、処理するのに多大な時間を要しているのが調査作業の実情である。
【０００６】
ところで、従来より広く使用されている一般的な検索処理方法として、パターンマッチングがある。これは、検索に使用するキーワードや検索式等（以下「検索用キーワード」と略す。）と、文書・文献等の検索単位毎に登録されているキーワード（以下、「登録キーワード」と略す。）とを比較し、一致するか（完全に、あるいは一部分）により、当該検索単位を検索結果として抽出するか否かを判断する手法である。その為、検索用キーワードと登録キーワードとが、不一致（完全に、あるいは一部分）ではあるが同義語等（類義語等を含む）の関係にある場合、検索することができないという問題点がある。即ち、技術思想の表現には非常に多くの用語があるので、パターンマッチングによる機械検索処理では、目的とする類似技術を抽出するのは非常に困難である。
【０００７】
このパターンマッチングの例としては、「主キー」という文字列を検索用キーワードに指定した場合において、「主キー」、「キー」、「キー項目」等の登録キーワードが、「主キー」という文字列の少なくとも一部分を含むため、類似のキーワードとして判定されることになる。しかし、「メイン項目」や「検索項目」等の技術上の同義語等である登録キーワードについては、「主キー」という文字列を検索用キーワードに指定した場合、検索により抽出することができない。
【０００８】
また、前述の登録キーワードは、検索行為に使用される為、事前に文書・文献等の検索単位毎に登録されている必要があり、その登録の手間を要する。（以下、この手間を「キーワード登録の手間」と称す。）
【０００９】
そして、この登録キーワードの決定方法としては、文書・文献等の検索単位毎に特徴のあるキーワードを一部抽出したり、あるいは管理者が検索単位毎にキーワードをいくつか設定すること等が行われている。この為、一般的に登録キーワードは、文書・文献等の検索単位を構成する一部分、あるいは一側面しか表していないことが多い。（以下、この点を「登録キーワードの一部表現性」と称す。）この為、登録キーワード以外の検索単位を構成するキーワード等を、検索用キーワードに指定した場合に、前述の登録キーワードの一部表現性によって、当該検索単位は検索の結果抽出されないことになる。
【００１０】
更に、先の「主キー」の例のように、技術上の同義語等である登録キーワードを統一して使用すべき必要が生じる。（以下、この点を「登録キーワードの統一必要性」と称す。）この登録キーワードの統一必要性の例としては、先の「主キー」という用語を登録キーワードに使用すると定めた場合、「メイン項目」や「検索項目」という登録キーワードを使用できなくなることが挙げられる。
【００１１】
しかし、この登録キーワードの統一必要性が遵守されていない場合は、検索を行う調査者にとっては好ましいことではなく、特に産業財産権等の新規性や進歩性等を判断する際には、その問題点が一層顕在化することになる。
【００１２】
ここで、新規性とは、ある特定の発明と、それに対する一以上の引用する発明とを認定し、両者を比較することにより相違点が生じるかにより、ある特定の発明が新しいものであるか否かを判断する特許要件である。この新規性の判断において、従来は調査者が目視で内容を確認することにより同一の発明であると認定される（新規性なしと判定される）発明同士であっても、特定の発明と引用する発明とが、同義の他の技術用語等を登録キーワードに使用していた場合は、精度の高い検索が行えないことがあった。本出願の出願時において普及している国内特許データベースには、Ｆターム、国際特許分類、ＦＩ記号その他、技術内容がシソーラス化された記号として付与されている。しかし、新しい発明は旧来の技術体系に必ずしも分類できるわけではないので、キーワードと分類記号を組み合わせても、完全に的を絞って該当特許等をヒットさせることは難しい。更に、分類記号の少ない、論文、一般技術文献を調査対象とする場合は、少ない件数に絞り込んだデータ集合中から、適切な文献を機械検索によりヒットさせることが難しい。そのため、ある程度の件数のデータ集合を作成し、その中で該当する文献等があるか否かを調査者が検討する必要がある。
【００１３】
また、進歩性とは、ある特定の発明と、それに対する一以上の引用する発明とを認定し、両者を対比することによりそれぞれの発明を特定する為の事項の一致点及び相違点を明らかにし、その相違点が当業者容易であるか否かにより判定される特許要件である。この新規性の判断において、従来は調査者が目視で確認することにより相違点があまりないと認定される（進歩性なしと判定される）発明同士であっても、特定の発明と引用する発明とが、同義の他の技術用語等を登録キーワードに使用していた場合は、適切な検索抽出が行えないことがあった。
【００１４】
一般的に産業財産権等の明細書等の書類は、作成者が多数存在することもあり、先の「主キー」の例のように同一、同種の意味である技術用語を多種多様な用語により表現している。このように、登録キーワードの統一必要性が考慮されていない事により、従来の検索方法では同一の発明や、類似の発明を検索することが非常に難しいという不都合があった。
【００１５】
また、検索単位毎の登録キーワードが、検索単位を構成する一部分等である場合は、登録キーワードの一部表現性により、精度の高い調査を行うことが難しいという不都合があった。
【００１６】
以上のような理由から、引例資料等の調査対象資料の抽出を迅速かつ正確に行えるシステムの出現が望まれている。また、特定の特許情報等に基づく新規性や進歩性分析を自動的に、または簡便化できる装置の出現が望まれている。
【００１７】
これに関しては、
【特許文献１】において、検索の対象となる複数の情報を、予め当該複数の情報に含まれるいくつかのキーワード群と対応付けて記憶しておき、当該検索対象情報のキーワードを集計して表示し、その中から調査者に選択されたキーワードに基づいて前記の検索の対象となる複数の情報から検索を行う方法が開示されている。
【００１８】
しかし、この方法では、事前に検索の対象となる複数の情報毎にキーワード群の対応付けを登録しなければならないという手間を要し、キーワード登録の手間を解決できていない。更に、複数の情報に含まれる特定のキーワード群を検索に使用するため、登録キーワードの一部表現性や統一必要性を解決できていない。
【００１９】
また、
【特許文献２】においては、検索の対象となる文献単位毎に、所定のキーワード群への関連の有無を表現するパターンを作成し、調査者がパターンを指示することにより、当該指示されたパターンに類似のパターンを持つ文献単位を検索する方法が開示されている。
【００２０】
しかし、事前に所定のキーワード群を決定しなければならないことや、その所定のキーワード群を途中で変更した場合等に、過去のパターンを再利用するのが難しいという不都合がある。また、
【特許文献１】と同様に、事前に検索の対象となる複数の情報毎にパターンの対応付けを登録しなければならないという手間を要し、キーワード登録の手間を解決できてはいない。更に、所定のキーワード群への関連の有無を表現するパターンを検索に使用するため、登録キーワードの一部表現性や統一必要性も解決できていない。
【００２１】
【特許文献１】
特開平９−７３４５３号
【００２２】
【特許文献２】
特開昭６１−１８２１３１号
【００２３】
【発明が解決しようとする課題】
このように、上記のような従来の検索方法にあっては、単に特定のキーワードや検索式を指定することにより、この検索式に該当するキーワード等を含む特許情報が抽出されるのみである。この為、新規性や進歩性等の判断は、抽出された特許情報の内容を調査者が実際に目視等で確認することによって行うのが一般的であり、抽出された特許情報（特許公報等）の件数が多い場合にその理解や把握に多大な手間を有する為、短時間かつ適切に新規性や進歩性等の判断を行うことが困難であるという不都合があった。
【００２４】
また、調査対象のデータ集合に応じて適切なキーワードや検索式を決定する行為は、調査者が行わなければならず、調査者の負担が重いという不都合があった。
【００２５】
更に、
【特許文献１】及び
【特許文献２】においても解決されていない課題として、以下の不都合が存在した。
【００２６】
まず、キーワード登録の手間を解決できていない為、調査対象のデータが発生する度に、キーワード登録を行わなければならないという不都合があった。
【００２７】
更に、登録キーワードの一部表現性を解決できていない為、調査漏れが発生しやすく、精度の高い調査を行い難いという不都合があった。
【００２８】
これに加え、調査対象のデータ集合において、登録キーワードの統一必要性が考慮されていない場合、同義語等の関係にある技術用語が複数使用されている為に、精度の高い調査を行い難いという不都合があった。
【００２９】
【発明の目的】
本発明は、かかる従来技術や
【特許文献１】及び
【特許文献２】の有する不都合を改善し、特に、キーワード登録の手間を解決し、登録キーワードを事前に登録する手間を要せずに、調査対象のデータ集合から検索を行うことのできる好適な手段の提供を目的とする。
【００３０】
また、登録キーワードの一部表現性を解決し、精度の高い調査を行う為に、調査対象のデータ集合に応じた適切な検索キーワードを調査者が選択することができる好適な手段の提供を目的とする。
【００３１】
更に、調査対象のデータ集合において、登録キーワードの統一必要性が考慮されていない場合においても、同義語等の関係にある技術用語等を、調査者が適切に検索キーワードとして選択することができる好適な手段を提供することを目的とする。
【００３２】
【課題を解決するための手段】
上記目的を達成する為、本発明に係るシステムは、一以上のキーワードからなる情報ブロックを含む基準データと、一以上のキーワードからなる情報ブロックを複数含む検索対象データとを記憶する記憶手段を備えている。また、当該システムは、前記基準データを構成するキーワードと、前記検索対象データを構成するキーワードとの関連付けを入力する入力手段を備えている。更に、当該システムは、前記基準データの情報ブロックを基にして前記検索対象データから類似の情報ブロックを検索する処理を実行するコンピュータを備えている。これに加え、当該システムは、当該検索結果を表示する表示手段を備えている。以上のようなシステムにおいて、本発明に係る情報探索プログラムは、前記コンピュータに、前記記憶手段に格納された前記基準データから当該基準データの情報ブロックを構成するキーワードを抽出する第１抽出ステップを実行させる。また、当該プログラムは、前記コンピュータに、前記記憶手段に格納された前記検索対象データから当該検索対象データの複数の情報ブロックを構成するキーワードを抽出するステップを実行させる。更に、当該プログラムは、前記基準データから抽出された一以上のキーワードと、前記検索対象データから抽出された一以上のキーワードとを前記表示手段に表示して、当該表示された基準データのキーワード毎に、前記検索対象データのキーワードの同義語等の関係による関連付けを調査者に選択させるステップを実行させる。これに加え、当該プログラムは、前記コンピュータに、当該関連付けされた検索対象データのキーワードに基づいて、前記検索対象データより前記類似の情報ブロックを検索する検索ステップを実行させる。
【００３３】
本発明によると、前記基準データのキーワード毎に関連付けされた、前記検索対象データのキーワードに基づいて、前記検索対象データより前記類似の情報ブロックを検索する為、従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。また、従来とは異なり、前記検索対象データの情報ブロック毎にキーワードを登録しておかなくても、調査者が検索を行うことが可能になる。更に、従来とは異なり、前記検索対象データに応じた適切な検索キーワードを、調査者が選択することができる。これに加え、従来とは異なり、基準データのキーワード毎に同義語等の関係にある、前記検索対象データのキーワードを、調査者が検索キーワードとして適切に選択できる。
【００３４】
また、他の発明に係る情報探索プログラムは、前記検索ステップが、前記基準データから抽出されたキーワードと、当該キーワード毎に関連付けされた検索対象データから抽出されたキーワードとに基づいて、前記検索対象データより前記類似の情報ブロックを検索することを特徴としている。
【００３５】
本発明によると、前記基準データのキーワードと、当該キーワード毎に関連付けされた前記検索対象データのキーワードとに基づいて、前記検索対象データより前記類似の情報ブロックを検索する為、従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。
【００３６】
更に、他の発明に係る情報探索プログラムは、前記第１抽出ステップが、前記基準データから抽出されたキーワードを調査者が編集することを前記入力手段から受け付けるステップを含むことを特徴としている。
【００３７】
本発明によると、前記基準データのキーワードを調査者が編集することにより、
従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。
【００３８】
また、他の発明に係る情報探索プログラムは、前記第１抽出ステップが、前記基準データから抽出されたキーワードを調査者が取捨することを前記入力手段から受け付けるステップを含むことを特徴としている。
【００３９】
本発明によると、前記基準データのキーワードを調査者が取捨することにより、
従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。
【００４０】
更に、他の発明に係る情報探索プログラムは、前記コンピュータに、前記基準データから抽出されたキーワード毎に調査者から検索処理における重要度を表す重み付けとの関連付けを受け付けさせるステップを備えている。そして、前記検索ステップは、当該基準データのキーワード毎に関連付けされた、前記検索対象データから抽出されたキーワードと、前記重み付けとに基づいて、前記検索対象データより前記類似の情報ブロックを検索するステップを実行することを特徴としている。
【００４１】
本発明によると、前記基準データのキーワード毎に関連付けされた、前記検索対象データのキーワードと、前記重み付けとに基づいて、前記検索対象データより前記類似の情報ブロックを検索する為、従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。また、従来とは異なり、検索結果が重み付けにより補正される為、より細やかな検索を行うことが可能になる。
【００４２】
また、他の発明に係る情報探索プログラムは、前記検索ステップが、前記検索対象データより類似の情報ブロックを検索した際に当該情報ブロック毎に類似度を示す情報を付与するステップを含んでいる。そして、当該プログラムが、前記コンピュータに、前記類似度順に前記検索結果を表示するステップを実行させることを特徴としている。
【００４３】
本発明によると、前記類似度順に前記検索結果が表示される為、調査者が容易に検索結果を理解することが可能になる。また、従来とは異なり、当該検索結果の個々の内訳に、調査者が容易にアクセスすることが可能になる。
【００４４】
更に、他の発明に係る情報探索プログラムは、前記基準データの情報ブロックが特定の特許等出願情報であり、前記検索対象データの複数の情報ブロックが複数の特許等出願情報または科学技術文献情報であることを特徴としている。
【００４５】
本発明によると、前記基準データの情報ブロックである特定の特許等出願情報に対して、前記検索対象データの複数の情報ブロックである複数の特許出願情報または科学技術文献から、類似の特許出願情報または科学技術文献を検索することが可能になる。これにより、従来とは異なり、調査者が費やす調査時間を短縮し、負担を軽減することが可能になる。
【００４６】
また、他の発明に係る情報探索プログラムは、前記基準データの情報ブロックが特定の特許出願の一以上の請求項であることを特徴としている。
【００４７】
本発明によると、前記基準データの情報ブロックである特定の特許等出願の一以上の請求項に対して、前記検索対象データの複数の情報ブロックである複数の特許出願情報または科学技術文献から、類似の特許出願情報または科学技術文献を検索することが可能になる。これにより、従来とは異なり、請求項毎に類似の特許出願情報または科学技術文献を検索することが可能になる。
【００４８】
更に、他の発明に係る情報探索プログラムは、特定の特許出願の一以上の請求項である前記基準データの情報ブロックが従属項である場合に、前記第１抽出ステップが、当該従属項が引用している独立項を前記基準データの情報ブロックに加えるステップを含むことを特徴としている。
【００４９】
本発明によると、特定の特許出願の一以上の請求項である前記基準データの情報ブロックが従属項である場合に、当該従属項が引用している独立項を前記基準データの情報ブロックに加えることが可能になる。これにより、従来とは異なり、前記基準データの情報ブロックが従属項の場合であっても、請求項毎に類似の特許出願情報または科学技術文献を検索することが可能になる。
【００５０】
これにより、前述の目的を達成しようとするものである。
【００５１】
【発明の実施の形態】
以下、本発明の一実施形態を図１及至図１９に基づいて説明する。
【００５２】
図１は、本発明の一実施形態である情報探索システムの構成図である。この情報探索システム１は図１に示すように、記憶手段２と、入力手段３と、表示手段４と、これらを制御するコンピュータ５とから成り立っている。これら各手段及びコンピュータ５は相互に接続されており、データの授受を行いながら協働することにより情報探索機能１５が実現される。また、記憶手段２は、検索処理の基準となる基準データ２１と、基準データの検索対象となる検索対象データ２２とを記憶している。
【００５３】
次に、図１における記憶手段２、及びそこに記憶されている基準データ２１と検索対象データ２２とを詳細に図示したものが、図２の記憶手段２の詳細図である。ここで、基準データ２１は、複数のキーワードからなる情報ブロック２１ａを含んでいる。また、検索対象データ２２は、複数のキーワードからなる情報ブロック２２ａを、複数含んでいる。また、検索対象データ２２はキーワード、分類記号等を検索式に使用することにより、調査者によって予め絞り込まれた検索データであるが、元々その件数が少ない場合は調査者による絞込みを行わないデータとする。
【００５４】
ここで、基準データ２１に含まれるキーワードと、検索対象データ２２に含まれるキーワードとは、一以上の語句や記号や数値等、あるいはその組み合わせから構成されている。この例として、「キー項目」や、「システム」や、「検索結果」等が挙げられる。また図２において、基準データ２１に含まれるキーワードａと、検索対象データ２２に含まれるキーワードａ´とは、同義語等（類義語等を含む）の関係にある。これと同様に、基準データ２１に含まれるキーワードｂと、検索対象データ２２に含まれるキーワードｂ´とは、同義語等の関係にある。ここでいう同義語等の関係とは、前述の「主キー」と、「メイン項目」や「検索項目」等との関係のように、キーワード同士が同義語等の関係にあることをいう。
【００５５】
更に、基準データの情報ブロック２１ａは、情報探索の基準となる情報単位を表している。また、基準データの情報ブロック２１ａの探索対象となる情報単位を表しているのが、検索対象データ２２に複数含まれる情報ブロック２２ａである。ここで、前述した情報探索機能１５は、コンピュータ５が、基準データ２１のキーワード又は／及びこれと同義語等の関係にある検索対象データ２２のキーワードを基にして、基準データの情報ブロック２１ａと類似の検索対象データの情報ブロック２２ａを判断する機能である。（後述）そして、この基準データの情報ブロック２１ａと、検索対象データの情報ブロック２２ａとは、文書や文献毎に区分されているのが望ましい。その場合、特定文献である基準データの情報ブロック２１ａに基づいて、特定文献の類似文献である、検索対象データの情報ブロック２２ａが、前述の情報探索機能１５により判断されることになる。
【００５６】
続いて、図１の情報探索システムを、パーソナルコンピュータに適用した場合におけるブロック図が図３である。ここで、図１における記憶手段２はメモリ６やハードディスク１０に、入力手段３はキーボード等９に、表示手段４はディスプレイ７に、コンピュータ５はＣＰＵ８に夫々対応している。そして、各構成要素はバス１３により接続されており、相互に通信を行えるように構成されている。また、図１における記憶手段２に記憶されていた基準データ２１や検索対象データ２２は、ファイルやデータベース等の形式により、基準データ１１や検索対象データ１２としてハードディスク１０に記憶されている。同様に、情報探索機能１５は、情報探索プログラム１４としてハードディスク１０に記憶されている。そして、図３におけるシステムは、ＣＰＵ８が情報探索プログラム１４を解釈することにより、計算処理を行い、各構成要素を制御する。また、ＣＰＵ８が情報探索プログラム１４を解釈・実行する際には、情報探索プログラム１４やデータ等を、ハードディスク１０からメモリ６に読み込む。
【００５７】
ここで、キーボード等９はパーソナルコンピュータに一般的に使用されているマウス等を含む。また、ディスプレイ７は、ＣＲＴディスプレイや液晶ディスプレイ等である。さらに、図示した基準データのキーワード１１ａは、基準データ１１に含まれるキーワードがＣＰＵ８により抽出され、メモリ６に読み込まれたことを概念的に表している。また、検索対象データのキーワード１２ａについても同様である。
【００５８】
次に、本実施形態における各構成要素の動作を図４のフローチャートに従って説明する。
【００５９】
図３の情報探索システムを使用して調査を行おうとする者（以下「調査者」と略す。）は、入力手段であるキーボード等９を使用し、調査開始のコマンドを入力すると、ＣＰＵ８はそのコマンドを解釈し、情報探索プログラム１４を起動する。
【００６０】
ここでＣＰＵ８が、起動された情報探索プログラム１４を解釈し、実行することにより、図４のフローチャートに図示された４つの処理が行われる。この４つの処理とはすなわち、基準データからキーワードを抽出する第１抽出ステップＳ１と、検索対象データからキーワードを抽出するステップＳ２と、基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップＳ３と、関連付けられた検索対象データのキーワードに基づいて検索対象データの情報ブロックを検索するステップＳ４である。以上の４つの処理によって、情報探索機能１５が実現する。また、基準データからキーワードを抽出する第１抽出ステップＳ１と、検索対象データからキーワードを抽出するステップＳ２とを、予め別ステップで処理しておき、情報探索プログラム１４の実行開始によって、即座に基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップＳ３から処理を進めることもできる。
【００６１】
以下にＳ１及至Ｓ４の、夫々のステップの概略を先に説明する。
【００６２】
まず、基準データからキーワードを抽出する第１抽出ステップＳ１とは、ハードディスク１０に格納された基準データ１１に含まれる情報ブロックを構成している、複数のキーワードを抽出するステップである。このステップにより抽出されたキーワードは、Ｓ３及びＳ４において使用される。以下、基準データからキーワードを抽出する第１抽出ステップＳ１を、「基準データからの第１抽出ステップ」と略す。
【００６３】
次に、検索対象データからキーワードを抽出するステップＳ２とは、ハードディスク１０に格納された、検索対象データ１２に含まれる複数の情報ブロックを構成している、複数のキーワードを抽出するステップである。このステップにより抽出されたキーワードは、Ｓ３及びＳ４において使用される。以下、検索対象データからキーワードを抽出するステップＳ２を、「検索対象データからの抽出ステップ」と略す。
【００６４】
続いて、基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップＳ３とは、Ｓ１において基準データ１１から抽出されたキーワードと、Ｓ２において検索対象データ１２から抽出されたキーワードとの関連付けを、調査者に選択させるステップである。このステップにより選択された関連付けは、Ｓ４において使用される。以下、基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップＳ３を、「関連付け選択ステップ」と略す。
【００６５】
続いて、関連付けられた検索対象データのキーワードに基づいて検索対象データの情報ブロックを検索するステップＳ４とは、Ｓ３において関連付けられた検索対象データ１２のキーワードに基づいて、基準データ１１の情報ブロックと類似の、検索対象データ１２の情報ブロックを検索するステップである。この検索ステップにより、検索対象データ１２の情報ブロック毎に、基準データ１１の情報ブロックに類似しているか否かが判定される。以下、関連付けられた検索対象データのキーワードに基づいて検索対象データの情報ブロックを検索するステップＳ４を、「検索ステップ」と略す。
【００６６】
以上が、Ｓ１及至Ｓ４の各ステップの概略であり、以下に夫々のステップを詳細に説明する。
【００６７】
まず、基準データからの第１抽出ステップＳ１においては、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより、図５のフローチャートに図示されたＳ１１及至Ｓ１５の処理を行う。以下、図５のフローチャートに基づいて、Ｓ１１及至Ｓ１５の夫々の処理について説明する。
【００６８】
最初に、Ｓ１１においてＣＰＵ８は、ディスプレイ７に図７の基準データ入力画面３０を表示させ、調査者からの基準データ入力命令を待つ。ここで調査者は、キーボード等９を使用し、基準データ読込ボタン３４を通じて基準データ読込を指示する。更に、調査者は、パスやファイル名等の基準データを特定する情報もあわせて指定する。そうすると、ＣＰＵ８は調査者の基準データ読込の指示を受け、指定された基準データを、基準データ表示部３３に表示する。また、ＣＰＵ８は、当該基準データの分析を行い、「請求項」という文字や墨付括弧の記号により、基準データを請求項毎に分割し、請求項の一覧を請求項一覧表示部３１に表示する。この、調査者が入力する基準データの例としては、特許等の明細書がある。また、このような基準データの受付処理の例としては、調査者が図８に図示したデータを基準データとして指定した場合、基準データ入力画面３０は図９のような状態になる。
【００６９】
次に、ＣＰＵ８は、図９の入力画面を表示すると、請求項一覧表示部３１に表示した基準データの請求項のうち、どれを使用するかの選択指示である、調査者の請求項選択命令を待つ。ここで調査者は、キーボード等９を使用し、請求項一覧表示部３１の各請求項をダブルクリックすること等によって、一以上の請求項の選択を命令する。そうすると、ＣＰＵ８は調査者の請求項選択の指示を受け、指定された請求項のデータを、基準データ表示部３３から抽出して、選択請求項表示部３２に表示する。
【００７０】
この抽出・表示処理の際に、ＣＰＵ８により、先の調査者による請求項選択の指示により選択された、一以上の請求項の中に、従属項を含んでいるか否かの分岐判定Ｓ１２が行われる。ここで、従属項を含んでいるか否かは、基準データ表示部３３から抽出した請求項データにおいて、「請求項」の文字や「請求項１及至３」の文字のような、他の請求項を引用しているか否かにより判定する。このＳ１２において、従属項を含んでいるとＣＰＵ８が判定した場合、当該従属項が引用している請求項のデータも基準データ表示部３３から抽出し、あわせて選択請求項表示部３２に表示する（Ｓ１３）。またこのＳ１３において、従属項が引用している請求項が、更に他の請求項を引用している場合（他の請求項の従属項目である場合）、当該引用している請求項のデータも、あわせて選択請求項表示部３２に表示する。このような処理を、引用している請求項が独立項になるまで繰り返す。以上のような従属項判定Ｓ１２の例として、図９の状態の基準データ入力画面３０において、調査者が請求項一覧表示部３１から「請求項２」を指定した場合、基準データ入力画面３０は図１０の状態になり、請求項１と２のデータが反映される。このようにして、調査者の請求項選択の指示が選択請求項表示部３２に反映されることになる。この選択請求項表示部３２に表示されている請求項データは、前述の情報探索機能における情報探索の基準となる情報単位である、基準データの情報ブロックを表している。
【００７１】
そして、この図１０の基準データ入力画面３０において、調査者がキーボード等９を使用し、選択請求項表示部３２に表示された請求項データを編集することができる。すなわち、選択請求項表示部３２に表示された請求項データに、文字や記号等を付け加えたり、又は削除したりすることができる。
【００７２】
続いて、この図１０の基準データ入力画面３０において、調査者がキーボード等９を使用し、キーワード抽出ボタン３５をクリックすること等によって、基準データの情報ブロックからのキーワード抽出を命令する。そうすると、ＣＰＵ８は調査者のキーワード抽出の指示を受け、選択請求項表示部３２に表示された請求項データを、キーワードに分割する（Ｓ１４）。この分割の方法としては、ＣＰＵ８が請求項データに含まれる助詞や空白文字等毎に、キーワードに分割する方法がある。このようなキーワード分割の例として、図１０の入力画面における選択請求項表示部３２に表示された請求項データは、図１１に図示した分割キーワード一覧３７のようにキーワード分割される。
【００７３】
そして、請求項データをキーワードに分割すると、ＣＰＵ８はディスプレイ７に図１２の探索条件決定画面４０を表示させ、請求項データを請求項データ表示部４２に、分割キーワード一覧３７をキーワード一覧表示部４３に表示する（Ｓ１５）。この図１２の探索条件決定画面４０において、調査者がキーボード等９を使用し、請求項データ表示部４２に表示された請求項データを編集することができる。すなわち、請求項データ表示部４２に表示された請求項データに、文字や記号等を付け加えたり、又は削除したりすることができる。ここで、請求項データ表示部４２に表示された請求項データを編集した場合、調査者がキーボード等９を使用し、キーワード再抽出ボタン４１をクリックすること等によって、請求項データ表示部４２からのキーワード再抽出、及び抽出したキーワード一覧をキーワード一覧表示部４３へ再表示できる。
【００７４】
このようにして、ＣＰＵ８はキーワード一覧表示部４３にキーワードの一覧を表示すると、当該キーワードのうち、使用するキーワードの指示である、調査者のキーワード選択命令を待つ。ここで調査者は、キーボード等９を使用し、キーワード一覧表示部４３の各キーワードをダブルクリックすること等により、一以上のキーワードの選択を命令する。そうすると、ＣＰＵ８は調査者のキーワード選択の指示を受け、指定されたキーワードを、キーワード一覧表示部４３から抽出して、検索条件表示部４８の構成要素の列に表示する。このように、調査者がキーワード一覧表示部４３から、例えば「キー項目」、「順序」、「検索結果」の各キーワードを順次クリックすることにより、図１３のように検索条件表示部４８の構成要素の列に選択したキーワードが表示される。この場合において調査者が選択するキーワードは、キーワード一覧表示部４３に表示された全てのキーワードではなく、発明の特徴を表すようなものを選ぶのが適当である。このように、キーワードの取捨及び編集は専ら調査者が決定するものであり、情報探索プログラム１４による自動的な決定がなされるのでは無い。
【００７５】
以上が、基準データからの第１抽出ステップＳ１における各処理の説明である。ここで、図１３において検索条件表示部４８の構成要素の列に表示されている各キーワードが、基準データからの第１抽出ステップＳ１で抽出されたキーワードである。以下、検索対象データからの抽出ステップＳ２について説明する。
【００７６】
この検索対象データからの抽出ステップＳ２においても、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより以下の処理が行われる。まずＣＰＵ８は、キーボード等９を通じて調査者から、Ｓ１によって受け付けた基準データに対しての、検索対象となる検索対象データの入力を受け付ける。この検索対象データの例としては、複数件数の特許公報等があげられ、その場合には出願毎の特許公報等が検索対象データにおける個々の情報ブロックとなる。また、前述のように、検索対象データにおける個々の情報ブロックは、基準データの情報ブロックとの類似度が判定される情報単位である。
【００７７】
そして、ＣＰＵ８は、調査者により入力された検索対象データを、前述の基準データからの第１抽出ステップＳ１におけるＳ１４と同様に、キーワードに分割する。この分割の方法としては、ＣＰＵ８が検索対象データに含まれる助詞や空白文字等毎に、キーワードに分割する方法がある。以上のようにして、検索対象データからの抽出ステップＳ２において分割された、検索対象データのキーワード一覧３８を図１４に図示する。ここで、図１４において表示されているキーワードが、検索対象データからの抽出ステップＳ２で抽出されたキーワードとする。
【００７８】
以上が、検索対象データからの抽出ステップＳ２における各処理の説明である。この検索対象データからの抽出ステップＳ２においては、調査者は検索対象データを指定するのみであり、他の処理は情報探索プログラム１４に基づいて行われる。そのため、この検索対象データからの抽出ステップＳ２の後に、基準データからの第１抽出ステップＳ１を情報探索プログラム１４が行っても不都合は無い。つまり、基準データからの第１抽出ステップＳ１と、検索対象データからの抽出ステップＳ２とは、順不同である。以下、関連付け選択ステップＳ３について説明する。
【００７９】
この関連付け選択ステップＳ３においても、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより以下の処理が行われる。ここで、Ｓ１とＳ２を終えると、図１３の探索条件決定画面４０がディスプレイ７に表示されており、ＣＰＵ８は調査者からの関連付け選択命令を待つ。ここで調査者は、キーボード等９を使用して、検索条件表示部４８の構成要素の列に表示された各キーワードのうち、任意のキーワード（以下、「任意キーワード」と略す。）を選択し、同義語ボタン４４をクリックすること等により、関連付け選択画面表示を指示する。これを検知すると、ＣＰＵ８は、当該任意キーワードデータを検索条件表示部４８から抽出し、情報探索プログラム１４が予め保持している関連付け選択画面５０をディスプレイ７に表示する。（後述）
【００８０】
この関連付け選択画面５０は、対象キーワード表示部５１と、ヒットキーワード一覧表示部５２と、ヒット外キーワード一覧表示部５３と、関連付け実行ボタン５４とを有している。ここで、対象キーワード表示部５１とは、先に調査者によって検索条件表示部４８の構成要素の列から選択された、任意キーワードを表示する部位である。図１５の関連付け選択画面５０においては、任意キーワードとして「キー項目」が選択されたことが分かる。また、ヒットキーワード一覧表示部５２とは、任意キーワードを検索キーとして、Ｓ２で抽出されたキーワードに対して検索を行った結果、ヒットしたキーワードの一覧を表示する部位である。更に、ヒット外キーワード一覧表示部５３とは、任意キーワードを検索キーとして、Ｓ２で抽出されたキーワードに対して検索を行った結果、ヒットしなかったキーワードの一覧を表示する部位である。このＳ２で抽出されたキーワードに対して行う検索方法は、前述の一般的なパターンマッチングが使用される。その結果、Ｓ２で抽出されたキーワードは各々、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３との、どちらか一方に振り分けられ、表示される。また、この振り分け処理は、調査者が選択した類似度基準により判断される。図１５のように、関連付け選択画面５０は、調査者が類似度基準を選択できるように、ラジオボタンを複数備えている。図１５の関連付け選択画面５０においては、類似度が（弱）、（中）、（強）の３つのラジオボタンを備えており、調査者がどのラジオボタンを選ぶかにより類似度基準を選択できる。この類似度基準のキーワード振り分け例としては、「主キー」という任意キーに対して類似度基準を（強）に設定した場合、Ｓ２で抽出されたキーワードのうち、「主キー」という文字を完全に含むキーワードのみが、ヒットキーワード一覧表示部５２に表示されることが挙げられる。また同様に、「主キー」という任意キーに対して類似度基準を（弱）に設定した場合、Ｓ２で抽出されたキーワードのうち、「主キー」という文字の一部分（例えば「主」や「キー」）を含むキーワードが、ヒットキーワード一覧表示部５２に表示されることが挙げられる。
【００８１】
先の関連付け選択画面５０をディスプレイ７に表示すると、ＣＰＵ８は、先の抽出した任意キーワードデータを対象キーワード表示部５１に表示する。更にＣＰＵ８は、情報探索プログラム１４により初期設定されている類似度基準と、Ｓ２で抽出されたキーワードとを読み込み、当該類似度基準に基づいて先のパターンマッチングにより、Ｓ２で抽出されたキーワードを、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに振り分けて表示する。このような類似度基準を設ける理由は、構成要素として選択したキーワードに類似するキーワードが検索対象データ中に多く含まれる場合は、類似度の高いキーワードのみをヒットキーワード一覧表示部５２に表示させて調査者が即座に必要とする検索用のキーワードを選択できるようにするためである。従って、構成要素として選択したキーワードに類似するキーワードが検索対象データ中に殆ど無い場合は、類似度の低いキーワードも含めてヒットキーワード一覧表示部５２に表示させ、調査者が選択できるようにしている。更に、ヒットキーワード一覧表示部５２に表示されるキーワードの件数が一定範囲内となるように、情報探索プログラム１４によりラジオボタンの１つを選択させることもできる。
【００８２】
また、調査者がキーボード等９を使用して、関連付け選択画面５０上の類似度基準選択ラジオボタンの選択を行い、当該類似度基準を変更すると、ＣＰＵ８は当該類似度基準に基づいて、再度先のＳ２で抽出されたキーワードを読み込み、先のパターンマッチングによりヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに振り分けて表示する。このようにして、情報探索プログラム１４により初期設定されている類似度基準を、調査者が変更することができる。
【００８３】
続いて、調査者は、キーボード等９を使用して、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに表示されたＳ２で抽出されたキーワードの一覧のうち、対象キーワード表示部５１に表示された任意キーワードと同義語等の関係にあると考えるキーワードを選択する。この選択方法としては、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに表示された、各キーワード毎に備えられたチェックボックスにチェックすることにより選択する方法がある。通常、ヒットキーワード一覧表示部５２には調査者が構成要素として選択したキーワードは表示しない。調査者が構成要素として選択したキーワードは表示するまでもなく、調査者によって選択されたものとして処理が行われる。なお、調査者が構成要素として選択したキーワードも表示させると、検索対象データ中に選択した構成要素の有無を調査者が確認可能となる。このチェック付けを検知すると、ＣＰＵ８は任意キーワードと当該チェックされたキーワードとの関連付けを記憶する。また、チェック付けが解除されると、ＣＰＵ８は任意キーワードと当該チェックが解除されたキーワードとの関連付けを削除する。図１５の関連付け選択画面５０を例にとると、任意キーワードである「キー項目」に対して、「検索キー」と「検索項目」と「特徴部分」とがキーワード選択され、関連付けが記憶されている状態である。
【００８４】
このように、調査者がキーボード等９を使用して、任意キーワードと、同義語等の関係にあると考えるキーワードを選択し、関連付け選択画面５０上の関連付け実行ボタン５４をクリックすること等により、任意キーワードに対する他のキーワードの関連付け保存が行われる。これを受けてＣＰＵ８は、当該関連付けを保存し、当該任意キーワードに関連付けられた他のキーワードを、探索条件決定画面４０上の検索条件表示部４８の、当該任意キーワードの行であり、同義語項目の列と交わる領域に各々表示する。この例として、図１６の探索条件決定画面４０上の検索条件表示部４８においては、「キー項目」という任意キーワードに対して「検索キー」、「検索項目」、「特徴部分」という３つのキーワードが、「順序」という任意キーワードに対しては「手続」、「順番」という２つのキーワードが夫々関連付けられていることを示している。また同時に、図１６の探索条件決定画面４０上の検索条件表示部４８においては、「検索結果」という任意キーワードに対して、全くキーワードが関連付けられていないことも示している。このように、任意キーワードに対して関連付けキーワードの無い場合は、構成要素として選択された語句、即ちこの場合は「検索結果」のキーワードのみを用いて検索用データベースの検索が行われる。（後述）
【００８５】
以上が、関連付け選択ステップＳ３における各処理の説明である。ここで、図１６において検索条件表示部４８の構成要素の列に表示されている各任意キーワードに対し、当該任意キーワード毎の行の同義語項目に表示されているキーワードが、Ｓ３において当該任意キーワード毎に関連付けされた検索対象データのキーワードである。以下、検索ステップＳ４について説明する。
【００８６】
この検索ステップＳ４においても、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより以下の処理が行われる。ここで、検索ステップＳ４は図６のフローチャートに図示されたＳ４１及至Ｓ４３の処理から成り立っている。以下、図６のフローチャートに基づいて、Ｓ４１及至Ｓ４３の夫々の処理について説明する。
【００８７】
まず、図１６の探索条件決定画面４０において、ＣＰＵ８は、検索条件表示部４８の構成要素の列に表示されている各任意キーワード毎に重み付けを付与する指示である、調査者の重み付け付与命令を待つ。ここで調査者は、キーボード等９を使用し、検索条件表示部４８の構成要素の列に表示されている各キーワードをクリックすること等により当該キーワードを選択した上で、重み付け付与メーター４５を操作するか或いはキーボード等９から数値を入力することにより、重み付け付与の指示を行う。ここで重み付け付与メーター４５は、図１６のように、中央の重み付け指示棒を左右に操作することにより、特定範囲の数値の中から、重み付け指示棒が指し示す値を選択できるように構成されている。この調査者からの重み付け付与の指示を受けると、ＣＰＵ８は、検索条件表示部４８の該当キーワードの行における重みの列に、重み付け付与メーター４５に示された値を表示する（Ｓ４１）。この例として、図１６の探索条件決定画面４０において、「キー項目」という任意キーワードに対して「８」が、「順序」という任意キーワードに対して「６」が、「検索結果」という任意キーワードに対して「３」が夫々調査者によって重み付け付与された場合、探索条件決定画面４０は図１７のような状態になる。また、調査者が特に重み付け指定していない、検索条件表示部４８の構成要素の列に表示されているキーワードについては、情報探索プログラム１４により初期設定値が自動設定されると、調査者にとって簡便であり望ましい。
【００８８】
ここで、調査者による重み付け付与ステップＳ４１は、説明の便宜上からＳ４に含めたが、Ｓ１の最後や、Ｓ２、Ｓ３の他のステップに存在していても良い。また調査者は、探索条件決定画面４０に表示されている時には適時、探索条件決定画面４０に表示されている各種データを保存することや、以前保存された当該各種データを読み込むことができる。例えば、調査者が探索条件決定画面４０上の設定保存ボタン４６をキーボード等９を使用し、クリックすることを受けて、ＣＰＵ８は、当該探索条件決定画面４０に表示されている各種データをハードディスク１０に保存する。また、調査者が探索条件決定画面４０上の設定読込ボタン４７をキーボード等９を使用してクリックし、設定ファイルを指定することを受けて、ＣＰＵ８は、当該設定ファイルをハードディスク１０からメモリ６に読み込み、探索条件決定画面４０に当該設定ファイルの内容を表示する。
【００８９】
次に、ＣＰＵ８は、検索条件表示部４８の条件に基づいて、検索対象データの情報ブロック毎に類似度判定を行う探索開始指示である、調査者の探索開始命令を待つ。ここで調査者は、キーボード等９を使用し、探索条件決定画面４０上の探索開始ボタン４９をクリックすること等によって、探索開始を指示する。そうすると、ＣＰＵ８は調査者からの探索開始指示を受け、検索条件表示部４８に表示されている検索条件データを抽出する。この例として、図１７の探索条件決定画面４０の状態において、調査者からの探索開始指示がなされると、図１８のような検索条件データ６０が抽出される。ここで、検索条件データ６０には新たに管理番号の列が存在するが、これは検索条件データ６０の各レコードをユニークにする為に管理番号を付したものである。
【００９０】
続いて、ＣＰＵ８は、検索条件データ６０に基づいて、検索対象データの情報ブロック毎に類似度判定を行う（Ｓ４２）。この類似度判定方法としては、検索条件データ６０において、ある特定の構成要素に対する同義語の数がＮで、検索対象データのある特定の情報ブロック（以下「特定情報ブロック」と称す。）に対して、Ｎ個のキーワードを使用した検索の結果Ｍ個ヒットした場合、その構成要素に対しての特定情報ブロックの得点は「（Ｍ／Ｎ）＊重み」として、特定情報ブロック毎に記録される。同様に、他の構成要素である「順序」及び「検索結果」についても同様の検索と記録を行う。最後に、各構成要素毎の得点を、特定情報ブロック毎に集計して、その特定情報ブロックの類似度得点とする。
【００９１】
例えば、第１８図に示す検索条件データ６０において、構成要素が「キー項目」の行では、同義語の数が３つ（「検索キー」、「検索項目」、「特徴部分」）であるのでＮの数は構成要素として選択した「キー項目」を加えてＮ＝４となる。これに対し、検索対象データのある特定の情報ブロックにおいて、「検索キー」と「検索項目」と「特徴部分」と「キー項目」とを検索キーワードとして、その情報ブロックを検索した場合「検索キー」と「検索項目」と「特徴部分」と「キー項目」の４つのキーワードが共に存在した場合は、Ｍ＝４であることが分かり、第１８図に示す検索条件データ６０より、構成要素が「キー項目」の重みは８であることが分かる。よって、この特定情報ブロックにおける構成要素が「キー項目」の行の得点は、「（Ｍ／Ｎ）＊重み」の式に代入すると「（４／４）＊８」となり、当該式の解として「８」であることが分かる。同様にして、構成要素が「順序」の行と「検索結果」の行とにおいても、当該特定情報ブロックにおいて得点を算出する。そして、以上の３つの構成要素の行毎に算出された得点を合計することにより、当該特定情報ブロックの合計得点（類似度得点）が算出される。なお、ここでは３つの構成要素の行毎に算出された得点を単純に合計しているが、３つの構成要件の行毎に算出された得点の積を求めて、合計得点を補正することができる。然るに、同じ技術を広範な同義語で表現できる特許明細書や技術文献といった文章を対象に類似度を求めるので、得点の算定方法を厳密に定める意味は殆ど無いため、種種の計算方法を採用できる。
【００９２】
また、先のＳ４２において、検索条件データ６０におけるある特定の構成要素に対する同義語の数であるＮに、当該特定の構成要素を加えない数（Ｎ−１）をＮとしても良い。例えば、第１８図に示す検索条件データ６０において、構成要素が「キー項目」の行では、同義語の数が３つ（「検索キー」、「検索項目」、「特徴部分」）であるのでＮ＝３とし、調査者が選んだ３つのキーワードを検索キーワードとして、先のＳ４２を行う。このように、当該構成要素自体をキーワードとして採用しないことにより、検索対照データ中のキーワードのみによる類似度判定を行うことができる。
【００９３】
このようにして、ＣＰＵ８は、検索条件データ６０に基づいて、検索対象データの情報ブロック毎に類似度得点を算出すると、当該類似度得点の高い順に検索対象データの情報ブロックをソート処理し、ディスプレイ７に一覧表示する（Ｓ４３）。この例としては、図１９のような表示方式がある。この図１９の検索情報データの情報ブロック類似度順一覧７０においては、「文献番号」の項目の値がその情報ブロックの当初の表示順番を意味し、「得点順位」の項目の値がその情報ブロックの合計得点順番を意味している。例えば、「文献番号」が「３７３」の情報ブロックの行からは、当初は３７３番目の行に表示されていた情報ブロックであり、合計得点が他の情報ブロックに比べ一番高かったことが分かる。同様にして、「文献番号」が「３７３」の情報ブロックの行からは、「公開番号」が「Ｈ０３−１２３４」であることや、「公開日」が「Ｈ０３／１０／１０」であること、また「発明の名称」が「Ａシステム」であることが読み取れる。更に、「リンク」の項目にはパスとファイル名等の、個別の情報ブロックデータの識別情報を保持している。これにより、調査者がキーボード等９を使用し、情報ブロック類似度順一覧７０上の個別の情報ブロックを選択することで、即座にＣＰＵ８が、当該情報ブロックをディスプレイ７に表示することができる。
【００９４】
以上のようなＳ４３により、調査者は、基準データの情報ブロックに対して、類似度得点の高い検索対象データの情報ブロックを、類似度得点順に認知することができる。
【００９５】
以上述べてきたように本実施形態によれば、基準データから抽出したキーワード毎に、検索対照データから抽出したキーワードを複数関係づけることができ、当該関係付けられた検索対照データからのキーワードに基づいて、検索対照データの情報ブロックを検索できるので、もれなく確実に目的とする情報ブロックを検索及び抽出できる効果がある。そのため、調査者は、検索対照データに含まれる内容を情報ブロック単位毎に、注意を要するもの順に見ることができ、読込み調査の時間が大幅に短縮できる。
【００９６】
【発明の効果】
本発明は以上のように構成され機能するので、これによると、基準データから抽出されたキーワード毎に、検索対象データから抽出されたキーワードを関連付けることができ、当該関連付けられた検索対象データからのキーワードに基づいて、検索対象データの情報ブロック毎に類似度を判定することができる。
【００９７】
また、検索対象データの情報ブロック毎に、事前に検索用キーワードを登録する必要が無く、検索対象データ全てをキーワード対象とすることができる。
【００９８】
更に、検索対象データにおいて、登録キーワードの統一必要性が考慮されていない場合においても、同義語等の関係にある技術用語等を、調査者が適切に検索キーワードとして選択することができる。
【００９９】
これに加え、基準データから抽出されたキーワード毎に、関連付けられた検索対象データから抽出されたキーワードに対し、調査者が重み付けを付与することによって、類似度判定を調整することができる。
【０１００】
以上のような点により、従来よりも細やかな調査が行うことが可能になり、もれなく確実に検索対象データから目的とする情報ブロックを検索及び抽出できる効果があるという、従来にない優れた情報探索システムを提供することができる。
【図面の簡単な説明】
【図１】本発明の一実施形態である情報探索システムの構成図である。
【図２】図１における記憶手段２の詳細図である。
【図３】図１の情報探索システムを、パーソナルコンピュータに適用した場合におけるブロック図である。
【図４】情報探索システムの処理を図示したフローチャート図である。
【図５】図４におけるＳ１を詳細化したフローチャート図である。
【図６】図４におけるＳ４を詳細化したフローチャート図である。
【図７】基準データ入力画面３０の画面図である。
【図８】基準データのデータ例である。
【図９】基準データ入力画面３０の画面図である。
【図１０】基準データ入力画面３０の画面図である。
【図１１】基準データをキーワードに分割したキーワード一覧図である。
【図１２】探索条件決定画面４０の画面図である。
【図１３】探索条件決定画面４０の画面図である。
【図１４】検索対象データをキーワードに分割したキーワード一覧図である。
【図１５】関連付け選択画面５０の画面図である。
【図１６】探索条件決定画面４０の画面図である。
【図１７】探索条件決定画面４０の画面図である。
【図１８】検索条件データ６０のデータ例である。
【図１９】検索情報データの情報ブロック類似度順一覧７０の表示例である。
【符号の説明】
１情報探索システム
２記憶手段
３入力手段
４表示手段
５コンピュータ
６メモリ
７ディスプレイ
８ＣＰＵ
９キーボード等
１０ハードディスク
１１基準データ
１１ａ基準データのキーワード
１２検索対象データ
１２ａ検索対象データのキーワード
１３バス
１４情報探索プログラム
１５情報探索機能
２１基準データ
２１ａ基準データの情報ブロック
２２検索対象データ
２２ａ検索対象データの情報ブロック
３０基準データ入力画面
３１請求項一覧表示部
３２選択請求項表示部
３３基準データ表示部
３４基準データ読込ボタン
３５キーワード抽出ボタン
３６基準データ
３７分割キーワード一覧
３８検索対象データのキーワード一覧
４０探索条件決定画面
４１キーワード再抽出ボタン
４２請求項データ表示部
４３キーワード一覧表示部
４４同義語ボタン
４５重み付け付与メーター
４６設定保存ボタン
４７設定読込ボタン
４８検索条件表示部
４９探索開始ボタン
５０関連付け選択画面
５１対象キーワード一覧表示部
５２ヒットキーワード一覧表示部
５３ヒット外キーワード一覧表示部
５４関連付け実行ボタン
６０検索条件データ
７０検索情報データの情報ブロック類似度順一覧[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a system for investigating a target material, etc., and particularly to an investigator who examines the contents of a document or the like, such as an investigation of an opposition material concerning industrial property rights or a prior application. A system that assists in the investigations that need to be performed and facilitates the investigation work.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, a patent information database search system capable of searching patent information and the like by a machine search has been developed and used. Taking a patent information database as an example, it is constructed by storing patent information generated for each patent application in a storage means of a host computer as a database.
[0003]
In addition, in such a patent information database, in addition to general bibliographic items, various keywords (for example, free keywords, fixed keywords, F-terms, etc.) are stored for each file. It is configured to be able to search for patent information. In addition, recently published patent publications in the form of CD-ROM, and CD-ROMs and the like relating to patent information edited by various manufacturers (hereinafter referred to as electronic patent information) include bibliographic items other than those described above. The specification, abstract, and drawings are also stored as electronic information, and the corresponding patent information can be searched based on various search items.
[0004]
Here, when searching for invalid or opposition materials using the above-mentioned patent information database or electronic patent information, the searcher creates a search formula and looks at the abstract of the contents that hit the search formula. The full text of the official gazette is read as necessary to determine whether it can be used as invalid or opposition material. If the number of hits is small as a result of this search, the user directly looks at the gazette. If the relevant publication is not found, the search formula is re-created and reading the hit publication is repeated. A prior application search for an invention to be filed in the future is also performed in a similar procedure.
[0005]
Here, the process of searching patents will be described in some detail. First, after understanding the contents and features of the present application, the researcher combines the classification symbols, keywords, and the like stored in the above-mentioned patent database and electronic patent information to determine the target technology, for example, from tens to hundreds. A search formula is created so that it is included in the answer set of about cases. Of course, it is only necessary that the number of hits as a result of the search is small, and the desired data is covered in the set of hit data. In many cases, similar technologies in the field are not extracted. For this reason, the researcher usually creates a search formula so that the target technology is included in the set of hit data, typically using the extraction of dozens, at most several hundreds. When a set exceeding several hundreds is created, it takes a lot of time for subsequent screening, and it is difficult to perform an efficient search. However, if one can be sure that the target similar technology has been extracted, the user will read the abstracts, gazettes, etc. of all the hit applications and search for the similar technology. This is done manually. If a similar technique cannot be extracted after reading many gazettes or the like, the search operation is recreated again and the same operation is repeated. Then, the investigation work is continued until it is confirmed that a similar technique cannot be extracted no matter how much search is performed. Especially in the case of publicly known examples, objections, and invalid material surveys, in addition to the time required to create a search formula, the time required for humans to peruse the results of machine search to further examine the contents and to discard them is added. Become. For this reason, it takes a lot of time to complete the survey, and the survey cost is enormous. Even a research expert who is familiar with the classification system and research methods takes a great deal of time to examine and process the results of machine search.
[0006]
By the way, there is a pattern matching as a general search processing method which has been widely used conventionally. This includes keywords and search formulas used for search (hereinafter abbreviated as “search keywords”), and keywords registered for each search unit such as documents and documents (hereinafter abbreviated as “registered keywords”). And whether the search unit is extracted as a search result based on whether the search unit matches (completely or partially). Therefore, if the search keyword and the registered keyword do not match (completely or partially) but have a synonymous relationship (including synonyms and the like), there is a problem that the search cannot be performed. That is, since there are so many terms in the expression of the technical idea, it is very difficult to extract the target similar technology by the machine search processing using pattern matching.
[0007]
As an example of this pattern matching, when a character string “primary key” is specified as a search keyword, the registered keywords such as “primary key”, “key”, and “key item” Since at least a part of the column is included, the keyword is determined to be similar. However, a registered keyword that is a technical synonym such as a “main item” or a “search item” cannot be extracted by searching if a character string “primary key” is specified as a search keyword.
[0008]
Further, since the above-mentioned registered keyword is used for a search action, it must be registered in advance for each search unit such as a document or a document, and the registration is time-consuming. (Hereinafter, this effort is referred to as "keyword registration effort.")
[0009]
As a method of determining a registered keyword, a keyword having a characteristic for each search unit such as a document or a document is partially extracted, or an administrator sets some keywords for each search unit. ing. For this reason, in general, a registered keyword often represents only a part or one side of a search unit of a document or a document. (Hereinafter, this point is referred to as “partial expression of registered keywords.”) For this reason, when a keyword or the like constituting a search unit other than a registered keyword is designated as a search keyword, one of the above-described registered keywords is not included. Due to the partial expression, the search unit is not extracted as a result of the search.
[0010]
Further, as in the case of the above-mentioned "primary key", it is necessary to use registered keywords that are technical synonyms or the like in a unified manner. (Hereinafter, this point is referred to as “necessity of unification of registered keywords.”) As an example of the necessity of unification of registered keywords, when the term “primary key” is used as a registered keyword, The registered keywords “item” and “search item” cannot be used.
[0011]
However, if the unified necessity of registered keywords is not complied with, it is not preferable for searchers who conduct searches, especially when judging the novelty or inventive step of industrial property rights, etc. The points will become more apparent.
[0012]
Here, novelty refers to whether a particular invention is new depending on whether a particular invention and one or more cited inventions to it are identified and a difference is generated by comparing the two. It is a patent requirement to determine whether or not. In the determination of novelty, even if the inventions are conventionally determined to be the same invention (determined as having no novelty) by visually confirming the contents by a researcher, the invention may be referred to as a specific invention. In the case where another technical term or the like having the same meaning is used as a registered keyword, a highly accurate search may not be performed. At the time of filing the application of the present application, the national patent database, which is widespread at the time of filing the application, is provided with F-terms, international patent classifications, FI symbols, and other technical contents as thesaurus symbols. However, since a new invention cannot always be classified into an old technical system, it is difficult to hit the corresponding patent or the like with a complete target even by combining a keyword and a classification symbol. Further, when a paper or a general technical document with few classification symbols is to be investigated, it is difficult to hit an appropriate document by a machine search from a data set narrowed down to a small number. Therefore, it is necessary for the investigator to create a certain number of data sets and examine whether there is a relevant document or the like in the data sets.
[0013]
In addition, inventive step is defined as identifying a particular invention and one or more cited inventions, and comparing the two to clarify the points of coincidence and difference in matters for specifying each invention. , Is a patent requirement determined based on whether or not the difference is easy for those skilled in the art. In the determination of novelty, inventions cited as specific inventions even if the inventions are conventionally determined to be of little difference by visual inspection by a researcher (determined as having no inventive step). However, if another synonymous technical term or the like is used as a registered keyword, appropriate search and extraction may not be performed.
[0014]
In general, there are many creators of documents such as specifications of industrial property rights, etc., and technical terms that have the same and similar meanings as in the example of the above "primary key" are a variety of terms It is expressed by As described above, since the necessity of unifying registered keywords is not taken into consideration, there is an inconvenience that it is extremely difficult to search for the same invention or a similar invention using the conventional search method.
[0015]
Further, when the registered keyword for each search unit is a part or the like constituting the search unit, there is an inconvenience that it is difficult to perform a highly accurate survey due to the partial expression of the registered keyword.
[0016]
For the reasons described above, the emergence of a system that can quickly and accurately extract materials to be investigated, such as reference materials, is desired. In addition, an apparatus that can automatically or simply analyze novelty and inventive step based on specific patent information or the like is desired.
[0017]
In this regard,
In Patent Literature 1, a plurality of pieces of information to be searched are stored in advance in association with some keyword groups included in the plurality of pieces of information, and the keywords of the information to be searched are totaled and displayed. There is disclosed a method of performing a search from a plurality of pieces of information to be searched based on a keyword selected by a researcher from the search results.
[0018]
However, in this method, it is necessary to register the association of the keyword group for each of a plurality of pieces of information to be searched in advance, and the time and effort for registering the keyword cannot be solved. Furthermore, since a specific keyword group included in a plurality of pieces of information is used for a search, the partial expression of registered keywords and the necessity of unifying them cannot be solved.
[0019]
Also,
In Patent Literature 2, a pattern expressing the presence or absence of association with a predetermined keyword group is created for each document unit to be searched, and the pattern is designated by the researcher by designating the pattern. Discloses a method of searching for a document unit having a pattern similar to.
[0020]
However, there is an inconvenience that it is difficult to reuse a past pattern when a predetermined keyword group must be determined in advance, or when the predetermined keyword group is changed on the way. Also,
As in Patent Document 1, it takes time and effort to register a pattern association for each of a plurality of pieces of information to be searched in advance, and the time and effort for keyword registration has not been solved. Further, since a pattern expressing the presence or absence of association with a predetermined keyword group is used for the search, the partial expression of the registered keywords and the necessity of unification cannot be solved.
[0021]
[Patent Document 1]
JP-A-9-73453
[0022]
[Patent Document 2]
JP-A-61-182131
[0023]
[Problems to be solved by the invention]
As described above, in the above-described conventional search method, patent information including a keyword or the like corresponding to the search formula is simply extracted by simply specifying a specific keyword or search formula. For this reason, determination of novelty, inventive step, and the like is generally performed by a researcher actually confirming the contents of the extracted patent information visually or the like. In the case where the number of cases is large, it takes a great deal of time to understand and comprehend the problem. Therefore, there is an inconvenience that it is difficult to appropriately judge the novelty, the inventive step and the like in a short time.
[0024]
In addition, the act of determining an appropriate keyword or search formula according to the data set to be surveyed must be performed by the researcher, and there is a disadvantage that the burden on the researcher is heavy.
[0025]
Furthermore,
[Patent Document 1] and
As a problem which has not been solved even in Patent Document 2, there are the following inconveniences.
[0026]
First, since the trouble of keyword registration has not been solved, there has been a disadvantage that keyword registration must be performed every time data to be investigated is generated.
[0027]
Furthermore, since the expression of some of the registered keywords has not been solved, there has been a problem that it is easy to omit the search and it is difficult to perform a highly accurate search.
[0028]
In addition, if the unified necessity of registered keywords is not considered in the data set to be surveyed, it is difficult to conduct a highly accurate survey because multiple technical terms such as synonyms are used. There was an inconvenience.
[0029]
[Object of the invention]
The present invention relates to such prior art and
[Patent Document 1] and
It is preferable to improve the inconvenience of Patent Literature 2, particularly to solve the trouble of keyword registration and perform a search from the data set to be investigated without the trouble of registering registered keywords in advance. The purpose is to provide means.
[0030]
In addition, in order to solve the partial expression of the registered keywords and to conduct a highly accurate survey, it is intended to provide a suitable means by which a researcher can select an appropriate search keyword according to a data set to be surveyed. And
[0031]
Furthermore, even when the necessity of unifying registered keywords is not considered in the data set to be surveyed, it is preferable that the researcher can appropriately select the technical terms and the like having a relation such as synonyms as search keywords. The purpose is to provide a simple means.
[0032]
[Means for Solving the Problems]
In order to achieve the above object, a system according to the present invention includes storage means for storing reference data including an information block including one or more keywords and search target data including a plurality of information blocks including one or more keywords. ing. Further, the system includes input means for inputting an association between a keyword constituting the reference data and a keyword constituting the search target data. Further, the system includes a computer that executes a process of searching the search target data for a similar information block based on the information block of the reference data. In addition, the system includes a display unit for displaying the search result. In the system as described above, the information search program according to the present invention executes, in the computer, a first extraction step of extracting a keyword constituting an information block of the reference data from the reference data stored in the storage unit. Let it. Further, the program causes the computer to execute, from the search target data stored in the storage means, a step of extracting keywords constituting a plurality of information blocks of the search target data. Further, the program displays on the display means one or more keywords extracted from the reference data and one or more keywords extracted from the search target data, and displays each keyword of the displayed reference data. And causing the investigator to select an association based on a relation such as a synonym of the keyword of the search target data. In addition, the program causes the computer to execute a search step of searching for the similar information block from the search target data based on the keyword of the associated search target data.
[0033]
According to the present invention, the similar information block is searched from the search target data based on the keyword of the search target data, which is associated with each keyword of the reference data. A highly accurate search can be performed. Further, unlike the related art, the researcher can perform a search without registering a keyword for each information block of the search target data. Further, unlike the related art, the researcher can select an appropriate search keyword according to the search target data. In addition, unlike the related art, the researcher can appropriately select a keyword of the search target data, which has a synonymous relationship or the like for each keyword of the reference data, as a search keyword.
[0034]
Also, the information search program according to another invention, wherein the search step is based on a keyword extracted from the reference data and a keyword extracted from search target data associated with each of the keywords. It is characterized in that the similar information block is searched from data.
[0035]
According to the present invention, to search for the similar information block from the search target data based on the keyword of the reference data and the keyword of the search target data associated with each of the keywords, unlike the related art, Researchers can perform more accurate searches.
[0036]
Furthermore, an information search program according to another invention is characterized in that the first extraction step includes a step of receiving from the input means that a researcher edits a keyword extracted from the reference data.
[0037]
According to the present invention, by editing the keywords of the reference data by a researcher,
Unlike before, the researcher can perform a search with higher accuracy.
[0038]
Further, the information search program according to another invention is characterized in that the first extracting step includes a step of receiving from the input means that the investigator discards a keyword extracted from the reference data.
[0039]
According to the present invention, the keyword of the reference data is discarded by a researcher,
Unlike before, the researcher can perform a search with higher accuracy.
[0040]
Furthermore, an information search program according to another invention includes a step of causing the computer to accept, from each investigator, an association with a weight indicating importance in search processing for each keyword extracted from the reference data. And searching the similar information block from the search target data based on the keyword extracted from the search target data and the weight associated with each keyword of the reference data. Is performed.
[0041]
According to the present invention, the similar information block is searched from the search target data based on the keyword of the search target data and the weight associated with each keyword of the reference data. This enables the researcher to perform a search with higher accuracy. Further, unlike the related art, since the search result is corrected by weighting, a more detailed search can be performed.
[0042]
Further, the information search program according to another invention includes a step of, when the search step searches for similar information blocks from the search target data, adding information indicating the degree of similarity to each information block. Then, the program causes the computer to execute a step of displaying the search results in the order of the similarity.
[0043]
According to the present invention, the search results are displayed in the order of the similarity, so that the researcher can easily understand the search results. Further, unlike the related art, the researcher can easily access the individual breakdown of the search result.
[0044]
Further, in the information search program according to another invention, the information block of the reference data is specific patent application information, and the plurality of information blocks of the search target data is a plurality of patent application information or technical literature information. It is characterized by having.
[0045]
According to the present invention, for specific patent application information which is an information block of the reference data, similar patent application information is obtained from a plurality of patent application information or a plurality of technical documents which are a plurality of information blocks of the search target data. Alternatively, it becomes possible to search for technical documents. As a result, unlike the related art, it is possible to reduce the research time spent by the researcher and reduce the burden.
[0046]
An information search program according to another invention is characterized in that the information block of the reference data is one or more claims of a specific patent application.
[0047]
According to the present invention, for one or more claims of a specific patent application or the like, which is an information block of the reference data, from a plurality of patent application information or scientific and technical literature which is a plurality of information blocks of the search target data, It will be possible to search for similar patent application information or scientific and technical literature. As a result, unlike the related art, it is possible to search for similar patent application information or scientific and technical literature for each claim.
[0048]
Further, in the information search program according to another invention, when the information block of the reference data, which is one or more claims of a specific patent application, is a dependent claim, the first extraction step refers to the dependent claim. And adding the independent term to the information block of the reference data.
[0049]
According to the present invention, when the information block of the reference data, which is one or more claims of a specific patent application, is a dependent claim, an independent claim cited by the dependent claim is added to the information block of the reference data. It becomes possible. Thus, unlike the related art, even if the information block of the reference data is a dependent claim, similar patent application information or scientific and technical literature can be searched for each claim.
[0050]
This aims to achieve the above-mentioned object.
[0051]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to FIGS.
[0052]
FIG. 1 is a configuration diagram of an information search system according to an embodiment of the present invention. As shown in FIG. 1, the information search system 1 includes a storage unit 2, an input unit 3, a display unit 4, and a computer 5 for controlling these. These means and the computer 5 are connected to each other, and the information search function 15 is realized by cooperating while transmitting and receiving data. The storage unit 2 stores reference data 21 serving as a reference for a search process and search target data 22 serving as a search target of the reference data.
[0053]
Next, the storage unit 2 in FIG. 1 and the reference data 21 and the search target data 22 stored therein are shown in detail in a detailed view of the storage unit 2 in FIG. Here, the reference data 21 includes an information block 21a including a plurality of keywords. In addition, the search target data 22 includes a plurality of information blocks 22a including a plurality of keywords. The search target data 22 is search data narrowed down by a researcher in advance by using a keyword, a classification symbol, or the like in a search formula, but if the number is originally small, data that is not narrowed down by the researcher is used. I do.
[0054]
Here, the keyword included in the reference data 21 and the keyword included in the search target data 22 are composed of one or more words, symbols, numerical values, and the like, or a combination thereof. Examples of this include a "key item", a "system", a "search result", and the like. Further, in FIG. 2, the keyword a included in the reference data 21 and the keyword a ′ included in the search target data 22 have a relationship of synonyms and the like (including synonyms and the like). Similarly, the keyword b included in the reference data 21 and the keyword b ′ included in the search target data 22 have a relationship such as a synonym. The relation between synonyms and the like here means that the keywords are synonymous and the like, like the relation between the above-mentioned “primary key” and “main item” or “search item”.
[0055]
Further, the information block 21a of the reference data represents an information unit serving as a reference for information search. The information blocks to be searched for in the information block 21a of the reference data are the information blocks 22a included in the search target data 22 in plurality. Here, the information search function 15 described above allows the computer 5 to generate a reference data information block 21a based on the keyword of the reference data 21 and / or the keyword of the search target data 22 having a synonymous relationship with the keyword. This function determines the information block 22a of similar search target data. It is desirable that the information block 21a of the reference data and the information block 22a of the search target data be divided for each document or document. In this case, based on the information block 21a of the reference data, which is the specific document, the information search function 15 determines the information block 22a of the search target data, which is a similar document to the specific document.
[0056]
Next, FIG. 3 is a block diagram in a case where the information search system of FIG. 1 is applied to a personal computer. Here, the storage means 2 in FIG. 1 corresponds to the memory 6 or the hard disk 10, the input means 3 corresponds to the keyboard 9 or the like, the display means 4 corresponds to the display 7, and the computer 5 corresponds to the CPU 8. The components are connected by a bus 13 so that they can communicate with each other. The reference data 21 and the search target data 22 stored in the storage unit 2 in FIG. 1 are stored in the hard disk 10 as the reference data 11 and the search target data 12 in a format such as a file or a database. Similarly, the information search function 15 is stored in the hard disk 10 as the information search program 14. Then, in the system in FIG. 3, the CPU 8 interprets the information search program 14 to perform calculation processing and control each component. When the CPU 8 interprets and executes the information search program 14, it reads the information search program 14, data, and the like from the hard disk 10 into the memory 6.
[0057]
Here, the keyboard and the like 9 include a mouse and the like generally used in personal computers. The display 7 is a CRT display, a liquid crystal display, or the like. Further, the illustrated keyword 11a of the reference data conceptually indicates that the keyword included in the reference data 11 has been extracted by the CPU 8 and read into the memory 6. The same applies to the keyword 12a of the search target data.
[0058]
Next, the operation of each component in the present embodiment will be described with reference to the flowchart of FIG.
[0059]
When a person who intends to conduct a survey using the information search system of FIG. 3 (hereinafter abbreviated as “surveyor”) inputs a survey start command using a keyboard or the like 9 as input means, the CPU 8 receives the command. The command is interpreted and the information search program 14 is started.
[0060]
Here, the CPU 8 interprets and executes the activated information search program 14, whereby the four processes shown in the flowchart of FIG. 4 are performed. The four processes include a first extraction step S1 for extracting a keyword from the reference data, a step S2 for extracting a keyword from the search target data, and an association between the reference data keyword and the search target data keyword. Step S3 and step S4 of searching for an information block of the search target data based on the associated keyword of the search target data. The information search function 15 is realized by the above four processes. Also, a first extraction step S1 for extracting a keyword from the reference data and a step S2 for extracting a keyword from the search target data are processed in separate steps in advance, and the execution of the information search program 14 immediately starts the reference. The process can also proceed from step S3 in which the association between the data keyword and the search target data keyword is selected.
[0061]
The outline of each step of S1 to S4 will be described below first.
[0062]
First, the first extraction step S1 of extracting keywords from the reference data is a step of extracting a plurality of keywords constituting an information block included in the reference data 11 stored on the hard disk 10. The keywords extracted in this step are used in S3 and S4. Hereinafter, the first extraction step S1 of extracting a keyword from the reference data is abbreviated as “first extraction step from the reference data”.
[0063]
Next, the step S2 of extracting a keyword from the search target data is a step of extracting a plurality of keywords constituting a plurality of information blocks included in the search target data 12 stored in the hard disk 10. The keywords extracted in this step are used in S3 and S4. Hereinafter, step S2 of extracting a keyword from search target data is abbreviated as “extraction step from search target data”.
[0064]
Subsequently, the step S3 of selecting the association between the keyword of the reference data and the keyword of the search target data includes associating the keyword extracted from the reference data 11 in S1 with the keyword extracted from the search target data 12 in S2. Is a step for making the researcher select. The association selected by this step is used in S4. Hereinafter, Step S3 for selecting the association between the keyword of the reference data and the keyword of the search target data is abbreviated as “association selection step”.
[0065]
Subsequently, the step S4 of searching for the information block of the search target data based on the keyword of the associated search target data includes the step of searching for the information block of the reference data 11 based on the keyword of the search target data 12 associated in S3. In this step, a similar information block of the search target data 12 is searched. In this search step, it is determined whether or not each information block of the search target data 12 is similar to the information block of the reference data 11. Hereinafter, step S4 of searching the information block of the search target data based on the keyword of the associated search target data is abbreviated as “search step”.
[0066]
The above is the outline of each step of S1 to S4, and each step will be described in detail below.
[0067]
First, in the first extraction step S1 from the reference data, the CPU 8 interprets and executes the information search program 14 to perform the processing of S11 to S15 shown in the flowchart of FIG. Hereinafter, based on the flowchart of FIG. 5, each process of S11 to S15 will be described.
[0068]
First, in S11, the CPU 8 displays the reference data input screen 30 of FIG. 7 on the display 7, and waits for a reference data input command from the researcher. Here, the investigator uses the keyboard 9 or the like to instruct reading of the reference data through the reference data reading button 34. Further, the researcher also specifies information for specifying reference data such as a path and a file name. Then, the CPU 8 receives the instruction of reading the reference data from the investigator, and displays the designated reference data on the reference data display unit 33. Further, the CPU 8 analyzes the reference data, divides the reference data into each claim by the character “Claim” or a symbol with black brackets, and displays a list of claims on the claim list display unit 31. . An example of the reference data input by the researcher is a specification such as a patent. Further, as an example of such a process of accepting reference data, when the investigator specifies the data shown in FIG. 8 as reference data, the reference data input screen 30 is in a state as shown in FIG.
[0069]
Next, when the input screen of FIG. 9 is displayed, the CPU 8 displays an input screen of FIG. Wait for. Here, the researcher instructs selection of one or more claims by double-clicking each claim in the claim list display section 31 using the keyboard 9 or the like. Then, the CPU 8 receives an instruction for selecting a claim from the investigator, extracts data of the specified claim from the reference data display unit 33, and displays the data on the selected claim display unit 32.
[0070]
At the time of this extraction and display processing, the CPU 8 executes a branch determination S12 of determining whether or not a dependent claim is included in one or more claims selected by a previous claimer's instruction for claim selection. Is Here, whether or not a dependent claim is included is determined in the claim data extracted from the reference data display unit 33 by another claim such as a character of “claim” or a character of “claims 1 to 3”. Is determined based on whether or not is cited. If the CPU 8 determines in the step S12 that a dependent claim is included, the data of the claim cited by the dependent claim is also extracted from the reference data display section 33 and is also displayed on the selected claim display section 32. (S13). In this S13, when the claim cited by the dependent claim further cites another claim (when it is a dependent item of another claim), the data of the cited claim is also used. In addition, it is displayed on the selected claim display section 32. Such processing is repeated until the cited claim becomes an independent claim. As an example of the dependent item determination S12 as described above, when the researcher specifies “Claim 2” from the claim list display section 31 on the reference data input screen 30 in the state of FIG. The state shown in FIG. 10 is obtained, and the data of claims 1 and 2 are reflected. In this way, the claimant's instruction to select a claim is reflected in the selected claim display section 32. The claim data displayed on the selected claim display section 32 represents an information block of reference data, which is an information unit serving as an information search reference in the information search function.
[0071]
Then, the investigator can edit the claim data displayed on the selected claim display section 32 using the keyboard 9 or the like on the reference data input screen 30 of FIG. That is, it is possible to add or delete characters, symbols, and the like to the claim data displayed on the selected claim display section 32.
[0072]
Subsequently, in the reference data input screen 30 of FIG. 10, the researcher instructs the extraction of keywords from the information block of the reference data by, for example, clicking the keyword extraction button 35 using the keyboard 9 or the like. Then, the CPU 8 receives the keyword extraction instruction from the researcher, and divides the claim data displayed on the selected claim display section 32 into keywords (S14). As a method of this division, there is a method in which the CPU 8 divides each particle or blank character included in the claim data into keywords. As an example of such a keyword division, the claim data displayed on the selected claim display section 32 on the input screen of FIG. 10 is divided into keywords as a divided keyword list 37 shown in FIG.
[0073]
When the claim data is divided into keywords, the CPU 8 displays the search condition determination screen 40 of FIG. 12 on the display 7, displays the claim data on the claim data display section 42, and displays the divided keyword list 37 on the keyword list display section 43. (S15). In the search condition determination screen 40 of FIG. 12, the investigator can edit the claim data displayed on the claim data display section 42 using the keyboard 9 or the like. That is, characters or symbols can be added to or deleted from the claim data displayed on the claim data display unit 42. Here, when the claim data displayed on the claim data display section 42 is edited, the investigator can use the keyboard 9 or the like to click the keyword re-extraction button 41, and the like. Can be re-extracted, and the extracted keyword list can be re-displayed on the keyword list display section 43.
[0074]
In this way, when displaying the list of keywords on the keyword list display section 43, the CPU 8 waits for a researcher's keyword selection instruction, which is an instruction of a keyword to be used among the keywords. Here, the investigator instructs selection of one or more keywords by using the keyboard 9 or the like and double-clicking each keyword in the keyword list display section 43. Then, the CPU 8 receives the instruction of the researcher to select a keyword, extracts the specified keyword from the keyword list display unit 43, and displays the extracted keyword in the column of the components of the search condition display unit 48. As described above, when the researcher sequentially clicks, for example, each of the keywords “key item”, “order”, and “search result” from the keyword list display unit 43, the configuration of the search condition display unit 48 as shown in FIG. The selected keyword is displayed in the element column. In this case, it is appropriate that the keyword selected by the researcher is not all of the keywords displayed in the keyword list display section 43, but one that indicates the features of the invention. As described above, the discarding and editing of the keyword are determined solely by the researcher, and are not automatically determined by the information search program 14.
[0075]
The above is the description of each process in the first extraction step S1 from the reference data. Here, each keyword displayed in the column of the component of the search condition display unit 48 in FIG. 13 is a keyword extracted in the first extraction step S1 from the reference data. Hereinafter, the extraction step S2 from the search target data will be described.
[0076]
Also in the extraction step S2 from the search target data, the following processing is performed by the CPU 8 interpreting and executing the information search program 14. First, the CPU 8 receives input of search target data to be searched for the reference data received in S1 from the researcher through the keyboard 9 or the like. Examples of the search target data include a plurality of patent publications, and in that case, the patent publications for each application are individual information blocks in the search target data. Further, as described above, each information block in the search target data is an information unit whose similarity to the information block of the reference data is determined.
[0077]
Then, the CPU 8 divides the search target data input by the researcher into keywords as in S14 in the first extraction step S1 from the reference data described above. As a method of this division, there is a method in which the CPU 8 divides each particle or blank character included in the search target data into keywords. FIG. 14 illustrates the keyword list 38 of the search target data divided in the extraction step S2 from the search target data as described above. Here, the keywords displayed in FIG. 14 are the keywords extracted in the extraction step S2 from the search target data.
[0078]
The above is the description of each process in the extraction step S2 from the search target data. In the extraction step S2 from the search target data, the researcher only specifies the search target data, and the other processing is performed based on the information search program 14. Therefore, there is no inconvenience even if the information search program 14 performs the first extraction step S1 from the reference data after the extraction step S2 from the search target data. That is, the first extraction step S1 from the reference data and the extraction step S2 from the search target data are in no particular order. Hereinafter, the association selection step S3 will be described.
[0079]
Also in this association selection step S3, the following processing is performed by the CPU 8 interpreting and executing the information search program 14. Here, after S1 and S2, the search condition determination screen 40 of FIG. 13 is displayed on the display 7, and the CPU 8 waits for an association selection command from the researcher. Here, the researcher uses the keyboard 9 or the like to select an arbitrary keyword (hereinafter abbreviated as “optional keyword”) from the keywords displayed in the component column of the search condition display unit 48. By clicking the synonym button 44 or the like, an instruction to display an association selection screen is issued. Upon detecting this, the CPU 8 extracts the arbitrary keyword data from the search condition display section 48 and displays the association selection screen 50 previously held by the information search program 14 on the display 7. (See below)
[0080]
The association selection screen 50 has a target keyword display section 51, a hit keyword list display section 52, a non-hit keyword list display section 53, and an association execution button 54. Here, the target keyword display section 51 is a section for displaying an arbitrary keyword, which is previously selected from the column of the components of the search condition display section 48 by the researcher. In the association selection screen 50 of FIG. 15, it can be seen that “key item” has been selected as an arbitrary keyword. The hit keyword list display section 52 is a part that displays a list of hit keywords as a result of performing a search on the keywords extracted in S2 using an arbitrary keyword as a search key. Further, the non-hit keyword list display section 53 is a part that displays a list of keywords that did not hit as a result of performing a search on the keywords extracted in S2 using an arbitrary keyword as a search key. The above-described general pattern matching is used as a search method for the keyword extracted in S2. As a result, the keywords extracted in S2 are sorted and displayed on one of the hit keyword list display section 52 and the non-hit keyword list display section 53, respectively. This sorting process is determined based on the similarity criteria selected by the researcher. As shown in FIG. 15, the association selection screen 50 includes a plurality of radio buttons so that the researcher can select a similarity criterion. The association selection screen 50 of FIG. 15 has three radio buttons of similarity (weak), (medium), and (strong), and can select a similarity criterion according to which radio button the investigator selects. . As an example of keyword distribution based on this similarity criterion, when the similarity criterion is set to (strong) for an arbitrary key “primary key”, the characters “primary key” in the keywords extracted in S2 are completely May be displayed on the hit keyword list display section 52. Similarly, when the similarity criterion is set to (weak) for an arbitrary key “primary key”, a part of the character “primary key” (for example, “primary” or “primary key”) in the keywords extracted in S2 Key ") is displayed on the hit keyword list display section 52.
[0081]
When the previous association selection screen 50 is displayed on the display 7, the CPU 8 displays the previously extracted arbitrary keyword data on the target keyword display section 51. Further, the CPU 8 reads the similarity criterion initially set by the information search program 14 and the keyword extracted in S2, and executes the pattern extraction based on the similarity criterion based on the similarity criterion to extract the keyword extracted in S2. The hit keyword list display section 52 and the non-hit keyword list display section 53 are displayed separately. The reason for providing such a similarity criterion is that if the search target data contains many keywords similar to the keyword selected as a component, only the keywords having a high degree of similarity are displayed on the hit keyword list display section 52. This is so that the researcher can immediately select a required search keyword. Therefore, when there is almost no keyword similar to the keyword selected as a constituent element in the search target data, a keyword including a keyword having a low similarity is displayed on the hit keyword list display section 52 so that the researcher can select the keyword. . Further, one of the radio buttons can be selected by the information search program 14 so that the number of keywords displayed on the hit keyword list display section 52 is within a certain range.
[0082]
When the investigator uses the keyboard 9 or the like to select a similarity criterion selection radio button on the association selection screen 50 and changes the similarity criterion, the CPU 8 re-starts based on the similarity criterion. The keywords extracted in S2 are read and sorted and displayed on the hit keyword list display section 52 and the non-hit keyword list display section 53 by the pattern matching. In this way, the investigator can change the similarity criterion initially set by the information search program 14.
[0083]
Subsequently, using the keyboard 9 or the like, the researcher uses the target keyword display unit 51 in the keyword list extracted in S2 displayed on the hit keyword list display unit 52 and the non-hit keyword list display unit 53. Is selected as a keyword considered to have a synonymous relationship with the arbitrary keyword displayed in. As a selection method, there is a method of selecting by checking a check box provided for each keyword displayed on the hit keyword list display section 52 and the non-hit keyword list display section 53. Normally, the keyword selected by the researcher as a component is not displayed in the hit keyword list display section 52. The keyword selected by the investigator as a constituent element is not displayed, but is processed as if it were selected by the investigator. If the keyword selected by the researcher as a component is also displayed, the researcher can check the presence or absence of the selected component in the search target data. Upon detecting this check, the CPU 8 stores the association between the arbitrary keyword and the checked keyword. When the check is released, the CPU 8 deletes the association between the arbitrary keyword and the keyword whose check is released. Taking the association selection screen 50 of FIG. 15 as an example, for the “key item” which is an arbitrary keyword, “search key”, “search item”, and “characteristic portion” are keyword-selected, and the association is stored. Is in a state of being
[0084]
As described above, by using the keyboard 9 or the like, the researcher selects a keyword considered to be in a relationship such as an arbitrary keyword and a synonym, and clicks the association execution button 54 on the association selection screen 50. The association and storage of another keyword with the arbitrary keyword is performed. In response to this, the CPU 8 saves the association, and sets another keyword associated with the arbitrary keyword in the row of the arbitrary keyword in the search condition display unit 48 on the search condition determination screen 40, and the synonym item Are displayed in the area that intersects the column of. As an example of this, in the search condition display section 48 on the search condition determination screen 40 in FIG. 16, three keywords of “search key”, “search item”, and “characteristic portion” are added to an arbitrary keyword of “key item”. However, this indicates that two keywords “procedure” and “order” are associated with the arbitrary keyword “order”, respectively. At the same time, the search condition display section 48 on the search condition determination screen 40 in FIG. 16 shows that no keyword is associated with any keyword “search result”. As described above, when there is no associated keyword for the arbitrary keyword, the search of the search database is performed using only the phrase selected as a component, that is, in this case, the keyword of “search result”. (See below)
[0085]
The above is the description of each process in the association selection step S3. Here, for each of the optional keywords displayed in the column of the constituent elements of the search condition display unit 48 in FIG. 16, the keyword displayed in the synonym item of the row for each of the optional keywords is replaced with the optional keyword in S3. It is a keyword of search target data associated with each search. Hereinafter, the search step S4 will be described.
[0086]
Also in this search step S4, the following processing is performed by the CPU 8 interpreting and executing the information search program 14. Here, the search step S4 is made up of the processes of S41 to S43 shown in the flowchart of FIG. Hereinafter, based on the flowchart of FIG. 6, each process of S41 to S43 will be described.
[0087]
First, on the search condition determination screen 40 of FIG. 16, the CPU 8 issues a weighting instruction of the investigator, which is an instruction for weighting each arbitrary keyword displayed in the column of the components of the search condition display unit 48. wait. Here, the researcher selects the keyword by using the keyboard 9 or the like, for example, by clicking each keyword displayed in the column of the component of the search condition display unit 48, and then operates the weighting meter 45. Or inputting a numerical value from a keyboard or the like 9 to give a weighting instruction. Here, as shown in FIG. 16, the weighting meter 45 is configured so that the value indicated by the weighting indicator can be selected from numerical values in a specific range by operating the center weighting indicator to the left or right. . When receiving the weighting instruction from the researcher, the CPU 8 displays the value indicated by the weighting meter 45 in the column of weight in the row of the corresponding keyword in the search condition display unit 48 (S41). As an example, in the search condition determination screen 40 of FIG. 16, “8” is assigned to an optional keyword “key item”, “6” is assigned to an optional keyword “order”, and an optional keyword “search result” is assigned. In the case where “3” is weighted by the investigator, the search condition determination screen 40 is in a state as shown in FIG. In addition, if the initial values are automatically set by the information search program 14 for the keywords that are not specified by the investigator and that are displayed in the column of the constituent elements of the search condition display unit 48, it is easy for the investigator. It is desirable.
[0088]
Here, the weighting step S41 by the investigator is included in S4 for convenience of description, but may be present at the end of S1, or in another step of S2 or S3. When displayed on the search condition determination screen 40, the investigator can save the various data displayed on the search condition determination screen 40 or read the previously stored various data at an appropriate time. For example, when the researcher clicks the setting save button 46 on the search condition determination screen 40 using the keyboard 9 or the like, the CPU 8 stores the various data displayed on the search condition determination screen 40 on the hard disk 10. To save. When the researcher clicks the setting read button 47 on the search condition determination screen 40 using the keyboard 9 or the like and specifies a setting file, the CPU 8 transfers the setting file from the hard disk 10 to the memory 6. The contents of the setting file are displayed on the read / search condition determination screen 40.
[0089]
Next, the CPU 8 waits for a search start command of the investigator, which is a search start instruction for performing similarity determination for each information block of the search target data based on the conditions of the search condition display unit 48. Here, the researcher instructs to start the search by using the keyboard or the like 9 and clicking the search start button 49 on the search condition determination screen 40 or the like. Then, the CPU 8 receives the search start instruction from the researcher and extracts the search condition data displayed on the search condition display section 48. As an example of this, when a search start instruction is given from a researcher in the state of the search condition determination screen 40 in FIG. 17, search condition data 60 as shown in FIG. 18 is extracted. Here, a new management number column exists in the search condition data 60, which is given a management number in order to make each record of the search condition data 60 unique.
[0090]
Subsequently, the CPU 8 performs a similarity determination for each information block of the search target data based on the search condition data 60 (S42). As the similarity determination method, in the search condition data 60, the number of synonyms for a specific component is N and a specific information block (hereinafter, referred to as a “specific information block”) of the search target data is used. If the search using N keywords results in M hits, the score of the specific information block for that component is recorded as “(M / N) * weight” for each specific information block. . Similarly, the same search and recording are performed for the other components “order” and “search result”. Finally, the scores for each component are totaled for each specific information block, and are set as the similarity score of the specific information block.
[0091]
For example, in the search condition data 60 shown in FIG. 18, in the row where the component is “key item”, the number of synonyms is three (“search key”, “search item”, and “characteristic part”). The number of N becomes N = 4 by adding the “key item” selected as a constituent element. On the other hand, in a certain information block of the search target data, when the information block is searched using “search key”, “search item”, “characteristic part”, and “key item” as search keywords, “search key” , "Search item", "characteristic part", and "key item", it is found that M = 4, and from the search condition data 60 shown in FIG. It can be seen that the weight of the “key item” is 8. Therefore, the score of a row whose constituent element in this specific information block is “key item” becomes “(4/4) * 8” when substituted into the expression of “(M / N) * weight”, and the solution of this expression is It turns out that it is "8". Similarly, the score is calculated in the specific information block also in the row having the component “order” and the row having the “search result”. Then, the total score (similarity score) of the specific information block is calculated by summing the scores calculated for each row of the above three components. Here, the scores calculated for each row of the three constituent elements are simply totaled. However, it is possible to correct the total score by calculating the product of the scores calculated for each row of the three component requirements. it can. However, since similarity is calculated for sentences such as patent specifications and technical documents that can express the same technology with a wide range of synonyms, there is almost no meaning in strictly determining the score calculation method, and various calculation methods can be adopted. .
[0092]
In step S42, the number (N-1) not including the specific component in N, which is the number of synonyms for a specific component in the search condition data 60, may be set as N. For example, in the search condition data 60 shown in FIG. 18, in the row where the component is “key item”, the number of synonyms is three (“search key”, “search item”, and “characteristic part”). N = 3, and the above S42 is performed using the three keywords selected by the researcher as search keywords. As described above, by not using the component itself as a keyword, it is possible to perform the similarity determination using only the keyword in the search reference data.
[0093]
In this way, when calculating the similarity score for each information block of the search target data based on the search condition data 60, the CPU 8 sorts the information blocks of the search target data in descending order of the similarity score, and displays 7 is displayed in a list (S43). As an example of this, there is a display method as shown in FIG. In the information block similarity order list 70 of the search information data in FIG. 19, the value of the item of “document number” means the initial display order of the information block, and the value of the item of “score order” indicates the value of the information. It means the total scoring order of the blocks. For example, from the line of the information block whose “document number” is “373”, it is the information block initially displayed on the 373rd line, and it can be seen that the total score was the highest compared to the other information blocks. . Similarly, from the line of the information block where the “document number” is “373”, the “public number” is “H03-1234” and the “public date” is “H03 / 10/10” In addition, it can be read that the "title of the invention" is "A system". Further, the item of “link” holds identification information of individual information block data such as a path and a file name. Thus, when the researcher uses the keyboard 9 or the like to select an individual information block on the information block similarity order list 70, the CPU 8 can immediately display the information block on the display 7.
[0094]
Through S43 as described above, the researcher can recognize the information blocks of the search target data having a high similarity score with respect to the information blocks of the reference data in the order of the similarity score.
[0095]
As described above, according to the present embodiment, for each keyword extracted from the reference data, a plurality of keywords extracted from the search reference data can be related, and based on the keyword from the related search reference data. Therefore, since the information block of the search reference data can be searched, there is an effect that the target information block can be reliably searched and extracted without fail. Therefore, the investigator can view the contents included in the search reference data in the order of attention, in units of information blocks, and can greatly reduce the time required for the reading investigation.
[0096]
【The invention's effect】
Since the present invention is configured and functions as described above, according to this, for each keyword extracted from the reference data, the keyword extracted from the search target data can be associated, and the keyword from the associated search target data can be used. The similarity can be determined for each information block of the search target data based on the keyword.
[0097]
Further, it is not necessary to register a search keyword in advance for each information block of the search target data, and all the search target data can be set as a keyword target.
[0098]
Furthermore, even in the case where the unified necessity of registered keywords is not considered in the search target data, the researcher can appropriately select a technical term or the like having a relation such as a synonym as a search keyword.
[0099]
In addition, for each keyword extracted from the reference data, the similarity determination can be adjusted by assigning a weight to the keyword extracted from the associated search target data by the researcher.
[0100]
In view of the above points, it is possible to conduct more detailed investigations than before, and there is an effect that it is possible to reliably search and extract a target information block from search target data without fail, which is an unprecedented superior information search. A system can be provided.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of an information search system according to an embodiment of the present invention.
FIG. 2 is a detailed diagram of a storage unit 2 in FIG.
FIG. 3 is a block diagram when the information search system of FIG. 1 is applied to a personal computer.
FIG. 4 is a flowchart illustrating a process of the information search system.
FIG. 5 is a detailed flowchart of S1 in FIG. 4;
FIG. 6 is a detailed flowchart of S4 in FIG. 4;
7 is a screen diagram of a reference data input screen 30. FIG.
FIG. 8 is a data example of reference data.
FIG. 9 is a screen diagram of a reference data input screen 30.
FIG. 10 is a screen diagram of a reference data input screen 30.
FIG. 11 is a keyword list obtained by dividing reference data into keywords.
FIG. 12 is a screen diagram of a search condition determination screen 40.
FIG. 13 is a screen diagram of a search condition determination screen 40.
FIG. 14 is a keyword list diagram in which search target data is divided into keywords.
15 is a screen diagram of an association selection screen 50. FIG.
16 is a screen diagram of a search condition determination screen 40. FIG.
17 is a screen diagram of a search condition determination screen 40. FIG.
18 is a data example of search condition data 60. FIG.
FIG. 19 is a display example of an information block similarity order list 70 of search information data.
[Explanation of symbols]
1 Information search system
2 storage means
3 Input means
4 Display means
5 Computer
6 memory
7 Display
8 CPU
9 Keyboard, etc.
10 Hard disk
11 Reference data
11a Keywords for reference data
12 Search target data
12a Keywords for search target data
13 bus
14 Information Search Program
15 Information search function
21 Reference data
21a Information block of reference data
22 Search target data
22a Information block of search target data
30 Reference data input screen
31 Claim list display section
32 Selected claim display section
33 Reference data display
34 Reference data read button
35 Keyword extraction button
36 Reference data
37 List of split keywords
38 Keyword list of search target data
40 Search condition determination screen
41 Keyword re-extraction button
42 Claim data display section
43 Keyword list display section
44 Synonym Button
45 Weighting meter
46 Setting save button
47 Setting read button
48 Search condition display area
49 Search start button
50 Association selection screen
51 Target keyword list display area
52 Hit keyword list display area
53 Non-hit keyword list display area
54 Association execution button
60 Search condition data
70 Information block similarity order list of search information data

Claims

Storage means for storing reference data including an information block including one or more keywords and search target data including a plurality of information blocks including one or more keywords;
Input means for inputting the association between the keyword constituting the reference data and the keyword constituting the search target data,
A computer that executes a process of searching for a similar information block from the search target data based on the information block of the reference data,
And a display means for displaying the search result.
To the computer,
A first extraction step of extracting a keyword constituting an information block of the reference data from the reference data stored in the storage unit;
Extracting a keyword constituting a plurality of information blocks of the search target data from the search target data stored in the storage unit;
One or more keywords extracted from the reference data and one or more keywords extracted from the search target data are displayed on the display means, and for each keyword of the displayed reference data, the search target data is displayed. Allowing the investigator to select an association by synonymous relationship or the like of the keywords of the researcher;
A search step of searching for the similar information block from the search target data based on the associated keyword of the search target data.

The information search program according to claim 1,
The search step includes:
An information search method for searching for the similar information block from the search target data based on a keyword extracted from the reference data and a keyword extracted from search target data associated with each keyword; program.

The information search program according to claim 2,
The first extracting step includes:
An information search program characterized by including a step of accepting, from the input unit, that a researcher edits a keyword extracted from the reference data.

The information search program according to claim 2,
The first extracting step includes:
An information search program characterized by including a step of receiving from an input unit that a researcher discards a keyword extracted from the reference data.

The information search program according to claim 1,
Said computer,
A step of receiving an association with a weight representing the importance in search processing from a researcher for each keyword extracted from the reference data,
The searching step executes a step of searching for the similar information block from the search target data based on the keyword extracted from the search target data and the weight associated with each keyword of the reference data. An information search program characterized by the following.

The information search program according to claim 1, wherein:
The search step includes:
When a similar information block is searched from the search target data, including a step of giving information indicating the degree of similarity for each information block,
To the computer,
An information search program for executing a step of displaying the search results in the order of the similarity.

7. The information search program according to claim 1, wherein
The information block of the reference data is specific patent application information,
An information search program characterized in that a plurality of information blocks of the search target data are a plurality of patent or other application information or scientific and technical literature information.

The information search program according to claim 7,
An information search program wherein the information block of the reference data is one or more claims of a specific patent application.

The information search program according to claim 8,
If the information block of the reference data, which is one or more claims of a particular patent application, is a dependent claim,
The first extracting step includes:
An information search program characterized by including a step of adding an independent term cited by the dependent claim to an information block of the reference data.

A recording medium on which the information search program according to claim 1 is recorded.