JP3929418B2

JP3929418B2 - Information search program and medium on which information search program is recorded

Info

Publication number: JP3929418B2
Application number: JP2003140555A
Authority: JP
Inventors: 和信五十嵐
Original assignee: アルトリサーチ株式会社
Priority date: 2003-05-19
Filing date: 2003-05-19
Publication date: 2007-06-13
Anticipated expiration: 2023-05-19
Also published as: JP2004342016A

Description

【０００１】
【発明の属する技術分野】
本発明は、目的とする資料等を調査する為のシステムに係り、特に産業財産権等に関する異議申し立て資料の調査や、先願調査等のように、実際に文書等の内容の読み込みを調査者が行う必要のある調査を補助し、調査作業を容易にするシステムに関する。
【０００２】
【従来の技術】
従来、特許情報等を機械検索によって検索できる特許情報データベース検索システムが開発使用されている。特許情報データベースを例にとれば、特許出願毎に発生する特許情報をデータベースとしてホストコンピュータの記憶手段に記憶させることによって構築されている。
【０００３】
そして、このような特許情報データベースには、一般的な書誌的事項の他に、各種キーワード（例えばフリーキーワード、固定キーワード、Ｆターム等）がファイル毎に記憶され、これらを指定することにより、対応する特許情報が検索できるように構成されている。また、近年発行されているＣＤ−ＲＯＭ形式の特許公報等や、各メーカによって編集された特許情報に関するＣＤ−ＲＯＭ等（以下、これらを電子特許情報という）には、上記した書誌的事項の他に明細書、要約書及び図面も電子情報として記憶され、各種検索項目に基づいて対応する特許情報が検索できるように構成されている。
【０００４】
ここで、前述の特許情報データベースや電子特許情報を用いて無効資料や異議申立資料を調査しようとする場合、調査者が検索式を作成し、その検索式にヒットした内容の抄録を見て更に必要により公報全文を読んで、無効資料あるいは異議申立資料として利用できるかを判断している。この検索の結果、ヒット件数が少ない場合は直接公報に目を通すことが行われる。該当する公報が見つからない場合は、再度検索式を作り直してヒットした公報を読むことが繰り返される。また、これから出願しようとする発明に対する先願調査等も、同様の手順によって行われる。
【０００５】
ここで、少し詳しく特許等調査の過程について説明する。まず、調査者は本願の内容と特徴点を理解した後、前述の特許データベースや電子特許情報に記憶されている分類記号、キーワード等を組合わせて、目的とする技術が例えば数十件乃至百件程度の回答集合中に含まれるように検索式を作成する。勿論、検索の結果ヒットした件数が少なく、このヒットしたデータ集合中に所望の技術が網羅されていれば良いのであるが、僅かな件数に絞り込むと部分的に関連する類似技術の抽出や、他分野にある類似技術が抽出されないことも多い。このため、調査者は通常は数十件、多くても数百件の抽出を目安として、ヒットしたデータ集合中に、目的とする技術が含まれるように検索式を作成する。数百件を超える集合が作成された場合は、その後のスクリーニングに多大な時間を要するので、効率的な検索を行うことは難しい。しかし、必ずその中に目的とする類似技術が抽出されていると確信できる場合は、ヒットした全ての出願について抄録や、公報等を読んで、類似技術を探すこととなる。この作業は手作業で行われる。また、多くの公報等を読んだ後に類似技術が抽出できない場合は、再び検索式を作り変えて同様な操作を繰り返す。そして、いくら探しても類似技術が抽出できないことが確認できるまで、調査作業が続けられる。特に公知例、異議、無効資料調査にあっては、検索式を作成するために要する時間に加えて機械検索の結果を人が熟読して内容を更に吟味し、取捨する作業時間が加わることになる。そのため、調査が完了するまでに多くの時間を要し、調査コストも多大なものとなる。分類体系や調査の手法に習熟した調査の専門家であっても、機械検索の結果を吟味、処理するのに多大な時間を要しているのが調査作業の実情である。
【０００６】
ところで、従来より広く使用されている一般的な検索処理方法として、パターンマッチングがある。これは、検索に使用するキーワードや検索式等（以下「検索用キーワード」と略す。）と、文書・文献等の検索単位毎に登録されているキーワード（以下、「登録キーワード」と略す。）とを比較し、一致するか（完全に、あるいは一部分）により、当該検索単位を検索結果として抽出するか否かを判断する手法である。その為、検索用キーワードと登録キーワードとが、不一致（完全に、あるいは一部分）ではあるが同義語等（類義語等を含む）の関係にある場合、検索することができないという問題点がある。即ち、技術思想の表現には非常に多くの用語があるので、パターンマッチングによる機械検索処理では、目的とする類似技術を抽出するのは非常に困難である。
【０００７】
このパターンマッチングの例としては、「主キー」という文字列を検索用キーワードに指定した場合において、「主キー」、「キー」、「キー項目」等の登録キーワードが、「主キー」という文字列の少なくとも一部分を含むため、類似のキーワードとして判定されることになる。しかし、「メイン項目」や「検索項目」等の技術上の同義語等である登録キーワードについては、「主キー」という文字列を検索用キーワードに指定した場合、検索により抽出することができない。
【０００８】
また、前述の登録キーワードは、検索行為に使用される為、事前に文書・文献等の検索単位毎に登録されている必要があり、その登録の手間を要する。（以下、この手間を「キーワード登録の手間」と称す。）
【０００９】
そして、この登録キーワードの決定方法としては、文書・文献等の検索単位毎に特徴のあるキーワードを一部抽出したり、あるいは管理者が検索単位毎にキーワードをいくつか設定すること等が行われている。この為、一般的に登録キーワードは、文書・文献等の検索単位を構成する一部分、あるいは一側面しか表していないことが多い。（以下、この点を「登録キーワードの一部表現性」と称す。）この為、登録キーワード以外の検索単位を構成するキーワード等を、検索用キーワードに指定した場合に、前述の登録キーワードの一部表現性によって、当該検索単位は検索の結果抽出されないことになる。
【００１０】
更に、先の「主キー」の例のように、技術上の同義語等である登録キーワードを統一して使用すべき必要が生じる。（以下、この点を「登録キーワードの統一必要性」と称す。）この登録キーワードの統一必要性の例としては、先の「主キー」という用語を登録キーワードに使用すると定めた場合、「メイン項目」や「検索項目」という登録キーワードを使用できなくなることが挙げられる。
【００１１】
しかし、この登録キーワードの統一必要性が遵守されていない場合は、検索を行う調査者にとっては好ましいことではなく、特に産業財産権等の新規性や進歩性等を判断する際には、その問題点が一層顕在化することになる。
【００１２】
ここで、新規性とは、ある特定の発明と、それに対する一以上の引用する発明とを認定し、両者を比較することにより相違点が生じるかにより、ある特定の発明が新しいものであるか否かを判断する特許要件である。この新規性の判断において、従来は調査者が目視で内容を確認することにより同一の発明であると認定される（新規性なしと判定される）発明同士であっても、特定の発明と引用する発明とが、同義の他の技術用語等を登録キーワードに使用していた場合は、精度の高い検索が行えないことがあった。本出願の出願時において普及している国内特許データベースには、Ｆターム、国際特許分類、ＦＩ記号その他、技術内容がシソーラス化された記号として付与されている。しかし、新しい発明は旧来の技術体系に必ずしも分類できるわけではないので、キーワードと分類記号を組み合わせても、完全に的を絞って該当特許等をヒットさせることは難しい。更に、分類記号の少ない、論文、一般技術文献を調査対象とする場合は、少ない件数に絞り込んだデータ集合中から、適切な文献を機械検索によりヒットさせることが難しい。そのため、ある程度の件数のデータ集合を作成し、その中で該当する文献等があるか否かを調査者が検討する必要がある。
【００１３】
また、進歩性とは、ある特定の発明と、それに対する一以上の引用する発明とを認定し、両者を対比することによりそれぞれの発明を特定する為の事項の一致点及び相違点を明らかにし、その相違点が当業者容易であるか否かにより判定される特許要件である。この新規性の判断において、従来は調査者が目視で確認することにより相違点があまりないと認定される（進歩性なしと判定される）発明同士であっても、特定の発明と引用する発明とが、同義の他の技術用語等を登録キーワードに使用していた場合は、適切な検索抽出が行えないことがあった。
【００１４】
一般的に産業財産権等の明細書等の書類は、作成者が多数存在することもあり、先の「主キー」の例のように同一、同種の意味である技術用語を多種多様な用語により表現している。このように、登録キーワードの統一必要性が考慮されていない事により、従来の検索方法では同一の発明や、類似の発明を検索することが非常に難しいという不都合があった。
【００１５】
また、検索単位毎の登録キーワードが、検索単位を構成する一部分等である場合は、登録キーワードの一部表現性により、精度の高い調査を行うことが難しいという不都合があった。
【００１６】
以上のような理由から、引例資料等の調査対象資料の抽出を迅速かつ正確に行えるシステムの出現が望まれている。また、特定の特許情報等に基づく新規性や進歩性分析を自動的に、または簡便化できる装置の出現が望まれている。
【００１７】
これに関しては、【特許文献１】において、検索の対象となる複数の情報を、予め当該複数の情報に含まれるいくつかのキーワード群と対応付けて記憶しておき、当該検索対象情報のキーワードを集計して表示し、その中から調査者に選択されたキーワードに基づいて前記の検索の対象となる複数の情報から検索を行う方法が開示されている。
【００１８】
しかし、この方法では、事前に検索の対象となる複数の情報毎にキーワード群の対応付けを登録しなければならないという手間を要し、キーワード登録の手間を解決できていない。更に、複数の情報に含まれる特定のキーワード群を検索に使用するため、登録キーワードの一部表現性や統一必要性を解決できていない。
【００１９】
また、【特許文献２】においては、検索の対象となる文献単位毎に、所定のキーワード群への関連の有無を表現するパターンを作成し、調査者がパターンを指示することにより、当該指示されたパターンに類似のパターンを持つ文献単位を検索する方法が開示されている。
【００２０】
しかし、事前に所定のキーワード群を決定しなければならないことや、その所定のキーワード群を途中で変更した場合等に、過去のパターンを再利用するのが難しいという不都合がある。また、【特許文献１】と同様に、事前に検索の対象となる複数の情報毎にパターンの対応付けを登録しなければならないという手間を要し、キーワード登録の手間を解決できてはいない。更に、所定のキーワード群への関連の有無を表現するパターンを検索に使用するため、登録キーワードの一部表現性や統一必要性も解決できていない。
【００２１】
【特許文献１】
特開平9−73453号
【００２２】
【特許文献２】
特開昭61−182131号
【００２３】
【発明が解決しようとする課題】
このように、上記のような従来の検索方法にあっては、単に特定のキーワードや検索式を指定することにより、この検索式に該当するキーワード等を含む特許情報が抽出されるのみである。この為、新規性や進歩性等の判断は、抽出された特許情報の内容を調査者が実際に目視等で確認することによって行うのが一般的であり、抽出された特許情報（特許公報等）の件数が多い場合にその理解や把握に多大な手間を有する為、短時間かつ適切に新規性や進歩性等の判断を行うことが困難であるという不都合があった。
【００２４】
また、調査対象のデータ集合に応じて適切なキーワードや検索式を決定する行為は、調査者が行わなければならず、調査者の負担が重いという不都合があった。
【００２５】
更に、【特許文献１】及び【特許文献２】においても解決されていない課題として、以下の不都合が存在した。
【００２６】
まず、キーワード登録の手間を解決できていない為、調査対象のデータが発生する度に、キーワード登録を行わなければならないという不都合があった。
【００２７】
更に、登録キーワードの一部表現性を解決できていない為、調査漏れが発生しやすく、精度の高い調査を行い難いという不都合があった。
【００２８】
これに加え、調査対象のデータ集合において、登録キーワードの統一必要性が考慮されていない場合、同義語等の関係にある技術用語が複数使用されている為に、精度の高い調査を行い難いという不都合があった。
【００２９】
【発明の目的】
本発明は、かかる従来技術や【特許文献１】及び【特許文献２】の有する不都合を改善し、特に、キーワード登録の手間を解決し、登録キーワードを事前に登録する手間を要せずに、調査対象のデータ集合から検索を行うことのできる好適な手段の提供を目的とする。
【００３０】
また、登録キーワードの一部表現性を解決し、精度の高い調査を行う為に、調査対象のデータ集合に応じた適切な検索キーワードを調査者が選択することができる好適な手段の提供を目的とする。
【００３１】
更に、調査対象のデータ集合において、登録キーワードの統一必要性が考慮されていない場合においても、同義語等の関係にある技術用語等を、調査者が適切に検索キーワードとして選択することができる好適な手段を提供することを目的とする。
【００３２】
【課題を解決するための手段】
上記目的を達成する為、本発明に係るシステムは、一以上のキーワードからなる情報ブロックを含む基準データと、一以上のキーワードからなる情報ブロックを複数含む検索対象データとを記憶する記憶手段を備えている。また、当該システムは、前記基準データを構成するキーワードと、前記検索対象データを構成するキーワードとの関連付けを入力する入力手段を備えている。更に、当該システムは、前記基準データの情報ブロックを基にして前記検索対象データから類似の情報ブロックを検索する処理を実行するコンピュータを備えている。これに加え、当該システムは、当該検索結果を表示する表示手段を備えている。以上のようなシステムにおいて、本発明に係る情報探索プログラムは、前記コンピュータに、前記記憶手段に格納された前記基準データから当該基準データの情報ブロックを構成するキーワードを抽出する第１抽出ステップを実行させる。また、当該プログラムは、前記コンピュータに、前記記憶手段に格納された前記検索対象データから当該検索対象データの複数の情報ブロックを構成するキーワードを抽出するステップを実行させる。更に、当該プログラムは、前記基準データから抽出された一以上のキーワードと、前記検索対象データから抽出された一以上のキーワードとを前記表示手段に表示して、当該表示された基準データのキーワード毎に、前記検索対象データから抽出されたキーワードとの関連付けを調査者に選択させるステップを実行させる。これに加え、当該プログラムは、前記コンピュータに、当該関連付けされた，検索対象データから抽出されたキーワードに基づいて、当該キーワードを抽出した検索対象データより前記類似の情報ブロックを検索する検索ステップを実行させる。
【００３３】
本発明によると、前記基準データのキーワード毎に関連付けされた、前記検索対象データのキーワードに基づいて、前記検索対象データより前記類似の情報ブロックを検索する為、従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。また、従来とは異なり、前記検索対象データの情報ブロック毎にキーワードを登録しておかなくても、調査者が検索を行うことが可能になる。更に、従来とは異なり、前記検索対象データに応じた適切な検索キーワードを、調査者が選択することができる。これに加え、従来とは異なり、基準データのキーワード毎に同義語等の関係にある、前記検索対象データのキーワードを、調査者が検索キーワードとして適切に選択できる。
【００３４】
また、他の発明に係る情報探索プログラムは、前記検索ステップが、前記基準データから抽出されたキーワードと、当該キーワード毎に関連付けされた検索対象データから抽出されたキーワードとに基づいて、前記検索対象データより前記類似の情報ブロックを検索することを特徴としている。
【００３５】
本発明によると、前記基準データのキーワードと、当該キーワード毎に関連付けされた前記検索対象データのキーワードとに基づいて、前記検索対象データより前記類似の情報ブロックを検索する為、従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。
【００３６】
更に、他の発明に係る情報探索プログラムは、前記第１抽出ステップが、前記基準データから抽出されたキーワードを調査者が編集することを前記入力手段から受け付けるステップを含むことを特徴としている。
【００３７】
本発明によると、前記基準データのキーワードを調査者が編集することにより、
従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。
【００３８】
また、他の発明に係る情報探索プログラムは、前記第１抽出ステップが、前記基準データから抽出されたキーワードを調査者が取捨することを前記入力手段から受け付けるステップを含むことを特徴としている。
【００３９】
本発明によると、前記基準データのキーワードを調査者が取捨することにより、
従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。
【００４０】
更に、他の発明に係る情報探索プログラムは、前記コンピュータに、前記基準データから抽出されたキーワード毎に調査者から検索処理における重要度を表す重み付けとの関連付けを受け付けさせるステップを備えている。そして、前記検索ステップは、当該基準データのキーワード毎に関連付けされた、前記検索対象データから抽出されたキーワードと、前記重み付けとに基づいて、前記検索対象データより前記類似の情報ブロックを検索するステップを実行することを特徴としている。
【００４１】
本発明によると、前記基準データのキーワード毎に関連付けされた、前記検索対象データのキーワードと、前記重み付けとに基づいて、前記検索対象データより前記類似の情報ブロックを検索する為、従来とは異なり、調査者がより精度の高い検索を行うことが可能になる。また、従来とは異なり、検索結果が重み付けにより補正される為、より細やかな検索を行うことが可能になる。
【００４２】
また、他の発明に係る情報探索プログラムは、前記検索ステップが、前記検索対象データより類似の情報ブロックを検索した際に当該情報ブロック毎に類似度を示す情報を付与するステップを含んでいる。そして、当該プログラムが、前記コンピュータに、前記類似度順に前記検索結果を表示するステップを実行させることを特徴としている。
【００４３】
本発明によると、前記類似度順に前記検索結果が表示される為、調査者が容易に検索結果を理解することが可能になる。また、従来とは異なり、当該検索結果の個々の内訳に、調査者が容易にアクセスすることが可能になる。
【００４４】
更に、他の発明に係る情報探索プログラムは、前記基準データの情報ブロックが特定の特許等出願情報であり、前記検索対象データの複数の情報ブロックが複数の特許等出願情報または科学技術文献情報であることを特徴としている。
【００４５】
本発明によると、前記基準データの情報ブロックである特定の特許等出願情報に対して、前記検索対象データの複数の情報ブロックである複数の特許出願情報または科学技術文献から、類似の特許出願情報または科学技術文献を検索することが可能になる。これにより、従来とは異なり、調査者が費やす調査時間を短縮し、負担を軽減することが可能になる。
【００４６】
また、他の発明に係る情報探索プログラムは、前記基準データの情報ブロックが特定の特許出願の一以上の請求項であることを特徴としている。
【００４７】
本発明によると、前記基準データの情報ブロックである特定の特許等出願の一以上の請求項に対して、前記検索対象データの複数の情報ブロックである複数の特許出願情報または科学技術文献から、類似の特許出願情報または科学技術文献を検索することが可能になる。これにより、従来とは異なり、請求項毎に類似の特許出願情報または科学技術文献を検索することが可能になる。
【００４８】
更に、他の発明に係る情報探索プログラムは、特定の特許出願の一以上の請求項である前記基準データの情報ブロックが従属項である場合に、前記第１抽出ステップが、当該従属項が引用している独立項を前記基準データの情報ブロックに加えるステップを含むことを特徴としている。
【００４９】
本発明によると、特定の特許出願の一以上の請求項である前記基準データの情報ブロックが従属項である場合に、当該従属項が引用している独立項を前記基準データの情報ブロックに加えることが可能になる。これにより、従来とは異なり、前記基準データの情報ブロックが従属項の場合であっても、請求項毎に類似の特許出願情報または科学技術文献を検索することが可能になる。
【００５０】
これにより、前述の目的を達成しようとするものである。
【００５１】
【発明の実施の形態】
以下、本発明の一実施形態を図１及至図１９に基づいて説明する。
【００５２】
図１は、本発明の一実施形態である情報探索システムの構成図である。この情報探索システム１は図１に示すように、記憶手段２と、入力手段３と、表示手段４と、これらを制御するコンピュータ５とから成り立っている。これら各手段及びコンピュータ５は相互に接続されており、データの授受を行いながら協働することにより情報探索機能１５が実現される。また、記憶手段２は、検索処理の基準となる基準データ２１と、基準データの検索対象となる検索対象データ２２とを記憶している。
【００５３】
次に、図１における記憶手段２、及びそこに記憶されている基準データ２１と検索対象データ２２とを詳細に図示したものが、図２の記憶手段２の詳細図である。ここで、基準データ２１は、複数のキーワードからなる情報ブロック２１ａを含んでいる。また、検索対象データ２２は、複数のキーワードからなる情報ブロック２２ａを、複数含んでいる。また、検索対象データ２２はキーワード、分類記号等を検索式に使用することにより、調査者によって予め絞り込まれた検索データであるが、元々その件数が少ない場合は調査者による絞込みを行わないデータとする。
【００５４】
ここで、基準データ２１に含まれるキーワードと、検索対象データ２２に含まれるキーワードとは、一以上の語句や記号や数値等、あるいはその組み合わせから構成されている。この例として、「キー項目」や、「システム」や、「検索結果」等が挙げられる。また図２において、基準データ２１に含まれるキーワードａと、検索対象データ２２に含まれるキーワードａ´とは、同義語等（類義語等を含む）の関係にある。これと同様に、基準データ２１に含まれるキーワードｂと、検索対象データ２２に含まれるキーワードｂ´とは、同義語等の関係にある。ここでいう同義語等の関係とは、前述の「主キー」と、「メイン項目」や「検索項目」等との関係のように、キーワード同士が同義語等の関係にあることをいう。
【００５５】
更に、基準データの情報ブロック２１ａは、情報探索の基準となる情報単位を表している。また、基準データの情報ブロック２１ａの探索対象となる情報単位を表しているのが、検索対象データ２２に複数含まれる情報ブロック２２ａである。ここで、前述した情報探索機能１５は、コンピュータ５が、基準データ２１のキーワード又は／及びこれと同義語等の関係にある検索対象データ２２のキーワードを基にして、基準データの情報ブロック２１ａと類似の検索対象データの情報ブロック２２ａを判断する機能である。（後述）そして、この基準データの情報ブロック２１ａと、検索対象データの情報ブロック２２ａとは、文書や文献毎に区分されているのが望ましい。その場合、特定文献である基準データの情報ブロック２１ａに基づいて、特定文献の類似文献である、検索対象データの情報ブロック２２ａが、前述の情報探索機能１５により判断されることになる。
【００５６】
続いて、図１の情報探索システムを、パーソナルコンピュータに適用した場合におけるブロック図が図３である。ここで、図１における記憶手段２はメモリ６やハードディスク１０に、入力手段３はキーボード等９に、表示手段４はディスプレイ７に、コンピュータ５はＣＰＵ８に夫々対応している。そして、各構成要素はバス１３により接続されており、相互に通信を行えるように構成されている。また、図１における記憶手段２に記憶されていた基準データ２１や検索対象データ２２は、ファイルやデータベース等の形式により、基準データ１１や検索対象データ１２としてハードディスク１０に記憶されている。同様に、情報探索機能１５は、情報探索プログラム１４としてハードディスク１０に記憶されている。そして、図３におけるシステムは、ＣＰＵ８が情報探索プログラム１４を解釈することにより、計算処理を行い、各構成要素を制御する。また、ＣＰＵ８が情報探索プログラム１４を解釈・実行する際には、情報探索プログラム１４やデータ等を、ハードディスク１０からメモリ６に読み込む。
【００５７】
ここで、キーボード等９はパーソナルコンピュータに一般的に使用されているマウス等を含む。また、ディスプレイ７は、ＣＲＴディスプレイや液晶ディスプレイ等である。さらに、図示した基準データのキーワード１１ａは、基準データ１１に含まれるキーワードがＣＰＵ８により抽出され、メモリ６に読み込まれたことを概念的に表している。また、検索対象データのキーワード１２ａについても同様である。
【００５８】
次に、本実施形態における各構成要素の動作を図４のフローチャートに従って説明する。
【００５９】
図３の情報探索システムを使用して調査を行おうとする者（以下「調査者」と略す。）は、入力手段であるキーボード等９を使用し、調査開始のコマンドを入力すると、ＣＰＵ８はそのコマンドを解釈し、情報探索プログラム１４を起動する。
【００６０】
ここでＣＰＵ８が、起動された情報探索プログラム１４を解釈し、実行することにより、図４のフローチャートに図示された４つの処理が行われる。この４つの処理とはすなわち、基準データからキーワードを抽出する第１抽出ステップＳ１と、検索対象データからキーワードを抽出するステップＳ２と、基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップＳ３と、関連付けられた検索対象データのキーワードに基づいて検索対象データの情報ブロックを検索するステップＳ４である。以上の４つの処理によって、情報探索機能１５が実現する。また、基準データからキーワードを抽出する第１抽出ステップＳ１と、検索対象データからキーワードを抽出するステップＳ２とを、予め別ステップで処理しておき、情報探索プログラム１４の実行開始によって、即座に基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップＳ３から処理を進めることもできる。
【００６１】
以下にＳ１及至Ｓ４の、夫々のステップの概略を先に説明する。
【００６２】
まず、基準データからキーワードを抽出する第１抽出ステップＳ１とは、ハードディスク１０に格納された基準データ１１に含まれる情報ブロックを構成している、複数のキーワードを抽出するステップである。このステップにより抽出されたキーワードは、Ｓ３及びＳ４において使用される。以下、基準データからキーワードを抽出する第１抽出ステップＳ１を、「基準データからの第1抽出ステップ」と略す。
【００６３】
次に、検索対象データからキーワードを抽出するステップS２とは、ハードディスク１０に格納された、検索対象データ１２に含まれる複数の情報ブロックを構成している、複数のキーワードを抽出するステップである。このステップにより抽出されたキーワードは、Ｓ３及びＳ４において使用される。以下、検索対象データからキーワードを抽出するステップS２を、「検索対象データからの抽出ステップ」と略す。
【００６４】
続いて、基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップS３とは、Ｓ１において基準データ１１から抽出されたキーワードと、Ｓ２において検索対象データ１２から抽出されたキーワードとの関連付けを、調査者に選択させるステップである。このステップにより選択された関連付けは、Ｓ４において使用される。以下、基準データのキーワードと検索対象データのキーワードとの関連付けを選択させるステップS３を、「関連付け選択ステップ」と略す。
【００６５】
続いて、関連付けられた検索対象データのキーワードに基づいて検索対象データの情報ブロックを検索するステップＳ４とは、Ｓ３において関連付けられた検索対象データ１２のキーワードに基づいて、基準データ１１の情報ブロックと類似の、検索対象データ１２の情報ブロックを検索するステップである。この検索ステップにより、検索対象データ１２の情報ブロック毎に、基準データ１１の情報ブロックに類似しているか否かが判定される。以下、関連付けられた検索対象データのキーワードに基づいて検索対象データの情報ブロックを検索するステップＳ４を、「検索ステップ」と略す。
【００６６】
以上が、Ｓ１及至Ｓ４の各ステップの概略であり、以下に夫々のステップを詳細に説明する。
【００６７】
まず、基準データからの第1抽出ステップＳ１においては、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより、図５のフローチャートに図示されたＳ１１及至Ｓ１５の処理を行う。以下、図５のフローチャートに基づいて、Ｓ１１及至Ｓ１５の夫々の処理について説明する。
【００６８】
最初に、Ｓ１１においてＣＰＵ８は、ディスプレイ７に図７の基準データ入力画面３０を表示させ、調査者からの基準データ入力命令を待つ。ここで調査者は、キーボード等９を使用し、基準データ読込ボタン３４を通じて基準データ読込を指示する。更に、調査者は、パスやファイル名等の基準データを特定する情報もあわせて指定する。そうすると、ＣＰＵ８は調査者の基準データ読込の指示を受け、指定された基準データを、基準データ表示部３３に表示する。また、ＣＰＵ８は、当該基準データの分析を行い、「請求項」という文字や墨付括弧の記号により、基準データを請求項毎に分割し、請求項の一覧を請求項一覧表示部３１に表示する。この、調査者が入力する基準データの例としては、特許等の明細書がある。また、このような基準データの受付処理の例としては、調査者が図８に図示したデータを基準データとして指定した場合、基準データ入力画面３０は図９のような状態になる。
【００６９】
次に、ＣＰＵ８は、図９の入力画面を表示すると、請求項一覧表示部３１に表示した基準データの請求項のうち、どれを使用するかの選択指示である、調査者の請求項選択命令を待つ。ここで調査者は、キーボード等９を使用し、請求項一覧表示部３１の各請求項をダブルクリックすること等によって、一以上の請求項の選択を命令する。そうすると、ＣＰＵ８は調査者の請求項選択の指示を受け、指定された請求項のデータを、基準データ表示部３３から抽出して、選択請求項表示部３２に表示する。
【００７０】
この抽出・表示処理の際に、ＣＰＵ８により、先の調査者による請求項選択の指示により選択された、一以上の請求項の中に、従属項を含んでいるか否かの分岐判定Ｓ１２が行われる。ここで、従属項を含んでいるか否かは、基準データ表示部３３から抽出した請求項データにおいて、「請求項」の文字や「請求項１及至３」の文字のような、他の請求項を引用しているか否かにより判定する。このＳ１２において、従属項を含んでいるとＣＰＵ８が判定した場合、当該従属項が引用している請求項のデータも基準データ表示部３３から抽出し、あわせて選択請求項表示部３２に表示する（Ｓ１３）。またこのＳ１３において、従属項が引用している請求項が、更に他の請求項を引用している場合（他の請求項の従属項目である場合）、当該引用している請求項のデータも、あわせて選択請求項表示部３２に表示する。このような処理を、引用している請求項が独立項になるまで繰り返す。以上のような従属項判定Ｓ１２の例として、図９の状態の基準データ入力画面３０において、調査者が請求項一覧表示部３１から「請求項２」を指定した場合、基準データ入力画面３０は図１０の状態になり、請求項１と２のデータが反映される。このようにして、調査者の請求項選択の指示が選択請求項表示部３２に反映されることになる。この選択請求項表示部３２に表示されている請求項データは、前述の情報探索機能における情報探索の基準となる情報単位である、基準データの情報ブロックを表している。
【００７１】
そして、この図１０の基準データ入力画面３０において、調査者がキーボード等９を使用し、選択請求項表示部３２に表示された請求項データを編集することができる。すなわち、選択請求項表示部３２に表示された請求項データに、文字や記号等を付け加えたり、又は削除したりすることができる。
【００７２】
続いて、この図１０の基準データ入力画面３０において、調査者がキーボード等９を使用し、キーワード抽出ボタン３５をクリックすること等によって、基準データの情報ブロックからのキーワード抽出を命令する。そうすると、ＣＰＵ８は調査者のキーワード抽出の指示を受け、選択請求項表示部３２に表示された請求項データを、キーワードに分割する（Ｓ１４）。この分割の方法としては、ＣＰＵ８が請求項データに含まれる助詞や空白文字等毎に、キーワードに分割する方法がある。このようなキーワード分割の例として、図１０の入力画面における選択請求項表示部３２に表示された請求項データは、図１１に図示した分割キーワード一覧３７のようにキーワード分割される。
【００７３】
そして、請求項データをキーワードに分割すると、ＣＰＵ８はディスプレイ７に図１２の探索条件決定画面４０を表示させ、請求項データを請求項データ表示部４２に、分割キーワード一覧３７をキーワード一覧表示部４３に表示する（Ｓ１５）。この図１２の探索条件決定画面４０において、調査者がキーボード等９を使用し、請求項データ表示部４２に表示された請求項データを編集することができる。すなわち、請求項データ表示部４２に表示された請求項データに、文字や記号等を付け加えたり、又は削除したりすることができる。ここで、請求項データ表示部４２に表示された請求項データを編集した場合、調査者がキーボード等９を使用し、キーワード再抽出ボタン４１をクリックすること等によって、請求項データ表示部４２からのキーワード再抽出、及び抽出したキーワード一覧をキーワード一覧表示部４３へ再表示できる。
【００７４】
このようにして、ＣＰＵ８はキーワード一覧表示部４３にキーワードの一覧を表示すると、当該キーワードのうち、使用するキーワードの指示である、調査者のキーワード選択命令を待つ。ここで調査者は、キーボード等９を使用し、キーワード一覧表示部４３の各キーワードをダブルクリックすること等により、一以上のキーワードの選択を命令する。そうすると、ＣＰＵ８は調査者のキーワード選択の指示を受け、指定されたキーワードを、キーワード一覧表示部４３から抽出して、検索条件表示部４８の構成要素の列に表示する。このように、調査者がキーワード一覧表示部４３から、例えば「キー項目」、「順序」、「検索結果」の各キーワードを順次クリックすることにより、図１３のように検索条件表示部４８の構成要素の列に選択したキーワードが表示される。この場合において調査者が選択するキーワードは、キーワード一覧表示部４３に表示された全てのキーワードではなく、発明の特徴を表すようなものを選ぶのが適当である。このように、キーワードの取捨及び編集は専ら調査者が決定するものであり、情報探索プログラム１４による自動的な決定がなされるのでは無い。
【００７５】
以上が、基準データからの第1抽出ステップＳ１における各処理の説明である。ここで、図１３において検索条件表示部４８の構成要素の列に表示されている各キーワードが、基準データからの第1抽出ステップＳ１で抽出されたキーワードである。以下、検索対象データからの抽出ステップＳ２について説明する。
【００７６】
この検索対象データからの抽出ステップＳ２においても、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより以下の処理が行われる。まずＣＰＵ８は、キーボード等９を通じて調査者から、Ｓ１によって受け付けた基準データに対しての、検索対象となる検索対象データの入力を受け付ける。この検索対象データの例としては、複数件数の特許公報等があげられ、その場合には出願毎の特許公報等が検索対象データにおける個々の情報ブロックとなる。また、前述のように、検索対象データにおける個々の情報ブロックは、基準データの情報ブロックとの類似度が判定される情報単位である。
【００７７】
そして、ＣＰＵ８は、調査者により入力された検索対象データを、前述の基準データからの第1抽出ステップＳ１におけるＳ１４と同様に、キーワードに分割する。この分割の方法としては、ＣＰＵ８が検索対象データに含まれる助詞や空白文字等毎に、キーワードに分割する方法がある。以上のようにして、検索対象データからの抽出ステップＳ２において分割された、検索対象データのキーワード一覧３８を図１４に図示する。ここで、図１４において表示されているキーワードが、検索対象データからの抽出ステップＳ２で抽出されたキーワードとする。
【００７８】
以上が、検索対象データからの抽出ステップＳ２における各処理の説明である。この検索対象データからの抽出ステップＳ２においては、調査者は検索対象データを指定するのみであり、他の処理は情報探索プログラム１４に基づいて行われる。そのため、この検索対象データからの抽出ステップＳ２の後に、基準データからの第1抽出ステップＳ１を情報探索プログラム１４が行っても不都合は無い。つまり、基準データからの第1抽出ステップＳ１と、検索対象データからの抽出ステップＳ２とは、順不同である。以下、関連付け選択ステップS３について説明する。
【００７９】
この関連付け選択ステップS３においても、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより以下の処理が行われる。ここで、Ｓ１とＳ２を終えると、図１３の探索条件決定画面４０がディスプレイ７に表示されており、ＣＰＵ８は調査者からの関連付け選択命令を待つ。ここで調査者は、キーボード等９を使用して、検索条件表示部４８の構成要素の列に表示された各キーワードのうち、任意のキーワード（以下、「任意キーワード」と略す。）を選択し、同義語ボタン４４をクリックすること等により、関連付け選択画面表示を指示する。これを検知すると、ＣＰＵ８は、当該任意キーワードデータを検索条件表示部４８から抽出し、情報探索プログラム１４が予め保持している関連付け選択画面５０をディスプレイ７に表示する。（後述）
【００８０】
この関連付け選択画面５０は、対象キーワード表示部５１と、ヒットキーワード一覧表示部５２と、ヒット外キーワード一覧表示部５３と、関連付け実行ボタン５４とを有している。ここで、対象キーワード表示部５１とは、先に調査者によって検索条件表示部４８の構成要素の列から選択された、任意キーワードを表示する部位である。図１５の関連付け選択画面５０においては、任意キーワードとして「キー項目」が選択されたことが分かる。また、ヒットキーワード一覧表示部５２とは、任意キーワードを検索キーとして、Ｓ２で抽出されたキーワードに対して検索を行った結果、ヒットしたキーワードの一覧を表示する部位である。更に、ヒット外キーワード一覧表示部５３とは、任意キーワードを検索キーとして、Ｓ２で抽出されたキーワードに対して検索を行った結果、ヒットしなかったキーワードの一覧を表示する部位である。このＳ２で抽出されたキーワードに対して行う検索方法は、前述の一般的なパターンマッチングが使用される。その結果、Ｓ２で抽出されたキーワードは各々、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３との、どちらか一方に振り分けられ、表示される。また、この振り分け処理は、調査者が選択した類似度基準により判断される。図１５のように、関連付け選択画面５０は、調査者が類似度基準を選択できるように、ラジオボタンを複数備えている。図１５の関連付け選択画面５０においては、類似度が（弱）、（中）、（強）の３つのラジオボタンを備えており、調査者がどのラジオボタンを選ぶかにより類似度基準を選択できる。この類似度基準のキーワード振り分け例としては、「主キー」という任意キーに対して類似度基準を（強）に設定した場合、Ｓ２で抽出されたキーワードのうち、「主キー」という文字を完全に含むキーワードのみが、ヒットキーワード一覧表示部５２に表示されることが挙げられる。また同様に、「主キー」という任意キーに対して類似度基準を（弱）に設定した場合、Ｓ２で抽出されたキーワードのうち、「主キー」という文字の一部分（例えば「主」や「キー」）を含むキーワードが、ヒットキーワード一覧表示部５２に表示されることが挙げられる。
【００８１】
先の関連付け選択画面５０をディスプレイ７に表示すると、ＣＰＵ８は、先の抽出した任意キーワードデータを対象キーワード表示部５１に表示する。更にＣＰＵ８は、情報探索プログラム１４により初期設定されている類似度基準と、Ｓ２で抽出されたキーワードとを読み込み、当該類似度基準に基づいて先のパターンマッチングにより、Ｓ２で抽出されたキーワードを、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに振り分けて表示する。このような類似度基準を設ける理由は、構成要素として選択したキーワードに類似するキーワードが検索対象データ中に多く含まれる場合は、類似度の高いキーワードのみをヒットキーワード一覧表示部５２に表示させて調査者が即座に必要とする検索用のキーワードを選択できるようにするためである。従って、構成要素として選択したキーワードに類似するキーワードが検索対象データ中に殆ど無い場合は、類似度の低いキーワードも含めてヒットキーワード一覧表示部５２に表示させ、調査者が選択できるようにしている。更に、ヒットキーワード一覧表示部５２に表示されるキーワードの件数が一定範囲内となるように、情報探索プログラム１４によりラジオボタンの１つを選択させることもできる。
【００８２】
また、調査者がキーボード等９を使用して、関連付け選択画面５０上の類似度基準選択ラジオボタンの選択を行い、当該類似度基準を変更すると、ＣＰＵ８は当該類似度基準に基づいて、再度先のＳ２で抽出されたキーワードを読み込み、先のパターンマッチングによりヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに振り分けて表示する。このようにして、情報探索プログラム１４により初期設定されている類似度基準を、調査者が変更することができる。
【００８３】
続いて、調査者は、キーボード等９を使用して、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに表示されたＳ２で抽出されたキーワードの一覧のうち、対象キーワード表示部５１に表示された任意キーワードと同義語等の関係にあると考えるキーワードを選択する。この選択方法としては、ヒットキーワード一覧表示部５２とヒット外キーワード一覧表示部５３とに表示された、各キーワード毎に備えられたチェックボックスにチェックすることにより選択する方法がある。通常、ヒットキーワード一覧表示部５２には調査者が構成要素として選択したキーワードは表示しない。調査者が構成要素として選択したキーワードは表示するまでもなく、調査者によって選択されたものとして処理が行われる。なお、調査者が構成要素として選択したキーワードも表示させると、検索対象データ中に選択した構成要素の有無を調査者が確認可能となる。このチェック付けを検知すると、ＣＰＵ８は任意キーワードと当該チェックされたキーワードとの関連付けを記憶する。また、チェック付けが解除されると、ＣＰＵ８は任意キーワードと当該チェックが解除されたキーワードとの関連付けを削除する。図１５の関連付け選択画面５０を例にとると、任意キーワードである「キー項目」に対して、「検索キー」と「検索項目」と「特徴部分」とがキーワード選択され、関連付けが記憶されている状態である。
【００８４】
このように、調査者がキーボード等９を使用して、任意キーワードと、同義語等の関係にあると考えるキーワードを選択し、関連付け選択画面５０上の関連付け実行ボタン５４をクリックすること等により、任意キーワードに対する他のキーワードの関連付け保存が行われる。これを受けてＣＰＵ８は、当該関連付けを保存し、当該任意キーワードに関連付けられた他のキーワードを、探索条件決定画面４０上の検索条件表示部４８の、当該任意キーワードの行であり、同義語項目の列と交わる領域に各々表示する。この例として、図１６の探索条件決定画面４０上の検索条件表示部４８においては、「キー項目」という任意キーワードに対して「検索キー」、「検索項目」、「特徴部分」という３つのキーワードが、「順序」という任意キーワードに対しては「手続」、「順番」という２つのキーワードが夫々関連付けられていることを示している。また同時に、図１６の探索条件決定画面４０上の検索条件表示部４８においては、「検索結果」という任意キーワードに対して、全くキーワードが関連付けられていないことも示している。このように、任意キーワードに対して関連付けキーワードの無い場合は、構成要素として選択された語句、即ちこの場合は「検索結果」のキーワードのみを用いて検索用データベースの検索が行われる。（後述）
【００８５】
以上が、関連付け選択ステップS３における各処理の説明である。ここで、図１６において検索条件表示部４８の構成要素の列に表示されている各任意キーワードに対し、当該任意キーワード毎の行の同義語項目に表示されているキーワードが、S３において当該任意キーワード毎に関連付けされた検索対象データのキーワードである。以下、検索ステップＳ４について説明する。
【００８６】
この検索ステップＳ４においても、前述のＣＰＵ８が情報探索プログラム１４を解釈し、実行することにより以下の処理が行われる。ここで、検索ステップＳ４は図６のフローチャートに図示されたＳ４１及至Ｓ４３の処理から成り立っている。以下、図６のフローチャートに基づいて、Ｓ４１及至Ｓ４３の夫々の処理について説明する。
【００８７】
まず、図１６の探索条件決定画面４０において、ＣＰＵ８は、検索条件表示部４８の構成要素の列に表示されている各任意キーワード毎に重み付けを付与する指示である、調査者の重み付け付与命令を待つ。ここで調査者は、キーボード等９を使用し、検索条件表示部４８の構成要素の列に表示されている各キーワードをクリックすること等により当該キーワードを選択した上で、重み付け付与メーター４５を操作するか或いはキーボード等９から数値を入力することにより、重み付け付与の指示を行う。ここで重み付け付与メーター４５は、図１６のように、中央の重み付け指示棒を左右に操作することにより、特定範囲の数値の中から、重み付け指示棒が指し示す値を選択できるように構成されている。この調査者からの重み付け付与の指示を受けると、ＣＰＵ８は、検索条件表示部４８の該当キーワードの行における重みの列に、重み付け付与メーター４５に示された値を表示する（Ｓ４１）。この例として、図１６の探索条件決定画面４０において、「キー項目」という任意キーワードに対して「８」が、「順序」という任意キーワードに対して「６」が、「検索結果」という任意キーワードに対して「３」が夫々調査者によって重み付け付与された場合、探索条件決定画面４０は図１７のような状態になる。また、調査者が特に重み付け指定していない、検索条件表示部４８の構成要素の列に表示されているキーワードについては、情報探索プログラム１４により初期設定値が自動設定されると、調査者にとって簡便であり望ましい。
【００８８】
ここで、調査者による重み付け付与ステップＳ４１は、説明の便宜上からＳ４に含めたが、Ｓ１の最後や、Ｓ２、Ｓ３の他のステップに存在していても良い。また調査者は、探索条件決定画面４０に表示されている時には適時、探索条件決定画面４０に表示されている各種データを保存することや、以前保存された当該各種データを読み込むことができる。例えば、調査者が探索条件決定画面４０上の設定保存ボタン４６をキーボード等９を使用し、クリックすることを受けて、ＣＰＵ８は、当該探索条件決定画面４０に表示されている各種データをハードディスク１０に保存する。また、調査者が探索条件決定画面４０上の設定読込ボタン４７をキーボード等９を使用してクリックし、設定ファイルを指定することを受けて、ＣＰＵ８は、当該設定ファイルをハードディスク１０からメモリ６に読み込み、探索条件決定画面４０に当該設定ファイルの内容を表示する。
【００８９】
次に、ＣＰＵ８は、検索条件表示部４８の条件に基づいて、検索対象データの情報ブロック毎に類似度判定を行う探索開始指示である、調査者の探索開始命令を待つ。ここで調査者は、キーボード等９を使用し、探索条件決定画面４０上の探索開始ボタン４９をクリックすること等によって、探索開始を指示する。そうすると、ＣＰＵ８は調査者からの探索開始指示を受け、検索条件表示部４８に表示されている検索条件データを抽出する。この例として、図１７の探索条件決定画面４０の状態において、調査者からの探索開始指示がなされると、図１８のような検索条件データ６０が抽出される。ここで、検索条件データ６０には新たに管理番号の列が存在するが、これは検索条件データ６０の各レコードをユニークにする為に管理番号を付したものである。
【００９０】
続いて、ＣＰＵ８は、検索条件データ６０に基づいて、検索対象データの情報ブロック毎に類似度判定を行う（Ｓ４２）。この類似度判定方法としては、検索条件データ６０において、ある特定の構成要素に対する同義語の数がＮで、検索対象データのある特定の情報ブロック（以下「特定情報ブロック」と称す。）に対して、Ｎ個のキーワードを使用した検索の結果Ｍ個ヒットした場合、その構成要素に対しての特定情報ブロックの得点は「（Ｍ／Ｎ）＊重み」として、特定情報ブロック毎に記録される。同様に、他の構成要素である「順序」及び「検索結果」についても同様の検索と記録を行う。最後に、各構成要素毎の得点を、特定情報ブロック毎に集計して、その特定情報ブロックの類似度得点とする。
【００９１】
例えば、第１８図に示す検索条件データ６０において、構成要素が「キー項目」の行では、同義語の数が３つ（「検索キー」、「検索項目」、「特徴部分」）であるのでＮの数は構成要素として選択した「キー項目」を加えてＮ＝４となる。これに対し、検索対象データのある特定の情報ブロックにおいて、「検索キー」と「検索項目」と「特徴部分」と「キー項目」とを検索キーワードとして、その情報ブロックを検索した場合「検索キー」と「検索項目」と「特徴部分」と「キー項目」の４つのキーワードが共に存在した場合は、Ｍ＝４であることが分かり、第１８図に示す検索条件データ６０より、構成要素が「キー項目」の重みは８であることが分かる。よって、この特定情報ブロックにおける構成要素が「キー項目」の行の得点は、「（Ｍ／Ｎ）＊重み」の式に代入すると「（４／４）＊８」となり、当該式の解として「８」であることが分かる。同様にして、構成要素が「順序」の行と「検索結果」の行とにおいても、当該特定情報ブロックにおいて得点を算出する。そして、以上の３つの構成要素の行毎に算出された得点を合計することにより、当該特定情報ブロックの合計得点（類似度得点）が算出される。なお、ここでは３つの構成要素の行毎に算出された得点を単純に合計しているが、３つの構成要件の行毎に算出された得点の積を求めて、合計得点を補正することができる。然るに、同じ技術を広範な同義語で表現できる特許明細書や技術文献といった文章を対象に類似度を求めるので、得点の算定方法を厳密に定める意味は殆ど無いため、種種の計算方法を採用できる。
【００９２】
また、先のＳ４２において、検索条件データ６０におけるある特定の構成要素に対する同義語の数であるＮに、当該特定の構成要素を加えない数（Ｎ−１）をＮとしても良い。例えば、第１８図に示す検索条件データ６０において、構成要素が「キー項目」の行では、同義語の数が３つ（「検索キー」、「検索項目」、「特徴部分」）であるのでＮ＝３とし、調査者が選んだ３つのキーワードを検索キーワードとして、先のＳ４２を行う。このように、当該構成要素自体をキーワードとして採用しないことにより、検索対照データ中のキーワードのみによる類似度判定を行うことができる。
【００９３】
このようにして、ＣＰＵ８は、検索条件データ６０に基づいて、検索対象データの情報ブロック毎に類似度得点を算出すると、当該類似度得点の高い順に検索対象データの情報ブロックをソート処理し、ディスプレイ７に一覧表示する（Ｓ４３）。この例としては、図１９のような表示方式がある。この図１９の検索情報データの情報ブロック類似度順一覧７０においては、「文献番号」の項目の値がその情報ブロックの当初の表示順番を意味し、「得点順位」の項目の値がその情報ブロックの合計得点順番を意味している。例えば、「文献番号」が「３７３」の情報ブロックの行からは、当初は３７３番目の行に表示されていた情報ブロックであり、合計得点が他の情報ブロックに比べ一番高かったことが分かる。同様にして、「文献番号」が「３７３」の情報ブロックの行からは、「公開番号」が「Ｈ０３−１２３４」であることや、「公開日」が「Ｈ０３／１０／１０」であること、また「発明の名称」が「Ａシステム」であることが読み取れる。更に、「リンク」の項目にはパスとファイル名等の、個別の情報ブロックデータの識別情報を保持している。これにより、調査者がキーボード等９を使用し、情報ブロック類似度順一覧７０上の個別の情報ブロックを選択することで、即座にＣＰＵ８が、当該情報ブロックをディスプレイ７に表示することができる。
【００９４】
以上のようなＳ４３により、調査者は、基準データの情報ブロックに対して、類似度得点の高い検索対象データの情報ブロックを、類似度得点順に認知することができる。
【００９５】
以上述べてきたように本実施形態によれば、基準データから抽出したキーワード毎に、検索対照データから抽出したキーワードを複数関係づけることができ、当該関係付けられた検索対照データからのキーワードに基づいて、検索対照データの情報ブロックを検索できるので、もれなく確実に目的とする情報ブロックを検索及び抽出できる効果がある。そのため、調査者は、検索対照データに含まれる内容を情報ブロック単位毎に、注意を要するもの順に見ることができ、読込み調査の時間が大幅に短縮できる。
【００９６】
【発明の効果】
本発明は以上のように構成され機能するので、これによると、基準データから抽出されたキーワード毎に、検索対象データから抽出されたキーワードを関連付けることができ、当該関連付けられた検索対象データからのキーワードに基づいて、検索対象データの情報ブロック毎に類似度を判定することができる。
【００９７】
また、検索対象データの情報ブロック毎に、事前に検索用キーワードを登録する必要が無く、検索対象データ全てをキーワード対象とすることができる。
【００９８】
更に、検索対象データにおいて、登録キーワードの統一必要性が考慮されていない場合においても、同義語等の関係にある技術用語等を、調査者が適切に検索キーワードとして選択することができる。
【００９９】
これに加え、基準データから抽出されたキーワード毎に、関連付けられた検索対象データから抽出されたキーワードに対し、調査者が重み付けを付与することによって、類似度判定を調整することができる。
【０１００】
以上のような点により、従来よりも細やかな調査が行うことが可能になり、もれなく確実に検索対象データから目的とする情報ブロックを検索及び抽出できる効果があるという、従来にない優れた情報探索システムを提供することができる。
【図面の簡単な説明】
【図１】本発明の一実施形態である情報探索システムの構成図である。
【図２】図１における記憶手段２の詳細図である。
【図３】図１の情報探索システムを、パーソナルコンピュータに適用した場合におけるブロック図である。
【図４】情報探索システムの処理を図示したフローチャート図である。
【図５】図４におけるＳ１を詳細化したフローチャート図である。
【図６】図４におけるＳ４を詳細化したフローチャート図である。
【図７】基準データ入力画面３０の画面図である。
【図８】基準データのデータ例である。
【図９】基準データ入力画面３０の画面図である。
【図１０】基準データ入力画面３０の画面図である。
【図１１】基準データをキーワードに分割したキーワード一覧図である。
【図１２】探索条件決定画面４０の画面図である。
【図１３】探索条件決定画面４０の画面図である。
【図１４】検索対象データをキーワードに分割したキーワード一覧図である。
【図１５】関連付け選択画面５０の画面図である。
【図１６】探索条件決定画面４０の画面図である。
【図１７】探索条件決定画面４０の画面図である。
【図１８】検索条件データ６０のデータ例である。
【図１９】検索情報データの情報ブロック類似度順一覧７０の表示例である。
【符号の説明】
１情報探索システム
２記憶手段
３入力手段
４表示手段
５コンピュータ
６メモリ
７ディスプレイ
８ＣＰＵ
９キーボード等
１０ハードディスク
１１基準データ
１１ａ基準データのキーワード
１２検索対象データ
１２ａ検索対象データのキーワード
１３バス
１４情報探索プログラム
１５情報探索機能
２１基準データ
２１ａ基準データの情報ブロック
２２検索対象データ
２２ａ検索対象データの情報ブロック
３０基準データ入力画面
３１請求項一覧表示部
３２選択請求項表示部
３３基準データ表示部
３４基準データ読込ボタン
３５キーワード抽出ボタン
３６基準データ
３７分割キーワード一覧
３８検索対象データのキーワード一覧
４０探索条件決定画面
４１キーワード再抽出ボタン
４２請求項データ表示部
４３キーワード一覧表示部
４４同義語ボタン
４５重み付け付与メーター
４６設定保存ボタン
４７設定読込ボタン
４８検索条件表示部
４９探索開始ボタン
５０関連付け選択画面
５１対象キーワード一覧表示部
５２ヒットキーワード一覧表示部
５３ヒット外キーワード一覧表示部
５４関連付け実行ボタン
６０検索条件データ
７０検索情報データの情報ブロック類似度順一覧[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a system for investigating a target material, etc., and in particular, an investigator actually reads the contents of a document, such as an investigation of an objection material related to industrial property rights, etc., a prior application search, etc. The present invention relates to a system that assists investigations that need to be conducted and facilitates investigation work.
[0002]
[Prior art]
Conventionally, a patent information database search system that can search patent information and the like by machine search has been developed and used. Taking a patent information database as an example, it is constructed by storing patent information generated for each patent application as a database in the storage means of the host computer.
[0003]
In addition to general bibliographic items, such patent information databases store various keywords (for example, free keywords, fixed keywords, F-terms, etc.) for each file. The patent information to be searched can be searched. In addition, in addition to the bibliographic items described above, CD-ROM format patent gazettes issued in recent years, CD-ROMs related to patent information edited by manufacturers (hereinafter referred to as electronic patent information), etc. The specification, abstract and drawings are also stored as electronic information, and the corresponding patent information can be searched based on various search items.
[0004]
Here, when trying to investigate invalid materials or opposition materials using the above-mentioned patent information database or electronic patent information, the researcher creates a search formula and sees the abstract of the contents that hit the search formula. If necessary, the full gazette is read to determine whether it can be used as invalid or opposition material. As a result of this search, when the number of hits is small, the publication is directly read. If the corresponding publication is not found, it is repeated to recreate the search formula and read the hit publication. The prior application search for the invention to be applied will be performed in the same procedure.
[0005]
Here, the patent search process will be described in some detail. First, the investigator understands the contents and features of the present application, and combines the classification symbols and keywords stored in the above-mentioned patent database and electronic patent information to obtain, for example, several tens to hundreds of target technologies. Create a search expression so that it is included in the answer set. Of course, the number of hits as a result of the search is small, and it is only necessary that the desired technology is covered in this hit data set. Often, similar technologies in the field are not extracted. For this reason, the researcher usually creates a search formula so that the target technology is included in the hit data set, with tens or hundreds at most as a guide. When a set exceeding several hundreds is created, it takes a lot of time for subsequent screening, and it is difficult to perform an efficient search. However, if it is certain that the target similar technology has been extracted, the abstract or the gazette will be read for all the hit applications, and the similar technology will be searched. This is done manually. If a similar technique cannot be extracted after reading a large number of publications, the search formula is recreated and the same operation is repeated. The investigation work continues until it can be confirmed that no similar technique can be extracted no matter how much searching. Especially in the case of publicly known cases, objections, and invalid material investigation, in addition to the time required to create the search formula, it will add work time for people to read the machine search results carefully, examine the contents further, and discard them. Become. Therefore, it takes a lot of time to complete the survey, and the cost of the survey becomes great. Even for survey experts who are proficient in classification systems and survey methods, the fact of survey work is that it takes a lot of time to examine and process the results of machine search.
[0006]
By the way, there is a pattern matching as a general search processing method widely used conventionally. This includes a keyword used in a search, a search expression, etc. (hereinafter abbreviated as “search keyword”) and a keyword registered for each search unit such as a document / document (hereinafter abbreviated as “registered keyword”). And whether or not the search unit is to be extracted as a search result based on whether or not they match (completely or partially). Therefore, if the search keyword and the registered keyword are inconsistent (completely or partially) but have a synonym or the like (including synonyms and the like), there is a problem that the search cannot be performed. That is, since there are so many terms in the expression of the technical idea, it is very difficult to extract a target similar technique in the machine search process by pattern matching.
[0007]
As an example of this pattern matching, when the character string “primary key” is designated as the search keyword, the registered keywords such as “primary key”, “key”, “key item” are the characters “primary key”. Since it includes at least a part of the column, it is determined as a similar keyword. However, registered keywords that are technical synonyms such as “main item” and “search item” cannot be extracted by search when the character string “main key” is designated as a search keyword.
[0008]
Further, since the above-described registered keyword is used for a search act, it needs to be registered in advance for each search unit such as a document and a document, and it takes time for registration. (Hereafter, this effort is referred to as “keyword registration effort”.)
[0009]
In addition, as a method for determining the registered keyword, a part of a keyword having a characteristic for each search unit such as a document or a document is extracted, or an administrator sets some keywords for each search unit. ing. For this reason, in general, a registered keyword often represents only a part or one aspect of a search unit such as a document / document. (Hereafter, this point is referred to as “partial expression of registered keywords”.) Therefore, when a keyword or the like constituting a search unit other than the registered keyword is designated as a search keyword, one of the registered keywords described above is used. Due to the partial expression, the search unit is not extracted as a result of the search.
[0010]
Furthermore, as in the example of the “primary key”, it is necessary to use the registered keywords that are technical synonyms and the like in a unified manner. (Hereinafter, this point is referred to as “Necessity of Unification of Registered Keywords”.) As an example of the necessity of unification of registered keywords, if the term “primary key” is used as a registered keyword, For example, the registered keywords “item” and “search item” cannot be used.
[0011]
However, if the unification necessity of this registered keyword is not observed, it is not preferable for the investigator who conducts the search, especially when judging novelty or inventive step of industrial property rights etc. The point will become more apparent.
[0012]
Here, novelty refers to whether a particular invention is new depending on whether a particular invention and one or more cited inventions are recognized and a difference occurs when they are compared. It is a patent requirement to judge whether or not. In the determination of novelty, even if the inventions are conventionally recognized as being the same invention by visually confirming the contents by an investigator (determined as having no novelty), they are cited as specific inventions. When a technical term or the like having the same meaning is used as a registered keyword, a highly accurate search may not be performed. In the domestic patent database that is prevailing at the time of filing of the present application, F terms, international patent classifications, FI symbols, and other technical contents are given as thesaurus symbols. However, since a new invention cannot always be classified into an old technical system, it is difficult to hit a patent or the like with a complete focus even if keywords and classification symbols are combined. Furthermore, when a paper or general technical document with a small number of classification symbols is to be investigated, it is difficult to hit an appropriate document by machine search from a data set narrowed down to a small number. Therefore, it is necessary to create a data set of a certain number of cases, and the investigator should examine whether or not there are relevant documents in the data set.
[0013]
Also, the inventive step is to identify a specific invention and one or more cited inventions, and to compare the two to clarify the points of agreement and differences between the items for specifying each invention. The patent requirement is determined by whether or not the difference is easy for those skilled in the art. In this novelty judgment, an invention that is cited as a specific invention, even among inventions that have been identified as having little difference by a visual inspection by an investigator (determined that there is no inventive step) However, when other technical terms having the same meaning are used as registered keywords, appropriate search and extraction may not be performed.
[0014]
In general, there are a lot of creators of documents such as specifications of industrial property rights, etc. As in the example of the “primary key” above, technical terms that have the same and similar meanings have a wide variety of terms. It is expressed by. As described above, since the necessity of unifying registered keywords is not taken into account, there is an inconvenience that it is very difficult to search for the same invention or a similar invention by the conventional search method.
[0015]
Further, when the registered keyword for each search unit is a part of the search unit, etc., there is a disadvantage that it is difficult to conduct a high-accuracy survey due to the partial expression of the registered keyword.
[0016]
For these reasons, there is a demand for the emergence of a system that can quickly and accurately extract survey target materials such as reference materials. In addition, there is a demand for an apparatus that can automatically or simplify novelty and inventive step analysis based on specific patent information.
[0017]
In this regard, in [Patent Document 1], a plurality of pieces of information to be searched are stored in advance in association with some keyword groups included in the plurality of pieces of information, and keywords of the search target information are stored. There is disclosed a method of performing a search from a plurality of pieces of information to be searched based on a keyword that is aggregated and displayed and selected by an investigator.
[0018]
However, this method requires time and effort for registering the association of keyword groups for each of a plurality of pieces of information to be searched in advance, and the time and effort for keyword registration cannot be solved. Furthermore, since a specific keyword group included in a plurality of pieces of information is used for the search, it is not possible to solve the partial expressibility and necessity of unification of registered keywords.
[0019]
In [Patent Document 2], for each document unit to be searched, a pattern expressing the presence / absence of association with a predetermined keyword group is created, and the instructor instructs the pattern. A method of searching for a document unit having a pattern similar to the pattern is disclosed.
[0020]
However, there is an inconvenience that it is difficult to reuse a past pattern when a predetermined keyword group must be determined in advance or when the predetermined keyword group is changed in the middle. Further, like [Patent Document 1], it takes time and effort to register pattern correspondence for each of a plurality of pieces of information to be searched in advance, and the time and effort for keyword registration cannot be solved. Furthermore, since a pattern expressing the presence / absence of association with a predetermined keyword group is used for the search, it is not possible to solve the partial expression and necessity of unification of registered keywords.
[0021]
[Patent Document 1]
JP-A-9-73453
[0022]
[Patent Document 2]
JP-A-61-182131
[0023]
[Problems to be solved by the invention]
As described above, in the conventional search method as described above, by simply specifying a specific keyword or search expression, patent information including a keyword or the like corresponding to this search expression is extracted. For this reason, judgment of novelty, inventive step, etc. is generally carried out by the investigator actually confirming the contents of the extracted patent information visually, etc., and the extracted patent information (patent publication etc.) ) Has a lot of trouble in understanding and grasping it, and it is difficult to judge novelty and inventive step in a short time and appropriately.
[0024]
In addition, the act of determining an appropriate keyword and search formula according to the data set to be investigated must be performed by the investigator, and there is a disadvantage that the burden on the investigator is heavy.
[0025]
Furthermore, the following inconveniences exist as problems that are not solved in [Patent Document 1] and [Patent Document 2].
[0026]
First, since the trouble of keyword registration cannot be solved, there is a disadvantage that the keyword registration must be performed every time data to be investigated is generated.
[0027]
Furthermore, since the partial expressivity of the registered keywords cannot be solved, there is a problem in that a survey is likely to be omitted and a highly accurate survey is difficult to perform.
[0028]
In addition to this, if the necessity of unifying registered keywords is not considered in the data set to be surveyed, it is difficult to conduct highly accurate surveys because multiple technical terms such as synonyms are used. There was an inconvenience.
[0029]
OBJECT OF THE INVENTION
The present invention improves the inconveniences of the prior art and [Patent Document 1] and [Patent Document 2], particularly solves the trouble of keyword registration, and does not require the trouble of registering registered keywords in advance. An object is to provide a suitable means capable of performing a search from a data set to be investigated.
[0030]
In addition, in order to solve the partial expression of registered keywords and conduct high-accuracy surveys, the purpose is to provide a suitable means by which the investigator can select an appropriate search keyword according to the data set to be surveyed. And
[0031]
Furthermore, even in the case where the necessity of unification of registered keywords is not considered in the data set to be investigated, it is preferable that the researcher can appropriately select technical terms having synonyms and the like as search keywords. It aims to provide a simple means.
[0032]
[Means for Solving the Problems]
In order to achieve the above object, a system according to the present invention includes storage means for storing reference data including an information block composed of one or more keywords and search target data including a plurality of information blocks composed of one or more keywords. ing. The system further includes an input unit for inputting an association between a keyword constituting the reference data and a keyword constituting the search target data. Further, the system includes a computer that executes processing for searching for similar information blocks from the search target data based on the information blocks of the reference data. In addition, the system includes display means for displaying the search results. In the system as described above, the information search program according to the present invention executes a first extraction step of extracting a keyword constituting an information block of the reference data from the reference data stored in the storage means in the computer. Let Further, the program causes the computer to execute a step of extracting keywords constituting a plurality of information blocks of the search target data from the search target data stored in the storage unit. Further, the program displays one or more keywords extracted from the reference data and one or more keywords extracted from the search target data on the display unit, and displays each keyword of the displayed reference data. And the search target data Extracted from keyword Association with To perform the step of letting the investigator select. In addition, the program is associated with the computer. , Search target data Extracted from Based on keywords, The keyword was extracted A search step for searching for the similar information block from search target data is executed.
[0033]
According to the present invention, since the similar information block is searched from the search target data based on the keyword of the search target data associated with each keyword of the reference data, unlike the conventional case, the investigator is more A highly accurate search can be performed. Also, unlike the prior art, the investigator can perform a search without registering a keyword for each information block of the search target data. Further, unlike the conventional case, an investigator can select an appropriate search keyword corresponding to the search target data. In addition to this, unlike the conventional case, the searcher can appropriately select a keyword of the search target data having a relationship such as a synonym for each keyword of the reference data as a search keyword.
[0034]
Further, the information search program according to another invention may be configured such that the search step is based on a keyword extracted from the reference data and a keyword extracted from search target data associated with each keyword. The similar information block is searched from data.
[0035]
According to the present invention, in order to search the similar information block from the search target data based on the keyword of the reference data and the keyword of the search target data associated with each keyword, unlike the conventional, The investigator can perform a search with higher accuracy.
[0036]
Furthermore, an information search program according to another invention is characterized in that the first extraction step includes a step of accepting from the input means that the investigator edits a keyword extracted from the reference data.
[0037]
According to the present invention, an investigator edits the keyword of the reference data,
Unlike the conventional case, the investigator can perform a search with higher accuracy.
[0038]
An information search program according to another invention is characterized in that the first extraction step includes a step of accepting from the input means that an investigator discards a keyword extracted from the reference data.
[0039]
According to the present invention, by investigating the keyword of the reference data,
Unlike the conventional case, the investigator can perform a search with higher accuracy.
[0040]
Furthermore, an information search program according to another invention includes a step of causing the computer to accept an association with a weight representing an importance level in search processing from an investigator for each keyword extracted from the reference data. And the said search step searches the said similar information block from the said search object data based on the keyword extracted from the said search object data linked | related for every keyword of the said reference data, and the said weight. It is characterized by performing.
[0041]
According to the present invention, since the similar information block is searched from the search target data based on the keyword of the search target data and the weighting, which are associated with each keyword of the reference data, unlike the conventional case. , The investigator can perform a more accurate search. Also, unlike the conventional case, the search result is corrected by weighting, so that a more detailed search can be performed.
[0042]
An information search program according to another invention includes a step in which the search step gives information indicating a similarity for each information block when a similar information block is searched from the search target data. Then, the program causes the computer to execute a step of displaying the search results in the order of similarity.
[0043]
According to the present invention, since the search results are displayed in the order of similarity, the investigator can easily understand the search results. Also, unlike the conventional case, the researcher can easily access the individual breakdown of the search results.
[0044]
Furthermore, in an information search program according to another invention, the information block of the reference data is specific patent application information, and the plurality of information blocks of the search target data are a plurality of patent application information or scientific and technical literature information. It is characterized by being.
[0045]
According to the present invention, for a specific patent application information that is an information block of the reference data, similar patent application information from a plurality of patent application information or science and technology documents that are a plurality of information blocks of the search target data Or it becomes possible to search scientific and technical literature. As a result, unlike the conventional case, it is possible to reduce the investigation time spent by the investigator and reduce the burden.
[0046]
An information search program according to another invention is characterized in that the information block of the reference data is one or more claims of a specific patent application.
[0047]
According to the present invention, for one or more claims of a specific patent application that is an information block of the reference data, from a plurality of patent application information or scientific and technical documents that are a plurality of information blocks of the search target data, It is possible to search for similar patent application information or scientific and technical literature. This makes it possible to search for similar patent application information or scientific and technical literature for each claim, unlike the prior art.
[0048]
Furthermore, in the information search program according to another invention, when the information block of the reference data, which is one or more claims of a specific patent application, is a dependent claim, the first extraction step is referred to by the dependent claim. Adding an independent term to the information block of the reference data.
[0049]
According to the present invention, when an information block of the reference data which is one or more claims of a specific patent application is a dependent claim, an independent claim cited by the dependent claim is added to the information block of the reference data It becomes possible. Thus, unlike the conventional case, even if the information block of the reference data is a dependent claim, it is possible to search for similar patent application information or scientific and technical literature for each claim.
[0050]
As a result, the above-described object is achieved.
[0051]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to FIGS.
[0052]
FIG. 1 is a configuration diagram of an information search system according to an embodiment of the present invention. As shown in FIG. 1, the information search system 1 comprises a storage means 2, an input means 3, a display means 4, and a computer 5 for controlling them. These means and the computer 5 are connected to each other, and an information search function 15 is realized by cooperating while exchanging data. The storage unit 2 stores reference data 21 that is a reference for search processing and search target data 22 that is a search target of reference data.
[0053]
Next, the storage means 2 in FIG. 1 and the reference data 21 and search target data 22 stored therein are shown in detail in the detailed view of the storage means 2 in FIG. Here, the reference data 21 includes an information block 21a composed of a plurality of keywords. In addition, the search target data 22 includes a plurality of information blocks 22a including a plurality of keywords. The search target data 22 is search data narrowed down in advance by a researcher by using keywords, classification symbols, etc. in the search formula. To do.
[0054]
Here, the keywords included in the reference data 21 and the keywords included in the search target data 22 are composed of one or more words, symbols, numerical values, or combinations thereof. Examples of this include “key item”, “system”, “search result”, and the like. In FIG. 2, the keyword “a” included in the reference data 21 and the keyword “a ′” included in the search target data 22 are in a relationship of synonyms and the like (including synonyms and the like). Similarly, the keyword b included in the reference data 21 and the keyword b ′ included in the search target data 22 have a relationship such as a synonym. The relationship of synonyms and the like here means that the keywords are in a relationship of synonyms and the like, such as the relationship between the “primary key” and “main item” and “search item” described above.
[0055]
Further, the reference data information block 21a represents an information unit that is a reference for information search. In addition, a plurality of information blocks 22 a included in the search target data 22 represent information units to be searched for the information block 21 a of the reference data. Here, the information search function 15 described above is based on the reference data information block 21a and the computer 5 based on the keyword of the reference data 21 and / or the keyword of the search target data 22 having a relationship such as a synonym thereof. This is a function for determining the information block 22a of similar search target data. The information block 21a for the reference data and the information block 22a for the search target data are preferably divided for each document or document. In this case, the information search function 15 determines the information block 22a of the search target data that is a similar document of the specific document based on the information block 21a of the reference data that is the specific document.
[0056]
FIG. 3 is a block diagram when the information search system of FIG. 1 is applied to a personal computer. 1, the storage means 2 corresponds to the memory 6 and the hard disk 10, the input means 3 corresponds to the keyboard 9 and the like, the display means 4 corresponds to the display 7, and the computer 5 corresponds to the CPU 8. And each component is connected by the bus | bath 13, and it is comprised so that it can communicate mutually. Further, the reference data 21 and the search target data 22 stored in the storage unit 2 in FIG. 1 are stored in the hard disk 10 as the reference data 11 and the search target data 12 in a file or database format. Similarly, the information search function 15 is stored in the hard disk 10 as the information search program 14. In the system in FIG. 3, the CPU 8 interprets the information search program 14 to perform calculation processing and control each component. Further, when the CPU 8 interprets and executes the information search program 14, the information search program 14 and data are read from the hard disk 10 into the memory 6.
[0057]
Here, the keyboard 9 includes a mouse or the like generally used in a personal computer. The display 7 is a CRT display, a liquid crystal display, or the like. Further, the illustrated reference data keyword 11 a conceptually indicates that the keyword included in the reference data 11 is extracted by the CPU 8 and read into the memory 6. The same applies to the keyword 12a of the search target data.
[0058]
Next, the operation of each component in this embodiment will be described with reference to the flowchart of FIG.
[0059]
When a person who intends to conduct a survey using the information search system of FIG. 3 (hereinafter abbreviated as “surveyor”) uses a keyboard 9 as an input means and inputs a command to start a survey, the CPU 8 The command is interpreted and the information search program 14 is activated.
[0060]
Here, the CPU 8 interprets and executes the activated information search program 14, whereby the four processes shown in the flowchart of FIG. 4 are performed. The four processes are a first extraction step S1 for extracting a keyword from reference data, a step S2 for extracting a keyword from search target data, and an association between the keyword of the reference data and the keyword of the search target data. Step S3 and step S4 for searching the information block of the search target data based on the associated keyword of the search target data. The information search function 15 is realized by the above four processes. Further, the first extraction step S1 for extracting a keyword from the reference data and the step S2 for extracting the keyword from the search target data are processed in different steps in advance, and the reference is immediately performed by the execution of the information search program 14. It is also possible to proceed from step S3 in which the association between the data keyword and the search target data keyword is selected.
[0061]
Below, the outline of each step of S1 to S4 will be described first.
[0062]
First, the first extraction step S1 for extracting a keyword from reference data is a step of extracting a plurality of keywords constituting an information block included in the reference data 11 stored in the hard disk 10. The keywords extracted by this step are used in S3 and S4. Hereinafter, the first extraction step S1 for extracting a keyword from reference data is abbreviated as “first extraction step from reference data”.
[0063]
Next, the step S2 for extracting keywords from the search target data is a step of extracting a plurality of keywords constituting a plurality of information blocks included in the search target data 12 stored in the hard disk 10. The keywords extracted by this step are used in S3 and S4. Hereinafter, step S2 for extracting a keyword from search target data is abbreviated as “extraction step from search target data”.
[0064]
Subsequently, in step S3 for selecting the association between the keyword of the reference data and the keyword of the search target data, the association between the keyword extracted from the reference data 11 in S1 and the keyword extracted from the search target data 12 in S2. Is a step for allowing the investigator to select. The association selected by this step is used in S4. Hereinafter, step S3 for selecting the association between the keyword of the reference data and the keyword of the search target data is abbreviated as “association selection step”.
[0065]
Subsequently, step S4 of searching for the information block of the search target data based on the keyword of the associated search target data is the same as the information block of the reference data 11 based on the keyword of the search target data 12 associated in S3. This is a step of searching for similar information blocks of the search target data 12. Through this search step, it is determined whether each information block of the search target data 12 is similar to the information block of the reference data 11. Hereinafter, step S4 for searching for the information block of the search target data based on the associated keyword of the search target data is abbreviated as “search step”.
[0066]
The above is the outline of each step from S1 to S4, and each step will be described in detail below.
[0067]
First, in the first extraction step S1 from the reference data, the CPU 8 described above interprets and executes the information search program 14, thereby performing the processes of S11 to S15 shown in the flowchart of FIG. Hereinafter, based on the flowchart of FIG. 5, each process of S11 to S15 is demonstrated.
[0068]
First, in S11, the CPU 8 displays the reference data input screen 30 of FIG. 7 on the display 7 and waits for a reference data input command from the investigator. Here, the investigator uses the keyboard 9 to instruct to read the reference data through the reference data read button 34. Furthermore, the investigator also specifies information for specifying reference data such as a path and a file name. Then, the CPU 8 receives the investigator's instruction to read the reference data, and displays the designated reference data on the reference data display unit 33. Further, the CPU 8 analyzes the reference data, divides the reference data for each claim based on the characters “claim” and symbols in black brackets, and displays a list of claims on the claim list display unit 31. . An example of the reference data input by the investigator is a specification such as a patent. As an example of such reference data reception processing, when the researcher designates the data shown in FIG. 8 as reference data, the reference data input screen 30 is in a state as shown in FIG.
[0069]
Next, when the CPU 8 displays the input screen of FIG. 9, the investigator's claim selection command is an instruction to select which of the claims of the reference data displayed on the claim list display unit 31 is to be used. Wait for. Here, the investigator commands selection of one or more claims by using a keyboard 9 or the like and double-clicking each claim in the claim list display section 31. Then, the CPU 8 receives an instruction from the investigator to select a claim, extracts the specified claim data from the reference data display unit 33, and displays it on the selected claim display unit 32.
[0070]
At the time of this extraction / display process, the CPU 8 performs branch determination S12 as to whether or not a dependent claim is included in one or more claims selected by a previous investigator's claim selection instruction. Is called. Here, whether or not a dependent claim is included is determined by checking whether the claim data extracted from the reference data display unit 33 is another claim such as the characters “claim” or “claims 1 to 3”. Is determined by whether or not If the CPU 8 determines in S12 that the dependent claim is included, the claim data cited by the dependent claim is also extracted from the reference data display unit 33 and displayed on the selected claim display unit 32 together. (S13). In S13, when a claim cited by a dependent claim further cites another claim (when it is a dependent item of another claim), data of the cited claim is also obtained. In addition, it is displayed on the selected claim display section 32. Such processing is repeated until the claim being cited becomes an independent claim. As an example of the dependent term determination S12 as described above, when the investigator designates “Claim 2” from the claim list display unit 31 on the reference data input screen 30 in the state of FIG. As shown in FIG. 10, the data of claims 1 and 2 are reflected. In this way, the examiner's claim selection instruction is reflected on the selected claim display section 32. The claim data displayed on the selected claim display unit 32 represents an information block of reference data, which is an information unit serving as a reference for information search in the information search function described above.
[0071]
Then, on the reference data input screen 30 in FIG. 10, the investigator can edit the claim data displayed on the selected claim display section 32 by using the keyboard 9 or the like. In other words, characters, symbols, etc. can be added to or deleted from the claim data displayed on the selected claim display section 32.
[0072]
Subsequently, on the reference data input screen 30 in FIG. 10, the researcher uses the keyboard 9 or the like and clicks the keyword extraction button 35 to instruct the keyword extraction from the information block of the reference data. Then, the CPU 8 receives the investigator's keyword extraction instruction, and divides the claim data displayed on the selected claim display unit 32 into keywords (S14). As a method of this division, there is a method in which the CPU 8 divides into a keyword for each particle or blank character included in the claim data. As an example of such keyword division, the claim data displayed on the selected claim display unit 32 in the input screen of FIG. 10 is divided into keywords as shown in the divided keyword list 37 shown in FIG.
[0073]
Then, when the claim data is divided into keywords, the CPU 8 causes the display 7 to display the search condition determination screen 40 of FIG. 12, the claim data is displayed in the claim data display section 42, and the divided keyword list 37 is displayed in the keyword list display section 43. (S15). In the search condition determination screen 40 of FIG. 12, the investigator can edit the claim data displayed on the claim data display unit 42 using the keyboard 9 or the like. In other words, characters, symbols, etc. can be added to or deleted from the claim data displayed on the claim data display unit 42. Here, when the claim data displayed on the claim data display unit 42 is edited, the investigator uses the keyboard 9 or the like and clicks the keyword re-extraction button 41 to remove the claim data from the claim data display unit 42. Keyword re-extraction, and the extracted keyword list can be re-displayed on the keyword list display unit 43.
[0074]
In this way, when the CPU 8 displays a list of keywords on the keyword list display unit 43, the CPU 8 waits for an investigator's keyword selection command, which is an instruction for the keyword to be used among the keywords. Here, the investigator commands the selection of one or more keywords by, for example, double-clicking each keyword in the keyword list display unit 43 using the keyboard 9 or the like. Then, the CPU 8 receives the investigator's keyword selection instruction, extracts the specified keyword from the keyword list display unit 43, and displays it in the component column of the search condition display unit 48. As described above, when the researcher sequentially clicks, for example, each keyword of “key item”, “order”, and “search result” from the keyword list display unit 43, the configuration of the search condition display unit 48 as shown in FIG. The selected keyword is displayed in the element column. In this case, it is appropriate that the keyword selected by the investigator is selected not to display all the keywords displayed in the keyword list display unit 43 but to represent the features of the invention. In this way, keyword selection and editing are exclusively determined by the investigator, and are not automatically determined by the information search program 14.
[0075]
The above is the description of each process in the first extraction step S1 from the reference data. Here, each keyword displayed in the component column of the search condition display unit 48 in FIG. 13 is the keyword extracted in the first extraction step S1 from the reference data. Hereinafter, the extraction step S2 from the search target data will be described.
[0076]
Also in the extraction step S2 from the search target data, the CPU 8 described above interprets and executes the information search program 14 to perform the following processing. First, the CPU 8 receives input of search target data to be searched for the reference data received in S1 from the investigator through the keyboard 9 or the like. Examples of the search target data include a plurality of patent gazettes. In this case, the patent gazette for each application is an individual information block in the search target data. Further, as described above, each information block in the search target data is an information unit whose similarity with the information block of the reference data is determined.
[0077]
And CPU8 divides | segments the search object data input by the investigator into a keyword similarly to S14 in 1st extraction step S1 from the above-mentioned reference | standard data. As a method of this division, there is a method in which the CPU 8 divides into keywords for each particle or blank character included in the search target data. FIG. 14 shows the keyword list 38 of the search target data divided in the extraction step S2 from the search target data as described above. Here, it is assumed that the keyword displayed in FIG. 14 is the keyword extracted in the extraction step S2 from the search target data.
[0078]
The above is description of each process in extraction step S2 from search object data. In the extraction step S2 from the search target data, the investigator only specifies the search target data, and other processing is performed based on the information search program 14. Therefore, there is no inconvenience even if the information search program 14 performs the first extraction step S1 from the reference data after the extraction step S2 from the search target data. That is, the first extraction step S1 from the reference data and the extraction step S2 from the search target data are in no particular order. Hereinafter, the association selection step S3 will be described.
[0079]
Also in this association selection step S3, the CPU 8 described above interprets and executes the information search program 14 to perform the following processing. Here, when S1 and S2 are finished, the search condition determination screen 40 of FIG. 13 is displayed on the display 7, and the CPU 8 waits for an association selection command from the investigator. Here, the investigator uses the keyboard 9 or the like to select an arbitrary keyword (hereinafter abbreviated as “optional keyword”) from among the keywords displayed in the component row of the search condition display unit 48. The association selection screen display is instructed by clicking the synonym button 44 or the like. When detecting this, the CPU 8 extracts the arbitrary keyword data from the search condition display unit 48 and displays the association selection screen 50 held in advance by the information search program 14 on the display 7. (Described later)
[0080]
The association selection screen 50 includes a target keyword display unit 51, a hit keyword list display unit 52, a non-hit keyword list display unit 53, and an association execution button 54. Here, the target keyword display unit 51 is a part that displays an arbitrary keyword previously selected from the component row of the search condition display unit 48 by the investigator. In the association selection screen 50 of FIG. 15, it can be seen that “key item” is selected as an arbitrary keyword. The hit keyword list display unit 52 is a part that displays a list of keywords that have been hit as a result of searching the keywords extracted in S2 using an arbitrary keyword as a search key. Further, the non-hit keyword list display unit 53 is a part that displays a list of keywords that have not been hit as a result of searching the keywords extracted in S2 using an arbitrary keyword as a search key. The general pattern matching described above is used as a search method for the keyword extracted in S2. As a result, the keywords extracted in S2 are sorted and displayed on either the hit keyword list display unit 52 or the non-hit keyword list display unit 53, respectively. This distribution process is determined based on the similarity criterion selected by the investigator. As shown in FIG. 15, the association selection screen 50 includes a plurality of radio buttons so that the investigator can select the similarity criterion. The association selection screen 50 in FIG. 15 has three radio buttons with a similarity of (weak), (medium), and (strong), and the similarity criterion can be selected depending on which radio button the investigator selects. . As an example of keyword assignment based on the similarity criterion, when the similarity criterion is set to (strong) for an arbitrary key called “primary key”, the character “primary key” in the keywords extracted in S2 is completely Only the keywords included in are displayed on the hit keyword list display section 52. Similarly, when the similarity criterion is set to (weak) for an arbitrary key “primary key”, a part of the character “primary key” (for example, “primary” or “primary key”) is extracted from the keywords extracted in S2. The keyword including the key “)” is displayed on the hit keyword list display unit 52.
[0081]
When the previous association selection screen 50 is displayed on the display 7, the CPU 8 displays the previously extracted arbitrary keyword data on the target keyword display unit 51. Further, the CPU 8 reads the similarity criterion initially set by the information search program 14 and the keyword extracted in S2, and based on the similarity criterion, the keyword extracted in S2 is obtained by the previous pattern matching. The hit keyword list display unit 52 and the non-hit keyword list display unit 53 are sorted and displayed. The reason for setting such a similarity criterion is that when the search target data includes many keywords similar to the keyword selected as the constituent element, only the keywords with high similarity are displayed on the hit keyword list display unit 52. This is because the searcher can select a keyword for search that is necessary immediately. Therefore, when there is almost no keyword similar to the keyword selected as the component in the search target data, the keyword including the low similarity is displayed on the hit keyword list display unit 52 so that the investigator can select it. . Furthermore, one of the radio buttons can be selected by the information search program 14 so that the number of keywords displayed in the hit keyword list display section 52 falls within a certain range.
[0082]
Further, when the investigator selects the similarity criterion selection radio button on the association selection screen 50 using the keyboard 9 or the like and changes the similarity criterion, the CPU 8 again performs the first step based on the similarity criterion. The keywords extracted in S2 are read and sorted and displayed on the hit keyword list display unit 52 and the non-hit keyword list display unit 53 by the previous pattern matching. In this way, the investigator can change the similarity criterion initially set by the information search program 14.
[0083]
Subsequently, the investigator uses the keyboard 9 or the like, and among the keyword lists extracted in S <b> 2 displayed on the hit keyword list display unit 52 and the non-hit keyword list display unit 53, the target keyword display unit 51. A keyword that is considered to have a relationship such as a synonym with the arbitrary keyword displayed in (1) is selected. As this selection method, there is a method of selecting by checking a check box provided for each keyword displayed on the hit keyword list display unit 52 and the non-hit keyword list display unit 53. Usually, the keyword selected by the investigator as a constituent element is not displayed in the hit keyword list display section 52. The keyword selected as the constituent element by the researcher does not need to be displayed, but is processed as being selected by the researcher. If the keyword selected as the constituent element by the investigator is also displayed, the investigator can confirm the presence or absence of the selected constituent element in the search target data. When this check is detected, the CPU 8 stores the association between the arbitrary keyword and the checked keyword. When the check is released, the CPU 8 deletes the association between the arbitrary keyword and the keyword for which the check is released. Taking the association selection screen 50 in FIG. 15 as an example, “search key”, “search item”, and “feature part” are selected as keywords for “key item” which is an arbitrary keyword, and the association is stored. It is in a state.
[0084]
In this way, by using the keyboard 9 or the like, the researcher selects a keyword that is considered to be in a relationship such as a synonym with an arbitrary keyword, and clicks the association execution button 54 on the association selection screen 50, etc. The other keyword is stored in association with the arbitrary keyword. In response to this, the CPU 8 saves the association, and the other keyword associated with the arbitrary keyword is the row of the arbitrary keyword in the search condition display unit 48 on the search condition determination screen 40, and the synonym item Each is displayed in an area that intersects the column. As an example of this, in the search condition display section 48 on the search condition determination screen 40 in FIG. 16, three keywords “search key”, “search item”, and “feature part” are assigned to an arbitrary keyword “key item”. However, it is shown that the two keywords “procedure” and “order” are associated with the arbitrary keyword “order”. At the same time, the search condition display section 48 on the search condition determination screen 40 in FIG. 16 also shows that no keyword is associated with the arbitrary keyword “search result”. As described above, when there is no association keyword for an arbitrary keyword, the search database is searched using only the word / phrase selected as the constituent element, that is, the keyword “search result” in this case. (Described later)
[0085]
The above is the description of each process in the association selection step S3. Here, for each arbitrary keyword displayed in the component column of the search condition display unit 48 in FIG. 16, the keyword displayed in the synonym item in the row for each arbitrary keyword is the arbitrary keyword in S3. This is a keyword of search target data associated with each. Hereinafter, the search step S4 will be described.
[0086]
Also in this search step S4, the CPU 8 described above interprets and executes the information search program 14, whereby the following processing is performed. Here, the search step S4 comprises the processes of S41 to S43 shown in the flowchart of FIG. Hereinafter, based on the flowchart of FIG. 6, each process of S41 to S43 is demonstrated.
[0087]
First, in the search condition determination screen 40 of FIG. 16, the CPU 8 issues an investigator's weight assignment command, which is an instruction to assign a weight to each arbitrary keyword displayed in the component row of the search condition display section 48. wait. Here, the investigator uses the keyboard 9 or the like to select the keyword by clicking each keyword displayed in the component row of the search condition display unit 48, and then operates the weighting meter 45. By giving a numerical value from the keyboard 9 or the like, a weighting instruction is given. Here, as shown in FIG. 16, the weighting meter 45 is configured to select a value indicated by the weighting indicator bar from numerical values in a specific range by operating the central weighting indicator bar left and right. . When receiving the weighting instruction from the investigator, the CPU 8 displays the value shown in the weighting meter 45 in the weight column in the corresponding keyword row of the search condition display unit 48 (S41). As an example of this, in the search condition determination screen 40 of FIG. 16, “8” is set for the optional keyword “key item”, “6” is set for the optional keyword “order”, and optional keyword “search result”. In contrast, when “3” is weighted by the investigator, the search condition determination screen 40 is in a state as shown in FIG. Further, for the keywords displayed in the component column of the search condition display unit 48 that are not particularly specified by the investigator, if the initial setting values are automatically set by the information search program 14, it is convenient for the researcher. It is desirable.
[0088]
Here, the weighting step S41 by the investigator is included in S4 for convenience of explanation, but may be present at the end of S1 or other steps of S2 and S3. In addition, the investigator can save various data displayed on the search condition determination screen 40 or read the previously stored various data when appropriate when displayed on the search condition determination screen 40. For example, when the investigator clicks the setting save button 46 on the search condition determination screen 40 using the keyboard 9 or the like, the CPU 8 transfers the various data displayed on the search condition determination screen 40 to the hard disk 10. Save to. When the investigator clicks the setting reading button 47 on the search condition determination screen 40 using the keyboard 9 or the like and specifies the setting file, the CPU 8 transfers the setting file from the hard disk 10 to the memory 6. Read and display the contents of the setting file on the search condition determination screen 40.
[0089]
Next, the CPU 8 waits for the searcher's search start command, which is a search start instruction for performing similarity determination for each information block of the search target data based on the conditions of the search condition display unit 48. Here, the investigator uses the keyboard 9 or the like to instruct the start of search by clicking the search start button 49 on the search condition determination screen 40 or the like. Then, the CPU 8 receives a search start instruction from the investigator, and extracts the search condition data displayed on the search condition display unit 48. As an example of this, when a search start instruction is given from the investigator in the state of the search condition determination screen 40 in FIG. 17, search condition data 60 as shown in FIG. 18 is extracted. Here, a new management number column exists in the search condition data 60, which is given a management number to make each record of the search condition data 60 unique.
[0090]
Subsequently, the CPU 8 performs similarity determination for each information block of the search target data based on the search condition data 60 (S42). As a similarity determination method, the number of synonyms for a specific component in the search condition data 60 is N, and a specific information block (hereinafter referred to as a “specific information block”) having search target data is used. As a result of a search using N keywords, the score of the specific information block for that component is recorded as “(M / N) * weight” for each specific information block. . Similarly, the same search and recording are performed for the “order” and “search result” which are other components. Finally, the score for each component is totaled for each specific information block, and the similarity score of the specific information block is obtained.
[0091]
For example, in the search condition data 60 shown in FIG. 18, the number of synonyms is three (“search key”, “search item”, “feature part”) in the row where the component is “key item”. The number of N is N = 4 by adding the “key item” selected as the component. On the other hand, in a specific information block having search target data, when the information block is searched using “search key”, “search item”, “feature part”, and “key item” as a search keyword, “search key” ”,“ Search item ”,“ characteristic part ”, and“ key item ”are all present, it can be seen that M = 4. From the search condition data 60 shown in FIG. It can be seen that the weight of the “key item” is 8. Therefore, the score of the row in which the constituent element in this specific information block is “key item” is “(4/4) * 8” when substituted in the formula of “(M / N) * weight”, and the solution of the formula is It turns out that it is "8". Similarly, the score is calculated in the specific information block in the row of the “order” component and the “search result” row. And the total score (similarity score) of the specific information block is calculated by summing up the scores calculated for each row of the above three components. Here, the scores calculated for each row of the three components are simply summed up, but the product of the scores calculated for each row of the three components may be obtained to correct the total score. it can. However, since the degree of similarity is calculated for texts such as patent specifications and technical documents that can express the same technology in a wide range of synonyms, there is almost no meaning to strictly determine the score calculation method, so various calculation methods can be adopted .
[0092]
In the previous S42, the number (N−1) in which the specific component is not added to N, which is the number of synonyms for the specific component in the search condition data 60, may be N. For example, in the search condition data 60 shown in FIG. 18, the number of synonyms is three (“search key”, “search item”, “feature part”) in the row where the component is “key item”. N = 3, and using the three keywords selected by the investigator as search keywords, the previous S42 is performed. Thus, by not using the component itself as a keyword, it is possible to perform similarity determination using only the keyword in the search reference data.
[0093]
In this way, when the CPU 8 calculates the similarity score for each information block of the search target data based on the search condition data 60, the CPU 8 sorts the information blocks of the search target data in descending order of the similarity score, and displays 7 is displayed as a list (S43). As this example, there is a display method as shown in FIG. In the information block similarity order list 70 of the search information data in FIG. 19, the value of the item “document number” means the initial display order of the information block, and the value of the item “scoring rank” is the information. It means the total score order of blocks. For example, from the row of the information block whose “document number” is “373”, it is the information block initially displayed in the 373rd row, and it is understood that the total score is the highest compared to the other information blocks. . Similarly, from the row of the information block whose “document number” is “373”, the “public number” is “H03-1234” and the “publication date” is “H03 / 10/10”. Further, it can be read that the “title of the invention” is “A system”. Furthermore, the item “link” holds identification information of individual information block data such as a path and a file name. Thus, the investigator uses the keyboard 9 or the like and selects individual information blocks on the information block similarity order list 70, so that the CPU 8 can immediately display the information blocks on the display 7.
[0094]
Through S43 as described above, the researcher can recognize the information blocks of the search target data having a high similarity score with respect to the information blocks of the reference data in the order of the similarity score.
[0095]
As described above, according to the present embodiment, for each keyword extracted from the reference data, a plurality of keywords extracted from the search reference data can be related, and based on the keywords from the related search reference data. Thus, since the information block of the search reference data can be searched, there is an effect that the target information block can be searched and extracted without fail. Therefore, the investigator can view the contents included in the search reference data in the order that requires attention for each information block unit, and can greatly reduce the time required for the reading investigation.
[0096]
【The invention's effect】
Since the present invention is configured and functions as described above, according to this, the keyword extracted from the search target data can be associated with each keyword extracted from the reference data. Based on the keyword, the similarity can be determined for each information block of the search target data.
[0097]
Further, it is not necessary to register a search keyword in advance for each information block of search target data, and all search target data can be set as a keyword target.
[0098]
Furthermore, even when the necessity of unifying registered keywords is not considered in the search target data, the researcher can appropriately select technical terms having synonyms and the like as search keywords.
[0099]
In addition, for each keyword extracted from the reference data, similarity determination can be adjusted by assigning a weight to the keyword extracted from the associated search target data.
[0100]
Due to the above points, it is possible to conduct a more detailed investigation than before, and it has the effect of being able to search and extract the target information block from the search target data without fail, which is superior to the conventional information search A system can be provided.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of an information search system according to an embodiment of the present invention.
FIG. 2 is a detailed view of storage means 2 in FIG.
FIG. 3 is a block diagram when the information search system of FIG. 1 is applied to a personal computer.
FIG. 4 is a flowchart illustrating processing of the information search system.
FIG. 5 is a flowchart showing details of S1 in FIG. 4;
FIG. 6 is a flowchart showing details of S4 in FIG.
7 is a screen view of a reference data input screen 30. FIG.
FIG. 8 is a data example of reference data.
9 is a screen view of a reference data input screen 30. FIG.
10 is a screen diagram of a reference data input screen 30. FIG.
FIG. 11 is a keyword list obtained by dividing reference data into keywords.
12 is a screen diagram of a search condition determination screen 40. FIG.
13 is a screen view of a search condition determination screen 40. FIG.
FIG. 14 is a keyword list obtained by dividing search target data into keywords.
15 is a screen view of an association selection screen 50. FIG.
16 is a screen diagram of a search condition determination screen 40. FIG.
17 is a screen diagram of a search condition determination screen 40. FIG.
18 is a data example of search condition data 60. FIG.
FIG. 19 is a display example of an information block similarity order list 70 of search information data.
[Explanation of symbols]
1 Information search system
2 storage means
3 Input means
4 display means
5 Computer
6 memory
7 Display
8 CPU
9 Keyboard etc.
10 Hard disk
11 Standard data
11a Keywords for reference data
12 Search target data
12a Keywords for search target data
13 Bus
14 Information search program
15 Information search function
21 Standard data
21a Reference data information block
22 Search target data
22a Information block of search target data
30 Standard data input screen
31 Claim List Display Unit
32 Selection claim display section
33 Reference data display
34 Standard data reading button
35 Keyword extraction button
36 Standard data
37 Split keyword list
38 Keyword list of search target data
40 Search condition decision screen
41 Keyword re-extraction button
42 Claim data display section
43 Keyword list display
44 Synonym button
45 Weighting meter
46 Save settings button
47 Setting read button
48 Search condition display area
49 Search start button
50 Association selection screen
51 Target keyword list display area
52 Hit Keyword List Display
53 Non-hit keyword list display
54 Association button
60 Search condition data
70 List of information block similarity by search information data

Claims

Storage means for storing reference data including an information block including one or more keywords and search target data including a plurality of information blocks including one or more keywords;
An input means for inputting an association between a keyword constituting the reference data and a keyword constituting the search target data;
A computer that executes a process of searching for similar information blocks from the search target data based on the information blocks of the reference data;
In a system comprising display means for displaying the search results,
In the computer,
A first extraction step of extracting a keyword constituting an information block of the reference data from the reference data stored in the storage means;
Extracting keywords constituting a plurality of information blocks of the search target data from the search target data stored in the storage means;
One or more keywords extracted from the reference data and one or more keywords extracted from the search target data are displayed on the display means, and the search target data is displayed for each keyword of the displayed reference data. Allowing an investigator to select an association with a keyword extracted from ,
An information search program for executing a search step for searching for the similar information block from the search target data from which the keyword is extracted based on the associated keyword extracted from the search target data .

The information search program according to claim 1,
The search step includes
An information search, wherein the similar information block is searched from the search target data based on a keyword extracted from the reference data and a keyword extracted from the search target data associated with each keyword. program.

In the information search program according to claim 2,
The first extraction step includes
The information search program characterized by including the step which receives from the said input means that the investigator edits the keyword extracted from the said reference | standard data.

In the information search program according to claim 2,
The first extraction step includes
The information search program characterized by including the step which receives from the said input means that an investigator discards the keyword extracted from the said reference | standard data.

The information search program according to claim 1,
The computer is
Receiving an association with a weight representing the importance in the search process from the investigator for each keyword extracted from the reference data,
The search step executes a step of searching for the similar information block from the search target data based on the keyword extracted from the search target data associated with each keyword of the reference data and the weighting. Information search program characterized by letting

In the information search program according to claims 1 to 5,
The search step includes
Providing information indicating similarity for each information block when searching for similar information blocks from the search target data,
In the computer,
An information search program for executing a step of displaying the search results in the order of similarity.

In the information search program according to claims 1 to 6,
The information block of the reference data is specific patent application information,
An information search program characterized in that a plurality of information blocks of the search object data are a plurality of patent application information or scientific and technical literature information.

The information search program according to claim 7,
An information search program characterized in that the information block of the reference data is one or more claims of a specific patent application.

The information search program according to claim 8, wherein
If the information block of the reference data that is one or more claims of a particular patent application is a dependent claim,
The first extraction step includes
An information search program comprising a step of adding an independent term cited by the dependent term to the information block of the reference data.

A recording medium recording the information search program according to claim 1.