JP2005031813A

JP2005031813A - Abstract preparation supporting system, program, abstract preparation supporting method, patent document retrieving system, and patent document rerieving method

Info

Publication number: JP2005031813A
Application number: JP2003193867A
Authority: JP
Inventors: Seiichi Okada; 聖一岡田; Masaaki Hasegawa; 雅昭長谷川; Katsuyoshi Sato; 克良佐藤; Kazuo Sumita; 一男住田; Kazuhiro Kimura; 和広木村; Hidekazu Irisawa; 秀和入澤
Original assignee: JAPAN PATENT INFORMATION ORGANIZATION; Toshiba Solutions Corp
Current assignee: JAPAN PATENT INFORMATION ORGANIZATION; Toshiba Digital Solutions Corp
Priority date: 2003-07-08
Filing date: 2003-07-08
Publication date: 2005-02-03
Anticipated expiration: 2023-07-08
Also published as: JP4301879B2

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently and properly prepare an abstract in second a language from a patent document in a first language. <P>SOLUTION: This abstract preparation supporting system comprises: an abstract rule dictionary 2 for specifying a principal part extraction rule for every fixed classification; a document structure analyzing part 11 for analyzing the chapter structure of an electronized text patent description; a principal part selecting part 12 for selecting an extraction rule dictionary according to the classification of the analyzed document structure analytic result, and for selecting a principal part from the document structure analytic result on the basis of the rule of the dictionary; a machine translating part 13 for translating the document structural analytic result into native language; an abstract candidate extracting part 14 for extracting an abstract candidate in the native language from the translated native language translation result for each principal part selected by the principal selecting part; and an abstract edition/display control part 15 for displaying the abstract candidate in the native language among the extracted native language abstract candidate and the text principal part corresponding to the native language abstract candidate or for displaying the native language abstract candidate and the text principal part by associating them with each other, and for preparing the native language abstract including the correction processing of the abstract candidate in native language. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、外国語特許文献等から母国語の抄録を作成する際に利用される抄録作成支援システム、プログラム、抄録作成支援方法及び特許文献検索システム並びに特許文献検索方法に関する。
【０００２】
【従来の技術】
従来から日本国以外の国においてどのような特許が出願されているかの情報をサービスすることを目的とし、外国語（例えば英語）で記載された特許文献（請求の範囲を含む明細書、図面等）全文を参照し、母国語（例えば日本語）などの和文抄録を作成することが日常的に行われている。この作成された和文抄録は、検索システムのデータベースに蓄積され、エンドユーザへの閲覧や検索に提供されている。
【０００３】
ところで、英語特許文献から和文抄録を作成する場合、現状では、抄録作成者が和文抄録の作成対象となっている英語特許文献の全文を読み、その特許文献から主要部分（以下、要部という）が何であるかを把握し、当該要部を母国語に翻訳する作業が行われている。作成すべき和文抄録は、技術分野、要旨、実施例などの中から予め定められた日本語の長さ（例えば合計文字数６００文字〜８００文字程度）の範囲で記載し作成している。
【０００４】
このような抄録作成作業は、すべて抄録作成者による人手の作業に頼っているので、抄録作成にかかる手間が非常に大きい。また、抄録作成作業工程において、抄録を作成する前段階では、抄録作成者が英語特許文献全文にわたって内容を把握する必要がある。その結果、抄録作成者は、英語特許文献全文を読みきる必要があるが、この読解のために多大な作業コストがかかり、また抄録作成作業の効率が非常に悪いことから、その作業効率の改善が求められていた。
【０００５】
そこで、抄録作成作業の効率化を図るために、近年、幾つかの抄録作成支援システムが提案されている。
【０００６】
その１つの抄録作成支援システムは、文章を要約するための文章要約システムであって、技術論文などの章節構造を抽出し、章節ごとに抄録としてそのまま利用するか（例えば要旨の部分）、要約処理を行うか、或いは要約処理をせずに無視するか（例えば文献）といった処理の切り分けを行うものである（特許文献１参照）。
【０００７】
他の１つの抄録作成支援システムは、文書情報を検索するための文書情報検索装置及び文書検索結果表示方法であって、検索結果を提示する際に、原文から抄録生成を自動的に行い、この生成された抄録をユーザに提示する。そして、抄録と原文との対応関係を取り出し、抄録表示と原文表示との連動や原文表示中の抄録部分を強調表示する構成である（特許文献２参照）。
【０００８】
【特許文献１】
特許第２７６６２６１号公報
【０００９】
【特許文献２】
特許第２９５７８７５号公報
【００１０】
【発明が解決しようとする課題】
従って、以上のような抄録作成支援システムでは、技術論文などを対象とするものであり、しかも抄録作成のための重要文抽出の処理方法としては、文書中における単語の使用頻度等の統計情報に基づいた方法（例えば使用頻度の高い単語が重要であり、重要語を含む文は重要であるという選定基準）、接続詞や文末の表現に基づいて重要文を選定する方法（例えば、「要するに」という副詞を用いている文は重要であるという選定基準）などが開示されている。
【００１１】
しかしながら、これらの重要文抽出の処理方法は、技術論文などを対象にしたものであり、かつ、一律の選定基準に従って重要文を選定する方法であるが、これをそのまま特許文献に適用しても、特許文献的な観点から要部であると判断できる質のよい重要文を抽出することができない問題がある。
【００１２】
また、前述する抄録作成支援システムは、単一の言語に関する抄録の作成が目的であり、例えば英語特許文献から和文抄録を作成するための支援機能をもっていないので、英語特許文献から和文抄録が作成できないという問題がある。
【００１３】
しかも、理想的には、自動的に抄録が作成されることが好ましいが、品質の良い抄録を作成するためには、最終的には抄録作成者等の人手によるチェックが必要となる。このため、抄録作成を自動的に行うのではなく、抄録作成を支援するというシステムの提供が必要となる。
【００１４】
さらに、以上のような抄録作成支援システムでは、「要旨」「発明の属する技術分野」「実施例」などの章節構造に従って、例えば「要旨」については要部として抽出する。しかしながら、要部となるべき単語やフレーズ等の抽出規則辞書に関し、予め国際特許分類（以下、ＩＰＣ分類と呼ぶ）の区分ごとに準備し、抄録作成対象とする特許文献のＩＰＣ分類や「発明の属する技術分野」「実施例」などの章節構造等に従って抽出規則辞書を切り替える等の内容が記載されておらず、その点からも上記システムのままでは利用することが難しい。また、抄録と機械翻訳との連携についても何ら記載されていない。
【００１５】
さらに、抄録作成者が作成する文章は和文抄録であるので、要部抽出とともに日本語に翻訳する作業も必要になってくる。従来、人手により翻訳する場合、仕事量との関係から同じ技術分野の特許文献であっても、別の分野の抄録作成者が担当することも多く、人手による翻訳に限れば、専門用語の翻訳に対して一貫性を保持することが非常に難しい状況にある。
【００１６】
そこで、英語特許文献から日本語に翻訳支援するための機械翻訳システムの利用が考えられる。機械翻訳システムについては、すでに幾つかのパッケージソフトウエアが市販されており、例えばＩＰＣ分類に基づいて機械翻訳するための専門用語辞書を準備し、機械翻訳時にどのＩＰＣ分類に関する特許文献であるかを判定し、特許文献のＩＰＣ分類に従って専門用語辞書を選択的に切替え使用することにより、適切な翻訳を行うシステムも提案されている（特開２００３−１６０６３号公報）。しかし、このシステムは、機械翻訳を意図するものであり、抄録作成については記載されていない。
【００１７】
よって、以上の説明から、外国語特許明細書（ここでいう、外国語特許明細書とは、請求の範囲を含み、さらに公開公報、公告公報、登録特許公報を問わず、特許関連文献（実用新案等を含む）の他、請求の範囲を含んだ明細書を指す意味である。従って、各国によって呼び名が異なるが、公に提供された電子化された特許出願書類といえる）を対象とし、従来の抄録作成システムで抄録を作成し、この作成された英文抄録を英日翻訳システムで日本語に翻訳することが考えられる。
【００１８】
しかしながら、抄録作成者が質の良い和文抄録を作成するためには、単に日本語に翻訳された結果を読むだけでは不十分であり、対応する英語の原文やその文章の前後の文章の文脈も参照する必要がある。従って、従来の抄録作成支援システムと機械翻訳システムとを組み合わせただけでは、異なる言語の抄録作成作業を一連の作業手順として進めることができないので、英語特許文献から和文抄録を作成するための支援機能を十分に果たすことができない。
【００１９】
何れにせよ、技術分野に応じた特許文献の特徴や特許抄録は、技術的な動向を把握し、またその他の実務的な面からも非常に重要な書類であり、外国語特許文献から適切な母国語の抄録を作成する支援システムの実現が待たれているのが実状であると言える。
【００２０】
本発明は上記事情にかんがみてなされたもので、電子化された第１言語の特許出願書類から効率的、かつ適切な第２言語の抄録を作成する抄録作成支援システム、プログラム及び抄録作成支援方法を提供することを目的とする。
【００２１】
また、本発明の他の目的は、電子化された第１言語の特許出願書類から所要とする第１言語の特許出願書類を検索し、この検索された第１言語の特許出願書類から前記第１言語の検索要部及び第２言語の検索候補（検索要部）を作成し、効率的な検索作業を実現する特許文献検索システム、プログラム及び特許文献検索方法を提供することを目的とする。
【００２２】
【課題を解決するための手段】
（１）上記課題を解決するために、本発明に係る抄録作成支援システムは、一定の分類ごとに少なくとも所要とする第１言語の章節の表題及びこの各表題の章節に記載される説明文中に含まれる第１言語の単語，フレーズ等のキーフレーズを規定する抽出規則辞書と、電子化された第１言語の特許出願書類に記載される文章の章節構造を解析する文書構造解析手段と、この文書構造解析手段で解析された文書構造解析結果に記載されている分類に従って所要の前記抽出規則辞書を選択し、当該抽出規則辞書に規定される表題及びキーフレーズに基づいて前記文書構造解析結果から要部を選定する要部選定手段と、文書構造解析手段で解析された文書構造解析結果を第２言語に翻訳処理する機械翻訳手段と、要部選定手段で選定される各要部ごとに機械翻訳手段で翻訳された第２言語の翻訳結果から第２言語の抄録候補を抽出する抄録候補抽出手段と、この抄録候補抽出手段で抽出される第２言語の抄録候補及びこの第２言語の抄録候補に対応する第１言語の要部のうち、少なくとも第２言語の抄録候補又は第２言語の抄録候補と第１言語の要部とを対応付けて表示し、当該第２言語の抄録候補の修正処理を含んで第２言語の抄録を作成する抄録編集表示制御手段とを設けた構成である。
【００２３】
この発明は以上のような構成とすることにより、第１言語の特許出願書類に記載される文章の章節構造から分野別に従って第１言語の要部と第１言語の文書構造解析結果より翻訳処理された第２言語の翻訳結果とから第２言語の抄録候補を取り出し、この第２言語の抄録候補を修正を含めて第２言語の抄録を作成するので、抄録作成者が第１言語の特許出願書類を読解することなく、自動的に第２言語の抄録候補を抽出し、抄録作成者による修正確認を含めて短時間に第２言語の抄録を作成することが可能であり、効率的に抄録を作成でき、かつ、適切な抄録を作成することが可能である。
【００２４】
なお、上記抄録の作成は、プログラムに記録されるプログラムや抄録作成方法によっても実現可能であり、これによりシステムと同様な作用効果を奏することができる。
【００２５】
（２）また、本発明に係る特許文献検索システムは、電子化された複数の第１言語の特許出願書類を記憶する記憶手段と、検索条件のもとに前記記憶手段から少なくとも１つの第１言語の特許出願書類を検索する文書検索手段と、一定の分類ごとに少なくとも所要とする第１言語の章節の表題及びこの各表題の章節に記載される説明文中に含まれる第１言語の単語，フレーズ等のキーフレーズを規定する抽出規則辞書と、文書検索手段により検索された前記電子化された第１言語の特許出願書類に記載される文章の章節構造を解析する文書構造解析手段と、この文書構造解析手段で解析された文書構造解析結果に記載されている分類に従って所要の抽出規則辞書を選択し、当該抽出規則辞書に規定される表題及びキーフレーズに基づいて文書構造解析結果から検索対象となる要部を選定する要部選定手段と、文書構造解析手段で解析された文書構造解析結果を第２言語に翻訳処理する機械翻訳手段と、要部選定手段で選定される各検索対象要部ごとに機械翻訳手段で翻訳された第２言語の翻訳結果から第２言語の検索候補を抽出する検索候補抽出手段と、この検索候補抽出手段で抽出される第２言語の検索候補及びこの第２言語の検索候補に対応する前記第１言語の検索要部のうち、少なくとも前記第１言語の検索要部又は前記第１言語の検索要部と前記第２言語の検索候補とを対応付けて表示し、或いは前記第１言語の検索要部及び前記第２言語の検索候補と前記第１言語の文書構造解析結果及び前記第２言語の翻訳結果とを選択的に表示する検索結果表示制御手段とを設けた構成である。
【００２６】
この発明は以上のような構成とすることにより、第１言語の多数の電子化された第１言語の特許出願書類の中から検索条件に従って少なくとも１つの第１言語の特許出願書類を検索し、この検索された第１言語の特許出願書類に対し、前記（１）に記載する構成を取り込んで、少なくとも第１言語の検索要部と第２言語の検索候補とを取り出して表示できるので、効率的な検索作業を実現することが可能である。
【００２７】
なお、上記検索要部や検索候補の取り出しは、プログラムに記録されるプログラムや特許文献検索方法によっても実現可能であり、これによりシステムと同様な作用効果を奏することができる。
【００２８】
【発明の実施の形態】
以下、本発明の実施の形態について図面を参照して説明する。
【００２９】
図１は本発明に係る抄録作成支援システムの一実施の形態を説明する構成図である。
【００３０】
この抄録作成支援システムは、電子化された外国語特許文献などの第１言語の特許出願書類を入力する原文明細書入力部１と、ＩＰＣ分類の一定区分ごとに所定の要素をもつ抽出規則で構成された複数の抽出規則辞書を格納する抽出規則辞書格納部２と、前記原文明細書入力部１から入力される外国語特許文献などから母国語などの第２言語の抄録を作成処理する抄録作成処理制御部３と、この抄録作成処理制御部３がＣＰＵで構成されている場合、抄録作成の一連の処理を実行させるためのプログラムを格納する記録媒体４と、この抄録作成処理制御部３で作成される母国語の抄録その他この抄録に関連する情報を保存する抄録関連情報格納用データベース５と、抄録作成の処理途中において必要な情報を一時格納するバッフアメモリ６と、作成された母国語の抄録その他必要な情報を出力する出力部７とにより構成されている。８は翻訳処理時に使用する翻訳処理上の知識を格納する翻訳辞書部である。
【００３１】
なお、前述する第１言語の特許出願書類とは、公に提供された刊行物としての性格を有する公開公報、公告公報、登録特許公報及び実用新案に関係する公報を含む全てのテキスト情報を指す。また、特許出願書類の中の明細書とは、厳密な意味で請求の範囲や要約とは別書類の場合も考えられるが、頒布される特許文献では明細書と請求の範囲や要約は所定の順序で一体に印刷されるものであるので、請求の範囲や要約を含めて指すものとする。以下、第１言語の特許出願書類は、説明の便宜上、外国語特許明細書又は英語特許明細書と総称する。
【００３２】
原文明細書入力部１は、一般にキーボード、マウスなどが用いられ、キーボードによる各種の制御指示を入力、マウスによる外国語特許明細書中の特定領域の指定により選択される当該外国語特許明細書の入力の他、外国語特許明細書を読み取るＯＣＲ（ＯｐｔｉｃａｌｃｈａｒａｃｔｅｒＲｅｒｄｅｒ）による読み取り情報の入力、フロッピーディスク、磁気テープ、磁気ディスクなどに保存される電子化された外国語特許明細書の入力など、異なる種々の入力形態による外国語特許明細書の入力を含むものである。
【００３３】
前記抽出規則辞書格納部２は、一定の区分毎（例えば電気、機械、化学、物理その他の分野などをさらに細分化した区分毎），例えばＩＰＣ分類の少なくとも上位分類毎の各抽出規則辞書が設けられ、各抽出規則辞書には所定の抽出規則が規定されている。この各抽出規則辞書の規則の一例としては例えば抄録項目、フィールド、キーフレーズの３つの要素から構成され、具体的な一例としては後記する図５に示す通りである。
【００３４】
前記抄録作成処理制御部３は、機能的には、原文明細書入力部１から入力される外国語特許明細書の項目（文書構造）を解析し、各項目ごとに認識可能な一定の形式に基づくタグ付けした文書構造の明細書を作成する文書構造解析手段としての文書構造解析部１１と、この構造解析部１１で文書構造を解析された文書構造解析結果の中から抽出規則辞書に定められる規則に従って要部を抽出する要部選定手段としての要部選定部１２と、文書構造解析部１１から出力される文書構造解析結果に対し、翻訳辞書部８に規定する規則・知識を用いて、母国語に翻訳する翻訳処理手段としての機械翻訳部１３と、要部選定部１２の要部抽出結果である各要部ごとに、機械翻訳部１３で翻訳された外国語特許明細書全文の母国語語翻訳結果の全文の中から母国語の抄録候補を抽出する抄録候補抽出手段としての抄録候補抽出部１４と、この抄録候補抽出部１４から抽出される母国語の抄録候補と要部選定部１２で抽出された外国語要部（原文要部）とを対応付けて表示し、又は前記外国語の文書構造解析結果全文及び前記機械翻訳部１３で翻訳処理された母国語の翻訳結果とを対応付けて表示し、当該母国語の抄録候補の抄録作成者による修正処理を含んで母国語の抄録を作成する抄録編集表示制御部１５とが設けられている。
【００３５】
前記出力部７は、機械翻訳部１３の出力である翻訳結果を出力したり、入力部１から入力される外国語特許明細書を表示したり、或いは作成された母国語抄録を表示したり、外国語特許明細書全文と翻訳済み明細書全文その他所要とする種々の形式で表示する機能を有するものであって、通常，各種ディスプレイなどの表示手段が用いられるが、その他、例えばプリンタなどの印字手段、或いはフロッピーディスク、磁気テープ、磁気ディスクへの書き込み登録手段、さらには他のメディアに対して送信する送信手段その他ユーザの所望する各種の出力形態が挙げられる。
【００３６】
次に、以上のような抄録作成支援システムの動作及び本発明に係るプログラム、抄録作成支援方法について図２を参照しながら説明する。
【００３７】
先ず、原文明細書入力部１から電子化された外国語特許明細書，例えば英語特許明細書が入力されると（Ｓ１）、抄録作成処理制御部３は、記録媒体４に格納されるプログラムに従って一連の処理を実行する。なお、この抄録作成処理制御部３は、例えば論理回路等で構成されるハード構成によって処理することも可能であるが、ここではプログラムに従って一連の処理を実行するものとする。
【００３８】
一般に、英語特許明細書は、図３にその一部を示すように、要旨（Ａｂｓｔｒａｃｔ）、請求項（Ｃｌａｉｍｓ）、発明の分野（ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ）、発明の背景（ＢＡＣＫＧＲＯＵＮＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ）などの明細書の項目（文書構造）の他、ここでは図示されていないが、発明者やＩＰＣ分類コード等が記載されている。
【００３９】
そこで、抄録作成処理制御部３は、原文明細書入力部１から入力される電子化された英語特許明細書を受け取ると、例えばバッフアメモリ６に一時格納した後、文書構造解析部１１にて英語特許明細書の文書構造解析処理を行う（Ｓ２：文書構造解析機能、文書構造解析ステップに相当する）。この文書構造解析処理部１１は、特許明細書のフォーマットがＸＭＬ形式であることを前提とした場合、予め定められる英語特許明細書の各項目（文書構造）を解析し、この項目ごとにＸＭＬタグ情報に基づいてタグ付けした文書構造に置き換え、文書構造解析結果としてバッフアメモリ６又はデータベース５に格納する（Ｓ３）。
【００４０】
図４は文書構造解析部１１により構造解析された文書構造解析結果の一部を示す図である。ここでは、Ａｂｓｔｒａｃｔ、Ｃｌａｉｍｓ、ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮといった各項目に従った文書構造がＸＭＬ形式で表現されている。この文書構造解析結果に示されているように、タグ＜ｆｉｅｌｄｈｅａｄｅｒ＝”ＸＸＸＸ”＞から当該タグに対応する＜／ｆｉｅｌｄ＞までの範囲が「ＸＸＸＸ」の表題に対応する部分となる。なお、ここでは、特許明細書のフォーマットがＸＭＬ形式であることを前提とした例であるが、他の形式による場合には英語特許明細書の各項目（文書構造）を解析し、この項目ごとに他の形式に従った文書構造に置き換えることは言うまでもない。
【００４１】
以上のようにして文書構造解析結果を得た後、要部選定部１２は、抽出規則辞書格納部２に格納される各抽出規則辞書の規則に従い、文書構造解析結果の中から順次要部を抽出する（Ｓ４：要部選定機能、要部選定ステップに相当する）。
【００４２】
この抽出規則辞書は、クラス（上位分類）からサブグループ（下位分類）まで細分化して展開されているＩＰＣ分類に基づく一定の技術区分（例えば、日本の公報発行区分に近い３０程度の技術区分や本発明を応用する分野の必要に応じた技術区分）ごとに設けられ、各抽出規則辞書内の規則の形式は、図５に示すように抄録項目、フィールド、キーフレーズの３つの要素から構成されている。抄録項目は、要部として出力する内容を日本語で記載されており、ここでは、技術分野、要旨、実施例（実施の形態を含む）という３つの項目が例示されている。フィールドは文書構造における各章節の表題を示している。キーフレーズは、フィールドで設定されている各章節を検索し、当該キーフレーズに定義されている単語やフレーズが表題の中味を説明する文中に含まれていれば、それを要部として抽出することを意味する。
【００４３】
例えば抄録作成対象である英語特許明細書の文書構造において、「ＢＲＥＩＦＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＰＲＩＯＲＡＲＴ」「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」「ＡＢＳＴＲＡＣＴ」などの表題に対応する段落の文中に、辞書の規則で定める技術分野に対応するキーフレーズとなる「ｒｅｌａｔｅｓｔｏ」「ｉｓｄｉｒｅｃｔｅｄｔｏｗａｒｄ」「ｔｈｅｉｎｖｅｎｔｉｏｎｃｏｎｃｅｒｎｓ」などの単語やフレーズが含まれていれば、その段落が「技術分野」という抄録項目に相当する要部の候補であることを意味する。また、「ＳＵＭＭＡＲＹＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」「ＳＵＭＭＡＲＹ」「ＡＢＳＴＲＡＣＴ」という表題に対応する段落の文中に、辞書の規則で定める要旨に対応するキーフレーズとなる「ｉｓｐｒｏｐｏｓｅｄ」「ｉｓｄｉｓｃｌｏｓｅｄ」「ｉｓｐｒｏｖｉｄｅｄ」「ｄｉｓｃｌｏｓｅｄｉｓ」などの単語、フレーズが含まれていれば、その段落が「要旨」という抄録項目に相当する要部の候補であることを意味する。なお、各抄録項目にはそれぞれ複数の条件の登録が可能であるので、例えば「要旨」に対応する規則には２つの抽出規則Ｒ２、Ｒ３に分けて登録されている。４番目の抽出規則であるＲ４に対応するキーフレーズには「Ｆｉｇ．？」が記載されているが、この「？」は任意の数字に照合することを意味している。つまり、キーフレーズの「Ｆｉｇ．？」は例えば「Ｆｉｇ．１」「Ｆｉｇ．２」等の何れとも照合可能であることを意味する。
【００４４】
そこで、要部選定部１２は、以上のような抽出規則辞書内の規則を用いて、図６に示す要部選定処理を実行する。具体的には、バッフアメモリ６又はデータベース５から図４に示す英語特許明細書の文書構造解析結果を読み込む（Ｓ１１）。一般に、特許明細書にはＩＰＣ分類が記載されているので、図４に示す文書構造解析結果にもタグ情報によりＩＰＣが認識可能になっている。つまり、図４の例では、「＜ｉｐｃ＞」と「＜／ｉｐｃ＞」とで囲まれている範囲がＩＰＣ分類が記されている部分であり、詳しくは「Ｃ０８Ｆ００２／００」と記載されている。この分類情報を用いることによって、クラス（上位分類）からサブグループ（下位分類）まで細分化して展開されているＩＰＣ分類に基づく技術区分に応じた抽出規則辞書を判断することができる。
【００４５】
要部選定部１２は、文書構造解析結果からＩＰＣ分類を取り出すと、抽出規則辞書格納部２から該当するＩＰＣ分類に対応する抽出規則辞書を読み込んだ後（Ｓ１２）、次のステップＳ１３に移行し、抽出規則辞書の最初に参照すべき抽出規則（ＩＤ＝Ｒ１）として、変数ｉに１を設定する。そして、変数ｉ＝１の抽出規則に基づき、当該規則のフィールド及びキーフレーズと照合する段落が抄録作成対象の英語特許明細書（文書構造解析結果）に存在するか否かを判断する（Ｓ１４）。ここで、照合する段落が存在していれば、この照合する段落の位置情報（特定情報）を抽出し（図２のＳ５、図６のＳ１５）、バッフアメモリ６又はデータベース５に記憶する。
【００４６】
この段落の位置情報を抽出した後、或いはステップＳ１４において照合する段落が存在しない場合、次のステップＳ１６に移行し、抽出規則辞書に規定する次に処理すべき抽出規則（ＩＤ＝Ｒ２）として、変数ｉに＋１をインクリメントした後、該当変数に相当する抽出規則の項目数全部が終了したか否かを判断し（Ｓ１７）、未だ抽出規則の項目が残っている場合にはステップＳ１４に戻り、変数ｉの抽出規則のフィールド及びキーフレーズと照合する段落が抄録作成対象の特許明細書（文書構造解析結果）に存在するか否かを判断し、全ての抽出規則について繰り返し実行する。
【００４７】
図７は、本発明の主旨を説明するために、要部の抽出部分を模式的に示した図である。例えば図５に示す抽出規則辞書の２番目の抽出規則（ＩＤ＝Ｒ２）では、フィールドの中に「ＡＢＳＴＲＡＣＴ」、キーフレーズの中に「ｉｓｄｉｓｃｌｏｓｅｄ」が記載されている。従って、この２番目の抽出規則が「Ａｂｓｔｒａｃｔ」である表題に関係する図示点線枠で囲む説明文２１の中に含まれる「ｉｓｄｉｓｃｌｏｓｅｄ」（下線引きで示している）と照合するので、この「Ａｂｓｔｒａｃｔ」の説明文２１又はその文中の段落位置（特定情報）を「要旨」として抽出することになる。また、図５に示す抽出規則辞書の１番目の抽出規則（ＩＤ＝Ｒ１）では、フィールドの中に「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」、キーフレーズの中に「ｒｅｌａｔｅｓｔｏ」が記載されている。従って、この１番目の抽出規則が「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」の点線枠で囲む図７の説明文２２の中に含まれる「ｒｅｌａｔｅｓｔｏ」（下線引きで示している）と照合するので、この「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」の説明文２２又はその文中の段落位置（特定情報）を「技術分野」として抽出する。さらに、図５に示す抽出規則辞書の３番目の抽出規則（ＩＤ＝Ｒ３）では、フィールドの中に「ＤＡＴＡＩＬＥＤＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」、キーフレーズの中に「ｔｈｉｓｉｎｖｅｎｔｉｏｎａｄｄｒｅｓｓｅｓ」が記載されている。従って、この３番目の抽出規則が「ＤＡＴＡＩＬＥＤＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」の点線枠で囲む説明文２３の中に含まれる「ｔｈｉｓｉｎｖｅｎｔｉｏｎａｄｄｒｅｓｓｅｓ」（下線引きで示している）と照合するので、この「ＤＡＴＡＩＬＥＤＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」の説明文２３又はその文中の段落位置（特定情報）を「要旨」として抽出することになる。
【００４８】
図８は要部選定部１２による抽出結果の表現形式の一例を表す図である。
【００４９】
図７は図４の文書構造解析結果と対応づけながら抽出される要部について要部特定情報２１，２２，２３を付けて表したが、図８はより具体的に特許明細書中のフィールド、文中の段落位置などのポインタで対応付けて抽出することができる。因みに、図８は、フィールド、段落（段落位置，特定情報）、抄録項目、要部抽出に使用された抽出規則の組み合わせにより構成されている。具体的には、図８の１行目の情報は、「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」という章節であって、１段落目が抽出規則Ｒ１に照合したことを示しており、抽出規則Ｒ１は、抄録項目の「技術分野」に対応する規則であることを表している。また、３行目の情報は、「ＤＡＴＡＩＬＥＤＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」という章節であって、１段落目が抽出規則Ｒ３に照合したことを示しており、この抽出規則Ｒ３は、抄録項目の「要旨」に対応する規則であることを表している。
【００５０】
以上のようにして要部選定部１２が英語文書構造解析結果から例えば図８に示すような要部を抽出した後、或いは要部選定部１２による要部選定処理と同時並行的に処理する機械翻訳部１３が翻訳辞書部８の翻訳処理上の知識を用いて、英語特許文献の文書構造解析結果を翻訳処理し（Ｓ６）、母国語である例えば日本語の翻訳結果を得るものである（Ｓ７）。これらステップＳ６，Ｓ７は翻訳処理機能、翻訳処理ステップに相当する。
【００５１】
図９は図４に示す英語の文書構造解析結果に対し、ＸＭＬタグ以外の他の文書を日本語に翻訳処理した翻訳結果を示す図である。なお、ＸＭＬタグを除いて翻訳処理した理由は、翻訳結果についても元の英語の文書構造解析結果を保持しておく必要があるので、ＸＭＬタグについては翻訳処理を行わないこととする。なお、英日機械翻訳については、既に幾つかのパッケージソフトが市販されており、それらのパッケージソフトで実現されている翻訳処理技術を利用すればよい。これにより、図９のような翻訳結果を容易に取得することが可能である。
【００５２】
引き続き、抄録候補抽出部１４は、要部選定部１２で抽出された要部抽出結果（図８参照）と機械翻訳部１３によって取得された機械翻訳結果（図９参照）とを比較しながら、母国語である例えば日本語の抄録候補を抽出する（Ｓ８：抄録候補抽出機能、抄録候補抽出ステップに相当する）。
【００５３】
この抄録候補抽出部１４では、具体的には、図８の１行目の「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」の１段落目に関する抄録候補を抽出することになる。図９に示す事例では、＜ｆｉｅｌｄｈｅａｄｅｒ＝”ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ”＞において、「［０００１］」で始まる段落が抄録候補に相当する。また、図８の２行目の「ＡＢＳＴＲＡＣＴ」の１段落目に関する抄録候補を抽出することになる。図９に示す事例からは、＜ｆｉｅｌｄｈｅａｄｅｒ＝”ＡＢＳＴＲＡＣＴ”＞において、「プロセス」で始まる段落が抄録候補に相当する。以下、同様に図８に示す要部抽出結果の各行の情報に従い、機械翻訳結果から抄録候補を順次取り出し、データベース５の所定エリアに記憶する。図１０は、その抄録候補の抽出結果を示す図である。
【００５４】
ところで、以上説明した実施の形態では、行数や段落数が変化しないことを前提とし、機械翻訳処理した例について述べたが、機械翻訳処理の中には一文を複数文に分けて訳出する処理を行う場合もある。このような場合、図８に示すように段落や文単位の対応では、対応関係がずれてしまう可能性が生ずる。しかし、文書構造解析部１１では、英語特許明細書側の全ての文や段落に対して、ユニークな番号をもつＸＭＬタグを割り振りして処理しているので、その文書構造解析結果に基づいて、ＸＭＬタグの情報を保持したままで機械翻訳を行えば問題はなくなる。
【００５５】
従って、この抄録作成支援システムでは、以上のようにして抄録候補を抽出した段階では、抄録関連情報格納用データベース５には英語特許明細書全文（図３参照）、英語文書構造解析結果（図４）、日本語機械翻訳結果（図９参照）、日本語抄録候補抽出結果（図１０参照）等が格納されているので、抄録編集表示制御部１５は、入力部１からの選択操作のもとに、抄録関連情報格納用データベース５から所要の情報を選択的に取り出し、所要のフォーマットに従って出力部７に表示する（Ｓ９）。
【００５６】
図１１は抄録作成者による所要の操作のもとに出力部７に表示された日本語・英語の要部表示（上段）又は日本語・英語の全文表示（下段）の画面表示例を示す図である。
【００５７】
先ず、抄録作成者は、初期画面を取り出すと、この初期画面には要部ボタン３１、全文ボタン３２、開くボタン３３、コピーボタン３４、保存ボタン３５、終了ボタン３６その他の所要とするボタン（図示せず）の他、図１１の上段左側の表示エリア３７には日本語の翻訳要部（抄録候補）が順次表示される。一方、この翻訳要部の表示に連動して図１１の上段右側の表示エリア３８に翻訳要部と対応関係をとりながら英語の原文要部が順次表示される。なお、図面のスペースの関係から翻訳要部及び原文要部のそれぞれの文字が記載できないので、図１２に一部の翻訳要部及び原文要部を記載してある。なお、各表示エリア３７，３８は、入力部１を構成するマウスなどのポインティングデバイスによって１つの段落を選択可能になっている。
【００５８】
すなわち、表示エリア３７と表示エリア３８は連動しており、表示エリア３７のある段落が選択されると、表示エリア３８には原文特許明細書に対応する個所の英文の要部が連動して選択表示される。３９は抄録作成エリアであって、抄録作成者が表示エリア３７，３８の翻訳要部、原文要部を確認しながら最適な抄録であると判断したとき、当該翻訳要部をコピーし、必要に応じて文字削除や挿入等の修正処理を施して和文抄録とするエリアである。つまり、抄録作成者は抄録作成エリア３９の内容を適宜編集処理して和文抄録を作成する。また、抄録作成者が参考とするために、表示エリア４０には発明者、出願年月日、ＩＰＣなどの書誌情報が表示され、表示エリア４１には特許された発明の名称の翻訳結果が表示される。
【００５９】
さらに、全文ボタン３２を操作した時には、図１１の下段に示すように翻訳全文と原文全文とが連動して表示される。その後、要部ボタン３１を操作すると、同図の上段に示す日本語・英語の要部が表示される。
【００６０】
図１３は抄録作成者又はユーザ（以下、抄録作成者と総称する）が実際に利用する場合の状態遷移図並びに抄録編集表示制御部１５の状態遷移に伴う処理を説明する図である。
【００６１】
同図において、状態５１はシステムの立ち上げ時の初期状態、状態５２は抄録作成者が抄録処理画面の開操作により表示エリア３７に翻訳結果の要部が表示されている状態、状態５３はシステムの終了を意味する終了状態、状態５４は抄録作成者による全文表示の操作により表示エリア３７に明細書全文の翻訳結果が表示されている状態を意味する。
【００６２】
さらに、具体的に状態遷移について説明すると、抄録作成者が状態５１にて開くボタン３３を操作すると、英語特許明細書のファイル名を当該抄録作成者に問い合わせて取得することにより、英語特許明細書から抽出された要部の翻訳結果、つまり翻訳要部をデータベース５の所要エリアから取り出し、表示エリア３７に表示し、これに連動して表示エリア３８には翻訳要部に対応する原文要部を表示し、さらに表示エリア３９を初期化した後、状態５２に遷移する。
【００６３】
この状態５２において、表示エリア３７に表示されている段落を選択してコピーボタン３４を操作すると、当該段落が表示エリア３９に挿入される。状態５２は、要部（翻訳）の状態であって、抄録項目（図１１，図１２参照）である「技術分野」「要旨」「実施例」が段落毎に設定されている状態にあるので、その抄録項目に従って表示エリア３９にコピーによって順次挿入する（Ｓ５２）。
【００６４】
また、状態５４では、全文ボタン３７の操作により、表示エリア３７に翻訳された全文（全文翻訳）が表示され、これに連動して表示エリア３８に原文全文が表示される（Ｓ５３）。この状態５４は、抄録項目が決定されていないので、抄録作成者からの入力を要求する等の処理を経て抄録項目を設定した後、表示エリア３７に表示されている段落を必要に応じてコピー操作により表示エリア３９にコピーする（Ｓ５４）。また、状態５２或いは状態５４において、保存ボタン３５を操作すると、抄録作成者による保存先の和文抄録のファイル名の入力に基づき、当該ファイル名のもとにデータベース５に表示エリア３９の内容が保存される（Ｓ５５、Ｓ５６）。なお、状態５４において開くボタン３３を操作すれば、ステップＳ５１と同様の処理を実行し（Ｓ５７）、また要部ボタン３１を操作すれば、ステップＳ５２と同様の処理を実行する（Ｓ５８）。そして、各状態において、終了ボタン３６を操作すれば、抄録作成及び所要とする情報の取り出し表示処理が終了する。
【００６５】
従って、以上のような実施の形態によれば、外国語特許明細書の項目（文書構造）を解析し、一定の形式に基づくタグ付けした文書構造解析結果を取得した後、この文書構造解析結果に基づいて、要部選定部１２が予め定める抽出規則辞書の規則に従って要部を抽出し、また機械翻訳部１３では、文書構造解析結果の内容を通常の翻訳処理のもとに例えば母国語に翻訳する。しかる後、抄録候補抽出部１４が要部抽出結果である各要部ごとに翻訳された日本語全文の翻訳結果の中から抄録候補を取り出した後、翻訳要部と原文要部を対応付けて表示し、翻訳要部をコピーし、原文要部を参照し、必要に応じて英語特許明細書の文書構造解析結果全部及び翻訳された翻訳結果全文を参照しながら、文字削除、挿入により和文抄録を作成するので、和文抄録を効率的に作成でき、かつ人手による和文抄録作業を適切に支援することができる。
【００６６】
なお、要部表示は、図１１、図１２に示すような形態に限るものではない。例えば図１４に示すように、要部ボタン３１を翻訳要部ボタン３１ａと原文要部ボタン３１ｂに分け、また全文ボタン３２についても翻訳全文ボタン３２ａと原文全文ボタン３２ｂとに分け、これらボタンを個別に選択操作することにより、４つの表示形態に選択して表示する構成であってもよい。要するに、英語特許明細書から選択した要部の英文やその翻訳結果、オリジナルの英語特許明細書、その翻訳した全文明細書などを自在に順次表示することが可能である。
【００６７】
次に、図１５は本発明に係る抄録作成支援システムの他の実施の形態を説明する図である。
【００６８】
この実施の形態は、抽出規則辞書格納部２に格納される各抽出規則辞書の規則に予め重要度を設定し、抽出された段落の重要度に基づいて抄録候補の表示量ないし表示数を制御する例である。
【００６９】
本発明に係る抄録作成支援システムでは、図１５に示す抽出規則辞書に予め重要度を設定しておけば、要部選定部１２によって要部を選定すると、図８に示す要部抽出結果にさらに重要度が付加された図１６に示す要部抽出結果が得られる。つまり、要部抽出結果には、抽出される各段落毎に、抽出の時に用いられた抽出規則の重要度情報が付加されている。その結果、図示されていないが、図１０に示す抄録候補の翻訳結果にも各要部ごとに重要度の情報が付加される。なお、予め「１」は重要度が最も高く、数値が大きくなるに従って重要度が低くなるように取り決められているものとする。
【００７０】
その結果、抄録編集表示制御部１５では、図１７に示すごとく重要度を考慮しつつ抄録候補の抽出処理を実行する。すなわち、抄録編集表示制御部１５は、先ず最初に変数ｉに１番目の抽出規則（ＩＤ＝Ｒ１）である１を設定した後（Ｓ６１）、この抽出規則（ＩＤ＝Ｒ１）に対応する重要度が予め定める閾値よりも大きいか、つまり閾値よりも重要度が高いか否かを判断し（Ｓ６２）、重要度が高いと判断された場合には、当該フィールドの段落に対応する日本語の翻訳要部を表示エリア３７に表示する（Ｓ６３）。この翻訳要部を表示した後又はステップＳ６２において閾値よりも重要度が低いと判断された場合、ステップＳ６４に移行し、変数ｉに＋１を加えた後、この変数ｉの番目の抽出規則（ＩＤ＝Ｒｉ）が未だ残っているか否かを判断し（Ｓ６５）、未だ残っている場合にはステップＳ６２に戻り、同様の処理を実行する。全ての抽出段落数について重要度を調べて要部翻訳を表示したとき、重要度に関する処理は終了する。
【００７１】
従って、以上のような実施の形態によれば、抄録作成段階で予め定める閾値よりも重要度の高い翻訳要部だけを表示エリア３７に表示して和文抄録を作成することができ、また少ない表示量を用いて効率的に和文抄録を作成できる。
【００７２】
なお、重要度の閾値は、抄録作成者が随時変更でき、抄録作成者の所要とする和文抄録を容易に作成できる。
【００７３】
図１８は本発明に係る抄録作成支援システムのさらに他の実施の形態を説明する図である。
【００７４】
この実施の形態は、図４に示す文書構造解析結果に記載される明細書から抄録候補として「代表図の説明」を抽出する例である。
【００７５】
本発明に係る抄録作成支援システムでは、抽出規則辞書に規定する規則の中のキーフレーズにフレーズの照合だけでなく、特殊記号を追加することにより、抽出規則辞書の表現形式を拡張し、「代表図の説明」を要部として抽出するものである。
【００７６】
このキーフレーズに導入する記号のうち、「＄ＲｅｐＦｉｇ＄」は代表図であることを意味する記号であり、「＄＾Ｌｉｎｅ＄」は指定されたフィールドの先頭からＬｉｎｅ段落までをキーフレーズの探索範囲であることを意味する記号である。「＄^＊Ｌｉｎｅ＄」は照合する段落以下のＬｉｎｅ段落を取り出すことを意味する。「＄^＊＄」は、例えば「ＳＵＭＭＡＲＹ」フィールドに関しては全ての段落を要部として取り出すことを意味する。
【００７７】
このシステムは、抽出規則辞書部に図１８に示すような抽出規則を記載することにより、要部抽出部１２では、１番目の抽出規則（ＩＤ＝Ｒ１）に関し、例えば代表図がＦｉｇ．１であれば、フィールドに規定する「ＦＩＥＬＤＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」に対応する文書構造解析結果の中に「Ｆｉｇ．１ｓｈｏｗｓ」というキーフレーズが存在するかを探索することを意味する。また、２番目の抽出規則（ＩＤ＝Ｒ２）に関し、フィールドに規定する「ＤＡＴＡＩＬＥＤＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」に対応する文書構造解析結果の中の３段落目までを探索範囲とし、「Ｆｉｇ．？ｓｈｏｗｓ」（「？」は任意の数字）というキーフレーズが存在しているかを探索することを意味する。さらに、３番目の抽出規則（ＩＤ＝Ｒ３）に関し、フィールドに規定する「ＣＬＡＩＭＳ」に対応する文書構造解析結果の中の２段落（他のキーフレーズの指定がないため）を抽出することを意味する。
【００７８】
因みに、米国公開特許明細書のＸＭＬ形式の定義である“−／／ＵＳＰＴＯ／／ＤＴＤＰＡＰＶ１．６２００２−０１−０１／／ＥＮ”におけるｐａｐ−ｖ１６−２００２−０１−０１．Ｄｔｄによれば、＜ｒｅｐｒｅｓｅｎｔａｔｉｖｅ・ｆｉｇｕｒｅ＞と＜／ｒｅｐｒｅｓｅｎｔａｔｉｖｅ・ｆｉｇｕｒｅ＞とで囲まれた数字が代表図を示している。このような形式で記載された特許明細書を入力した場合、代表図が何であるかを特許明細書中から容易に取り出すことができる。
【００７９】
従って、要部選定部１２の図６に示すステップＳ１４では、以上のような定義情報を用いて、抽出規則辞書中の「＄ＲｅｐＦｉｇ＄」という記号の具体的な文字列に置き換えた後、通常のキーフレーズに従って探索するように処理すればよい。すなわち、例えば抄録作成対象とする特許明細書の代表図が「Ｆｉｇ．１」であれば、「＄ＲｅｐＦｉｇ＄」は「Ｆｉｇ．１」に置換できるので、抽出規則Ｒ１のキーフレーズは「Ｆｉｇ．１ｓｈｏｗｓ」となる。よって、フィールドに規定する「ＤＡＴＡＩＬＥＤＤＥＳＣＲＩＰＴＩＯＮＯＦＴＨＥＩＮＶＥＮＴＩＯＮ」に対応する文書構造解析結果の中から「Ｆｉｇ．１ｓｈｏｗｓ」というキーフレーズを探索することになる。
【００８０】
また、要部選定部１２のキーフレーズの探索のステップＳ１４（図６）において、抽出規則に記載されているキーフレーズに、特殊記号「＄ＲｅｐＦｉｇ＄」が含まれていれば、キーフレーズの探索範囲をＬｉｎｅ段落まで限定したり、特殊記号「＄^＊Ｌｉｎｅ＄」が含まれていれば、キーフレーズに照合する段落以下のＬｉｎｅ段落を要部として抽出したり、キーフレーズに特殊記号「＄^＊＄」が含まれていれば、処理対象のフィールドの全段落を要部として選択する等の処理を行うことになる。
【００８１】
従って、このような実施の形態によれば、特許明細書の中から代表図の説明を抄録候補として容易に抽出することができる。
【００８２】
なお、以上のような特殊記号を新たに定義し、それに合せて要部選定部１２を構成すれば、代表図の説明以外にも要部とする内容を容易に抽出することが可能である。
【００８３】
図１９は本発明に係る抄録作成支援システムのさらに他の実施の形態を説明する図である。
【００８４】
この実施の形態は、抽出規則辞書格納部２の中に新たに不要語辞書を設け、抽出規則辞書の条件に照合する段落であっても、不要語に照合する段落に限り、抽出しない構成とする例である。
【００８５】
図１９は不要語辞書の一例を示す図であって、例えば１番目と２番目の規則Ｆ１，Ｆ２の不要語は、第２の実施例を説明する際に用いられるフレーズであり、要部として選択する上で適切でない場合がある。また、３番目と４番目の規則Ｆ３，Ｆ４の不要語は、従属形式請求項となっている場合に用いられるフレーズであり、これも要部として選択すべきでない場合が多い為である。
【００８６】
図２０は抽出規則辞書格納部２に抽出規則辞書の他に不要語辞書を加えた場合の要部選定部１２の一連の処理例を示すフローチャートである。
【００８７】
この要部選定部１２において、図６との処理の違いは、ステップＳ７４にて抽出規則ｉと照合する段落が存在するか否かを判断し、段落有りと判断されとき、不要語辞書の規則に基づいて照合した段落の中に不要語が存在するか否かを判断し（Ｓ７５）、不要語が含まれている場合には要部として抽出しない処理としたものである。従って、その他のステップＳ７１〜Ｓ７４は図６のステップＳ１１〜Ｓ１４に相当し、またステップＳ７６〜Ｓ７８は図６のステップＳ１５〜Ｓ１７に相当するので、詳細な処理は図６の説明に譲る。
【００８８】
従って、以上のような実施の形態によれば、不要語辞書の規則に定める不要語が機械文翻訳結果の要部に相当する段落に存在する場合、要部として適切でないと判断し、その段落については要部として抽出しないので、より精度の高い和文抄録を作成できる。
【００８９】
なお、図１９に示す不要語辞書では、ＩＰＣ分類毎に不要語辞書を用意する場合を想定しているが、例えば抽出規則辞書内の各抄録項目や各抽出規則に対して、不要語を設定するような構成であってもよい。
【００９０】
また、前述する各実施の形態では、要部選択並びに抄録候補生成において段落を処理単位とするものであったが、これらは他の言語単位（文や節、フレーズ、単語等）を処理単位として構成することもできる。
【００９１】
さらに、本実施の形態では、抄録作成対象とする例えば英語特許明細書に記載されたＩＰＣ分類に従って抽出規則辞書を選択的に利用する形態をとったが、例えば機械翻訳に関し、専門分野に応じた専門用語辞書を選択的に利用するようにすれば、翻訳精度の向上が期待され、さらにＩＰＣ分類に従って機械翻訳で利用する辞書を選択するように変形することも可能である。
【００９２】
さらに、近年、機械翻訳システムでは、翻訳メモリの機能が実現されている（例えばＴｈｅ翻訳プロフェッショナルＶ８．０）。この翻訳メモリとは、ある英文について過去に精度の高い日本語訳を作成した場合、当該英文と類似する英文については、同じ日本語訳を再利用した方が質の良い翻訳を行える可能性が高いという事実に基づき、過去の英文−日本語訳の組合せを記憶しておき、新しい英文を翻訳する際に過去の類似文を検索し、類似文が存在する場合に当該類似文に対応する日本文訳をユーザに提示する機能である。
【００９３】
そこで、本発明に係る抄録作成支援システムにおいても、例えば要部選定部１２で抽出された要部に関する翻訳結果を抄録作成者に提示し、その翻訳結果を参考にしつつ最終的に和文抄録を作成する構成でもよい。この場合、要部選定部１２で抽出された英文要部と修正結果の和文抄録とを記憶しておけば、機械翻訳部１３における翻訳メモリ機能を利用することが可能となる。つまり、本発明システムとしては、要部選定部１２で抽出された英文要部と抄録作成者による編集終了後の和文抄録とを対として記憶する記憶手段と、翻訳メモリ機能とをさらに設けた機械翻訳部１３とすれば、翻訳メモリ機能を有効に活用しながら和文抄録を作成できる。この際、ＩＰＣ分類に対応させて英文要部−和文抄録を記憶する構成とすれば、抄録作成対象の英語特許明細書のＩＰＣ分類に従って探索する英文要部−和文抄録の過去事例を効率的に選択利用することが可能である。
【００９４】
また、以上の実施の形態では、日本語を母国語とする抄録作成者が和文抄録する例について述べたが、例えば英語から中国語、英語からドイツ語、日本語から英語など，他の言語対に対しても同様に適用可能である。従って、この場合には、抽出規則辞書における抽出規則や機械翻訳を適用する言語についても、他の言語対に合せて組み込むようにすれば、他の言語対の対応する抄録作成支援システムを実現できることは言うまでもない。
【００９５】
また、上記実施の形態では、日本語の抄録候補と原文の要部とを対応付けて表示したが、抄録候補抽出部１４で抽出される日本語の抄録候補だけを表示することも可能である。
【００９６】
次に、図２１は本発明に係る要部作成機能付き検索システムの一実施の形態を説明する機能構成図である。
【００９７】
この検索システムは、文書検索部６１、図１の抽出規則辞書格納部２と同様の機能をもつ抽出規則辞書格納部６２、図１の文書構造解析部１１と同様の機能をもつ文書構造解析部６３、図１の要部選定部１２と同様の機能をもつ要部選定部６４、図１の機械翻訳部１３と同様の機能をもつ機械翻訳部６５、検索候補抽出６６、検索結果表示制御部６７により構成され、その他種々の構成要素例えば特許文献保存用データベース６８、翻訳辞書（図示せず）、一連の検索処理用プログラムを記録する記録媒体等が設けられている。なお、同図において図１と同一機能をもつ構成要素２、６３〜６５等は、図１，図２で既に詳しく説明しているので、ここでは省略する。
【００９８】
先ず、特許文献保存用データベース６８には、多数の外国語特許明細書，例えば英語特許明細書が電子化されて蓄積される。
【００９９】
この状態において、文書検索部６１は、図示しない検索入力部から例えば技術分野やＩＰＣ分類等の検索条件のもとに検索指示が入力されると、記録媒体から検索処理用プログラムに従って一連の処理を実行する。すなわち、文書検索部６１は、検索条件が入力されると、データベース６８から検索条件に適合した複数の英語特許明細書を検索する（７１）。なお、この文書検索部６１としては、従来公知の種々の文書検索技術が用いられる。
【０１００】
文書検索部６１で複数の英語特許明細書が検索されると、一時バッフアメモリなどに格納した後、例えば検索された順番に検索結果である特許明細書を文書構造解析部６３に送出し、特許明細書の文書構造を解析する。なお、特許明細書の文書構造解析については図１，図２で既に説明した通りである。その結果、図４に示すような文書構造解析結果を取り出すことができる（７２）。
【０１０１】
このようにして取り出された文書構造解析結果は、要部選定部６４及び機械翻訳部６５に送られる。この要部選定部６４は、図６に示す処理の流れに従って文書構造解析結果の中から抽出規則辞書格納部６２に格納される抽出規則辞書（図５参照）に定める規則に従って要部を抽出する（７３）。この要部抽出結果は図７又は図８に示す通りである。一方、機械翻訳部６５では、文書構造解析結果に対し、翻訳辞書部８に規定する規則・知識を用いて、母国語に翻訳し、図９に示すような機械翻訳結果を取得する（７４）。なお、これら要部選定部６４及び機械翻訳部６５は、図１及び図２に関連する説明の中で詳しく説明されているので、ここでは省略する。
【０１０２】
引き続き、検索候補抽出部６６は、要部選定部６４の抽出結果である各要部ごとに機械翻訳部６５で翻訳された外国語特許明細書全文の日本語翻訳結果の中から翻訳された検索要部を取り出し、例えば検索結果保存用データベース（図示せず）に格納する。この検索結果保存用データベースには、検索した原文である全文英語特許明細書、機械翻訳部６５で機械翻訳された全文翻訳された母国語特許明細書、要部選定部６４で抽出された原文要部、検索要部生成部６６で生成された翻訳要部が格納されている。
【０１０３】
そこで、検索結果表示制御部６７では、図２２に示す処理手順に従って図１の７に相当する出力部に表示する。図２３は画面表示例を示す図である。なお、初期画面には全文表示ボタン８１、要部表示ボタン８２、次選択ボタン８３、前選択ボタン８４が形成されている。
【０１０４】
検索結果表示制御部６７は、要部ボタン８２が操作されると、検索された英語特許明細書が複数存在する場合には画面奥行き方向に表示エリア８５、８６が重なりをもって表示される。この状態において、変数ｉとして１がセットされると、検索結果保存用データベースから検索されたある１つの英語特許明細書ｉから抽出された原文要部（要部抽出結果）を取り出し、当該原文要部を項分けして表示エリア８５に表示する。そして、この原文要部に連動し、各原文要部に対応する翻訳要部を表示エリア８６に表示する（Ｓ８２）。
【０１０５】
しかる後、何れかのボタンが操作されたかを判断し（Ｓ８３）、ボタン操作がない場合には繰り返し判断する。このステップＳ８３において、ボタン操作有りと判断された場合には、全文表示ボタン８１が操作されたか否かを判断し（Ｓ８４）、全文ボタン８１が操作されたと判断した場合には、該当特許明細書ｉの原文全文を表示エリア８５に表示し、それに伴って原文全文に対応する機械翻訳結果である全文翻訳を表示エリア８６に表示する（Ｓ８５）。そして、次に何れかのボタンが操作されたかを判断し（Ｓ８６）、ここで、要部表示ボタン８２が操作されたと判断された場合（Ｓ８７）にはステップＳ８２に戻って同様に英語特許明細書ｉから抽出された原文要部及び翻訳要部を表示する。
【０１０６】
ところで、ステップＳ８４において、ボタン操作が行われたが、全文表示ボタン８１の操作でないと判断された場合、前選択ボタン８４の操作か、次選択ボタン８３の操作かを判断する（Ｓ８９、Ｓ９０）。ここで、前選択ボタン８４が操作されたと判断された場合、変数ｉに１（下限数）の範囲でｉ−１をセットし（Ｓ９１）、ステップＳ８２に移行し、１つ前の英語特許明細書ｉから抽出された原文要部及び翻訳要部を表示エリア８５，８６にそれぞれ表示する。一方、次選択ボタン８３が操作されたと判断された場合、変数ｉにＮ（上限数）の範囲でｉ＋１をセットし（Ｓ９２）、ステップＳ８２に移行し、次の英語特許明細書ｉから抽出された原文要部及び翻訳要部を表示エリア８５，８６にそれぞれ表示する。
【０１０７】
なお、ステップＳ８７において、ボタン操作が要部表示ボタン８２の操作でなく、前選択ボタン８４が操作された場合（Ｓ９３）には変数ｉにｉ−１をセットし（Ｓ９４）、ステップＳ８５に移行し、原文全文及び翻訳全文を表示エリア８５，８６にそれぞれ表示する。また、次選択ボタン８３が操作された場合（Ｓ９５）には変数ｉにｉ＋１をセットし（Ｓ９６）、ステップＳ８５に移行し、原文全文及び翻訳全文を表示エリア８５，８６にそれぞれ表示する。
【０１０８】
従って、以上のような実施の形態によれば、多数の外国語特許明細書の中から検索条件のもとに少なくとも１つ以上の外国語特許明細書を検索し、この検索された外国語特許明細書について、全文翻訳された母国語特許明細書、原文要部、翻訳要部を取り出し、表示画面に選択的に外国語特許明細書−全文翻訳された母国語特許明細書、原文要部−翻訳要部の如く対の関係で表示するので、検索者は容易に所要とする技術内容を把握でき、ひいては検索効率を大幅に高めることができる。
【０１０９】
なお、図２３では、要部を項目別に表示したが、項目別でなく、要部選定部６４で抽出された段落の出現順に表示する構成であってもよい。
【０１１０】
また、図２３には図示されていないが、全文表示の場合、既に要部選定部６４で要部が抽出されているので、当該全文中の要部相当部分を例えば異なる色などで強調表示すれば、全文中でどの個所が重要であるかが一目瞭然に認識でき、さらに検索効率を高めることができる。
【０１１１】
なお、本願発明は、上記実施の形態に限定されるものでなく、その要旨を逸脱しない範囲で種々変形して実施できる。
【０１１２】
また、各実施の形態は可能な限り組み合わせて実施することが可能であり、その場合には組み合わせによる効果が得られる。さらに、上記各実施の形態には種々の上位，下位段階の発明が含まれており、開示された複数の構成要素の適宜な組み合わせにより種々の発明が抽出され得るものである。例えば問題点を解決するための手段に記載される全構成要件から幾つかの構成要件が省略されうることで発明が抽出された場合には、その抽出された発明を実施する場合には省略部分が周知慣用技術で適宜補われるものである。
【０１１３】
【発明の効果】
以上説明したように本発明によれば、第１言語の特許文献から効率的、適切な第２言語の抄録を作成できる抄録作成支援システム、プログラム及び抄録作成支援方法を提供できる。
【０１１４】
また、本発明は、検索された第１言語の特許明細書に対応する全文翻訳の第２言語の特許明細書、原文要部、翻訳要部を作成し、選択的に表示することにより、迅速に技術内容を把握でき、ひいては検索効率を大幅に向上させる特許文献検索システム、プログラム及び特許文献検索方法を提供できる。
【図面の簡単な説明】
【図１】本発明に係る抄録作成支援システムの一実施の形態を示す構成図。
【図２】図１に示す抄録作成支援システムの一連の処理の流れを説明する図。
【図３】抄録作成対象となる例えば英語特許明細書の一例を示す図。
【図４】図１に示す文書構造解析部で解析された英語特許明細書の文書構造解析結果を示す図。
【図５】抽出規則辞書格納部に格納される抽出規則辞書内の規則を説明する図。
【図６】図１に示す要部選定部の具体的な処理の流れを説明するフローチャート。
【図７】要部選定部で抽出された原文要部抽出結果のイメージ図。
【図８】要部選定部で抽出された要部抽出結果の一表現例を示す図。
【図９】図３に示す抄録作成対象となる例えば英語特許明細書の機械翻訳結果を示す図。
【図１０】図１に示す抄録候補生成部で取り出した和文の抄録候補の一例を説明する図。
【図１１】抄録候補の表示例であって、特に抽出された翻訳要部及び原文要部の表示例、この翻訳要部から和文抄録を編集処理する例及び翻訳全文及び原文全文の表示例を説明する図。
【図１２】翻訳要部と原文要部を対応付けて表示する例を示す図。
【図１３】図１に示す抄録編集表示制御部による状態遷移とその状態遷移に伴う処理を説明する図。
【図１４】別の抄録候補の表示例を示す図。
【図１５】本発明に係る抄録作成支援システムの他の実施の形態を説明するための抽出規則辞書の一例を示す図。
【図１６】要部選定部において図１５に示す抽出規則辞書を用いて抽出した要部抽出結果の一例を示す図。
【図１７】要部選定部における一連の処理の流れを説明するフローチャート。
【図１８】本発明に係る抄録作成支援システムのさらに他の実施の形態を説明するための抽出規則の一例を示す図。
【図１９】本発明に係る抄録作成支援システムのさらに他の実施の形態を説明するための不要語辞書の一例を示す図。
【図２０】要部選定部において抽出規則辞書及び不要語辞書を用いて要部を抽出するための処理の一例を示すフローチャート。
【図２１】本発明に係る検索システムの一実施の形態を説明する機能構成図。
【図２２】図２１における検索結果表示制御部の一連の処理を説明するフローチャート。
【図２３】検索結果の表示例を示す図。
【符号の説明】
１…原文明細書入力部、２…抽出規則辞書格納部、３…抄録作成処理制御部、４…記録媒体、５…抄録関連情報格納用データベース、７…出力部、８…翻訳辞書部、１１…文書構造解析部、１２…要部選定部、１３…機械翻訳部、１４…抄録候補抽出部、１５…抄録編集表示制御部、６１…文書検索部、６２…抽出規則辞書格納部、６４…要部選定部、６５…機械翻訳部、６６…検索候補抽出部、６７…検索結果表示制御部。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an abstract creation support system, a program, an abstract creation support method, a patent document search system, and a patent document search method used when creating an abstract of a native language from a foreign language patent document or the like.
[0002]
[Prior art]
Patent documents (specifications including claims, drawings, etc.) written in a foreign language (for example, English) for the purpose of providing information on what patents have been filed in countries other than Japan ) It is a common practice to refer to the full text and create Japanese abstracts such as the native language (eg Japanese). The created Japanese abstracts are stored in the database of the search system and are provided for viewing and searching for end users.
[0003]
By the way, when creating a Japanese abstract from an English patent document, at present, the abstract creator reads the entire text of the English patent document for which the Japanese abstract is to be created, and the main part (hereinafter referred to as the main part) from the patent document. To understand what is, and to translate the relevant parts into their native language. The Japanese abstract to be created is described and created within a predetermined Japanese length (for example, a total of about 600 to 800 characters) from the technical field, abstract, and examples.
[0004]
Since all such abstract creation work relies on manual work by the abstract creator, it takes a great deal of time to create the abstract. Also, in the abstract creation work process, before the abstract is created, the abstract creator needs to grasp the contents of the entire English patent document. As a result, the abstract creator needs to be able to read the entire English patent document, but this reading requires a lot of work costs, and the efficiency of the abstract creation work is very poor. Was demanded.
[0005]
In order to improve the efficiency of abstract creation work, several abstract creation support systems have been proposed in recent years.
[0006]
One abstract creation support system is a text summarization system for summarizing sentences, which extracts chapter structures such as technical papers and uses them as abstracts for each section (for example, abstract part) Or processing is performed such as whether to ignore the summarization process without performing the digest process (see, for example, document) (see Patent Document 1).
[0007]
Another abstract creation support system is a document information retrieval device and a document retrieval result display method for retrieving document information, and automatically presents an abstract from the original text when presenting the retrieval result. Present the generated abstract to the user. And it is the structure which takes out the correspondence of an abstract and an original text, highlights the interlocking | linkage of an abstract display and an original text display, and the abstract part in an original text display (refer patent document 2).
[0008]
[Patent Document 1]
Japanese Patent No. 2766261
[0009]
[Patent Document 2]
Japanese Patent No. 2957875
[0010]
[Problems to be solved by the invention]
Therefore, the abstract creation support system as described above is intended for technical papers, etc., and the important sentence extraction processing method for abstract creation is based on statistical information such as the frequency of words used in the document. Based on the method (for example, selection criteria that words that are frequently used are important, and sentences containing important words are important), methods for selecting important sentences based on conjunctions and expressions at the end of sentences (for example, “In short” The selection criteria that sentences using adverbs are important are disclosed.
[0011]
However, these important sentence extraction processing methods are intended for technical papers and the like, and are methods for selecting important sentences according to uniform selection criteria. However, there is a problem that it is impossible to extract a high-quality important sentence that can be determined to be a main part from the viewpoint of patent literature.
[0012]
The abstract creation support system described above is intended to create an abstract for a single language. For example, it does not have a support function for creating a Japanese abstract from an English patent document, so a Japanese abstract cannot be created from an English patent document. There is a problem.
[0013]
In addition, ideally, it is preferable that the abstract is automatically created. However, in order to create a high-quality abstract, it is finally necessary to manually check the abstract creator. For this reason, it is necessary to provide a system that does not automatically create abstracts but supports abstracts.
[0014]
Further, in the abstract creation support system as described above, for example, the “summary” is extracted as a main part according to the chapter structure such as “summary”, “technical field to which the invention belongs”, “examples”, and the like. However, with regard to extraction rule dictionaries such as words and phrases that should be the main part, it is prepared in advance for each category of international patent classification (hereinafter referred to as IPC classification), and the IPC classification of the patent document to be abstracted or “ The contents such as switching the extraction rule dictionary according to the chapter structure such as “technical field” and “example” are not described, and it is difficult to use the system as it is from that point. In addition, there is no description about the cooperation between abstracts and machine translation.
[0015]
Furthermore, since the text created by the abstract creator is a Japanese abstract, it is necessary to extract the main part and translate it into Japanese. Conventionally, when manually translating, even if it is a patent document in the same technical field due to the work load, the abstract creator in another field is often in charge, and if limited to manual translation, translation of technical terms It is very difficult to maintain consistency against
[0016]
Therefore, it is conceivable to use a machine translation system for supporting translation from English patent documents into Japanese. As for machine translation systems, some package software is already on the market. For example, a technical term dictionary for machine translation based on IPC classification is prepared, and which IPC classification is a patent document for machine translation. A system that performs appropriate translation by determining and selectively switching and using a technical term dictionary according to the IPC classification of patent literature has also been proposed (Japanese Patent Laid-Open No. 2003-16063). However, this system is intended for machine translation and does not describe the creation of abstracts.
[0017]
Therefore, from the above description, a foreign language patent specification (herein, a foreign language patent specification includes a claim, and further, a patent related document (practical (Including new models, etc.), and the specification including the scope of claims. Therefore, it is intended for electronic patent application documents that are publicly provided, although the names differ depending on the country) It is conceivable that an abstract is created by a conventional abstract creation system, and the created English abstract is translated into Japanese by an English-Japanese translation system.
[0018]
However, in order for an abstract creator to produce a high-quality Japanese abstract, it is not sufficient to simply read the results translated into Japanese, and the corresponding English source text and the context of the text before and after the text Need to refer. Therefore, a combination of a conventional abstract creation support system and a machine translation system cannot proceed with the creation of abstracts in different languages as a series of work procedures, so a support function for creating Japanese abstracts from English patent documents Cannot be fulfilled sufficiently.
[0019]
In any case, the characteristics and patent abstracts of patent documents according to the technical field are very important documents for grasping technical trends and other practical aspects. It can be said that the real situation is waiting for the realization of a support system for creating abstracts in the native language.
[0020]
The present invention has been made in view of the above circumstances, and an abstract creation support system, program, and abstract creation support method for efficiently and appropriately creating an abstract in a second language from a digitized patent application document in a first language The purpose is to provide.
[0021]
Another object of the present invention is to search for a required first language patent application document from an electronic first language patent application document, and search the first language patent application document from the searched first language patent application document. An object of the present invention is to provide a patent document search system, a program, and a patent document search method for creating a search main part of one language and search candidates (search main part) of a second language and realizing an efficient search work.
[0022]
[Means for Solving the Problems]
(1) In order to solve the above-mentioned problems, the abstract creation support system according to the present invention includes at least the titles of the chapters of the first language required for each fixed category and the explanations described in the chapters of these titles. An extraction rule dictionary that prescribes key phrases such as words and phrases in the first language included, document structure analysis means for analyzing the chapter structure of sentences described in the electronic patent application documents in the first language, and The required extraction rule dictionary is selected according to the classification described in the document structure analysis result analyzed by the document structure analysis means, and from the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary Main part selection means for selecting the main part, machine translation means for translating the document structure analysis result analyzed by the document structure analysis means into the second language, and each main part selected by the main part selection means Abstract candidate extraction means for extracting second language abstract candidates from the second language translation results translated by the machine translation means, second language abstract candidates extracted by the abstract candidate extraction means, and the second Among the main parts of the first language corresponding to the language abstract candidates, at least the second language abstract candidates or the second language abstract candidates and the main parts of the first language are displayed in association with each other. An abstract edit display control means for creating an abstract in the second language including an abstract candidate correction process is provided.
[0023]
According to the present invention, the translation processing is performed based on the main part of the first language and the document structure analysis result of the first language in accordance with the field from the chapter structure of the sentence described in the patent application document of the first language. The second language abstract candidate is extracted from the translation result of the second language, and the second language abstract candidate is revised and the second language abstract is created. It is possible to automatically extract the second language abstract candidate without reading the application documents and create the second language abstract in a short period of time, including correction confirmation by the abstract creator. Abstracts can be created and appropriate abstracts can be created.
[0024]
Note that the abstract can be created by a program recorded in the program or an abstract creation method, and the same effects as the system can be achieved.
[0025]
(2) Further, the patent document search system according to the present invention includes a storage unit that stores a plurality of digitized patent application documents in a first language, and at least one first from the storage unit based on a search condition. A document search means for searching patent application documents in a language, a title of a chapter in the first language that is required at least for each predetermined classification, and a word in the first language included in an explanatory note described in the section of each title, An extraction rule dictionary that prescribes key phrases such as phrases, a document structure analysis unit that analyzes a chapter structure of a sentence described in the digitized first language patent application document searched by the document search unit, and A required extraction rule dictionary is selected according to the classification described in the document structure analysis result analyzed by the document structure analysis means, and the document is based on the title and key phrase specified in the extraction rule dictionary. The main part selection means for selecting the main part to be searched from the structure analysis result, the machine translation means for translating the document structure analysis result analyzed by the document structure analysis means into the second language, and the main part selection means. Search candidate extraction means for extracting search candidates in the second language from the translation result of the second language translated by the machine translation means for each main part to be searched, and the second language extracted by the search candidate extraction means Of the first language corresponding to the search candidate of the second language and the search candidate of the second language, and at least the search main part of the first language or the search main part of the first language and the search of the second language Candidates are displayed in association with each other, or the search part of the first language and the search candidate of the second language, the document structure analysis result of the first language, and the translation result of the second language are selectively displayed. Search result display control means It is.
[0026]
The present invention is configured as described above to search at least one first language patent application document according to a search condition from among a plurality of electronic first language patent application documents in the first language. The configuration described in (1) above can be incorporated into the retrieved patent application documents in the first language, and at least the main parts of the search in the first language and the search candidates in the second language can be extracted and displayed. It is possible to realize a typical search operation.
[0027]
The retrieval main part and retrieval candidates can be extracted by a program recorded in the program or a patent document retrieval method, and the same operational effects as the system can be obtained.
[0028]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0029]
FIG. 1 is a block diagram illustrating an embodiment of an abstract creation support system according to the present invention.
[0030]
This abstract creation support system includes a text description input unit 1 for inputting patent application documents in a first language such as digitized foreign language patent documents, and an extraction rule having predetermined elements for each predetermined category of IPC classification. An extraction rule dictionary storage unit 2 for storing a plurality of extracted extraction rule dictionaries, and an abstract for generating an abstract of a second language such as a native language from a foreign language patent document or the like input from the original text description input unit 1 When the creation process control unit 3 and the abstract creation process control unit 3 are constituted by a CPU, a recording medium 4 for storing a program for executing a series of abstract creation processes, and the abstract creation process control unit 3 An abstract-related information storage database 5 for storing the abstract of the native language created in step 1 and other information related to the abstract, and a buffer memory 6 for temporarily storing necessary information during the abstract creation process, It is constituted by an output unit 7 for outputting the Abstract other necessary information of the created mother tongue. Reference numeral 8 denotes a translation dictionary for storing knowledge on translation processing used at the time of translation processing.
[0031]
Note that the above-mentioned patent application documents in the first language refer to all text information including publicly-available gazettes, publicly-known gazettes, registered patent gazettes, and gazettes related to utility models that have characteristics as publicly provided publications. . In addition, the specification in the patent application document may be a document separate from the claims and the abstract in a strict sense, but in the patent document to be distributed, the specification and the claims and the abstract are not specified. Since they are printed together in order, they are intended to include the claims and summary. Hereinafter, the patent application documents in the first language are collectively referred to as foreign language patent specifications or English patent specifications for convenience of explanation.
[0032]
The original specification input unit 1 is generally a keyboard, a mouse, etc., and inputs various control instructions using the keyboard, and the foreign language patent specification selected by designating a specific area in the foreign language patent specification using the mouse. In addition to input, input of reading information by OCR (Optical character Reader) that reads foreign language patent specifications, input of electronic foreign language patent specifications stored on floppy disk, magnetic tape, magnetic disk, etc. are different It includes the input of foreign language patent specifications in various input forms.
[0033]
The extraction rule dictionary storage unit 2 is provided with each extraction rule dictionary for each predetermined category (for example, for each category obtained by further subdividing electrical, mechanical, chemical, physical, or other fields), for example, for each IPC category at least for each upper category. A predetermined extraction rule is defined in each extraction rule dictionary. An example of the rules of each extraction rule dictionary is composed of, for example, three elements of abstract items, fields, and key phrases, and a specific example is as shown in FIG.
[0034]
The abstract creation processing control unit 3 functionally analyzes the items (document structure) of the foreign language patent specifications input from the original text description input unit 1 and puts them into a certain format that can be recognized for each item. A document structure analysis unit 11 as a document structure analysis unit for creating a description of a tagged document structure based on the document structure, and a document structure analysis result obtained by analyzing the document structure by the structure analysis unit 11 is determined in an extraction rule dictionary. Using a rule / knowledge defined in the translation dictionary unit 8 with respect to a document structure analysis result output from the document structure analysis unit 11 and a main part selection unit 12 as a main part selection means for extracting a main part according to the rules, Machine translation unit 13 serving as a translation processing means for translating into the native language, and the main part of the full text of the foreign language patent specification translated by machine translation unit 13 for each major part that is a result of the principal part extraction result of principal part selecting part 12 In the full text of the translation results in Japanese The abstract candidate extraction unit 14 as an abstract candidate extraction unit for extracting the abstract candidates in the native language, the abstract candidate for the native language extracted from the abstract candidate extraction unit 14, and the foreign language requirement extracted by the main part selection unit 12 A part (original part of the original text) is displayed in association with each other, or the whole document structure analysis result of the foreign language and a translation result of the native language translated by the machine translation unit 13 are displayed in association with each other. There is provided an abstract editing display control unit 15 for generating an abstract of the native language including correction processing by an abstract creator of abstract candidates for the Japanese language.
[0035]
The output unit 7 outputs a translation result which is an output of the machine translation unit 13, displays a foreign language patent specification input from the input unit 1, or displays a created native language abstract, It has the function of displaying the full text of the foreign language patent specification, the translated text and the various other required formats. Usually, display means such as various displays are used. Means, a floppy disk, a magnetic tape, a writing registration means to the magnetic disk, a transmission means for transmitting to other media, and various output forms desired by the user.
[0036]
Next, the operation of the above abstract creation support system, the program according to the present invention, and the abstract creation support method will be described with reference to FIG.
[0037]
First, when a computerized foreign language patent specification, for example, an English patent specification, is input from the source text specification input unit 1 (S1), the abstract creation processing control unit 3 follows a program stored in the recording medium 4. A series of processing is executed. The abstract creation processing control unit 3 can perform processing by a hardware configuration including, for example, a logic circuit, but here, a series of processing is executed according to a program.
[0038]
Generally, as shown in FIG. 3, some of the English patent specifications include Abstract, Claims, Field of Invention (FIELD OF THE INVENTION), Background of Invention (BACKGROUND OF THE INVENTION), etc. In addition to the items (document structure) in the specification, the inventor, the IPC classification code, and the like are described, which are not shown here.
[0039]
Therefore, when the abstract creation processing control unit 3 receives the digitized English patent specification input from the original text description input unit 1, for example, after temporarily storing it in the buffer memory 6, the document structure analysis unit 11 performs the English patent. The document structure analysis process of the specification is performed (S2: document structure analysis function, corresponding to the document structure analysis step). This document structure analysis processing unit 11 analyzes each item (document structure) of a predetermined English patent specification when it is assumed that the format of the patent specification is an XML format, and an XML tag for each item. The document structure is replaced with a tag based on the information, and stored in the buffer memory 6 or the database 5 as a document structure analysis result (S3).
[0040]
FIG. 4 is a diagram showing a part of the document structure analysis result obtained by the structure analysis by the document structure analysis unit 11. Here, the document structure according to each item such as Abstract, Claims, FIELD OF THE INVENTION is expressed in XML format. As shown in the document structure analysis result, a range from the tag <field header = “XXXX”> to </ field> corresponding to the tag is a portion corresponding to the title “XXXX”. Here, the example is based on the premise that the format of the patent specification is the XML format, but in the case of other formats, each item (document structure) of the English patent specification is analyzed, and each item is analyzed. It goes without saying that the document structure is replaced with another format.
[0041]
After obtaining the document structure analysis result as described above, the main part selection unit 12 sequentially extracts the main part from the document structure analysis result according to the rules of each extraction rule dictionary stored in the extraction rule dictionary storage unit 2. Extraction (S4: corresponding to a main part selection function and a main part selection step).
[0042]
This extraction rule dictionary is based on a certain technical category (for example, about 30 technical categories close to the Japanese publication issue category, etc.) based on the IPC classification that is subdivided from class (superordinate category) to subgroup (subordinate category). (The technical classification according to the need of the field to which the present invention is applied), and the format of the rules in each extraction rule dictionary is composed of three elements: abstract items, fields, and key phrases as shown in FIG. ing. In the abstract item, the contents to be output as the main part are described in Japanese, and here, three items of the technical field, the gist, and the example (including the embodiment) are illustrated. The field indicates the title of each chapter in the document structure. For keyphrases, search for each chapter set in the field, and if the word or phrase defined in the keyphrase is included in the sentence explaining the contents of the title, extract it as the main part. Means.
[0043]
For example, in the document structure of an English patent specification that is the subject of an abstract, the technical fields defined by the rules of the dictionary in the sentence of paragraphs corresponding to titles such as “BREIF DESCRIPTION OF THE PRIOR ART”, “FIELD OF THE INVENTION”, “ABSTRACT”, etc. If a word or phrase such as “relates to”, “is directed tower”, or “the inventor concerns”, which is a key phrase corresponding to, is included, a candidate for the main part corresponding to the abstract item “technical field” It means that. In the paragraphs corresponding to the titles “SUMMARY OF THE INVENTION”, “SUMMARY”, and “ABSTRACT”, “is promoted”, “is disclosed”, “is provided”, “ If a word or phrase such as “disclosed is” is included, it means that the paragraph is a candidate for the main part corresponding to the abstract item “summary”. Since a plurality of conditions can be registered for each abstract item, for example, the rule corresponding to “summary” is divided into two extraction rules R2 and R3. In the key phrase corresponding to the fourth extraction rule R4, “FIG.?” Is described. This “?” Means matching with an arbitrary number. In other words, the key phrase “FIG.?” Means that it can be collated with any of “FIG. 1”, “FIG. 2”, and the like.
[0044]
Therefore, the main part selection unit 12 executes the main part selection process shown in FIG. 6 using the rules in the extraction rule dictionary as described above. Specifically, the document structure analysis result of the English patent specification shown in FIG. 4 is read from the buffer memory 6 or the database 5 (S11). Generally, since the IPC classification is described in the patent specification, the IPC can be recognized by the tag information also in the document structure analysis result shown in FIG. That is, in the example of FIG. 4, the range surrounded by “<ipc>” and “</ ipc>” is a portion where the IPC classification is described, and is described in detail as “C08F 002/00”. ing. By using this classification information, it is possible to determine the extraction rule dictionary corresponding to the technical classification based on the IPC classification that is subdivided and expanded from the class (upper classification) to the subgroup (lower classification).
[0045]
When the IPC classification is extracted from the document structure analysis result, the main part selection unit 12 reads the extraction rule dictionary corresponding to the corresponding IPC classification from the extraction rule dictionary storage unit 2 (S12), and then proceeds to the next step S13. As the extraction rule (ID = R1) to be referred to first in the extraction rule dictionary, 1 is set to the variable i. Then, based on the extraction rule of the variable i = 1, it is determined whether or not a paragraph to be collated with the field and key phrase of the rule exists in the English patent specification (document structure analysis result) to be abstracted (S14). . Here, if there is a paragraph to be collated, position information (specific information) of the paragraph to be collated is extracted (S5 in FIG. 2, S15 in FIG. 6) and stored in the buffer memory 6 or the database 5.
[0046]
After extracting the position information of this paragraph or when there is no paragraph to be collated in step S14, the process proceeds to the next step S16, and the extraction rule (ID = R2) to be processed next specified in the extraction rule dictionary is as follows. After incrementing the variable i by +1, it is determined whether or not all the extraction rule items corresponding to the relevant variable have been completed (S17). If there are still extraction rule items, the process returns to step S14. It is determined whether or not a paragraph to be matched with the extraction rule field and key phrase of the variable i exists in the patent specification (document structure analysis result) to be abstracted, and is repeatedly executed for all extraction rules.
[0047]
FIG. 7 is a diagram schematically showing the extraction part of the main part in order to explain the gist of the present invention. For example, in the second extraction rule (ID = R2) of the extraction rule dictionary shown in FIG. 5, “ABSTRACT” is described in the field, and “is disclosed” is described in the key phrase. Therefore, since this second extraction rule is collated with “is disclosed” (indicated by underline) included in the explanatory text 21 enclosed in the dotted dotted frame related to the title “Abstract”, this “ The description sentence 21 of “Abstract” or the paragraph position (specific information) in the sentence is extracted as the “summary”. In the first extraction rule (ID = R1) of the extraction rule dictionary shown in FIG. 5, “FIELD OF THE INVENTION” is described in the field, and “relates to” is described in the key phrase. Therefore, since this first extraction rule is collated with “relates to” (indicated by underlining) included in the explanatory text 22 of FIG. 7 surrounded by the dotted frame of “FIELD OF THE INVENTION”, this “ The description sentence 22 of “FIELD OF THE INVENTION” or the paragraph position (specific information) in the sentence is extracted as “technical field”. Further, in the third extraction rule (ID = R3) of the extraction rule dictionary shown in FIG. 5, “DATAILED DESCRIPTION OF THE INVENTION” is described in the field, and “this invention addresses” is described in the key phrase. Therefore, since this third extraction rule is collated with “this invitation addresses” (indicated by underlining) included in the explanatory text 23 enclosed in the dotted line frame of “DATAILED DESCRIPTION OF THE INVENTION”, this “DATAILED The description 23 of “DESCRIPTION OF THE INVENTION” or the paragraph position (specific information) in the sentence is extracted as the “summary”.
[0048]
FIG. 8 is a diagram illustrating an example of an expression format of an extraction result by the main part selection unit 12.
[0049]
FIG. 7 shows the main parts extracted while being associated with the document structure analysis result of FIG. 4 with the main part specifying information 21, 22, 23, but FIG. 8 more specifically shows the fields in the patent specification, It can be extracted by associating with a pointer such as a paragraph position in the sentence. Incidentally, FIG. 8 includes a combination of a field, a paragraph (paragraph position, specific information), an abstract item, and an extraction rule used for extracting a main part. Specifically, the information on the first line in FIG. 8 is a chapter “FIELD OF THE INVENTION”, which indicates that the first paragraph matches the extraction rule R1, and the extraction rule R1 is an abstract item. It is a rule corresponding to the “technical field”. The information on the third line is a section called “DATAILED DESCRIPTION OF THE INVENTION”, which indicates that the first paragraph matches the extraction rule R3. This extraction rule R3 is the “summary” of the abstract item. Indicates that the rule corresponds to.
[0050]
As described above, after the main part selection unit 12 extracts the main part as shown in FIG. 8 from the English document structure analysis result, for example, or a machine that performs processing in parallel with the main part selection processing by the main part selection unit 12 The translation unit 13 translates the document structure analysis result of the English patent document using the translation processing knowledge of the translation dictionary unit 8 (S6), and obtains the translation result of the native language, for example, Japanese ( S7). These steps S6 and S7 correspond to a translation processing function and a translation processing step.
[0051]
FIG. 9 is a diagram showing a translation result obtained by translating a document other than the XML tag into Japanese with respect to the English document structure analysis result shown in FIG. The reason for the translation process excluding the XML tag is that the original English document structure analysis result needs to be retained for the translation result, so that the translation process is not performed for the XML tag. For English-Japanese machine translation, some package software is already on the market, and a translation processing technique realized by these package software may be used. Thereby, it is possible to easily obtain the translation result as shown in FIG.
[0052]
Subsequently, the abstract candidate extraction unit 14 compares the main part extraction result (see FIG. 8) extracted by the main part selection unit 12 with the machine translation result (see FIG. 9) acquired by the machine translation unit 13. For example, a Japanese abstract candidate, which is a native language, is extracted (S8: equivalent to abstract candidate extraction function, abstract candidate extraction step).
[0053]
Specifically, the abstract candidate extraction unit 14 extracts abstract candidates related to the first paragraph of “FIELD OF THE INVENTION” on the first line in FIG. 8. In the example shown in FIG. 9, in <field header = “FIELD OF THE INVENTION”>, a paragraph beginning with “[0001]” corresponds to an abstract candidate. Further, abstract candidates relating to the first paragraph of “ABSTRACT” on the second line in FIG. 8 are extracted. From the example shown in FIG. 9, in <field header = “ABSTRACT”>, a paragraph beginning with “process” corresponds to an abstract candidate. Similarly, abstract candidates are sequentially extracted from the machine translation results in accordance with the information on each line of the main part extraction results shown in FIG. 8 and stored in a predetermined area of the database 5. FIG. 10 is a diagram showing extraction results of the abstract candidates.
[0054]
By the way, in the embodiment described above, an example in which machine translation processing is performed on the premise that the number of lines and the number of paragraphs does not change has been described. However, in the machine translation processing, a single sentence is divided into a plurality of sentences and translated. May be performed. In such a case, as shown in FIG. 8, there is a possibility that the correspondence is shifted in correspondence between paragraphs and sentences. However, since the document structure analysis unit 11 allocates and processes XML tags having unique numbers for all sentences and paragraphs on the English patent specification side, based on the document structure analysis result, If machine translation is performed with the XML tag information held, there is no problem.
[0055]
Therefore, in this abstract creation support system, at the stage of extracting abstract candidates as described above, the abstract related information storage database 5 stores the entire English patent specification (see FIG. 3) and the English document structure analysis result (FIG. 4). ), Japanese machine translation results (see FIG. 9), Japanese abstract candidate extraction results (see FIG. 10), and the like are stored, the abstract editing display control unit 15 performs the selection operation from the input unit 1. Then, required information is selectively extracted from the abstract related information storage database 5 and displayed on the output unit 7 in accordance with the required format (S9).
[0056]
FIG. 11 is a diagram showing a screen display example of the main part display in Japanese / English (upper part) or the full text display in Japanese / English (lower part) displayed on the output unit 7 under the required operation by the abstract creator. It is.
[0057]
First, when the abstract creator takes out the initial screen, the main screen button 31, the full text button 32, the open button 33, the copy button 34, the save button 35, the end button 36, and other necessary buttons (see FIG. In addition, in the upper left display area 37 of FIG. 11, Japanese translation essential parts (abstract candidates) are sequentially displayed. On the other hand, in conjunction with the display of the translation essential parts, the English original sentence essential parts are sequentially displayed in the display area 38 on the upper right side of FIG. In addition, since each character of the translation main part and the original text main part cannot be described due to the space of the drawing, a part of the translation main part and the original text main part are shown in FIG. In each of the display areas 37 and 38, one paragraph can be selected by a pointing device such as a mouse constituting the input unit 1.
[0058]
That is, the display area 37 and the display area 38 are linked, and when a paragraph with the display area 37 is selected, the main part of the English text corresponding to the original patent specification is linked to the display area 38 and selected. Is displayed. 39 is an abstract creation area. When the abstract creator determines that the abstract is the best abstract while confirming the translation essential parts and original text essential parts of the display areas 37 and 38, the relevant translation essential parts are copied and necessary. In response to this, this is an area that is subjected to correction processing such as character deletion and insertion to make a Japanese abstract. That is, the abstract creator creates a Japanese abstract by appropriately editing the contents of the abstract creation area 39. Also, for reference by the abstract creator, bibliographic information such as the inventor, application date, and IPC is displayed in the display area 40, and the translation result of the patented invention name is displayed in the display area 41. Is done.
[0059]
Further, when the full text button 32 is operated, the translated full text and the original full text are displayed in conjunction with each other as shown in the lower part of FIG. Thereafter, when the main part button 31 is operated, the Japanese / English main part shown in the upper part of the figure is displayed.
[0060]
FIG. 13 is a diagram illustrating a state transition diagram when actually used by an abstract creator or user (hereinafter collectively referred to as an abstract creator) and a process associated with the state transition of the abstract editing display control unit 15.
[0061]
In this figure, state 51 is the initial state when the system is started up, state 52 is a state in which the main part of the translation result is displayed in the display area 37 by the abstract creator opening the abstract processing screen, and state 53 is the system The end state means the end of, and the state 54 means a state in which the translation result of the full text of the specification is displayed in the display area 37 by the operation of full text display by the abstract creator.
[0062]
Further, the state transition will be described in detail. When the abstract creator operates the button 33 opened in the state 51, the English patent specification is obtained by inquiring and obtaining the file name of the English patent specification. The translation result of the main part extracted from the above, that is, the translation main part is taken out from the required area of the database 5 and displayed in the display area 37. After the display and the display area 39 are initialized, the state transitions to the state 52.
[0063]
In this state 52, when a paragraph displayed in the display area 37 is selected and the copy button 34 is operated, the paragraph is inserted into the display area 39. The state 52 is a state of the main part (translation), and the abstract items (see FIGS. 11 and 12) “technical field”, “summary”, and “example” are set for each paragraph. Then, it is sequentially inserted into the display area 39 by copying according to the abstract item (S52).
[0064]
Further, in the state 54, the full-text translated (full-text translation) is displayed in the display area 37 by operating the full-text button 37, and the original full-text is displayed in the display area 38 in conjunction with this (S53). In this state 54, since the abstract item has not been determined, after setting the abstract item through a process such as requesting input from the abstract creator, the paragraph displayed in the display area 37 is copied as necessary. Copying to the display area 39 by operation (S54). When the save button 35 is operated in the state 52 or the state 54, the contents of the display area 39 are saved in the database 5 based on the input of the file name of the Japanese abstract of the save destination by the abstract creator. (S55, S56). If the open button 33 is operated in the state 54, the same processing as step S51 is executed (S57), and if the main button 31 is operated, the same processing as step S52 is executed (S58). When the end button 36 is operated in each state, the abstract creation and the required information extraction and display processing are completed.
[0065]
Therefore, according to the embodiment as described above, after analyzing the item (document structure) of the foreign language patent specification and obtaining the tagged document structure analysis result based on a certain format, the document structure analysis result is obtained. Based on the above, the main part selection unit 12 extracts the main part according to the rules of the extraction rule dictionary determined in advance, and the machine translation unit 13 converts the contents of the document structure analysis result into, for example, the native language based on normal translation processing. translate. After that, after the abstract candidate extraction unit 14 extracts the abstract candidates from the translation results of the whole Japanese sentence translated for each main part, which is the main part extraction result, the translation main part and the original main part are associated with each other. Display, copy the translation main part, refer to the main part of the original text, and if necessary, refer to the entire document structure analysis result of the English patent specification and the translated translation result, and extract the Japanese abstract by deleting and inserting characters. Therefore, Japanese abstracts can be efficiently created, and manual Japanese abstract work can be appropriately supported.
[0066]
The main part display is not limited to the form shown in FIGS. For example, as shown in FIG. 14, the main part button 31 is divided into a translation main part button 31a and an original sentence main part button 31b, and the full sentence button 32 is also divided into a translated full sentence button 32a and an original full sentence button 32b. It may be configured to select and display in four display modes by performing a selection operation. In short, it is possible to freely display the English text of the main part selected from the English patent specification, the translation result thereof, the original English patent specification, the translated full text specification, etc. freely and sequentially.
[0067]
Next, FIG. 15 is a diagram for explaining another embodiment of the abstract creation support system according to the present invention.
[0068]
In this embodiment, priorities are set in advance in the rules of each extraction rule dictionary stored in the extraction rule dictionary storage unit 2, and the display amount or display number of abstract candidates is controlled based on the importance of the extracted paragraphs. This is an example.
[0069]
In the abstract creation support system according to the present invention, if the importance level is set in advance in the extraction rule dictionary shown in FIG. 15, if the main part is selected by the main part selection unit 12, the main part extraction result shown in FIG. The main part extraction result shown in FIG. 16 with the importance added is obtained. That is, the importance information of the extraction rule used at the time of extraction is added to the extraction result of the main part for each paragraph to be extracted. As a result, although not shown in the drawing, importance level information is added to the translation results of abstract candidates shown in FIG. It is assumed that “1” has the highest importance and that the importance decreases as the numerical value increases.
[0070]
As a result, the abstract editing display control unit 15 executes abstract candidate extraction processing in consideration of the importance as shown in FIG. That is, the abstract edit display control unit 15 first sets 1 as the first extraction rule (ID = R1) to the variable i (S61), and then the importance corresponding to this extraction rule (ID = R1). Is greater than a predetermined threshold, that is, whether the importance is higher than the threshold (S62). If it is determined that the importance is high, the Japanese translation corresponding to the paragraph of the field The main part is displayed in the display area 37 (S63). After displaying this translation main part or when it is determined that the importance is lower than the threshold value in step S62, the process proceeds to step S64, and after adding +1 to the variable i, the first extraction rule (ID of this variable i (ID = Ri) is determined whether it still remains (S65). If it still remains, the process returns to step S62 to execute the same processing. When the importance level is checked for all the extracted paragraph numbers and the main part translation is displayed, the process related to the importance level ends.
[0071]
Therefore, according to the embodiment as described above, it is possible to create a Japanese abstract by displaying in the display area 37 only the translation essential parts having a higher importance than the threshold value determined in advance in the abstract creation stage. Japanese abstracts can be created efficiently using quantities.
[0072]
The importance threshold can be changed by the abstract creator at any time, and the Japanese abstract required by the abstract creator can be easily created.
[0073]
FIG. 18 is a diagram for explaining still another embodiment of the abstract creation support system according to the present invention.
[0074]
This embodiment is an example in which “description of representative diagram” is extracted as an abstract candidate from the specification described in the document structure analysis result shown in FIG.
[0075]
The abstract creation support system according to the present invention expands the expression format of the extraction rule dictionary by adding not only phrase matching but also special symbols to the key phrases in the rules defined in the extraction rule dictionary. The explanation of the figure is extracted as the main part.
[0076]
Among the symbols to be introduced into this key phrase, “$ RepFig $” is a symbol meaning a representative figure, and “$ ^ Line $” is a key phrase search from the beginning of the specified field to the Line paragraph. It is a symbol that means a range. "$ ^* “Line $” means that the Line paragraph below the paragraph to be collated is taken out. "$ ^* For example, “$” means that all the paragraphs are extracted as the main part for the “SUMMARY” field.
[0077]
In this system, an extraction rule as shown in FIG. 18 is described in the extraction rule dictionary unit, so that the main part extraction unit 12 relates to the first extraction rule (ID = R1). If it is 1, it means searching for a key phrase “FIG. 1 shows” in the document structure analysis result corresponding to “FIELD OF THE INVENTION” defined in the field. In addition, regarding the second extraction rule (ID = R2), the search range is up to the third paragraph in the document structure analysis result corresponding to “DATA ILED DESCRIPTION OF THE INVENTION” defined in the field, and “FIG.? Shows”. This means searching for the presence of a key phrase ("?" Is an arbitrary number). Furthermore, regarding the third extraction rule (ID = R3), it means that two paragraphs in the document structure analysis result corresponding to “CLAIMS” defined in the field (because no other key phrase is specified) are extracted. To do.
[0078]
Incidentally, “pap-v16-2002-01-01.” In “-// USPTO // DTD PAP V1.6 2002-01-01 // EN” which is the definition of the XML format of the US published patent specification. According to Dtd, a number surrounded by <representative figure> and </ representative figure> represents a representative diagram. When a patent specification described in such a format is input, what the representative diagram is can be easily extracted from the patent specification.
[0079]
Therefore, in step S14 shown in FIG. 6 of the main part selection unit 12, after replacing with a specific character string of the symbol “$ RepFig $” in the extraction rule dictionary using the above definition information, The search may be performed according to the key phrase. That is, for example, if the representative figure of the patent specification to be abstracted is “FIG. 1”, “$ RepFig $” can be replaced with “FIG. 1”, so the key phrase of the extraction rule R1 is “FIG. 1 shows ". Therefore, the key phrase “FIG. 1 shows” is searched from the document structure analysis result corresponding to “DATA ILED DESCRIPTION OF THE INVENTION” defined in the field.
[0080]
In addition, if the key phrase described in the extraction rule includes the special symbol “$ RepFig $” in step S14 (FIG. 6) of the key phrase search in the main part selection unit 12, the key phrase search is performed. Limit the range to the Line paragraph, or use the special symbol “$ ^* If "Line $" is included, the line paragraph below the paragraph to be matched with the key phrase is extracted as the main part, or the special symbol "$" is included in the key phrase. ^* If “$” is included, processing such as selecting all the paragraphs of the field to be processed as the main part is performed.
[0081]
Therefore, according to such an embodiment, it is possible to easily extract the explanation of the representative diagram from the patent specification as an abstract candidate.
[0082]
If the special symbols as described above are newly defined and the main part selection unit 12 is configured accordingly, it is possible to easily extract the contents as the main parts other than the description of the representative diagram.
[0083]
FIG. 19 is a diagram for explaining still another embodiment of the abstract creation support system according to the present invention.
[0084]
This embodiment has a configuration in which an unnecessary word dictionary is newly provided in the extraction rule dictionary storage unit 2, and even a paragraph that matches the conditions of the extraction rule dictionary is not extracted only in a paragraph that matches an unnecessary word. This is an example.
[0085]
FIG. 19 is a diagram showing an example of an unnecessary word dictionary. For example, unnecessary words of the first and second rules F1 and F2 are phrases used in explaining the second embodiment, May not be appropriate for selection. This is because the unnecessary words in the third and fourth rules F3 and F4 are phrases that are used when they are dependent claims, and these are often not to be selected as essential parts.
[0086]
FIG. 20 is a flowchart showing a series of processing examples of the main part selection unit 12 when an unnecessary word dictionary is added to the extraction rule dictionary storage unit 2 in addition to the extraction rule dictionary.
[0087]
In the main part selection unit 12, the difference from the processing in FIG. 6 is that in step S 74, it is determined whether or not there is a paragraph that matches the extraction rule i. It is determined whether or not an unnecessary word exists in the collated paragraph based on (S75), and if an unnecessary word is included, the process is not extracted as a main part. Accordingly, the other steps S71 to S74 correspond to steps S11 to S14 in FIG. 6, and steps S76 to S78 correspond to steps S15 to S17 in FIG. 6, and detailed processing will be left to the description of FIG.
[0088]
Therefore, according to the embodiment as described above, when an unnecessary word defined in the rules of the unnecessary word dictionary is present in a paragraph corresponding to the main part of the machine sentence translation result, it is determined that it is not appropriate as the main part. Is not extracted as the main part, so a more accurate Japanese abstract can be created.
[0089]
The unnecessary word dictionary shown in FIG. 19 assumes that an unnecessary word dictionary is prepared for each IPC classification. For example, an unnecessary word is set for each abstract item or each extraction rule in the extraction rule dictionary. Such a configuration may be adopted.
[0090]
Further, in each of the embodiments described above, paragraphs are used as processing units in the selection of the main part and the generation of abstract candidates. However, these are processing units in other language units (sentences, sections, phrases, words, etc.). It can also be configured.
[0091]
Furthermore, in the present embodiment, an extraction rule dictionary is selectively used according to the IPC classification described in, for example, an English patent specification, which is an abstract creation target. If the technical term dictionary is selectively used, an improvement in translation accuracy is expected, and it is also possible to modify the dictionary to be used for machine translation according to the IPC classification.
[0092]
Furthermore, in recent years, the function of a translation memory has been realized in a machine translation system (for example, The Translation Professional V8.0). With this translation memory, if a Japanese translation of a certain English sentence with high accuracy was created in the past, it would be possible to re-use the same Japanese translation for an English sentence similar to the English sentence. Based on the fact that it is high, remember the past English-Japanese translation combination, search for past similar sentences when translating a new English sentence, and if there are similar sentences, the Japanese corresponding to the similar sentences This is a function for presenting a translation to the user.
[0093]
Therefore, also in the abstract creation support system according to the present invention, for example, the translation result related to the main part extracted by the main part selection unit 12 is presented to the abstract creator, and finally the Japanese abstract is created while referring to the translation result. The structure to do may be sufficient. In this case, if the English main part extracted by the main part selection unit 12 and the Japanese abstract of the correction result are stored, the translation memory function in the machine translation unit 13 can be used. That is, the system of the present invention is a machine further provided with a storage means for storing a pair of an English main part extracted by the main part selecting part 12 and a Japanese abstract after completion of editing by the abstract creator, and a translation memory function. If the translation unit 13 is used, a Japanese abstract can be created while effectively utilizing the translation memory function. At this time, if it is configured to store the English main part-Japanese abstract corresponding to the IPC classification, the past examples of the English main part-Japanese abstract searched according to the IPC classification of the English patent specification subject to the abstract generation can be efficiently used. Selective use is possible.
[0094]
In the above embodiment, an example in which an abstract creator whose native language is Japanese is a Japanese abstract has been described. For example, English to Chinese, English to German, Japanese to English, etc. The same applies to the above. Therefore, in this case, if an extraction rule in the extraction rule dictionary or a language to which machine translation is applied is incorporated in accordance with another language pair, an abstract creation support system corresponding to the other language pair can be realized. Needless to say.
[0095]
Further, in the above embodiment, the Japanese abstract candidate and the main part of the original text are displayed in association with each other, but it is also possible to display only the Japanese abstract candidate extracted by the abstract candidate extracting unit 14. .
[0096]
Next, FIG. 21 is a functional configuration diagram illustrating an embodiment of a search system with a main part creation function according to the present invention.
[0097]
This search system includes a document search unit 61, an extraction rule dictionary storage unit 62 having the same function as the extraction rule dictionary storage unit 2 in FIG. 1, and a document structure analysis unit having the same function as the document structure analysis unit 11 in FIG. 63, a main part selection unit 64 having the same function as the main part selection unit 12 of FIG. 1, a machine translation unit 65 having the same function as the machine translation unit 13 of FIG. 1, a search candidate extraction 66, and a search result display control unit 67, and various other components such as a patent document storage database 68, a translation dictionary (not shown), a recording medium for recording a series of search processing programs, and the like are provided. In the figure, the components 2, 63 to 65 having the same functions as those in FIG. 1 have already been described in detail in FIGS.
[0098]
First, a large number of foreign language patent specifications, for example, English patent specifications, are digitized and stored in the patent document storage database 68.
[0099]
In this state, when a search instruction is input from a search input unit (not shown) under a search condition such as a technical field or an IPC classification, the document search unit 61 performs a series of processes from the recording medium according to the search processing program. Execute. That is, when the search condition is input, the document search unit 61 searches the database 68 for a plurality of English patent specifications that match the search condition (71). As the document search unit 61, various known document search techniques are used.
[0100]
When a plurality of English patent specifications are searched by the document search unit 61, after storing them in a temporary buffer memory or the like, for example, the patent specifications as search results are sent to the document structure analysis unit 63 in the searched order, and the patent specifications Analyzes the document structure. The document structure analysis of the patent specification has already been described with reference to FIGS. As a result, a document structure analysis result as shown in FIG. 4 can be extracted (72).
[0101]
The document structure analysis result extracted in this way is sent to the main part selection unit 64 and the machine translation unit 65. The main part selection unit 64 extracts a main part from the document structure analysis result according to the rules defined in the extraction rule dictionary (see FIG. 5) stored in the extraction rule dictionary storage unit 62 from the document structure analysis result according to the processing flow shown in FIG. (73). The main part extraction result is as shown in FIG. 7 or FIG. On the other hand, the machine translation unit 65 translates the document structure analysis result into the native language using the rules and knowledge prescribed in the translation dictionary unit 8, and acquires the machine translation result as shown in FIG. 9 (74). . The main part selection unit 64 and the machine translation unit 65 have been described in detail in the description related to FIGS. 1 and 2, and will be omitted here.
[0102]
Subsequently, the search candidate extraction unit 66 performs a search translated from the Japanese translation results of the full text of the foreign language patent specification translated by the machine translation unit 65 for each major part as the extraction result of the major part selection unit 64. The main part is taken out and stored, for example, in a search result storage database (not shown). The search result storage database includes a full-text English patent specification that is a searched original, a full-text translated native-language patent specification that is machine-translated by the machine translation unit 65, and an original text that is extracted by the main part selection unit 64. And the translation main part generated by the search main part generating part 66 are stored.
[0103]
Therefore, the search result display control unit 67 displays the result on the output unit corresponding to 7 in FIG. 1 according to the processing procedure shown in FIG. FIG. 23 is a diagram showing a screen display example. Note that a full text display button 81, a main part display button 82, a next selection button 83, and a previous selection button 84 are formed on the initial screen.
[0104]
When the main part button 82 is operated, the search result display control unit 67 displays the display areas 85 and 86 with overlapping in the screen depth direction when there are a plurality of searched English patent specifications. In this state, when 1 is set as the variable i, the main part of the original text (the main part extraction result) extracted from one English patent specification i searched from the search result storage database is extracted. The parts are classified and displayed in the display area 85. Then, in conjunction with the original text main part, the main translation part corresponding to each original text is displayed in the display area 86 (S82).
[0105]
Thereafter, it is determined which button has been operated (S83), and when there is no button operation, it is repeatedly determined. If it is determined in step S83 that the button is operated, it is determined whether or not the full text display button 81 has been operated (S84). If it is determined that the full text button 81 has been operated, the corresponding patent specification is determined. The full text of i is displayed in the display area 85, and the full text translation that is the machine translation result corresponding to the full text of the original is displayed in the display area 86 (S85). Next, it is determined which button has been operated (S86). If it is determined that the main display button 82 has been operated (S87), the process returns to step S82 and the English patent specification is similarly applied. The original text main part and the translation main part extracted from the book i are displayed.
[0106]
If it is determined in step S84 that the button operation has been performed but the full-text display button 81 is not operated, it is determined whether the previous selection button 84 or the next selection button 83 is operated (S89, S90). . Here, when it is determined that the previous selection button 84 has been operated, i-1 is set to the variable i within the range of 1 (the lower limit number) (S91), and the process proceeds to step S82, and the previous English patent specification The main part of the original text and the main part of translation extracted from the book i are displayed in the display areas 85 and 86, respectively. On the other hand, if it is determined that the next selection button 83 has been operated, i + 1 is set to the variable i within the range of N (upper limit number) (S92), and the process proceeds to step S82, where it is extracted from the next English patent specification i. The original text main part and the translation main part are displayed in the display areas 85 and 86, respectively.
[0107]
In step S87, if the button operation is not the operation of the main part display button 82 but the previous selection button 84 is operated (S93), i-1 is set to the variable i (S94), and the process proceeds to step S85. Then, the original text and the translated text are displayed in the display areas 85 and 86, respectively. When the next selection button 83 is operated (S95), i + 1 is set to the variable i (S96), and the process proceeds to step S85 to display the original text and the translated text in the display areas 85 and 86, respectively.
[0108]
Therefore, according to the embodiment as described above, at least one foreign language patent specification is searched from a number of foreign language patent specifications based on a search condition, and the searched foreign language patent is searched. For the description, the full-language translated native language patent specification, the original text part, the translation main part are taken out, and the foreign language patent specification is selectively displayed on the display screen-the full-text translated native language patent specification, the original text part- Since the translation is displayed in a paired relationship like the main parts of the translation, the searcher can easily grasp the required technical contents, and thus the search efficiency can be greatly improved.
[0109]
In FIG. 23, the main part is displayed for each item. However, the main part may be displayed in the order of appearance of the paragraphs extracted by the main part selection unit 64 instead of for each item.
[0110]
Although not shown in FIG. 23, in the case of the full text display, since the main part is already extracted by the main part selection unit 64, the main part corresponding part in the whole sentence is highlighted with a different color, for example. For example, it is possible to recognize at a glance which part is important in the whole sentence, and to further improve the search efficiency.
[0111]
Note that the present invention is not limited to the above-described embodiment, and various modifications can be made without departing from the scope of the invention.
[0112]
In addition, the embodiments can be implemented in combination as much as possible, and in that case, the effect of the combination can be obtained. Further, each of the above embodiments includes various higher-level and lower-level inventions, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements. For example, when an invention is extracted because some constituent elements can be omitted from all the constituent elements described in the means for solving the problem, the omitted part is used when the extracted invention is implemented. Is appropriately supplemented by well-known conventional techniques.
[0113]
【The invention's effect】
As described above, according to the present invention, it is possible to provide an abstract creation support system, a program, and an abstract creation support method that can efficiently and appropriately create an abstract of a second language from patent documents in the first language.
[0114]
In addition, the present invention quickly creates a second language patent specification, a source text main part and a translation main part corresponding to the searched first language patent specification, and displays them selectively. In addition, it is possible to provide a patent document search system, a program, and a patent document search method that can grasp the technical contents and thus greatly improve the search efficiency.
[Brief description of the drawings]
FIG. 1 is a configuration diagram showing an embodiment of an abstract creation support system according to the present invention.
FIG. 2 is a diagram for explaining a flow of a series of processes of the abstract creation support system shown in FIG. 1;
FIG. 3 is a diagram showing an example of an English patent specification, for example, which is an abstract creation target.
4 is a diagram showing a document structure analysis result of an English patent specification analyzed by a document structure analysis unit shown in FIG. 1. FIG.
FIG. 5 is a diagram for explaining rules in an extraction rule dictionary stored in an extraction rule dictionary storage unit;
6 is a flowchart for explaining a specific processing flow of a main part selection unit shown in FIG. 1;
FIG. 7 is an image diagram of a main text extraction result extracted by a main text selection unit.
FIG. 8 is a diagram illustrating an example of an expression of a main part extraction result extracted by a main part selection unit.
FIG. 9 is a diagram showing a machine translation result of, for example, an English patent specification that is an abstract creation target shown in FIG. 3;
10 is a diagram for explaining an example of Japanese abstract candidates extracted by the abstract candidate generation unit shown in FIG. 1; FIG.
FIG. 11 is a display example of abstract candidates, in particular a display example of the extracted translation main part and original text main part, an example of editing a Japanese abstract from this translation main part, and a display example of the full translation and original full text Illustration to explain.
FIG. 12 is a diagram showing an example in which a translation main part and an original text main part are displayed in association with each other.
FIG. 13 is a diagram for explaining state transitions by the abstract edit display control unit shown in FIG. 1 and processing associated with the state transitions.
FIG. 14 is a diagram showing a display example of another abstract candidate.
FIG. 15 is a diagram showing an example of an extraction rule dictionary for explaining another embodiment of the abstract creation support system according to the present invention.
16 is a diagram showing an example of a main part extraction result extracted by the main part selection unit using the extraction rule dictionary shown in FIG.
FIG. 17 is a flowchart illustrating a flow of a series of processes in a main part selection unit.
FIG. 18 is a diagram showing an example of an extraction rule for explaining still another embodiment of the abstract creation support system according to the present invention.
FIG. 19 is a diagram showing an example of an unnecessary word dictionary for explaining still another embodiment of the abstract creation support system according to the present invention.
FIG. 20 is a flowchart illustrating an example of processing for extracting a main part using an extraction rule dictionary and an unnecessary word dictionary in a main part selection unit;
FIG. 21 is a functional configuration diagram illustrating an embodiment of a search system according to the present invention.
22 is a flowchart for explaining a series of processing of a search result display control unit in FIG.
FIG. 23 is a diagram showing a display example of search results.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Original text description input part, 2 ... Extraction rule dictionary storage part, 3 ... Abstract creation process control part, 4 ... Recording medium, 5 ... Abstract related information storage database, 7 ... Output part, 8 ... Translation dictionary part, 11 Document structure analysis unit, 12 Main part selection unit, 13 Machine translation unit, 14 Abstract candidate extraction unit, 15 Abstract edit display control unit, 61 Document search unit, 62 Extraction rule dictionary storage unit, 64 Main part selection part, 65 ... Machine translation part, 66 ... Search candidate extraction part, 67 ... Search result display control part.

Claims

An extraction rule dictionary that defines key phrases such as words and phrases of the first language included in the captions of the first language chapters and the description sentences described in the chapters of the respective titles, which are required for each certain classification;
A document structure analyzing means for analyzing the chapter structure of the sentence described in the electronic first language patent application document;
The required extraction rule dictionary is selected according to the classification described in the document structure analysis result analyzed by the document structure analysis means, and the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary Main part selection means for selecting the main part from
Machine translation means for translating the document structure analysis result analyzed by the document structure analysis means into a second language;
Abstract candidate extraction means for extracting second language abstract candidates from the translation result of the second language translated by the machine translation means for each main part selected by the main part selection means;
Only the abstract candidates of the second language extracted by the abstract candidate extracting means are displayed, or the abstract candidates of the second language and the main parts of the first language are displayed in association with each other, and the abstract of the second language is displayed. An abstract creation support system comprising an abstract edit display control means for creating an abstract in a second language including candidate correction processing.

The essential part selecting means specifies an abstract item of a second language to be classified into the extraction rule of the extraction rule dictionary when the abstract is created, specifies the essential part to be collated with the extraction rule, 2. The information output device according to claim 1, further comprising means for adding and outputting an abstract item of the second language corresponding to the main part to the main part specifying information when outputting the information to be specified. Abstract creation support system.

The abstract edit display control means includes a second language abstract candidate extracted by the abstract candidate extraction means and a main part of the first language, a full sentence structure analysis result of the first language, and a translation by the machine translation means. And further providing means for selectively displaying the processed second language translation results in correspondence with each other and generating a second language abstract including a correction process for the second language abstract candidate. The abstract creation support system according to claim 1, wherein

Importance information is newly defined in the extraction rule of the extraction rule dictionary,
The main part selection means specifies a main part to be collated with the extraction rule, and outputs the information specifying the main part with the importance information added to the main part specifying information,
The abstract candidate extraction unit compares the importance information output from the main part selection unit with a threshold value indicating a predetermined importance level, and the main part to which the importance information higher than the threshold value is added. 3. The abstract creation support system according to claim 1, further comprising means for extracting a second language abstract candidate from the translation result of the second language translated by the machine translation means. .

A key phrase having a symbol including a symbol related to a representative diagram is added to the extraction rule dictionary,
The main part selection means selects the required extraction rule dictionary according to the classification described in the document structure analysis result analyzed by the document structure analysis means, and uses the title and key phrase defined in the extraction rule dictionary. The abstract creation support system according to claim 1, wherein a location for explaining a representative diagram is selected based on the document structure analysis result.

In addition to the extraction rule dictionary, an unnecessary word dictionary is provided,
The main part selection means selects the required extraction rule dictionary according to the classification described in the document structure analysis result analyzed by the document structure analysis means, and uses the title and key phrase defined in the extraction rule dictionary. Based on the result of the document structure analysis, a main part is selected, and if there is an unnecessary word to be described in the unnecessary word dictionary in the selected main part, means for excluding the selected main part The abstract creation support system according to any one of claims 1 to 5, further provided.

Extraction rules having key phrases such as words and phrases of the first language included in the captions described in the chapters of the first language and the descriptions of the chapters of the respective titles are stored at least for each certain classification, A computer that creates an abstract in the second language from the electronic patent application documents in the first language,
A document structure analysis function for analyzing the chapter structure of a sentence described in an electronic first language patent application document;
Select the required extraction rule dictionary according to the classification described in the analyzed document structure analysis result, and select the main part from the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary The main part selection function to
A machine translation function for translating the document structure analysis result analyzed by the document structure analysis function into a second language;
Abstract candidate extraction function for extracting second language abstract candidates from the second language translation result translated by the machine translation function for each main part selected by the main part selection function;
Only the abstract candidates of the second language extracted by this function are displayed, or the abstract candidates of the second language and the main parts of the first language are displayed in association with each other, and the abstract candidates of the second language are corrected. A program characterized by realizing a function of creating an abstract of a second language including processing.

The title of chapters in the first language required for each fixed category and the extraction rules for key phrases such as words and phrases in the first language included in the explanations described in the chapters of each title are stored, and electronic An abstract creation support method for creating an abstract in a second language from a patent application document in a first language,
A document structure analysis step for analyzing a chapter structure of a sentence described in an electronic first language patent application document;
Select the required extraction rule dictionary according to the classification described in the analyzed document structure analysis result, and select the main part from the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary The main part selection steps to be
A machine translation processing step for translating the document structure analysis result analyzed in the document structure analysis step into a second language;
An abstract candidate extraction step for extracting an abstract candidate in a second language from the translated second language translation result for each essential part selected in the essential part selection step;
Only the extracted abstract candidate of the second language is displayed, or the abstract candidate of the second language and the main part of the first language are displayed in association with each other, and the correction process of the abstract candidate of the second language is performed. And an abstract creation support method comprising the step of creating an abstract in a second language.

Storage means for storing a plurality of digitized patent application documents in a first language;
Document search means for searching for at least one patent application document in the first language from the storage means under search conditions;
An extraction rule dictionary that defines key phrases such as words and phrases of the first language included in the captions of the first language chapters and the description sentences described in the chapters of the respective titles, which are required for each certain classification;
Document structure analyzing means for analyzing a chapter structure of a sentence described in the electronic first language patent application document searched by the document searching means;
The required extraction rule dictionary is selected according to the classification described in the document structure analysis result analyzed by the document structure analysis means, and the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary Main part selection means for selecting the main part to be searched from,
Machine translation means for translating the document structure analysis result analyzed by the document structure analysis means into a second language;
Search candidate extraction means for extracting second language search candidates from the translation result of the second language translated by the machine translation means for each search target principal part selected by the principal part selection means;
After the search candidate of the second language is extracted by the search candidate extraction means, the display of the search main part of the first language or the correspondence display between the search main part of the first language and the search candidate of the second language or A patent document search system, comprising: search result display control means for displaying a correspondence between the document structure analysis result of the first language and the translation result of the second language.

Importance information is newly defined in the extraction rule of the extraction rule dictionary,
The main part selection means specifies a main part to be collated with the extraction rule, and outputs the information specifying the main part with the importance information added to the main part specifying information,
The search candidate extraction unit compares the importance level information output from the main part selection unit with a threshold value indicating a predetermined importance level, and controls a search amount based on the threshold value. 9. The patent document search system according to 9.

Multiple electronic patent application documents in the first language and the titles of the chapters in the first language required for each certain category, and the words and phrases in the first language included in the explanatory texts described in the chapters of each title An extraction rule dictionary that defines key phrases such as: a computer for extracting search candidates in at least the second language from the patent application documents in the first language;
A document search function for searching for at least one patent application document in the first language from the stored information under a search condition;
A document structure analysis function for analyzing a chapter structure of a sentence described in the electronic first language patent application document searched by this function;
The required extraction rule dictionary is selected according to the classification described in the document structure analysis result analyzed by this function, and the search object is searched from the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary. The main part selection function to select the main part to be
A machine translation function for translating the document structure analysis result analyzed by the document structure analysis function into a second language;
A search candidate extraction function for extracting a search candidate of the second language from the translation result of the second language translated by the machine translation function for each search target main part selected by the main part selection function;
A search result display control function that displays search candidates in the second language extracted by the search candidate extraction function or displays the search candidates in the second language in association with the search main parts in the first language is realized. A program characterized by letting

Multiple electronic patent application documents in the first language and the titles of the chapters in the first language required for each certain category, and the words and phrases in the first language included in the explanatory texts described in the chapters of each title An extraction rule dictionary that defines key phrases such as: a search method for extracting at least a search candidate in the second language from the patent application document in the first language,
A document search step of searching for at least one first language patent application document from among the stored information under a search condition;
A document structure analysis step of analyzing a chapter structure of a sentence described in the electronic first language patent application document searched by this step;
The required extraction rule dictionary is selected according to the classification described in the document structure analysis result analyzed in this step, and the search object is searched from the document structure analysis result based on the title and key phrase specified in the extraction rule dictionary. A main part selection step for selecting the main part to be
A machine translation processing step for translating the document structure analysis result analyzed in the document structure analysis step into a second language;
A search candidate extraction step of extracting a search candidate of the second language from the translation result of the second language translated by the machine translation function for each search target main part selected in the main part selection step;
The second language search candidates extracted in the search candidate extraction step are displayed, or the second language search candidates and the first language search main part are displayed in association with each other. Patent literature search method.