JP2004133896A

JP2004133896A - Patent specification debugging tool and patent specification debugging tool program

Info

Publication number: JP2004133896A
Application number: JP2003186186A
Authority: JP
Inventors: Masatoshi Shibuya; 渋谷　正敏; Kenji Sato; 佐藤　謙治
Original assignee: Individual
Current assignee: Individual
Priority date: 2002-08-14
Filing date: 2003-06-30
Publication date: 2004-04-30
Anticipated expiration: 2023-06-30
Also published as: JP4102897B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a patent specification debugging tool that graphically displays a claim sentence, extracts technical elements in the claim sentence, clearly displays a relation among the technical elements in the graphical display, and outputting those that should be discussed again as the claim sentences. <P>SOLUTION: An inputted Japanese claim sentence is subjected to a morpheme analysis by using a morpheme dictionary for analyzing the claim sentence. A technical element that is positioned before a case postpositional particle 'to' and the one specified from words and phrases that are positioned after a directive such as "aforesaid" are extracted. A tree structure of a subject that is an element for specifying an invention and a technical element directly related to the subject is created. The relation among the technical elements is graphically displayed on the tree structure to allow the relation among the technical elements to be understood clearly. The phrases and words having broader terms of the technical elements in the claim sentence are detected and outputted, thus outputting the usage of phrases and words having the broader terms, problems in grammar, and problems in the usage of the technical elements, and easily discussing the claim sentence again. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明が属する技術分野】
この発明は、日本語で作成された特許明細書のクレーム文を解析して図式化表示してクレーム文としての問題点を自動検出し出力する特許明細書デバッグツールに関するものである。
【０００２】
【従来の技術】
特許明細書の作成を支援するツールとしては、特開２００１−３０６７５４号や特開２０００−４８０１３号に開示されているように、特許明細書の作成の手順や作成の要点やチェックリストを表示するツールや、特開２００２−２０７７２０号に開示されているように、学術論文を特許明細書に変換するツールがある。
【０００３】
また、特開平９−２９３０７５号に開示されているように、特許明細書のクレーム文をパターン照合した上で階層データを作り第二の言語に翻訳するツールがある。
【０００４】
その他、本発明の関連技術として重要な日本文を自動解析する技術としては、機械翻訳関係の特開平１１−２５０８７号、特開平７−２９５９８５等をはじめ多数の文献がある。
【０００５】
【発明が解決しようとする課題】
日本語の特許明細書の不慣れな者にとって、特許明細書を作成するのは、たいへん困難な作業であり、特許技術者に依頼する場合も小規模の企業にとっては、高額な負担となり、特許出願を困難なものとしている。特に、クレーム文は長文となる場合があり、さらに種々の指示語の存在が、クレーム文の解読を難解なものにするため、日本語の特許明細書の作成に不慣れな者にとって、自分が作成した特許明細書のクレーム文が自分の意図したように作成されているのか判断するのが困難なだけでなく、出願人が専門の特許技術者に特許明細書の作成を依頼した場合でも、作成された特許明細書のクレーム文が出願人に属する発明者が自分が意図したとおりに書かれているか明確に判断できない場合が多く、誤った出願をしてしまうという問題もある。
【０００６】
日本語の特許明細書のクレーム文を明確に解読できるようにする方法として、クレーム文を技術要素に分解して図式化表示する方法は有効であるが、従来の技術である特開平９−２９３０７５号のように、パターン照合する方法では、日本語の特許明細書のクレーム文のように多種多様な形態をとるものにとって、実用的なレベルになるまでクレーム文のパターンを準備することは事実上困難であり、より日本語の特許明細書のクレーム文の実情にあった方法でクレーム文を解析し、図式化表示する方法が必要になる。
【０００７】
また、熟練した特許技術者でも、扱う特許明細書の技術分野に不慣れな場合には、不必要な技術要素を過大評価してクレーム文の前提部に記載してしまい請求の範囲を狭めたり、技術要素としてより上位概念の語句を用いてクレーム文を作成できる場合でも、その技術分野に不慣れな場合には、上位概念の語句が思い浮かばなかったりするために、技術要素としては発明者より提示された下位概念の語句をそのまま用いて出願してしまい請求の範囲を狭めてしまう問題点もある。
【０００８】
また、本発明の関連技術として重要な日本文を自動解析する技術としては、「原文」を「形態素解析」し、「構文解析」し、「意味解析」していくトランスファー方式が通常用いられているが、通常の文章の場合には、自然言語のあいまい性のために「構文解析」の段階で、まだ十分に精度の良い解析ができないのが現状であるが、特許明細書のクレーム文ではあいまい性をともなう表現が少なく、また常用される特殊な表現方法に着目して、現状の日本文の自動解析技術でクレーム文を解析できるようにする必要がある。
【０００９】
本発明は上記の問題点を解決するためになされたものであり、
多種多様な文構造を持つ日本語の特許明細書のクレーム文であっても、そのクレームを構成する技術要素を格助詞「と」の前に前置される特性と、クレーム文内の技術要素の相互の関係を示すために用いられる「前記」または「該」などの修飾語（以下、ものを指し示す修飾語という意味で「指示語」と表記する）の後に技術要素が配置される特性とに着目し、
また、一般の日本文の自動解析に用いる形態素辞書にクレーム文の技術要素として用いられた用語であれば、たとえば、「エンジン電子制御装置」のように本来４つの形態素「エンジン」、「電子」、「制御」、「装置」に分解される用語であっても、クレーム文の技術要素として使用された実績があれば１つの名詞相当の用語として追加登録できる「クレーム文解析用形態素辞書」を用いてクレーム文を形態素解析し、
その形態素解析の結果から格助詞「と」に前置された語句からクレーム文の技術要素を抽出し、その技術要素間の従属関係を論理的に解析して、クレーム文である請求項の末尾に記載される発明を特定する要素（以下「主題」と記載する）に直結する技術要素を１次の技術要素として抽出し、その主題と１次の技術要素ごとに枝分けした木構造で図式化表示し、さらに、クレーム文に含まれる指示語で特定できる技術要素を抽出して、指示語の種類に基づき参照元の技術要素と参照先の技術要素の関係を定めて、上記の木構造の上で技術要素間の引用関係を線で結ぶことで明確に表示し、クレーム文の内容を図式化表示することにより、出願人の発明者や特許明細書作成者がクレーム文が意図したように作成できたかを明確に判断できるようにする。
【００１０】
また、前提部が存在するクレーム文では、クレーム文を前提部と特徴部に分離し、前提部と特徴部の主題を抽出し、前提部と特徴部ごとに上記の木構造の表示と図式化表示を行うほか、
前提部に他で参照されていない技術要素を自動的に検出し出力するようにして、請求の範囲が狭くなることを防止したり、技術要素の上位概念の語句を検出出力し、また、前提部に特徴部より下位概念の語句があれば自動的に検出して、出願人の発明者や特許明細書作成者がクレーム文の技術要素の再検討が容易にできるようにする。
【００１１】
【課題を解決するための手段】
この目的を達成するために、本発明の請求項１に記載の特許明細書デバッグツールは、データ入出力部を有する特許明細書デバッグツールであって、前記データ入出力部に入力された日本語の特許明細書のクレーム文を解析する辞書ファイルと、前記辞書ファイルを用いて解析された結果として得られた前記クレーム文の格助詞「と」とものを指し示す修飾語である指示語とにより特定されて名詞または名詞と同じ働きを持つ名詞句を前記クレーム文の技術要素として特定する技術要素特定手段と、前記辞書ファイルを用いて解析して得られた前記クレーム文の末尾に記載されている名詞または名詞と同じ働きを持つ名詞句を発明の主題として特定する主題特定手段と、前記技術要素特定手段により該特定された技術要素の中から、前記主題特定手段により特定された前記主題に直接関係する該技術要素を１次の技術要素として特定する１次の技術要素特定手段と、この１次の技術要素特定手段により特定された該１次の技術要素と前記主題とを図面の上に配置し、該１次の技術要素と前記主題との関係を図式として表示するクレーム文第一の図式化手段とを備えたことを特徴とするものである。従って、この発明によれば、クレーム文第一の図式化手段により、クレーム文を分解して、その従属関係を図式化して表示されるようになり、クレーム文の構成が明確に理解できるようになる。
【００１２】
また、請求項２に記載の特許明細書デバッグツールは、前記辞書ファイルには日本文の自動解析に用いる形態素辞書にクレーム文の技術要素として用いられた用語を名詞相当の用語として追加登録した形態素解析用辞書を含むことを特徴とするものであり、従って、特殊な専門用語や複雑な形態素から構成される技術用語がクレーム文内にあっても容易に技術要素を抽出できるようになる。
【００１３】
また、請求項３に記載の特許明細書デバッグツールは、前記主題特定手段には、前記クレーム文の中から前提部と特徴部とに分離する要素を検出し、前記クレーム文を前提部と特徴部とに分離する機能と前記前提部の主題を特定する機能と、前記クレーム文を前提部と特徴部に分離する要素を検出できない場合は、前記クレーム文には前提部がなく、前記クレーム文を全て特徴部として処理する機能とを含むことを特徴とするものであり、従って、前提部つきのクレーム文と、構成要素列挙型のクレーム文とに識別して処理できるようになる。
【００１４】
また、請求項４に記載の特許明細書デバッグツールは、前記クレーム文第一の図式化手段には、前記前提部の主題と前記特徴部の主題ごとに前記１次の技術要素との関係を図式として表示する機能を有することを特徴とするものであり、従って、前提部つきのクレーム文であれば、前提部と特徴部に分けて図式化表示できるようになり、構成要素列挙型のクレーム文であれば、クレーム文をすべて特徴部として扱い、クレーム文を図式化して表示できるようになり、クレーム文の構成が明確に理解できるようになる。
【００１５】
また、請求項５に記載の特許明細書デバッグツールは、前記技術要素特定手段のうち、前記指示語により特定された技術要素に関して前記クレーム文の中の参照元の技術要素を特定する参照元技術要素特定手段と、この参照元技術要素特定手段により特定された該参照元の技術要素と該指示語により特定された技術要素とを前記クレーム文第一の図式化手段により表示された前記図式の上で線で結び、なおかつ、該指示語を該図式から削除した該図式を表示するクレーム文第二の図式化手段とを備えたことを特徴とするものである。従って、クレーム文第二の図式化手段により、参照元技術要素特定手段により定義された技術要素どうしの関係が明確に理解できるようになる。
【００１６】
また、請求項６に記載の特許明細書デバッグツールは、前記技術要素特定手段の中に、前記指示語の後に動詞または動詞と同じ働きを持つ動詞句または用言が続き、その後に名詞または名詞と同じ働きをもつ名詞句が続く場合は、該動詞または該動詞句または該用言を除いた該名詞または名詞と同じ働きをもつ名詞句を技術要素として特定する機能を含むことを特徴とするものであり、従って、指示語の後の動詞句を除かないと技術要素を抽出できない場合でも技術要素を抽出できるようになる。
【００１７】
また、請求項７に記載の特許明細書デバッグツールは、前記技術要素特定手段の中に、前記クレーム文の中の並立の意味を有する接続詞を格助詞「と」に変換する機能を含むことを特徴とするものであり、従って、この方法により接続詞「または」等により並立された語句を技術要素として抽出できるようになる。
【００１８】
また、請求項８に記載の特許明細書デバッグツールは、前記技術要素特定手段の中で、格助詞「と」に前置された語句として特定された技術要素を「と」による技術要素とし、この格助詞「と」の後に続く形態素の品詞と読点または符号とにより前記「と」による技術要素の属性を定め、該属性が前記クレーム文の中で先行した「と」による技術要素の属性から変化した状態を遷移状態として、該遷移状態により、前記「と」による技術要素どうしの従属関係を演算式またはこの演算式に準じた方式で決定する方法を前記１次の技術要素特定手段に含むことを特徴とするものである。従って、前記「と」による技術要素どうしの従属関係が演算式のみで決定できるようになるので、プログラムの処理の負担が軽くなるばかりでなく、多種多様な形態をとる日本語のクレーム文に対して柔軟な処理が可能になる。
【００１９】
また、請求項９に記載の特許明細書デバッグツールは、前記１次の技術要素特定手段のうち、前記「と」による技術要素どうしの従属関係を演算式で決定する方法の中に、前記「と」による技術要素に対し位置づけの値を定義する演算式を含み、該位置づけの値を定義する演算式は、前記遷移状態に基づいて定義され、また、特別に定義された技術要素については位置づけの値を補正できる機能を含み、前記位置づけの値の大小の比較により前記１次の技術要素を特定する機能を含むことを特徴とするものであり、従って、前記１次の技術要素を演算式のみで決定できるようになるので、プログラムの処理の負担が軽くなるばかりでなく、多種多様な形態をとる日本語のクレーム文に対して柔軟な処理が可能になる。
【００２０】
また、請求項１０に記載の特許明細書デバッグツールは、前記「と」による技術要素を特定する方法の中に、前記格助詞「と」が並立の意味を持たない場合は、該格助詞「と」に前置される語句を「と」による技術要素として抽出しない機能を含むことを特徴とするものであり、従ってこの方法により並立を意味しない格助詞「と」に前置される語句を技術要素として抽出されることを防止できるようになる。
【００２１】
また、請求項１１に記載の特許明細書デバッグツールは、前記技術要素特定手段により特定された技術要素の最後尾の形態素が助詞の場合は、該助詞を除いたものを技術要素として特定する機能を有することを特徴とするものであり、従ってこの方法により最後尾に助詞となる技術要素を抽出しなくなる。
【００２２】
また、請求項１２に記載の特許明細書デバッグツールは、前記前提部に属する前記技術要素のうち、他で参照されない技術要素があれば、第一の警告を出力することを特徴とするものである。従って、前提部に他で引用されない技術要素があるのに気づき、クレーム文の再検討が容易にできるようになる。
【００２３】
また、請求項１３に記載の特許明細書デバッグツールは、前記「と」による技術要素の属性の並びから判断して、格助詞「と」の使い方として、文法に反する場合があれば、第二の警告を出力することを特徴とするものである。従って、格助詞「と」の誤った使い方に気づき、クレーム文の再検討が容易にできるようになる。
【００２４】
また、請求項１４に記載の特許明細書デバッグツールは、前記主題特定手段により特定された前記前提部の主題と前記の特徴部の主題の比較を行い、それぞれの主題が異なる場合は、「クレーム文の前提部と特徴部との主題が異なる」ことを意味する文章または図形を出力することを特徴とするものであり、従って、前提部と特徴部との主題を同じものにすべきかの検討が容易にできるようになる。
【００２５】
また、請求項１５に記載の特許明細書デバッグツールは、前記クレーム文の中の前記技術要素の前記上位概念の用語を前記辞書ファイルの中の上位概念用語辞典から検出して出力する技術要素上位概念検出手段を備えたことを特徴とするものである。従って、技術要素を上位概念に置き換えるべきか等のクレーム文の再検討が容易にできるようになる。
【００２６】
また、請求項１６に記載の特許明細書デバッグツールは、前記特徴部の中の複数の技術要素のいずれかと前記前提部の中の複数の技術要素の前記上位概念の用語のいずれかとを比較して同じ語句があれば、第三の警告を出力することを特徴とするものである。従って、前提部に、特徴部より下位概念の語句が使われているのに気づき、クレーム文の再検討が容易にできるようになる。
【００２７】
さらに、請求項１７に記載のコンピュータプログラムを汎用のパーソナルコンピュータ等にインストールすることにより、汎用のパーソナルコンピュータを用いて、容易に特許明細書のクレーム文を図式化表示できるようになるため、クレーム文の構成やクレーム文の中の技術要素の関連を明確に理解できるようになり、また、クレーム文の再検討が容易にできるようになる。
【００２８】
【発明の実施の形態】
以下、この発明の特許明細書デバッグツールをコンピュータプログラムとして具体化した実施の形態について説明する。
【００２９】
図１は汎用のパーソナルコンピュータのブロック図であり、この汎用のパーソナルコンピュータにコンピュータプログラムとしての特許明細書デバッグツールをインストールして、本発明の機能を具体化させるものである。
【００３０】
中央演算装置であるＣＰＵ１と、ＲＡＭまたはＲＯＭで構成される内部メモリ２と、入出力インターフェース３とはバスライン１１で接続され、入出力インターフェース３には、データ通信端末４とデータ出力端末５とデータ入力端末６と外部メモリドライブ端末７とが接続されている。
【００３１】
データ通信端末４としては、外部サーバと接続するためのＴＡ（ターミナルアダプタ）や通信モデム等があり、データ出力端末５としては、モニター画面やプリンター等があり、データ入力端末６としては、キーボードやマウス等があり、外部メモリドライブ端末７としては、記憶媒体のＨＤ（ハードディスク）１２やＣＤ（コンパクトディスク）やＦＤ（フロッピディスク）等をドライブするＨＤドライバーやＣＤドライバーやＦＤドライバー等がある。
【００３２】
ＯＳ（オペレーティングシステム）８とＯＳ８の上で動く特許明細書デバッグツール９は外部メモリ（ＣＰＵ１にバスライン１１で直結されていないメモリ）ドライブ端末７の一つであるＨＤドライバーによりアクセスされるＨＤ（ハードディスク）１２の中にインストールされていて、ＯＳ８が内部メモリに書き込まれた後、特許明細書デバッグツール９の一部または全部が内部メモリ２に書き込み、図１４や図５のフローチャート等で示すプログラムを実行する形態をとる。
【００３３】
特許明細書デッグツール９で解析対象とする図３に示すようなクレーム文は、データ入力端末６のキーボードを介して入力されたり、記憶媒体に記憶されたものを外部メモリドライブ端末７を介して入力されたり、データ通信端末４を介して外部サーバから入力され、特許明細書データファイル１０に格納される。
【００３４】
また、データ入力端末６のキーボードやマウス等は、特許明細書デバッグツール９を操作するために使われ、データの加工結果や図４に示すような技術要素相関図等は、データ出力端末５のモニター画面やプリンタ等に出力される。
【００３５】
図２は、特許明細書デバッグツール９と特許明細書データファイル１０との構成図である。特許明細書デバッグツール９はデバッグを実行する実行ファイル２０とクレーム文解析用辞書ファイル２９から構成され、実行ファイル２０は、デバッグ対象のクレーム文を形態素解析（クレーム文をクレーム文解析用辞書ファイル２９に内蔵された形態素解析用辞書３０で定義された文字列に分解し、品詞等を解析）するための形態素解析部２１と、形態素解析の結果とクレーム文解析用辞書ファイル２９に内蔵された前提部判定ファイル３２とにより、クレーム文の特徴部と前提部とを抽出し、特徴部と前提部との主題を抽出する主題抽出部２２と、格助詞「と」の前に置かれた名詞句をクレーム文の技術要素として抽出する「と」による技術要素抽出部２３と、主題抽出部２２で抽出した主題と「と」による技術要素抽出部２３で抽出した技術要素の関係を木構造等で出力する木構造出力部２４と、「上記」、「前記」、「この」、「これらの」、「該」、「同」などの指示語により特定できる技術要素を抽出する指示語による技術要素抽出部２５と、技術要素相互の関連を示す図を出力する技術要素相関図出力部２６と、技術要素の上位概念の検出を行う上位概念検索部２７と、クレーム文のエラー等を出力するエラーメッセージ出力部２８とに分かれていて、それぞれコンピュータプログラムとして機能する。
【００３６】
クレーム文解析用辞書ファイル２９は、形態素解析用辞書３０と上位概念辞書３１と前提部判定用語ファイル３２と前処理ファイル３３と指示語ファイル３４とポジション特別処理ファイル３５とから構成される。
【００３７】
形態素解析用辞書３０は、日本語の単語の辞書で単語の品詞や活用変化や語尾変化や単語と単語との関係から品詞を確定させるための文法等が格納されていて、形態素解析部２１でクレーム文を形態素解析するために用いられる。この形態素解析用辞書３０は、一般の日本文の自動解析に用いられる形態素辞書であっても良いが、この形態素辞書にクレーム文の技術要素として用いられた用語であれば、たとえば、「エンジン電子制御装置」のように本来４つの形態素「エンジン」、「電子」、「制御」、「装置」に分解される用語であっても、「エンジン電子制御装置」を１つの名詞相当の用語として追加登録できる「クレーム文解析用辞書としての形態素解析用辞書３０」であっても良い。
【００３８】
上位概念辞書３１は、語句に対する上位概念の語句が格納されていて、例えば、本発明でクレーム文の技術要素として「ばね」や「ゴム」といった語句が抽出された場合、それらの語句の上位概念として「弾性体」という語句が上位概念辞書３１に格納されていて、その上位概念の語句が自動的に検索できるようにするために用いられる。
【００３９】
前提部判定用語ファイル３２は、このファイルに登録されている語句、例えば「において、」や「であって、」等の語句で区切られたクレーム文のうち、文頭からこれらの語句までを前提部とし、これに続き文末までを特徴部と判定するために用いられる。
【００４０】
前処理用ファイル３３は、技術要素を抽出するためにクレーム文の中に、このファイルに登録されている語句を指定された語句に変換するもので、例えば、「及び」という接続詞は格助詞「と」に変換するように定められていれば、クレーム文の中の「及び」を「と」に変換するように用いられる。
【００４１】
指示語ファイル３４は、このファイルに登録されている語句で、クレーム文内の技術要素の相互の関係を示すために用いる指示語、例えば、図１５に示すような「上記」、「前記」、「この」、「同」、「該」等などのものを指し示す修飾語で、これらの指示語に続き、直後の動詞句のような用言を除いた語句と、他の部分で一致する語句を検出し、その語句をあらたな技術要素と決める時に使用し、同ファイルで、それぞれの登録されている語句に定義されている「参照距離」の数値である「＋２」、「＋１」、「−１」、「−２」により、同ファイルに登録されている語句により決められた技術要素の参照対象とするもう一方の技術要素を決めるもので、「−２」であれば、先に出た技術要素のうち、一番先に出た技術要素を参照対象とし、「−１」であれば、先に出た技術要素のうち、一番最後に出た技術要素を参照対象とし、「＋１」であれば、後から出る技術要素のうち、一番先に出る技術要素を参照対象とし、「＋２」であれば、後から出る技術要素のうち、一番最後に出る技術要素を参照対象とするように用いられる。
【００４２】
ポジション特別処理ファイル３５は、このファイルに登録されている語句、例えば、「特徴」等が技術要素として検出された場合は、後述する位置づけＰＳの値を定められた方法に従い操作するために用いられる。
【００４３】
図２の特許明細書データファイル１０の構成部品として、クレームファイル５０には、デバッグの対象とする特許明細書の請求項の原文がクレーム文として格納されていて、アドレス付き技術要素ファイル５１は、図７に示すように、デバッグ対象のクレーム文が、形態素と技術要素に分解された表形式のファイルとして格納され、形態素と技術要素を「クレーム要素」の列に記載し、その左列を「アドレス」の項目としてアドレスを記し、「クレーム要素」の右の列には、「語義」、「属性ＴＸ」、「位置づけＰＳ」、「参照技術要素のアドレス」、「上位概念用語」を項目とする列が設けられ、「語義」の列には各要素の品詞や技術要素や主題や特徴部、前提部のそれぞれの文頭、文末等を表す語句を記し、「属性ＴＸ」の列には各要素の後述する属性ＴＸが記され、「位置づけＰＳ」の列には技術要素の後述する位置づけＰＳの値が記され、「参照技術要素のアドレス」の列には、指示語により特定された技術要素の参照対象とする技術要素のアドレスが記され、「上位概念用語」の列には、各技術要素の上位概念用語が記され、技術要素相関図ファイル５２には、図４のような木構造と木構造上で技術要素の相互の関係を図示するためのデータが格納されている。
【００４４】
図５は、デバッグ対象のクレーム文を入力した後、図４のような技術要素相関図を出力するまでの処理手順を示したフローチャートである。
【００４５】
具体的には、図３に示すクレーム文「ＡとＢとに保持されたＣと、前記Ｃを支持するＤと、ＥがＨに変位した時、前記Ｄを駆動するＩとを備えたＪにおいて、ＫがＬした時に、前記ＩがＭ以下となるＮと、前記ＫがＯした時に、前記Ｎが、ＰとＱとＲとを牽引してＳにするＴとを備えたことを特徴とするＪ。」を例にして説明する。なおＡ〜Ｔの部分には、任意の普通名詞等で構成される技術的な用語が入るものとする。
【００４６】
図５の処理Ｓ１では、デバッグ対象のクレーム文を形態素解析部２１において、形態素解析用辞書３０を参照して形態素解析（文を形態素解析用辞書３０に登録された文字列に分解し、品詞等を解析する）の処理が行われる。また、形態素解析用辞書３０に登録されていない未知語は、すべて名詞句として処理する。
【００４７】
次に、処理Ｓ２において、形態素解析の解析の結果、前処理ファイル３３に登録されている語句がデバッグ対象のクレーム文の中にあれば、前処理ファイル３３で指定された語句に変換する。例えば、接続詞の「および」または「及び」が前処理ファイル３３に登録されていて、接続詞の「および」または「及び」を格助詞「と」に変換するように指定されていれば、そのように変換する。これは、文法上、本来は「東京と大阪とを結ぶ新幹線」と記すべきところ、クレーム文で「東京及び大阪とを結ぶ新幹線」という表し方をする場合があり、この表し方のため後述する方法で技術要素を抽出できないことがあるので、処理Ｓ２のような特別な処置を設定する必要がある。また、「東京と大阪を結ぶ新幹線」と表す場合もあるが、これは並立する語句には、すべて格助詞「と」を付けるという本来の形に反しているので、この場合には後術する方法で大阪を技術要素として抽出できなくても良いものとする。
【００４８】
次に、処理Ｓ３において、クレーム文の全文にアドレスを付ける。アドレスの値はクレーム文頭からの文字数で良い。
【００４９】
図６に処理Ｓ３までの形態素解析の結果の一部を示す。左の列から「アドレス」、「形態素相当」、「品詞相当」である。「品詞相当」の中で使われる「名詞句」は、全体として一つの名詞と同じはたらきをするものや、形態素解析用辞書３０で登録されている単語ごとに「名詞句」定義されている場合も「名詞句」として表す。例えば、「エンジン電子制御装置」は本来の形態素としては、「エンジン（普通名詞）」、「電子（普通名詞）」、「制御（普通名詞）」、「装置（普通名詞）」に分解され、この４つの語句が連なってできた語句であるが、形態素解析用辞書３０に「品詞相当」が「名詞句」として「エンジン電子制御装置」が形態素相当として登録されていれば、「最長一致法」（特開平１１−２５０８９号参照）を用いて形態素解析をした場合、「エンジン電子制御装置」が「名詞句」として解析される。「品詞相当」の中で使われる「動詞句」は、全体として一つの動詞と同じはたらきをするものや、形態素解析用辞書３０で登録されている単語ごとに「動詞句」と定義されている場合も「動詞句」として表す。ここで、図６の列の項目名を「形態素相当」としているのは、本来の形態素と異なるものが含まれ、「品詞相当」としているのは、本来の文法の品詞と異なるものが含まれるためである。
【００５０】
次に、処理Ｓ４において、前提部または特徴部の主題抽出を行う。この処理のより詳細な処理手順を図８に示す。
【００５１】
図５の処理３で、クレーム文の全文にクレーム文頭からの文字数でアドレスを付け、図６に示すように形態素相当の各語句の先頭の文字のアドレスを各形態素相当の語句のアドレスとして表した場合、図８の処理Ｓ４０１で、アドレスの最小値（図６の例では１）がクレーム文頭のアドレス値、アドレス最大値（図７の例では１２１）がクレーム文末のアドレス値であり、なおかつ特徴部文末のアドレスとすることができる。
【００５２】
次に、処理Ｓ４０２で、検索範囲をクレーム全文に設定し、次に、判定処理Ｓ４０３で、クレーム全文の中に、前提部判定用語ファイル３２に登録されている語句（例えば、「において、」、「であって、」のようにクレーム文で前提部の区切りとして常用される語句）が有るかどうかを判定する。
【００５３】
判定処理Ｓ４０３の判定結果が“有る”の場合、判定処理Ｓ４０４で、前提部判定用語ファイル３２に登録されている語句が１項のクレーム文中に２つ以上有るかどうかを判定し、２つ以上あれば、処理Ｓ４０５において、“前提部判定用語が２回以上使われているエラー”として、エラー処理をして、処理Ｓ４０６において、全処理を終了し、判定処理Ｓ４０４の判定結果が“２つ以上無い”場合には、クレーム文中に前提部が存在するので、次に、処理Ｓ４０７で、クレーム文の文頭のアドレス（図６、図７の例では１）を前提部文頭のアドレスとし、次に、処理Ｓ４０８において、処理Ｓ４０３で検出した語句の先頭のアドレス（図７の例では５０）を前提部の文末のアドレスとし、次に、処理Ｓ４０９において、前提部文末の次の形態素相当のアドレス（図７の例では、“Ｋ”が次の形態素相当になり、そのアドレスは５５）を特徴部の文頭のアドレスとし、次に、処理Ｓ４１０において、前提部文末（図７の例では、アドレス５０の“において、”）直前の動詞句（図７の例では、アドレス４６の“備えた”）を含まない独立した最長の名詞句（図７の例では、アドレス４９の“Ｊ”）を「前提部主題」とし、「前提部主題」の先頭の文字のアドレスを前提部主題のアドレスとし、前提部主題の属性ＴＸを１として記録し、処理Ｓ４１３に進む。
【００５４】
ここで、属性ＴＸとは、図５の処理Ｓ４以下で抽出される「主題」、「技術要素」等のクレーム要素について、その属性を判別するための数値で、図９に示す表に従い定義され、属性ＴＸ＝０は、後述する特徴部主題であり、属性ＴＸ＝１は、前提部主題であり、属性ＴＸ＝３は、後述する技術要素であって、後ろに読点が連なっている格助詞「と」の前の動詞句を含まない独立した最長の名詞句であり、属性ＴＸ＝４は、後述する技術要素であって、後ろに格助詞が連なっている格助詞「と」の前の動詞句を含まない独立した最長の名詞句であり、属性ＴＸ＝５は、後述する技術要素であり、後ろに読点や格助詞を伴わない格助詞「と」の前の動詞句を含まない独立した最長の名詞句であり、属性ＴＸ＝７は、後述する技術要素であって、後述する指示語により指定された最長の語句として、それぞれ定義される。
【００５５】
また、上記の“独立した最長の名詞句”とは、用言を含まない“「名詞句１」＋格助詞＋「名詞句２」”という語句な場合に、格助詞が所有・限定を意味する「の」が使われている場合は、「名詞句１」が所有者で、「名詞句２」は「名詞句１」の従属者の立場になり独立していないため、「名詞句２」だけでは「前提部主題」とならないことを意味し、格助詞が前の語句に主語格を与える「が」や目的格を与える「を」などである場合には、「名詞句２」は「名詞句１」に対して独立した立場であるため、「名詞句２」だけで「前提部主題」になることを意味する。例えば「エンジンの冷却装置」という語句では、所有・限定を意味する格助詞「の」があるため独立していない名詞句「冷却装置」だけでは「前提部主題」にならないことを意味している。また、“「名詞句１」＋格助詞＋「名詞句２」”という語句の格助詞として、並立を意味する「と」が使われている場合も、「名詞句２」と「名詞句１」は２つで１組と考えられ、「名詞句２」が独立した最長の名詞句とは考えられない。
【００５６】
図８の判定処理Ｓ４０３の結果が“（前提部判定用語が）無い”の場合は、処理Ｓ４１１において、クレーム文頭を特徴部文頭のアドレスとし、次に、処理Ｓ４１２において、前提部に該当するアドレスが無いという意味で、前提部文頭のアドレスおよび前提部文末のアドレスを０に設定し、処理Ｓ４１３に進む。構成要素列挙型のクレーム文のように前提部が無いクレーム文は、この処理で全文が特徴部として以下処理される。
【００５７】
処理Ｓ４１３において、特徴部文末（図７の例ではアドレス１２１の“。”）直前の動詞句（図７の例ではアドレス１１８の“する”）を含まない独立した最長の名詞句（図７の例ではアドレス１２０の“Ｊ”）を「特徴部主題」とし、「特徴部主題」の先頭の文字のアドレスを「特徴部主題」のアドレス（図７の例では１２０）とし、特徴部の属性ＴＸを図９の定義に従い０として、判定処理Ｓ４１４に進む。
【００５８】
判定処理Ｓ４１４において、特徴部主題と前提部主題の比較を行い、両者が等しければ処理Ｓ４を終了し、等しくなければ、処理Ｓ４１５で、“特徴部主題と前提部主題が等しくない”というメッセージを出力するための処理を行ってから処理Ｓ４を終了する。
【００５９】
次に、図５の処理Ｓ５において、格助詞「と」で特定できる技術要素の検出を行う。この処理のより詳細な処理手順を図１０に示す。
【００６０】
処理Ｓ５０１で検索範囲をクレーム全文に設定し、その検索範囲において、処理Ｓ５０２で、格助詞「と」の手前の動詞句を含まない独立した最長の名詞句をｉ番目の技術要素Ｎｉとして抽出し、技術要素Ｎｉの先頭の文字のアドレスを技術要素ＮｉのアドレスＡｉとして、図１２のように記録する。
【００６１】
図１２は、図３に示すクレーム文を解析して抽出した格助詞「と」の前の技術要素である。ここでｉ＝６の“Ｍ以下”は、「ＩがＭ以下となる」において、格助詞「と」に先行する語句は“ＩがＭ以下”となるが、“Ｉ”は主語格を示す格助詞「が」により分離されるため、“独立した最長の名詞句”として“Ｍ以下”が抽出されたことになり、ｉ＝１２の“特徴”は、「ことを特徴とする」において、格助詞「と」に先行する語句は“ことを特徴”になるが、目的格を示す格助詞「を」により、“ことを”と“特徴”とが分離されるため、独立した最長の名詞句として“特徴”が抽出されたことになる。“Ｍ以下”と“特徴”は、結果の意味を示す格助詞「と」についていたもので、その他の並立の意味を示す格助詞「と」についたものと性格が異なるが、処理Ｓ５０２を実行すると技術要素として抽出される。
【００６２】
“Ｍ以下”や“特徴”がクレーム文の技術要素として抽出されるのが不都合であれば、処理Ｓ１の形態素解析の他に、格助詞「と」の意味を解析する処理を追加し、“結果の意味を示す格助詞「と」に前置される語句は技術要素としない”、または、“並立の意味を示す格助詞「と」に前置される語句を技術要素とする”というルールを追加設定するようにしてもよい。
【００６３】
次に、処理Ｓ５０３において、技術要素Ｎｉの属性ＴＸｉを図９の属性ＴＸを定義する表に従い記録する。図１２に図３のクレーム文の技術要素Ｎｉの属性ＴＸｉを示す。
ｉ＝１のアドレスＡ１＝１の技術要素Ｎ１の“Ａ”は、格助詞「と」のみに前置されているので、ＴＸ１＝５となり、
ｉ＝２のアドレスＡ２＝３の技術要素Ｎ２の“Ｂ”は、“格助詞「と」＋格助詞「に」”に前置されているので、ＴＸ２＝４となり、
ｉ＝３のアドレスＡ３＝１１の技術要素Ｎ３の“Ｃ”は、“格助詞「と」＋読点”に前置されているので、ＴＸ３＝３となる。
以下同様の方法で、
技術要素Ｎ４の“Ｄ”は、ＴＸ４＝３となり、
技術要素Ｎ５の“Ｉ”は、ＴＸ５＝４となり、
技術要素Ｎ６の“Ｍ以下”は、ＴＸ６＝５となり、
技術要素Ｎ７の“Ｎ”は、ＴＸ７＝３となり、
技術要素Ｎ８の“Ｐ”は、ＴＸ８＝５となり、
技術要素Ｎ９の“Ｑ”は、ＴＸ９＝５となり、
技術要素Ｎ１０の“Ｒ”は、ＴＸ１０＝４となり、
技術要素Ｎ１１の“Ｔ”は、ＴＸ１１＝４となり、
技術要素Ｎ１２の“特徴”は、ＴＸ１２＝５となる。
【００６４】
次に、処理Ｓ５０４において、位置づけＰＳの初期値（ｉ＝１の技術要素Ｎ１の位置づけＰＳの値）であるＰＳ１を０に設定する。ここで、位置づけＰＳとは、技術要素Ｎｉの次元の高低を判定する数値で、より数値が大きければ、より主要な技術要素となり、より数値が小さければ、より枝葉な技術要素となる。前提部または特徴部において、最も大きな数値の位置づけＰＳの値をもつ技術要素が１次の技術要素となり、そのＰＳの値より１小さいＰＳの値を持つ技術要素が２次の技術要素となる。
【００６５】
次に、処理Ｓ５０５において、クレーム文の文頭から順次抽出された順に技術要素Ｎｉの属性ＴＸｉの値を１つ前の属性ＴＸｉ−１からの遷移の状態により、図１１に示す表の定義に基づき技術要素Ｎｉの位置づけＰＳｉの値を設定していく処理を行う。
【００６６】
図３のクレーム文の場合、図１２に示すようにｉ＝１の技術要素Ｎ１の“Ａ”の位置づけＰＳ１は初期設定された０のままであるが、ｉ＝２の技術要素Ｎ２の“Ｂ”の位置づけＰＳ２は、属性ＴＸがＴＸ１＝５からＴＸ２＝４に遷移したため、図１１の表の定義の“ＰＳ＝ＰＳ”に従い、ＰＳ１と同じ０となる。
【００６７】
図１１において、位置づけＰＳの値が１増えるのは、ＴＸが「４から３に」、「４から４に」、「４から５に」、「５から３に」遷移した場合であり、位置づけＰＳの値が１減るのは、ＴＸが「３から５に」遷移した場合のみであり、位置づけＰＳの値が増減しないのは、ＴＸが「３から３に」、「３から４に」、「５から４に」、「５から５に」遷移した場合である。なお、図１１に示す定義は、日本語の文例の検証をもとに定めたものであるが、より煩雑な文章に応用するためにはより高度な定義を必要とする。
【００６８】
図１２に示すように、ｉ＝３の技術要素Ｎ３の“Ｃ”は属性ＴＸ３が「ＴＸ２の４から３に」遷移したため、位置づけＰＳは１つ増え、ＰＳ３は１になる。すなわち、位置づけＰＳが１となった技術要素Ｎ３の“Ｃ”は、位置づけＰＳが０の技術要素Ｎ１の“Ａ”や技術要素Ｎ２の“Ｂ”より主要な技術要素の位置づけになったと考えられる。
【００６９】
以下同様にして、ｉ＝７の技術要素Ｎ７の“Ｎ”は属性ＴＸ７が「４から３に」遷移したため、位置づけＰＳは１増えＰＳ７は３になり、ｉ＝８の技術要素Ｎ８の“Ｐ”は属性ＴＸ８が「３から５に」遷移したため、位置づけＰＳは１減りＰＳ８は２になる。すなわち、技術要素Ｎ８の“Ｐ”は、技術要素Ｎ７の“Ｎ”より、位置づけＰＳが１つ小さくなったので、１レベル主要な技術要素では無くなったことになる。
【００７０】
また、ｉ＝１１の技術要素Ｎ１１の“Ｔ”は属性ＴＸが「４から４に」遷移したため、位置づけＰＳは１増えＰＳ１１は３になり、ｉ＝１２の技術要素Ｎ１２の“特徴”は、属性ＴＸが「４から５に」遷移するため、位置づけＰＳは１増え、この段階では、ＰＳ１２は４になり、特徴部の中で最も大きな位置づけＰＳとなる。
【００７１】
次に、処理Ｓ５０６で、全技術要素Ｎｉを検索対象とし、判定処理Ｓ５０７において、各技術要素Ｎｉが「ポジション特別処理ファイル」３５に登録されている語句と一致するかどうかを判定し、ＮＯ、すなわち、一致するものが無ければ処理Ｓ５０９に進むが、ＹＥＳ、すなわち、一致するものがあれば処理Ｓ５０８で、その一致する技術要素Ｎｉの位置づけＰＳｉの値から１００減じたものをＰＳｉに設定しなおしてから、処理Ｓ５０９へ進む。「ポジション特別処理ファイル」に“特徴”が登録されている場合は、図１２の例のように、ｉ＝１２の技術要素Ｎ１２は、“特徴”であるため、処理Ｓ５０８において、位置づけＰＳ１２の値は４であったが、１００減じられ−９６となる。
【００７２】
次に、処理Ｓ５０９において、アドレスが特徴部に属し、位置づけＰＳｉが最大となる技術要素Ｎ１を特徴部主題に関する“１次の技術要素”として抽出し、処理Ｓ５１０において、アドレスが前提部に属し、位置づけＰＳｉが最大となる技術要素Ｎ１を前提部主題に関する“１次の技術要素”として抽出する。
【００７３】
図１２の例では、特徴部主題に関する１次の技術要素Ｎｉは、位置づけＰＳｉが特徴部で最大の３となるｉ＝７の“Ｎ”とｉ＝１１の“Ｔ”とになり、前提部主題に関する１次の技術要素Ｎｉは、位置づけＰＳｉが前提部で最大の１となるｉ＝３の“Ｃ”と、ｉ＝４の“Ｄ”と、ｉ＝５の“Ｉ”とになり、図４に示すように、それぞれ、特徴部主題の“Ｊ”と前提部主題の“Ｊ”とに直結する位置づけの技術要素となる。
【００７４】
ここで、位置づけＰＳのもうひとつの特性として、それぞれの特徴部と前提部において、最大のＰＳ値にならない技術要素は、、次にＰＳ値が１大きくなる技術要素に従属する特性がある。図１２の前提部では、ｉ＝１の“Ａ”とｉ＝２の“Ｂ”との位置づけＰＳは０であるが、次にＰＳが１大きくなるｉ＝３の“Ｃ”に“Ａ”と“Ｂ”とが同列で従属することになり、特徴部ではｉ＝６の“Ｍ以下”がｉ＝７の“Ｎ”に従属することになり、ｉ＝８乃至１０の“Ｐ”、“Ｑ”、“Ｒ”は次に位置づけＰＳが１大きくなるｉ＝１１の“Ｔ”に従属しているように、図１２の位置づけＰＳｉの並び方を解析するだけで、どの技術要素がどの技術要素に従属するか容易に判断できるので、各助詞「と」で特定した技術要素を用いて木構造を作成する場合、図１２の位置づけＰＳの並びを解析して、より細かい木構造を作成することができるようになる。
【００７５】
図１０の判定処理Ｓ５０７と処理Ｓ５０８で、“特徴”の位置づけＰＳの値を特別処理として、１００減じたのは、“特徴”が特徴部主題に関する１次の技術要素として適切でないと考えたためである。
【００７６】
次に、処理Ｓ５１１において、処理Ｓ５０１から処理Ｓ５１０までの処理の間で、文法上、格助詞「と」が正しく使われていないと考えられる場合は、警告を出力するためのエラー処理を行う。例えば、属性ＴＸが３、４、５となる格助詞「と」が１個だけ存在する場合や、属性ＴＸが３となる格助詞「と」の後に、属性ＴＸが４となる格助詞「と」が無い場合等は警告を出力する。
【００７７】
次に、図５の処理Ｓ６において、前提部または特徴部の主題の説明部を抽出する。
【００７８】
処理Ｓ５１０で前提部の１次の技術要素を抽出できた場合、前提部で最後の１次の技術要素の次の語句から、前提部主題の直前の語句までを前提部主題の説明部として抽出し、処理Ｓ５１０で前提部の１次の技術要素を抽出できなかった場合は、前提部の文頭から前提部主題の直前の語句までを前提部主題の説明部として抽出する。図３のクレーム文の例では、図７、図１２に示すように、アドレス４３の技術要素“Ｉ”が最後の１次の技術要素となるため、次のアドレス４４の「と」から前提部主題であるアドレス４９の“Ｊ”の直前の語句「備えた」までの「とを備えた」を前提部主題の説明部として抽出する。
【００７９】
処理Ｓ５０９で特徴部の１次の技術要素を抽出できた場合、特徴部で最後の１次の技術要素の次の語句から、特徴部主題の直前の語句までを特徴部主題の説明部として抽出し、処理Ｓ５０９で特徴部の１次の技術要素を抽出できなかった場合は、特徴部の文頭から特徴部主題の直前の語句までを特徴部主題の説明部として抽出する。図３のクレーム文の例では、図７、図１２に示すように、アドレス１０６の技術要素“Ｔ”が最後の１次の技術要素となるため、次のアドレス１０７の「と」から特徴部主題であるアドレス１２０の“Ｊ”の直前の語句「する」までの「とを備えたことを特徴とする」を特徴部主題の説明部として抽出する。
【００８０】
次に、図５の処理Ｓ７において、処理Ｓ５で抽出した格助詞「と」で特定できる技術要素を説明する技術要素説明部を抽出する。
【００８１】
処理Ｓ７にいて、１次の技術要素Ｎｉがある場合には、
（１）その１次の技術要素Ｎｉの手前にある別の１次の技術要素Ｎｉの直後の格助詞「と」と別の助詞または読点を除いた次の語句と、
（２）前提部の文頭と、
（３）特徴部の文頭とのいずれかのアドレスの値から、１次の技術要素Ｎｉのアドレスの値を引いた値が、負の値で最も０に近くなる語句から、その１次の技術要素Ｎｉの直前の語句までをその１次の技術要素説明部として抽出する。
【００８２】
図３のクレーム文の例では、図１２に示すように前提部の第１番目の１次の技術要素は、ｉ＝３の“Ｃ”で、技術要素説明部は、前提部文頭から、“Ｃ”の直前までの「ＡとＢとに保持された」になり、前提部の第２番目の１次の技術要素は、ｉ＝４の“Ｄ”で、“Ｄ”には手前に別の１次の技術要素“Ｃ”があり、“Ｃ”の直後の格助詞「と」と読点を除いた語句から、“Ｄ”の直前の語句までの「前記Ｃを支持する」が技術要素説明部になる。同様にして、前提部第３番目の１次の技術要素“Ｉ”の技術要素説明部は「ＥがＨに変位した時、前記Ｄを駆動する」であり、特徴部第１番目の１次の技術要素“Ｎ”の技術要素説明部は「ＫがＬした時に、前記ＩがＭ以下となる」であり、特徴部第２番目の１次の技術要素“Ｔ”の技術要素説明部は「前記ＫがＯした時に、前記Ｎが、ＰとＱとＲとを牽引してＳにする」である。
【００８３】
ここまでの説明で、クレーム文を特徴部と前提部の各主題と、各主題の説明部と、各主題に関する１次の技術要素と、各１次の技術要素の技術要素説明部に分解することができたので、図５の処理Ｓ８で木構造の出力処理を行う。
【００８４】
図４は、図３のクレーム文を木構造で図示したものであるが、処理Ｓ８の段階では、まだ指示語の「前記」等は削除されていない。図４の例のように、前提部の主題と特徴部の主題とは分離して配置し、各主題の近くに、主題に関する１次の技術要素を分離して他と干渉しないように配置し、主題と１次の技術要素とをそれぞれ枠で囲み、主題の枠とその主題に関する１次の技術要素の枠とを線で結ぶ。次に、各主題の主題説明部をそれぞれの主題の近くに、１次の技術要素や枠と枠とを結ぶ線と干渉しないように配置する。次に、各１次の技術要素の技術要素説明部を関係する１次の技術要素の近くに他と干渉しないように配置し、その技術要素説明部を枠で囲み、その枠と、その技術要素説明部と関係する１次の技術要素を囲む枠とを線で結ぶ。また、１次の技術要素以外の技術要素も枠で囲み、クレーム文の木構造として出力する。
【００８５】
次に、図５の処理９の指示語で特定できる２箇所の技術要素の抽出を行う。ここで、指示語とは、図１５に示すように指示語ファイル３４に登録されている語句のことで、例えば「上記」、「前記」、「この」、「同」、「該」などの語句があり、これらの語句に続き直後の動詞句を除いた語句と、他の部分で最も長い文で一致する語句を技術要素として特定するために用いられる語句である。また、指示語ファイル３４では、登録されている語句ごとに「参照距離」の数値が「＋２」、「＋１」、「−１」、「−２」のいずれかの値で設定されていて、同ファイルに登録された指示語で特定された技術要素と、他の場所の同じ語句の技術要素のうち、参照対象とする技術要素をこの「参照距離」の数値により決めることとし、“「参照距離」の数値により参照技術要素を決めるルール”として、
「参照距離」が「−２」であれば、一番先に出た同じ語句の技術要素を参照技術要素とし、
「参照距離」が「−１」であれば、直前の同じ語句の技術要素を参照技術要素とし、
「参照距離」が「＋１」であれば、直後の同じ語句の技術要素を参照技術要素とし、
「参照距離」が「＋２」であれば、一番後に出る同じ語句の技術要素を参照技術要素とする。例えば、指示語ファイル３４には、「前記」が登録されていて、その「参照距離」が「−２」と設定されているので、「ＡはＢで、前記ＡはＣになるが、前記ＡはＤにならない。」という文では、「Ａは」は、指示語「前記」で特定された技術要素になるが、上記のルールに従うと、２番目と３番目の「Ａは」の参照技術要素は、いずれも１番最初に出てくる「Ａは」になり、３番目の「Ａは」の参照技術要素が２番目の「Ａは」にはならないことになる。
【００８６】
また、他の部分で最も長い文で一致する語句を探す時に、指示語ファイル３４に登録されている語句に続いた“直後の動詞句を除く”理由は、「電波の干渉量を測定するステップと、この測定した干渉量に基づき」の例のように、指示語「この」の後の動詞句「測定した」を除かないと、「干渉量」を技術要素として、抽出できないためである。
【００８７】
図１３に処理Ｓ９のより詳細な処理手順を示す。
処理手順の主要な点として、処理Ｓ９０３で、指示語ファイル３４の中から「指示語」を選択し、判定処理Ｓ９０４で、クレーム文の中に該当する「指示語」があるかを検索し、有れば処理Ｓ９１０で「指示語」の後の直後の動詞句を除いた語句のうち、最初の句読点までの語句を取り出し「検索バッファ」にセットする。この時、この語句の先頭のアドレスＡＳｉと、指示語ＳＪｉとを記録しておく。次に、判定処理Ｓ９１１において、クレーム文の中の他の場所に、「検索バッファ」に登録された語句と同じ語句が有るか検索し、無ければ、処理Ｓ９１４で「検索バッファ」の語尾１文字を減らし、再度、判定処理Ｓ９１１で「検索バッファ」と同じ語句がクレーム文の中に有るかの判定処理を行う。
【００８８】
処理Ｓ９１４の後で、判定処理Ｓ９１５で「検索バッファ」の文字数が０かどうか判定し、０でなければ、判定処理Ｓ９１１を行うが、０であれば参照する技術要素が無いのに指示語を用いているので、処理Ｓ９１６でエラーとして記録して、判定処理Ｓ９０７に進む。
【００８９】
判定処理Ｓ９１１において、他の場所に「検索バッファ」と同じ語句があるとなった場合には、処理Ｓ９１２で、この時の「検索バッファ」内の語句を「技術要素」として登録し、この「指示語」と新たに検出された「技術要素」の「アドレス」を記録し、次に処理Ｓ９１３において、他の場所の同じ語句の「技術要素」を全て検索して、「指示語」とその検索された「技術要素」の「アドレス」を記録し、判定処理Ｓ９０７に進む。
【００９０】
判定処理Ｓ９０７で、指示語ファイル３４に別の「指示語」が無いとなった場合には、処理Ｓ９１７に進み、処理Ｓ９１０で記録したアドレスＡＳｉごとの指示語ＳＪｉに対する「参照距離」を指示語ファイル３４より求め、“「参照距離」の数値により参照技術要素を決めるルール”に従い、アドレスＡＳｉの「技術要素」に対する「参照技術要素」を特定し、相互の参照技術要素のアドレス（複数参照されていれば複数のアドレス）を“アドレス付き技術要素ファイル５１”に記録する。このアドレスは、図７の例であれば、「参照技術要素のアドレス」の列に記録し、参照元の技術要素であれば参照先の技術要素のアドレスが、参照先の技術要素であれば参照元の技術要素のアドレスが、それぞれ記録されていて、当該する技術要素が他のどの技術要素と関連しているか容易にわかるようにする。
【００９１】
図３のクレーム文の例では、「前記」が指示語として図１５の指示語ファイル３４に参照距離が「−２」として登録されているので、「Ｃ」、「Ｄ」、「Ｉ」、「Ｋが」、「Ｎ」が指示語で特定される技術要素として抽出される。また、参照技術要素は、それぞれの技術要素に１つずつ存在している。技術要素として、「Ｋが」のような語句が抽出されて不都合であれば、“技術要素のうち最後の語句が助詞であればその助詞を削除する”等のルールを追加設定してもよい。
【００９２】
次に、図５の処理Ｓ１０において、指示語ファイル３４に登録された指示語があれば、処理Ｓ８で出力した木構造の中から、その指示語を削除する。ここで指示語を削除するのは、図式を見やすくするためである。
【００９３】
次に、図５の処理Ｓ１１において、処理１０までの木構造の中で、処理Ｓ９で抽出した技術要素を枠で囲み、参照元の技術要素の枠と参照先の技術要素の枠とを線で結び、両方の技術要素が相互に関係していることを明確に示す図式を出力する。図３のクレーム文の例では、図４のような技術要素相関図を出力できる。
【００９４】
以上が、デバッグ対象のクレーム文を入力した後、技術要素相関図を出力するまでの具体的な手順である。
【００９５】
次に、技術要素相関図を出力した後の処理として、図１４の処理手順に示すように、処理２１で、検討対象をアドレス付き技術要素ファイル５１の前提部とし、次に判定処理Ｓ２２において、処理Ｓ２１で定めた検索対象の中の技術要素について、参照技術要素のアドレスが空欄のものがあるかを調べ、無ければ処理Ｓ２４に進むが、有れば、処理Ｓ２３で“それらの技術要素が参照されていない”ことを警告の意味で出力し、処理Ｓ２４に進む。例えば、図４、図７の例では、前提部の「Ａ」、「Ｂ」は技術要素として抽出されたが、他のどこからも参照されていないので、メッセージの出力対象となる。ここで、前提部の「Ａ」と「Ｂ」とが、どこからも参照されていないことを出力する理由は、「Ａ」と「Ｂ」との記載が請求の範囲を狭めていないか、あるいは、「Ａ」と「Ｂ」という技術要素が将来使われなくなった場合に不利な記載にならないか、「Ａ」と「Ｂ」とに関する記載を削除して請求項として成立するかどうか等を出願者が検討しやすくするためである。
【００９６】
次に、Ｓ２４で検討対象をクレーム全文の技術要素に設定し、処理２５で上位概念辞書３１を用いて、技術要素の上位概念の用語を２次の上位概念（上位概念のその上の上位概念）まで検索し、アドレス付き技術要素ファイル５１に記録し、次に、判定処理Ｓ２６で、前提部にある技術要素の上位概念用語と特徴部の技術要素とに同じ語句があるかを調べ、無ければ終了するが、有れば処理Ｓ２７で「前提部に特徴部より下位概念の用語が有る」ことを出力し、全処理を終了する。処理Ｓ２５で上位概念用語を図７のようにアドレス付き技術要素ファイル５１に出力するのは、出願者が、より上位概念の用語に置き換えらて出願できるかを検討しやすくするためであり、処理Ｓ２７のような出力するのは、特徴部より前提部の方が狭まった書き方になっていないか出願者が検討しやすくするためである。
【００９７】
上記の実施の形態は、請求項１項分のクレーム文を図式化表示する方法について説明したが、請求項が２項以上のクレーム文を入力した場合も、句点または“請求項”等のキーワードを基準にして、文の切れ目を見つけ、同様の手順で請求項２項以上の技術要素相関図等を出力できる。
【００９８】
上記の実施の形態では、格助詞「と」に前置される技術要素の属性ＴＸを図９にも示すように３種類のみを設定して説明したが、この属性ＴＸを４種類以上設定し、図１１に示すような属性ＴＸの遷移状態に対する位置づけＰＳを定義する方法もそれに対応するように変更してもよいし、図２に示すクレーム文解析用辞書ファイル２９に６種類の辞書ファイルのみについて説明したが、その他の辞書ファイルを追加設定し、あらたに追加設定した辞書ファイルを用いて図式化表示の方法を変更してもよい。
【００９９】
上記の実施の形態では、格助詞の「と」と指示語による技術要素を抽出する方法のみを説明したが、「または」や「かつ」などの接続詞により技術要素を抽出してクレーム文の一部を集合理論の図で図式化表示をしたり、構文解析や意味解析の手法を取り入れて、クレーム文の一部を簡易なフローチャート図やシステムブロック図で図式化表示することもできる。
【０１００】
上記の実施の形態では、クレーム文解析用辞書ファイル２９の形態素解析用辞書３０や上位概念辞書３１は、外部メモリのＨＤに格納して使用するように説明したが、形態素解析用辞書３０は、逐次公開される特許願公開文献の請求項に記載されている技術要素を抽出し追加登録したものをダウンロード等により使用できるようにしてもよいし、上位概念辞書３１はインターネット上のＷｅｂサーバーに格納して使用するようにして、これを利用する特許技術者が辞書の内容を随時更新できるように構成してもよい。
【０１０１】
上記の実施の形態は、汎用のパーソナルコンピュータで動かすコンピュータプログラムとして、実施の形態を説明したが、コンピュータであれば、汎用パーソナルコンピュータに限らず、ＣＰＵとプログラムを搭載した専用機として特許明細書デバッグツールを構成してもよい。
【０１０２】
【発明の効果】
以上説明したように、本発明によれば、多種多様な文構造を持つ日本語の特許明細書のクレーム文であっても、「において、」や「であって、」のようなクレーム文の前提部を示す常用の語句により、クレーム文の前提部と特徴部との分離と主題の抽出とができ、並立の意味を有する格助詞「と」に前置された語句により技術要素を抽出でき、格助詞「と」に続く助詞または読点の状態により技術要素の属性を数値化し、その１変数の属性の数値の遷移の状態から、技術要素の位置づけや従属関係を論理的に解析できるようになり、また、主題の説明部や技術要素の説明部を分離することにより、パターン照合法を用いずに、論理的手法で、多種多様な形態をとる日本語の特許明細書のクレーム文を木構造のような図式化表示ができるようになる。
【０１０３】
また、「前記」や「上記」などの指示語に続く語句からクレーム文の技術要素を抽出し、指示語の種類により、参照元、参照先の関係を定義し、上記の木構造の上で技術要素相互の関係を明確に図式化表示できるようになる。
【０１０４】
クレーム文を木構造で表し、技術要素の関係を明確に図式化表示することにより、出願人の発明者や特許明細書作成者が意図したようにクレーム文が作成されているか、容易に確認できるようになるだけでなく、格助詞「と」に関する文法上の誤りを自動的に検出できる他、前提部に他で参照されない技術要素が有れば、それを自動的に検出できるため、出願人の発明者や特許明細書作成者がクレーム文の見直しをすることが容易になる。
【０１０５】
上記で検出したクレーム文の技術要素に関して、上位概念の語句を自動的に検出して出力し、前提部に特徴部より下位概念の語句が有れば自動的に検出されるため、出願人の発明者や特許明細書作成者がクレーム文の技術要素の見直しを容易にできるようになる。
【０１０６】
日本文の自動解析に用いる形態素辞書にクレーム文の技術要素として用いられた用語を名詞相当の用語として追加登録した形態素解析用辞書を用いることにより、特殊な専門用語や複雑な形態素から構成される技術用語がクレーム文内にあっても容易に技術要素を抽出できるようになる。
【図面の簡単な説明】
【図１】ブロック図
【図２】特許明細書デバッグツールの構成図
【図３】デバッグ対象クレーム文
【図４】技術要素相関図
【図５】フローチャート
【図６】Ｓ３の段階での形態素解析結果
【図７】アドレス付き技術要素ファイル５１の例
【図８】前提部または特徴部の主題抽出のフローチャート
【図９】属性ＴＸ表
【図１０】格助詞「と」で特定できる技術要素の抽出フローチャート
【図１１】属性ＴＸの遷移状態に対する位置づけＰＳを決める式
【図１２】格助詞「と」の前の技術要素
【図１３】指示語で特定できる技術要素の抽出フローチャート
【図１４】ゼネラルフローチャート
【図１５】指示語ファイルの表
【符号の説明】
１　ＣＰＵ
２　内部メモリ
３　入出力インターフェース
４　データ通信端末
５　データ出力端末
６　データ入力端末
７　外部メモリドライブ端末
８　ＯＳ
９　特許明細書デバッグツール
１０　特許明細書データファイル
１１　バスライン
１２　ＨＤ
２０　実行ファイル
２１　形態素解析部
２２　主題部抽出部
２３　「と」による技術要素抽出部
２４　木構造出力部
２５　指示語による技術要素抽出部
２６　技術要素相関図出力部
２７　上位概念検索部
２８　エラーメッセージ出力部
２９　クレーム文解析用辞書ファイル
３０　形態素解析用辞書
３１　上位概念辞書
３２　前提部判定用語ファイル
３３　前処理ファイル
３４　指示語ファイル
３５　ポジション特別処理ファイル
５０　クレームファイル
５１　アドレス付き技術要素ファイル
５２　技術要素相関図ファイル
図中、同一符号は同一または相当部分を示す。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a patent specification debugging tool which analyzes and graphically displays a claim statement of a patent specification created in Japanese, and automatically detects and outputs a problem as a claim statement.
[0002]
[Prior art]
As a tool for supporting the creation of a patent specification, as disclosed in JP-A-2001-306754 and JP-A-2000-48013, a procedure for creating a patent specification, a gist of the creation, and a checklist are displayed. There are tools and tools for converting academic papers into patent specifications, as disclosed in JP-A-2002-207720.
[0003]
Further, as disclosed in Japanese Patent Application Laid-Open No. 9-293075, there is a tool which creates hierarchical data after pattern matching of a claim sentence of a patent specification and translates it into a second language.
[0004]
There are many other techniques for automatically analyzing Japanese sentences, which are important techniques related to the present invention, such as JP-A-11-25087 and JP-A-7-295985, which are related to machine translation.
[0005]
[Problems to be solved by the invention]
Creating a patent specification is a very difficult task for those who are unfamiliar with Japanese patent specifications. Is difficult. In particular, claim sentences may be long, and the presence of various descriptive words may make it difficult to decipher the claim sentences. Not only is it difficult to judge whether the claim statement of the patent specification has been prepared as intended, but also when the applicant requests a specialized patent engineer to prepare the patent specification. In many cases, it is not possible to clearly judge whether the inventor belonging to the applicant has written the claim sentence of the issued patent specification, and there is also a problem that an incorrect application is filed.
[0006]
As a method of clearly interpreting the claim sentence of the Japanese patent specification, a method of decomposing the claim sentence into technical elements and displaying it in a diagrammatic form is effective. In the case of a pattern matching method such as that of the Japanese patent specification, it is virtually impossible to prepare a claim statement pattern to a practical level for a variety of forms such as a claim statement in a Japanese patent specification. It is difficult, and it is necessary to have a method of analyzing the claim sentence in a manner more suited to the actual situation of the claim sentence in the Japanese patent specification and displaying it in a diagrammatic manner.
[0007]
Also, even if a skilled patent engineer is unfamiliar with the technical field of the patent specification to be handled, it will overestimate unnecessary technical elements and describe them in the premise of the claim statement, narrowing the claims, Even if a claim sentence can be created using words of a higher concept as a technical element, if the user is unfamiliar with the technical field, the inventor presents the technical element as the words of the higher concept may not come to mind. There is also a problem that an application is made using the words of the subordinate concept as it is, and the scope of claims is narrowed.
[0008]
In addition, as a technique for automatically analyzing Japanese sentences, which is important as a related technique of the present invention, a transfer method in which “morphological analysis”, “syntax analysis”, and “semantic analysis” of “original text” are usually used. However, in the case of ordinary texts, at the stage of "syntax analysis" due to the ambiguity of natural language, it is not possible to analyze sufficiently accurately at present, but in the patent statement, the claim statement does not It is necessary to focus on special expressions that have few expressions with ambiguity and that are commonly used, and to be able to analyze claim sentences with the current automatic analysis technology for Japanese sentences.
[0009]
The present invention has been made to solve the above problems,
Even in the case of a claim sentence in a Japanese patent specification that has a wide variety of sentence structures, the technical elements that make up the claim are preceded by the characteristic that precedes the case particle “to” and the technical elements in the claim sentence The characteristic in which the technical element is arranged after a modifier such as “said” or “the” (hereinafter, referred to as “designator” in the meaning of the modifier indicating the thing) used to indicate the mutual relationship between Pay attention to
Also, if a term is used as a technical element of a claim sentence in a morphological dictionary used for automatic analysis of general Japanese sentences, for example, there are originally four morphemes such as "engine electronic control unit" and "electronic". Even if the term is decomposed into "control" and "device", a "morphological dictionary for claim sentence analysis" that can be additionally registered as a term equivalent to one noun if it has been used as a technical element of a claim sentence Morphological analysis of the claim sentence using
From the result of the morphological analysis, the technical elements of the claim sentence are extracted from the phrase preceding the case particle "to", and the subordination relation between the technical elements is logically analyzed. A technical element directly connected to an element specifying the invention described in (1) (hereinafter referred to as “theme”) is extracted as a primary technical element, and is graphically represented by a tree structure that is divided into the theme and the primary technical element. The technical structure that can be specified by the directive included in the claim sentence is extracted, and the relationship between the technical element of the reference source and the technical element of the reference destination is determined based on the type of the directive. The citation relationship between the technical elements is clearly displayed by connecting the lines on the above, and the contents of the claim text are displayed in a diagrammatic manner, so that the inventor of the applicant or the creator of the patent specification may have intended the To be able to determine clearly To.
[0010]
In the case of a claim sentence with a premise, the claim sentence is separated into a premise and a feature, the subject of the premise and the feature is extracted, and the tree structure is displayed and schematized for each of the premise and the feature. Display,
Automatically detect and output technical elements that are not referred to elsewhere in the premise, to prevent the claims from being narrowed, detect and output words of the superordinate concept of technical elements, and If there is a word of a concept lower than the characteristic part in the section, it is automatically detected, so that the inventor of the applicant or the creator of the patent specification can easily review the technical elements of the claim text.
[0011]
[Means for Solving the Problems]
In order to achieve this object, a patent specification debug tool according to claim 1 of the present invention is a patent specification debug tool having a data input / output unit. A dictionary file for analyzing the claim sentence of the patent specification, and a designator which is a modifier indicating the case particle "to" of the claim sentence obtained as a result of analysis using the dictionary file. A technical element specifying means for specifying a noun or a noun phrase having the same function as a noun as a technical element of the claim sentence, and a technical sentence described at the end of the claim sentence obtained by analyzing the dictionary file. A subject specifying means for specifying a noun or a noun phrase having the same function as the noun as a subject of the invention; and the technical element specified by the technical element specifying means. Primary technical element specifying means for specifying the technical element directly related to the subject specified by the specifying means as a primary technical element, and the primary technology specifying by the primary technical element specifying means An element and the subject are arranged on a drawing, and a claim statement first diagramming means for displaying a relationship between the primary technical element and the subject as a diagram is provided. . Therefore, according to the present invention, the claim sentence is decomposed by the first diagramming means of the claim sentence, and the dependency relation is graphically displayed, so that the configuration of the claim sentence can be clearly understood. Become.
[0012]
Further, the patent specification debug tool according to claim 2, wherein the dictionary file additionally registers terms used as technical elements of the claim sentence in the morphological dictionary used for automatic analysis of Japanese sentences as terms equivalent to nouns. It is characterized by including an analysis dictionary, so that technical elements can be easily extracted even if technical terms composed of special technical terms or complex morphemes exist in the claim sentence.
[0013]
According to a third aspect of the present invention, in the patent specification debugging tool, the subject specifying unit detects an element that is separated into a prerequisite part and a characteristic part from the claim text, and uses the claim text as a prerequisite part. In the case where the function for separating the premise part and the function for specifying the subject of the premise part and the element for separating the claim sentence into the premise part and the characteristic part cannot be detected, the claim sentence has no premise part and the claim sentence Is characterized as a feature part, so that it is possible to identify and process a claim statement with a premise part and a claim statement of a component enumeration type.
[0014]
Further, the patent specification debugging tool according to claim 4 is characterized in that the first diagramming means of the claim statement stores the relationship between the primary technical element for each subject of the premise and the subject of the characteristic part. It is characterized by having a function of displaying as a diagram, and therefore, if a claim sentence has a prerequisite part, it can be displayed graphically by dividing it into a prerequisite part and a feature part, and a claim statement of component enumeration type Then, all the claim sentences are treated as characteristic parts, and the claim sentences can be displayed in a diagrammatic form, so that the structure of the claim sentences can be clearly understood.
[0015]
Further, the patent specification debug tool according to claim 5, wherein, among the technical element specifying means, a reference source technology that specifies a reference technical element in the claim sentence with respect to the technical element specified by the descriptive word Element specifying means, and the technical element of the reference source specified by the reference-source technical element specifying means and the technical element specified by the descriptive term, And a second claim sentence displaying means for displaying the diagram in which the descriptive word is deleted from the diagram and which is connected with a line above. Therefore, the second diagramming means of the claim sentence makes it possible to clearly understand the relationship between the technical elements defined by the referring technical element specifying means.
[0016]
Further, in the patent specification debugging tool according to claim 6, the technical element specifying means is followed by a verb or a verb phrase or a verb having the same function as the verb, followed by a noun or a noun. If a noun phrase having the same function as follows follows, it includes a function of specifying, as a technical element, the noun phrase having the same function as the verb or the verb phrase or the noun or the noun excluding the verb. Therefore, the technical element can be extracted even if the technical element cannot be extracted unless the verb phrase after the descriptive word is removed.
[0017]
Further, the patent specification debug tool according to claim 7 includes that the technical element specifying means includes a function of converting a conjunction having a parallel meaning in the claim sentence into a case particle "to". Therefore, this method makes it possible to extract, as a technical element, phrases arranged side by side with the conjunction "or" or the like.
[0018]
In addition, the patent specification debug tool according to claim 8 is characterized in that, in the technical element specifying means, a technical element specified as a phrase preceding a case particle “to” is a technical element based on “and”; The attribute of the technical element by "to" is determined by the part of speech of the morpheme following this case particle "to" and the reading point or code, and the attribute is determined from the attribute of the technical element by "to" preceding in the claim sentence. The primary technical element specifying means includes a method in which the changed state is set as a transition state, and the subordinate relation between the technical elements based on the "to" is determined by an arithmetic expression or a method based on the arithmetic expression based on the transition state. It is characterized by the following. Therefore, since the dependency between the technical elements by the "to" can be determined only by the arithmetic expression, not only the load on the processing of the program is reduced, but also a variety of various forms of Japanese claim statements can be used. And flexible processing becomes possible.
[0019]
In addition, the patent specification debug tool according to claim 9, wherein the primary technical element specifying means includes a method for determining a dependency relationship between technical elements by the "to" using an arithmetic expression. And an arithmetic expression that defines a value for the technical element according to the above. The arithmetic expression for defining the value of the technical element is defined based on the transition state. And a function of identifying the primary technical element by comparing the values of the positioning values. Therefore, the primary technical element is calculated by an arithmetic expression. Since the determination can be made only by the above, not only the load of the processing of the program is reduced, but also a flexible processing can be performed for a Japanese claim sentence in various forms.
[0020]
Further, in the patent specification debugging tool according to claim 10, when the case particle “to” does not have a parallel meaning in the method for specifying the technical element by “to”, the case particle “ It is characterized by including a function that does not extract the word prefixed with "to" as a technical element by "to". It can be prevented from being extracted as a technical element.
[0021]
Also, the patent specification debug tool according to claim 11 has a function of, when the last morpheme of the technical element specified by the technical element specifying means is a particle, specifying the element excluding the particle as the technical element. Therefore, this method makes it impossible to extract a technical element that is a particle at the end by this method.
[0022]
Further, the patent specification debug tool according to claim 12 outputs a first warning when there is a technical element that is not referred to by others among the technical elements belonging to the premise. is there. Therefore, it is possible to notice that the prerequisite part has a technical element which is not cited elsewhere, and to easily reconsider the claim statement.
[0023]
In addition, the patent specification debug tool according to claim 13 determines the use of the case particle “to” against the grammar as the usage of the case particle “to”, judging from the arrangement of the attributes of the technical element by the “to”. Is output. Therefore, it becomes possible to notice the erroneous use of the case particle "to" and to easily reconsider the claim sentence.
[0024]
In addition, the patent specification debugging tool according to claim 14 compares the subject of the premise specified by the subject specifying means with the subject of the characteristic part. It is characterized by outputting a sentence or figure that means that the subject of the sentence is different from the subject of the feature. "Therefore, it is examined whether the subject of the premise and the feature should be the same. Can be easily done.
[0025]
In addition, the patent specification debug tool according to claim 15, wherein the term of the superordinate concept of the technical element in the claim sentence is detected from a broader term dictionary in the dictionary file and output. A concept detecting means is provided. Therefore, it is possible to easily reconsider the claim statement as to whether the technical element should be replaced with a higher concept.
[0026]
Also, the patent specification debug tool according to claim 16 compares any of the plurality of technical elements in the characteristic part with any of the terms of the superordinate concept of the plurality of technical elements in the premise part. If the same word is found, a third warning is output. Therefore, it is possible to notice that the words of the lower concept are used in the premise part rather than the characteristic part, and it is possible to easily review the claim sentence.
[0027]
Further, by installing the computer program according to claim 17 on a general-purpose personal computer or the like, the claim text of the patent specification can be easily displayed in a schematic form using a general-purpose personal computer. And the relationship between the technical elements in the claim statement can be clearly understood, and the claim statement can be easily reviewed.
[0028]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment in which the patent specification debug tool of the present invention is embodied as a computer program will be described.
[0029]
FIG. 1 is a block diagram of a general-purpose personal computer, in which a patent specification debug tool as a computer program is installed in the general-purpose personal computer to realize the functions of the present invention.
[0030]
A CPU 1, which is a central processing unit, an internal memory 2 composed of a RAM or a ROM, and an input / output interface 3 are connected by a bus line 11, and a data communication terminal 4 and a data output terminal 5 are connected to the input / output interface 3. The data input terminal 6 and the external memory drive terminal 7 are connected.
[0031]
The data communication terminal 4 includes a TA (terminal adapter) and a communication modem for connecting to an external server, the data output terminal 5 includes a monitor screen and a printer, and the data input terminal 6 includes a keyboard and the like. There is a mouse or the like, and the external memory drive terminal 7 includes an HD driver, a CD driver, and an FD driver that drive a storage medium such as an HD (hard disk) 12, a CD (compact disk), and an FD (floppy disk).
[0032]
An OS (Operating System) 8 and a patent specification debug tool 9 running on the OS 8 are connected to an external memory (memory that is not directly connected to the CPU 1 via the bus line 11) by an HD (Hard Disk Drive) which is one of drive terminals 7. After the OS 8 has been written into the internal memory and the OS 8 has been written into the internal memory, a part or all of the patent specification debug tool 9 writes into the internal memory 2 and the program shown in the flowcharts of FIGS. Is executed.
[0033]
A claim sentence to be analyzed by the patent specification dig tool 9 as shown in FIG. 3 is input via the keyboard of the data input terminal 6 or input from the storage medium via the external memory drive terminal 7. Or input from an external server via the data communication terminal 4 and stored in the patent specification data file 10.
[0034]
The keyboard and mouse of the data input terminal 6 are used to operate the debugging tool 9 of the patent specification. The data processing result and the technical element correlation diagram as shown in FIG. It is output to a monitor screen or a printer.
[0035]
FIG. 2 is a configuration diagram of the patent specification debug tool 9 and the patent specification data file 10. The patent specification debug tool 9 includes an execution file 20 for executing debugging and a claim sentence analysis dictionary file 29. The execution file 20 morphologically analyzes a claim sentence to be debugged (converts a claim sentence into a claim sentence analysis dictionary file 29). A morphological analysis unit 21 for decomposing into a character string defined by a morphological analysis dictionary 30 incorporated in the lexical analysis and analyzing the part of speech, etc .; A subject extraction unit 22 that extracts a feature part and a premise part of the claim sentence from the part determination file 32 and extracts a theme of the feature part and the premise part, and a noun phrase placed before the case particle “to” Is extracted as a technical element of a claim sentence by a "to" technical element extracting unit 23, and a subject extracted by the subject extracting unit 22 and extracted by a "to" technical element extracting unit 23 The relationship between the technical elements can be specified by a tree structure output unit 24 that outputs a tree structure or the like and an instruction such as “above”, “the above”, “this”, “these”, “the”, “the same”, etc. A technical element extracting unit 25 using an instruction word for extracting a technical element, a technical element correlation diagram output unit 26 for outputting a diagram showing a relationship between the technical elements, and a high concept searching unit 27 for detecting a high concept of the technical element; , And an error message output unit 28 that outputs an error of a claim sentence, and each functions as a computer program.
[0036]
The claim sentence analysis dictionary file 29 includes a morphological analysis dictionary 30, a high-level concept dictionary 31, a premise part determination term file 32, a pre-processing file 33, a directive word file 34, and a position special processing file 35.
[0037]
The morphological analysis dictionary 30 is a dictionary of Japanese words, and stores a part of speech of a word, a grammar for determining a part of speech from a change of inflection or a ending, a relation between a word and a word, and the like. Used for morphological analysis of claim sentences. The morphological analysis dictionary 30 may be a morphological dictionary used for automatic analysis of general Japanese sentences, but any term used as a technical element of a claim sentence in this morphological dictionary, such as “Engine Electronics "Engine electronic control unit" is added as a term equivalent to one noun even if the term is originally decomposed into four morphemes "engine", "electronic", "control", and "device" like "control unit" A “morphological analysis dictionary 30 as a claim sentence analysis dictionary” that can be registered may be used.
[0038]
The high-level concept dictionary 31 stores high-level concepts for words and phrases. For example, when words such as "spring" and "rubber" are extracted as technical elements of a claim sentence in the present invention, the high-level concepts of those words are extracted. The word "elastic body" is stored in the high-level concept dictionary 31, and is used to automatically search for the word of the high-level concept.
[0039]
The prerequisite part determination term file 32 includes prerequisite parts from words and phrases registered in this file, for example, a claim sentence delimited by words such as "at,""at," and the like. Following this, it is used to determine the part up to the end of the sentence as a characteristic part.
[0040]
The pre-processing file 33 converts words registered in this file into specified words in a claim sentence in order to extract a technical element. For example, the conjunction "and" is a case particle " If it is specified to convert to "," it is used to convert "and" in the claim sentence to "to."
[0041]
The descriptive word file 34 is a phrase registered in this file, and is a descriptive word used to indicate the mutual relationship between the technical elements in the claim sentence, for example, as shown in FIG. Modifiers that indicate things such as "this", "same", "the", etc., following these descriptive terms, and words that are identical in other parts to the words excluding verbs such as the verb phrase immediately following Is used to determine the phrase as a new technical element. In the same file, the numerical values of "reference distance" defined for each registered phrase are "+2", "+1", ""-1" and "-2" determine the other technical element to be referred to by the technical element determined by the phrase registered in the file. Of the technical elements that appear, the first , “−1”, the last one of the previously appearing technical elements is referred to. If “+1”, the first one of the later ones appears. The outgoing technical element is referred to, and if it is "+2", the last one of the later appearing technical elements is used as the reference target.
[0042]
The position special processing file 35 is used to operate the value of the positioning PS, which will be described later, according to a predetermined method when a word registered in the file, for example, “feature” or the like is detected as a technical element. .
[0043]
As a component of the patent specification data file 10 of FIG. 2, the claim file 50 stores the original text of the claim of the patent specification to be debugged as a claim statement. As shown in FIG. 7, a claim sentence to be debugged is stored as a tabular file decomposed into morphemes and technical elements, and the morphemes and technical elements are described in the column of “claim element”, and the left column is “ The address is described as an item of "address", and "semantic", "attribute TX", "positioning PS", "address of reference technology element", and "super-concept term" are described in the right column of "claim element". In the “semantic” column, words indicating the part of speech of each element, the technical element, the subject, the characteristic part, the beginning of each sentence of the premise, the end of the sentence, etc. are described, and in the “attribute TX” column, element The attribute TX described later is described, the value of the positioning PS of the technical element described later is described in the column of “positioning PS”, and the column of the address of the reference technology element is described in the column of “address of the reference technical element”. The address of the technical element to be referred to is described, the superordinate term of each technical element is described in the column of “superordinate term”, and the technical element correlation diagram file 52 has a tree structure as shown in FIG. Data for illustrating the mutual relationship of the technical elements on the tree structure is stored.
[0044]
FIG. 5 is a flowchart showing a processing procedure from input of a claim sentence to be debugged to output of a technical element correlation diagram as shown in FIG.
[0045]
More specifically, a claim J shown in FIG. 3 comprising a C held in A and B, a D supporting the C, and an I driving the D when E is displaced to H. Wherein, when K is L, N is such that the I is less than or equal to M, and when K is O, the N pulls P, Q and R to S to make S. J. "as an example. It should be noted that technical terms including arbitrary common nouns and the like are included in the portions A to T.
[0046]
In processing S1 of FIG. 5, the morphological analysis unit 21 refers to the morphological analysis dictionary 30 to morphologically analyze the sentence to be debugged (decomposes the sentence into a character string registered in the morphological analysis dictionary 30, and Is analyzed). All unknown words that are not registered in the morphological analysis dictionary 30 are processed as noun phrases.
[0047]
Next, in the process S2, if the words registered in the preprocessing file 33 are found in the claim sentence to be debugged as a result of the morphological analysis, the words are converted into the words specified in the preprocessing file 33. For example, if the conjunction "and" or "and" is registered in the pre-processing file 33 and it is specified to convert the conjunction "and" or "and" to the case particle "to", Convert to In terms of grammar, this should be originally described as "Shinkansen connecting Tokyo and Osaka", but there is a case where the claim statement expresses it as "Shinkansen connecting Tokyo and Osaka". Since a technical element may not be able to be extracted by the method, it is necessary to set a special treatment such as the processing S2. Also, it may be expressed as "Shinkansen connecting Tokyo and Osaka", but this is contrary to the original form of attaching the case particle "to" to all parallel words, so in this case we will follow up It does not have to be able to extract Osaka as a technical element by the method.
[0048]
Next, in step S3, an address is assigned to all of the claims. The address value may be the number of characters from the beginning of the claim.
[0049]
FIG. 6 shows a part of the result of the morphological analysis up to the processing S3. From the left column are “address”, “morpheme equivalent”, and “part of speech equivalent”. A “noun phrase” used in “part of speech” has the same function as one noun as a whole, or a “noun phrase” is defined for each word registered in the morphological analysis dictionary 30 Are also expressed as “noun phrases”. For example, the "engine electronic control device" is decomposed into "engine (common noun)", "electronic (common noun)", "control (common noun)", and "device (common noun)" as original morphemes. If these four phrases are formed in succession, if the “part of speech” is registered as a “noun phrase” in the morphological analysis dictionary 30 and the “engine electronic control unit” is registered as a morpheme equivalent, the “longest match method” is used. (See Japanese Unexamined Patent Application Publication No. 11-25089), the "engine electronic control unit" is analyzed as a "noun phrase". The “verb phrase” used in the “part of speech” is defined as a “verb phrase” for each word that works the same as one verb as a whole or for each word registered in the morphological analysis dictionary 30. Also in this case, it is expressed as a “verb phrase”. Here, the item name in the column in FIG. 6 that is “morphologically equivalent” includes a word that is different from the original morpheme, and the item name that is “equivalent to part of speech” includes a word that is different from the original grammatical part of speech. That's why.
[0050]
Next, in step S4, the subject of the premise or characteristic part is extracted. FIG. 8 shows a more detailed processing procedure of this processing.
[0051]
In process 3 of FIG. 5, addresses are assigned to all sentences of the claim sentence by the number of characters from the beginning of the claim sentence, and as shown in FIG. 6, the address of the first character of each word corresponding to a morpheme is represented as the address of the word corresponding to each morpheme. In step S401 of FIG. 8, the minimum value of the address (1 in the example of FIG. 6) is the address value at the beginning of the claim sentence, the maximum address (121 in the example of FIG. 7) is the address value at the end of the claim sentence, and the feature It can be the address at the end of the sentence.
[0052]
Next, in step S402, the search range is set to the full text of the claim. Next, in the determination process S403, the words registered in the premise part determination term file 32 (for example, It is determined whether or not there is a phrase such as "and," which is commonly used as a delimiter of a premise in a claim sentence).
[0053]
If the determination result in the determination process S403 is “Yes”, in the determination process S404, it is determined whether or not there are two or more words registered in the premise determination word file 32 in one claim sentence. If there is any, in step S405, the error processing is performed as "an error in which the prerequisite part determination term is used twice or more", and in step S406, all the processing is terminated, and the determination result in the determination processing S404 is "two". If not, there is a prerequisite part in the claim sentence. Next, in step S407, the address of the head of the claim sentence (1 in the examples of FIGS. 6 and 7) is set as the address of the head of the premise part. In step S408, the head address (50 in the example in FIG. 7) of the phrase detected in step S403 is set as the end address of the premise sentence. An element-equivalent address (in the example of FIG. 7, “K” corresponds to the next morpheme, the address of which is 55) is set as the address of the head of the feature part. Then, in step S410, the end of the premise part sentence (FIG. In the example, an independent longest noun phrase (in the example of FIG. 7, “” of the address 49 in the example of FIG. 7) does not include the verb phrase immediately before “” of the address 50 (“provided” of the address 46 in the example of FIG. 7). J ") is set as the" premise part theme ", the address of the first character of the" premise part theme "is set as the address of the premise part theme, the attribute TX of the premise part theme is recorded as 1, and the process proceeds to step S413.
[0054]
Here, the attribute TX is a numerical value for determining the attribute of a claim element such as a “subject” and a “technical element” extracted in the processing S4 and subsequent steps in FIG. 5, and is defined according to the table shown in FIG. , Attribute TX = 0 is a characteristic part theme described later, attribute TX = 1 is a premise part theme, and attribute TX = 3 is a technical element described later, which is a case particle having a series of reading marks after it. It is an independent longest noun phrase that does not include the verb phrase before “to”, and the attribute TX = 4 is a technical element to be described later, An independent longest noun phrase that does not include a verb phrase. The attribute TX = 5 is a technical element described later, and is independent of a verb phrase that does not include a verb phrase before a case particle “to” that does not include a reading mark or case particle. Is the longest noun phrase, and the attribute TX = 7 is a technical element described later, As the longest word specified by the instruction word for predicates are defined respectively.
[0055]
In addition, the above-mentioned “independent longest noun phrase” means “possible or limited possession of case particle” when it is a phrase ““ noun phrase 1 ”+ case particle +“ noun phrase 2 ”” which does not include a verb. When "no" is used, "noun phrase 1" is the owner and "noun phrase 2" is a subordinate of "noun phrase 1" and is not independent. ] Alone does not constitute a "premise subject", and if the case particle is "ga" that gives the subject case to the previous phrase or "wo" that gives the object case, then "noun phrase 2" Since he is independent from “Noun phrase 1”, it means that “Noun phrase 2” alone becomes a “premise subject”. For example, the phrase "cooling system for the engine" means that the noun phrase "cooling system" alone does not become a "prerequisite subject" because there is a case particle "no" that means ownership and limitation. . Also, when "to", which means parallelism, is used as the case particle of the phrase "" Noun phrase 1 "+ case particle +" Noun phrase 2 "", "Noun phrase 2" and "Noun phrase 1" Are considered as one set of two, and "noun phrase 2" is not considered to be the longest independent noun phrase.
[0056]
When the result of the determination processing S403 in FIG. 8 is “(there is no prerequisite part determination term)”, in step S411, the head of the claim is set as the address of the characteristic part head, and then in step S412, the address corresponding to the premise part In the sense that there is no, the address of the beginning of the premise sentence and the address of the end of the premise sentence are set to 0, and the process proceeds to step S413. In the case of a claim sentence having no prerequisite, such as a claim sentence of a component enumeration type, the whole sentence is processed as a characteristic part in this processing.
[0057]
In the process S413, the longest independent noun phrase (FIG. 7) that does not include the verb phrase immediately before the end of the feature sentence (“.” At the address 121 in the example of FIG. 7) (“YES” at the address 118 in the example of FIG. 7). In the example, “J” of the address 120 is “feature part theme”, the address of the first character of “feature part theme” is the address of “feature part theme” (120 in the example of FIG. 7), and the attribute of the feature part is TX is set to 0 in accordance with the definition in FIG. 9 and the process proceeds to determination processing S414.
[0058]
In the determination process S414, the feature subject and the premise subject are compared. If the two are equal, the process S4 is terminated. If they are not equal, in the process S415, a message indicating that the feature subject and the premise subject are not equal is output. After performing the process for outputting, the process S4 ends.
[0059]
Next, in a process S5 in FIG. 5, a technical element that can be specified by the case particle “to” is detected. FIG. 10 shows a more detailed processing procedure of this processing.
[0060]
In step S501, the search range is set to the full claim. In step S502, the longest independent noun phrase that does not include the verb phrase before the case particle "to" is extracted as the i-th technical element Ni in the search range. The address of the first character of the technical element Ni is recorded as the address Ai of the technical element Ni as shown in FIG.
[0061]
FIG. 12 is a technical element before the case particle "to" extracted by analyzing the claim sentence shown in FIG. Here, “i is less than or equal to M” for i = 6 means “I is less than or equal to M”, and the phrase preceding the case particle “to” is “I is less than or equal to M”, but “I” indicates the subject case Since the case particles are separated by “ga”, “M or less” is extracted as “independent longest noun phrase”, and the “feature” of i = 12 is The phrase preceding the case particle "to" is "koto", but the case particle "wo" indicating the object case separates "koto" and "feature", so the longest independent noun This means that "features" have been extracted as phrases. “M or less” and “feature” are attached to the case particle “to” indicating the meaning of the result, and have a different character from those attached to the case particles “to” indicating the other parallel meanings. Then, it is extracted as a technical element.
[0062]
If it is inconvenient to extract “less than or equal to M” or “feature” as a technical element of the claim sentence, in addition to the morphological analysis of the process S1, a process of analyzing the meaning of the case particle “to” is added. The rule that the phrase preceding the case particle "to" indicating the meaning of the result is not a technical element "or" the phrase preceding the case particle "to" indicating the parallel meaning is a technical element " May be additionally set.
[0063]
Next, in step S503, the attribute TXi of the technical element Ni is recorded according to the table defining the attribute TX in FIG. FIG. 12 shows the attribute TXi of the technical element Ni in the claim sentence of FIG.
Since “A” of the technical element N1 of the address A1 = 1 of i = 1 is prefixed only to the case particle “to”, TX1 = 5,
"B" of the technical element N2 of the address A2 = 3 of i = 2 is prefixed with "case particle" to "+ case particle" ni "", so that TX2 = 4,
Since “C” of the technical element N3 of address A3 = 11 of i = 3 precedes “case particle“ to ”+ reading point”, TX3 = 3.
Hereinafter, in the same manner,
“D” of the technical element N4 becomes TX4 = 3,
“I” of the technical element N5 becomes TX5 = 4,
“M or less” of the technical element N6 becomes TX6 = 5,
“N” of the technical element N7 becomes TX7 = 3,
“P” of the technical element N8 becomes TX8 = 5,
“Q” of the technical element N9 is TX9 = 5,
“R” of the technical element N10 becomes TX10 = 4,
“T” of the technical element N11 becomes TX11 = 4,
The “feature” of the technical element N12 is TX12 = 5.
[0064]
Next, in process S504, PS1 which is the initial value of the positioning PS (the value of the positioning PS of the technical element N1 with i = 1) is set to 0. Here, the positioning PS is a numerical value for determining the level of the dimension of the technical element Ni. The larger the numerical value, the more the main technical element, and the smaller the numerical value, the more the branch and leaf technical element. In the premise or characteristic part, the technical element having the largest numerical value of the PS is the primary technical element, and the technical element having a PS value one smaller than the PS value is the secondary technical element.
[0065]
Next, in step S505, the value of the attribute TXi of the technical element Ni is sequentially changed from the beginning of the claim sentence to the state of the transition from the immediately preceding attribute TXi-1 based on the definition in the table shown in FIG. A process of setting the value of the position PSi of the technical element Ni is performed.
[0066]
In the case of the claim sentence of FIG. 3, as shown in FIG. 12, the position PS1 of “A” of the technical element N1 of i = 1 remains at the initial value of 0, but “B” of the technical element N2 of i = 2. Since the attribute TX has transitioned from TX1 = 5 to TX2 = 4, the position PS2 of "" is the same 0 as PS1 according to the definition of "PS = PS" in the table of FIG.
[0067]
In FIG. 11, the value of the positioning PS increases by 1 when the TX transitions from “4 to 3”, “4 to 4”, “4 to 5”, and “5 to 3”. The value of PS decreases by 1 only when TX transitions from "3 to 5", and the value of PS does not increase or decrease when TX changes from "3 to 3", "3 to 4", This is a case where a transition has been made from “5 to 4” and “5 to 5”. Note that the definition shown in FIG. 11 is determined based on verification of a Japanese sentence example, but requires a more sophisticated definition in order to apply to a more complicated sentence.
[0068]
As shown in FIG. 12, the attribute TX3 of “C” of the technical element N3 of i = 3 has transitioned from “4 of TX2 to 3”, so the positioning PS increases by one and PS3 becomes 1. In other words, it is considered that “C” of the technical element N3 whose positioning PS is 1 has been positioned as a more important technical element than “A” of the technical element N1 whose positioning PS is 0 and “B” of the technical element N2. .
[0069]
Similarly, since the attribute TX7 has transitioned from “4 to 3” for “N” of the technical element N7 with i = 7, the positioning PS increases by one, PS7 becomes 3, and “P” of the technical element N8 with i = 8. In the case of "", since the attribute TX8 has transitioned from "3 to 5," the positioning PS is reduced by 1 and the PS8 is changed to 2. That is, "P" of the technical element N8 is smaller than "N" of the technical element N7 by one position, and thus is no longer a one-level main technical element.
[0070]
Also, the attribute TX of the “T” of the technical element N11 with i = 11 has transitioned from “4 to 4”, so the positioning PS increases by one and PS11 becomes 3, and the “feature” of the technical element N12 with i = 12 is Since the attribute TX makes a transition from “4 to 5”, the positioning PS increases by one. At this stage, the PS 12 becomes 4 and becomes the largest positioning PS among the feature parts.
[0071]
Next, in step S506, all the technical elements Ni are set as search targets, and in determination processing S507, it is determined whether or not each technical element Ni matches a phrase registered in the “position special processing file” 35. In other words, if there is no match, the process proceeds to step S509, but if YES, that is, if there is a match, in step S508, the value obtained by subtracting 100 from the value of the positioning PSi of the matching technical element Ni is set to PSi again. Then, the process proceeds to step S509. If “feature” is registered in the “position special processing file”, the technical element N12 with i = 12 is “feature” as in the example of FIG. Was 4, but reduced by 100 to -96.
[0072]
Next, in step S509, the technical element N1 whose address belongs to the characteristic part and the positioning PSi is the maximum is extracted as the “primary technical element” relating to the characteristic part subject. In step S510, the address belongs to the premise part. The technical element N1 with the largest PSi is extracted as the "primary technical element" related to the premise subject.
[0073]
In the example of FIG. 12, the primary technical elements Ni related to the feature subject are “N” at i = 7 and “T” at i = 11 where the positioning PSi is the maximum of 3 in the feature, and the premise The primary technical elements Ni related to the subject are “C” at i = 3, “D” at i = 4, and “I” at i = 5, where the positioning PSi is the maximum 1 in the premise, As shown in FIG. 4, these are technical elements that are directly connected to “J” of the feature subject and “J” of the premise subject, respectively.
[0074]
Here, as another characteristic of the positioned PS, the technical element that does not have the maximum PS value in each of the characteristic part and the prerequisite part has a characteristic that depends on the technical element whose PS value next increases by one. In the premise of FIG. 12, the positioning PS of “A” of i = 1 and “B” of i = 2 is 0, but “A” is added to “C” of i = 3 where PS increases by 1 next. And “B” are subordinate in the same column, and “M or less” of i = 6 is subordinate to “N” of i = 7 in the characteristic part, and “P” of i = 8 to 10, As for “Q” and “R”, only the analysis of the arrangement of the positioning PSi in FIG. Since it is easy to judge whether or not the element is dependent on the element, when creating a tree structure using the technical element specified by each particle "to", the arrangement of the positioning PS in FIG. 12 is analyzed to create a finer tree structure. Will be able to do it.
[0075]
In the judgment processing S507 and the processing S508 in FIG. 10, the value of the PS of the “feature” is reduced by 100 as a special processing because the “feature” is not considered to be appropriate as the first technical element related to the feature subject. is there.
[0076]
Next, in the process S511, if it is considered that the case particle “to” is not used correctly in the grammar between the processes from the process S501 to the process S510, an error process for outputting a warning is performed. For example, when there is only one case particle "to" having the attribute TX of 3, 4, or 5, or after the case particle "to" having the attribute TX of 3, the case particle "to" having the attribute TX of 4 is provided. Is output, a warning is output.
[0077]
Next, in process S6 of FIG. 5, a description part of the subject of the premise part or the characteristic part is extracted.
[0078]
If the primary technical element of the prerequisite part can be extracted in step S510, the part from the word following the last primary technical element in the prerequisite part to the phrase immediately before the prerequisite subject is extracted as the description part of the prerequisite subject. If the primary technical element of the prerequisite part cannot be extracted in step S510, the part from the beginning of the prerequisite part to the phrase immediately before the prerequisite topic is extracted as the description part of the prerequisite topic. In the example of the claim sentence of FIG. 3, as shown in FIGS. 7 and 12, the technical element “I” of the address 43 is the last primary technical element. The word “provided” up to the word “provided” immediately before “J” at address 49, which is the subject, is extracted as the explanation part of the premise subject.
[0079]
If the primary technical element of the characteristic part can be extracted in the processing S509, the part from the next phrase of the last primary technical element to the phrase immediately before the characteristic part subject is extracted as the description part of the characteristic part theme. If the primary technical element of the characteristic part cannot be extracted in step S509, the part from the beginning of the characteristic part to the phrase immediately before the characteristic part theme is extracted as the description part of the characteristic part theme. In the example of the claim sentence of FIG. 3, as shown in FIGS. 7 and 12, the technical element “T” of the address 106 is the last primary technical element. “The feature is provided” up to the word “do” immediately before “J” at the address 120 as the subject is extracted as the description part of the feature subject.
[0080]
Next, in processing S7 of FIG. 5, a technical element explanation part that describes a technical element that can be specified by the case particle “to” extracted in processing S5 is extracted.
[0081]
If there is a primary technical element Ni in the process S7,
(1) The next phrase excluding the case particle “to” immediately before another primary technical element Ni in front of the primary technical element Ni and another particle or reading mark,
(2) The beginning of the premise,
(3) The value obtained by subtracting the value of the address of the primary technical element Ni from the value of one of the addresses at the beginning of the feature portion is the negative value of the primary technology The words up to the phrase immediately before the element Ni are extracted as the primary technical element explanation part.
[0082]
In the example of the claim sentence of FIG. 3, as shown in FIG. 12, the first primary technical element of the premise part is “C” of i = 3, and the technical element explanation part is “ Up to just before “C”, “A and B are held”, the second primary technical element of the premise is “D” for i = 4, and “D” is another There is a primary technical element "C", and the "supporting C" from the phrase excluding the case particle "to" immediately after "C" and the reading mark to the phrase immediately preceding "D" is a technical element. Become an explanation section. Similarly, the technical element explanation part of the third primary technical element “I” of the premise part is “drive E when E is displaced to H”, and the first primary characteristic element of the characteristic part The technical element explanation part of the technical element “N” is “when K becomes L, the I becomes M or less”, and the technical element explanation part of the second primary technical element “T” of the characteristic part is "When the K goes O, the N pulls P, Q, and R to S."
[0083]
In the description so far, the claim sentence is decomposed into the themes of the characteristic part and the premise part, the explanation part of each subject, the primary technical element related to each subject, and the technical element explanation part of each primary technical element. As a result, a tree structure output process is performed in process S8 of FIG.
[0084]
FIG. 4 illustrates the claim sentence of FIG. 3 in a tree structure. However, at the stage of the processing S8, the descriptive words such as “said” have not been deleted yet. As in the example of FIG. 4, the subject of the premise part and the subject of the feature part are arranged separately, and a primary technical element related to the subject is arranged near each subject so as not to interfere with others. , The subject and the primary technical element are each surrounded by a frame, and the subject frame and the primary technical element frame related to the subject are connected by a line. Next, the subject description section of each subject is arranged near each subject so as not to interfere with the primary technical element or the line connecting the frames. Next, the technical element explanation part of each primary technical element is arranged near the related primary technical element so as not to interfere with others, and the technical element explanation part is surrounded by a frame. A line is connected between the element explanation part and the frame surrounding the related primary technical element. In addition, technical elements other than the primary technical element are also surrounded by a frame and output as a tree structure of a claim sentence.
[0085]
Next, two technical elements that can be specified by the instruction word of the process 9 in FIG. 5 are extracted. Here, the descriptive term is a phrase registered in the descriptive term file 34 as shown in FIG. 15, and is, for example, "above", "the above", "this", "the same", "the", etc. There are phrases, which are phrases used to identify a phrase that follows these phrases except for the verb phrase immediately after, and a phrase that matches the longest sentence in other parts as a technical element. In the descriptive word file 34, the numerical value of the “reference distance” is set as one of “+2”, “+1”, “−1”, and “−2” for each registered phrase. Of the technical elements specified by the directives registered in the same file and the technical elements of the same phrase in other places, the technical element to be referred to is determined by the numerical value of the "reference distance", and " As a rule that determines the reference technology element by the numerical value of “distance”,
If the “reference distance” is “−2”, the first technical element of the same phrase as the reference technical element,
If the "reference distance" is "-1", the technical element of the same phrase immediately before is used as the reference technical element,
If the “reference distance” is “+1”, the technical element of the same phrase immediately after is used as the reference technical element,
If the "reference distance" is "+2", the technical element of the same phrase appearing last is set as the reference technical element. For example, in the descriptive word file 34, "" is registered and its "reference distance" is set to "-2", so "A is B and A is C, In the sentence "A does not become D.", "A is" is the technical element specified by the descriptive word "said", but according to the above rules, the second and third "A is" All of the technical elements are "A is" that appears first, and the third "A is" is not the second "A is".
[0086]
Further, when searching for a matching phrase in the longest sentence in the other part, the reason for “exclude the immediately following verb phrase” following the phrase registered in the descriptive word file 34 is that “the step of measuring the amount of radio wave interference” This is because the "interference amount" cannot be extracted as a technical element unless the verb phrase "measured" after the descriptive word "this" is removed, as in the example of "based on the measured interference amount".
[0087]
FIG. 13 shows a more detailed processing procedure of the processing S9.
As a main point of the processing procedure, in step S903, "instructive word" is selected from the instructive word file 34, and in determination processing S904, it is searched whether the corresponding "instructive word" is present in the claim sentence. If there is, in step S910, the words up to the first punctuation mark are extracted from the words excluding the verb phrase immediately after the "indicator", and set in the "search buffer". At this time, the head address ASi of this phrase and the instruction word SJi are recorded. Next, in determination processing S911, it is searched whether or not the same phrase as the phrase registered in the “search buffer” exists in another place in the claim sentence. Is again determined in the determination processing S911 whether or not the same phrase as the “search buffer” is present in the claim sentence.
[0088]
After the processing S914, it is determined whether or not the number of characters of the “search buffer” is 0 in a determination processing S915. If the number of characters is not 0, the determination processing S911 is performed. Since it is used, it is recorded as an error in the process S916, and the process proceeds to the determination process S907.
[0089]
If it is determined in the determination process S911 that the same word as the “search buffer” is found in another place, the word in the “search buffer” at this time is registered as a “technical element” in a process S912, and this “ The "instructive word" and the "address" of the newly detected "technical element" are recorded, and in step S913, all "technical elements" of the same phrase in other places are searched, and the "instructive word" and its The “address” of the searched “technical element” is recorded, and the process proceeds to determination processing S907.
[0090]
If there is no other "instruction" in the instruction file 34 in the determination processing S907, the process proceeds to step S917, and the "reference distance" for the instruction SJi for each address ASi recorded in the processing S910 is indicated by the instruction. According to the file 34, the "reference element" corresponding to the "technical element" of the address ASi is specified according to the "rule for determining the reference element based on the numerical value of the" reference distance ", and the address of the mutual reference element (multiple references are made). Are recorded in the "technical element file with address 51". In the example of FIG. 7, this address is recorded in the column of “address of the reference technical element”. If the address of the technical element of the reference is the technical element of the reference source, and if the address of the technical element of the reference is the technical element of the reference. The addresses of the technical elements of the reference source are respectively recorded, so that it is easy to know which other technical element the relevant technical element relates to.
[0091]
In the example of the claim sentence of FIG. 3, since “reference” is registered as a descriptive term in the descriptive term file 34 of FIG. 15 as “−2”, “C”, “D”, “I”, “K” and “N” are extracted as the technical elements specified by the instruction word. In addition, one reference technical element exists for each technical element. If a phrase such as “K ga” is extracted as a technical element and it is inconvenient, a rule such as “If the last phrase in the technical element is a particle, delete the particle” may be additionally set. .
[0092]
Next, in step S10 of FIG. 5, if there is a directive registered in the directive file 34, the directive is deleted from the tree structure output in step S8. The reason why the instruction words are deleted here is to make the diagram easy to see.
[0093]
Next, in the processing S11 of FIG. 5, in the tree structure up to the processing 10, the technical element extracted in the processing S9 is surrounded by a frame, and the frame of the technical element of the reference source and the frame of the technical element of the reference destination are drawn by a line. And output a diagram that clearly shows that both technical elements are interrelated. In the example of the claim sentence of FIG. 3, a technical element correlation diagram as shown in FIG. 4 can be output.
[0094]
The above is the specific procedure from the input of the claim statement to be debugged to the output of the technical element correlation diagram.
[0095]
Next, as a process after outputting the technical element correlation diagram, as shown in the processing procedure of FIG. 14, in a process 21, an examination target is set as a prerequisite part of the addressable technical element file 51, and then in a determination process S22, With respect to the technical elements in the search target determined in the process S21, it is checked whether or not the address of the reference technical element is blank. If there is no address, the process proceeds to the process S24. "Not referenced" is output in the sense of a warning, and the process proceeds to step S24. For example, in the examples of FIG. 4 and FIG. 7, “A” and “B” of the prerequisite are extracted as technical elements, but are not referenced from anywhere else, and thus are output as messages. Here, the reason that "A" and "B" of the prerequisite part output that they are not referenced from anywhere is that the description of "A" and "B" does not narrow the scope of the claims, or , Whether the technical elements "A" and "B" will not be used in the future and whether they will be disadvantageous, or whether the statements regarding "A" and "B" will be deleted and applied as claims This is to make it easier for the person to consider.
[0096]
Next, in S24, the subject of examination is set to the technical element of the full text of the claim, and in step 25, the term of the superordinate concept of the technical element is converted to the secondary superordinate concept (superordinate concept above the superordinate concept using the superordinate concept dictionary 31). ), And records it in the addressable technical element file 51. Then, in a determination process S26, it is checked whether the same term is present in the superordinate term of the technical element in the premise part and the technical element of the characteristic part. If there is, the process outputs in step S27 "there is a term of a lower concept than the characteristic portion in the premise", and the entire process ends. The reason why the generic term is output to the addressable technical element file 51 as shown in FIG. 7 in the process S25 is to make it easier for the applicant to examine whether or not the application can be replaced with a generic term. The reason why the output is performed as in S27 is to make it easier for the applicant to examine whether the prerequisite part is written in a manner narrower than the characteristic part.
[0097]
In the above-described embodiment, the method of displaying the claim sentence corresponding to claim 1 in a diagrammatic manner has been described. However, even when the claim inputs two or more claim sentences, a keyword such as a punctuation mark or “claim” may be used. With reference to, a sentence break is found, and a technical element correlation diagram or the like according to claim 2 or more can be output in a similar procedure.
[0098]
In the above embodiment, the attribute TX of the technical element preceding the case particle “to” is described by setting only three types as shown in FIG. 9. However, four or more types of the attribute TX are set. The method of defining the position PS with respect to the transition state of the attribute TX as shown in FIG. 11 may be changed to correspond to this, and only six types of dictionary files are included in the claim sentence analysis dictionary file 29 shown in FIG. However, another dictionary file may be additionally set, and the method of the graphical display may be changed using the newly set dictionary file.
[0099]
In the above-described embodiment, only the method of extracting the technical element by the case particle “to” and the descriptive word has been described. However, the technical element is extracted by the connective such as “or” or “and” and the It is also possible to diagrammatically display the part with a set theory diagram, or to adopt a method of syntactic analysis or semantic analysis to diagrammatically display a part of a claim sentence with a simple flowchart diagram or system block diagram.
[0100]
In the above-described embodiment, the morphological analysis dictionary 30 and the high-level concept dictionary 31 of the claim sentence analysis dictionary file 29 have been described as being stored in the HD of the external memory for use. The technical elements described in the claims of the patent application published documents that are sequentially published may be extracted and additionally registered, and may be used by downloading or the like. The high-level concept dictionary 31 may be stored in a Web server on the Internet. It may be configured such that the contents of the dictionary can be updated at any time by a patent engineer using this.
[0101]
In the above-described embodiment, the embodiment has been described as a computer program operated by a general-purpose personal computer. However, a computer is not limited to a general-purpose personal computer, and may be a special-purpose machine equipped with a CPU and a program. A tool may be configured.
[0102]
【The invention's effect】
As described above, according to the present invention, even a claim sentence in a Japanese patent specification having a variety of sentence structures, a claim sentence such as "at," or " Common words indicating the antecedent part enable the separation of the antecedent part from the feature part of the claim sentence and the extraction of the subject, and the technical element can be extracted by the word preceding the case particle `` to '' which has a parallel meaning. The attribute of the technical element is quantified by the state of the particle or the reading mark following the case particle "to", and the position of the technical element and the dependency can be analyzed logically from the transition of the numerical value of the attribute of one variable. In addition, by separating the explanation part of the subject and the explanation part of the technical element, the claim sentence of the Japanese patent specification which takes various forms by a logical method without using the pattern matching method is converted into a tree. To be able to display a diagram like a structure That.
[0103]
In addition, the technical element of the claim sentence is extracted from the phrase following the descriptive word such as "said" or "above", and the relationship between the reference source and the reference destination is defined by the type of the descriptive word. The relationship between technical elements can be clearly and graphically displayed.
[0104]
By displaying the claim text in a tree structure and clearly showing the relationship between technical elements in a diagram, it is easy to confirm whether the claim text was created as intended by the applicant's inventor or patent specification creator. In addition to the above, it is possible to automatically detect grammatical errors related to the case particle "to" and to automatically detect any technical elements that are not referenced elsewhere in the prerequisites. It becomes easy for the inventor and the patent specification creator to review the claim statement.
[0105]
With respect to the technical elements of the claim sentence detected above, the words of the superordinate concept are automatically detected and output, and if the prerequisite contains the words of the subordinate concept from the characteristic part, it is automatically detected. The inventor and the creator of the patent specification can easily review the technical elements of the claim text.
[0106]
By using a morphological analysis dictionary in which terms used as technical elements of claim sentences are additionally registered as noun-equivalent terms in the morphological dictionary used for automatic analysis of Japanese sentences, it is composed of special technical terms and complex morphemes Even if the technical term is in the claim text, the technical element can be easily extracted.
[Brief description of the drawings]
FIG. 1 is a block diagram. FIG. 2 is a configuration diagram of a patent specification debug tool. FIG. 3 is a claim statement to be debugged. FIG. 4 is a technical element correlation diagram. FIG. 5 is a flowchart. FIG. 6 is a morphological analysis at the stage of S3. Result [Fig. 7] Example of technical element file 51 with address [Fig. 8] Flowchart of subject extraction of premise or characteristic part [Fig. 9] Attribute TX table [Fig. 10] Extraction of technical element specified by case particle "to" Flowchart [FIG. 11] Expression for deciding position PS with respect to transition state of attribute TX [FIG. 12] Technical element before case particle "to" [FIG. 13] Extraction flowchart of technical element that can be specified by descriptive word [FIG. 14] General flowchart FIG. 15 is a table of descriptive word files [Explanation of symbols]
1 CPU
2 Internal memory 3 Input / output interface 4 Data communication terminal 5 Data output terminal 6 Data input terminal 7 External memory drive terminal 8 OS
9 patent specification debug tool 10 patent specification data file 11 bus line 12 HD
Reference Signs List 20 Executable file 21 Morphological analysis unit 22 Subject part extraction unit 23 Technical element extraction unit by "to" 24 Tree structure output unit 25 Technical element extraction unit by instruction word 26 Technical element correlation diagram output unit 27 Upper concept search unit 28 Error message output Section 29 claim sentence analysis dictionary file 30 morphological analysis dictionary 31 upper concept dictionary 32 premise part determination term file 33 pre-processing file 34 instruction word file 35 position special processing file 50 claim file 51 addressed technical element file 52 technical element correlation diagram In the file diagram, the same reference numerals indicate the same or corresponding parts.

Claims

A patent specification debugging tool having a data input / output unit, wherein a dictionary file for analyzing a claim sentence of a Japanese patent specification input to the data input / output unit, and a result analyzed using the dictionary file Technical element that specifies a noun or a noun phrase having the same function as a noun as a technical element of the claim sentence specified by the case particle "to" of the claim sentence obtained as a modifier that points to the thing A subject specifying means for specifying, as a subject of the present invention, a noun or a noun phrase having the same function as a noun described at the end of the claim sentence obtained by analyzing using the dictionary file; Among the technical elements specified by the element specifying means, the technical element directly related to the subject specified by the subject specifying means is defined as a primary technical element. The primary technical element specifying means to be specified, the primary technical element and the subject specified by the primary technical element specifying means are arranged on a drawing, and the primary technical element and the subject are arranged. A patent specification debug tool, comprising: a claim statement first diagramming means for displaying a relationship between the claim description and a diagram.

2. The morphological analysis dictionary according to claim 1, wherein the dictionary file includes a morphological analysis dictionary in which terms used as technical elements of a claim sentence are additionally registered as terms corresponding to nouns in a morphological dictionary used for automatic analysis of Japanese sentences. A patent specification debugging tool.

The subject specifying unit detects an element that separates a premise part and a characteristic part from the claim sentence, and specifies a function that separates the claim sentence into a premise part and a characteristic part and a subject of the premise part. If a function and an element that separates the claim sentence into a premise and a feature cannot be detected, the claim does not have a premise and includes a function of processing all of the claim sentence as a feature. The patent specification debug tool according to claim 1 or 2, wherein:

The claim diagram first diagramming means has a function of graphically displaying the relationship between the primary technical element for each subject of the prerequisite and the subject of the characteristic portion. 4. The patent specification debug tool according to any one of claims 1 to 3.

Among the technical element specifying means, with respect to the technical element specified by the descriptive term, a reference source technical element specifying means for specifying a reference technical element in the claim sentence, The technical element of the reference source and the technical element specified by the descriptive word are connected by a line on the diagram displayed by the first diagramming means of the claim sentence, and the descriptive word is converted from the diagram. The patent specification debugging tool according to any one of claims 1 to 4, further comprising: a claim statement second diagramming means for displaying the deleted diagram.

In the technical element specifying means, if the descriptive word is followed by a verb or a verb phrase or a verb having the same function as the verb, and then a noun or a noun phrase having the same function as the noun follows. The patent specification debug according to any one of claims 1 to 5, further comprising a function of specifying, as a technical element, the noun phrase having the same function as the verb phrase or the noun or the noun excluding the verb. tool.

The patent according to any one of claims 1 to 6, wherein the technical element specifying means includes a function of converting a conjunction having a parallel meaning in the claim sentence into a case particle "to". Statement debugging tool.

In the technical element specifying means, the technical element specified as a phrase preceding the case particle "to" is a technical element by "to", and the part of speech and punctuation of a morpheme following the case particle "to" or The attribute of the technical element by “to” is determined by a code, and a state in which the attribute has changed from the attribute of the technical element by “to” preceding in the claim sentence is defined as a transition state. 8. A method according to claim 1, wherein said primary technical element specifying means includes a method for determining a dependency relationship between technical elements based on "and" by an arithmetic expression or a method based on the arithmetic expression. A patent specification debugging tool.

In the primary technical element specifying means, in the method of determining the dependency between the technical elements by "to" by an arithmetic expression, an arithmetic expression for defining a value of the technical element by "to" The arithmetic expression that defines the value of the positioning is defined based on the transition state, and includes a function that can correct the value of the positioning for the specially defined technical element, and the magnitude of the value of the positioning is The patent specification debug tool according to any one of claims 1 to 8, further comprising a function of specifying the primary technical element by comparing.

In the method of specifying the technical element by "to", when the case particle "to" does not have a parallel meaning, the word preceding the case particle "to" is replaced by the technical element by "to". The patent specification debug tool according to any one of claims 1 to 9, further comprising a function not to be extracted.

11. The method according to claim 1, wherein when the last morpheme of the technical element specified by the technical element specifying unit is a particle, a function of specifying the element other than the particle as a technical element is provided. Patent Specification Debugging Tool.

The patent specification debug tool according to any one of claims 1 to 11, wherein, among the technical elements belonging to the premise, if there is a technical element that is not referred to by others, a first warning is output.

13. The method according to claim 1, further comprising outputting a second warning if the usage of the case particle "to" is inconsistent with the grammar, judging from the arrangement of the attributes of the technical elements by the "to". The patent specification debug tool according to any one of the above.

The subject of the premise specified by the subject specifying means is compared with the subject of the feature, and when the subjects are different, it is determined that “the subject of the premise and the feature of the claim sentence are different”. The patent specification debugging tool according to any one of claims 1 to 13, wherein a meaningful sentence or graphic is output.

2. A technical element superordinate concept detecting means for detecting and outputting a term of the superordinate concept of the technical element in the claim sentence from a superordinate concept term dictionary in the dictionary file. 15. The patent specification debug tool according to any one of 14.

If any of the plurality of technical elements in the feature part is compared with any of the terms of the generic concept of the plurality of technical elements in the premise part, and there is the same phrase, a third warning is output. The patent specification debug tool according to any one of claims 1 to 15, wherein:

A computer program which, when installed on a computer, causes the computer to have a function equivalent to that of the debug tool according to any one of claims 1 to 16.