JP2011257817A

JP2011257817A - Patent specification analyzer and text analyzer

Info

Publication number: JP2011257817A
Application number: JP2010129477A
Authority: JP
Inventors: Kenichiro Ayaki; 健一郎綾木
Original assignee: Individual
Current assignee: Individual
Priority date: 2010-06-04
Filing date: 2010-06-04
Publication date: 2011-12-22

Abstract

PROBLEM TO BE SOLVED: To precisely extract case components defining a patent claim.SOLUTION: Component separation means 11 separates a claim including plural components into several components. Modification structure analyzing means 12 executes morphological analysis on the respective components to break down into a segment including a word and a parse of the word and analyzes the modification structure of the segment. Segment restructuring means 13 restructures the modification structure of the segment into a tree-like segment structure. A pattern 15 of plural case components and a pattern 16 excluded from the case components are restructured in a tree-like configuration with segments including the word and the word parse. Case component extraction means 14 compares the segment structure to the plural case component patterns 15 and the patterns 16 exclusive case component, and extracts the same as a case component.

Description

本発明は、特許明細書分析装置、及び文章分析装置、特に、特許請求項を分析して表示する特許明細書分析装置や、契約書等の条件を分析して表示する文章分析装置に関するものである。 The present invention relates to a patent specification analysis device and a text analysis device, and more particularly to a patent specification analysis device that analyzes and displays a patent claim, and a text analysis device that analyzes and displays conditions such as a contract. is there.

従来、特許明細書分析装置、及び文章分析装置では、特許明細書における特許請求の範囲に定義された発明の限定度合をカウントして表示する技術が知られている。 2. Description of the Related Art Conventionally, in a patent specification analysis apparatus and a text analysis apparatus, a technique for counting and displaying the degree of limitation of an invention defined in the scope of claims in a patent specification is known.

特許文献１には、電子データ化された特許明細書における特定の欄に記載されている文字列を抽出したのち、格成分数を全てカウントして発明の限定度合として表示する技術が記載されている。 Patent Document 1 describes a technique of extracting a character string described in a specific column in a patent specification converted into electronic data, and then counting all the case components and displaying them as the degree of limitation of the invention. Yes.

特開２００９−２５９１５４号公報JP 2009-259154 A

しかしながら、従来の特許明細書分析装置、及び文章分析装置は、
（Ａ）〜（Ｃ）の課題があった。 However, the conventional patent specification analyzer and the sentence analyzer are
There were problems (A) to (C).

（Ａ）手掛かり句の文字列のみを指標として格成分を抽出しているため、品詞が全く異なる部分であっても手掛かり句文字列と一致してしまい、誤って格成分でない文字列を格成分として抽出してしまう虞があった。 (A) Since the case component is extracted using only the character string of the cue phrase as an index, even if the part of speech is completely different, it matches the cue phrase character string, and the character string that is not the case component is mistakenly used as the case component. There was a possibility of extracting as.

（Ｂ）手がかり句の文字列を指標として格成分を抽出していた。よって、格成分の語句に含まれている品詞の語尾変化により、その語句を誤って格成分ではないと判断してしまう虞があった。 (B) Case components were extracted using the character string of the clue phrase as an index. Therefore, there is a possibility that the phrase is erroneously determined not to be a case component due to a ending change of the part of speech included in the phrase of the case component.

（Ｃ）例えば、手がかり句の文字列の重複のみを指標として格成分でないパターンを抽出して、これを格成分でないものとして除外しているため、品詞の語尾変化の場合、例えば同一の動詞が能動態と受動態とで重複記載されていた場合、誤って重複した記載でないものとして判断してしまう虞があった。 (C) For example, a pattern that is not a case component is extracted using only the duplication of the character string of the cue phrase as an index, and this is excluded as a case component. If the active and passive voices are described in duplicate, there is a risk that it will be erroneously judged as not being duplicated.

本発明の特許明細書分析装置は、複数の構成要件を備えた請求項を、前記構成要件ごとに区分する構成要件区分手段と、前記構成要件を形態素解析して、単語と単語の品詞とを有する文節に分解し、且つ前記文節の係り受けを解析する係り受け解析手段と、前記文節の係り受けを、ツリー状の文節構造に構成する文節構造構成手段と、単語と単語の品詞とを有する文節によってツリー状に構成された複数の格成分のパターン及び格成分の除外パターンと、前記文節構造を前記複数の格成分のパターン及び格成分の除外パターンと比較し、格成分として抽出する格成分抽出手段とを備えたことを特徴とする。 The patent specification analysis apparatus of the present invention includes a component requirement classifying unit that classifies a claim having a plurality of component requirements for each component requirement, a morphological analysis of the component requirement, and a word and a part of speech of the word. A dependency analysis means for decomposing the phrase into phrases and analyzing the dependency of the phrase; a phrase structure configuration means for configuring the dependency of the phrase into a tree-like phrase structure; and a word and a part of speech of the word A plurality of case component patterns and case component exclusion patterns configured in a tree shape with clauses, and a case component that compares the clause structure with the plurality of case component patterns and case component exclusion patterns and extracts them as case components And an extraction means.

本発明の特許明細書分析装置、及び文章分析装置によれば、以下（Ａ），（Ｂ）の効果がある。 According to the patent specification analysis apparatus and the sentence analysis apparatus of the present invention, there are the following effects (A) and (B).

（Ａ）文節構造と複数の格成分のパターン及び格成分の除外パターンと比較して格成分として抽出する際には、手掛かり句の文字列ではなく、単語と単語の品詞とを有する文節によって構成されたツリー状の文節構造を比較している。よって、手掛かり句の文字列に比べて、より正確に格成分を抽出できる (A) When extracting as a case component in comparison with a phrase structure and a plurality of case component patterns and case component exclusion patterns, it is not a character string of a clue phrase but a phrase having a word and a part of speech of the word The tree-like phrase structure is compared. Therefore, the case component can be extracted more accurately than the character string of the clue phrase.

（Ｂ）格成分のパターンと格成分の除外パターンは、動詞の原形をもとに判断しているので、例えば動詞が能動態と受動態とで記載されていた場合であっても、これを同一のものとして判断することが可能である。 (B) Since the case component pattern and the case component exclusion pattern are determined based on the original form of the verb, for example, even if the verb is described as active and passive, the same is used. It can be judged as a thing.

図１は、本発明の実施例１における特許明細書分析装置の概略の構成を示す図である。FIG. 1 is a diagram illustrating a schematic configuration of a patent specification analysis apparatus according to Embodiment 1 of the present invention. 図２は、図１の格成分のパターン（その１）を示す図である。FIG. 2 is a diagram showing a case component pattern (part 1) of FIG. 図３は、図１の格成分のパターン（その２）を示す図である。FIG. 3 is a diagram showing a case component pattern (part 2) of FIG. 図４は、図１の格成分の除外パターンを示す図である。FIG. 4 is a diagram illustrating a case component exclusion pattern in FIG. 1. 図５は、図１の特許明細書分析装置の動作を示すフローチャートである。FIG. 5 is a flowchart showing the operation of the patent specification analyzer of FIG. 図６は、図１の構成要件区分手段の動作を示すフローチャートである。FIG. 6 is a flowchart showing the operation of the component requirement classifying means of FIG. 図７は、図１の係り受け解析手段における動作を示すフローチャートである。FIG. 7 is a flowchart showing the operation of the dependency analyzing means of FIG. 図８は、図１の文節構造構成手段におけるＸＭＬ化の動作を示すフローチャートである。FIG. 8 is a flowchart showing the XML operation in the phrase structure construction unit of FIG. 図９は、図１の文節構造構成手段におけるツリー化の動作を示すフローチャートである。FIG. 9 is a flowchart showing the treeing operation in the phrase structure constructing means of FIG. 図１０は、図１の格成分抽出手段の動作を示すフローチャートである。FIG. 10 is a flowchart showing the operation of the case component extraction means of FIG. 図１１は、図１における文節の係り受け解析結果を示す図である。FIG. 11 is a diagram showing a dependency analysis result of the phrase in FIG. 図１２は、図８に示す処理による係り受け解析ＸＭＬを示す図である。FIG. 12 is a diagram showing dependency analysis XML by the processing shown in FIG. 図１３は、図１におけるツリー状の文節構造を示す図である。FIG. 13 is a diagram showing the tree-like phrase structure in FIG. 図１４は、図１における格成分を抽出したツリー状の文節構造を示す図である。FIG. 14 is a diagram showing a tree-like phrase structure from which the case components in FIG. 1 are extracted. 図１５は、本発明の実施例２における契約書分析装置の概略の構成を示す図である。FIG. 15 is a diagram illustrating a schematic configuration of a contract document analysis apparatus according to the second embodiment of the present invention. 図１６は、図１５の契約書分析装置の動作を示すフローチャートである。FIG. 16 is a flowchart showing the operation of the contract document analyzer of FIG.

本発明を実施するための形態は、以下の好ましい実施例の説明を添付図面と照らし合わせて読むと、明らかになるであろう。但し、図面はもっぱら解説のためのものであって、本発明の範囲を限定するものではない。 Modes for carrying out the present invention will become apparent from the following description of the preferred embodiments when read in light of the accompanying drawings. However, the drawings are only for explanation and do not limit the scope of the present invention.

（実施例１の構成）
図１は、本発明の実施例１における特許明細書分析装置の概略の構成を示す図である。 (Configuration of Example 1)
FIG. 1 is a diagram illustrating a schematic configuration of a patent specification analysis apparatus according to Embodiment 1 of the present invention.

特許明細書分析装置１０は、構成要件区分手段１１と、係り受け解析手段１２と、文節構造構成手段１３と、格成分抽出手段１４とを有している。格成分抽出手段１４は、格成分のパターン１５及び格成分の除外パターン１６とを備えている。 The patent specification analysis apparatus 10 includes a component requirement classification unit 11, a dependency analysis unit 12, a phrase structure configuration unit 13, and a case component extraction unit 14. The case component extraction means 14 includes a case component pattern 15 and a case component exclusion pattern 16.

構成要件区分手段１１は、請求項の文章２１を構成要件で区分し、各構成要件に区分された請求項の文章２２を出力する。係り受け解析手段１２は、各構成要件に区分された請求項の文章２２を係り受け解析し、文節の係り受け解析結果２３を出力する。文節構造構成手段１３は、文節の係り受け解析結果２３をＸＭＬ形式に変換したのち、文節の係り受けに応じたツリー化を行い、ツリー状の文節構造２４を出力する。ツリー状の文節構造２４は、単語と単語の品詞とを有する文節によってツリー状に構成されている。格成分抽出手段１４は、ツリー状の文節構造２４を格成分のパターン１５及び格成分の除外パターン１６と照合し、格成分を抽出したツリー状の文節構造２５を出力する。格成分のパターン１５及び格成分の除外パターン１６は、単語と単語の品詞とを有する文節によってツリー状に構成されている。 The component requirement classifying unit 11 classifies the sentence 21 of the claim by the component requirement, and outputs the sentence 22 of the claim divided into each component requirement. The dependency analysis unit 12 performs dependency analysis on the sentence 22 of the claim divided into each constituent requirement, and outputs a dependency analysis result 23 of the phrase. The phrase structure constructing means 13 converts the phrase dependency analysis result 23 into the XML format, and then forms a tree according to the phrase dependency, and outputs a tree-like phrase structure 24. The tree-like phrase structure 24 is configured in a tree shape by phrases having a word and a part of speech of the word. The case component extraction means 14 compares the tree-like phrase structure 24 with the case component pattern 15 and the case component exclusion pattern 16 and outputs a tree-like phrase structure 25 from which the case components are extracted. The case component pattern 15 and the case component exclusion pattern 16 are configured in a tree shape with phrases having words and parts of speech of the words.

図２は、図１の格成分のパターン（その１）を示す図であり、左側に格が何を示すかが示されており、右側にツリー状の文節構造である格成分のパターン１５が示されている。右側のツリー状の文節構造は、１行が１つの文節に対応している。格成分のパターン１５のツリー状の文節構造は、単語と単語の品詞とを有する文節によってツリー状に構成されている。 FIG. 2 is a diagram showing the case component pattern (part 1) of FIG. 1. The left side shows what the case shows, and the right side shows the case component pattern 15 having a tree-like phrase structure. It is shown. In the tree-like clause structure on the right side, one line corresponds to one clause. The tree-like phrase structure of the case component pattern 15 is configured in a tree form by phrases having words and parts of speech of the words.

対象を示す格その１の１行目において、括弧で括られている〔名詞句〕は、名詞で終了する単語列を示している。「名詞句」の次に記載されている「を（助詞）」は、単語は「を」であることと、この単語「を」の品詞は、助詞であることを示している。「を（助詞）」の次に記載されている「−Ｄ」は、この文節が係り受けている先の文節の位置を示している。 In the first line of case 1 indicating the object, [noun phrase] enclosed in parentheses indicates a word string ending with a noun. “O (participant)” described next to “noun phrase” indicates that the word is “” and that the part of speech of this word “” is a particle. “-D” described next to “O (particle)” indicates the position of the previous clause on which this clause is dependent.

対象を示す格その１の２行目において、括弧で括られている〔動詞句〕は、動詞の単語を有している文節を示している。「動詞句」の次に記載されている「−＋」は、この文節が係り受けている元の文節の位置を示している。この文節の「−＋」は、直前の文節の「−D」の位置と同じ位置であり、よって直前の文節を係り受けていることを示している。「−＋」の次に記載されている「−＊」は、動詞句以降の文節がどのような単語で構成されているか、また動詞句以降の係り受け構造がどのようになっているかは問わないことを示している。対象を示す格その１は、これらの記号によって、〔名詞句〕の次に助詞「を」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。 In the second line of the case 1 indicating the subject, [verb phrase] enclosed in parentheses indicates a phrase having the word of the verb. “− +” Described next to “verb phrase” indicates the position of the original phrase on which this phrase is dependent. “− +” In this phrase indicates the same position as the position of “−D” in the immediately preceding phrase, and thus indicates that the immediately preceding phrase is being modified. “-*” Described after “− +” indicates whether the phrase after the verb phrase is composed of words and how the dependency structure after the verb phrase is structured. It shows no. Case No. 1 indicating an object shows a pattern in which a phrase composed of a particle “wo” next to [noun phrase] depends on a phrase having a word as a verb.

係り受けている文節とは、単に「ラプラシアンフィルタ処理を行う」で示されるように、「〔名詞句〕を」の次に出現する文節のことではなく、日本語の文法上において係り受けている文節である。よって、「ラプラシアンフィルタ処理をＲＧＢ信号毎に行う」の文章は、この係り受けのパターンに含まれている。
同様に、対象を示す格その２は、〔名詞句〕の次に助詞「が」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
条件を示す格その１は、〔名詞句〕の次に助詞「に」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
条件を示す格その２は、〔名詞句〕の次に助詞「に」と、助詞「より」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
条件を示す格その３は、任意の文節の次に動詞「基づい」と、助詞「て」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
条件を示す格その４は、任意の文節の次に動詞「応じ」と、助詞「て」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。 Dependent phrase is not a phrase that appears next to “[noun phrase]”, as indicated by “Perform Laplacian filter processing”, but is dependent on Japanese grammar. It is a phrase. Therefore, the sentence “Laplacian filter processing is performed for each RGB signal” is included in this dependency pattern.
Similarly, Case No. 2 indicating an object indicates a pattern in which a phrase composed of the particle “ga” after [noun phrase] is dependent on a phrase having a word that is a verb.
Case 1 indicating a condition indicates a pattern in which a phrase composed of the particle “ni” next to [noun phrase] depends on a phrase having a word as a verb.
Case 2 indicating a condition indicates a pattern in which a phrase composed of the particle “ni” and the particle “more” is followed by a phrase having a word as a verb after [noun phrase].
Case 3 indicating a condition indicates a pattern in which a phrase composed of a verb “based” and a particle “te” following an arbitrary phrase depends on a phrase having a word as a verb.
Case No. 4 indicating a condition shows a pattern in which a phrase consisting of a verb “Accept” after an arbitrary phrase and a particle “te” depends on a phrase having a word as a verb.

条件を示す格その５は、任意の文節の次に動詞「対応させ」と、助詞「て」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。 Case No. 5 indicating a condition indicates a pattern in which a phrase consisting of a verb “corresponding” and an auxiliary particle “te” following an arbitrary phrase depends on a phrase having a word as a verb.

条件を示す格その６は、任意の文節の次に名詞「場合」と、助詞「に」と、助詞「は」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
時期を示す格その１は、任意の文節の次に名詞「とき」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。 Case 6 showing a condition shows a pattern in which a phrase consisting of a noun “if”, a particle “ni”, and a particle “ha” depends on a phrase having a word as a verb after an arbitrary phrase. ing.
Case 1 indicating a time indicates a pattern in which a phrase consisting of a noun “to” next to an arbitrary phrase depends on a phrase having a word as a verb.

図３は、図１の格成分のパターン（その２）を示す図であり、図２と同様に、左側に格が何を示すかが示されており、右側にツリー状の文節構造である格成分のパターン１５が示されている。図２と同様に、右側のツリー状の文節構造は、１行が１つの文節に対応している。
始点を示す格その１は、〔名詞句〕の次に助詞「から」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
着点を示す格その１は、〔名詞句〕の次に助詞「へ」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
着点を示す格その２は、〔名詞句〕の次に助詞「に」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。 FIG. 3 is a diagram showing the case component pattern (part 2) of FIG. 1. Like FIG. 2, the left side shows what the case shows, and the right side has a tree-like phrase structure. A case component pattern 15 is shown. As in FIG. 2, in the tree-like clause structure on the right side, one line corresponds to one clause.
Case No. 1 indicating the starting point shows a pattern in which a phrase composed of the particle “kara” after [noun phrase] depends on a phrase having a word as a verb.
Case No. 1 indicating a landing point shows a pattern in which a phrase consisting of the particle “he” next to [noun phrase] depends on a phrase having a word as a verb.
Case No. 2 indicating a landing point indicates a pattern in which a phrase consisting of the particle “ni” after [noun phrase] depends on a phrase having a word as a verb.

着点を示す格その３は、〔名詞句〕の次に助詞「に」からなる文節が、動詞「対し」と助詞「て」からなる文節に係り受けており、更に動詞である単語を有する文節に係り受けているパターンを示している。 In case 3, the phrase that consists of the particle "ni" after [noun phrase] depends on the phrase that consists of the verb "versus" and the particle "te", and has a word that is a verb. It shows the pattern that depends on the phrase.

媒介を示す格その１は、〔名詞句〕の次に助詞「を」からなる文節が、動詞「介し」と助詞「て」からなる文節に係り受けており、更に動詞である単語を有する文節に係り受けているパターンを示している。 Case 1 that shows mediation is that a phrase consisting of the particle "wo" next to [noun phrase] depends on a phrase consisting of the verb "me" and the particle "te", and further has a word that is a verb. The pattern that is dependent on is shown.

媒介を示す格その２は、〔名詞句〕の次に助詞「に」からなる文節が、動詞「媒介させ」と助詞「て」からなる文節に係り受けており、更に動詞である単語を有する文節に係り受けているパターンを示している。 In case 2 that shows mediation, the phrase consisting of the particle “ni” next to the [noun phrase] depends on the phrase consisting of the verb “mediate” and the particle “te”, and further has a word that is a verb. It shows the pattern that depends on the phrase.

用途役割を示す格その１は、〔名詞句〕の次に助詞「として」と助詞「の」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。
状態を示す格その１は、〔名詞句〕の次に名詞「状態」と、助詞「で」からなる文節が、動詞である単語を有する文節に係り受けているパターンを示している。 Case No. 1 indicating a use role shows a pattern in which a phrase composed of a particle “as” and a particle “no” following a [noun phrase] depends on a phrase having a word as a verb.
Case No. 1 indicating a state shows a pattern in which a phrase composed of a noun “state” and a particle “de” following [noun phrase] depends on a phrase having a word as a verb.

原料・材料を示す格その１は、〔名詞句〕の次に助詞「から」からなる文節が、動詞「なる」からなる文節に係り受けており、更に任意の文節に係り受けているパターンを示している。 The case 1 that shows the raw material / material is that the phrase consisting of the particle "kara" after the [noun phrase] is dependent on the phrase consisting of the verb "nar", and the pattern depends on any phrase. Show.

原料・材料を示す格その２は、〔名詞句〕の次に助詞「を」からなる文節が、動詞「有する」からなる文節に係り受けており、更に任意の文節に係り受けているパターンを示している。
比較を示す格その１は、〔名詞句〕の次に助詞「より」からなる文節が、形容詞の単語を有する文節に係り受けているパターンを示している。
図４は、図１の格成分の除外パターンを示す図である。 In case 2 that shows the raw material / material, the phrase consisting of the particle "O" after the [noun phrase] depends on the phrase consisting of the verb "has", and the pattern depends on any phrase. Show.
Case No. 1 showing a comparison shows a pattern in which a phrase consisting of the particle “no” after [noun phrase] depends on a phrase having an adjective word.
FIG. 4 is a diagram illustrating a case component exclusion pattern in FIG. 1.

格成分の除外パターン１６は、特徴部を除外する４パターンと、前記による繰り返しを除外する２パターンと、特定の動詞への助詞「を」の係り受けを除外する４パターンとを有している。格成分の除外パターン１６のツリー状の文節構造は、単語と単語の品詞とを有する文節によってツリー状に構成されている。 The case component exclusion pattern 16 includes four patterns that exclude features, two patterns that exclude repetitions described above, and four patterns that exclude the dependency of the particle “” on a specific verb. . The tree-like phrase structure of the case component exclusion pattern 16 is constituted by a tree having phrases having words and parts of speech of the words.

特徴部その１は、〔名詞句〕の次に助詞「を」からなる文節が、動詞「有する」に係り受け、名詞「こと」の次に助詞「を」からなる文節に係り受け、名詞「特徴」の次に助詞「を」からなる文節に係り受け、動詞「する」に係り受け、更に「・・・装置／方法。」等で示される請求項の文章の末尾且つ特許請求の範囲の対象に係り受けているパターンを示している。 The feature part 1 is that the phrase consisting of the particle “wo” after [noun phrase] depends on the verb “having”, and the phrase consisting of the particle “wo” next to the noun “ko”. Dependent on the phrase consisting of the particle "wo" next to "feature", depending on the verb "do", and at the end of the sentence of the claim indicated by "... apparatus / method." The pattern depending on the object is shown.

特徴部その２は、〔名詞句〕の次に助詞「を」からなる文節が、動詞「備える」に係り受け、名詞「こと」の次に助詞「を」からなる文節に係り受け、名詞「特徴」の次に助詞「を」からなる文節に係り受け、動詞「する」に係り受け、更に「・・・装置／方法。」等で示される請求項の文章の末尾且つ特許請求の範囲の対象に係り受けているパターンを示している。 The feature part 2 is that the phrase consisting of the particle "wo" next to the [noun phrase] depends on the verb "preparation", and depends on the phrase "no" next to the noun "ko". Dependent on the phrase consisting of the particle "wo" next to "feature", depending on the verb "do", and at the end of the sentence of the claim indicated by "... apparatus / method." The pattern depending on the object is shown.

特徴部その３は、〔名詞句〕の次に助詞「を」からなる文節が、動詞「有する」に係り受け、更に「・・・装置／方法。」等で示される請求項の文章の末尾且つ特許請求の範囲の対象に係り受けているパターンを示している。 Characteristic part 3 is that the phrase consisting of the particle "wo" next to [noun phrase] depends on the verb "has", and further, "... device / method." And the pattern which depends on the object of a claim is shown.

特徴部その４は、〔名詞句〕の次に助詞「を」からなる文節が、動詞「備える」に係り受け、更に「・・・装置／方法。」等で示される請求項の文章の末尾且つ特許請求の範囲の対象に係り受けているパターンを示している。 The feature part 4 is that the phrase consisting of the particle "wo" next to [noun phrase] depends on the verb "preparation", and further, "... apparatus / method." And the pattern which depends on the object of a claim is shown.

特徴部その１〜４は、いずれも特許の対象を示す文節に係り受けている動詞を含んでいるパターンである。このパターンは、極めて形式的かつ定型的であり、特許権の技術的範囲に影響を与えるものではない。更に、格成分とは構成要件に対する動詞を限定するものである。よって、特許の対象に対する動詞を限定する部分は、たとえ形式的に格成分のパターンと一致した部分があったとしても、これを格成分から除外している。 Each of the feature parts 1 to 4 is a pattern including a verb depending on a clause indicating the subject of the patent. This pattern is extremely formal and routine and does not affect the technical scope of the patent. Furthermore, the case component limits the verb for the constituent requirement. Therefore, the part which limits the verb with respect to the object of the patent excludes this from the case component even if there is a portion which formally matches the pattern of the case component.

前記による繰り返し・その１は、名詞「前記」を先頭に、助詞「に」と助詞「より」を末尾に有する文節構造が、更に動詞である単語を有する文節に係り受けているパターンを示している。 Repetition by the above ・ Part 1 shows a pattern in which a phrase structure having the noun “above” at the beginning, the particle “ni” and the particle “yori” at the end depends on a phrase having a word that is a verb. Yes.

前記による繰り返し・その２は、名詞「前記」を先頭に、助詞「が」を末尾に有する文節構造が、更に動詞である単語を有する文節に係り受けているパターンを示している。 Repetition by the above (2) shows a pattern in which the phrase structure having the noun “above” at the head and the particle “ga” at the end is further dependent on the phrase having the word that is a verb.

前記による繰り返し・その１とその２は、更に、名詞「前記」に引き続いている名詞句が、既に出現している名詞句と同一であることと、ここで出現している動詞句の原形は、既に出現している動詞句の原形と同一であることを判断している。これにより、能動態で記載されている動詞句が、受動態で再度記載されたとしても検出可能としている。 The repetition by the above-mentioned 1 and 2 are further that the noun phrase following the noun “above” is the same as the noun phrase already appearing, and the verb phrase that appears here is the original form , It is determined that it is the same as the original form of the verb phrase that has already appeared. Thereby, even if the verb phrase described in the active voice is described again in the passive voice, it can be detected.

前記による繰り返し・その１とその２は、特許請求の範囲を明確化するために重複記載された部分であり、記載されて無くてもよい。これを格成分として抽出すると、記載形式によって格成分が変化することとなり望ましくない。よって、これを格成分から除外している。 The above repetitions 1 and 2 are duplicated parts for clarifying the scope of the claims, and need not be described. If this is extracted as a case component, the case component will change depending on the description format, which is not desirable. Therefore, this is excluded from the case components.

特定の動詞への係り受け・その１は、名詞句の次に助詞「を」からなる文節が、動詞「行う」に係り受けているパターンを示している。動詞「行う」は語尾変化していてもよい。この場合は、形式的には対象を示す格のパターンに該当するが、実際には格ということはできないためである。 Dependency on a specific verb No. 1 shows a pattern in which a phrase consisting of a particle “no” next to a noun phrase depends on the verb “do”. The verb “do” may change in ending. In this case, it is formally applicable to the case pattern indicating the object, but it cannot actually be a case.

動詞「行う」は特定の意味を有しておらず、名詞句の次に助詞「を」からなる文節が係り受けることによって、初めて動詞の意味が確定する。仮に、名詞句の次に動詞「する」を記述したならば、同一の意味を表すことができるが、このように書き換えることで格成分数が変化することで望ましくない。 The verb “do” does not have a specific meaning, and the meaning of the verb is determined for the first time only when the phrase consisting of the particle “wo” follows the noun phrase. If the verb “do” is described next to the noun phrase, the same meaning can be expressed, but it is not desirable because the number of case components changes by rewriting in this way.

例えば、「受信データにＲＧＢ変換処理を行う変換手段」の文章を考える。形式的には「ＲＧＢ変換処理を」は、対象を表す格に該当するが、この文章は「受信データにＲＧＢ変換処理する変換手段」と書き直すことができ、これを格成分とすると、表現方法の形式的な変化のみで格成分が変化する為、望ましくない。よって、このパターンは格成分としない。 For example, consider the sentence “Conversion means for performing RGB conversion processing on received data”. Formally, “RGB conversion processing” corresponds to a case representing an object, but this sentence can be rewritten as “conversion means for performing RGB conversion processing on received data”. This is not desirable because the case component changes only by the formal change of. Therefore, this pattern is not a case component.

特定の動詞への係り受け・その２は、名詞句の次に助詞「を」からなる文節が、動詞「実行する」に係り受けているパターンを示している。動詞「実行する」は語尾変化していてもよい。 Dependency on a specific verb No. 2 shows a pattern in which a phrase consisting of a particle “wo” next to a noun phrase depends on the verb “execute”. The verb “execute” may have a ending change.

特定の動詞への係り受け・その２は、名詞句の次に助詞「を」からなる文節が、動詞「処理する」に係り受けているパターンを示している。動詞「処理する」は語尾変化していてもよい。 Dependency on a specific verb No. 2 shows a pattern in which a phrase composed of a particle “wo” next to a noun phrase depends on a verb “process”. The verb “process” may have a ending change.

特定の動詞への係り受け・その２は、名詞句の次に助詞「を」からなる文節が、動詞「する」に係り受けているパターンを示している。動詞「する」は語尾変化していてもよい。
（実施例１の動作）
図５は、図１の特許明細書分析装置の動作を示すフローチャートである。 Dependency on a specific verb No. 2 shows a pattern in which a phrase consisting of a particle “wo” next to a noun phrase depends on the verb “se”. The verb “do” may have a ending change.
(Operation of Example 1)
FIG. 5 is a flowchart showing the operation of the patent specification analyzer of FIG.

特許明細書分析装置１０は、処理を開始すると、ステップＳ１において、各行ごとに特許公報や公開公報の情報を含んでいるＣＳＶ形式のファイルから、個々の特許公報や公開公報の請求項の文章２１を抽出する。 When the patent specification analysis apparatus 10 starts processing, in step S1, a sentence 21 of each patent gazette or public gazette claim is obtained from a CSV file containing information on the patent gazette or gazette for each line. To extract.

ステップＳ２において、構成要件区分手段１１によって請求項の文章２１を構成要件ごとに区切る。ステップＳ３において、係り受け解析手段１２によって、構成要件ごとに係り受け解析してＣＳＶ化する。ステップＳ４において、文節構造構成手段１３によって、係り受け解析ＣＳＶをＸＭＬに変換し、ステップＳ５において、係り受け解析ＸＭＬを係り受けに応じてツリー化する。ステップＳ６において、ツリー化した係り受け解析ＸＭＬから格成分を抽出し、図５の動作を終了する。
図６は、図１の構成要件区分手段の動作を示すフローチャートである。 In step S <b> 2, the sentence 21 of the claim is divided by the component requirement classification unit 11 for each component requirement. In step S <b> 3, the dependency analysis unit 12 performs dependency analysis for each component and converts it into a CSV. In step S4, the dependency structure analyzing unit 13 converts the dependency analysis CSV into XML, and in step S5, the dependency analysis XML is tree-formed according to the dependency. In step S6, a case component is extracted from the tree dependency analysis XML, and the operation of FIG.
FIG. 6 is a flowchart showing the operation of the component requirement classifying means of FIG.

処理が開始すると、ステップＳ１０において、構成要件区分手段１１は、構成要件列挙形式の区切り「と、」をカウントしてＳｋ１とする。ステップＳ１１において、書き流し形式の区切り「し、」をカウントしてＳｋ２とする。ステップＳ１２において、ジェプソン形式の区切り「を特徴とする・・・であって、」をカウントしてＳｊとする。
ステップＳ１３において、Ｓｊが０よりも大きくなかったならば、ステップＳ１４の処理を行い、Ｓｊが０よりも大きいならば、ステップＳ１７の処理を行う。 When the process starts, in step S10, the component requirement classification unit 11 counts the component requirement enumeration format delimiter “to” and sets it to Sk1. In step S11, the delimiter “Shi” in the writing format is counted as Sk2. In step S12, the Jepson-format break “characteristic of...,” Is counted as Sj.
In step S13, if Sj is not greater than 0, the process of step S14 is performed, and if Sj is greater than 0, the process of step S17 is performed.

ステップＳ１４において、Ｓｋ１がＳｋ２以上であったならば、ステップＳ１５において、構成要件列挙形式の区切りで分節し、そうでなかったならば、ステップＳ１６において、書き流し形式の区切りで分節し、図６の処理を終了する。 In step S14, if Sk1 is greater than or equal to Sk2, in step S15, segmentation is performed at the configuration requirement enumeration format. Otherwise, in step S16, segmentation is performed at the segmentation of the writing format. End the process.

ステップＳ１７において、Ｓｋ１がＳｋ２以上であったならば、ステップＳ１８において、ジェプソン形式及び構成要件列挙形式の区切りで分節し、そうでなかったならば、ステップＳ１９において、ジェプソン形式及び書き流し形式の区切りで分節し、図６の処理を終了する。 In step S17, if Sk1 is greater than or equal to Sk2, in step S18, the segmentation is performed using the separation of the jepson format and the component requirement listing format. If not, the separation is performed using the separation of the jepson format and the writing format in step S19. And the process of FIG. 6 is terminated.

特許請求の範囲の書き方には、大きく分けて構成要件列挙形式と書き流し形式とがある。構成要件列挙形式は、構成要件が名詞句で記述されており、この名詞句に助詞「と」と記号「、」が付与されて列挙されている。よって、助詞「と」と記号「、」で区切ることによって、構成要件ごとに区切ることができる。 The method of writing a claim is roughly classified into a component requirement enumeration format and a writing format. In the component requirement enumeration format, the component requirements are described in noun phrases, and the noun phrases are listed with the particle “to” and the symbol “,” added thereto. Therefore, it is possible to delimit each constituent requirement by delimiting with the particle “to” and the symbol “,”.

書き流し形式は、構成要件が動詞で記述されている。構成要件は、主に動詞「する」の語尾変化である「し」と記号「、」が付与されている。よって、動詞「し」と記号「、」で区切ることによって、構成要件ごとに区切ることができる。 In the writing style, the component requirements are described in verbs. Constituent requirements are given “shi” which is a ending change of the verb “sei” and a symbol “,” mainly. Therefore, it is possible to delimit each constituent requirement by delimiting with the verb “shi” and the symbol “,”.

ジェプソン形式は、公知部分の末尾に「を特徴とする・・・であって、」を有し、更に構成要件列挙形式又は書き流し形式と組み合わされている。よって、構成要件列挙形式又は書き流し形式の構成要件の末尾パターンに加えて、ジェプソン形式の構成要件の末尾パターンで区切ることによって、構成要件ごとに区切ることができる。
図７は、図１の係り受け解析手段における動作を示すフローチャートである。 The Jepson format has “characterized by ...” at the end of the known part, and is further combined with a component requirement listing format or a writing format. Therefore, in addition to the end pattern of the configuration requirement in the configuration requirement enumeration format or the writing format, it is possible to delimit each configuration requirement by delimiting with the end pattern of the configuration requirement in the Jepson format.
FIG. 7 is a flowchart showing the operation of the dependency analyzing means of FIG.

処理が開始すると、ステップＳ２０において、係り受け解析手段１２は、構成要件ごとに分節されている請求項の文章２１を読み込む。ステップＳ２１〜Ｓ２６に渡って、構成要件毎に処理を繰り返す。 When the process starts, in step S20, the dependency analysis unit 12 reads a sentence 21 of a claim segmented for each component. Over steps S21 to S26, the process is repeated for each configuration requirement.

ステップＳ２２において、現在の構成要件の文章を形態素変換し、単語と文節とを抽出する。ステップＳ２３〜Ｓ２５に渡って、現在の構成要件に含まれている文節毎に処理を繰り返す。ステップＳ２４において、現在の文節がどの文節に係り受けているかを判断する。ステップＳ２５において、構成要件に含まれている全ての文節の処理が終了したならば、ステップＳ２６の処理を行う。ステップＳ２６において、現在の請求項に含まれている全ての構成要件の処理が終了したならば、図７の処理を終了する。
図８は、図１の文節構造構成手段におけるＸＭＬ化の動作を示すフローチャートである。 In step S22, the sentence of the current configuration requirement is morphologically converted to extract words and phrases. Over steps S23 to S25, the process is repeated for each clause included in the current configuration requirement. In step S24, it is determined to which phrase the current phrase is dependent. In step S25, if all the clauses included in the configuration requirements have been processed, the process of step S26 is performed. In step S26, when the processing of all the constituent requirements included in the current claim is completed, the processing of FIG. 7 is ended.
FIG. 8 is a flowchart showing the XML operation in the phrase structure construction unit of FIG.

処理が開始すると、ステップＳ３０において、文節構造構成手段１３は、ルートのみからなるＸＭＬのデータ構造を生成する。ステップＳ３１〜Ｓ４０に渡って、入力されたＣＳＶ形式の文節の係り受け解析結果２３を、構成要件毎に処理を繰り返す。ステップＳ３２において、ＸＭＬのルートに構成要件のノードを追加する。ステップＳ３３〜Ｓ３９に渡って、当該構成要件の文節毎に処理を繰り返す。ステップＳ３４において、当該構成要件に構造のノードを追加する。ステップＳ３５において、当該構成ノードに文節のノードを追加する。ステップＳ３６〜Ｓ３８に渡って、当該文節の単語毎に処理を繰り返す。ステップＳ３７において、当該文節に単語ノードを追加し、単語の読みと品詞を記録する。ステップＳ３８において、当該文節における全ての単語の処理が終了したならば、ステップＳ３９の処理を行う。ステップＳ３９において、当該構成要件における全ての文節の処理が終了したならば、ステップＳ４０の処理を行う。ステップＳ４０において、全ての構成要件の処理が終了したならば、図８の処理を終了する。 When the process starts, in step S30, the phrase structure constructing unit 13 generates an XML data structure including only the root. Over the steps S31 to S40, the input dependency analysis result 23 of the clause in the CSV format is repeated for each configuration requirement. In step S32, a node having a configuration requirement is added to the XML route. Over the steps S33 to S39, the process is repeated for each clause of the constituent requirement. In step S34, a structure node is added to the component. In step S35, a clause node is added to the constituent node. Over steps S36 to S38, the process is repeated for each word in the phrase. In step S37, a word node is added to the clause, and the word reading and the part of speech are recorded. If it is determined in step S38 that all the words in the phrase have been processed, the process in step S39 is performed. If it is determined in step S39 that all the clauses in the configuration requirement have been processed, the process of step S40 is performed. In step S40, when the processing of all the configuration requirements is completed, the processing in FIG.

図９（ａ），（ｂ）は、図１の文節構造構成手段におけるツリー化の動作を示すフローチャートである。図９（ａ）は、上位処理を示すフローチャートであり、図９（ｂ）は、上位処理と当該再帰処理から呼び出される再帰処理を示すフローチャートである。 FIGS. 9A and 9B are flowcharts showing the treeing operation in the phrase structure constructing means of FIG. FIG. 9A is a flowchart showing the upper process, and FIG. 9B is a flowchart showing the upper process and the recursive process called from the recursive process.

上位処理が開始すると、ステップＳ５０において、構成要件の先頭の構造を対象とし、構成要件の末尾の構造を指定先として、（ｂ）の再帰処理を呼び出して、図９（ａ）の上位処理を終了する。 When the upper process starts, in step S50, the recursive process of (b) is called with the structure at the beginning of the configuration requirement as the target and the structure at the end of the configuration requirement as the designation destination, and the upper process of FIG. finish.

再帰処理が開始すると、ステップＳ５１〜Ｓ５５に渡って、対象の構造から指定先の構造までを現在の構造として処理を繰り返す。ステップＳ５２において、現在の構造は、次の構造に係り受けているか否かを判断する。次の構造に係り受けていたならば、ステップＳ５３の処理を行い、係り受けていなかったならば、ステップＳ５４の処理を行う。ステップＳ５３において、現在の構造を、次の構造における子ノードの先頭に追加する。ステップＳ５４において、次の構造を新たな対象とし、係り受け先の構造を新たな指定先として当該再帰処理を再帰的に呼び出す。ステップＳ５５において、指定先の構造まで全てを繰り返したならば、図９（ｂ）の処理を終了する。
図１０は、図１の格成分抽出手段の動作を示すフローチャートである。 When the recursive process starts, the process is repeated from steps S51 to S55 with the structure from the target structure to the designated structure as the current structure. In step S52, it is determined whether or not the current structure depends on the next structure. If so, the process of step S53 is performed. If not, the process of step S54 is performed. In step S53, the current structure is added to the head of the child node in the next structure. In step S54, the recursive process is recursively called with the next structure as a new target and the dependency destination structure as a new designation destination. If all the steps up to the designated destination structure are repeated in step S55, the processing in FIG. 9B is terminated.
FIG. 10 is a flowchart showing the operation of the case component extraction means of FIG.

処理が開始すると、格成分抽出手段１４は、ステップＳ６０において、を読み込こむ。ステップＳ６１〜Ｓ６７に渡って、全ての構成要件において処理を繰り返す。ステップＳ６２〜Ｓ６６に渡って、現在の構成要件に含まれている全ての文節を末尾から順に処理を繰り返す。ステップＳ６３において、格成分のパターン１５であるか否かを判断し、ステップＳ６４において、格の除外パターンであるか否かを判断する。格成分のパターン１５且つ格成分の除外パターン１６でない場合のみステップＳ６５の処理を行い、それ以外ならばステップＳ６６の処理を行う。ステップＳ６６において、全ての文節の処理が終了したならば、ステップＳ６７の処理を行う。ステップＳ６７において、全ての構成要件の処理が終了したならば、図１０の処理を終了する。
図１１は、図１における文節の係り受け解析結果を示す図である。 When the process is started, the case component extraction unit 14 reads in step S60. Over steps S61 to S67, the process is repeated for all the configuration requirements. Over steps S62 to S66, the processing is repeated in order from the end for all the clauses included in the current configuration requirement. In step S63, it is determined whether or not the pattern is a case component pattern 15. In step S64, it is determined whether or not it is a case exclusion pattern. The process of step S65 is performed only when the pattern is not the case component pattern 15 and the case component exclusion pattern 16, and the process of step S66 is performed otherwise. In step S66, if all the clauses have been processed, the process of step S67 is performed. In step S67, when the processing of all the configuration requirements is completed, the processing in FIG.
FIG. 11 is a diagram showing a dependency analysis result of the phrase in FIG.

原則として、各行に１つの単語が記載されている。単語が記載されている行は、単語を構成する文字列と、読みと、単語を構成する文字列を、品詞と、その品詞に係わる詳細情報とがタブで区切られて記載されている。 As a rule, one word is written in each line. In the line in which the word is described, a character string that constitutes the word, a reading, a character string that constitutes the word, a part of speech, and detailed information related to the part of speech are separated by tabs.

単語は、先頭に「＊」アスタリスクが記載されている行によって、文節に区切られている。先頭に「＊」アスタリスクが記載されている行は、当該文節のＩＤ番号と、当該文節が係り受けている先の文節のＩＤ番号と「Ｄ」とがスペースで区切られて記載されている。
文節は、先頭に半角の「ＥＯＳ」が記載されている行によって、構成要件毎に文節されている。
図１２は、図８に示す処理による係り受け解析ＸＭＬを示す図である。 Words are separated into clauses by lines beginning with “*” asterisks. The line in which “*” asterisk is written at the beginning describes the ID number of the clause, the ID number of the previous clause on which the clause is associated, and “D” separated by a space.
The clause is claused for each constituent requirement by a line having “EOS” at the beginning.
FIG. 12 is a diagram showing dependency analysis XML by the processing shown in FIG.

図示しないルートノードに、「構成要件」ノードが接続されている。この「構成要件」ノードには、複数の「構造」ノードが接続されている。この「構造」ノードの属性には、この「構造」ノードの子ノードである「文節」ノードのＩＤと係り受け先の「文節」ノードのＩＤとが記載されている。この「構造」ノードは、「構成要件」ノード直下に全て接続されており、「構造」ノードの係り受けを反映した接続ではない。 A “configuration requirement” node is connected to a root node (not shown). A plurality of “structure” nodes are connected to the “configuration requirement” node. In the attribute of the “structure” node, an ID of a “clause” node that is a child node of the “structure” node and an ID of a “clause” node as a dependency destination are described. This “structure” node is all connected directly under the “configuration requirement” node, and is not a connection reflecting the dependency of the “structure” node.

この「構造」ノードには１つの「文節」ノードが接続されている。この「文節」ノードの属性には、この「文節」ノードのＩＤと、係り受け先の「文節」ノードＩＤとが記載されている。 One “phrase” node is connected to this “structure” node. In the attribute of the “phrase” node, the ID of the “phrase” node and the “phrase” node ID of the dependency destination are described.

「文節」ノードには、複数の単語ノードが接続されている。この「単語」ノードの属性には、単語の品詞が記載されており、「単語」ノードの値には、この単語の読みが記載されている。
図１３は、図１におけるツリー状の文節構造を示す図である。 A plurality of word nodes are connected to the “phrase” node. The part of speech of the word is described in the attribute of the “word” node, and the reading of the word is described in the value of the “word” node.
FIG. 13 is a diagram showing the tree-like phrase structure in FIG.

図示しないルートノードに、「構成要件」ノードが接続されている。この「構成要件」ノードには、複数の「構造」ノードが接続されている。この「構造」ノードの兄弟ノードの末尾は、係り受け先の「文節」ノードである。
図１４は、図１における格成分を抽出したツリー状の文節構造を示す図である。 A “configuration requirement” node is connected to a root node (not shown). A plurality of “structure” nodes are connected to the “configuration requirement” node. The end of the sibling node of this “structure” node is the “clause” node of the dependency destination.
FIG. 14 is a diagram showing a tree-like phrase structure from which the case components in FIG. 1 are extracted.

「構造」ノードの属性に、格成分であるか否かと、この格成分の種類と、この格成分がどのようなパターンによって抽出されたかが記載されている。例えば５行目の「構造」ノードは、着点をあらわす格成分であり、「〜に」のパターンによって抽出されたことを示している。
（実施例１の効果） The attribute of the “structure” node describes whether it is a case component, the type of this case component, and what pattern the case component was extracted from. For example, the “structure” node on the fifth line is a case component representing a landing point, and indicates that it is extracted by the pattern “˜”.
(Effect of Example 1)

本実施例１の特許明細書分析装置によれば、文節構造と複数の格成分のパターン１５及び格成分の除外パターン１６と比較して格成分として抽出する際には、手掛かり句の文字列ではなく、単語と単語の品詞と文節によって構成されたツリー状の文節構造を比較している。よって、手掛かり句の文字列に比べて、より正確に格成分を抽出できる効果がある。 According to the patent specification analysis apparatus of the first embodiment, when extracting a case component in comparison with the phrase structure, the plurality of case component patterns 15 and the case component exclusion pattern 16, the character string of the clue phrase is Rather, it compares tree-like phrase structures composed of words and parts of speech and phrases. Therefore, there is an effect that the case component can be extracted more accurately than the character string of the clue phrase.

（実施例２の構成）
図１５は、本発明の実施例２における契約書分析装置の概略の構成を示す図であり、実施例１を示す図１と同一の要素には同一の符号が付与されている。 (Configuration of Example 2)
FIG. 15 is a diagram illustrating a schematic configuration of a contract analysis apparatus according to the second embodiment of the present invention, and the same elements as those in FIG.

本実施例２の契約書分析装置１０Ａは、実施例１に示す特許明細書分析装置１０が有する構成要件区分手段１１とは異なる条項区分手段１１Ａを有し、実施例１に示す格成分の除外パターン１６とは異なる格成分の除外パターン１６Ａを有しているほかは、実施例１に示す特許明細書分析装置１０と同様の構成を有している。
（実施例２の動作）
図１６は、図１５の契約書分析装置の動作を示すフローチャートである。 The contract analysis apparatus 10A according to the second embodiment includes clause classification means 11A different from the component requirement classification means 11 included in the patent specification analysis apparatus 10 according to the first embodiment, and excludes case components as illustrated in the first embodiment. It has the same configuration as that of the patent specification analysis apparatus 10 shown in the first embodiment except that it has a case component exclusion pattern 16A different from the pattern 16.
(Operation of Example 2)
FIG. 16 is a flowchart showing the operation of the contract document analyzer of FIG.

処理が開始すると、ステップＳ７０において、要素区分手段は、条文の名称を抽出すると格抽出の対象外という属性を付与し、ステップＳ７１において、条文の前書きを抽出して格抽出の対象外という属性を付与し、条文の本文に格抽出の対象という属性を付与する。ステップＳ７２〜Ｓ７４において、全条文の処理を繰り返す。 When the process starts, in step S70, the element classification means gives the attribute that is not subject to case extraction when the name of the clause is extracted, and in step S71, the attribute that is not subject to case extraction is extracted by extracting the preface of the clause. Assign the attribute of case extraction to the text of the article. In steps S72 to S74, the processing of all the articles is repeated.

ステップＳ７３において、条文ごとに区切り、ステップＳ７４において、格抽出の対象という属性が付与された部分について、格成分を抽出する。全条文の処理が終了したならば、図１６の処理を終了する。 In step S73, each sentence is separated, and in step S74, a case component is extracted for a portion to which an attribute of case extraction is given. When the processing of all the sentences is finished, the processing of FIG. 16 is finished.

このように、契約書においても特許明細書と同様に、格成分を自動抽出することが可能である。また、格成分は、特許明細書と同様に、契約書の効力範囲を限定する定量的指標として役立てることが可能である。
（実施例２の効果） As described above, in the contract, the case component can be automatically extracted as in the patent specification. In addition, the case component can be used as a quantitative index for limiting the effective range of the contract as in the patent specification.
(Effect of Example 2)

本実施例２の契約書分析装置１０Ａによれば、文節構造と複数の格成分のパターン１５及び格成分の除外パターン１６と比較して格成分として抽出する際には、手掛かり句の文字列ではなく、単語と単語の品詞と文節によって構成されたツリー状の文節構造を比較している。よって、手掛かり句の文字列に比べて、より正確に格成分を抽出できる効果がある。
（変形例）
本発明は、上記実施例に限定されず、種々の利用形態や変形が可能である。この利用形態や変形例としては、例えば、次の（ａ）のようなものがある。 According to the contract analysis apparatus 10A of the second embodiment, when extracting a case component in comparison with the phrase structure, the plurality of case component patterns 15 and the case component exclusion pattern 16, the character string of the clue phrase is Rather, it compares tree-like phrase structures composed of words and parts of speech and phrases. Therefore, there is an effect that the case component can be extracted more accurately than the character string of the clue phrase.
(Modification)
The present invention is not limited to the above-described embodiments, and various usage forms and modifications are possible. As this usage pattern and modification, for example, there is the following (a).

（ａ）実施例１，２では、係り受け解析結果をツリー状の構造を有するＸＭＬに変換したが、ツリー状のデータ構造が取り扱えるならばバイナリ形式であってもよい。 (A) In the first and second embodiments, the dependency analysis result is converted into XML having a tree-like structure. However, a binary format may be used as long as the tree-like data structure can be handled.

格成分と無効審判の請求認容率とは高い相関性を有し、かつ格成分と侵害訴訟における侵害認容率とは高い相関性を示す。これは、格成分の数と、特許の技術的範囲の広さとは高い相関性を示すことを意味しており、格成分によって、特許の質の定量的評価が可能であることを示している。
本発明は、格成分の自動抽出装置を示している。本発明により、人間の判断に依らず、機械的且つ定量的に特許の質を判断可能である。 There is a high correlation between the case component and the acceptance rate of the invalidation trial, and the case component and the infringement acceptance rate in the infringement suit are highly correlated. This means that there is a high correlation between the number of case components and the breadth of the patent's technical scope, and it is possible to quantitatively evaluate the quality of patents using case components. .
The present invention shows a case component automatic extraction apparatus. According to the present invention, it is possible to judge the quality of a patent mechanically and quantitatively without depending on human judgment.

１０特許明細書分析装置
１０Ａ契約書分析装置
１１構成要件区分手段
１１Ａ条項区分手段
１２係り受け解析手段
１３文節構造構成手段
１４格成分抽出手段
１５格成分のパターン
１６，１６Ａ格成分の除外パターン
２１請求項の文章
２２各構成要件に区分された請求項の文章
２３文節の係り受け解析結果
２４ツリー状の文節構造
２５格成分を抽出したツリー状の文節構造
DESCRIPTION OF SYMBOLS 10 patent specification analyzer 10A contract document analyzer 11 component requirement classification means 11A clause classification means 12 dependency analysis means 13 clause structure structure means 14 case component extraction means 15 case component pattern 16, 16A case component exclusion pattern 21 Item sentence 22 Claim sentence divided into constituent elements 23 Clause dependency analysis result 24 Tree-like phrase structure 25 Tree-like phrase structure extracted from case components

Claims

A component requirement classifying means for classifying a claim having a plurality of component requirements for each component requirement;
A morpheme analysis of the constituent requirements, disassembled into phrases having words and parts of speech of the words, and dependency analysis means for analyzing the dependency of the clauses;
Clause structure configuring means for configuring the clause dependency into a tree-like clause structure;
A plurality of case component patterns and case component exclusion patterns configured in a tree shape with clauses having words and word parts of speech;
Case component extraction means for comparing the phrase structure with the plurality of case component patterns and case component exclusion patterns and extracting as case components;
A patent specification analysis apparatus comprising:

The component requirement classifying unit classifies the claim into the component requirements according to a pattern corresponding to the statement format after determining the statement format of the claim. Patent specification analyzer.

The case component extraction means compares the phrase structure with the plurality of case component patterns and case component exclusion patterns by at least the original form of the first word and the part of speech of the first word, and extracts the case component as a case component;
The patent specification analysis apparatus according to claim 1.

The exclusion pattern of the case component is
4. The patent specification analysis apparatus according to claim 3, further comprising a verb depending on a clause indicating a subject of the patent.

The exclusion pattern of the case component is
A pattern in which a phrase ending with the particle "wo" depends on the verb "do",
A pattern in which a phrase ending with the particle "wo" depends on the verb "execute",
A pattern in which a phrase ending with the particle "wo" depends on the verb "processing",
4. The patent specification analysis apparatus according to claim 3, wherein the phrase ending with the particle “s” is one of patterns depending on the verb “s”.

Element classifying means for classifying a sentence including a plurality of elements for each element;
Dependent analysis means for performing morphological analysis of the element, decomposing the phrase into words and parts of speech of the word, and analyzing the dependency of the phrase;
Clause structure configuring means for configuring the clause dependency into a tree-like clause structure;
A plurality of case component patterns and case component exclusion patterns configured in a tree shape with clauses having words and word parts of speech;
Case component extraction means for comparing the phrase structure with the plurality of case component patterns and case component exclusion patterns and extracting as case components;
A sentence analyzing apparatus characterized by comprising: