JP2016224998A

JP2016224998A - Information processing device

Info

Publication number: JP2016224998A
Application number: JP2016198076A
Authority: JP
Inventors: 真之正林; Masayuki Masabayashi
Original assignee: Individual
Current assignee: Individual
Priority date: 2016-10-06
Filing date: 2016-10-06
Publication date: 2016-12-28
Anticipated expiration: 2035-04-09
Also published as: JP6734174B2

Abstract

PROBLEM TO BE SOLVED: To achieve proper evaluation means of quality of a bulletin such as a statement.SOLUTION: A claim word frequency distribution generation part 41 separates a content of a range of a patent claim of a patent publication bulletin into respective words, and generates claim word frequency distribution indicating frequency distribution of each word, after the content is divided into words. A statement word frequency distribution generation part 42 separates a content of a statement of the patent publication bulletin into respective words, and generates statement word frequency distribution indicating frequency distribution of each word after the content is divided into words. A statement near-synonym frequency distribution generation part 43 classifies respective words extracted from the statement into plural groups corresponding to each of the plural words in the range of the patent claim, then, generates statement near-synonym frequency distribution indicating frequency distribution in which, each of the plural groups after classification is a unit. An evaluation information generation part 44 generates evaluation information based on the claim word similarity distribution and the statement near-synonym frequency distribution.SELECTED DRAWING: Figure 2

Description

本発明は、情報処理装置に関する。 The present invention relates to an information processing apparatus.

従来より、特許権は、実施予定や実施中の事業を保護する等を目的として、多くの企業により取得されている。また、特許権は、財産権であり、第三者に対して、譲渡したり、専用実施権の設定や通常実施権の許諾をすることで、有効活用することもできる。 Conventionally, patent rights have been acquired by many companies for the purpose of protecting implementation schedules and ongoing projects. A patent right is a property right and can be effectively used by transferring it to a third party, setting a dedicated license, or granting a normal license.

特許権を取得するためには、願書の他、当該願書の添付書類の提出が必要になる。願書の添付書類としては、特許請求の範囲、明細書、図面、及び要約書が存在する。
このような願書の添付書類の作成を支援する技術（例えば特許文献１参照）や、願書の添付書類の文章構造等を解析する技術（例えば特許文献２参照）については、数多くの研究開発がなされている。 In order to obtain a patent right, it is necessary to submit a document attached to the application in addition to the application. The attached documents of the application include claims, specification, drawings, and abstract.
A lot of research and development has been conducted on the technology for supporting the preparation of the attached document of the application (for example, see Patent Document 1) and the technology for analyzing the sentence structure of the attached document to the application (for example, see Patent Document 2). ing.

特開２０１３−０８０２７８号公報JP2013-080278A 特開２０１４−０１０７２８号公報JP, 2014-010728, A

しかしながら、明細書等の公報の品質の適切な評価手法の実現が要求されていたところ、特許文献１，２を含め従来の技術では当該要求に十分に応えることができない状況であった。 However, when it is required to implement an appropriate evaluation method for the quality of the gazette in the specification and the like, the conventional techniques including Patent Documents 1 and 2 cannot sufficiently meet the request.

本発明は、このような状況に鑑みてなされたものであり、明細書等の公報の品質の適切な評価手法を実現することを目的とする。 This invention is made | formed in view of such a condition, and it aims at implement | achieving the appropriate evaluation method of the quality of gazettes, such as a specification.

本発明の一側面の情報処理装置は、
知的財産権に関する公報に含まれ得る第１書類の内容を、文字、図形、記号、又はそれらの結合からなる所定の単位情報に分離して、分離後の前記単位情報の頻度分布を示す第１情報を生成する第１情報生成手段と、
当該公報に含まれ得る第２書類の内容を前記単位情報に分離して、分離後の前記単位情報の頻度分布を示す第２情報を生成する第２情報生成手段と、
前記第２情報における各単位情報を、前記第１情報の複数の単位情報の夫々に対応する複数のグループに分類し、分類後の前記複数のグループの夫々を単位とした頻度分布を示す第３情報を生成する第３情報生成手段と、
前記第１情報における各単位情報毎に、前記第１情報における前記各単位情報の頻度の順位と、前記第３情報における、対応するグループの頻度の順位との順位差が小さいほど高スコアとなるようにスコアを演算し、前記第１情報における前記各単位情報の前記スコアの和を、前記スコアの総合値とし、前記スコアの総合値が高いほど、前記公報の内容を高く評価する評価情報として、サポート要件の充実度合を示す前記評価情報を生成する評価情報生成手段と、
を備える情報処理装置。 An information processing apparatus according to one aspect of the present invention includes:
The first document showing the frequency distribution of the unit information after separation, by separating the content of the first document that can be included in the gazette relating to the intellectual property right into predetermined unit information consisting of characters, figures, symbols, or combinations thereof First information generating means for generating one information;
Second information generating means for separating the content of the second document that can be included in the publication into the unit information, and generating second information indicating the frequency distribution of the unit information after separation;
A third distribution classifying each unit information in the second information into a plurality of groups corresponding to each of the plurality of unit information of the first information, and showing a frequency distribution with each of the plurality of groups after classification as a unit. Third information generating means for generating information;
For each unit information in the first information, the smaller the difference in rank between the frequency ranking of the unit information in the first information and the frequency ranking of the corresponding group in the third information, the higher the score. As the evaluation information that evaluates the content of the publication as the total value of the score is higher, the higher the total value of the score, the higher the score, , Evaluation information generating means for generating the evaluation information indicating the degree of fulfillment of support requirements;
An information processing apparatus comprising:

本発明によれば、明細書等の公報の品質の適切な評価手法を実現することができる。 According to the present invention, it is possible to realize an appropriate evaluation method for the quality of a publication such as a specification.

本発明が適用される情報処理装置の一実施形態としての公報評価装置のハードウェアの構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the gazette evaluation apparatus as one Embodiment of the information processing apparatus with which this invention is applied. 図２の公報評価装置の機能的構成を示す機能ブロック図である。It is a functional block diagram which shows the functional structure of the gazette evaluation apparatus of FIG. 図２の機能的構成を有する公報評価装置が実行する評価情報生成処理の概略を説明する模式図である。It is a schematic diagram explaining the outline of the evaluation information generation process which the gazette evaluation apparatus which has the functional structure of FIG. 2 performs. 図２の機能的構成を有する公報評価装置が実行する評価情報生成処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of the evaluation information generation process which the gazette evaluation apparatus which has a functional structure of FIG. 2 performs.

以下、本発明の実施形態について、図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本発明の情報処理装置の一実施形態としての公報評価装置１のハードウェアの構成を示すブロック図である。 FIG. 1 is a block diagram showing a hardware configuration of a publication evaluation apparatus 1 as an embodiment of an information processing apparatus of the present invention.

公報評価装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１３と、バス１４と、入出力インターフェース１５と、入力部１６と、出力部１７と、記憶部１８と、通信部１９と、ドライブ２０と、を備えている。 The publication evaluation apparatus 1 includes a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a bus 14, an input / output interface 15, an input unit 16, and an output unit. 17, a storage unit 18, a communication unit 19, and a drive 20.

ＣＰＵ１１は、ＲＯＭ１２に記録されているプログラム、又は、記憶部１８からＲＡＭ１３にロードされたプログラムに従って各種の処理を実行する。
ＲＡＭ１３には、ＣＰＵ１１が各種の処理を実行する上において必要なデータ等も適宜記憶される。 The CPU 11 executes various processes according to a program recorded in the ROM 12 or a program loaded from the storage unit 18 to the RAM 13.
The RAM 13 appropriately stores data necessary for the CPU 11 to execute various processes.

ＣＰＵ１１、ＲＯＭ１２及びＲＡＭ１３は、バス１４を介して相互に接続されている。このバス１４にはまた、入出力インターフェース１５も接続されている。入出力インターフェース１５には、入力部１６、出力部１７、記憶部１８、通信部１９及びドライブ２０が接続されている。 The CPU 11, ROM 12, and RAM 13 are connected to each other via a bus 14. An input / output interface 15 is also connected to the bus 14. An input unit 16, an output unit 17, a storage unit 18, a communication unit 19, and a drive 20 are connected to the input / output interface 15.

入力部１６は、キーボードやマウス等で構成され、オペレータの指示操作に応じて各種情報を入力する。
出力部１７は、ディスプレイやスピーカ等で構成され、画像や音声を出力する。
記憶部１８は、ハードディスクやＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等で構成され、各種データを記憶する。
通信部１９は、インターネットを含むネットワーク４を介して他の装置（図示せず）との間で行う通信を制御する。 The input unit 16 includes a keyboard, a mouse, and the like, and inputs various information according to an instruction operation by the operator.
The output unit 17 includes a display, a speaker, and the like, and outputs images and sounds.
The storage unit 18 includes a hard disk, a DRAM (Dynamic Random Access Memory), and the like, and stores various data.
The communication unit 19 controls communication with other devices (not shown) via the network 4 including the Internet.

ドライブ２０には、磁気ディスク、光ディスク、光磁気ディスク、或いは半導体メモリ等よりなる、リムーバブルメディア３１が適宜装着される。ドライブ２０によってリムーバブルメディア３１から読み出されたプログラムは、必要に応じて記憶部１８にインストールされる。また、リムーバブルメディア３１は、記憶部１８に記憶されている各種データも、記憶部１８と同様に記憶することができる。 A removable medium 31 made of a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is appropriately attached to the drive 20. The program read from the removable medium 31 by the drive 20 is installed in the storage unit 18 as necessary. The removable medium 31 can also store various data stored in the storage unit 18 in the same manner as the storage unit 18.

このような構成を有する公報評価装置１は、本実施形態では、特許公報若しくは出願公開公報、又は実用新案登録公報に含まれ得る明細書の品質評価をする装置である。
なお、以下の説明では便宜上、特許権について説明するが、実用新案権の場合にも基本的に同様である。 In this embodiment, the publication evaluation apparatus 1 having such a configuration is an apparatus that evaluates the quality of specifications that can be included in a patent publication, an application publication publication, or a utility model registration publication.
In the following description, for the sake of convenience, a patent right will be described, but the same applies to a utility model right.

特許権を取得するためには、願書の他、当該願書の添付書面として、特許請求の範囲、明細書、必要な図面、及び要約書の提出が必要になる。 In order to obtain a patent right, in addition to the application, it is necessary to submit the claims, the description, the necessary drawings, and the abstract as an attached document of the application.

特許請求の範囲は、特許権の権利書としての使命を果たすべきものである。即ち、特許発明の技術的範囲が、特許請求の範囲の記載に基づいて定められる。
明細書は、発明の内容を第三者に公開する技術文献としての使命を果たすべきものである。
ここで、特許請求の範囲の記載について、「特許を受けようとする発明が発明の詳細な説明に記載したものであること。」という要件（以下、「サポート要件」と呼ぶ）が求められている。具体的には日本国でいえば、当該サポート要件は、特許法第36条第６項第１号に規定されている。
このサポート要件は、日本国では拒絶理由（特許法第49条４号）になっている。つまり、サポート要件を満たす明細書でなければ、日本国では特許権を取得することができない。
従って、サポート要件を満たすか否かは、明細書の品質にとって重要な要素の１つである。
そこで、本実施形態の公報評価装置１は、特許請求の範囲に含まれる単語の頻度分布と、明細書に含まれる単語の頻度分布とに基づいて、明細書の内容を評価する評価情報として、サポート要件の充実度合を示す評価値（以下、「サポート情報充実度指数」と呼ぶ）を生成する。
以下、評価対象の特許公報又は出願公開公報についての評価情報（サポート情報充実度指数）を生成するまでの一連の処理を、「評価情報生成処理」と呼ぶ。 The claims should fulfill their mission as patent rights. That is, the technical scope of the patented invention is determined based on the description of the scope of claims.
The specification should fulfill its mission as a technical document for disclosing the contents of the invention to a third party.
Here, with regard to the description of the scope of claims, a requirement (hereinafter referred to as “support requirement”) that “the invention to be patented is described in the detailed description of the invention” is required. Yes. Specifically, in Japan, the support requirement is stipulated in Article 36, Paragraph 6, Item 1 of the Patent Act.
This support requirement is the reason for refusal in Japan (Patent Law Article 49.4). In other words, a patent right cannot be obtained in Japan unless it is a specification that satisfies the support requirements.
Therefore, whether or not the support requirement is satisfied is one of the important factors for the quality of the specification.
Therefore, the publication evaluation apparatus 1 of the present embodiment, as evaluation information for evaluating the contents of the specification based on the frequency distribution of the words included in the claims and the frequency distribution of the words included in the specification, An evaluation value indicating the degree of fulfillment of the support requirement (hereinafter referred to as “support information fulfillment index”) is generated.
Hereinafter, a series of processes until the evaluation information (support information quality index) for the patent gazette or the application publication gazette to be evaluated is generated is referred to as “evaluation information generation process”.

図２は、公報評価装置１の機能的構成のうち、評価情報生成処理を実行するための機能的構成を示す機能ブロック図である。
公報評価装置１のＣＰＵ１１においては、評価情報生成処理が実行される場合、クレーム単語頻度分布生成部４１と、明細書単語頻度分布生成部４２と、明細書類義語頻度分布生成部４３と、評価情報生成部４４とが機能する。なお、評価情報生成部４４には、重み付け部５１が含まれている。
公報評価装置１の記憶部１８の一領域には、公報情報ＤＢ６１が設けられる。 FIG. 2 is a functional block diagram showing a functional configuration for executing the evaluation information generation process among the functional configurations of the publication evaluation apparatus 1.
In the CPU 11 of the publication evaluation apparatus 1, when the evaluation information generation process is executed, the complaint word frequency distribution generation unit 41, the specification word frequency distribution generation unit 42, the detailed document synonym frequency distribution generation unit 43, and the evaluation information The generation unit 44 functions. The evaluation information generation unit 44 includes a weighting unit 51.
In one area of the storage unit 18 of the publication evaluation apparatus 1, a publication information DB 61 is provided.

クレーム単語頻度分布生成部４１は、評価対象の特許公報又は出願公開公報に含まれる特許請求の範囲の内容（クレームの内容）を各単語に分離して、分離後の単語の頻度分布を示す情報（以下、「クレーム単語頻度分布」と呼ぶ）を生成する。 The claim word frequency distribution generation unit 41 separates the contents of claims (contents of claims) contained in the patent gazette or application publication to be evaluated into each word, and indicates the frequency distribution of the words after separation (Hereinafter referred to as “claim word frequency distribution”).

明細書単語頻度分布生成部４２は、評価対象の特許公報又は出願公開公報に含まれる明細書の内容を各単語に分離して、分離後の単語の頻度分布を示す情報（以下、「明細書単語頻度分布」と呼ぶ）を生成する。 The specification word frequency distribution generation unit 42 separates the contents of the specification contained in the patent gazette or the application publication gazette to be evaluated into each word, and indicates the frequency distribution of the words after the separation (hereinafter referred to as “specification” Called word frequency distribution).

ここで、サポート情報充実度指数を生成するにあたり、クレーム単語頻度分布と、明細書単語頻度分布とを比較してもよいが、当該比較では有効な比較とならない場合がある。この理由について以下説明する。 Here, in generating the support information enrichment index, the claim word frequency distribution and the specification word frequency distribution may be compared, but the comparison may not be an effective comparison. The reason for this will be described below.

ただし、明細書には、上述したサポート要件に加えて、発明の詳細な説明について、「その発明の属する技術の分野における通常の知識を有する者がその実施をすることができる程度に明確かつ十分に記載したものであること」という要件（以下、「実施可能要件」と呼ぶ）も求められている。具体的には日本国でいえば、当該実施可能要件は、特許法第36条第４項第１号に規定されている。
この実施可能要件は、日本国では拒絶理由（特許法第49条４号）になっている。つまり、実施可能要件を満たす明細書でなければ、日本国では特許権を取得することができない。
ここで、特許請求の範囲は、上述のように権利書としての性格を有することから、一般的に広い権利範囲となるように可能な限り上位概念で記載されることが多い。つまり、クレーム単語頻度分布に現れる各単語は、上位概念の漠然とした単語であることが多い。
これに対して、明細書には、実施可能要件を満たすべく、具体的な技術内容を示す単語、つまり、当該上位概念（特許請求の範囲に記載される単語）を例示した下位概念であることが多い。つまり、明細書内では、上位概念を説明する語として、具体的な技術内容を示す単語が登場することが多い。
換言すると、特許請求の範囲に記載の上位概念の単語がそのまま用いられて、明細書における実施形態が記載されていることは少なく、複数の下位概念（例示）の単語で記載されていることが多い。
このため、クレーム単語頻度分布と、明細書単語頻度分布とをそのまま比較しても、有効な比較とはならない場合があり得る。 However, in the description, in addition to the above-mentioned support requirements, the detailed description of the invention is “clear and sufficient so that a person having ordinary knowledge in the technical field to which the invention belongs can carry out its implementation. The requirement “to be described in the above” (hereinafter referred to as “practicable requirement”) is also required. Specifically, in Japan, the enablement requirement is defined in Article 36, Paragraph 4, Item 1 of the Patent Act.
This feasible requirement is the reason for refusal in Japan (Patent Act Article 49.4). In other words, a patent right cannot be obtained in Japan unless it is a specification that satisfies the enablement requirement.
Here, since the claims have the character as a right document as described above, the claims are often described in a superordinate concept as much as possible so as to generally have a broad right range. That is, each word appearing in the complaint word frequency distribution is often a vague word of a superordinate concept.
On the other hand, in the specification, in order to satisfy the feasibility requirement, a word indicating specific technical contents, that is, a subordinate concept exemplifying the superordinate concept (word described in claims) There are many. That is, in the specification, a word indicating specific technical contents often appears as a word explaining the superordinate concept.
In other words, the words of the higher concept described in the claims are used as they are, and the embodiments in the specification are rarely described, and the words of the plurality of lower concepts (examples) are described. Many.
For this reason, even if the claim word frequency distribution and the specification word frequency distribution are compared as they are, it may not be an effective comparison.

そこで、本実施形態では、明細書類義語頻度分布生成部４３は、明細書から抽出された各単語を、特許請求の範囲の複数の単語（上位概念）の夫々に対応する複数のグループに分類する。
ここで、複数のグループへの分類手法は特に限定されないが、本実施形態では、複数の類義語の上位概念が特許請求の範囲の所定の１つの単語に対応するものとして、類義語を同一グループに分類するという分類手法が採用されている。
明細書類義語頻度分布生成部４３は、分類後の複数のグループ（特許請求の範囲の複数の単語に対応するグループ）の夫々を単位とした頻度分布を示す情報を生成する。このような情報を、以下、「明細書類義語頻度分布」と呼ぶ。 Therefore, in the present embodiment, the detailed document synonym frequency distribution generation unit 43 classifies each word extracted from the specification into a plurality of groups corresponding to a plurality of words (superordinate concepts) in the scope of claims. .
Here, the classification method into a plurality of groups is not particularly limited, but in the present embodiment, the synonyms are classified into the same group on the assumption that the superordinate concept of the plurality of synonyms corresponds to one predetermined word in the claims. A classification method is used.
The detailed document synonym frequency distribution generation unit 43 generates information indicating a frequency distribution in units of a plurality of classified groups (groups corresponding to a plurality of words in the claims). Such information is hereinafter referred to as “detailed document synonym frequency distribution”.

評価情報生成部４４は、クレーム単語類度分布と、明細書類義語頻度分布とに基づいて、サポート情報充実度指数（評価情報）を生成する。
例えば本実施形態では、評価情報生成部４４は、クレーム単語頻度分布に含まれる単語の頻度ランキングと、明細書類義語頻度分布に含まれるグループ（クレーム単語頻度分布の所定単語に対応するグループ）の頻度ランキングとを対比することで、その類似度を算出する。評価情報生成部４４は、当該類似度又はその加工値を、サポート情報充実度指数（評価情報）として生成する。
具体的には例えば、評価情報生成部４４は、クレーム単語頻度分布における各単語毎に、明細書類義語頻度分布における、対応するグループとの順位差に基づくスコアを演算し、クレーム単語頻度分布における各単語のスコアの総合値に基づいて、サポート情報充実度指数（評価情報）を生成する。 The evaluation information generation unit 44 generates a support information fulfillment index (evaluation information) based on the claim word similarity distribution and the detailed document synonym frequency distribution.
For example, in the present embodiment, the evaluation information generating unit 44 uses the frequency ranking of words included in the claim word frequency distribution and the frequency of groups included in the detailed document synonym frequency distribution (groups corresponding to predetermined words in the claim word frequency distribution). The degree of similarity is calculated by comparing the ranking. The evaluation information generation unit 44 generates the similarity or the processed value as a support information fulfillment index (evaluation information).
Specifically, for example, the evaluation information generation unit 44 calculates, for each word in the claim word frequency distribution, a score based on a rank difference from the corresponding group in the detailed document synonym frequency distribution, A support information fulfillment index (evaluation information) is generated based on the total value of the word scores.

図３は、このような図２の機能的構成を有する公報評価装置１が実行する評価情報生成処理の概要を説明する模式図である。 FIG. 3 is a schematic diagram for explaining an outline of the evaluation information generation process executed by the publication evaluation apparatus 1 having the functional configuration shown in FIG.

図３の例では、クレーム単語頻度７１の頻度ランキングについては、第１位は「弾性体」であり、第２位は「応力」である。
ここで、明細書単語頻度７２の頻度ランキングについては、第１位は「応力」であり、第２位は「バネ」であり、第３位は「ゴム」である。従って、頻度ランキングの点でクレーム単語頻度７１と明細書単語頻度７２とを比較すると、特許請求の範囲と明細書の第１位が異なっており、特許請求の範囲の第１位の「弾性体」は明細書ではランクインされていない。その結果、単にこの順位に基づいてサポート情報充実度指数（評価情報）が生成された場合には、類似度が低い（一致度が低い）として、低い値になってしまう可能性がある。
具体的には例えば、順位が一致する場合にはスコア「１」が与えられ、順位差が１の場合にはスコア「０．９」が与えられ、順位差が２の場合にはスコア「０．８」が与えられ、それ以降順位差が増える毎にスコアは０．１ずつ減少していき、スコア「０」となった後は一律スコア「０」になるものとする。この場合、クレーム単語頻度７１について、第１位の「弾性体」のスコアは「０．９」となり、第２位の「応力」のスコアは「０」となる。これらのスコアの総合値がサポート情報充実度指数（評価情報）とするならば、その値は「０．９」になる。
このように本来高値となるべきサポート情報充実度指数（評価情報）が低くなる理由は、「弾性体」という上位概念と、その下位概念である「バネ」と「ゴム」との対応付けがなされていないからである。 In the example of FIG. 3, regarding the frequency ranking of the complaint word frequency 71, the first place is “elastic body” and the second place is “stress”.
Here, regarding the frequency ranking of the specification word frequency 72, the first place is “stress”, the second place is “spring”, and the third place is “rubber”. Therefore, when the claim word frequency 71 and the specification word frequency 72 are compared in terms of frequency ranking, the first place of the claims is different from the first place of the description. Is not ranked in the description. As a result, when the support information fulfillment index (evaluation information) is simply generated based on this ranking, the similarity may be low (the degree of coincidence is low) and may be a low value.
Specifically, for example, a score “1” is given when the ranks match, a score “0.9” is given when the rank difference is 1, and a score “0” when the rank difference is 2. .8 ”is given, and thereafter the score decreases by 0.1 each time the rank difference increases. After the score becomes“ 0 ”, the score becomes“ 0 ”. In this case, for the claim word frequency 71, the score of the first “elastic body” is “0.9” and the score of the second “stress” is “0”. If the total value of these scores is the support information fulfillment index (evaluation information), the value is “0.9”.
The reason why the support information fulfillment index (evaluation information) that should be high in this way is low is that the superordinate concept “elastic body” and the subordinate concepts “spring” and “rubber” are associated with each other. Because it is not.

そこで、本実施形態では、明細書類義語頻度分布生成部４３は、明細書から抽出された「バネ」と「ゴム」を、特許請求の範囲の「弾性体」に対応するグループに属するように分類する。
このような分類の結果として、明細書類義語頻度７３が生成される。明細書類義語頻度７３の頻度ランキングについては、第１位は「バネ、グループ」が属するグループ、即ち、特許請求の範囲の「弾性体」に対応するグループである。そして、第２位は「応力」が属するグループ、即ち、特許請求の範囲の「応力」に対応するグループである。
この場合、クレーム単語頻度７１と、明細書類義語頻度７３との各順位は一致するので、この順位に基づいてサポート情報充実度指数（評価情報）が生成されると、類似度が高い（一致度が高い）として、高値になる。
具体的には例えば、上位スコアを用いるならば、クレーム単語頻度７１について、第１位の「弾性体」のスコアは「１」となり、第２位の「応力」のスコアも「１」となる。従って、これらのスコアの総合値「２」という高値が、サポート情報充実度指数（評価情報）になる。 Therefore, in the present embodiment, the specification document synonym frequency distribution generation unit 43 classifies the “spring” and “rubber” extracted from the specification so as to belong to the group corresponding to the “elastic body” in the claims. To do.
As a result of such classification, a detailed document synonym frequency 73 is generated. Regarding the frequency ranking of the statement document synonym frequency 73, the first rank is the group to which the “spring, group” belongs, that is, the group corresponding to the “elastic body” in the claims. The second place is a group to which “stress” belongs, that is, a group corresponding to “stress” in claims.
In this case, since the ranks of the complaint word frequency 71 and the statement document synonym frequency 73 coincide with each other, if a support information fulfillment index (evaluation information) is generated based on this rank, the similarity is high (coincidence) Is high).
Specifically, for example, if a higher score is used, for the complaint word frequency 71, the score of the first “elastic body” is “1” and the score of the second “stress” is also “1”. . Therefore, the high value of the total value “2” of these scores is the support information fulfillment index (evaluation information).

ここで、クレーム単語頻度７１も明細書類義語頻度５４も、単純な単語の出現頻度に基づく分布である。このような単純な出現頻度同士で比較したサポート情報充実度指数（評価情報）の信頼性は高いといえない場合もでてくる。
そこで、本実施形態の評価情報生成部４４には重み付け部５１が設けられている。 Here, both the claim word frequency 71 and the detailed document synonym frequency 54 are distributions based on the appearance frequency of simple words. There are cases where the reliability of the support information quality index (evaluation information) compared with such simple appearance frequencies is not high.
Therefore, the evaluation information generation unit 44 of the present embodiment is provided with a weighting unit 51.

例えば上述の頻度ランキングに基づくスコアが採用されている場合、評価情報生成部４４は、クレーム単語頻度７１における各単語の順位に基づいて、スコアを重み付けすることができる。
例えば、順位が一致するといっても、第１位同士として一致する場合と、第１０位同士として一致する場合とを比較すれば、前者の場合の方が、全体の類似度に貢献する割合が高いといえる。
そこで、重み付け部５１は、第１位同士として一致する場合のスコアが高くなるように重み付けすると共に、第１０位同士として一致する場合のスコアが低くなるように重み付けをする。例えば、第１位同士として一致する場合のスコアが「２」となる一方で、第１０位同士として一致する場合のスコアが「０．２」となるように、重み付けされる。 For example, when a score based on the above-described frequency ranking is adopted, the evaluation information generation unit 44 can weight the score based on the rank of each word in the complaint word frequency 71.
For example, even if the rankings match, if the case of matching the first place is compared with the case of matching the tenth place, the ratio of the former case contributes to the overall similarity. It can be said that it is expensive.
Therefore, the weighting unit 51 performs weighting so that the score when matching as the first place is high, and weighting so that the score when matching as the 10th place is low. For example, weighting is performed so that the score when matching as the first place is “2” while the score when matching as the tenth place is “0.2”.

また例えば評価情報生成部４４は、クレーム単語頻度７１における各単語毎に、明細書類義語頻度７３における、対応するグループに属する各単語との関連度合に基づいて、スコアを重み付けすることもできる。
例えば、特許請求の範囲の「弾性体」に対応するグループについては、明細書類義語頻度７３における、対応するグループには「バネ、ゴム」が入っている。これらは、「弾性体」の下位概念であるが、「弾性体」という単語自体は当該グループに入っていない。
一方、特許請求の範囲の「応力」に対応するグループについては、明細書類義語頻度７３における、対応するグループには「応力」という単語自体が入っている。
そこで、重み付け部５１は、特許請求の範囲の「弾性体」よりも、特許請求の範囲の「応力」の方が、関連度の高い単語が対応するグループに属しているとして、重みを増すように重み付けをする。 Further, for example, the evaluation information generation unit 44 can weight the score for each word in the complaint word frequency 71 based on the degree of association with each word belonging to the corresponding group in the detailed document synonym frequency 73.
For example, for a group corresponding to “elastic body” in the scope of claims, “spring, rubber” is included in the corresponding group in the definition document synonym frequency 73. These are subordinate concepts of “elastic body”, but the word “elastic body” itself does not belong to the group.
On the other hand, for the group corresponding to “stress” in the claims, the word “stress” is included in the corresponding group in the detailed document synonym frequency 73.
Therefore, the weighting unit 51 increases the weight on the assumption that the “stress” in the claim belongs to the group to which the word having a higher degree of association belongs than the “elastic body” in the claim. Weighting

このように各種重み付けをした後で生成されるサポート情報充実度指数（評価情報）は、より実体に近い値になっていると想定される。つまり、重み付けをすることで、より高精度なサポート情報充実度指数（評価情報）の生成が可能になる。 Thus, it is assumed that the support information fulfillment index (evaluation information) generated after performing various weightings is closer to the actual value. That is, weighting makes it possible to generate a support information fulfillment index (evaluation information) with higher accuracy.

ここで、クレーム単語頻度７１では、特許発明の特徴となる単語とは異なる単語、例えば図３に示す「前記」のように一般的な単語が、頻度ランキングで上位となる場合もある。そこで、このような単語をノイズ情報として、除去するとよい。
即ち、評価情報生成部４４は、クレーム単語頻度７１における各単語のうちノイズ情報（例えば「前記」という単語）を除去して、当該ノイズ情報が除去された後の各単語毎のスコアの総合値に基づいて、サポート情報充実度指数（評価情報）を生成する。 Here, in the complaint word frequency 71, a word different from the word that is a feature of the patented invention, for example, a general word such as “above” shown in FIG. 3, may be ranked higher in the frequency ranking. Therefore, it is preferable to remove such words as noise information.
That is, the evaluation information generation unit 44 removes noise information (for example, the word “said”) from each word in the complaint word frequency 71, and the total score value for each word after the noise information is removed. Based on the above, a support information fulfillment index (evaluation information) is generated.

このようにノイズ除去後に生成されるサポート情報充実度指数（評価情報）は、より実体に近い値になっていると想定される。つまり、ノイズ情報を除去をすることで、より高精度なサポート情報充実度指数（評価情報）の生成が可能になる。 Thus, it is assumed that the support information fulfillment index (evaluation information) generated after noise removal is closer to the actual value. That is, by removing the noise information, it is possible to generate a support information fulfillment index (evaluation information) with higher accuracy.

図４は、図３の機能的構成を有する公報評価装置１が実行する評価情報生成処理の流れを説明するフローチャートである。 FIG. 4 is a flowchart for explaining the flow of evaluation information generation processing executed by the publication evaluation apparatus 1 having the functional configuration of FIG.

ステップＳ１において、クレーム単語頻度分布生成部４１は、クレーム単語頻度分布を生成する。
ステップＳ２において、明細書単語頻度分布生成部４２は、明細書単語頻度分布を生成する。
ステップＳ３において、明細書類義語頻度分布生成部４３は、明細書類義語頻度分布を生成する。
ステップＳ４において、評価情報生成部４４は、クレーム単語類度分布と、明細書類義語頻度分布とに基づいて、サポート情報充実度指数（評価情報）を生成する。
これにより、評価情報生成処理は終了となる。 In step S1, the claim word frequency distribution generation unit 41 generates a claim word frequency distribution.
In step S2, the specification word frequency distribution generation unit 42 generates a specification word frequency distribution.
In step S3, the detailed document synonym frequency distribution generation unit 43 generates a detailed document synonym frequency distribution.
In step S4, the evaluation information generation unit 44 generates a support information fulfillment index (evaluation information) based on the claim word similarity distribution and the detailed document synonym frequency distribution.
Thereby, the evaluation information generation process ends.

以上、本発明の実施形態について説明したが、本発明は前述した実施形態に限るものではない。また、本実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本実施形態に記載されたものに限定されるものではない。 As mentioned above, although embodiment of this invention was described, this invention is not restricted to embodiment mentioned above. Further, the effects described in the present embodiment are merely a list of the most preferable effects resulting from the present invention, and the effects of the present invention are not limited to those described in the present embodiment.

例えば、上述の実施形態では、特許掲載公報又は出願公開公報の特許請求の範囲から各単語が抽出されて頻度分布が生成され、明細書から各単語が抽出されて頻度分布が生成された。
しかしながら、上述の実施形態に特に限定されず、例えば明細書、図面、又は要約書から、文字、図形、若しくは記号又はこれらの結合（以下、「文字等」と呼ぶ）が抽出されてもよい。
さらにいえば、上述の実施形態では、特許に関する公報が採用されたが、特に特許に限定する必要は特になく、文字等を含む２種類以上の書類の提出を前提として付与される知的財産権に関する公報一般の評価に本発明を採用することができる。 For example, in the above-described embodiment, each word is extracted from the claims of the patent publication publication or the application publication gazette to generate a frequency distribution, and each word is extracted from the specification to generate the frequency distribution.
However, the present invention is not particularly limited to the above-described embodiment. For example, characters, figures, symbols, or a combination thereof (hereinafter referred to as “characters”) may be extracted from the specification, drawings, or abstract.
Furthermore, in the above-described embodiment, the patent publication is adopted, but it is not particularly limited to the patent, and the intellectual property right granted on the assumption that two or more types of documents including characters are submitted. The present invention can be employed for general evaluation of publications related to the above.

また例えば、上述の実施形態では、頻度分布の単位情報は、単語とされたが、特にこれに限定されず、文字、図形、記号、又はそれらの結合からなる任意の単位情報を採用することができる。 Further, for example, in the above-described embodiment, the unit information of the frequency distribution is a word. However, the unit information is not particularly limited to this, and any unit information composed of characters, figures, symbols, or a combination thereof may be adopted. it can.

換言すると、本実施形態が適用される情報処理装置は、次のような構成を有すれば足り、各種各様な実施の形態を取ることが可能である。 In other words, the information processing apparatus to which the present embodiment is applied only needs to have the following configuration, and can take various embodiments.

即ち、本発明の情報処理装置は、
知的財産権に関する公報に含まれ得る第１書類（例えば特許請求の範囲）の内容を、文字、図形、記号、又はそれらの結合からなる所定の単位情報（例えば単語）に分離して、分離後の前記単位情報の頻度分布を示す第１情報（例えばクレーム単語頻度分布）を生成する第１情報生成手段（例えば図２のクレーム単語頻度分布生成部４１）と、
当該公報に含まれ得る第２書類（例えば明細書）の内容を前記単位情報（例えば単語）に分離して、分離後の前記単位情報の頻度分布を示す第２情報（例えば明細書単語頻度分布）を生成する第２情報生成手段（例えば図２の明細書単語頻度分布生成部４２）と、
前記第１情報と前記第２情報とに基づいて、前記公報の内容を評価する評価情報を生成する評価情報生成手段と、
を備える。 That is, the information processing apparatus of the present invention
The contents of the first document (for example, claims) that can be included in the gazette relating to intellectual property rights are separated into predetermined unit information (for example, words) consisting of characters, figures, symbols, or combinations thereof, and separated. A first information generating unit (for example, the claim word frequency distribution generating unit 41 in FIG. 2) for generating first information (for example, a claim word frequency distribution) indicating the frequency distribution of the unit information after;
Second information (for example, specification word frequency distribution) indicating the frequency distribution of the unit information after separation of the content of the second document (for example, description) that can be included in the gazette into the unit information (for example, word) ) Generating second information (for example, the specification word frequency distribution generation unit 42 in FIG. 2),
Evaluation information generating means for generating evaluation information for evaluating the contents of the publication based on the first information and the second information;
Is provided.

かかる情報処理装置を採用することで、知的財産権に関する公報をより適切に評価することが可能になる。 By adopting such an information processing apparatus, it becomes possible to more appropriately evaluate a gazette related to intellectual property rights.

なお、公報に含まれ得る第１書類の内容及び第２書類の内容とは、必ずしも公報の謄本である必要はなく、仮に公報が発行されるならば当該公報に含まれるであろう第１書類及び第２書類の各内容も含む意である。
つまり、公報の発行は特に必須ではなく、未出願の段階又は出願後未公開の段階でも、第１書類や第２書類の内容を示す情報は存在する可能性があるため、これらも含める意として、「公報に含まれ得る」と記載している。
従って、公報に含まれ得る第１書類及び第２書類の内容としては、例えば、出願後未公開の「特許請求の範囲」や「明細書」等についての出願人側で保持しているコピーデータの内容や、出願前における「特許請求の範囲」や「明細書」の内容を記載した書類（例えば出願人側で作成する発明報告書等）の内容等も含む。
また、第２ユーザも、実施候補事業者である必要は特に無く、出願人側の知的財産担当者等も該当し得る。
以上のことから、例えば、特許出願人（企業）側の知的財産担当者等は、出願前の段階で明細書案の評価をすることができるので、低評価の明細書案を高評価になるように書きなおしたうえで出願することもできる。 The contents of the first document and the contents of the second document that can be included in the gazette do not necessarily need to be a copy of the gazette, and if the gazette is issued, the first document that will be included in the gazette. And the contents of the second document.
In other words, publication of a gazette is not particularly essential, and there is a possibility that information indicating the contents of the first document or the second document may exist even in an unfiled stage or an unpublished stage after filing. , “Can be included in the publication”.
Therefore, the contents of the first document and the second document that can be included in the official gazette include, for example, copy data held by the applicant regarding unpublished “claims” and “specifications”, etc. And the contents of documents (for example, an invention report created by the applicant) describing the contents of “claims” and “specifications” before the application.
Further, the second user is not particularly required to be an implementation candidate business operator, and may be an intellectual property person in charge of the applicant.
From the above, for example, the intellectual property person in charge of the patent applicant (company) can evaluate the draft specification at the stage before the application, so that the low-valuation draft will be highly evaluated. You can also apply after rewriting it.

さらに、このような評価情報は、特許解析等の他分野に適用可能である。例えば、被引用件数に基づく特許評価インデックスの修正（ウエイト付け）に活用できる。
また例えば、特許分析に基づいて、ライセンス候補等を探索するにあたり、評価情報（サポート情報充実度指数）の高い明細書からなる特許を、「マッチング可能性の高い」特許として優先的に、ライセンス候補者等に提示するようなこともできる。 Further, such evaluation information can be applied to other fields such as patent analysis. For example, it can be used for correcting (weighting) a patent evaluation index based on the number of citations.
Also, for example, when searching for license candidates based on patent analysis, a patent consisting of a specification with a high evaluation information (support information quality index) is preferentially designated as a “highly matching” patent. It can also be presented to a person or the like.

上述した一連の処理は、ハードウェアにより実行させることもできるし、ソフトウェアにより実行させることもできる。
換言すると、図３の機能的構成は例示に過ぎず、特に限定されない。即ち、上述した一連の処理を全体として実行できる機能が公報評価装置１に備えられていれば足り、この機能を実現するためにどのような機能ブロックを用いるのかは特に図２の例に限定されない。
また、１つの機能ブロックは、ハードウェア単体で構成してもよいし、ソフトウェア単体で構成してもよいし、それらの組み合わせで構成してもよい。 The series of processes described above can be executed by hardware or can be executed by software.
In other words, the functional configuration of FIG. 3 is merely an example, and is not particularly limited. That is, it is sufficient that the publication evaluation apparatus 1 has a function capable of executing the above-described series of processing as a whole, and what functional block is used to realize this function is not particularly limited to the example of FIG. .
In addition, one functional block may be constituted by hardware alone, software alone, or a combination thereof.

一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、コンピュータ等にネットワークや記録媒体からインストールされる。
コンピュータは、専用のハードウェアに組み込まれているコンピュータであってもよい。また、コンピュータは、各種のプログラムをインストールすることで、各種の機能を実行することが可能なコンピュータ、例えば汎用のパーソナルコンピュータであってもよい。 When a series of processing is executed by software, a program constituting the software is installed on a computer or the like from a network or a recording medium.
The computer may be a computer incorporated in dedicated hardware. The computer may be a computer capable of executing various functions by installing various programs, for example, a general-purpose personal computer.

このようなプログラムを含む記録媒体は、ユーザにプログラムを提供するために装置本体とは別に配布される図１のリムーバブルメディア３１により構成されるだけでなく、装置本体に予め組み込まれた状態でユーザに提供される記録媒体等で構成される。リムーバブルメディア３１は、例えば、磁気ディスク（フロッピディスクを含む）、光ディスク、又は光磁気ディスク等により構成される。光ディスクは、例えば、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｋ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ），ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）等により構成される。光磁気ディスクは、ＭＤ（Ｍｉｎｉ−Ｄｉｓｋ）等により構成される。また、装置本体に予め組み込まれた状態でユーザに提供される記録媒体は、例えば、プログラムが記録されている図１のＲＯＭ１２や、記憶部１８に含まれるハードディスク等で構成される。 The recording medium including such a program is not only constituted by the removable medium 31 of FIG. 1 distributed separately from the apparatus main body in order to provide the program to the user, but also in a state of being incorporated in the apparatus main body in advance. It is comprised with the recording medium etc. which are provided in this. The removable medium 31 is composed of, for example, a magnetic disk (including a floppy disk), an optical disk, a magneto-optical disk, or the like. The optical disk is composed of, for example, a CD-ROM (Compact Disk-Read Only Memory), a DVD (Digital Versatile Disk), or the like. The magneto-optical disk is configured by an MD (Mini-Disk) or the like. In addition, the recording medium provided to the user in a state of being preliminarily incorporated in the apparatus main body is configured by, for example, the ROM 12 in FIG.

なお、本明細書において、記録媒体に記録されるプログラムを記述するステップは、その順序に沿って時系列的に行われる処理はもちろん、必ずしも時系列的に処理されなくとも、並列的或いは個別に実行される処理をも含むものである。 In the present specification, the step of describing the program recorded on the recording medium is not limited to the processing performed in time series along the order, but is not necessarily performed in time series, either in parallel or individually. The process to be executed is also included.

１・・・公報評価装置、１１・・・ＣＰＵ、１２・・・ＲＯＭ、１３・・・ＲＡＭ、１４・・・バス、１５・・・入出力インターフェース、１６・・・入力部、１７・・・出力部、１８・・・記憶部、１９・・・通信部、２０・・・ドライブ、３１・・・リムーバブルメディア、４１・・・クレーム単語頻度分布生成部、４２・・・明細書単語頻度分布生成部、４３・・・明細書類義語頻度分布生成部、４４・・・評価情報生成部、５１・・・重み付け部、６１・・・公報情報ＤＢ DESCRIPTION OF SYMBOLS 1 ... Gazette evaluation apparatus, 11 ... CPU, 12 ... ROM, 13 ... RAM, 14 ... Bus, 15 ... Input / output interface, 16 ... Input part, 17 ... -Output unit, 18 ... Storage unit, 19 ... Communication unit, 20 ... Drive, 31 ... Removable media, 41 ... Claim word frequency distribution generation unit, 42 ... Description word frequency Distribution generation unit, 43 ... Detailed document synonym frequency distribution generation unit, 44 ... Evaluation information generation unit, 51 ... Weighting unit, 61 ... Gazette information DB

Claims

The first document showing the frequency distribution of the unit information after separation, by separating the content of the first document that can be included in the gazette relating to the intellectual property right into predetermined unit information consisting of characters, figures, symbols, or combinations thereof First information generating means for generating one information;
Second information generating means for separating the content of the second document that can be included in the publication into the unit information, and generating second information indicating the frequency distribution of the unit information after separation;
A third distribution classifying each unit information in the second information into a plurality of groups corresponding to each of the plurality of unit information of the first information, and showing a frequency distribution with each of the plurality of groups after classification as a unit. Third information generating means for generating information;
For each unit information in the first information, the smaller the difference in rank between the frequency ranking of the unit information in the first information and the frequency ranking of the corresponding group in the third information, the higher the score. As the evaluation information that evaluates the content of the publication as the total value of the score is higher, the higher the total value of the score, the higher the score, , Evaluation information generating means for generating the evaluation information indicating the degree of fulfillment of support requirements;
An information processing apparatus comprising: