JP2018026119A

JP2018026119A - Classification system, control method of classification system, and program

Info

Publication number: JP2018026119A
Application number: JP2017141042A
Authority: JP
Inventors: 亮王; Ryo O; 加藤　玲雄; Reo Kato; 玲雄加藤; 聡永沼; Satoshi Naganuma; 龍太郎田嶋; Ryutaro Tajima; 廣本　英久; Hidehisa Hiromoto; 英久廣本; 高野　誠司; Seiji Takano; 誠司高野
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2016-07-29
Filing date: 2017-07-20
Publication date: 2018-02-15
Anticipated expiration: 2037-07-20
Also published as: JP7038499B2

Abstract

PROBLEM TO BE SOLVED: To solve the problem in which work to give intracompany classification is not efficiently performed and feedback of a classification result cannot be efficiently utilized.SOLUTION: A classification system comprises a model construction unit that constructs a learning model based on teacher data which indicates a relationship between input data associated with at least one official report and output data indicating at least one piece of classification given to the at least one official report corresponding to the input data. The classification system inputs processing target data into the learning model, and provides information associated with the processing target data to a user terminal according to the result output from the learning model. When teacher information is received from the user terminal, the classification system submits the teacher data according to the teacher information to the model construction unit.SELECTED DRAWING: Figure 2

Description

本発明は、特許公報（出願公開公報、特許掲載公報を含む）などの公報を分類する技術に関する。 The present invention relates to a technique for classifying a gazette such as a patent gazette (including an application publication gazette and a patent publication gazette).

特許公報には、分類番号が付与されている。例えば、国際特許分類（ＩＰＣ：International Patent Classification）が付与されたうえで特許公報は発行される。特許文献１には、国際特許分類をユーザが指定する際に、指定しようとする国際特許分類が付与されている特許公報の発明の名称や要約を表示することで、ユーザが国際特許分類を把握し易くする技術が開示されている。 The patent publication is assigned a classification number. For example, a patent gazette is issued after an international patent classification (IPC) is given. In Patent Document 1, when the user designates the international patent classification, the name of the invention of the patent publication to which the international patent classification to be designated and the summary are displayed, so that the user understands the international patent classification. Techniques that facilitate this are disclosed.

国際特許分類は、世界で共通で使用されている分類である一方、分類方法などをユーザが任意に指定できない。特許調査に不慣れな研究開発部門や技術部門でも特許調査をし易くするために、自社及び他社の特許公報に対して独自の分類（社内分類と呼ぶ）を付与することが行われている。 The international patent classification is a classification commonly used in the world, but the user cannot arbitrarily specify the classification method. In order to facilitate patent searches even in R & D departments and technical departments that are unfamiliar with patent searches, proprietary classifications (referred to as in-house classifications) have been assigned to patent gazettes of companies and other companies.

特開２００７−２３３９１０号公報JP 2007-233910 A

社内分類を付与する作業は、分類を付与する担当者が、例えば新着で発行される特許公報を目視して確認して、手入力で分類を付与する作業が行われている。かかる作業は、負荷を生じさせるものである。誤って分類が付与される場合もある。また、分類が正しくない旨のフィードバックが例えば開発部門などから分類担当者に伝達される場合があるが、必ずしもその後の分類作業に反映されているとは限らない。 As for the work of assigning the in-house classification, the person in charge of assigning the classification visually confirms, for example, a patent gazette issued in the new arrival and assigns the classification manually. Such work causes a load. In some cases, classification may be given by mistake. Further, although feedback indicating that the classification is not correct may be transmitted from the development department or the like to the person in charge of classification, it is not always reflected in the subsequent classification work.

また、社内分類を付与する場合には、対象の公報に対して定性的な評価を併せて付与することもあり、このような評価も誤って付与される場合もある。そして、このような定性的な評価は社内分類と常に同時に行われるのではなく、独立して行われることもある。 In addition, when the in-house classification is given, a qualitative evaluation may be given together with the target publication, and such an evaluation may be given by mistake. And such qualitative evaluation is not always performed at the same time as in-house classification, but may be performed independently.

本発明にかかる分類システムは、少なくとも１つの公報に関する入力データと、前記入力データに対応する少なくとも１つの前記公報に付与される少なくとも１つの分類を示す出力データとの関係を示す教師データに基づいて、学習モデルを構築するモデル構築部と、少なくとも１つの公報に関する処理対象データを、前記入力データとして、前記学習モデルに入力する処理対象データ入力部と、前記処理対象データの入力に応じて前記学習モデルから出力された結果に従って、前記処理対象データに関する情報をユーザ端末に提供する提供部と、前記ユーザ端末から、前記教師データの元となる教師情報を受信する受信部と、前記教師情報に応じた前記教師データを前記モデル構築部に投入する教師データ投入部とを有する。 The classification system according to the present invention is based on teacher data indicating a relationship between input data relating to at least one publication and output data indicating at least one classification assigned to at least one publication corresponding to the input data. A model building unit that builds a learning model, processing target data related to at least one publication, and a processing target data input unit that inputs the learning model as the input data, and the learning according to the input of the processing target data According to the result output from the model, a providing unit that provides information related to the processing target data to a user terminal, a receiving unit that receives teacher information that is a source of the teacher data from the user terminal, and the teacher information A teacher data input unit that inputs the teacher data into the model construction unit.

本発明によれば、社内分類を付与する作業を効率的に行うことができる。また、分類結果のフィードバックを効率的に活用することができる。さらに、本発明によれば、定性的な評価を付与する作業を効率的に行うこともできる。 According to the present invention, it is possible to efficiently perform work for assigning in-house classifications. In addition, the feedback of the classification result can be used efficiently. Furthermore, according to the present invention, it is possible to efficiently perform the work of giving qualitative evaluation.

分類作業の概要を説明する図である。It is a figure explaining the outline | summary of a classification | category operation | work. 分類システムを含む構成の一例を示す図である。It is a figure which shows an example of a structure containing a classification system. 学習モデルの構成の一例を示す図である。It is a figure which shows an example of a structure of a learning model. 教師データを説明する図である。It is a figure explaining teacher data. 分類システムが行う処理のフローチャートの一例を示す図である。It is a figure which shows an example of the flowchart of the process which a classification system performs. ユーザ端末におけるＵＩ画面の一例を示す図である。It is a figure which shows an example of UI screen in a user terminal. 分類済みＤＢに格納される評価付分類データの例を示す図である。It is a figure which shows the example of the classification data with evaluation stored in classification completed DB. 換算テーブルに用いられる値を説明する図である。It is a figure explaining the value used for a conversion table. 教師データの他の例を説明する図である。It is a figure explaining the other example of teacher data. ユーザ端末におけるＵＩ画面の他の例を示す図である。It is a figure which shows the other example of UI screen in a user terminal. 教師データの他の例を説明する図である。It is a figure explaining the other example of teacher data. ユーザ端末におけるＵＩ画面の他の例を示す図である。It is a figure which shows the other example of UI screen in a user terminal. ユーザ端末におけるＵＩ画面の他の例を示す図である。It is a figure which shows the other example of UI screen in a user terminal.

以下、図面を参照しながら本発明の実施形態について詳細に説明する。なお、以下の実施形態において説明する構成は一例に過ぎず、本発明は図示された構成に限定されるものではない。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In addition, the structure demonstrated in the following embodiment is only an example, and this invention is not limited to the structure shown in figure.

＜社内分類の利用形態＞
実施形態の説明に先立って、まず、現在行われている、社内分類が用いられる利用形態を説明する。図１は、社内分類の活用形態の一例を示す図である。社内分類は、例えば特許調査、特許情報のウォッチングなどのワークフローにおいて利用される。 <Usage classification usage pattern>
Prior to the description of the embodiment, first, a usage mode in which in-house classification is used will be described. FIG. 1 is a diagram illustrating an example of an in-house classification utilization mode. In-house classification is used in workflows such as patent search and patent information watching.

第１のシーンＷ１においては、調査担当者が所定の検索式と配信先とを設定する。例えば、ＳＤＩ（Selective Dissemination of Information）検索で用いる検索条件とその配信先とを設定する。ＳＤＩ検索とは、新たに発行された公報に対して予め設定された検索条件を用いて検索を行うことである。検索結果は、設定された配信先に自動的に配信される。全ての新着公報を分類する必要はないので、第１のシーンＷ１においては、例えば国際調査分類などの所定の検索条件を用いて大まかに新着公報を絞り込むことが行われる。ここで「公報」とは、特許、実用新案、意匠、商標などの出願公開公報、特許掲載公報などの各種の公報を含む。特許庁が発行する公報のみならず、定期的に発行される技術文献、技術論文を含んでもよい。何かしらの文献に分類を付与する形態であればその対象はいずれのものであってもよい。 In the first scene W1, the investigator sets a predetermined search formula and distribution destination. For example, a search condition used in an SDI (Selective Dissemination of Information) search and its distribution destination are set. The SDI search is a search using a search condition set in advance for a newly issued publication. The search result is automatically distributed to the set distribution destination. Since it is not necessary to classify all new publications, in the first scene W1, for example, new publications are roughly narrowed down using a predetermined search condition such as an international search classification. Here, “publication” includes various publications such as patent publications, utility models, designs, trademark publications, patent publication publications, and the like. It may include not only publications issued by the JPO, but also technical documents and technical papers that are periodically issued. Any object may be used as long as a classification is given to some document.

第２のシーンＷ２においては、分類担当者が新着公報に対して社内分類を付与する。社内分類はユーザ企業等において自由に設定が可能である。例えば、開発部門毎や製品毎などで分類が可能である。大分類、中分類、小分類などのように階層的な分類も可能である。一般に社内分類は、新着公報の仕分けのために用いられる。もちろん、新着公報の仕分けは一例であり、多様な形態で社内分類は使用され得る。社内分類が付与された新着公報は、該当する部門に送付される。社内分類は複数付与されてよい。例えば、ある公報Ｐ１に対して、分類Ｃ１および分類Ｃ２という２つの社内分類が付与されてよい。 In the second scene W2, the person in charge of classification gives an in-house classification to the new arrival publication. In-house classification can be freely set in the user company or the like. For example, it is possible to classify by development department or product. Hierarchical classification such as major classification, middle classification, and minor classification is also possible. Generally, in-house classification is used for sorting new publications. Of course, the sorting of new publications is an example, and in-house classification can be used in various forms. The new arrival bulletin given the in-house classification is sent to the corresponding department. Multiple in-house classifications may be given. For example, two in-house classifications, classification C1 and classification C2, may be given to a certain publication P1.

第３のシーンＷ３においては、送付された新着公報を各部門の担当者が確認する。ことのき、分類が間違っている場合には、分類が間違っている旨の連絡を分類担当者に通知する。また、各部門の担当者は、新着公報を確認して、それぞれの公報に評価（重みづけ）を付与する。例えば、自社技術との関連度を示す値であったり、侵害可能性の度合いを示す値など、各種の評価（重みづけ）をユーザ企業等が独自に設定できる。 In the third scene W3, the person in charge of each department confirms the new arrival bulletin sent. If the classification is wrong, the person in charge of classification is notified that the classification is wrong. Further, the person in charge of each department confirms the new arrival bulletin and assigns an evaluation (weighting) to each bulletin. For example, the user company or the like can uniquely set various evaluations (weights) such as a value indicating the degree of association with the company's technology or a value indicating the degree of the possibility of infringement.

第４のシーンＷ４においては、各部門で付与された評価（重みづけ）に応じて、調査担当者がその公報の経過をウォッチングし、必要に応じて担当する部門と協議する。 In the fourth scene W4, according to the evaluation (weighting) given by each department, the investigator watches the progress of the publication and discusses with the department in charge as necessary.

以上が一般的に行われる社内分類のワークフローの説明である。 The above is a description of the workflow for in-house classification that is generally performed.

＜実施形態１＞
本実施形態においては、例えば図１に示すワークフローの第２のシーンＷ２の分類作業を、後述する分類システムが処理する形態を説明する。なお、本実施形態で説明する分類システムは図１に示すようなワークフローシステムで用いられなくてもよく、任意の形態で利用可能である。 <Embodiment 1>
In the present embodiment, a mode in which a classification system (to be described later) processes the classification work of the second scene W2 in the workflow shown in FIG. 1 will be described. Note that the classification system described in this embodiment does not have to be used in the workflow system as shown in FIG. 1, and can be used in any form.

＜構成＞
図２は、実施形態にかかる分類システム２００の構成の一例を示す図である。分類システム２００は、モデル構築部２０１、学習モデル２０２、教師データ投入部２０３、処理対象データ入力部２０４、出力結果提供部２０５、提供先データベース（ＤＢ）２０６、教師情報受信部２０７、教師情報統合部２０８、教師情報反映部２０９、及び分類済みＤＢ２１０を有する。図１は構成の一例を示したものに過ぎず、他の構成を含んでもよい。また、図１に記載された構成の全てが必須の要件であるとは限らない。分類システム２００は、公報ＤＢ２５０とネットワークを通じて通信可能に構成されてよい。 <Configuration>
FIG. 2 is a diagram illustrating an example of the configuration of the classification system 200 according to the embodiment. The classification system 200 includes a model construction unit 201, a learning model 202, a teacher data input unit 203, a processing target data input unit 204, an output result provision unit 205, a provision destination database (DB) 206, a teacher information reception unit 207, and teacher information integration. A section 208, a teacher information reflection section 209, and a classified DB 210. FIG. 1 is merely an example of the configuration, and may include other configurations. Further, not all the configurations described in FIG. 1 are essential requirements. The classification system 200 may be configured to be able to communicate with the publication DB 250 through a network.

分類システム２００は、情報処理装置として実現することができる。情報処理装置は、ＣＰＵ、メモリ、ＨＤＤ、及びネットワークインタフェースを有してよい。図２に示す各部は、ＨＤＤに格納されたプログラムが一時的にメモリに読み出され、ＣＰＵがメモリに読み出されたプログラムを実行することで、ＣＰＵが図２に示す各部として機能してよい。また、図２に示す各部のうちの少なくとも一部が、各種のネットワークを通じて相互に接続された複数の情報処理装置によって実現されてよい。また、図２に示す各部のうちの少なくとも一部（例えばモデル構築部２０１及び学習モデル２０２）は、複数の情報処理装置による分散処理によって実現されてよい。 The classification system 200 can be realized as an information processing apparatus. The information processing apparatus may include a CPU, a memory, an HDD, and a network interface. Each unit illustrated in FIG. 2 may function as each unit illustrated in FIG. 2 when a program stored in the HDD is temporarily read into the memory and the CPU executes the program read into the memory. . Further, at least a part of the units illustrated in FIG. 2 may be realized by a plurality of information processing apparatuses connected to each other through various networks. Further, at least a part (for example, the model construction unit 201 and the learning model 202) of each unit illustrated in FIG. 2 may be realized by distributed processing by a plurality of information processing apparatuses.

分類システム２００は、ユーザが使用する端末（以下、ユーザ端末２６１と呼ぶ）との間で通信が可能に構成されており、ユーザ端末２６１に検索結果を提供したり、ユーザ端末２６１から、後述する教師情報を、ネットワークを通じて受信したりしてよい。 The classification system 200 is configured to be able to communicate with a terminal used by a user (hereinafter referred to as a user terminal 261). The classification system 200 provides a search result to the user terminal 261, and will be described later from the user terminal 261. Teacher information may be received through a network.

ユーザ端末２６１は、パーソナルコンピュータ、タブレット、モバイル端末など任意の種類の端末であってよい。分類システム２００は、ユーザをログイン管理しており、ログインしたユーザに固有の検索条件や検索結果などを、ユーザ端末２６１に提供してもよい。 The user terminal 261 may be any type of terminal such as a personal computer, a tablet, or a mobile terminal. The classification system 200 manages login of users, and may provide the user terminal 261 with search conditions and search results unique to the logged-in user.

本実施形態の分類システム２００は、ユーザ企業毎に異なるように構成してよい。Ａ社がＡ社用にカスタマイズした教師データに基づいて構築された学習モデル２０２を、Ａ社と異なるＢ社が用いると、Ｂ社の所望する結果が得られないからである。また、Ａ社がＡ社用にカスタマイズした学習モデル２０２を、Ａ社とライバルのＣ社に使わせることはＡ社の意図に反し、情報保護の観点からも適切でないからである。したがって、本実施形態の分類システム２００は、ユーザ企業毎に学習モデル２０２がカスタマイズされることになる。なお、ユーザ企業は、企業に限られず、任意の団体、集団などであってよい。 The classification system 200 of the present embodiment may be configured to be different for each user company. This is because if company B, which is different from company A, uses a learning model 202 constructed based on teacher data customized for company A by company A, the desired result of company B cannot be obtained. Moreover, it is because it is against the intention of A company and it is not appropriate from the viewpoint of information protection to make A company and rival company C use the learning model 202 customized for A company by A company. Therefore, in the classification system 200 of this embodiment, the learning model 202 is customized for each user company. The user company is not limited to a company, and may be an arbitrary group or group.

モデル構築部２０１は、学習モデル２０２を構築する。なお、本明細書において「構築」とは、新たにモデルを生成すること、生成済のモデルを変更または更新すること、及び新たに生成したモデルで既存のモデルを置き換えることを含む概念として用いることとする。例えば、モデル構築部２０１は、構築されている学習モデルがない場合、新たにモデルを生成してよい。モデル構築部２０１は、既に構築されている学習モデルがある場合、その学習モデルの一部の構成を変更または更新することで新たに学習モデルを構築してよい。モデル構築部２０１は、新たに学習モデルを生成し、既に構築されている学習モデルを生成した学習モデルで置き換えてもよい。 The model construction unit 201 constructs a learning model 202. In this specification, “construction” is used as a concept including generating a new model, changing or updating a generated model, and replacing an existing model with a newly generated model. And For example, the model construction unit 201 may newly generate a model when there is no learning model constructed. If there is a learning model that has already been constructed, the model construction unit 201 may construct a new learning model by changing or updating a part of the configuration of the learning model. The model construction unit 201 may newly generate a learning model and replace the learning model that has already been constructed with the learning model that has been created.

モデル構築部２０１は、学習モデル２０２として、例えば図３に示すようなニューラルネットワークを構築してよい。ニューラルネットワークは、入力層３１０、中間層３２０、及び出力層３３０を含む。それぞれの層は、複数のノードで構成される。図３では、入力層３１０は、ノード３１１、３１２、３１３を含み、中間層３２０は、ノード３２１、３２２、３２３を含み、出力層３３０は、ノード３３１、３３２、３３３を含む例を示している。各層に含まれるノードの数は図示した例に限られるものではなく、多数のノードから構成される形態が一般的である。中間層３２０は一層に限られるものではなく、複数の層から構成されてよい。それぞれのノードには重みが設定されており、入力されたデータに重みを掛け合わせた値を後段のノードに伝えていく。 The model construction unit 201 may construct a neural network as shown in FIG. 3 as the learning model 202, for example. The neural network includes an input layer 310, an intermediate layer 320, and an output layer 330. Each layer is composed of a plurality of nodes. In FIG. 3, the input layer 310 includes nodes 311, 312, and 313, the intermediate layer 320 includes nodes 321, 322, and 323, and the output layer 330 illustrates nodes 331, 332, and 333. . The number of nodes included in each layer is not limited to the illustrated example, and a form composed of a large number of nodes is common. The intermediate layer 320 is not limited to a single layer, and may be composed of a plurality of layers. A weight is set for each node, and a value obtained by multiplying the input data by the weight is transmitted to a subsequent node.

教師あり学習では、例えば、教師データとして、入力データと入力データに対応する出力データとを用意する。モデル構築部２０１は、入力データが出力データと等しくなるように各ノードに設定される重みを調整する処理を繰り返す。このようにして、モデル構築部２０１は学習モデル２０２を構築する。その後、処理対象のデータが学習モデル２０２に入力されると、学習モデル２０２は構築されたモデルに従った結果を出力することになる。 In supervised learning, for example, input data and output data corresponding to the input data are prepared as teacher data. The model construction unit 201 repeats the process of adjusting the weight set to each node so that the input data becomes equal to the output data. In this way, the model construction unit 201 constructs the learning model 202. Thereafter, when data to be processed is input to the learning model 202, the learning model 202 outputs a result according to the constructed model.

本実施形態において、モデル構築部２０１は、公知の機械学習処理を実行することで学習モデル２０２を構築してよい。例えば、モデル構築部２０１は、ＣＮＮ（Convolutional Neural Network)を用いてもよいし、ＲＮＮ（Recurrent Neural Network）を用いてもよい。その他の手法を用いてもよい。ニューラルネットワークではなく、ＳＶＭ（Support Vector Machine）で学習モデル２０２を構築してもよい。本実施形態において学習モデル２０２それ自体は任意の種類のものを用いてよい。 In the present embodiment, the model construction unit 201 may construct the learning model 202 by executing a known machine learning process. For example, the model construction unit 201 may use a CNN (Convolutional Neural Network) or an RNN (Recurrent Neural Network). Other methods may be used. The learning model 202 may be constructed not by a neural network but by SVM (Support Vector Machine). In this embodiment, the learning model 202 itself may be of any type.

教師データ投入部２０３は、教師データをモデル構築部２０１に投入する。初期の段階においては、教師データ投入部２０３は、分類済みＤＢ２１０と公報ＤＢ２５０とから得られるデータを教師データとして投入してよい。 The teacher data input unit 203 inputs teacher data to the model construction unit 201. In the initial stage, the teacher data input unit 203 may input data obtained from the classified DB 210 and the publication DB 250 as teacher data.

公報ＤＢ２５０は、例えば利用可能な発行済みの全ての公報が格納されたＤＢである。公報ＤＢ２５０は、前述のように、特許、実用新案、意匠、商標などの出願公開公報、特許掲載公報などの各種の公報を含み、また、特許庁が発行する公報のみならず、定期的に発行される技術文献、技術論文を含んでもよい。このように、公報ＤＢ２５０は、単一のＤＢである必要はなく、複数のＤＢの集合であってよい。公報ＤＢ２５０は、日本国内の公報のみならず、諸外国の公報のＤＢを含んでよい。 The publication DB 250 is a DB in which, for example, all available published publications are stored. As described above, the publication DB 250 includes various publications such as patent publications, utility models, designs, trademark publication publications, patent publication publications, etc., and is issued periodically as well as publications issued by the JPO. Technical literature and technical papers may be included. Thus, the publication DB 250 does not have to be a single DB, and may be a set of a plurality of DBs. The gazette DB 250 may include not only gazettes in Japan but also gazettes of gazettes in other countries.

分類済みＤＢ２１０には、例えば公報の識別情報（例えば、出願番号、公開番号など）と、その公報に付与されている社内分類とが関連付けられたデータが格納される。例えば、公報Ｐ１０には、分類Ｃ１および分類Ｃ２が付与される、といったデータが格納される。 The classified DB 210 stores, for example, data in which identification information (for example, application number, publication number, etc.) of a publication is associated with an in-house classification assigned to the publication. For example, the publication P10 stores data indicating that the classification C1 and the classification C2 are given.

教師データ投入部２０３は、例えば図４に示すような教師データを初期の教師データとして投入してよい。教師データは、少なくとも１つの公報に関する入力データと、入力データに対応する少なくとも１つの公報に付与される少なくとも１つの分類を示す出力データとの関係を示すデータであってよい。より具体的には、出力データは、社内分類として付与され得る複数の分類それぞれの近似度を示すデータであってよい。図４は、教師データの一例を説明する図である。図４の例においては、社内分類として付与され得る複数の分類が分類Ｃ１と分類Ｃ２と分類Ｃ３とであるものとする。図４に示す教師データにおいては、入力データとして公報Ｐ１の要約文のテキストデータを用意し、この入力データに対応する出力データとして公報Ｐ１に対する分類Ｃ１の近似度を示す値１、及び公報Ｐ１に対する分類Ｃ２の近似度を示す値０、及び公報Ｐ１に対する分類Ｃ３の近似度を示す値１を含むデータを用意した例を示している。本実施形態においては値１が最も近似している値を示すものとする。つまり、公報Ｐ１は、分類Ｃ１及び分類Ｃ３に分類すべきことを示す教師データを用意している。一方、本実施形態においては、値０は最も近似していない値を示すものとする。つまり、公報Ｐ１は、分類Ｃ２には分類すべきではないことを示す教師データを用意している。 The teacher data input unit 203 may input teacher data as shown in FIG. 4 as initial teacher data, for example. The teacher data may be data indicating a relationship between input data relating to at least one publication and output data indicating at least one classification given to at least one publication corresponding to the input data. More specifically, the output data may be data indicating the degree of approximation of each of a plurality of classifications that can be assigned as in-house classifications. FIG. 4 is a diagram illustrating an example of teacher data. In the example of FIG. 4, it is assumed that a plurality of classifications that can be given as in-house classifications are classification C1, classification C2, and classification C3. In the teacher data shown in FIG. 4, text data of a summary sentence of the publication P1 is prepared as input data, and the output data corresponding to the input data is a value 1 indicating the degree of approximation of the classification C1 with respect to the publication P1, and the publication P1. An example is shown in which data including a value 0 indicating the degree of approximation of the classification C2 and a value 1 indicating the degree of approximation of the classification C3 with respect to the publication P1 is prepared. In the present embodiment, the value 1 indicates the closest value. That is, the publication P1 prepares teacher data indicating that classification should be made into the classification C1 and the classification C3. On the other hand, in the present embodiment, the value 0 indicates a value that is not closest. That is, the gazette P1 prepares teacher data indicating that it should not be classified into the classification C2.

このように、図４に示すような教師データをモデル構築部２０１が用いて学習モデル２０２を構築する。例えば、モデル構築部２０１は、学習モデル２０２を構成するニューラルネットワークの各ノードの重みを変えてよい。このように構築された学習モデル２０２に公報Ｐ１を処理対象データとして入力すると、出力として分類Ｃ１及びＣ２のそれぞれの近似度を示すデータが得られることになる。図４の教師データを初期の教師データとして用いている場合には、公報Ｐ１を処理対象データとして学習モデル２０２に入力すると、分類Ｃ１及び分類Ｃ３のそれぞれの近似度は限りなく１に近い値が得られる。別の公報Ｐ１１を処理対象データとして学習モデル２０２に入力すると、分類Ｃ１、分類Ｃ２、及び分類Ｃ３の近似度を示す値がそれぞれ出力されることになる。 Thus, the model construction unit 201 constructs the learning model 202 using the teacher data as shown in FIG. For example, the model construction unit 201 may change the weight of each node of the neural network constituting the learning model 202. When the publication P1 is input as processing target data to the learning model 202 constructed in this way, data indicating the respective approximation degrees of the classifications C1 and C2 is obtained as output. When the teacher data in FIG. 4 is used as initial teacher data, when the publication P1 is input as processing target data to the learning model 202, the respective approximation degrees of the classification C1 and the classification C3 are infinitely close to 1. can get. When another publication P11 is input to the learning model 202 as the processing target data, values indicating the degrees of approximation of the classification C1, the classification C2, and the classification C3 are respectively output.

図４では、入力データとして要約文のテキストデータを用いる例を示したが、これに限らない。特許請求の範囲のテキストデータでもよいし、全文のテキストデータでもよい。また、ユーザが任意のテキスト文書を入力してもよい。また、入力データとして、書誌データを入力してもよいし、画像データを入力してもよい。また、これらの一部のデータでもよいし、複数の種類のデータを用いてもよい。 In FIG. 4, although the example using the text data of a summary sentence as input data was shown, it is not restricted to this. The text data of claims may be sufficient, and the text data of the whole sentence may be sufficient. In addition, the user may input an arbitrary text document. Bibliographic data may be input as input data, or image data may be input. Some of these data may be used, and a plurality of types of data may be used.

教師データ投入部２０３は、初期の教師データとして、入力データに対応する必要なデータ（例えば、公報の要約文のテキストデータ）を公報ＤＢ２５０から取得してよく、出力データに対応する必要なデータ（例えば特定の公報に付与されている分類を示すデータ）を分類済みＤＢ２１０から取得してよい。なお、入力データに対応する必要なデータが分類済みＤＢ２１０に格納されている場合（例えば図４の例で、要約文も分類済みＤＢ２１０に格納されている場合）、教師データ投入部２０３は、公報ＤＢ２５０を参照せずに
得られる教師データを用いてもよい。 The teacher data input unit 203 may obtain necessary data corresponding to the input data (for example, text data of the summary sentence of the gazette) from the gazette DB 250 as initial teacher data, and necessary data corresponding to the output data ( For example, data indicating the classification given to a specific publication may be acquired from the classified DB 210. When necessary data corresponding to the input data is stored in the classified DB 210 (for example, in the example of FIG. 4, the summary sentence is also stored in the classified DB 210), the teacher data input unit 203 Teacher data obtained without referring to the DB 250 may be used.

教師データには、公報のどの部分（例えば、書誌情報、要約、請求の範囲、図面など）を用いて学習したかを示す付加情報が含まれてよい。処理対象データ入力部２０４は、付加情報を参照して、処理対象データとして入力される公報の中の対応する箇所を入力データとして用いてよい。 The teacher data may include additional information indicating which part of the publication (for example, bibliographic information, summary, claims, drawings, etc.) is used for learning. The processing target data input unit 204 may refer to the additional information and use a corresponding part in the publication input as processing target data as input data.

処理対象データ入力部２０４は、公報ＤＢ２５０から新着情報を取得する。処理対象データ入力部２０４は、所定のタイミングで新着情報を取得してよい。例えば、公報ＤＢ２５０の更新の都度、新着情報を取得してよい。ウィークリー、マンスリーなどのように、前回の新着情報の取得時期から所定の期間間隔で新着情報を取得してもよい。新着情報とは、処理対象データ入力部２０４が前回の新着情報を取得した時点から新たに発行された公報としてよい。処理対象データ入力部２０４は、新着情報で特定される公報を処理対象データとして学習モデル２０２に入力する。なお、処理対象データ入力部２０４は、新着情報の中から所定の検索条件で対象を絞り込んだ公報を処理対象データとして用いてよい。 The processing target data input unit 204 acquires new arrival information from the publication DB 250. The processing target data input unit 204 may acquire new arrival information at a predetermined timing. For example, new information may be acquired every time the publication DB 250 is updated. New arrival information may be acquired at predetermined time intervals from the previous acquisition timing of new arrival information, such as weekly and monthly. The new arrival information may be a newly issued gazette from the time when the processing target data input unit 204 acquires the previous arrival information. The processing target data input unit 204 inputs the publication specified by the new arrival information to the learning model 202 as processing target data. Note that the processing target data input unit 204 may use, as processing target data, a publication in which the target is narrowed down by predetermined search conditions from newly arrived information.

また、処理対象データ入力部２０４は、新着情報で特定される公報以外の任意の公報を処理対象データとして学習モデル２０２に入力してよい。例えば、社内分類の種別を増やした場合には、既に分類済みの公報を処理対象データとして学習モデル２０２に入力してよい。また、技術分野の拡大などによって新着情報の絞り込み条件を変更する場合など、これまで処理の対象外の発行済みの公報を処理対象データとして学習モデル２０２に入力してよい。 Further, the processing target data input unit 204 may input an arbitrary publication other than the publication specified by the new arrival information to the learning model 202 as the processing target data. For example, when the number of in-house classifications is increased, already classified publications may be input to the learning model 202 as processing target data. In addition, when changing the filtering condition of newly arrived information due to expansion of the technical field or the like, published publications that have not been processed so far may be input to the learning model 202 as processing target data.

出力結果提供部２０５は、処理対象データが学習モデル２０２に入力され、学習モデル２０２が出力した結果に従って、処理対象データを識別する識別情報（たとえば出願番号、公開番号、または特許番号など）を、ユーザ端末２６１に提供してよい。 The output result providing unit 205 receives identification information (for example, an application number, a publication number, or a patent number) for identifying the processing target data according to the result output from the learning model 202 when the processing target data is input to the learning model 202. You may provide to the user terminal 261.

出力結果提供部２０５は、学習モデル２０２が出力した結果に基づいて、処理対象データの公報に付与すべき分類を特定する。たとえば、出力結果提供部２０５は、学習モデル２０２が出力する各分類の近似度を示す値が所定の閾値を超えている分類が、処理対象データの公報に付与される分類であると特定してよい。出力結果提供部２０５は、特定した分類を処理対象データの公報に付与する。出力結果提供部２０５は、付与した分類に対応するユーザを特定し、特定したユーザのユーザ端末２６１に処理対象データを識別する識別情報を提供してよい。 The output result providing unit 205 specifies the classification to be given to the publication of the processing target data based on the result output by the learning model 202. For example, the output result providing unit 205 specifies that the classification in which the value indicating the degree of approximation of each classification output from the learning model 202 exceeds a predetermined threshold is the classification given to the publication of the processing target data. Good. The output result providing unit 205 assigns the identified classification to the publication of the processing target data. The output result providing unit 205 may identify a user corresponding to the assigned classification, and provide identification information for identifying the processing target data to the user terminal 261 of the identified user.

提供先ＤＢ２０６は、所定の分類が付与された公報の情報を提供する提供先のユーザを示すデータを格納する。例えば、分類Ｃ１が付与された公報の提供先のユーザは、開発部門Ｄ１に所属するユーザであることを示すデータであってよい。提供先のユーザは複数であってよい。つまり、開発部門Ｄ１に複数のユーザが所属していればその複数のユーザに提供されてよい。また、提供先は複数の部門に跨ってもよい。 The providing destination DB 206 stores data indicating a providing destination user who provides information on a gazette to which a predetermined classification is assigned. For example, the user who provides the publication with the classification C1 may be data indicating that the user belongs to the development department D1. There may be a plurality of users to be provided. That is, if a plurality of users belong to the development department D1, it may be provided to the plurality of users. In addition, the provision destination may extend over a plurality of departments.

なお、提供される内容は公報に関する識別情報などの書誌情報でよく、要約など内容に関する一部の情報を含んでもよい。出力結果提供部２０５は、出力結果が抽出されたことを、提供先のユーザの登録済みのメールアドレスなどに通知する。この通知を受けてユーザ端末２６１からアクセスがあった場合、出力結果提供部２０５は、アクセスをしたユーザ端末２６１に出力結果を提示してもよい。 The provided content may be bibliographic information such as identification information related to the gazette, and may include some information related to the content such as a summary. The output result providing unit 205 notifies the registered user's registered e-mail address and the like that the output result has been extracted. When there is an access from the user terminal 261 in response to this notification, the output result providing unit 205 may present the output result to the user terminal 261 that has accessed.

また、出力結果提供部２０５は、学習モデル２０２に処理対象データとして入力された公報の識別情報と、学習モデル２０２から出力された結果が示す分類とを対応付けたデータを分類済みＤＢ２１０に出力し、分類済みＤＢ２１０を更新する。本実施形態においては、学習モデル２０２から出力された結果が示す分類を用いてまず分類済みＤＢ２１０を更新する。その後、ユーザからの教師情報に基づいて、分類済みＤＢ２１０に格納されたデータを修正する形態を説明する。しかしながら、この例に限られるものではない。 Further, the output result providing unit 205 outputs, to the classified DB 210, data in which identification information of the gazette input as processing target data to the learning model 202 is associated with the classification indicated by the result output from the learning model 202. The classified DB 210 is updated. In this embodiment, the classified DB 210 is first updated using the classification indicated by the result output from the learning model 202. Then, the form which corrects the data stored in classified DB210 based on the teacher information from a user is demonstrated. However, the present invention is not limited to this example.

教師情報受信部２０７は、出力結果提供部２０５から出力結果が提供されたことに応じてユーザ端末２６１から送信される教師情報を受信する。教師情報とは、出力結果提供部２０５が出力した結果（すなわち、学習モデル２０２の出力した内容）が正しい（適切か）か、あるいは間違っているか（不適切か）をユーザが指定した情報としてよい。例えば、開発部門Ｄ１のメンバーに提供された公報Ｐ２１が、開発部門Ｄ１と関係がない公報であったと想定する。このような場合、開発部門Ｄ１のメンバーから、自部門とは関係ない公報である旨の情報が教師情報として通知される。教師情報受信部２０７は、このような教師情報を受信する。また、開発部門Ｄ１のメンバーが、公報Ｐ２１は、自部門ではなく、むしろ他部門の開発部門Ｄ２に関連すると判断できた場合、公報Ｐ２１に対して新たに分類Ｃ２を付与することを指定する教師情報を受信してもよい。 The teacher information receiving unit 207 receives teacher information transmitted from the user terminal 261 in response to the output result provided from the output result providing unit 205. The teacher information may be information in which the user specifies whether the result output from the output result providing unit 205 (that is, the content output from the learning model 202) is correct (appropriate) or incorrect (appropriate). . For example, it is assumed that the publication P21 provided to the members of the development department D1 is a publication that is not related to the development department D1. In such a case, a member of the development department D1 notifies the teacher information that the publication is not related to the department. The teacher information receiving unit 207 receives such teacher information. In addition, when a member of the development department D1 can determine that the publication P21 is not related to the own department but rather to the development department D2 of another department, the teacher who designates that the classification P2 is newly assigned to the publication P21. Information may be received.

教師情報統合部２０８は、複数の教師情報を教師情報受信部２０７が受信した場合、教師情報を統合する。教師情報統合部２０８が統合する態様は、２つの態様がある。 The teacher information integration unit 208 integrates teacher information when the teacher information reception unit 207 receives a plurality of teacher information. There are two modes for the teacher information integration unit 208 to integrate.

１つは、１つの公報に対して複数の分類が学習モデル２０２から出力された場合である。つまり、複数の分類先のユーザ端末からそれぞれ教師情報が送られるケースである。例えば、学習モデル２０２に処理対象データとして公報Ｐ３を入力し、学習モデル２０２から分類Ｃ１、Ｃ２、Ｃ３のいずれも該当する結果が出力されたと想定する。分類Ｃ１に対応する開発部門Ｄ１のメンバーおよび分類Ｃ２に対応する開発部門Ｄ２のメンバーからは、自部門とは関係ない公報である旨の情報を、教師情報受信部２０７が教師情報としてそれぞれ受信したと想定する。すると、教師情報統合部２０８は、このように受信した教師情報を統合する。すなわち、教師情報統合部２０８は、複数の教師情報を統合して、公報Ｐ３は分類Ｃ３を出力することが正しい、という内容の教師情報に統合する。公報Ｐ３は分類Ｃ１及び分類Ｃ２を出力することが正しくない、という内容の教師情報をさらに含めて統合してもよい。統合した教師情報は教師情報反映部２０９に送信される。 One is a case where a plurality of classifications are output from the learning model 202 for one publication. That is, the teacher information is sent from each of a plurality of classification destination user terminals. For example, it is assumed that the publication P3 is input to the learning model 202 as the processing target data, and the results corresponding to any of the classifications C1, C2, and C3 are output from the learning model 202. The teacher information receiving unit 207 received information indicating that the publication is not related to the own department as teacher information from members of the development department D1 corresponding to the classification C1 and members of the development department D2 corresponding to the classification C2. Assume that Then, the teacher information integration unit 208 integrates the teacher information received in this way. That is, the teacher information integration unit 208 integrates a plurality of teacher information, and integrates the teacher information with the content that the publication P3 is correct to output the classification C3. The gazette P3 may be integrated by further including teacher information indicating that it is not correct to output the classification C1 and the classification C2. The integrated teacher information is transmitted to the teacher information reflecting unit 209.

もう一つのケースは、１つの分類の提供先に複数のユーザが登録されている場合である。つまり、同じ部門内の複数のユーザからそれぞれ教師情報を受信するケースである。このような場合、多様な教師情報の採用方法がある。例えば、教師情報統合部２０８は、全てのユーザが共通の評価をした教師情報のみを、教師情報反映部２０９に出力してよい。換言すれば、教師情報統合部２０８は、一部のユーザが異なる評価をした場合には、その公報については教師情報として採用しなくてよい。かかる構成によれば、全てのユーザの意見が一致した教師データが反映されることになるので、より堅固な学習モデル２０２を構築することができる。 Another case is a case where a plurality of users are registered in one classification providing destination. That is, it is a case where teacher information is received from each of a plurality of users in the same department. In such a case, there are various methods for employing teacher information. For example, the teacher information integration unit 208 may output only the teacher information that is shared by all users to the teacher information reflection unit 209. In other words, when some users make different evaluations, the teacher information integration unit 208 does not have to adopt the publication as teacher information. According to this configuration, teacher data in which the opinions of all the users are the same is reflected, so that a more robust learning model 202 can be constructed.

教師情報統合部２０８は、一部のユーザが異なる評価をした場合には、多数決によって教師情報を採用してもよい。多数のユーザが評価を行う場合には、全てのユーザの評価が完全に一致しない場合が多くなることも想定される。採用できる教師情報が少ないと教師データも十分なサンプル数とはならない。この結果、学習モデル２０２の進化が進まず、十分な成果が得られない場合もある。よって、一部のユーザが異なる評価をした場合には、教師情報統合部２０８は、多数決によって教師情報を採用してもよい。 The teacher information integration unit 208 may employ teacher information by majority vote when some users make different evaluations. When a large number of users perform evaluation, it is assumed that the evaluations of all users do not completely match. If there is little teacher information that can be adopted, teacher data will not be a sufficient number of samples. As a result, there is a case where the learning model 202 does not advance and a sufficient result cannot be obtained. Therefore, when some users make different evaluations, the teacher information integration unit 208 may employ teacher information by majority vote.

教師情報統合部２０８は、ユーザ毎に重みを変えてよい。ユーザ毎に習熟度は異なる。したがって、習熟度の高いユーザには重みを重くし、熟練度の低いユーザには重みを軽くする。そして、それぞれのユーザの評価した値（つまり、教師情報）を重みづけ平均した結果を、教師情報反映部２０９に出力する教師情報として採用してもよい。例えば、習熟度が低い複数のユーザが、分類は不適切であるという教師情報を出力したとしても、習熟度の高いユーザが分類は不適切であるという判断をしていない場合には、教師情報統合部２０８は、その分類は不適切であるという教師情報を教師情報反映部２０９に出力しなくてもよい。 The teacher information integration unit 208 may change the weight for each user. The proficiency level varies from user to user. Therefore, the weight is increased for users with a high level of proficiency, and the weight is reduced for users with a low level of proficiency. Then, a result obtained by weighting and averaging the values evaluated by the respective users (that is, teacher information) may be adopted as the teacher information output to the teacher information reflecting unit 209. For example, even if a plurality of users with low proficiency levels output teacher information indicating that classification is inappropriate, if users with high proficiency levels do not determine that classification is inappropriate, teacher information The integration unit 208 may not output the teacher information that the classification is inappropriate to the teacher information reflection unit 209.

教師情報統合部２０８は、リーダとなるユーザからの指示を受け付け可能に構成してよい。複数のユーザのうち、一部のユーザが異なる評価をした場合には、リーダからの指示によって、教師情報として採用するか否かを決定してもよい。具体的には、教師情報統合部２０８は、他と異なる教師情報が含まれているか否かを判定する。異なる教師情報が含まれている場合、リーダとして登録されているユーザ端末に、どの教師情報を採用するかを問い合わせるＵＩ画面を提供する。そのユーザ端末から採用すべき教師情報の指定を受信すると、教師情報統合部２０８は、指定された教師情報を採用する。 The teacher information integration unit 208 may be configured to accept an instruction from a user as a leader. When some users among the plurality of users make different evaluations, it may be determined whether or not to adopt the teacher information according to an instruction from the reader. Specifically, the teacher information integration unit 208 determines whether teacher information different from others is included. When different teacher information is included, a UI screen for inquiring which teacher information to use is provided to a user terminal registered as a reader. When receiving the designation of teacher information to be adopted from the user terminal, the teacher information integration unit 208 adopts the designated teacher information.

教師情報反映部２０９は、教師情報統合部２０８から出力された教師情報を反映する。教師情報反映部２０９は、分類済みＤＢ２１０を教師情報に基づいて更新する。例えば、前述の例のように、公報Ｐ３を処理対象データとして学習モデル２０２に入力した場合において、学習モデル２０２から出力された分類Ｃ１の近似度が０．６、分類Ｃ２の近似度が０．７、分類Ｃ３の近似度が０．８であった場合を例に挙げて説明する。出力結果提供部２０５は、この結果を受けて、所定の閾値（例えば、０．５）を超える近似度を有する分類が適切な分類であると判定してよい。この結果、出力結果提供部２０５は、公報Ｐ３に対応する分類は、分類Ｃ１、分類Ｃ２、及び分類Ｃ３であると特定する。そして、公報Ｐ３に関する情報をそれぞれの分類に対応するユーザ端末２６１に出力する。また、出力結果提供部２０５は、公報Ｐ３と分類Ｃ１、Ｃ２、Ｃ３とを関連付けた分類済みデータを分類済みＤＢ２１０に格納する。このとき、統合された教師情報が、公報Ｐ３に対する分類は、分類Ｃ３が適切であり、分類Ｃ１及び分類Ｃ２は不適切であることを示す情報であるとする。すると、教師情報反映部２０９は、分類済みＤＢ２１０に既に格納されている公報Ｐ３に関する分類データを変更する。すなわち、既に格納されている公報Ｐ３に関する分類データを、公報Ｐ３と分類Ｃ３のみとを関連付けた分類データに変更する。 The teacher information reflection unit 209 reflects the teacher information output from the teacher information integration unit 208. The teacher information reflection unit 209 updates the classified DB 210 based on the teacher information. For example, when the publication P3 is input to the learning model 202 as the processing target data as in the above example, the degree of approximation of the classification C1 output from the learning model 202 is 0.6, and the degree of approximation of the classification C2 is 0. 7. A case where the degree of approximation of classification C3 is 0.8 will be described as an example. In response to this result, the output result providing unit 205 may determine that a classification having an approximation degree exceeding a predetermined threshold (for example, 0.5) is an appropriate classification. As a result, the output result providing unit 205 specifies that the classifications corresponding to the publication P3 are the classification C1, the classification C2, and the classification C3. And the information regarding the publication P3 is output to the user terminal 261 corresponding to each classification. Further, the output result providing unit 205 stores the classified data in which the publication P3 is associated with the classifications C1, C2, and C3 in the classified DB 210. At this time, it is assumed that the integrated teacher information is information indicating that classification C3 is appropriate and classification C1 and classification C2 are inappropriate for the publication P3. Then, the teacher information reflection unit 209 changes the classification data regarding the publication P3 that is already stored in the classified DB 210. That is, the already stored classification data relating to the publication P3 is changed to classification data in which the publication P3 is associated with only the classification C3.

また、教師情報反映部２０９は、統合された教師情報に基づく教師データを教師データ投入部２０３に出力する。つまり、公報Ｐ３を入力データとし、分類Ｃ１の近似度の値０、分類Ｃ２の近似度の値０、分類Ｃ３の近似度の値１を出力データとする教師データを教師データ投入部２０３に出力する。これを受けて、教師データ投入部２０３は、教師データをモデル構築部２０１に投入し、学習モデル２０２を再度構築する。かかる処理により、教師データを反映した学習モデルを用いて以降の処理対象データの処理が行われることになる。 Also, the teacher information reflection unit 209 outputs teacher data based on the integrated teacher information to the teacher data input unit 203. In other words, the gazette P3 is input data, and the teacher data input unit 203 outputs the teacher data whose output data is the approximation value 0 of the classification C1, the approximation value 0 of the classification C2, and the approximation value 1 of the classification C3. To do. In response to this, the teacher data input unit 203 inputs the teacher data to the model construction unit 201 and constructs the learning model 202 again. By this processing, the subsequent processing target data is processed using the learning model reflecting the teacher data.

図５は、本実施形態にかかる処理の一例を示すフローチャートである。図５に示す処理は、ユーザがユーザ端末２６１を用いて分類システム２００にログインし、教師データを用いて学習モデルを構築する際に分類システム２００において実行される。 FIG. 5 is a flowchart illustrating an example of processing according to the present embodiment. The process shown in FIG. 5 is executed in the classification system 200 when a user logs in to the classification system 200 using the user terminal 261 and constructs a learning model using teacher data.

ステップＳ５０１において教師データ投入部２０３は、教師データを取得する。新規で学習モデルを構築する場合には、教師データ投入部２０３は、分類済みＤＢ２１０を参照して、過去に分類担当者などが手作業などで分類をした分類済みのデータを、教師データとして取得してよい。 In step S501, the teacher data input unit 203 acquires teacher data. When a new learning model is constructed, the teacher data input unit 203 refers to the classified DB 210 and acquires classified data that has been classified manually by a classifier in the past as teacher data. You can do it.

ステップＳ５０２において教師データ投入部２０３は、取得した教師データをモデル構築部２０１に投入する。モデル構築部２０１は、投入された教師データを用いて学習モデル２０２を構築する。なお、初期の教師データは任意の教師データを用いてよく、説明した例に限定されるものではない。また、既に学習モデルが構築されている状態においては、ステップＳ５０１及びステップＳ５０２の処理は省略してよい。 In step S <b> 502, the teacher data input unit 203 inputs the acquired teacher data into the model construction unit 201. The model construction unit 201 constructs a learning model 202 using the input teacher data. The initial teacher data may be arbitrary teacher data, and is not limited to the example described. Further, in the state where the learning model has already been constructed, the processing in step S501 and step S502 may be omitted.

ステップＳ５０３において処理対象データ入力部２０４は、公報ＤＢ２５０から処理対象データを取得する。例えば、処理対象データ入力部２０４は、所定の期間に発行された公報に関する新着情報を取得する。処理対象データ入力部２０４は、ユーザ端末２６１によって設定された配信頻度などの各種の条件に従って新着情報を取得してよい。処理対象データ入力部２０４は、取得した新着の公報を処理対象データとして学習モデル２０２に入力する。学習モデル２０２は、入力された処理対象データに基づく結果を出力する。 In step S <b> 503, the processing target data input unit 204 acquires processing target data from the publication DB 250. For example, the processing target data input unit 204 acquires new arrival information related to a gazette issued during a predetermined period. The processing target data input unit 204 may acquire new arrival information according to various conditions such as the distribution frequency set by the user terminal 261. The processing target data input unit 204 inputs the acquired new arrival publication as processing target data to the learning model 202. The learning model 202 outputs a result based on the input processing target data.

ステップＳ５０４において出力結果提供部２０５は、出力された結果に従って処理対象データに関する情報をユーザ端末に提供する。学習モデル２０２からは入力データの公報に対応する、それぞれの分類の近似度を示すデータが出力される。出力結果提供部２０５は、所定の閾値を超える近似度を有する分類を特定する。出力結果提供部２０５は、提供先ＤＢ２０６を参照して、特定した分類に対応する提供先のユーザを特定する。そして、特定したユーザのユーザ端末に結果を提供する。例えば、出力結果提供部２０５は、処理対象データを識別する識別情報のリストをユーザ端末に提供してもよい。 In step S504, the output result providing unit 205 provides the user terminal with information regarding the processing target data according to the output result. The learning model 202 outputs data indicating the degree of approximation of each classification corresponding to the input data publication. The output result providing unit 205 identifies a classification having an approximation degree exceeding a predetermined threshold. The output result provision unit 205 refers to the provision destination DB 206 and identifies a provision destination user corresponding to the identified classification. Then, the result is provided to the user terminal of the identified user. For example, the output result providing unit 205 may provide the user terminal with a list of identification information for identifying the processing target data.

ステップＳ５０５において、教師情報受信部２０７は、ユーザからの教師情報を受信する。ユーザ端末２６１には、例えばユーザが属する開発部門に関連する分類が付された公報データや、公報のリストが送付される。ユーザは、ユーザ端末２６１を用いて公報を確認し、自身の開発部門に関係ないと判断した公報を、不適切な分類がされた公報であることを指定する。 In step S505, the teacher information receiving unit 207 receives teacher information from the user. The user terminal 261 receives, for example, gazette data with a classification related to the development department to which the user belongs, and a list of gazettes. The user confirms the publication using the user terminal 261, and designates that the publication that has been determined to be unrelated to its own development department is an appropriately classified publication.

図６は、ユーザが自身の開発部門に関係ないと判断した公報を、不適切な分類がされた公報であることを指定するＵＩ画面６００の一例を示す図である。ＵＩ画面６００はユーザ端末２６１に表示される。ＵＩ画面６００には、ユーザが所属する開発部門Ｄ１用の新着の公報のリストが表示されている。ユーザは、リストに並んで表示されているチェックボックスをチェックして、不適切な分類がされた公報であることを指定したり、プルダウンメニューによって適切な分類先を指定したりすることができる。図６はリストの画面例を示しているが、それぞれの公報の詳細を表示する詳細画面においてチェックボックスやプルダウンメニューを表示してもよい。このようなＵＩ画面６００を介して不適切な分類がされた公報であることが指定されると、当該情報が教師情報として送信される。また、ユーザが適切な分類先が判断できる場合には、プルダウンメニューによって適切な分類先が指定されてもよく、この場合、適切な分類先も教師情報として送信されてよい。 FIG. 6 is a diagram illustrating an example of a UI screen 600 for designating that a gazette that the user has determined to be unrelated to his / her development department is a gazette that is inappropriately classified. The UI screen 600 is displayed on the user terminal 261. On the UI screen 600, a list of newly published publications for the development department D1 to which the user belongs is displayed. The user can check the check boxes displayed side by side in the list, specify that the publication is classified inappropriately, or specify an appropriate classification destination using a pull-down menu. FIG. 6 shows an example of a list screen, but check boxes and pull-down menus may be displayed on a detail screen displaying details of each publication. When it is designated via the UI screen 600 that the publication is classified inappropriately, the information is transmitted as teacher information. When the user can determine an appropriate classification destination, an appropriate classification destination may be designated by a pull-down menu. In this case, the appropriate classification destination may also be transmitted as teacher information.

ステップＳ５０６において教師情報統合部２０８は、教師情報受信部２０７で受信した複数の教師情報を統合する。例えば、教師情報統合部２０８は、ある特定の公報に関して複数の教師情報が教師情報受信部２０７で受信された場合、当該教師情報を統合する。 In step S506, the teacher information integration unit 208 integrates a plurality of teacher information received by the teacher information reception unit 207. For example, the teacher information integration unit 208 integrates the teacher information when a plurality of teacher information is received by the teacher information reception unit 207 regarding a specific publication.

ステップＳ５０７において教師情報反映部２０９は、教師情報を分類済みＤＢ２１０に反映する。例えば、分類済みＤＢ２１０においては、公報Ｐ３に関連付けられた分類が分類Ｃ１、分類Ｃ２、分類Ｃ３である分類データが格納されているとする。このとき、分類Ｃ１と分類Ｃ２とが間違っている旨の教師情報が得られた場合、教師情報反映部２０９は、公報Ｐ３に関連付けられる分類をＣ３のみとする分類データに更新する。 In step S507, the teacher information reflection unit 209 reflects the teacher information in the classified DB 210. For example, in the classified DB 210, it is assumed that classification data whose classifications are classification C1, classification C2, and classification C3 are stored. At this time, when teacher information indicating that the classification C1 and the classification C2 are wrong is obtained, the teacher information reflection unit 209 updates the classification data associated with the publication P3 to classification data having only C3.

ステップＳ５０８において教師情報反映部２０９は、教師情報をもとに教師データを生成する。上記の例では、公報Ｐ３を入力データとし、分類Ｃ１の近似度の値０、分類Ｃ２の近似度の値０、分類Ｃ３の近似度の値１を出力データとする教師データを生成する。生成した教師データは、教師データ投入部２０３に出力される。 In step S508, the teacher information reflection unit 209 generates teacher data based on the teacher information. In the above example, the gazette P3 is used as input data, and teacher data is generated with output data of the approximation value 0 of the classification C1, the approximation value 0 of the classification C2, and the approximation value 1 of the classification C3. The generated teacher data is output to the teacher data input unit 203.

ステップＳ５０９において教師データ投入部２０３は、教師データをモデル構築部２０１に出力する。モデル構築部２０１は、教師データに基づいて学習モデル２０２を構築する。 In step S509, the teacher data input unit 203 outputs the teacher data to the model construction unit 201. The model construction unit 201 constructs a learning model 202 based on teacher data.

ステップＳ５１０において処理対象データ入力部２０４は、次の処理対象データがあるかを判定する。例えば、次の新着の公報を取得するタイミングであるかを判定する。次の処理対象データがある場合には、ステップＳ５０３に進み、以降、処理を繰り返す。 In step S510, the processing target data input unit 204 determines whether there is next processing target data. For example, it is determined whether it is time to acquire the next new publication. If there is next data to be processed, the process proceeds to step S503, and the process is repeated thereafter.

なお、図示していないが、ユーザ端末２６１から分類処理の中止の指示を受け付けた時点で、図５に示す処理が終了してよい。 Although not shown, the processing shown in FIG. 5 may be ended when an instruction to stop the classification processing is received from the user terminal 261.

以上説明したように、本実施形態によれば、社内分類を付与する処理を学習モデル２０２に実行させることができるので、分類付与を効率的に行うことができる。また、学習モデル２０２が出力した結果を教師データとして用いることで、学習モデル２０２を進化させることができ、分類結果のフィードバックを効率的に活用することができる。 As described above, according to the present embodiment, the process of assigning in-house classification can be executed by the learning model 202, so that classification can be efficiently performed. Further, by using the result output from the learning model 202 as teacher data, the learning model 202 can be evolved, and the feedback of the classification result can be used efficiently.

＜実施形態２＞
実施形態１では、特定の公報に関する情報を処理対象データとして入力した場合に、学習モデルがその特定の公報に対するそれぞれの分類の近似度を出力する形態を説明した。「社内分類の利用形態」の項において説明したように、各開発部門では、分類された公報それぞれに対して評価を付すことが行われている。実施形態２においては、評価を、近似度を調整するパラメータとして用いる形態を説明する。実施形態１の例では、教師データとしては、分類が適切であることを示す値１と適切でないかを示す値０とを用いる例を説明した。実施形態２では、評価に応じて、教師データとして用いる出力データの値を、０以上１以下の間の任意の値として用いる形態を説明する。 <Embodiment 2>
In the first embodiment, the description has been given of the mode in which, when information related to a specific publication is input as processing target data, the learning model outputs the degree of approximation of each classification for the specific publication. As described in the section “Usage classification usage pattern”, each development department evaluates each classified publication. In the second embodiment, a mode in which evaluation is used as a parameter for adjusting the degree of approximation will be described. In the example of the first embodiment, the example in which the value 1 indicating that the classification is appropriate and the value 0 indicating whether it is not appropriate has been described as the teacher data. In the second embodiment, a mode will be described in which the value of output data used as teacher data is used as an arbitrary value between 0 and 1 according to the evaluation.

また、実施形態１では、学習モデル２０２から出力される近似度を示す値は、分類が正しいか否かの判定の場面において用いられる形態を説明した。実施形態２では、近似度を示す値に応じてユーザ端末に提供される情報のリストの順序を変更する形態も併せて説明する。 In the first embodiment, the value indicating the degree of approximation output from the learning model 202 has been described as being used in a scene for determining whether the classification is correct. In the second embodiment, an embodiment in which the order of the list of information provided to the user terminal is changed according to the value indicating the degree of approximation will be described.

実施形態２における分類システムの構成は、実施形態１で説明したものと同様とすることができる。 The configuration of the classification system in the second embodiment can be the same as that described in the first embodiment.

実施形態２においては、分類済みＤＢ２１０には、それぞれの公報に関して少なくとも１つの分類と当該分類の評価とが関連付けられた評価付分類データが格納される。図７は、分類済みＤＢ２１０に格納される評価付分類データの一例を示す図である。例えば公報Ｐ３１には、２つの分類Ｃ１、Ｃ２が関連付けられている。さらに、公報Ｐ３１の分類Ｃ１に関しては、重みづけ大という評価が関連付けられており、公報Ｐ３１の分類Ｃ２に関しては、重みづけ小という評価が関連付けられている。この重みづけ大や重みづけ小といった評価は、各開発部門において付された評価である。例えば、自社技術との関連度を示す評価である。評価が高いほど、つまり、重みづけが大であるほど、自社技術との関連度が高いことを示し注目すべき公報であることを示している。 In the second embodiment, the classified DB 210 stores classified data with evaluation in which at least one classification and an evaluation of the classification are associated with each publication. FIG. 7 is a diagram illustrating an example of classified data with evaluation stored in the classified DB 210. For example, in the publication P31, two classifications C1 and C2 are associated. Further, the classification C1 of the publication P31 is associated with an evaluation of high weighting, and the classification C2 of the publication P31 is associated with an evaluation of small weighting. Evaluations such as large weighting and small weighting are evaluations given in each development department. For example, an evaluation indicating the degree of relevance with the company's technology. The higher the evaluation, that is, the higher the weighting, the higher the degree of association with the company's technology, indicating that the publication is worthy of attention.

実施形態２において教師データ投入部２０３は、教師データとして、評価が付された分類データを用いる。教師データ投入部２０３は、図７に示すような評価付分類データをスコア化する。 In the second embodiment, the teacher data input unit 203 uses classification data with an evaluation as teacher data. The teacher data input unit 203 scores the classified data with evaluation as shown in FIG.

図８は、スコア化の例を説明するための図である。図８は、スコアに換算する際に用いられる換算テーブルの一例を示す図である。換算テーブルにおいては、それぞれの評価に対する重みづけのパラメータが設定されている。例えば、公報Ｐ３１の分類Ｃ１に付されている評価が「重みづけ大」である場合、公報Ｐ３１の分類Ｃ１に対しては、値１のスコアが付される。公報Ｐ３１の分類Ｃ２に付されている評価が「重みづけ小」である場合、公報Ｐ３１の分類Ｃ２に対しては、値０．６のスコアが付される。 FIG. 8 is a diagram for explaining an example of scoring. FIG. 8 is a diagram illustrating an example of a conversion table used when converting into a score. In the conversion table, weighting parameters for each evaluation are set. For example, when the evaluation attached to the classification C1 of the publication P31 is “high weighting”, a score of value 1 is attached to the classification C1 of the publication P31. When the evaluation assigned to the classification C2 of the publication P31 is “low weighting”, the classification C2 of the publication P31 is assigned a score of 0.6.

図９は、実施形態２における教師データの一例を示す図である。実施形態１で説明したように、それぞれの公報に関して、入力データとして要約文のテキストデータを用いている。一方、実施形態１とは異なり、出力データとして、評価に基づいてスコア化した値を、近似度を示す値として用いている。図９は、図７で示す公報Ｐ３１及び公報Ｐ３２を教師データとして用いる例を示している。公報Ｐ３１には分類Ｃ１及び分類Ｃ２が付与されており、分類Ｃ３は付与されていない。したがって、図９の教師データにおいては、公報Ｐ３１に対する分類Ｃ３の近似度は値０が設定されている。また、公報Ｐ３１の分類Ｃ１に付されている重みづけは「重みづけ大」であるので、図９の教師データにおいては、公報Ｐ３１に対する分類Ｃ１の近似度は値１が設定されている。また、公報Ｐ３１の分類Ｃ２に付されている重みづけは「重みづけ小」であるので、図９の教師データにおいては、公報Ｐ３１に対する分類Ｃ３の近似度は値０．６が設定されている。 FIG. 9 is a diagram illustrating an example of teacher data according to the second embodiment. As described in the first embodiment, the text data of the summary sentence is used as input data for each publication. On the other hand, unlike Embodiment 1, as output data, a value scored based on evaluation is used as a value indicating the degree of approximation. FIG. 9 shows an example in which the publications P31 and P32 shown in FIG. 7 are used as teacher data. In the publication P31, the classification C1 and the classification C2 are given, and the classification C3 is not given. Therefore, in the teacher data of FIG. 9, the value 0 is set as the degree of approximation of the classification C3 with respect to the publication P31. Further, since the weight assigned to the classification C1 of the publication P31 is “high weighting”, the value of 1 is set as the degree of approximation of the classification C1 with respect to the publication P31 in the teacher data of FIG. Further, since the weight assigned to the classification C2 of the publication P31 is “low weighting”, the degree of approximation of the classification C3 with respect to the publication P31 is set to a value of 0.6 in the teacher data of FIG. .

教師データ投入部２０３が、このような教師データをモデル構築部２０１に投入すると、モデル構築部２０１は、教師データに従って学習モデル２０２を構築する。構築した学習モデル２０２に対して処理対象データ入力部２０４が処理対象データを入力すると、学習モデル２０２は出力データとして所定の出力値（近似度を示す値）を出力する。 When the teacher data input unit 203 inputs such teacher data into the model construction unit 201, the model construction unit 201 constructs a learning model 202 according to the teacher data. When the processing target data input unit 204 inputs processing target data to the constructed learning model 202, the learning model 202 outputs a predetermined output value (a value indicating the degree of approximation) as output data.

出力結果提供部２０５は、学習モデル２０２から出力された結果をユーザ端末２６１に提供する。例えば、実施形態１で説明したように、出力結果提供部２０５は、所定の閾値を超える近似度を有する分類を処理対象データとして入力した公報の分類と特定する。出力結果提供部２０５は、所定の閾値を超える近似度を有する公報の中で近似度に従って並び順を変えた公報のリストをユーザ端末２６１に提供してよい。例えば、分類Ｃ１として分類された公報のリストを開発部門Ｄ１のユーザに提供することを想定する。また、処理対象データとして、新着の公報群が学習モデル２０２に入力されたものと想定する。この場合、出力結果提供部２０５は、学習モデル２０２から出力された、分類Ｃ１に関する近似度が所定の閾値を超えている公報を抽出する。そして、抽出した公報の中から、分類Ｃ１に関して近似度の値が高い順に公報の並び順を変更したリストを生成する。出力結果提供部２０５は、このように生成したリストをユーザ端末２６１に提供してよい。 The output result providing unit 205 provides the user terminal 261 with the result output from the learning model 202. For example, as described in the first exemplary embodiment, the output result providing unit 205 identifies a classification having a degree of approximation exceeding a predetermined threshold as a classification of a publication that is input as processing target data. The output result providing unit 205 may provide the user terminal 261 with a list of publications in which the order of arrangement is changed according to the degree of approximation among publications having the degree of approximation exceeding a predetermined threshold. For example, it is assumed that a list of publications classified as classification C1 is provided to the user of the development department D1. Further, it is assumed that a new publication group is input to the learning model 202 as processing target data. In this case, the output result providing unit 205 extracts a gazette that is output from the learning model 202 and has an approximate degree related to the classification C1 that exceeds a predetermined threshold. Then, a list in which the order of publications is changed in descending order of the value of the degree of approximation with respect to the classification C1 is generated from the extracted publications. The output result providing unit 205 may provide the user terminal 261 with the list generated in this way.

図１０は、出力結果提供部２０５から提供されたリストをユーザ端末２６１で表示するＵＩ画面の例を示す図である。実施形態２では、それぞれの公報に対して重みづけを付与することが可能である。このように重みづけが付与された公報が教師情報として教師情報受信部２０７で受信される。ここで、重みづけの付与にリストボックスを使用したが、値を直接入力するインボックスでもよく、他の画面入力のためのコントロールであってもよい。 FIG. 10 is a diagram illustrating an example of a UI screen for displaying the list provided from the output result providing unit 205 on the user terminal 261. In the second embodiment, it is possible to give weights to the respective publications. The gazette thus weighted is received by the teacher information receiving unit 207 as teacher information. Here, the list box is used for assigning the weight, but it may be an inbox for directly inputting a value or a control for inputting another screen.

教師情報統合部２０８は、実施形態１で説明したように、複数の教師情報を統合してよい。例えば、多数決を採用してもよいし、重みづけをすることを採用してもよいし、ユーザが決定することを採用してもよい。 As described in Embodiment 1, the teacher information integration unit 208 may integrate a plurality of teacher information. For example, majority voting may be employed, weighting may be employed, or user deciding may be employed.

教師情報反映部２０９は、実施形態１で説明したように、教師情報を反映した教師データを生成して、教師データ投入部２０３に出力する。例えば図８に示すような換算テーブルを用いて重みづけを反映した教師データを生成する。なお、本実施形態においては換算テーブルは全ての分類で共通のものを用いる例を示しているが、これに限られない。分類ごとに異なる換算テーブルを用いてもよい。教師データ投入部２０３は、教師データをモデル構築部２０１に投入し、モデル構築部２０１は、学習モデル２０２を構築する。 As described in the first embodiment, the teacher information reflecting unit 209 generates teacher data reflecting the teacher information, and outputs the teacher data to the teacher data input unit 203. For example, teacher data reflecting weights is generated using a conversion table as shown in FIG. In the present embodiment, an example in which a conversion table is common to all classifications is shown, but the present invention is not limited to this. A different conversion table may be used for each classification. The teacher data input unit 203 inputs teacher data into the model construction unit 201, and the model construction unit 201 constructs a learning model 202.

以上説明したように、本実施形態においては、ユーザの評価を教師データの要素に含めることができるので、より適切な分類を出力する学習モデル２０２を構築することができる。 As described above, in this embodiment, since the user's evaluation can be included in the elements of the teacher data, the learning model 202 that outputs a more appropriate classification can be constructed.

実施形態２においては説明を簡略化するために、大分類だけを用いる例を説明したが、大分類の中に中分類があってもよく、中分類の中にさらに小分類を設ける構成としてもよい。このような場合には、パラメータをより細分化すればよい。 In the second embodiment, in order to simplify the description, an example in which only the major classification is used has been described. However, the middle classification may be included in the major classification, and a configuration in which a minor classification is further provided in the middle classification. Good. In such a case, the parameters may be further subdivided.

実施形態２においては、近似度が高い順に公報の並び順を変更してユーザ端末２６１で表示する例を説明したが、実施形態１においても同様の表示態様を採用してよい。すなわち、分類に評価（重みづけ）が付与されず、教師情報としても評価（重みづけ）に関する情報を用いない場合であっても、学習モデル２０２が出力した近似度に応じて並び順を変更した公報のリストをユーザ端末２６１に提供してよい。 In the second embodiment, the example in which the order of publications is changed and displayed on the user terminal 261 in descending order of the degree of approximation has been described, but the same display mode may be adopted in the first embodiment. That is, even when no evaluation (weighting) is given to the classification and information regarding the evaluation (weighting) is not used as the teacher information, the arrangement order is changed according to the degree of approximation output by the learning model 202. A list of publications may be provided to the user terminal 261.

＜実施形態３＞
実施形態１では、特定の公報に関する情報を処理対象データとして入力した場合に、学習モデルがその特定の公報に対するそれぞれの分類の近似度を出力する形態を説明した。実施形態３では、分類の近似度に加え、特定の公報に関する情報を処理対象データとして入力した場合に、学習モデルがその特定の公報に対する危険度を出力する形態を説明する。ここで、危険度とは、対象会社、対象事業、対象製品、対象部品等の対象物の抵触性を意味し、より具体的には、対象物が特定の公報に対して抵触するかどうかを示す度合いである。実施形態３では、まず、実施形態１で説明した分類の近似度に加え、学習モデル２０２に危険度を学習させ、処理対象データを学習モデルに入力して分類の近似度及び危険度を出力する例を説示する。 <Embodiment 3>
In the first embodiment, the description has been given of the mode in which, when information related to a specific publication is input as processing target data, the learning model outputs the degree of approximation of each classification for the specific publication. In the third embodiment, a description will be given of a mode in which the learning model outputs the degree of risk for a specific publication when information on the specific publication is input as processing target data in addition to the classification approximation. Here, the risk level means incompatibility of the target company, target business, target product, target part, etc., and more specifically, whether the target object conflicts with a specific publication. It is a degree to show. In the third embodiment, first, in addition to the classification approximation described in the first embodiment, the learning model 202 is made to learn the risk, the processing target data is input to the learning model, and the classification approximation and the risk are output. An example is given.

実施形態３における分類システムの構成は、実施形態１で説明したものと同様とすることができる。 The configuration of the classification system in the third embodiment can be the same as that described in the first embodiment.

実施形態３においては、分類済みＤＢ２１０には、それぞれの公報に関して少なくとも１つの分類と当該分類の近似度と危険度が関連付けられて格納される。 In the third embodiment, the classified DB 210 stores at least one classification, an approximation degree of the classification, and a risk level in association with each publication.

実施形態３において教師データ投入部２０３は、教師データとして、分類済みＤＢ２１０に格納されている公報、この公報に付与されている分類及びその近似度並びに危険度を用いる。 In the third embodiment, the teacher data input unit 203 uses, as teacher data, a gazette stored in the classified DB 210, a classification given to this gazette, an approximation degree thereof, and a risk degree.

図１１は、実施形態３における教師データの一例を示す図である。実施形態１で説明したように、それぞれの公報に関して、入力データとして要約文のテキストデータを用いている。図１１は、公報Ｐ１及び公報Ｐ２を教師データとして用いる例を示している。公報Ｐ１には分類Ｃ１に近似度１が付与され、分類Ｃ２に近似度０が付与され、分類Ｃ３に近似度１が付与され、危険度０．８が付与されている。また、公報Ｐ２には分類Ｃ１に近似度１が付与され、分類Ｃ２に近似度０が付与され、分類Ｃ３に近似度０が付与され、危険度１が付与されている。ここで、この例では、一の公報に一の危険度が付与されているが、一の公報の分類一つに危険度が付与されてもよい。 FIG. 11 is a diagram illustrating an example of teacher data according to the third embodiment. As described in the first embodiment, the text data of the summary sentence is used as input data for each publication. FIG. 11 shows an example in which the publication P1 and the publication P2 are used as teacher data. In the publication P1, the degree of approximation 1 is assigned to the classification C1, the degree of approximation 0 is assigned to the classification C2, the degree of approximation 1 is assigned to the classification C3, and the degree of risk is assigned 0.8. Further, in the publication P2, the degree of approximation 1 is assigned to the classification C1, the degree of approximation 0 is assigned to the classification C2, the degree of approximation 0 is assigned to the classification C3, and the degree of risk 1 is assigned. Here, in this example, one risk is assigned to one publication, but a risk may be assigned to one classification of one publication.

教師データ投入部２０３が、このような教師データをモデル構築部２０１に投入すると、モデル構築部２０１は、教師データに従って学習モデル２０２を構築する。構築した学習モデル２０２に対して処理対象データ入力部２０４が処理対象データを入力すると、学習モデル２０２は出力データとして所定の出力値（分類の近似度及び危険度）を出力する。 When the teacher data input unit 203 inputs such teacher data into the model construction unit 201, the model construction unit 201 constructs a learning model 202 according to the teacher data. When the processing target data input unit 204 inputs processing target data to the constructed learning model 202, the learning model 202 outputs a predetermined output value (classification approximation and risk) as output data.

出力結果提供部２０５は、学習モデル２０２から出力された結果をユーザ端末２６１に提供する。例えば、実施形態１で説明したように、出力結果提供部２０５は、所定の閾値を超える近似度を有する分類を処理対象データとして入力した公報の分類と特定する。出力結果提供部２０５は、所定の閾値を超える近似度を有する公報の中で近似度に従って並び順を変えた公報のリストに危険度を含めてユーザ端末２６１に提供してよい。例えば、分類Ｃ１として分類された公報のリストを開発部門Ｄ１のユーザに提供することを想定する。また、処理対象データとして、新着の公報群が学習モデル２０２に入力されたものと想定する。この場合、出力結果提供部２０５は、学習モデル２０２から出力された、分類Ｃ１に関する近似度が所定の閾値を超えている公報を抽出する。そして、抽出した公報の中から、分類Ｃ１に関して近似度の値が高い順に公報の並び順を変更したリストを生成する。出力結果提供部２０５は、このように生成したリストをユーザ端末２６１に提供してよい。ここで、リストには危険度も含まれているため、危険度の値が高い順に公報の並び順を変更したリストを生成して用いることもできる。また、ユーザに提示する公報は、近似度が所定の閾値を超えているものでなく、これに代えて、危険度が所定の閾値を超えている公報であってもよい。 The output result providing unit 205 provides the user terminal 261 with the result output from the learning model 202. For example, as described in the first exemplary embodiment, the output result providing unit 205 identifies a classification having a degree of approximation exceeding a predetermined threshold as a classification of a publication that is input as processing target data. The output result providing unit 205 may provide the user terminal 261 with the degree of risk in a list of publications in which the order of arrangement is changed according to the degree of approximation among publications having the degree of approximation exceeding a predetermined threshold. For example, it is assumed that a list of publications classified as classification C1 is provided to the user of the development department D1. Further, it is assumed that a new publication group is input to the learning model 202 as processing target data. In this case, the output result providing unit 205 extracts a gazette that is output from the learning model 202 and has an approximate degree related to the classification C1 that exceeds a predetermined threshold. Then, a list in which the order of publications is changed in descending order of the value of the degree of approximation with respect to the classification C1 is generated from the extracted publications. The output result providing unit 205 may provide the user terminal 261 with the list generated in this way. Here, since the risk level is also included in the list, it is possible to generate and use a list in which the order of publications is changed in descending order of the risk value. Further, the publication presented to the user may be a publication in which the degree of approximation does not exceed a predetermined threshold, and instead, the publication in which the degree of risk exceeds a predetermined threshold.

図１２は、出力結果提供部２０５から提供されたリストをユーザ端末２６１で表示するＵＩ画面の例を示す図である。実施形態３では、それぞれの公報に対して分類及び危険度が付与されて表示されている。ここで、この表示されているリスト内で近似度も表示させることもできる。そして、ユーザは分類及びその近似度、並びに、危険度のいずれかを修正することも可能で、対象の公報と、修正された分類及びその近似度、並びに、危険度とが教師情報として教師情報受信部２０７で受信される。 FIG. 12 is a diagram illustrating an example of a UI screen for displaying the list provided from the output result providing unit 205 on the user terminal 261. In the third embodiment, the classification and the degree of risk are assigned to each publication and displayed. Here, the degree of approximation can also be displayed in the displayed list. The user can also correct any of the classification and the degree of approximation thereof, and the degree of risk. The subject publication, the corrected classification and the degree of approximation thereof, and the degree of danger are teacher information as teacher information. Received by the receiving unit 207.

以上説明したように、本実施形態においては、ユーザの評価を教師データの要素に含めることができるので、より適切な分類及び危険度を出力する学習モデル２０２を構築することができる。 As described above, in the present embodiment, since the user's evaluation can be included in the elements of the teacher data, the learning model 202 that outputs more appropriate classification and risk can be constructed.

なお、実施形態３では、１つの学習モデルである学習モデル２０２が公報、分類、近似度及び危険度を教師データとして学習した構成にて説明したが、分類のための学習モデルと危険度のための学習モデルを分けて構成することもでき、そして、この危険度のための学習モデルはさらに分類別に学習モデルを細分化した構成であってもよい。具体的には、Ｃ１、Ｃ２という分類があった場合に、分類のための学習モデルは一つであり、危険度のための学習モデルをＣ１用とＣ２用で準備する。このような構成の場合、予めユーザにて設定された分類、近似度、危険度を読出し、まずは、対象の公報と、分類及び近似度を分類のための学習モデルに学習させ、次に、分類された公報を、その分類に対応する危険度のための学習モデルに公報及び危険度を用いて学習させる。特定の公報に関する情報を処理対象データとして入力した場合に、まず、分類のための学習モデルが適切な分類を出力し、出力された分類に対応する危険度のための学習モデルで処理させることで対象公報の危険度が出力される。ここでの危険度は、対象分類における危険度と言える。 In the third embodiment, the learning model 202, which is one learning model, has been described as a configuration in which the gazette, classification, approximation, and risk are learned as teacher data. These learning models may be configured separately, and the learning model for the degree of risk may be configured by further subdividing the learning model by classification. Specifically, when there is a classification of C1 and C2, there is one learning model for classification, and a learning model for risk is prepared for C1 and C2. In the case of such a configuration, the classification, the degree of approximation, and the degree of risk preset by the user are read out, and first, the subject publication, the classification and the degree of approximation are learned in a learning model for classification, and then the classification is performed. The published gazette is learned using the gazette and the risk level in a learning model for the risk level corresponding to the classification. When information related to a specific publication is input as processing target data, first, the learning model for classification outputs an appropriate classification and is processed by the learning model for the risk corresponding to the output classification. The risk level of the subject publication is output. The degree of risk here can be said to be the degree of risk in the target classification.

また、前記した対象分類における危険度を出力するために、分類に対応する危険度のための学習モデルを、分類のための学習モデルに加えて用いたが、対象分類における危険度だけでなく、対象部署の危険度、対象装置の危険度を出力する構成であってもよく、それぞれ対象部署の危険度のための学習モデル、対象装置の危険度のための学習モデルを構築することで実現することができる。この場合において、入力する教師データは公報データとこの公報データの危険度を、対応する対象部署の危険度のための学習モデル、対応する対象装置の危険度のための学習モデルに学習させることが必要となる。これにより、例えば、ユーザが製品Ａに関して抵触可能性がある公報を複数発見し、それぞれ危険度を数量化した値を付与し、製品Ａの危険度のための学習モデルに学習させることで、新規の公報をこの学習モデルに処理対象データとして入力した場合に、学習モデルが対象の公報に対する危険度を出力する。つまり、製品Ａが対象の公報に抵触する程度である危険度を出力する。加えて、実施形態では学習モデルにより分類を付与することを前提にした説明を行ってきたが、前記した学習モデルにより危険度を付与する構成は、必ずしも分類を付与する動作を含めなくともよく、学習モデルにより対象公報の危険度を出力する動作のみも本願発明の範囲に含まれる。逆に、学習モデルにより危険度を求め、その後に、分類を付与する構成であってもよい。 In addition, in order to output the risk level in the target classification described above, the learning model for the risk corresponding to the classification is used in addition to the learning model for the classification. It may be configured to output the risk level of the target department and the risk level of the target device, which are realized by building a learning model for the risk level of the target department and a learning model for the risk level of the target device, respectively. be able to. In this case, the teaching data to be input can cause the publication data and the risk level of the publication data to be learned by the learning model for the risk level of the corresponding target department and the learning model for the risk level of the corresponding target device. Necessary. Thereby, for example, the user discovers a plurality of publications that may be in conflict with the product A, assigns a value obtained by quantifying the risk level, and causes the learning model for the risk level of the product A to learn a new one. Is input as processing target data to this learning model, the learning model outputs the degree of risk for the target publication. That is, the degree of risk that the product A is in conflict with the target publication is output. In addition, in the embodiment, the description has been made on the assumption that the classification is given by the learning model, but the configuration that gives the risk level by the learning model described above does not necessarily include the operation of giving the classification, Only the operation of outputting the risk level of the subject publication by the learning model is included in the scope of the present invention. On the contrary, the structure which calculates | requires a danger level with a learning model and gives a classification | category after that may be sufficient.

さらに、実施形態３では危険度を説示したが、図１のＷ３で重みづけに代えて危険度を開発部門が設定する構成であってもよい。 Furthermore, although the risk level is explained in the third embodiment, the development department may set the risk level instead of weighting in W3 of FIG.

上述した実施形態においては、主に特許に関する公報の分類を例に挙げて説明したが、これに限られない。任意の種類の公報の分類処理に適用することができる。 In the above-described embodiment, the description has been mainly given of the classification of publications relating to patents. It can be applied to classification processing of any kind of publication.

＜実施形態４＞
実施形態１では、特定の公報に関する情報を処理対象データとして入力した場合に、学習モデルがその特定の公報に付与され得る複数の社内分類それぞれの近似度を出力する形態を説明した。実施形態４では、出願に関する情報を処理対象データとして入力した場合に、学習モデルが、ＩＰＣ、ＣＰＣ（Cooperative Patent Classification）、ＥＣＬＡ（European Classification）、ＩＣＯ（In Computer Only）、ＦＩ（File Index）、Ｆターム等、その出願に付与され得る複数の国内外の特許庁及び国際機関等による分類と、その近似度を出力する形態を説明する。 <Embodiment 4>
In the first embodiment, a description has been given of a mode in which, when information related to a specific gazette is input as processing target data, the learning model outputs a degree of approximation of each of a plurality of in-house classifications that can be assigned to the specific gazette. In the fourth embodiment, when information on an application is input as processing target data, the learning model is IPC, CPC (Cooperative Patent Classification), ECLA (European Classification), ICO (In Computer Only), FI (File Index), A description will be given of a classification by a plurality of domestic and foreign patent offices and international organizations that can be given to the application, such as F-term, and a form for outputting the degree of approximation.

実施形態４における分類システムの構成は、実施形態１で説明したものと同様とすることができる。 The configuration of the classification system in the fourth embodiment can be the same as that described in the first embodiment.

本実施形態に係る分類システム２００のモデル構築部２０１は、出願に関する複数の入力データと、出願に付与される少なくとも１つの分類を示す出力データとの関係を示す教師データに基づいて、学習モデルを構築する。モデル構築部２０１が実行する処理は、実施形態１に記載したものと同様である。ここで、出願は、特許出願、実用新案登録出願、意匠登録出願又は商標登録出願であってよく、出願に関する複数の入力データとは、出願に際して提出される書類や物件に関するデータである。 The model construction unit 201 of the classification system 200 according to the present embodiment generates a learning model based on teacher data indicating a relationship between a plurality of input data related to an application and output data indicating at least one classification assigned to the application. To construct. The process executed by the model construction unit 201 is the same as that described in the first embodiment. Here, the application may be a patent application, a utility model registration application, a design registration application, or a trademark registration application, and the plurality of input data relating to the application is data relating to documents and properties submitted at the time of application.

処理対象データ入力部２０４は、出願に関する複数の処理対象データを、複数の入力データとして、学習モデルに入力する。より具体的には、出願が特許出願又は実用新案登録出願である場合、複数の処理対象データは、願書、明細書、特許請求の範囲又は実用新案登録請求の範囲、図面及び要約書のうち少なくともいずれかを含んでよい。また、出願が意匠登録出願である場合、複数の処理対象データは、願書並びに願書に記載された意匠に係る物品、意匠に係る物品の説明、意匠の説明及び図面のうち少なくともいずれかを含んでよい。ここで、図面は、正面図、左右の側面図、背面図、斜視図等を含んでよい。さらに、出願が商標登録出願である場合、複数の処理対象データは、願書並びに願書に記載された指定商品、指定役務及び商標のうち少なくともいずれかを含んでよい。ここで、商標は、文字、図形、色彩のみ、音、ホログラム等を含んでよい。 The processing target data input unit 204 inputs a plurality of processing target data related to the application into the learning model as a plurality of input data. More specifically, when the application is a patent application or utility model registration application, the plurality of data to be processed includes at least one of an application, a description, a claim or a utility model registration request, a drawing, and an abstract. Either may be included. In addition, when the application is a design registration application, the plurality of processing target data includes at least one of the application, the article related to the design described in the application, the description of the article related to the design, the description of the design, and the drawings. Good. Here, the drawings may include a front view, left and right side views, a rear view, a perspective view, and the like. Further, when the application is a trademark registration application, the plurality of processing target data may include at least one of the application form, the designated product, the designated service, and the trademark described in the application form. Here, the trademark may include characters, figures, colors only, sounds, holograms, and the like.

本実施形態に係る分類システム２００は、実施形態１に係る分類システム２００と同様に、複数の処理対象データの入力に応じて学習モデルから出力された結果に従って、複数の処理対象データに関する情報をユーザ端末２６１に提供する提供部（出力結果提供部２０５）と、ユーザ端末２６１から、教師データの元となる教師情報を受信する受信部（教師情報受信部２０７）と、教師情報に応じた教師データをモデル構築部２０１に投入する教師データ投入部２０３と、を有する。 Similar to the classification system 200 according to the first embodiment, the classification system 200 according to the present embodiment receives information on a plurality of processing target data according to the result output from the learning model in response to the input of the plurality of processing target data. A providing unit (output result providing unit 205) provided to the terminal 261, a receiving unit (teacher information receiving unit 207) that receives teacher information that is the source of teacher data from the user terminal 261, and teacher data corresponding to the teacher information And a teacher data input unit 203 for inputting to the model construction unit 201.

さらに、本実施形態に係る分類システム２００の提供部は、複数の処理対象データのうち、出願に付与されると判断された少なくとも１つの分類が、付与されると判断された根拠となる処理対象データをユーザ端末２６１に提供する。 Furthermore, the providing unit of the classification system 200 according to the present embodiment provides a processing target that is a basis for determining that at least one classification determined to be given to the application among a plurality of processing target data is given. Data is provided to the user terminal 261.

図１３は、ユーザ端末におけるＵＩ画面の他の例を示す図である。同図に示すＵＩ画面がユーザ端末２６１に表示されることで、ユーザは、出願に関する複数の入力データを選択して、分類システム２００から提供された出願に付与され得る複数の分類と、付与根拠とを確認することができる。 FIG. 13 is a diagram illustrating another example of the UI screen on the user terminal. The UI screen shown in the figure is displayed on the user terminal 261, so that the user can select a plurality of input data related to the application, a plurality of classifications that can be assigned to the application provided from the classification system 200, and the basis for the assignment And can be confirmed.

ユーザは、入力データ３０１の項目に、出願に関するデータを入力する。例えば、出願が特許出願の場合、入力データ３０１の項目に、願書、明細書、特許請求の範囲、図面及び要約書に関するデータを入力してよい。入力データ３０１の項目にデータを入力すると、項目指定画面３０２に、入力データ３０１に含まれる項目が列挙される。ユーザは、項目指定画面３０２によって、要約書に記載された要約や、特許請求の範囲の全請求項又は一部の請求項を指定して、学習モデルに入力する複数の処理対象データの一部を指定することができる。同様に、段落指定画面３０３には、入力データ３０１に含まれる明細書の段落が列挙される。ユーザは、段落指定画面３０３によって、明細書の段落を指定し、学習モデルに入力する複数の処理対象データの一部を指定することができる。ユーザは、処理対象データの指定を終えた場合、候補抽出ボタン３０４を押下して、指定した複数の処理対象データを学習モデルに入力し、出願に付与される少なくとも１つの分類を示す出力データを得る。出力データは、入力データに対応する出願に付与され得る複数の分類のそれぞれの近似度を示す値を含む。 The user inputs data related to the application in the item of the input data 301. For example, when the application is a patent application, data relating to the application, the specification, the claims, the drawings, and the abstract may be input in the input data 301 item. When data is input to the items of the input data 301, items included in the input data 301 are listed on the item designation screen 302. The user designates the summary described in the abstract or all or a part of claims in the summary on the item designation screen 302, and a part of a plurality of processing target data to be input to the learning model Can be specified. Similarly, the paragraph specification screen 303 lists the paragraphs of the specification included in the input data 301. The user can specify a paragraph of the specification and specify some of a plurality of processing target data to be input to the learning model on the paragraph specifying screen 303. When the user finishes specifying the processing target data, the user presses the candidate extraction button 304, inputs the specified plurality of processing target data to the learning model, and outputs output data indicating at least one classification given to the application. obtain. The output data includes a value indicating the degree of approximation of each of the plurality of classifications that can be assigned to the application corresponding to the input data.

ＵＩ画面の分類候補３０５には、出願に付与され得る複数の分類と、複数の分類それぞれの近似度が示される。本例では、出願に付与され得る複数の分類は、「Ｇ０６Ｆ１５／６０」と「Ｇ０６Ｆ１７／００」というＩＰＣであり、近似度は、「Ｇ０６Ｆ１５／６０」について「近似値１」、「Ｇ０６Ｆ１７／００」について「近似値０．８」である。本例では、近似度の降順にソートして出願に付与され得る複数の分類を表示しているが、表示の順序は任意である。 The UI screen category candidates 305 indicate a plurality of categories that can be assigned to the application and the degree of approximation of each of the plurality of categories. In this example, a plurality of classifications that can be given to the application are IPCs “G06F 15/60” and “G06F 17/00”, and the degree of approximation is “approximate value 1”, “G06F 15/60” “Approximate value 0.8” for “G06F 17/00”. In this example, a plurality of classifications that can be assigned to the application are displayed by sorting in descending order of the degree of approximation, but the display order is arbitrary.

ユーザによって、分類候補３０５に表示された複数の分類のうちいずれか１つが選択されると、その分類について、付与根拠となった項目が項目単位の付与根拠記載箇所３０６に表示され、付与根拠となった明細書の段落が段落単位の付与根拠記載箇所３０７に表示される。さらに、項目単位の付与根拠記載箇所３０６に表示された項目や、段落単位の付与根拠記載箇所３０７に表示された段落のうち１つを選択すると、その項目や段落が、付与根拠記載箇所３０８に全文表示される。なお、分類候補３０５に表示された複数の分類のうちいずれか１つをダブルクリックすることで、その分類の説明を表示させることができる。このようにしてユーザは、学習モデルによって自動付与された分類の付与根拠となる記載を確認して、分類の妥当性を確かめることができる。 When one of the plurality of classifications displayed in the classification candidate 305 is selected by the user, the item that is the basis for the assignment is displayed in the item-by-item grant basis description location 306, and The paragraph in the specification is displayed in the provision basis description portion 307 for each paragraph. Further, when one of the items displayed in the item-by-item grant basis description location 306 or the paragraph displayed in the paragraph-by-paragraph grant basis description location 307 is selected, the item or paragraph is displayed in the grant basis description location 308. Full text is displayed. Note that by double-clicking any one of the plurality of classifications displayed in the classification candidate 305, the description of the classification can be displayed. In this way, the user can confirm the validity of the classification by confirming the description that is the basis for the classification automatically given by the learning model.

本実施形態に係る分類システム２００は、複数の項目又は複数の段落をそれぞれ別個に学習モデルに入力した場合に、学習モデルの出力データによって示される分類の近似度が所定値以上の項目又は段落を、付与根拠と決定してよい。また、分類システム２００は、複数の項目及び複数の段落を学習モデルに入力した場合と、複数の項目及び複数の段落のうち特定の項目又は特定の段落を削除した入力データを学習モデルに入力した場合とで、出力データがどのように変化するかに基づき、分類の付与根拠を決定してよい。例えば、特定の項目又は特定の段落を削除すると、近似度が閾値以上下がる場合には、その特定の項目又は特定の段落を付与根拠としてよい。ここで、付与根拠の粒度を段落とした例を説明したが、同様の処理にて文章単位、句単位、単語単位を付与根拠とすることもできる。そして、付与根拠となった箇所についてハイライトにする構成であってもよい。ハイライトの例としては、付与根拠の段落、文章、句、単語のみ、他の箇所と比べて文字の背景色を変えることや、他の箇所と比べて文字色を変えること、文字に下線を引いたり太字としたりフォントの大きさを大きくして強調表示したりすることが挙げられる。加えて、図１３の応用例としては、付与根拠となる段落を付与根拠記載箇所３０８に表示すると共に、付与根拠の文章、句、単語を更にハイライトする機能を提供してもよい。 In the classification system 200 according to the present embodiment, when a plurality of items or a plurality of paragraphs are separately input to the learning model, items or paragraphs whose classification approximation indicated by the output data of the learning model is equal to or greater than a predetermined value. It may be determined as the grant basis. In addition, the classification system 200 inputs, when a plurality of items and a plurality of paragraphs are input to the learning model, and input data obtained by deleting a specific item or a specific paragraph from the plurality of items and the plurality of paragraphs, to the learning model. In some cases, the basis for assigning the classification may be determined based on how the output data changes. For example, when a specific item or a specific paragraph is deleted and the degree of approximation falls by a threshold value or more, the specific item or the specific paragraph may be used as the grant basis. Here, an example in which the granularity of the grant basis is a paragraph has been described, but a sentence unit, a phrase unit, and a word unit can be set as the grant basis in the same processing. And the structure which makes it highlight about the location used as the provision grounds may be sufficient. Examples of highlights include paragraphs, sentences, phrases, and words for the basis of grant, changing the background color of characters compared to other locations, changing the color of characters compared to other locations, and underlining characters For example, it may be drawn, bolded, or highlighted with a large font size. In addition, as an application example of FIG. 13, a function that further highlights a sentence, a phrase, or a word of the grant basis may be provided while displaying the paragraph that becomes the grant basis in the grant basis description location 308.

ユーザは、分類候補３０５に表示された複数の分類の中から、出願の特徴を最も良く表していると考えられる分類を選択して、筆頭決定ボタン３０９を押下することで、その分類を筆頭分類として出願に付与することができる。また、ユーザは、分類候補３０５に表示された複数の分類の中から、出願の特徴を良く表していると考えられる分類を選択して、分類決定ボタン３１０を押下することで、その分類を出願に付与することができる。 The user selects a classification that is considered to best represent the characteristics of the application from among the plurality of classifications displayed in the classification candidate 305, and presses the first determination button 309, thereby selecting the first classification. Can be given to the application as In addition, the user selects a category that is considered to well represent the characteristics of the application from the plurality of categories displayed in the category candidate 305, and presses the classification determination button 310 to apply the category. Can be granted.

提供部は、ある分類が出願に付与されると判断された根拠となる処理対象データが、複数の処理対象データのうち所定の処理対象データに含まれる場合、その分類を筆頭分類としてユーザ端末に提供してもよい。ここで、所定の処理対象データとは、出願の特徴を良く表していると考えられるデータであり、例えば出願が特許出願の場合、発明の特徴を良く表していると考えらえる要約や上位請求項の記載である。 When the processing target data that is the basis for determining that a certain classification is given to the application is included in the predetermined processing target data among the plurality of processing target data, the providing unit sets the classification as the first classification to the user terminal. May be provided. Here, the predetermined processing target data is data that is considered to well represent the characteristics of the application. For example, when the application is a patent application, a summary or a high-order claim that is considered to represent the characteristics of the invention well. It is description of a term.

また、提供部は、出願に付与され得る複数の分類のうち、その近似度が最も高いものを、筆頭分類としてユーザ端末に提供してもよい。また、分類システム２００は、付与根拠として提供した記載箇所が、出願の特徴を良く表している記載箇所である場合に、その付与根拠に基づいて付与された分類を筆頭分類としてもよい。 Further, the providing unit may provide the user terminal with the highest degree of approximation among the plurality of classifications that can be given to the application as the first classification. In addition, the classification system 200 may use the classification given based on the grant basis as the top classification when the description provided as the grant basis is a description that well represents the characteristics of the application.

分類システム２００の処理対象データ入力部２０４は、出願に関するテキストデータ及び画像データを、複数の入力データとして、学習モデルに入力し、モデル構築部２０１は、テキストデータ及び画像データと、出願に付与される少なくとも１つの分類を示す出力データとの関係を示す教師データに基づいて、学習モデルを構築してもよい。ここで、出願が特許出願の場合、テキストデータは、願書、明細書、特許請求の範囲及び要約書を含んでよく、画像データは、図面並びに明細書中に挿入される化学式、数式及び表を含んでよい。この場合、学習モデルは、単一のモデルであり、テキストデータ及び画像データをひとまとまりのデータとして学習される。入力するテキストデータは、図面を説明する明細書の段落に限ってもよいし、図面をＯＣＲ（Optical Character Recognition）処理して得られたものであってもよい。 The processing target data input unit 204 of the classification system 200 inputs text data and image data related to the application as a plurality of input data to the learning model, and the model construction unit 201 is assigned to the application with the text data and image data. A learning model may be constructed based on teacher data indicating a relationship with output data indicating at least one classification. Here, when the application is a patent application, the text data may include an application, a specification, a claim, and an abstract, and the image data includes a chemical formula, a mathematical formula and a table inserted in the drawing and the specification. May include. In this case, the learning model is a single model, and the text data and the image data are learned as a set of data. The text data to be input may be limited to a paragraph in the specification describing the drawing, or may be obtained by performing OCR (Optical Character Recognition) processing on the drawing.

分類システム２００の処理対象データ入力部２０４は、出願に関するテキストデータを、複数の入力データの一部として、学習モデルに含まれる第１サブ学習モデルに入力し、出願に関する画像データを、複数の入力データの一部として、学習モデルに含まれる第２サブ学習モデルに入力し、モデル構築部２０１は、テキストデータと、出願に付与される少なくとも１つの分類を示す出力データとの関係を示す教師データに基づいて、第１サブ学習モデルを構築し、画像データと、出願に付与される少なくとも１つの分類を示す出力データとの関係を示す教師データに基づいて、第２サブ学習モデルを構築してもよい。この場合、学習モデルは、第１サブ学習モデルと第２サブ学習モデルを含み、第１サブ学習モデルにはテキストデータが入力され、第２サブ学習モデルには画像データが入力される。このように、データの種類毎にサブ学習モデルを別々に構築して、サブ学習モデルの出力データを統合して、出願に付与される分類を示す出力データを構成することとしてもよい。 The processing target data input unit 204 of the classification system 200 inputs text data related to the application as a part of the plurality of input data to the first sub-learning model included in the learning model, and inputs image data related to the application to the plurality of inputs. As part of the data, it is input to the second sub-learning model included in the learning model, and the model construction unit 201 indicates the teacher data indicating the relationship between the text data and the output data indicating at least one classification given to the application A first sub-learning model is constructed, and a second sub-learning model is constructed based on the teacher data indicating the relationship between the image data and the output data indicating at least one classification given to the application. Also good. In this case, the learning model includes a first sub learning model and a second sub learning model. Text data is input to the first sub learning model, and image data is input to the second sub learning model. Thus, it is good also as constructing | assembling the output data which shows the classification | category given to an application by constructing | assembling a sub learning model separately for every kind of data, and integrating the output data of a sub learning model.

国内外の特許庁及び国際機関等による分類は、統廃合を含んで改定されることがある。そのような場合にも、本実施形態に係る分類システム２００を用いてよい。分類の改定があった際、学習モデルを新分類に基づいて再構築するか、学習モデルは旧分類に基づいて構築されたもののままとして、学習モデルの出力データに対して新旧分類の変換を施すこととしてよい。 Classification by domestic and foreign patent offices and international organizations may be revised, including consolidation. Even in such a case, the classification system 200 according to the present embodiment may be used. When the classification is revised, the learning model is reconstructed based on the new classification, or the learning model is left constructed based on the old classification, and the output data of the learning model is converted to the old and new classification. That's good.

また、本例では、ＩＰＣが分類候補３０５に表示される例について説明したが、分類候補３０５には、ＣＰＣ、ＥＣＬＡ、ＩＣＯ、ＦＩ又はＦタームといった国内外の特許分類が表示されてもよい。 In this example, an example in which the IPC is displayed in the classification candidate 305 has been described. However, the classification candidate 305 may display domestic and foreign patent classifications such as CPC, ECLA, ICO, FI, or F-term.

また、入力データが意匠登録出願に関するデータである場合、分類候補３０５には、意匠分類やＤターム等の国内外の意匠分類が表示されてよい。同様に、入力データが商標登録出願に関するデータである場合、分類候補３０５には、図形商標の図形分類や指定商品役務の類似群コード、ニース分類等の国内外の商標に関する分類が表示されてよい。この場合、学習モデルは、商標を入力データとして、図形分類等を出力データに含むものであってもよいし、指定商品役務を入力データとして、類似群コードやニース分類を出力データとして含むものであってもよい。 When the input data is data related to a design registration application, the classification candidate 305 may display domestic and foreign design classifications such as design classification and D-term. Similarly, when the input data is data related to a trademark registration application, classification candidates 305 may display classifications related to domestic and foreign trademarks such as graphic classification of graphic trademarks, similar group codes of designated product services, and nice classification. . In this case, the learning model may include a trademark as input data, a graphic classification or the like as output data, a specified product service as input data, and a similar group code or a nice classification as output data. There may be.

なお、学習モデルは、指定商品役務を入力データとして、適切な指定商品役務を出力データとした教師データによって構築してもよい。ここで、適切な指定商品役務とは、登録になった指定商品役務をいう。また、学習モデルは、商標を入力データとして、指定商品役務を出力データとした教師データによって構築してもよい。これにより、類似する商標を出願する際に、指定すべき指定商品役務を提案する学習モデルが得られる。また、学習モデルは、商標を入力データとして、称呼を出力データとした教師データによって構築してもよい。 Note that the learning model may be constructed by teacher data using a designated product service as input data and an appropriate designated product service as output data. Here, the appropriate designated commodity service means a designated designated commodity service registered. In addition, the learning model may be constructed by teacher data using the trademark as input data and the designated product service as output data. As a result, a learning model for proposing a designated product service to be designated when applying for a similar trademark can be obtained. In addition, the learning model may be constructed by teacher data using a trademark as input data and a title as output data.

本実施形態４で記載されている事項は、適宜、前記各実施形態１、２及び３に適用することができ、例えば、例示１として、実施形態１、２又は３での社内分類に加え、本実施形態４の公報分類を付与する構成とし、自社出願について社内分類と公報分類を付与することができる。ここで付与された公報分類は、出願人が出願時に付与する公報分類として用いることができ、出願公開時に特許庁が付与した公報分類が公開されてデータとして取り込まれた際に、実施形態４で付与した公報分類を特許庁が付与した公報分類に上書きしてもよい。また、例示２として、実施形態４では公報分類の付与根拠を表示する構成としたように、前記各実施形態１、２及び３においても社内分類の付与根拠を表示する構成であってもよい。 The matters described in the fourth embodiment can be appropriately applied to each of the first, second, and third embodiments. For example, as an example 1, in addition to the in-house classification in the first, second, or third embodiment, It is set as the structure which provides the gazette classification of this Embodiment 4, and an in-house classification and a gazette classification can be provided about an in-house application. The publication classification assigned here can be used as the publication classification assigned by the applicant at the time of filing, and when the publication classification assigned by the JPO at the time of publication of the application is published and taken in as data, the embodiment 4 The granted publication classification may be overwritten on the publication classification given by the JPO. Further, as an example 2, as in the fourth embodiment, the basis for providing the publication classification is displayed, and in each of the first, second, and third embodiments, the basis for providing the in-house classification may be displayed.

逆に、実施形態１、２又は３に記載されている事項を、適宜、本実施形態４に適用することもでき、例えば、実施形態１に記載の、複数ユーザ端末から受信した教師情報を統合する機能（本機能に、複数ユーザ端末からの教師情報が異なる場合に多数決で決定するサブ機能、ユーザ毎の重みを加味して教師情報を決定するサブ機能、リーダからの指示に基づいて教師情報を決定するサブ機能を付加してもよい）を本実施形態４に適用することもできる。 Conversely, the matters described in the first, second, or third embodiment can be applied to the fourth embodiment as appropriate. For example, the teacher information received from the multiple user terminals described in the first embodiment is integrated. Function (this function has a sub-function that is determined by majority when teacher information from a plurality of user terminals is different, a sub-function that determines teacher information in consideration of the weight for each user, and teacher information based on instructions from the reader It is also possible to add a sub-function for determining the above) to the fourth embodiment.

上述した実施形態の機能を実現するための各部は、例えばハードウェアまたはソフトウェアによって実装することができる。ソフトウェアによって実装される場合、ハードウェアを制御するプログラムコードをＣＰＵ、ＭＰＵなどの各種のプロセッサによって実行されてもよい。プログラムコードの機能を実現するための回路等のハードウェアを設けてもよい。プログラムコードの一部をハードウェアで実現し、残りの部分を各種プロセッサが実行してもよい。 Each unit for realizing the functions of the above-described embodiments can be implemented by, for example, hardware or software. When implemented by software, program code for controlling hardware may be executed by various processors such as a CPU and MPU. Hardware such as a circuit for realizing the function of the program code may be provided. A part of the program code may be realized by hardware, and the remaining part may be executed by various processors.

２００分類システム
２０１モデル構築部
２０２学習モデル
２０３教師データ投入部
２０４処理対象データ入力部
２０５出力結果提供部２０５
２０６提供先ＤＢ
２０７教師情報受信部
２０８教師情報統合部
２０９教師情報反映部
２１０分類済みＤＢ
２５０公報ＤＢ
２６１ユーザ端末 DESCRIPTION OF SYMBOLS 200 Classification system 201 Model construction part 202 Learning model 203 Teacher data input part 204 Process target data input part 205 Output result provision part 205
206 Destination DB
207 Teacher information reception unit 208 Teacher information integration unit 209 Teacher information reflection unit 210 Classified DB
250 Gazette DB
261 User terminal

Claims

Model construction for constructing a learning model based on teacher data indicating a relationship between input data relating to at least one publication and output data indicating at least one classification assigned to at least one publication corresponding to the input data And
Processing target data input unit for inputting processing target data relating to at least one publication as the input data to the learning model;
A providing unit that provides information on the processing target data to a user terminal according to a result output from the learning model in response to an input of the processing target data;
A receiving unit that receives teacher information as a source of the teacher data from the user terminal;
A classification system comprising: a teacher data input unit that inputs the teacher data corresponding to the teacher information to the model construction unit.

The teacher information transmitted from the user terminal includes information indicating that a specific classification of a specific publication is not appropriate,
The classification system according to claim 1, wherein the model construction unit constructs a learning model that does not assign the specific classification to the specific publication based on the teacher data corresponding to the teacher information.

The classification according to claim 1, wherein the output data indicated by the teacher data includes a value indicating a degree of approximation of each of a plurality of classifications that can be assigned to at least one of the publications corresponding to the input data. system.

The classification system according to claim 3, wherein the value indicating the degree of approximation includes a value indicating that the approximation is performed or a value indicating that the approximation is not performed.

The classification system according to claim 4, wherein the value indicating the degree of approximation further includes an intermediate value between a value indicating approximation and a value indicating not approximation.

The receiving unit receives teacher information including an evaluation,
The classification system according to claim 3, wherein the value indicating the degree of approximation is determined according to the evaluation.

The processing target data input unit inputs processing target data related to a plurality of publications to the learning model,
The said provision part provides the list | wrist of the gazette which changed the arrangement | sequence order of several gazettes in the order with the high value which shows that it is approximated to the said user terminal. Classification system.

The classification system according to claim 7, wherein the plurality of publications are newly arrived publications issued in a predetermined period.

The providing unit includes:
Get information to identify users corresponding to each classification,
The information on the processing target data is provided to the user terminal of the user corresponding to the classification output from the learning model based on the acquired information specifying the user. Classification system according to paragraph.

The providing unit includes:
When the approximation output from the learning model in response to the input of the processing target data exceeds a predetermined threshold, the processing target data is transferred to the user terminal of the user corresponding to the classification having the approximation exceeding the threshold. Offer to,
10. The classification system according to claim 9, wherein when the degree of approximation is equal to or less than a predetermined threshold, the processing target data is not provided to the user terminal of the user corresponding to a classification having an degree of approximation equal to or less than the threshold.

When the degree of approximation output from the learning model in response to an input of the processing target data exceeds a threshold regarding a plurality of classifications, the providing unit sends the processing target data to the user terminal of the user corresponding to each classification Provide information about
The receiving unit receives teacher information from each user terminal,
The classification system according to claim 9 or 10, further comprising an integration unit that integrates teacher information received from a plurality of user terminals.

The classification system according to claim 11, wherein when the teacher information of some user terminals is different from the teacher information of other user terminals, the integration unit determines the teacher information to be adopted by majority vote.

The integration unit
Get weight information set for each user,
The classification system according to claim 11, wherein when the teacher information of some user terminals is different from the teacher information of other user terminals, the teacher information is determined in consideration of a weight for each user.

The integration unit
The reader can be identified among users,
12. The classification system according to claim 11, wherein when the teacher information of some user terminals is different from the teacher information of other user terminals, an instruction from the reader is received and the teacher information is determined based on the received instruction. .

Input data relating to at least one publication, output data indicating at least one classification assigned to at least one publication corresponding to the input data, and output indicating a degree of risk assigned to the publication corresponding to the input data A model building unit that builds a learning model based on teacher data indicating a relationship with the data;
Processing target data input unit for inputting processing target data relating to at least one publication as the input data to the learning model;
A providing unit that provides information on the processing target data to a user terminal according to a result output from the learning model in response to an input of the processing target data;
A receiving unit that receives teacher information as a source of the teacher data from the user terminal;
A classification system comprising: a teacher data input unit that inputs the teacher data corresponding to the teacher information to the model construction unit.

A model construction unit that constructs a learning model based on teacher data indicating a relationship between input data relating to at least one publication and output data indicating a degree of risk assigned to the publication corresponding to the input data;
Processing target data input unit for inputting processing target data relating to at least one publication as the input data to the learning model;
A providing unit that provides information on the processing target data to a user terminal according to a result output from the learning model in response to an input of the processing target data;
A receiving unit that receives teacher information as a source of the teacher data from the user terminal;
A classification system comprising: a teacher data input unit that inputs the teacher data corresponding to the teacher information to the model construction unit.

Model construction for constructing a learning model based on teacher data indicating a relationship between input data relating to at least one publication and output data indicating at least one classification assigned to at least one publication corresponding to the input data A method for controlling a classification system having parts,
Processing target data input step for inputting processing target data relating to at least one publication as the input data to the learning model;
Providing a user terminal with information on the processing target data according to a result output from the learning model in response to an input of the processing target data;
A receiving step of receiving teacher information as a source of the teacher data from the user terminal;
A method for controlling a classification system, comprising: a teacher data input step of inputting the teacher data corresponding to the teacher information into the model construction unit.

The program for functioning a computer as each part of the classification system as described in any one of Claims 1-16.

A model construction unit that constructs a learning model based on teacher data indicating a relationship between a plurality of input data relating to an application and output data indicating at least one classification assigned to the application;
A plurality of processing target data related to the application, a processing target data input unit that inputs the plurality of processing data as the plurality of input data to the learning model;
A providing unit for providing information on the plurality of processing target data to a user terminal according to a result output from the learning model in response to an input of the plurality of processing target data;
A receiving unit that receives teacher information as a source of the teacher data from the user terminal;
A teacher data input unit that inputs the teacher data corresponding to the teacher information into the model construction unit;
The providing unit provides the user terminal with processing target data that is a basis for determining that the at least one classification is given to the application among the plurality of processing target data.
Classification system.

When the processing target data that is the basis for determining that the at least one classification is given to the application is included in predetermined processing target data among the plurality of processing target data, the providing unit determines the classification. The classification system according to claim 19, wherein the classification system is provided to the user terminal as a first classification.

The output data indicated by the teacher data includes a value indicating the degree of approximation of each of a plurality of classifications that can be assigned to the application corresponding to the input data,
21. The classification system according to claim 19 or 20, wherein the providing unit provides the user terminal with the highest degree of approximation among a plurality of classifications that can be assigned to the application as a top classification.

The processing target data input unit inputs text data and image data related to the application as the plurality of input data to the learning model,
The model construction unit constructs the learning model based on teacher data indicating a relationship between the text data and image data and output data indicating at least one classification given to the application. 22. The classification system according to any one of 21.

The processing target data input unit inputs text data relating to the application as a part of the plurality of input data to a first sub-learning model included in the learning model, and image data relating to the application is input to the plurality of pieces of data. As part of the input data, input to the second sub-learning model included in the learning model,
The model construction unit constructs the first sub-learning model based on teacher data indicating a relationship between the text data and output data indicating at least one classification given to the application, and the image data The classification according to any one of claims 19 to 21, wherein the second sub-learning model is constructed based on teacher data indicating a relationship with output data indicating at least one classification assigned to the application. system.

The application is a patent application or a utility model registration application,
The plurality of pieces of processing target data include at least one of an application, a specification, a claim, or a utility model registration claim, a drawing, and a summary, according to any one of claims 19 to 23. Classification system.

The application is a design registration application,
The plurality of pieces of processing target data include at least one of an application, an article relating to the design described in the application, an explanation of the article relating to the design, an explanation of the design, and a drawing. Classification system according to paragraph.

The application is a trademark registration application;
The classification system according to any one of claims 19 to 23, wherein the plurality of pieces of processing target data include at least one of an application and a specified product, a specified service, and a trademark described in the application.

The output data indicated by the teacher data includes a value indicating the degree of approximation of each of a plurality of classifications that can be assigned to at least one of the applications corresponding to the input data,
The providing unit includes:
Get information to identify users corresponding to each classification,
Based on the acquired information identifying the user, providing information on the processing target data to the user terminal of the user corresponding to the classification output from the learning model,
When the degree of approximation output from the learning model in response to the input of the processing target data exceeds a threshold for a plurality of classifications, information on the processing target data is provided to the user terminal of the user corresponding to each classification. ,
The receiving unit receives teacher information from each user terminal,
The classification system according to any one of claims 19 to 26, further comprising an integration unit that integrates teacher information received from a plurality of user terminals.

The classification system according to claim 27, wherein when the teacher information of some user terminals is different from the teacher information of other user terminals, the integration unit determines the teacher information to be adopted by majority vote.

The integration unit
Get weight information set for each user,
The classification system according to claim 27, wherein when the teacher information of some user terminals is different from the teacher information of other user terminals, the teacher information is determined in consideration of the weight for each user.

The integration unit
The reader can be identified among users,
28. The classification system according to claim 27, wherein when the teacher information of some user terminals is different from the teacher information of other user terminals, an instruction from the reader is received and the teacher information is determined based on the received instruction. .