JP2021026406A

JP2021026406A - Information processing apparatus, information processing method, and program

Info

Publication number: JP2021026406A
Application number: JP2019142515A
Authority: JP
Inventors: 竜太植田; Ryuta Ueda
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-08-01
Filing date: 2019-08-01
Publication date: 2021-02-22
Anticipated expiration: 2039-08-01
Also published as: JP7406885B2

Abstract

To appropriately evaluate input data even when a classifier has not sufficiently learned characteristics for classifying the input data.SOLUTION: An information processing apparatus according to the present invention comprises: a likelihood acquisition unit that acquires the class likelihood for medical data added with a correct label by using a first classifier that classifies the medical data by class; a classification result evaluation unit that evaluates the degree of deviation based on the class likelihood acquired by the likelihood acquisition unit and a class corresponding to the correct label; a determination unit that determines whether the degree of deviation estimated by the classification result evaluation unit satisfies a predetermined standard; and a learning unit for a classifier that performs learning of a second classifier with medical data determined to satisfy the predetermined standard by the determination unit as training data.SELECTED DRAWING: Figure 6

Description

本発明は、正解ラベルが付与された医用データに対する分類器の分類結果に基づいて当該分類器と異なる分類器を学習する情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program for learning a classifier different from the classifier based on the classification result of the classifier for medical data to which a correct answer label is given.

医用画像を解析し、医師に読影の助けとなる情報を提示するコンピュータ支援診断（ＣｏｍｐｕｔｅｒＡｉｄｅｄＤｉａｇｎｏｓｉｓ：ＣＡＤ）システムが知られている。医用画像から鑑別診断の候補となる診断名を分類し提示するＣＡＤシステムには、医用データと正解の診断名（正解ラベル）を対とした教師データを用いて分類器に機械学習をさせることにより実現されるものがある。 Computer Aided Diagnosis (CAD) systems are known that analyze medical images and present information to the doctor to assist in interpretation. In the CAD system that classifies and presents diagnostic names that are candidates for differential diagnosis from medical images, the classifier is made to perform machine learning using teacher data that pairs medical data and correct diagnostic names (correct answer labels). There is something that will be realized.

特許文献１で開示されたＣＡＤシステムは、機械学習に基づく異常陰影等の異常検出処理システムによる病変検出等の支援結果と、当該支援結果を医師が訂正した後の訂正結果と、を対応付けて保存し、支援処理の性能の定量評価を行う。 The CAD system disclosed in Patent Document 1 correlates a support result such as lesion detection by an abnormality detection processing system such as an abnormality shadow based on machine learning with a correction result after the support result is corrected by a doctor. Save and quantitatively evaluate the performance of support processing.

特許第４１０４０３６号公報Japanese Patent No. 41004036

特許文献１における技術では、単一の分類器の分類結果に対する訂正情報を基に分類器の性能を評価することはできる。一方で、正解ラベルが付与された医用データに対する分類器の分類結果に基づいて当該分類器と異なる分類器を学習することは開示されていない。 In the technique of Patent Document 1, the performance of a classifier can be evaluated based on the correction information for the classification result of a single classifier. On the other hand, it is not disclosed to learn a classifier different from the classifier based on the classification result of the classifier for the medical data to which the correct answer label is given.

本発明に係る情報処理装置は、以下の構成を備える。すなわち、
医用データをクラス分類する第一の分類器を用いて、正解ラベルが付与された医用データに対するクラス尤度を取得する尤度取得部と、尤度取得部により取得したクラス尤度と、正解ラベルに対応するクラスとに基づいて乖離の程度を評価する分類結果の評価部と、分類結果の評価部による乖離の程度が所定の基準を満たすか否かを判定する判定部と、
判定部により所定の基準を満たすと判定された医用データを教師データとした第二の分類器の学習をする分類器の学習部と、を備える。 The information processing device according to the present invention has the following configurations. That is,
The likelihood acquisition unit that acquires the class likelihood for the medical data with the correct answer label using the first classifier that classifies the medical data, the class likelihood acquired by the likelihood acquisition unit, and the correct answer label. A classification result evaluation unit that evaluates the degree of divergence based on the class corresponding to, a judgment unit that determines whether the degree of divergence by the classification result evaluation unit meets a predetermined criterion, and
It is provided with a learning unit of a classifier that learns a second classifier using medical data determined by the determination unit as teacher data to satisfy a predetermined criterion.

本発明によれば、正解ラベルが付与された医用データに対する分類器の分類結果に基づいて当該分類器と異なる分類器を学習することが可能となる。 According to the present invention, it is possible to learn a classifier different from the classifier based on the classification result of the classifier for the medical data to which the correct answer label is given.

実施形態１乃至４の情報処理装置を含む情報処理システムのシステム構成図System configuration diagram of an information processing system including the information processing devices of the first to fourth embodiments 実施形態１乃至４の情報処理装置のハードウェア構成図Hardware configuration diagram of the information processing apparatus of the first to fourth embodiments 実施形態１乃至４の医用画像ＤＢの構成を示す概念図A conceptual diagram showing the configuration of the medical image DB of the first to fourth embodiments. 情報処理装置の分類器作成処理のフロー図Flow chart of information processing device classifier creation process 情報処理装置の分類対象の医用データ評価フロー図Medical data evaluation flow chart for classification of information processing equipment 実施形態１の情報処理装置の機能ブロック図Functional block diagram of the information processing apparatus of the first embodiment 実施形態１の情報処理装置の表示画面の例Example of display screen of the information processing apparatus of the first embodiment 実施形態１の情報処理装置の処理のフロー図Flow chart of processing of information processing apparatus of Embodiment 1 実施形態１の情報処理装置の表示画面の例Example of display screen of the information processing apparatus of the first embodiment 実施形態２の情報処理装置の機能ブロック図Functional block diagram of the information processing apparatus of the second embodiment 実施形態２の情報処理装置の表示画面の例Example of display screen of the information processing apparatus of the second embodiment 実施形態２の情報処理装置の処理のフロー図Flow chart of processing of information processing apparatus of Embodiment 2 実施形態３の分類データ図Classification data diagram of embodiment 3 実施形態３の情報処理装置の表示画面の例Example of display screen of the information processing apparatus of the third embodiment 実施形態３の情報処理装置の表示画面の例Example of display screen of the information processing apparatus of the third embodiment 実施形態４の情報処置装置の機能ブロック図Functional block diagram of the information treatment device of the fourth embodiment 実施形態４の情報処理装置の処理のフロー図Flow chart of processing of information processing apparatus of Embodiment 4

以下、添付の図面を参照して、本発明の実施形態に基づいて発明の詳細を説明する。尚、特に断らない限り、他の実施形態等で説明した項目については、同一の番号を付し、その説明を省略するものとする。また、以下の実施形態において示す構成は一例に過ぎず、本発明は図示された構成に限定されるものではない。 Hereinafter, the details of the invention will be described with reference to the accompanying drawings, based on the embodiments of the present invention. Unless otherwise specified, the items described in other embodiments and the like shall be assigned the same number and the description thereof shall be omitted. Further, the configuration shown in the following embodiments is only an example, and the present invention is not limited to the illustrated configuration.

＜実施形態１＞
実施形態１では、胸部Ｘ線ＣＴ（ＣｏｍｐｕｔｅｄＴｏｍｏｇｒａｐｈｙ）画像上の肺結節影に対する診断名の分類を行うＣＡＤシステムである情報処理装置について説明する。本実施形態の情報処理装置は、分類器による分類の尤度と正解との乖離の程度を評価し、当該乖離の程度に基づいて判定したデータと、処理対象となるデータとの類似性を評価し、結果をユーザに通知する。ユーザは、当該通知に基づき、処理を続けるか否かを選択できる。 <Embodiment 1>
In the first embodiment, an information processing device which is a CAD system for classifying diagnostic names for lung nodule shadows on a chest X-ray CT (Computed Tomography) image will be described. The information processing device of the present embodiment evaluates the degree of deviation between the likelihood of classification by the classifier and the correct answer, and evaluates the similarity between the data determined based on the degree of deviation and the data to be processed. And notify the user of the result. The user can choose whether or not to continue the process based on the notification.

（システム構成）
図１は、本実施形態の情報処理装置を含む情報処理システムのシステム構成図である。 (System configuration)
FIG. 1 is a system configuration diagram of an information processing system including the information processing device of the present embodiment.

図１において、情報処理システムは、医用画像データベース（以降、医用画像ＤＢと呼ぶ）１０２、情報処理装置１０１、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）１０３から構成される。 In FIG. 1, the information processing system is composed of a medical image database (hereinafter referred to as a medical image DB) 102, an information processing device 101, and a LAN (Local Area Network) 103.

医用画像ＤＢ１０２は、ＣＴ装置など医用画像の撮像装置で撮影された医用画像と、その医用画像の診断名とを含む医用データを記憶する。また、医用データを、ＬＡＮ１０３を介して検索、取得するための既知のデータベース機能を提供する。医用画像ＤＢ１０２に記憶される医用データの構成については図３を用いて説明する。 The medical image DB 102 stores medical data including a medical image taken by a medical image imaging device such as a CT device and a diagnosis name of the medical image. It also provides a known database function for retrieving and retrieving medical data via LAN 103. The structure of the medical data stored in the medical image DB 102 will be described with reference to FIG.

（ハードウェア構成）
図２は、本実施形態の情報処理装置１０１のハードウェア構成図である。 (Hardware configuration)
FIG. 2 is a hardware configuration diagram of the information processing device 101 of the present embodiment.

図２において、記憶媒体２０１は、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）や本実施形態に係る各種処理を行うための処理プログラム、各種情報を記憶するＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等の記憶媒体である。ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）２０２はＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔＯｕｔｐｕｔＳｙｓｔｅｍ）等、ハードウェアを初期化しＯＳを起動するためのプログラムを記憶する。ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０３はＢＩＯＳやＯＳ、処理プログラムを実行する際の演算処理を行う。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）２０４は、ＣＰＵ２０３がプログラムを実行する際の情報を一時記憶する。ＬＡＮインタフェース２０５は、ＩＥＥＥ（ＩｎｓｔｉｔｕｔｅｏｆＥｌｅｃｔｒｉｃａｌａｎｄＥｌｅｃｔｒｏｎｉｃｓＥｎｇｉｎｅｅｒｓ）８０２．３ａｂ等の規格に対応し、ＬＡＮ１０３を介して通信を行うためのインタフェースである。２０７は表示画面を表示するディプレイであり、２０６はディスプレイ２０７に表示する画面情報を信号に変換し出力するディスプレイインタフェースである。２０９はキー入力を行うキーボード、２１０は画面上の座標位置を指定及びボタン操作の入力を行うマウス、２０８はキーボード２０９及びマウス２１０からの信号を受信するための入力インタフェースである。２１１は、各ブロックが通信を行うための内部バスである。 In FIG. 2, the storage medium 201 is a storage medium such as an OS (Operating System), a processing program for performing various processes according to the present embodiment, and an HDD (Hard Disk Drive) for storing various information. The ROM (Read Only Memory) 202 stores a program for initializing the hardware and starting the OS, such as a BIOS (Basic Input Output System). The CPU (Central Processing Unit) 203 performs arithmetic processing when executing a BIOS, an OS, and a processing program. The RAM (Random Access Memory) 204 temporarily stores information when the CPU 203 executes a program. The LAN interface 205 corresponds to a standard such as IEEE (Institute of Electrical and Electronics Engineers) 802.3ab, and is an interface for communicating via the LAN 103. 207 is a display for displaying a display screen, and 206 is a display interface for converting screen information to be displayed on the display 207 into a signal and outputting the signal. Reference numeral 209 is a keyboard for key input, 210 is a mouse for designating coordinate positions on the screen and inputting button operations, and 208 is an input interface for receiving signals from the keyboard 209 and the mouse 210. Reference numeral 211 is an internal bus for each block to communicate.

（医用データの構成）
図３は、医用画像ＤＢ１０２に記憶される医用データの構成を示す概念図である。 (Composition of medical data)
FIG. 3 is a conceptual diagram showing the structure of medical data stored in the medical image DB 102.

図３において、医用画像ＤＢ１０２に記憶される医用データは、第１の医用データセット３１０と分類対象の医用データセット３２０から構成される。第１の医用データセット３１０は分類器の検証に用いる医用データセットである。第１の医用データセット３１０は複数の第１の医用データ３１１−ｊ（ｊ＝１，．．．，Ｎ１）を含み構成され、第１の医用データ３１１−ｊ（ｊ＝１，．．．，Ｎ１）はそれぞれ患者情報３０１、診断名３０２、画像３０３等の情報から構成される。ここで、患者情報３０１は、患者ＩＤ、患者の氏名、年齢、性別など、患者に関する情報である。診断名３０２は、画像３０３に関する診断名であり、例えば本実施形態のおいては「原発」、「転移」、「良性」の３種類である。ここで、「原発」とは原発性肺癌、「転移」とは転移性肺癌、「良性」とは良性結節を指す。画像３０３は、ＣＴ画像から抽出した、肺結節を含む三次元の部分領域画像である。分類対象の医用データセット３２０は情報処理装置１０１によって分類を行う医用データセットである。分類対象の医用データセット３２０は複数の分類対象の医用データ３２１−ｊ（ｊ＝１，．．．，Ｎ３）を含み構成され、分類対象の医用データ３２１−ｊ（ｊ＝１，．．．，Ｎ３）は、患者情報３０１、シリーズ画像３０４から構成される。ここで、シリーズ画像３０４は、ＣＴ装置によって１回の撮影で得られる画像であり、複数の断面の画像（以降、スライス画像と呼ぶ）から構成される。 In FIG. 3, the medical data stored in the medical image DB 102 is composed of a first medical data set 310 and a medical data set 320 to be classified. The first medical data set 310 is a medical data set used for verifying the classifier. The first medical data set 310 includes a plurality of first medical data 311-j (j = 1, ..., N1), and is composed of the first medical data 311-j (j = 1, ..., N1). , N1) are composed of information such as patient information 301, diagnosis name 302, and image 303, respectively. Here, the patient information 301 is information about the patient such as a patient ID, a patient's name, age, and gender. The diagnosis name 302 is a diagnosis name related to the image 303, and for example, in the present embodiment, there are three types of “primary”, “metastasis”, and “benign”. Here, "primary" refers to primary lung cancer, "metastasis" refers to metastatic lung cancer, and "benign" refers to a benign nodule. Image 303 is a three-dimensional partial region image including a lung nodule extracted from a CT image. The medical data set 320 to be classified is a medical data set that is classified by the information processing device 101. The medical data set 320 to be classified includes a plurality of medical data 321-j (j = 1, ..., N3) to be classified, and the medical data 321-j (j = 1, ..., N3) to be classified is included. , N3) is composed of patient information 301 and series image 304. Here, the series image 304 is an image obtained by a CT apparatus in one shooting, and is composed of an image having a plurality of cross sections (hereinafter, referred to as a slice image).

尚、診断名３０２は、「悪性」、「良性」であっても、原発、転移、良性を更に細分化した診断名でもよい。また、画像３０３は、ＣＴ画像と肺結節を含む三次元の部分領域を示す座標情報との組み合わせでもよい。尚、第１の医用データ３１１−ｉ、第３の医用データ３２１−ｉには上述した以外の情報を含んでいてもよい。 The diagnosis name 302 may be "malignant" or "benign", or may be a diagnosis name in which the primary, metastasis, and benign are further subdivided. Further, the image 303 may be a combination of a CT image and coordinate information indicating a three-dimensional partial region including a lung nodule. The first medical data 311-i and the third medical data 321-i may include information other than those described above.

図４および図５は情報処理装置１０１が分類対象のデータ（入力データ）を評価し、通知をするまでの構成を簡便に示した図である。ここでは、まず図４を用いて、第１の医用データセット３１０に基づいて分類対象の医用データを評価するための分類データセットを作成する構成について説明する。次に図５で作成された分類データセットに基づいて分類対象の医用データセットを評価する構成（図５）の説明をする。 4 and 5 are diagrams simply showing a configuration in which the information processing apparatus 101 evaluates the data (input data) to be classified and notifies the data. Here, first, using FIG. 4, a configuration for creating a classification data set for evaluating medical data to be classified based on the first medical data set 310 will be described. Next, a configuration (FIG. 5) for evaluating the medical data set to be classified based on the classification data set created in FIG. 5 will be described.

図４は、分類対象の医用データを評価するための分類データセット及び分類器を作成するフローについて示している。分類データセットを生成するにあたって、ここでは開始条件としてｉに１が設定されているものとして説明する。ここで、少なくとも一つの分類器が学習済みで存在するものとする。もしくは、学習済みの分類器が存在しなかった場合には第１の医用データセットを学習した分類器を設ける。このような前提条件のもとにフローを説明する。 FIG. 4 shows a flow for creating a classification data set and a classifier for evaluating medical data to be classified. In generating the classification data set, it is assumed here that 1 is set as the start condition. Here, it is assumed that at least one classifier has been trained and exists. Alternatively, if there is no trained classifier, a classifier that trains the first medical data set is provided. The flow will be described under such preconditions.

まず、第ｉの医用データ取得部４０１が医用ＤＢ１０２より第ｉの医用データを取得する。第ｉのデータはｉ＝１の場合には、第１の医用データセットである。つまり第ｉの医用データ取得部４０１により、第１の医用データセット３１０が取得される。第ｉの医用データ取得部４０１により取得された医用データは第ｉの分類器４０２に送信され、第ｉ（第１）の分類器によって診断名の分類を尤度で算出する。分類器に関する説明は、後述する。 First, the i-th medical data acquisition unit 401 acquires the i-th medical data from the medical DB 102. The i-th data is the first medical data set when i = 1. That is, the first medical data set 310 is acquired by the first medical data acquisition unit 401. The medical data acquired by the i-th medical data acquisition unit 401 is transmitted to the i-th classifier 402, and the classification of the diagnosis name is calculated by the i- (first) classifier with the likelihood. A description of the classifier will be given later.

次に、第ｉの分類器４０２による分類結果を受けて、分類結果の評価部４０３により分類結果と、正解との乖離の程度を評価する。評価方法についても後述する。正解との乖離が所定の基準以上かどうかを判定し、第ｉの医用データから第ｉ＋１の医用データの生成を行う。尚、第ｉ分類器に対し、乖離が所定の基準を満たさないデータを第ｉの分類データとして記憶する。分類データは即ち第ｉの分類器において、分類と正解との乖離の程度が所定の基準よりも小さい（所定の基準を満たさない）、分類器にとって精度よく分類できるデータ群となる。 Next, after receiving the classification result by the third classifier 402, the evaluation unit 403 of the classification result evaluates the degree of deviation between the classification result and the correct answer. The evaluation method will also be described later. It is determined whether or not the deviation from the correct answer is equal to or greater than a predetermined standard, and the i + 1 medical data is generated from the i-th medical data. In addition, the data whose dissociation does not satisfy a predetermined criterion is stored in the i-th classifier as the i-th classification data. The classification data is, in other words, a data group that can be accurately classified for the classifier in the third classifier, in which the degree of deviation between the classification and the correct answer is smaller than the predetermined standard (does not meet the predetermined standard).

一方で、乖離の程度が所定の基準よりも大きい（所定の基準を満たす）医用データである第ｉ＋１の医用データを第ｉ＋１の分類器の学習部４０４に送信する。第ｉ＋１の医用データを基に第ｉ＋１の学習部は第ｉ＋１のデータに対応する診断名３０２をクラス（ラベル）として、学習を行う。そしてｉにｉ＋１を代入して４０１からのフローを再度実行する。本構成により、乖離の程度が所定の基準以上のデータを再帰的に評価、分類データセットを作成し、作成された分類データセットと入力データとを比較することにより、図５で説明をする分類対象のデータ（入力データ）に対して評価が可能となる。 On the other hand, the medical data of the i + 1 which is the medical data whose degree of dissociation is larger than the predetermined standard (satisfies the predetermined standard) is transmitted to the learning unit 404 of the classifier of the i + 1. Based on the medical data of the i + 1, the learning unit of the i + 1 performs learning with the diagnosis name 302 corresponding to the data of the i + 1 as a class (label). Then, i + 1 is substituted for i and the flow from 401 is executed again. With this configuration, data whose degree of divergence is greater than or equal to a predetermined standard is recursively evaluated, a classification data set is created, and the created classification data set is compared with the input data to explain the classification described in FIG. Evaluation is possible for the target data (input data).

なお、本フローにおいては終了条件を明記してないが、例えば学習データ数が一定以下になった場合に処理を終了してもよいし、精度が一定以下になった場合や、モデル構造に対して学習データが不足すると判定された際に本フローを終了する終了条件としてもよい。また過学習や未学習の判定がなされた場合を終了条件としてもよいし、ユーザが決めた所定回数のみ実施しても、医用データセットのデータ数や、データの分散によって終了条件が設定されてもよい。 Although the end condition is not specified in this flow, for example, the process may be terminated when the number of training data is below a certain level, when the accuracy is below a certain level, or for the model structure. This may be used as an end condition for ending this flow when it is determined that the learning data is insufficient. Further, the case where over-learning or unlearning is determined may be set as the end condition, or even if the execution is performed only a predetermined number of times determined by the user, the end condition is set according to the number of data in the medical data set and the distribution of the data. May be good.

次に図５を用いて、分類対象の医用データセット３２０を入力した際に情報処理装置１０１が行うフローについて説明をする。まず、分類対象の医用データ取得部５０１によって医用ＤＢ１０２より分類対象の医用データセット３２０を取得する。そして取得した分類対象の医用データ３２０を分類対象の医用データ評価部５０２に送信する。分類対象の医用データ評価部５０２は、入力された分類対象の医用データセット３２０と、分類データセットとの類似性を評価する。すなわち分類器のそれぞれに対応する精度よく分類できるデータの集合である分類データセットに対して、分類対象の医用データセット３２０の類似性が一定以上かどうかを評価する。そして評価結果を通知部５０３に送信し、受診した評価結果に基づいて通知部５０３が通知をする。分類器の分類データとの類似性の評価方法についても後述する。 Next, with reference to FIG. 5, the flow performed by the information processing apparatus 101 when the medical data set 320 to be classified is input will be described. First, the classification target medical data acquisition unit 501 acquires the classification target medical data set 320 from the medical DB 102. Then, the acquired medical data 320 to be classified is transmitted to the medical data evaluation unit 502 to be classified. The classification target medical data evaluation unit 502 evaluates the similarity between the input classification target medical data set 320 and the classification data set. That is, it is evaluated whether or not the similarity of the medical data set 320 to be classified is equal to or higher than a certain level with respect to the classification data set which is a set of data that can be classified accurately corresponding to each of the classifiers. Then, the evaluation result is transmitted to the notification unit 503, and the notification unit 503 notifies based on the evaluation result received. The method of evaluating the similarity with the classification data of the classifier will also be described later.

以下図４および図５のフローを実施するための機能を示した機能ブロック図（図６）を用いて述べる。 Hereinafter, a functional block diagram (FIG. 6) showing a function for carrying out the flow of FIGS. 4 and 5 will be used.

（機能ブロック）
図６は、本実施形態の情報処理装置１０１の機能ブロック図である。 (Functional block)
FIG. 6 is a functional block diagram of the information processing device 101 of the present embodiment.

図６において、情報処理装置１０１は、図４および図５に記載の機能ブロックに加えて、分類器による分類結果として尤度を取得する尤度取得部６０１、乖離の程度が所定の基準を超える医用データを判定する判定部６０２、判定部６０２により所定の基準を超えると判定されたデータである第ｉ＋１の医用データセット６０３、判定部６０２により所定の基準を超えないと判定された医用データを第ｉの分類器における分類データとして記憶をする医用画像ＤＢ１０２における分類データセット６２０から構成される。以下より各部の機能について述べる。 In FIG. 6, in addition to the functional blocks shown in FIGS. 4 and 5, the information processing apparatus 101 has a likelihood acquisition unit 601 that acquires the likelihood as a classification result by the classifier, and the degree of deviation exceeds a predetermined standard. Judgment unit 602 for determining medical data, medical data set 603 of the i + 1 which is data determined to exceed a predetermined standard by the determination unit 602, and medical data determined not to exceed a predetermined standard by the determination unit 602. It is composed of a classification data set 620 in the medical image DB 102 that is stored as classification data in the i-th classifier. The functions of each part will be described below.

ここでは、図４と、図５のフローに則って分類対象の医用データセット３２０を評価するための分類データセット６２０を作成するフローと、作成された分類データセット６２０に基づいて分類対象の医用データセット３２０を評価するフローとに分けて説明をする。 Here, a flow for creating a classification data set 620 for evaluating the medical data set 320 to be classified according to the flow of FIGS. 4 and 5 and a medical data set for classification based on the created classification data set 620. The flow of evaluating the data set 320 will be described separately.

（分類データセット６２０を作成するフロー）
第ｉの医用データ取得部４０１は、医用画像ＤＢ１０２より医用データを取得する。ｉ＝１の場合には、例えば第１の医用データセット３１０を取得する。そして取得した医用データセットを尤度取得部６０１に送信する。 (Flow for creating classification data set 620)
The third medical data acquisition unit 401 acquires medical data from the medical image DB 102. In the case of i = 1, for example, the first medical data set 310 is acquired. Then, the acquired medical data set is transmitted to the likelihood acquisition unit 601.

第ｉの分類器４０２（第１の分類器）は、ｉ＝１の場合には、例えば第１の医用データを構成する画像３０３における肺結節の部分領域画像が入力されると診断名（クラス）に分類する。第ｉの分類器４０２（第１の分類器）は診断名の分類結果として、入力された画像がどのクラスに分類されるかを尤度で出力する。すなわち、分類器４０４による分類結果として、「原発」である尤度、「転移」である尤度、「良性」である尤度を出力する。具体的には、分類器４０１は、第ｉの医用データセット３１０を用いて機械学習したＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）である。 In the case of i = 1, the first classifier 402 (first classifier) has a diagnosis name (class) when, for example, a partial region image of the lung nodule in the image 303 constituting the first medical data is input. ). The i-th classifier 402 (first classifier) outputs the class of the input image as the classification result of the diagnosis name with the likelihood. That is, as the classification result by the classifier 404, the likelihood of being "primary", the likelihood of being "transition", and the likelihood of being "benign" are output. Specifically, the classifier 401 is a CNN (Convolutional Neural Network) machine-learned using the i-th medical data set 310.

尤度取得部６０１は、第ｉの医用データ取得部より取得した第１の医用データセット３１０を構成する画像３０３を、第ｉの分類器４０２（第１の分類器）へ入力し、第ｉの分類器４０２による分類結果である分類の尤度を取得する。具体的には、第ｉの分類器４０２による分類結果はＣＮＮの最終層からの出力であり、「原発」、「転移」、「良性」に対応する３つのノードの出力値にＳｏｆｔｍａｘと呼ばれる演算を施した値を取得する。 The likelihood acquisition unit 601 inputs the image 303 constituting the first medical data set 310 acquired from the i-th medical data acquisition unit into the i-th classifier 402 (first classifier), and the i-th The likelihood of classification, which is the classification result by the classifier 402 of the above, is acquired. Specifically, the classification result by the third classifier 402 is the output from the final layer of CNN, and the output value of the three nodes corresponding to "nuclear power", "transition", and "benign" is calculated as Softmax. Get the value given by.

分類結果の評価部４０３は、ｉ＝１の場合には、尤度取得部６０１で取得した第１の医用データセット３１０のおける第１の医用データ３１１−ｊ（ｊ＝１，．．．，Ｎ１）に対するクラスへの分類の尤度と正解との乖離の程度を評価する。例えば、正解ラベルに対応するクラスである正解クラスへの尤度と、所定の値との差を比較する。具体的には、分類結果の評価部４０３は正解に該当するクラスについては、分類器４０１の分類結果のうち正解クラスへの尤度と１．０の差の絶対値を算出する。そして正解クラス以外のクラス分類について、第１の評価部４０３は、正解クラスへの尤度を除いたクラスの内、最も高い尤度を持つクラスの尤度と０．０の差の絶対値を算出し、正解クラスへの尤度差と、正解クラス以外への尤度差の和をとって評価値とする。例えば、診断名が「原発」の場合のクラスの正解を（１．０，０．０，０．０）と表記し、また、分類結果において「原発」である尤度が０．８、「転移」である尤度が０．２、「良性」である尤度が０．０を（０．８，０．２，０．０）と表記するとする。判定部６０２が分類結果に基づいて尤度（０．８，０．２，０．０）と正解（１．０，０．０，０．０）との乖離を評価すると、｜０．８−１．０｜＋｜０．２−０．０｜＝０．４となる。同様に、尤度が（０．８，０．１，０．１）の場合の乖離は０．３となる。また、尤度（０．６，０．４，０．０）の場合は０．８、尤度（０．６，０．２，０．２）の場合は０．６となる。従って、分類結果が正解と一致する「原発」であっても、「原発」に対する尤度が低く、「原発」以外で最も高い尤度が高い場合には乖離が大きくなる。同様に、尤度（０．３，０．７，０．０）の場合は１．４、尤度（０．３，０．３５，０．３５）の場合は１．０５、尤度（０．１，０．９，０．０）の場合は１．８、尤度（０．１，０．４５，０．４５）の場合は１．３５となる。従って、分類結果が正解の「原発」と異なる場合であっても、「原発」に対する尤度が高く、「原発」以外で最も高い尤度が低い場合には乖離が小さくなる。すなわち、本実施形態で評価する乖離は、単なる正解率や正解と分類する尤度とは異なり、正解の程度と不正解の程度を総合している。尚、分類結果の評価部４０３による評価値は、本形態のみに限定されず、例えば、第ｉの分類器４０２による正解クラスと、１．０との差の絶対値に、正解クラス以外のクラスの尤度と０．０の差を加算した値を評価値として算出してもよい。つまり、分類結果の評価部４０３によって算出される評価値はあくまで、正解と、第ｉの分類器４０２による分類結果との乖離の程度を評価できれば形態は問わない。 When i = 1, the evaluation unit 403 of the classification result has the first medical data 311-j (j = 1, ...,) in the first medical data set 310 acquired by the likelihood acquisition unit 601. Evaluate the degree of divergence between the likelihood of classification into classes and the correct answer for N1). For example, the difference between the likelihood of the correct answer class, which is the class corresponding to the correct answer label, and a predetermined value is compared. Specifically, the evaluation unit 403 of the classification result calculates the absolute value of the difference between the likelihood to the correct answer class and 1.0 among the classification results of the classifier 401 for the class corresponding to the correct answer. Then, for the classification other than the correct answer class, the first evaluation unit 403 determines the absolute value of the difference between the likelihood of the class having the highest likelihood and 0.0 among the classes excluding the likelihood to the correct answer class. Calculate and use the sum of the likelihood difference to the correct answer class and the likelihood difference to other than the correct answer class as the evaluation value. For example, the correct answer of the class when the diagnosis name is "primary" is written as (1.0, 0.0, 0.0), and the likelihood of "primary" in the classification result is 0.8, " It is assumed that the likelihood of "transition" is 0.2 and the likelihood of "beneficial" is 0.0 (0.8, 0.2, 0.0). When the determination unit 602 evaluates the difference between the likelihood (0.8, 0.2, 0.0) and the correct answer (1.0, 0.0, 0.0) based on the classification result, | 0.8 -1.0 | + | 0.2-0.0 | = 0.4. Similarly, when the likelihood is (0.8, 0.1, 0.1), the divergence is 0.3. The likelihood (0.6, 0.4, 0.0) is 0.8, and the likelihood (0.6, 0.2, 0.2) is 0.6. Therefore, even if the classification result is the same as the correct answer for "nuclear power plant", the likelihood for "nuclear power plant" is low, and if the highest likelihood other than "nuclear power plant" is high, the divergence becomes large. Similarly, the likelihood (0.3, 0.7, 0.0) is 1.4, the likelihood (0.3, 0.35, 0.35) is 1.05, and the likelihood ( In the case of 0.1, 0.9, 0.0), it is 1.8, and in the case of likelihood (0.1, 0.45, 0.45), it is 1.35. Therefore, even if the classification result is different from the correct answer "nuclear power plant", the likelihood for "nuclear power plant" is high, and the deviation is small when the highest likelihood other than "nuclear power plant" is low. That is, the divergence evaluated in this embodiment is different from the mere correct answer rate and the likelihood of classifying as a correct answer, and is a total of the degree of correct answer and the degree of incorrect answer. The evaluation value by the evaluation unit 403 of the classification result is not limited to this embodiment. For example, the absolute value of the difference between the correct answer class by the i-th classifier 402 and 1.0 is a class other than the correct answer class. The value obtained by adding the difference between the likelihood of and 0.0 may be calculated as the evaluation value. That is, the evaluation value calculated by the evaluation unit 403 of the classification result does not matter in any form as long as the degree of discrepancy between the correct answer and the classification result by the third classifier 402 can be evaluated.

判定部６０２は、分類結果の評価部４０３の評価結果に基づいて、第１の医用データセット３１０（ｉ＝１の場合）の第１の医用データ３１１−ｊ（ｉ＝１，．．．，Ｎ１）から、所定の基準を満たすかを判定し、所定の基準を満たす医用データである第ｉ＋１の医用データセット６０３を取得する。ｉ＝１の場合においては、所定の基準を満たすと判定された医用データセットは第二の医用データセットとなる。ここで、所定の基準とは予め定義された固定値であり、例えば、前記乖離の値が１．２以上である。この場合、正解が「原発」で分類結果が異なるデータであっても、尤度が（０．３，０．３５，０．３５）や（０．２５，０．３７５，０．３７５）となるデータは乖離が各々１．０５、１．１２５となるため所定の基準を満たすデータセットである第２の医用データセット６０３とは判定されない。他方で、尤度が（０．３，０．５，０．２）のデータの場合は乖離が１．２、尤度が（０．２５，０．４５，０．３）のデータの場合も乖離が１．２となるため所定の基準を満たす医用データセットである第２の医用データセット６０３のデータとなる。なお所定の基準を設定する設定部（不図示）が別途設けられてもよいし、ユーザによってＧＵＩ等を介して所定の基準値の入力を受け付けてもよい。また判定部６０２が、分類結果の評価部４０３による評価結果に基づいて所定の基準を満たさないと判定をしたデータを第ｉの分類器における分類データ６２０として記憶部（医用画像ＤＢ１０２）に対応付けて記憶をする。所定の基準を満たさないと判定された医用データとは例えば、乖離の値が１．２未満のデータである。判定部４０３により、分類器による分類結果と、正解との乖離が所定の基準未満であるデータを所定の基準を満たさないと判定された医用データを指す。判定部４０３は、所定の基準か満たすか否かを判定する。所定の基準とは例えば閾値であり、所定の基準を満たすとは、ここでは乖離の程度が閾値を超える医用データ、所定の基準を満たさないとは、ここでは乖離の程度が閾値を超えない医用データを指す。 Based on the evaluation result of the evaluation unit 403 of the classification result, the determination unit 602 sets the first medical data 311-j (i = 1, ...,) of the first medical data set 310 (when i = 1). From N1), it is determined whether or not the predetermined criteria are satisfied, and the medical data set 603 of the i + 1 which is the medical data satisfying the predetermined criteria is acquired. In the case of i = 1, the medical data set determined to meet the predetermined criteria is the second medical data set. Here, the predetermined reference is a fixed value defined in advance, and for example, the value of the deviation is 1.2 or more. In this case, even if the correct answer is "primary" and the classification result is different, the likelihood is (0.3, 0.35, 0.35) or (0.25, 0.375, 0.375). Since the deviations are 1.05 and 1.125, respectively, the data is not determined to be the second medical data set 603, which is a data set satisfying a predetermined criterion. On the other hand, in the case of data with a likelihood of (0.3, 0.5, 0.2), the deviation is 1.2, and in the case of data with a likelihood of (0.25, 0.45, 0.3). Since the deviation is 1.2, the data is the data of the second medical data set 603, which is a medical data set satisfying a predetermined standard. A setting unit (not shown) for setting a predetermined reference may be separately provided, or the user may accept input of a predetermined reference value via a GUI or the like. Further, the data determined by the determination unit 602 that the predetermined criteria are not satisfied based on the evaluation result by the evaluation unit 403 of the classification result is used as the classification data 620 in the i-th classifier and corresponds to the storage unit (medical image DB 102). Attach and memorize. The medical data determined not to satisfy the predetermined criteria is, for example, data having a dissociation value of less than 1.2. The data in which the deviation between the classification result by the classifier and the correct answer is less than the predetermined standard by the determination unit 403 refers to the medical data determined not to satisfy the predetermined standard. The determination unit 403 determines whether or not the predetermined criteria are satisfied. The predetermined standard is, for example, a threshold value. Meet the predetermined standard here means medical data in which the degree of dissociation exceeds the threshold value, and here means that the degree of dissociation does not exceed the threshold value for medical data. Refers to data.

即ち、判定部４０３により所定の基準を満たさないと判定された医用データを、所定の基準を満たさないと判定された医用データを分類した分類器に対応する分類データとすることを特徴とする。ここで、分類データと分類データを構成する医用データを分類した分類器が対応付けられて医用画像ＤＢに記憶される。 That is, it is characterized in that the medical data determined by the determination unit 403 not to satisfy the predetermined criteria is used as the classification data corresponding to the classifier that classifies the medical data determined not to satisfy the predetermined criteria. Here, the classification data and the classifier that classifies the medical data constituting the classification data are associated and stored in the medical image DB.

第ｉ＋１の分類器の学習部４０４（ｉ＝１の場合は第２の分類器）は、判定部６０２により判定された第ｉ＋１の医用データセット６０３と診断名を対にした教師データを用いて第ｉ＋１の分類器の学習を行う。第ｉ＋１の分類器も同様に分類対象の画像を入力されると診断名を尤度で算出する構成となる。即ち、情報処理装置１０１は、医用データをクラス分類する第ｉの分類器４０２を用いて、正解ラベルが付与された医用データに対応するクラス尤度を取得する尤度取得部６０１と、尤度取得部６０１により取得したクラス尤度と、正解ラベルに対応するクラスとに基づいて乖離の程度を評価する分類結果の評価部４０３を有する。また、分類結果の評価部４０３による乖離が所定の基準を満たすか否かを判定する判定部６０２と、判定部６０２により所定の基準を満たすと判定された医用データを教師データとした第ｉ＋１の分類器の学習する第ｉ＋１の分類器の学習部を有することを特徴とする。 The learning unit 404 of the i + 1 classifier (the second classifier in the case of i = 1) uses the teacher data of the diagnosis name paired with the medical data set 603 of the i + 1 determined by the determination unit 602. The i + 1 classifier is learned. Similarly, the i + 1 classifier has a configuration in which the diagnosis name is calculated by the likelihood when the image to be classified is input. That is, the information processing device 101 uses the third classifier 402 for classifying the medical data, and the likelihood acquisition unit 601 for acquiring the class likelihood corresponding to the medical data to which the correct answer label is given, and the likelihood acquisition unit 601 and the likelihood. It has a classification result evaluation unit 403 that evaluates the degree of deviation based on the class likelihood acquired by the acquisition unit 601 and the class corresponding to the correct answer label. Further, the determination unit 602 that determines whether or not the deviation of the classification result by the evaluation unit 403 satisfies the predetermined criteria, and the i + 1 that uses the medical data determined by the determination unit 602 to satisfy the predetermined criteria as teacher data. It is characterized by having a learning unit of the i + 1 classifier for learning the classifier.

ここまでのフローが終了すると、前述した終了条件等により終了判定がされない場合は、ｉにｉ＋１を代入することで、上記のフローを繰り返す。すなわち、情報処理装置１０１は、教師データにより学習された第ｉ＋１の分類器を、医用データをクラス分類する分類器（第ｉの分類器４０２）とし、所定の基準を満たすと判定された医用データを対象にして、尤度取得部６０１、分類結果の評価部４０３、判定部６０２、学習部４０４の処理を繰り返し実行できる制御部（ＣＰＵ２０３）を有することを特徴とする。繰り返し処理により、情報処理装置１０１は、複数の分類器と、複数の分類器のそれぞれに対応する分類データを記憶部（医用ＤＢ１０２）に記憶する。 When the flow up to this point is completed, if the end determination is not made due to the above-mentioned end conditions or the like, the above flow is repeated by substituting i + 1 for i. That is, the information processing device 101 uses the i + 1 classifier learned from the teacher data as a classifier for classifying the medical data (the i-th classifier 402), and the medical data determined to satisfy a predetermined criterion. It is characterized by having a control unit (CPU 203) capable of repeatedly executing the processes of the likelihood acquisition unit 601, the classification result evaluation unit 403, the determination unit 602, and the learning unit 404. By the iterative processing, the information processing apparatus 101 stores the plurality of classifiers and the classification data corresponding to each of the plurality of classifiers in the storage unit (medical DB 102).

尚、分類器の作成および分類データの作成フローの繰り返しは、上述した終了条件のいずれかにより規定されてもよい。例えば、フローの繰り返しにより分類器を学習する学習データが減少することがある。教師データの減少は分類器の精度低下の原因となるため教師データの数が分類器のモデル構造等に対して所定の数以下になった場合には、繰り返しフローの終了条件とする。もしくは分類器の分類精度が所定未満になった場合において終了してもよい。もしくは、分類器への教師データの偏在や、数、学習回数により引き起こされる、過学習や、未学習が判定された場合に繰り返しの終了条件としてもよい。もちろんユーザが規定した回数のみ繰り返しフローを実行する構成でもよい。すなわち、情報処理装置１０１は、分類器を学習する教師データの数が所定以下と判定、分類器の分類精度が所定以下と判定、過学習の判定、未学習の判定、ユーザによる指定回数を超えると判定のうちのいずれかの判定処理が行われた際に繰り返しを終了することを特徴とする。 The repetition of the flow of creating the classifier and the flow of creating the classification data may be defined by any of the above-mentioned termination conditions. For example, the training data for learning the classifier may decrease due to the repetition of the flow. Since the decrease in teacher data causes a decrease in the accuracy of the classifier, when the number of teacher data is less than a predetermined number with respect to the model structure of the classifier, etc., it is used as the end condition of the repeating flow. Alternatively, it may be terminated when the classification accuracy of the classifier becomes less than a predetermined value. Alternatively, it may be used as a repeat termination condition when overfitting or unlearning is determined, which is caused by uneven distribution of teacher data in the classifier, number, and number of learnings. Of course, the flow may be executed repeatedly only the number of times specified by the user. That is, the information processing device 101 determines that the number of teacher data for learning the classifier is less than or equal to the predetermined number, determines that the classification accuracy of the classifier is less than or equal to the predetermined value, determines overfitting, determines unlearned, and exceeds the number of times specified by the user. It is characterized in that the repetition is terminated when any of the determination processes of the determination is performed.

以下より作成された情報処理装置１０１を構成する複数の分類器と、複数の分類器のそれぞれに対応する分類データセット６２０とに基づいて、分類対象の医用データセット３２０が入力される場合の処理について説明する。 Processing when the medical data set 320 to be classified is input based on the plurality of classifiers constituting the information processing device 101 created from the following and the classification data set 620 corresponding to each of the plurality of classifiers. Will be described.

（分類対象の医用データセット３２０を評価するフロー）
分類対象の医用データ取得部５０１は、分類対象の医用データセット３２０を医用画像ＤＢ１０２より取得する。分類対象の医用データ取得部５０１は、取得した分類対象の医用データセット３２０を分類対象の医用データ評価部５０２に送信する。 (Flow for evaluating the medical data set 320 to be classified)
The medical data acquisition unit 501 to be classified acquires the medical data set 320 to be classified from the medical image DB 102. The classification target medical data acquisition unit 501 transmits the acquired classification target medical data set 320 to the classification target medical data evaluation unit 502.

分類対象の医用データ評価部５０２は、分類データセット６２０と分類対象の医用データセット３２０のシリーズ画像３０４から抽出された肺結節の部分領域画像との類似性を評価する。即ち、分類対象の医用データ評価部５０２は、分類データ６２０と分類対象の医用データセット３２０との類似性を評価することを特徴とする。 The classification target medical data evaluation unit 502 evaluates the similarity between the classification data set 620 and the partial region image of the lung nodule extracted from the series image 304 of the classification target medical data set 320. That is, the medical data evaluation unit 502 to be classified is characterized in that it evaluates the similarity between the classification data 620 and the medical data set 320 to be classified.

肺結節の部分領域画像は、図７で説明する表示画面上での操作に基づき抽出される。類似性は、上記分類データ作成フローで作成された分類データに、分類データに対応付けられた分類器をクラス（ラベル）として付与し、機械学習した分類器（分類対象医用データを分類する分類器）により評価をする。分類対象医用データを分類する分類器は例えばＣＮＮである。ＣＮＮから出力される尤度を類似性とする（以降、類似度と呼ぶ）。即ち情報処理装置１０１は、複数の分類器のそれぞれをラベルとして付与した分類データを教師データとして学習をした分類器を用いて、分類対象の医用データを評価する分類対象の医用データ評価部５０２を有する。また分類対象の医用データ評価部５０２は、分類結果を尤度で算出する。 The partial region image of the lung nodule is extracted based on the operation on the display screen described with reference to FIG. Similarity is obtained by assigning a classifier associated with the classification data as a class (label) to the classification data created in the above classification data creation flow, and using a machine-learned classifier (classifier for classifying medical data to be classified). ) To evaluate. A classifier that classifies medical data to be classified is, for example, CNN. The likelihood output from the CNN is defined as the similarity (hereinafter referred to as the similarity). That is, the information processing device 101 uses a classifier that has learned the classification data with each of the plurality of classifiers as labels as teacher data, and evaluates the medical data to be classified by the medical data evaluation unit 502 to be classified. Have. Further, the medical data evaluation unit 502 to be classified calculates the classification result by the likelihood.

通知部５０３は、分類対象の医用データ評価部５０２による評価結果に基づく情報を通知する。具体的には、各分類器に対応する分類データとの類似度を表示画面に表示する。表示画面については図７を用いて説明する。 The notification unit 503 notifies the information based on the evaluation result by the medical data evaluation unit 502 to be classified. Specifically, the degree of similarity with the classification data corresponding to each classifier is displayed on the display screen. The display screen will be described with reference to FIG.

（表示画面）
図７は、本実施形態の情報処理装置１０１の表示画面の一例を示す図である。 (Display screen)
FIG. 7 is a diagram showing an example of a display screen of the information processing device 101 of the present embodiment.

図７において、表示画面７００は、ディスプレイ２０７に表示されるユーザインタフェース画面である。表示画面７００は、患者情報表示領域７０１、画像表示領域７０２、診断支援ボタン７０４から構成される。また、図７において、７０３は肺結節部分領域、通知領域７０５は通知部５０３により表示される通知領域である。 In FIG. 7, the display screen 700 is a user interface screen displayed on the display 207. The display screen 700 is composed of a patient information display area 701, an image display area 702, and a diagnosis support button 704. Further, in FIG. 7, 703 is a lung nodule partial region, and notification region 705 is a notification region displayed by notification unit 503.

患者情報表示領域７０１には、分類対象の医用データ３２１−ｊ（ｊ＝１，．．．，Ｎ３）の患者情報３０１の情報を表示する。図７では、患者氏名が「下丸子太郎」、患者ＩＤが「ｐａｔ０１２３４５６」、年齢が「７５歳」、性別が「男」の場合の表示例を示している。 In the patient information display area 701, the information of the patient information 301 of the medical data 321-j (j = 1, ..., N3) to be classified is displayed. FIG. 7 shows a display example when the patient name is “Taro Shimomaruko”, the patient ID is “pat0123456”, the age is “75 years old”, and the gender is “male”.

画像表示領域７０２には、分類対象の医用データ３２１−ｊ（ｊ＝１，．．．，Ｎ３）のシリーズ画像３０４を表示する。画像表示領域７０２では、表示するシリーズ画像のスライス送りやＷＬ（ＷｉｎｄｏｗＬｅｖｅｌ）／ＷＷ（ＷｉｎｄｏｗＷｉｄｔｈ）と呼ばれる階調表示条件の変更など表示の変更ができる。 In the image display area 702, a series image 304 of medical data 321-j (j = 1, ..., N3) to be classified is displayed. In the image display area 702, the display can be changed such as slicing the series image to be displayed and changing the gradation display condition called WL (Window Level) / WW (Window Width).

肺結節部分領域７０３の指定は例えばユーザが画像表示領域７０２でマウスをドラッグする操作により行われ、マウスのクリックで解除される。ドラッグに合わせて結節部分領域７０３がスライス画像上に表示され、表示中のスライス画像を中心として、同じ奥行きを持つ３次元領域（立方体）が指定される。尚、部分領域の指定は、ユーザの操作によってのみ指定されるものに限定されず、例えば他の画像処理手段では部分領域の指定が行われてもよいし、画像領域中から部分領域を抽出するように設計された機械学習に基づくモデルにより指定されても構わない。 The designation of the lung nodule partial region 703 is performed, for example, by the user dragging the mouse in the image display region 702, and is released by clicking the mouse. The nodule portion region 703 is displayed on the slice image in accordance with the drag, and a three-dimensional region (cube) having the same depth is designated with the slice image being displayed as the center. The specification of the partial area is not limited to the one specified only by the operation of the user. For example, the partial area may be specified by other image processing means, or the partial area is extracted from the image area. It may be specified by a machine learning-based model designed as such.

診断支援ボタン７０４は、肺結節部分領域７０３の画像から診断名の分類を行うためのボタンであり、診断支援ボタン７０４をマウスでクリックすると、情報処理装置１０１は肺結節部分領域７０３の画像を抽出し、抽出した画像から診断名の分類を行う。 The diagnosis support button 704 is a button for classifying the diagnosis name from the image of the lung nodule partial region 703. When the diagnosis support button 704 is clicked with the mouse, the information processing apparatus 101 extracts the image of the lung nodule partial region 703. Then, the diagnosis name is classified from the extracted image.

通知領域７０５は、肺結節部分領域７０３の画像と、分類器に対応付けられた分類データセット６２０との類似性に基づく情報が表示される。具体的には、ポップアップ表示されるウインドウであり、類似度を表示すると共に、処理を実施するか中止するかを指定するボタンを備える。 The notification area 705 displays information based on the similarity between the image of the lung nodule partial area 703 and the classification data set 620 associated with the classifier. Specifically, it is a window that is displayed in a pop-up, and includes a button for displaying the similarity and specifying whether to execute or cancel the process.

（処理フロー）
図８は、本実施形態の情報処理装置１０１の処理のフロー図である。 (Processing flow)
FIG. 8 is a processing flow chart of the information processing apparatus 101 of the present embodiment.

本処理は、情報処理装置１０１の起動後に、ユーザからの指示に基づき実行される。ユーザは処理の実行を指示する際に、処理の対象とする分類対象の医用データ３２１−ｊ（ｊ＝１，．．．，Ｎ３）を指定する。 This process is executed based on an instruction from the user after the information processing device 101 is started. When instructing the execution of the process, the user specifies the medical data 321-j (j = 1, ..., N3) to be classified as the target of the process.

ステップＳ８０１で、第ｉの医用データ取得部４０１は医用画像ＤＢ１０２より第ｉの医用データセット３１０を読みだす。 In step S801, the third medical data acquisition unit 401 reads out the i-th medical data set 310 from the medical image DB 102.

尤度取得部６０１は、ステップＳ６０１で読み出した第ｉの医用データセット３１０を構成する画像３０３を第ｉの分類器４０２に入力し、第ｉの分類器４０２からの分類結果としてクラス（診断名）への分類の尤度を取得する。 The likelihood acquisition unit 601 inputs the image 303 constituting the third medical data set 310 read in step S601 into the i-th classifier 402, and classifies the class (diagnosis name) as the classification result from the i-th classifier 402. ) To get the likelihood of classification.

ステップＳ８０３で、分類結果の評価部４０３は、尤度取得部６０１で取得したクラスへの分類の尤度と第ｉの医用データセット３１０の診断名３０２に基づき、クラスへの分類の尤度と正解の乖離の程度を評価する。 In step S803, the evaluation unit 403 of the classification result determines the likelihood of classification into the class based on the likelihood of classification to the class acquired by the likelihood acquisition unit 601 and the diagnosis name 302 of the third medical data set 310. Evaluate the degree of divergence between correct answers.

ステップＳ８０４で、判定部４０４は、第１の評価部４０３で評価した乖離の程度が所定の基準を満たすかを判定し、第ｉの医用データセット３１０の第１の医用データ３１１−ｊ（ｊ＝１，．．．，Ｎ１）が第ｉ＋１の医用データセット６０３のデータであるか否を判定する。乖離の程度が所定の基準を超える場合には、第ｉ＋１の医用データセット６０３であるとし、所定の基準を超えない場合には、第ｉの分類器４０２における分類データセット６２０として医用画像ＤＢに記憶をする。正解ラベルと分類結果との乖離の程度が所定の基準を超えるとは、ここでは両者の乖離の程度が所定の閾値を上回る場合を指す。また所定の基準を超えないとは、ここでは正解ラベルと分類結果との乖離の程度が所定の閾値を超えない場合を指す。 In step S804, the determination unit 404 determines whether the degree of deviation evaluated by the first evaluation unit 403 satisfies a predetermined criterion, and determines whether the degree of deviation is satisfied with the predetermined criteria, and determines whether the degree of deviation is satisfied with the predetermined criteria, and the first medical data 311-j (j) of the first medical data set 310 It is determined whether or not = 1, ..., N1) is the data of the medical data set 603 of the i + 1. If the degree of dissociation exceeds a predetermined standard, it is assumed to be the medical data set 603 of the i + 1, and if it does not exceed the predetermined standard, it is stored in the medical image DB as the classification data set 620 in the i-th classifier 402. Remember. The degree of deviation between the correct label and the classification result exceeds a predetermined standard, here means a case where the degree of deviation between the two exceeds a predetermined threshold. Further, "not exceeding a predetermined standard" here means a case where the degree of deviation between the correct label and the classification result does not exceed the predetermined threshold value.

ステップＳ８０５で、第ｉ＋１の分類器の学習部４０４は、診断名３０２を正解ラベル（クラス）として第ｉ＋１の分類器の機械学習を行う。即ち、第ｉ＋１の医用データを学習データとし、診断名をクラスとして分類をする第ｉ＋１の分類器が生成される。 In step S805, the learning unit 404 of the i + 1 classifier performs machine learning of the i + 1 classifier with the diagnosis name 302 as the correct label (class). That is, a first i + 1 classifier that classifies the i + 1 medical data as learning data and the diagnosis name as a class is generated.

ステップＳ８０６で、ｉ＋１をｉに代入して、ｉの値を更新する。 In step S806, i + 1 is assigned to i to update the value of i.

ステップＳ８０７は、分類器生成の終了判定を行う。終了判定は上述の他にも例えば学習データ数が一定以下になった場合に処理を終了してもよいし、精度が一定以下になった場合や、モデル構造に対して学習データが不足すると判定された際に本フローを終了する条件としてもよい。また過学習や未学習の判定がなされた場合を終了条件としてもよいし、ユーザが決めた所定回数のみ実施しても、医用データセットのデータ数や、データの分散によって終了条件が設定されてもよい。終了条件を満たさない場合に、ステップＳ８０１から再度分類器の学習フローを繰り返す。終了条件が満たされた場合に、次のステップＳ８０７に移る。ここでまでが上述の図４の処理である分類データの作成フローに該当する。ここから作成された分類対象の医用データセット３２０を評価するフロー（図５）に対応するステップについて説明をする。なおステップＳ８０１〜ステップＳ８０７までのフローは分類器を学習・作成するフローであって、異なる情報処理装置もしくは、同一の情報処理装置において既に学習済みの分類器が存在する場合においては、本フローは省略されても構わない。 Step S807 determines the end of classifier generation. In addition to the above, the end determination may be completed when, for example, the number of training data falls below a certain level, or when the accuracy falls below a certain level, or when the training data is insufficient for the model structure. It may be a condition to end this flow when it is done. Further, the case where over-learning or unlearning is determined may be set as the end condition, or even if the execution is performed only a predetermined number of times determined by the user, the end condition is set according to the number of data in the medical data set and the distribution of the data. May be good. If the end condition is not satisfied, the learning flow of the classifier is repeated again from step S801. When the end condition is satisfied, the process proceeds to the next step S807. Up to this point corresponds to the classification data creation flow, which is the process of FIG. 4 described above. The steps corresponding to the flow (FIG. 5) for evaluating the medical data set 320 to be classified created from here will be described. The flow from step S801 to step S807 is a flow for learning and creating a classifier, and if there is a different information processing device or a classifier that has already been learned in the same information processing device, this flow is It may be omitted.

ステップＳ８０８において、分類対象の医用データ取得部５０１は、医用画像ＤＢ１０２より分類対象の医用データセット３２０を取得し、分類対象データの読み出しを行う。ユーザインタフェース制御部（ＣＰＵ２０３）は、ステップＳ８０８で、本処理の実行時に指定された分類対象の医用データ３２１−ｊ（ｊ＝１，．．．，Ｎ３）を読み出し、ステップＳ８０９で、図７に一例を示した表示画面７００を表示する。また、ステップＳ８１０では、ユーザによる操作に基づき指定された肺結節部分領域７０３の画像を抽出する。 In step S808, the classification target medical data acquisition unit 501 acquires the classification target medical data set 320 from the medical image DB 102 and reads out the classification target data. The user interface control unit (CPU 203) reads out the medical data 321-j (j = 1, ..., N3) to be classified specified at the time of executing this process in step S808, and in step S809, FIG. 7 A display screen 700 showing an example is displayed. Further, in step S810, an image of the lung nodule partial region 703 designated based on the operation by the user is extracted.

ステップＳ８１１で、分類対象の医用データ評価部５０２は、ステップＳ８１０で抽出した画像と分類データセット６２０の画像との類似性を評価する。分類対象の医用データ評価部５０２は、分類データに対応付けられた分類器をクラス（ラベル）として付与し、機械学習した分類器（分類対象医用データを分類する分類器）による分類結果に基づいて評価をする。 In step S811, the medical data evaluation unit 502 to be classified evaluates the similarity between the image extracted in step S810 and the image of the classification data set 620. The medical data evaluation unit 502 to be classified assigns a classifier associated with the classification data as a class (label), and based on the classification result by the machine-learned classifier (classifier for classifying the medical data to be classified). Evaluate.

ステップＳ８１２で、通知部５０３は、ステップＳ８１１での評価結果に基づき、表示画面７００上に通知領域７０５を表示する。尚、通知領域７０５は、ステップＳ８１１での評価結果が所定の値を超える場合のみ通知部５０３によって表示されてもよい。 In step S812, the notification unit 503 displays the notification area 705 on the display screen 700 based on the evaluation result in step S811. The notification area 705 may be displayed by the notification unit 503 only when the evaluation result in step S811 exceeds a predetermined value.

以上説明したように、本実施形態によれば、情報処理装置１０１は分類結果の評価部４０３において第ｉの分類器による分類の尤度と正解との乖離の程度を評価し、当該乖離に基づいて判定部６０２が判定した分類データと、分類対象となるデータとの類似性を分類対象の医用データ評価部５０２が評価し、通知部５０３を介して評価結果をユーザに通知する。本発明は、分類対象に対して複数の分類器を設け、複数の分類器に対応する分類データセット６２０と比較をする構成をとる。この構成により、例えば入力データの分散により一つの分類器では、学習データから該医用データの分類を行うための特徴が十分に学習できない場合においても複数の分類器を設けることで適切な入力データの評価が可能となる。また、例えば教師データに誤ってラベリングがなされたデータを教師データから分離して、異なる分類器への教師データ（分類データ）もしくは、医用データとしてプールすることができる。複数の分類器に対応する分類データのいずれとも所定の基準を満たす類似性が確認できない場合に、分類器を学習する際の学習データに分類対象の医用データを分類するための特徴を含むデータが含まれていないと評価することができる。故に、本発明の課題である分類器の分類結果に基づいて入力データを適切に評価することが可能となる。 As described above, according to the present embodiment, the information processing apparatus 101 evaluates the degree of deviation between the likelihood of classification by the classification device i and the correct answer in the evaluation unit 403 of the classification result, and is based on the deviation. The medical data evaluation unit 502 to be classified evaluates the similarity between the classification data determined by the determination unit 602 and the data to be classified, and notifies the user of the evaluation result via the notification unit 503. The present invention has a configuration in which a plurality of classifiers are provided for a classification target and comparison is made with a classification data set 620 corresponding to the plurality of classifiers. With this configuration, for example, even if one classifier cannot sufficiently learn the features for classifying the medical data from the training data due to the distribution of the input data, it is possible to provide an appropriate classifier by providing a plurality of classifiers. Evaluation becomes possible. Further, for example, data in which teacher data is erroneously labeled can be separated from teacher data and pooled as teacher data (classification data) in different classifiers or medical data. When it is not possible to confirm the similarity of the classification data corresponding to multiple classifiers to meet the predetermined criteria, the training data when learning the classifiers includes data including features for classifying the medical data to be classified. It can be evaluated that it is not included. Therefore, it is possible to appropriately evaluate the input data based on the classification result of the classifier, which is the subject of the present invention.

また本実施形態は分類対象の医用データ評価部５０２による評価結果を、通知部５０３を介してユーザに認識させることができる。通知部５０３による当該通知により、ユーザは分類処理を実行するか否かを判断することができる。また分類対象となるデータが、情報処理装置１０１に設けられた分類器において、分類データとの類似性が低い場合において、ユーザは、分類器による分類結果が信頼性に欠けるものであるということを予め認知することができる。さらにユーザは分類結果が信頼性に欠けると予想される分類処理の中止をあらかじめ選択することが可能となる。尚、複数の分類器は単一の情報処理装置１０１に備わっていても、複数の情報処理装置に備わっていても、複数の情報処理装置間で構築された仮想環境において分類処理が行われてもよい。 Further, in the present embodiment, the evaluation result by the medical data evaluation unit 502 to be classified can be recognized by the user via the notification unit 503. The notification by the notification unit 503 allows the user to determine whether or not to execute the classification process. Further, when the data to be classified has a low similarity to the classification data in the classifier provided in the information processing device 101, the user indicates that the classification result by the classifier is unreliable. It can be recognized in advance. Further, the user can select in advance to cancel the classification process in which the classification result is expected to be unreliable. Regardless of whether the plurality of classifiers are provided in the single information processing device 101 or in the plurality of information processing devices, the classification process is performed in the virtual environment constructed between the plurality of information processing devices. May be good.

（変形例１−１）
本実施形態における分類器の作成フローは、フローの回数を重ねるたびに分類データや第ｉ＋１の医用データセットにおけるデータの数や、クラスの数が減少することが予想される。そのため、複数回フローによって作成された分類器に対応する分類データセットへの尤度が、その分類器よりもフロー数の少ない分類器に対応する分類データセットよりも大きい場合においても同様の基準において乖離が判定されることが好ましくない場合がある。当該場合においては、例えば、フローの回数が増えるに伴って、判定部６０２による基準を大きく設定したり、分類処理を実行するための閾値を高く設定したりしてもよい。尚、本変形例１−１の骨子は、ユーザに入力データへの評価を認知させることであって、例えば、判定の基準を変えなくとも、尤度の高い分類器を作成するために実施されたフロー回数を通知しても、分類器を学習したデータの数を通知してもよい。もしくは両者を組み合わせてもよい。 (Modification 1-1)
In the flow of creating the classifier in the present embodiment, it is expected that the number of classification data, the number of data in the i + 1 medical data set, and the number of classes decrease as the number of flows is repeated. Therefore, even if the likelihood to the classification data set corresponding to the classifier created by the multiple flows is larger than the classification data set corresponding to the classifier having a smaller number of flows than the classifier, the same criteria are used. It may not be desirable to determine the divergence. In this case, for example, as the number of flows increases, the reference by the determination unit 602 may be set larger, or the threshold value for executing the classification process may be set higher. The gist of the present modification 1-1 is to make the user recognize the evaluation of the input data, and for example, it is carried out to create a classifier with high likelihood without changing the judgment criteria. The number of flows may be notified, or the number of data learned from the classifier may be notified. Alternatively, both may be combined.

（変形例１−２）
実施形態１の分類結果の評価部４０３は、第ｉの分類器４０２による分類結果と正解との乖離の程度を、正解のクラスに該当するクラスについては、正解クラスへの分類の尤度と１．０の差の絶対値を算出する。さらに正解以外のクラスへの分類について、正解以外のクラスの内、最も高い尤度を持つクラスの尤度と０．０の差の絶対値を算出し、正解クラスへの尤度差と、正解クラス以外への尤度差の和を算出することにより評価した。一方、実施形態１の変形例１に対応する分類結果の評価部４０３は、正解のクラスに対応するクラスへの分類の尤度から正解以外のクラスで最も高い尤度を有するクラスの尤度を減ずることにより乖離の評価（評価値の算出）をおこなう。本評価値の算出方法を適用した場合に、分類結果の評価部４０３によって算出される評価値のうち、最も小さい乖離の値は１．０であり、最も大きい乖離の値は−１．０となる。また、判定部６０２は、所定の基準として例えば、−０．２以下を第ｉ＋１の医用データセットのデータと判定する。 (Modification 1-2)
The evaluation unit 403 of the classification result of the first embodiment determines the degree of deviation between the classification result by the third classifier 402 and the correct answer, and for the class corresponding to the correct answer class, the likelihood of classification into the correct answer class and 1 Calculate the absolute value of the difference of .0. Furthermore, for classification into classes other than the correct answer, the absolute value of the difference between the likelihood of the class with the highest likelihood and 0.0 among the classes other than the correct answer is calculated, and the likelihood difference to the correct answer class and the correct answer are obtained. It was evaluated by calculating the sum of the likelihood differences to other than the class. On the other hand, the evaluation unit 403 of the classification result corresponding to the modified example 1 of the first embodiment determines the likelihood of the class having the highest likelihood in the class other than the correct answer from the likelihood of classification into the class corresponding to the correct answer class. Evaluate the divergence (calculate the evaluation value) by reducing it. When this evaluation value calculation method is applied, among the evaluation values calculated by the evaluation unit 403 of the classification result, the smallest deviation value is 1.0, and the largest deviation value is -1.0. Become. Further, the determination unit 602 determines, for example, −0.2 or less as the data of the i + 1 medical data set as a predetermined reference.

尚、分類結果の評価部４０３は、正解の分類の尤度と１．０の差の絶対値だけでもよく、この場合、判定部６０２の所定の基準は、分類数に基づき決定する。具体的には、乖離の値が分類数の逆数からどの程度下回るかで第ｉ＋１の医用データセットのデータか否かを判定する。例えば３分類の場合、１／３＝０．３３・・より約５％下回る０．３１３５以下の場合に第ｉ＋１の医用データセットのデータであると判定する。尚、本変形例の場合は、正解の分類以外の他の分類の間違え方の程度を考慮した評価はできない。例えば、正解が「原発」であるデータに対して（０．３２，０．６８，０．０）と分類しても（０．３２，０．３４，０．３４）と分類しても乖離の値は同じとなり、第ｉ＋１の医用データセットのデータでないと判定する。 The evaluation unit 403 of the classification result may be only the absolute value of the difference between the likelihood of the correct answer classification and 1.0. In this case, the predetermined criterion of the determination unit 602 is determined based on the number of classifications. Specifically, it is determined whether or not the data is of the i + 1 medical data set based on how much the dissociation value is lower than the reciprocal of the classification number. For example, in the case of 3 classifications, if it is 0.3135 or less, which is about 5% lower than 1/3 = 0.33 ..., it is determined that the data is from the i + 1 medical data set. In the case of this modified example, it is not possible to make an evaluation considering the degree of mistake in classification other than the correct classification. For example, even if the data whose correct answer is "nuclear power plant" is classified as (0.32,0.68,0.0) or (0.32,0.34,0.34), there is a divergence. The values of are the same, and it is determined that the data is not the data of the i + 1 medical data set.

（変形例１−３）
実施形態１の通知部５０３は、分類処理の開始前に通知領域７０５を表示し、ユーザに処理の実行と中止を選択させたが、図９Ａに示すように、分類処理後に分類結果と共に類似度を表示してもよい。また、図９Ｂに示すように、分類対象の医用データ評価部５０２が、類似度が所定の値を超える場合には、第ｉの分類器４０２による分類処理を実行しないよう制御し、通知部５０３が、処理を実行しなかった旨を、類似度と共に表示しても良い。即ち、情報処理装置１０１は、分類対象の医用データ評価部５０２による類似性に基づいて、分類対象の医用データを第ｉの分類器への入力データとするか否かを決定する。 (Modification 1-3)
The notification unit 503 of the first embodiment displays the notification area 705 before the start of the classification process and causes the user to select execution or stop of the process. However, as shown in FIG. 9A, the similarity is displayed together with the classification result after the classification process. May be displayed. Further, as shown in FIG. 9B, the medical data evaluation unit 502 to be classified controls not to execute the classification process by the i-th classifier 402 when the similarity exceeds a predetermined value, and the notification unit 503. However, the fact that the process was not executed may be displayed together with the similarity. That is, the information processing device 101 determines whether or not the medical data to be classified is input data to the i-th classifier based on the similarity by the medical data evaluation unit 502 to be classified.

図９Ａの通知部５０３における通知領域９０１は、本変形例の通知領域の一例である。通知領域９０１では、「原発」である尤度が８３％、「転移」である尤度が１２％、「良性」である尤度が５％という分類結果と共に、第３の分類データとの類似度が９５％である旨を表示する。 The notification area 901 in the notification unit 503 of FIG. 9A is an example of the notification area of this modification. In the notification area 901, the likelihood of "primary" is 83%, the likelihood of "transition" is 12%, and the likelihood of "benign" is 5%, and the similarity with the third classification data. Indicates that the degree is 95%.

図９Ｂの通知領域９０２も、本変形例の通知領域の一例である。通知領域９０２は、第３の分類データとの類似度が９５％であり、処理を実行しなかった旨を表示する。また、ユーザによる確認のボタンも表示する。尚、確認ボタンを表示せずに、一定時間表示後自動的に当該通知領域のウインドウを閉じても良い。 The notification area 902 of FIG. 9B is also an example of the notification area of this modification. The notification area 902 has a similarity with the third classification data of 95%, and displays that the process has not been executed. It also displays a user confirmation button. The window of the notification area may be automatically closed after being displayed for a certain period of time without displaying the confirmation button.

本変形例によれば、複数回の分類器作成フローにより作成された分類器に対応する分類データセットとの類似度が所定の値を超える場合に、ユーザが処理の実行の実施と中止を指示する操作が不要となる。 According to this modification, when the similarity with the classification data set corresponding to the classifier created by the multiple classifier creation flow exceeds a predetermined value, the user instructs to execute or stop the execution of the process. No operation is required.

（変形例１−４）
本変形例では学習済みの分類器が存在し、学習済みの分類器を用いて分類処理を実行する場合について記載する。ここでは、学習済みの分類器が単数でかつ教師データが取得できる状態にある場合についての処理について述べる。まずは、学習済みの分類器を作成する際に用いた教師データと第１の医用データセットの比較を行い、重複データを削除して両者のデータを統合し、第１のデータセット３１０としてステップＳ８０１の処理を実行してもよい。本構成により、分類器が作成された後に新たな教師データが取得された場合や、他の学習済みの分類器を用いて、当該分類処理を可能とする。尚、学習済みモデルの分類対象が異なる場合や、学習済みの分類器を作成する際に用いた教師データと第１の医用データセットの分散が大きい場合には、第１の医用データセットとして追加をしなくとも、第１の医用データセットに追加をせずに、第１の医用データセットを用いて、学習済みの分類器に対するファインチューニングや転移学習によって第１の分類器が作成されてもよい。本構成により教師データの数や質に対して分類の精度やロバスト性の向上が期待される。 (Modification 1-4)
In this modification, a case where a trained classifier exists and the classification process is executed using the trained classifier will be described. Here, the processing in the case where the trained classifier is singular and the teacher data can be acquired will be described. First, the teacher data used when creating the trained classifier is compared with the first medical data set, duplicate data is deleted, and both data are integrated, and step S801 is set as the first data set 310. You may execute the process of. With this configuration, when new teacher data is acquired after the classifier is created, or by using another trained classifier, the classification process can be performed. If the classification target of the trained model is different, or if the teacher data used when creating the trained classifier and the first medical data set are widely dispersed, it is added as the first medical data set. Even if the first classifier is created by fine tuning or transfer learning for the trained classifier using the first medical data set without adding to the first medical data set. Good. This configuration is expected to improve the accuracy and robustness of classification with respect to the number and quality of teacher data.

＜実施形態２＞
実施形態２では、実施形態１と同様に、胸部Ｘ線ＣＴ画像上の肺結節影に関する診断推論を行うＣＡＤシステムである情報処理装置について説明する。 <Embodiment 2>
In the second embodiment, as in the first embodiment, an information processing device which is a CAD system for making a diagnostic inference about a lung nodule shadow on a chest X-ray CT image will be described.

実施形態１では、判定部６０２が分類器の尤度と正解との乖離の程度を評価し、分類対象の医用データ評価部５０２により、当該乖離の程度に基づいて判定したデータ（分類データセット６２０）と、分類対象の医用データセットとの類似性を評価し、結果をユーザに通知した。本実施形態２では、分類対象の医用データ評価部の評価結果に基づいて、分類器の設定を行う分類器の設定部１００１をさらに有する。即ち、情報処理装置１０１は、分類対象の医用データ評価部５０２による評価結果に基づいて複数の分類器のうち、分類対象の医用データを分類する分類器を設定する分類器の設定部１００１を有する。 In the first embodiment, the determination unit 602 evaluates the degree of deviation between the likelihood of the classifier and the correct answer, and the medical data evaluation unit 502 to be classified determines the data based on the degree of the deviation (classification data set 620). ) And the medical data set to be classified were evaluated, and the result was notified to the user. In the second embodiment, the classifier setting unit 1001 for setting the classifier is further provided based on the evaluation result of the medical data evaluation unit to be classified. That is, the information processing device 101 has a classifier setting unit 1001 that sets a classifier that classifies the medical data to be classified among the plurality of classifiers based on the evaluation result by the medical data evaluation unit 502 to be classified. ..

尚、本実施形態に係る情報処理装置のシステム構成、ハードウェア構成、医用画像ＤＢ１０２の構成は実施形態１と同様であるため説明を省略する。 Since the system configuration, hardware configuration, and configuration of the medical image DB 102 of the information processing apparatus according to the present embodiment are the same as those of the first embodiment, the description thereof will be omitted.

図１０は、本実施形態の情報処理装置の機能ブロック図である。図１０において、１００１は分類器の設定部である。 FIG. 10 is a functional block diagram of the information processing device of the present embodiment. In FIG. 10, 1001 is a classifier setting unit.

分類器設定部１００１は、分類対象の医用データ評価部５０２による評価結果に基づいて、分類対象の医用データセット３２０に対して、診断名を分類する第ｉの分類器を設定する。分類対象の医用データ評価部５０２は、複数の分類器をクラスとして分類する尤度を算出する。そのため、例えば分類器の作成フローが３度実施された場合には、分類器は第１から第３の分類器が存在する構成になる。ここで、分類対象の医用データ評価部５０２は、それぞれの分類器をクラス（ラベル）として、それぞれの分類器に対応する分類データを対にした教師データを用いて分類器の学習をする。分類対象の医用データセットを入力した際の分類結果は、Ｓｏｆｔｍａｘ演算をし、各クラスに振られる値（尤度）を合計すると１となるように算出される。例えば（第１の分類器、第２の分類器、第３の分類器、その他）のクラス分類をした際に、分類結果が（０．６、０．２、０．１、０．１）となり、それぞれのクラスの分類データである尤度が示される。この場合において分類対象の医用データは、第１の分類器に対応する分類データである尤度が最も高いことを示す。言い換えると、第１の医用データセットのうち、第１の分類器が分類した分類結果との乖離の程度が所定の基準よりも小さいデータである尤度が高いことを示す。つまり当該分類対象の医用データを分類器１で分類をした際の分類結果に対して０．６の尤度で信頼性が保たれることを示している。分類対象の医用データ評価部５０２からの分類結果を受けて、入力する分類器を選択する。 The classifier setting unit 1001 sets a third classifier for classifying the diagnosis name with respect to the medical data set 320 to be classified based on the evaluation result by the medical data evaluation unit 502 to be classified. The medical data evaluation unit 502 to be classified calculates the likelihood of classifying a plurality of classifiers as a class. Therefore, for example, when the flow of creating a classifier is carried out three times, the classifier has a configuration in which the first to third classifiers exist. Here, the medical data evaluation unit 502 to be classified uses each classifier as a class (label) and learns the classifier using the teacher data paired with the classification data corresponding to each classifier. The classification result when the medical data set to be classified is input is calculated so as to be 1 when the Softmax calculation is performed and the values (likelihoods) assigned to each class are totaled. For example, when the classification of (first classifier, second classifier, third classifier, etc.) is performed, the classification result is (0.6, 0.2, 0.1, 0.1). And the likelihood, which is the classification data of each class, is shown. In this case, the medical data to be classified indicates that the classification data corresponding to the first classifier has the highest likelihood. In other words, it indicates that the data in the first medical data set whose degree of deviation from the classification result classified by the first classifier is smaller than a predetermined criterion has a high likelihood. That is, it is shown that the reliability is maintained with a likelihood of 0.6 with respect to the classification result when the medical data to be classified is classified by the classifier 1. The classifier to be input is selected in response to the classification result from the medical data evaluation unit 502 to be classified.

分類器設定部１００１が分類対象の医用データ評価部５０２による分類結果に基づいて、分類対象の医用データの分類器への入力の可否を決定し、分類器の入力を決定した際には、当該分類対象の医用データセット３２０を入力する分類器を設定し、設定された分類器に対して診断名の分類を実施させる。分類器設定部１００１は、単純には分類対象の医用データ評価部５０２からの分類結果のうち、最も高い尤度を示す分類器を分類処理を行う分類器として設定をする。もしくは、分類器の設定部１００１は閾値を設定し、閾値を超えて且つ、最も高い尤度をもつ分類器を分類器として設定してもよい。または、分類器の設定部１００１により、尤度が閾値を超える分類器を分類対象の医用データを分類させる分類器として設定を行ってもよい。尚、分類器の設定部１００１は、尤度が閾値を超えてかつ、最も高い尤度を有する分類器を分類器として設定してもよい。 When the classifier setting unit 1001 determines whether or not to input the medical data to be classified into the classifier based on the classification result by the medical data evaluation unit 502 to be classified, and determines the input to the classifier, the relevant person is concerned. A classifier for inputting the medical data set 320 to be classified is set, and the set classifier is used to classify the diagnosis name. The classifier setting unit 1001 simply sets the classifier showing the highest likelihood among the classification results from the medical data evaluation unit 502 to be classified as a classifier for performing classification processing. Alternatively, the classifier setting unit 1001 may set a threshold value, and the classifier that exceeds the threshold value and has the highest likelihood may be set as the classifier. Alternatively, the classifier setting unit 1001 may set a classifier whose likelihood exceeds the threshold value as a classifier for classifying the medical data to be classified. The classifier setting unit 1001 may set a classifier having a likelihood exceeding a threshold value and having the highest likelihood as a classifier.

分類器の設定部１００１が分類対象の医用データセット３２０に対して分類器への入力を許可しない場合には、例えば、分類対象の医用データ評価部５０２の分類結果のうち尤度が閾値よりも小さい場合や、クラスへの尤度間の差が小さい場合が考えられる。もしくは、変形例１−１に記載をしたように、複数回の分類器作成フローにより作成された分類器は、当該分類器よりも少数回のフローで作成された分類器よりも、学習データやクラス数において信頼性が低い場合がある。そのため、第ｉの分類器のうち、ｉ以下の分類器にのみ診断名の分類を許可するように閾値を設定しても、分類器に対する学習データ数の下限や、学習データを構成する診断名を有する学習データの数の下限によって入力を許可しなくともよい。 When the setting unit 1001 of the classifier does not allow the medical data set 320 to be classified to be input to the classifier, for example, the likelihood of the classification result of the medical data evaluation unit 502 to be classified is greater than the threshold value. It may be small or the difference between the likelihoods to the class may be small. Alternatively, as described in the modified example 1-1, the classifier created by the multiple times classifier creation flow has more training data and the classifier than the classifier created by the lesser number of flows than the classifier. It may be unreliable in terms of the number of classes. Therefore, even if the threshold is set so that the classification of the diagnosis name is permitted only for the classifier i or less among the i-th classifiers, the lower limit of the number of training data for the classifier and the diagnosis name constituting the training data It is not necessary to allow input by the lower limit of the number of training data having.

図１１は、本実施形態の情報処理装置の表示画面の例である。 FIG. 11 is an example of a display screen of the information processing apparatus of this embodiment.

図１１において、通知領域１１０１は通知部５０３による通知領域の一例である。本実施形態の通知領域１１０１には、分類対象の医用データ評価部５０２の分類器による分類結果と共に、第３の分類器（表示画面例では「分類器３」と記載）を使用した旨の通知が表示される。即ち、通知部５０３は分類対象の医用データを分類した分類器を示す情報と、分類器による分類結果を通知することを特徴とする。 In FIG. 11, the notification area 1101 is an example of the notification area by the notification unit 503. In the notification area 1101 of the present embodiment, a notification that a third classifier (described as "classifier 3" in the display screen example) is used together with the classification result by the classifier of the medical data evaluation unit 502 to be classified is used. Is displayed. That is, the notification unit 503 is characterized in that it notifies the information indicating the classifier that classifies the medical data to be classified and the classification result by the classifier.

図１２は、本実施形態の情報処理装置の処理のフロー図である。 FIG. 12 is a flow chart of processing of the information processing apparatus of this embodiment.

本実施形態の処理では、ステップＳ８１１に続き、ステップＳ１２１２を実行する。ステップＳ１２１２は、分類対象の医用データ評価部５０２により分類処理を実行し算出された分類結果を基に、分類器の設定部１００１が分類器への入力の可否を判定する。終了条件は上述したようにデータ数、尤度、クラス数、分類器の番号（何回のフローにより作成された分類器かを示す番号ｉ）等により設定される。分類器の設定部１００１はステップＳ１２１２により、終了条件を満たす場合には、ステップＳ１２１３を実行し、終了条件を満たすと判定された場合には、ステップＳ１２１４を実行する。 In the process of this embodiment, step S1212 is executed following step S811. In step S1212, the setting unit 1001 of the classifier determines whether or not input to the classifier is possible based on the classification result calculated by executing the classification process by the medical data evaluation unit 502 to be classified. As described above, the end condition is set by the number of data, the likelihood, the number of classes, the number of the classifier (the number i indicating how many times the classifier was created), and the like. According to step S1212, the classifier setting unit 1001 executes step S1213 when the end condition is satisfied, and executes step S1214 when it is determined that the end condition is satisfied.

ステップＳ１２１３で、終了条件を満たした旨を通知部５０３により通知したうえで、再度分類処理を実行するかをユーザにより選択させる。ユーザが分類を選択した場合には、ステップＳ１２１４を実行する。 In step S1213, the notification unit 503 notifies that the end condition is satisfied, and then the user selects whether to execute the classification process again. If the user selects the classification, step S1214 is executed.

ステップＳ１２１４において、分類器の設定部１２１４は、分類対象の医用データセットＳ３２０を入力する分類器をすくなくともひとつ設定する。分類器の設定方法は、上述の尤度や、分類器の番号、データ数、クラス数等により決定される。 In step S1214, the classifier setting unit 1214 sets at least one classifier for inputting the medical data set S320 to be classified. The method of setting the classifier is determined by the above-mentioned likelihood, the number of the classifier, the number of data, the number of classes, and the like.

ステップＳ１２１５において、ステップＳ１２１４において、分類器設定部１００１により設定された第ｉの分類器（単一または複数）で診断名の分類処理をする。 In step S1215, in step S1214, the diagnosis name is classified by the third classifier (single or plural) set by the classifier setting unit 1001.

ステップＳ１２１６において通知部５０３は第ｉの分類器（単一または複数）の分類結果と、分類に使用した分類器を表示する。 In step S1216, the notification unit 503 displays the classification result of the i-th classifier (single or plural) and the classifier used for the classification.

以上説明したように、本実施形態によれば、複数の分類器と、分類器に対応付けた分類データとの類似性を判定し、類似性に基づいて、分類器の設定部１００１が分類器への入力の可否を判定する。複数の分類器との分類データとの比較によって、分類対象の医用データセットＳ３２０が学習データに含まれていながら、特徴を充分に学習できなかったデータと、学習データに含まれていなかったデータの区別をより明確に行うことが可能となる。さらに、分類データに基づいた分類器による類似性判定を行い、所定の基準を満たす第ｉの分類器に基づいて、診断名を分類することで、当該分類器が示す分類結果の信頼性が向上し、さらに分類器からの出力結果を分類データとの類似性という形で予め認知することが可能となる。 As described above, according to the present embodiment, the similarity between the plurality of classifiers and the classification data associated with the classifiers is determined, and the classifier setting unit 1001 determines the similarity with the classifiers based on the similarity. Judges whether or not input to is possible. By comparing the classification data with a plurality of classifiers, the data in which the characteristics could not be sufficiently learned even though the medical data set S320 to be classified was included in the training data, and the data not included in the training data. The distinction can be made more clearly. Furthermore, by performing similarity judgment by a classifier based on the classification data and classifying the diagnosis name based on the i-th classifier that meets a predetermined criterion, the reliability of the classification result indicated by the classifier is improved. Further, the output result from the classifier can be recognized in advance in the form of similarity with the classification data.

（変形例２−１）
実施形態２では、分類データを学習させた分類器による分類結果に基づいて、類似度や、データ数、クラス数が所定の基準以上である場合において、分類器の設定部１００１が分類器の設定を行った。なお、通知部５０３は、類似度やデータ数、クラス数等、分類器を選択する情報を通知した上で、ユーザが分類器を設定できる入力手段を有する構成でもよい。例えば、分類器をプルダウンやチェックボックス等への入力受付部を介して、分類器を設定することが考えられる。本構成により、作成された複数の分類器を用いて結果を参照したい場合や、診断名を確認したいクラスを含む分類器を選択することが可能となる。 (Modification 2-1)
In the second embodiment, the classifier setting unit 1001 sets the classifier when the similarity, the number of data, and the number of classes are equal to or higher than a predetermined standard based on the classification result by the classifier trained with the classification data. Was done. The notification unit 503 may have an input means that allows the user to set the classifier after notifying information for selecting the classifier, such as the degree of similarity, the number of data, and the number of classes. For example, it is conceivable to set the classifier via a pull-down or an input reception unit for a check box or the like. With this configuration, it is possible to refer to the results using a plurality of created classifiers, or to select a classifier including the class for which the diagnosis name is to be confirmed.

（変形例２−２）
実施形態２では、分類器への分類対象の医用データセット３２０の入力をしない条件として、分類器ごとの尤度の差が小さいことを条件として述べた。しかしながら、尤度の差が小さい分類器がいずれも分類対象の医用データセット３２０に対して分類能を発揮していた場合には、両者の分類器間での尤度差は小さくなることが考えられる。この場合においては、尤度差が小さくても、いずれかの分類器で分類をすることで信頼度の高い診断名が分類されることになる。 (Modification 2-2)
In the second embodiment, the condition that the medical data set 320 to be classified is not input to the classifier is described on the condition that the difference in the likelihood of each classifier is small. However, if all the classifiers having a small difference in likelihood exert their classification ability with respect to the medical data set 320 to be classified, it is considered that the difference in likelihood between the two classifiers becomes small. Be done. In this case, even if the likelihood difference is small, a highly reliable diagnostic name can be classified by classifying with one of the classifiers.

つまり分類器間に割り振られる尤度差が小さいことは、分類器における分類結果と正解との乖離が小さい分類データ（所定基準を満たさない医用データ）との類似性が低いと判断することは適切ではないことがある所以である。故に、分類器の分類結果におけるクラス（分類器）間の尤度差が小さい場合には、分類器に対応する分類データに分類されなかったその他のラベルとの尤度の差を比較する。つまり、分類器間の尤度差が小さく、かつその他のクラスとの尤度差が大きい場合には、分類器設定部１００１は尤度差の小さい複数の分類器を分類するための分類器として設定をし、設定された分類器を用いて診断名を分類する。そして複数の分類器による診断名の分類結果を比較して、分類結果とする。本構成により、分類器が分類能を有しているにも関わらず、尤度差が小さいために分類器への入力データから除外される可能性が低減する。 In other words, it is appropriate to judge that the small likelihood difference allocated between the classifiers is low in similarity to the classification data (medical data that does not meet the prescribed criteria) with a small discrepancy between the classification result and the correct answer in the classifier. That is why it may not be. Therefore, when the likelihood difference between classes (classifiers) in the classification result of the classifier is small, the likelihood difference with other labels not classified in the classification data corresponding to the classifier is compared. That is, when the likelihood difference between the classifiers is small and the likelihood difference from other classes is large, the classifier setting unit 1001 serves as a classifier for classifying a plurality of classifiers having a small likelihood difference. Make settings and classify diagnostic names using the set classifier. Then, the classification results of the diagnosis names by a plurality of classifiers are compared and used as the classification results. With this configuration, although the classifier has the classification ability, the possibility of being excluded from the input data to the classifier is reduced because the likelihood difference is small.

（変形例２−３）
変形例２−２では、分類器による分類結果である分類器（クラス）間の尤度の差が小さい場合に、例えばその他のクラスへの尤度と、尤度の差が小さい分類器のクラスに対応する尤度との差を比較し、その差が所定の基準よりも大きい場合には、分類対象医用データを分類する分類器として分類器設定部１００１が設定を行う構成を説明した。 (Modification 2-3)
In the modified example 2-2, when the difference in likelihood between the classifiers (classes), which is the result of classification by the classifier, is small, for example, the class of the classifier having a small difference in likelihood with the other classes. The configuration in which the classifier setting unit 1001 sets as a classifier for classifying the medical data to be classified when the difference from the likelihood corresponding to the above is compared and the difference is larger than a predetermined standard has been described.

変形例２−３では、分類器設定部１００１は複数の分類器を分類器として設定をし、複数の分類器の結果を正規化した後に、総和を比較することで診断名の分類結果としてもよい。 In the modification 2-3, the classifier setting unit 1001 sets a plurality of classifiers as classifiers, normalizes the results of the plurality of classifiers, and then compares the sum to obtain the classification result of the diagnosis name. Good.

ここでは、例として分類器が第３の分類器まで存在し、診断名がＡ、Ｂ、Ｃ、Ｄ（その他）であるとする。分類対象の医用データ取得部５０１により取得された医用データを基に、分類対象の医用データ評価部５０２により、各分類器に対応する分類データを学習データとし、ラベルに分類器名を付与したクラス分類を行う。その場合の尤度が次のように分類されたとする。（第１の分類器、第２の分類器、第３の分類器）＝（０．６、０．２、０．２）。本変形例では、分類器の設定部１００１は、それぞれの分類器を分類対象の医用データセットを分類するための分類器として設定をし、分類処理を実行させる。そして結果が次のようであると仮定をする。分類器Ａに関して、（診断名Ａ、診断名Ｂ、診断名Ｃ、診断名Ｄ）＝（０．６、０．４、０．０、０．０）。分類器Ｂは、（診断名Ａ、診断名Ｂ、診断名Ｃ、診断名Ｄ）＝（０．９、０．１、０．０、０．０）。分類器Ｃは診断名Ａ、診断名Ｂ、診断名Ｃ、診断名Ｄ）＝（０．５、０．５、０．０、０．０）。ここでは、分類器間の学習データ数や、クラス数が互いに同一であると仮定をしているが、仮に学習データ数や、クラス数が異なる場合には、互いの分類器間の分類の尤度のばらつきをなくすための正規化処理や、学習データの数が所定の基準より少ない場合など信頼性が小さい場合には、分類器ごとの尤度に係数として乗算を行ってもよい。 Here, as an example, it is assumed that a classifier exists up to a third classifier and the diagnostic names are A, B, C, D (others). Based on the medical data acquired by the medical data acquisition unit 501 of the classification target, the classification data corresponding to each classifier is used as training data by the medical data evaluation unit 502 of the classification target, and the classifier name is given to the label. Make a classification. It is assumed that the likelihood in that case is classified as follows. (1st classifier, 2nd classifier, 3rd classifier) = (0.6, 0.2, 0.2). In this modification, the classifier setting unit 1001 sets each classifier as a classifier for classifying the medical data set to be classified, and executes the classification process. And assume that the result is as follows. With respect to classifier A, (diagnosis name A, diagnosis name B, diagnosis name C, diagnosis name D) = (0.6, 0.4, 0.0, 0.0). The classifier B has (diagnosis name A, diagnosis name B, diagnosis name C, diagnosis name D) = (0.9, 0.1, 0.0, 0.0). The classifier C has a diagnosis name A, a diagnosis name B, a diagnosis name C, and a diagnosis name D) = (0.5, 0.5, 0.0, 0.0). Here, it is assumed that the number of training data and the number of classes between the classifiers are the same, but if the number of training data and the number of classes are different, the coefficient of classification between the classifiers is likely. If the reliability is low, such as in the normalization process for eliminating the variation in the degree or when the number of training data is less than a predetermined standard, the likelihood of each classifier may be multiplied as a coefficient.

そして分類器名をラベルとしたクラス分類の結果を、それぞれの分類器によって診断名を分類した結果に乗算する。即ち、分類器Ａ＝（０．６×０．６、０．６×０．４、０．６×０．０、０．６×０．０）となり、他の分類器でも同様の処理を行う。そして分類器ごとの診断名の総和を取得する。診断名の総和＝（０．６４、０．３６、０．０、０．０）となる。診断名の総和を受けて診断名を分類する分類処理による分類結果としてもよい。 Then, the result of class classification using the classifier name as a label is multiplied by the result of classifying the diagnosis name by each classifier. That is, the classifier A = (0.6 × 0.6, 0.6 × 0.4, 0.6 × 0.0, 0.6 × 0.0), and the same processing is performed with other classifiers. Do. Then, the sum of the diagnosis names for each classifier is obtained. The sum of the diagnosis names = (0.64, 0.36, 0.0, 0.0). It may be a classification result by a classification process for classifying the diagnosis names based on the sum of the diagnosis names.

＜実施形態３＞
本発明の一側面として、分類器の分類結果と正解との乖離の程度が所定の基準を満たさないデータを当該分類器における分類データとして記憶し、所定の基準を満たす苦手データは、他の分類器の教師データもしくは、医用データとしてプールをした。結果として複数の分類器が作成され、複数の分類器のそれぞれに対応する分類データと、分類対象の医用データとの類似性を評価することによって、分類対象の医用データに対する分類の信頼性をユーザに認知させることができ、かつ異なる特徴を学習した分類器を複数設けることにより、当該分類対象の医用データを入力するのにふさわしい分類器を認知、選択することが可能となった。 <Embodiment 3>
As one aspect of the present invention, data in which the degree of deviation between the classification result of the classifier and the correct answer does not meet the predetermined criteria is stored as classification data in the classifier, and data that is not good at satisfying the predetermined criteria is classified into another classification. Pooled as vessel teacher data or medical data. As a result, multiple classifiers are created, and the reliability of classification for the medical data to be classified is evaluated by evaluating the similarity between the classification data corresponding to each of the multiple classifiers and the medical data to be classified. By providing a plurality of classifiers that can be recognized by the user and that have learned different characteristics, it has become possible to recognize and select a classifier suitable for inputting medical data to be classified.

本実施形態では、分類データおよび分類器の作成フローと、フローの繰り返しによって教師データに対して情報処理装置１０１が行う処理について述べる。 In the present embodiment, a flow for creating classification data and a classifier, and a process performed by the information processing apparatus 101 on the teacher data by repeating the flow will be described.

第ｉの医用データから分類データおよび分類器の作成フローの回数を繰り返すほどデータ数や、クラス数は減少する。他方で、複数の分類器間に同一のラベルが付与された分類データが存在することが考えられる。ここでは、簡便のために、ラベルを診断名とし、第１の医用データセットに対してＡ〜Ｅ（診断名）のラベルが付与されているものとし、図１３を用いて説明をする。図１３は第１から第Ｎの分類器に対応する分類データと、分類データを構成する診断名のラベルごとの分類データのサンプル数を示している。上述までに説明したように、第１の分類器に対応する第１の分類データから第Ｎの分類器に対応する第Ｎの分類データまで下段の分類データになるにつれて、データの数と、クラスの数が減少していることを示している。一方で、例えば第１の分類データと第２の分類データについて考えてみると、第１の分類データは第１の分類器で、所定の基準を満たさないデータ（正解との分類結果との乖離の程度が例えば所定の閾値未満）、つまり第１の分類器で精度よく分類できたデータになる。比較して、第２の分類データは、第１の分類器では精度よく分類できなかったものの、第２の分類器では精度よく分類されたデータであり、第１の分類データにおける各ラベルに対応するデータと、第２の各ラベルに対応するデータ間には、両者を隔てるための特徴が存在することが考えられる。ここでは、第１の分類データにおける診断名Ａと第２の分類データにおける診断名Ａをそれぞれ別のラベルとして、分類器を学習させる。同一の診断名Ａを互いに有する分類器に対応する分類データをそれぞれの分類器のラベルを付与して学習をさせることにより、診断名Ａに対するロバスト性を複数の分類器によって実現することが可能になる。尚、診断名Ａを有する複数の分類器をそれぞれラベルとして設けてもよいし、複数の診断名と複数の分類データをラベルとして分類器を学習してもよい。 The number of data and the number of classes decrease as the number of times of the flow of creating classification data and classifier from the third medical data is repeated. On the other hand, it is conceivable that there is classification data with the same label among the plurality of classifiers. Here, for the sake of simplicity, it is assumed that the label is a diagnostic name and the first medical data set is labeled with A to E (diagnosis name), and the description will be made with reference to FIG. FIG. 13 shows the classification data corresponding to the first to Nth classifiers and the number of samples of the classification data for each label of the diagnosis name constituting the classification data. As explained above, the number of data and the class as the lower classification data is from the first classification data corresponding to the first classifier to the Nth classification data corresponding to the Nth classifier. It shows that the number of is decreasing. On the other hand, considering, for example, the first classification data and the second classification data, the first classification data is the first classifier, and the data that does not meet the predetermined criteria (difference from the classification result from the correct answer). The degree of the data is, for example, less than a predetermined threshold value), that is, the data can be classified accurately by the first classifier. By comparison, the second classification data was not accurately classified by the first classifier, but was accurately classified by the second classifier, and corresponds to each label in the first classification data. It is conceivable that there is a feature for separating the data and the data corresponding to each of the second labels. Here, the classifier is trained by using the diagnosis name A in the first classification data and the diagnosis name A in the second classification data as separate labels. By assigning labels to each classifier and learning the classification data corresponding to the classifiers having the same diagnosis name A, it is possible to realize the robustness to the diagnosis name A by a plurality of classifiers. Become. A plurality of classifiers having the diagnosis name A may be provided as labels, or the classifier may be learned using a plurality of diagnosis names and a plurality of classification data as labels.

本構成により、作成された分類器を、分類対象の医用データ評価部５０２で用いる分類器とすることによって、複数の分類器のそれぞれに対応する分類データをクラスとして分類する分類器での分類よりも、より詳細な評価結果を取得することができる。 By using the classifier created by this configuration as the classifier used by the medical data evaluation unit 502 to be classified, the classification data corresponding to each of the plurality of classifiers is classified as a class. Also, more detailed evaluation results can be obtained.

たとえば、分類対象の医用データセット３２０を構成する分類対象の医用データが、第２の分類器の診断名Ａが付与されたデータと類似度が９５％であった場合に、図１４（ａ）の通知領域１４０５は分類器の番号と、診断名、類似度を通知する。また、本構成の場合には、分類対象の医用データの評価部５０２における分類器が、複数分類器のそれぞれのクラスを包含したクラスを有する分類器となる。そのため、分類対象の医用データ評価部５０２における評価結果が、分類器設定部１００１による分類器の設定の工程を経ずに評価をすることが可能となる。つまり第２の分類器の診断名Ａへの尤度は高いが、第１の分類器の診断名Ａへの尤度が小さかった場合には、第１の分類器とは異なる特徴で学習ができ、かつ信頼度が高い診断名の分類が可能になったことを指す。尚、通知内容はこれらのうちいずれかを含んでいれば、他の情報と共に通知されてもよい。例えば、図１４（ｂ）における通知領域１４０６に示したように複数の診断名が分類され、それぞれの診断名に対して類似性が高い分類データが異なることがある。このような場合には、診断名のそれぞれに対して類似度と、診断名を通知部５０３により通知をしてもよい。また一例として図１５の円グラフ１５００のように、それぞれの分類器に対応する分類データの割合と、分類器による分類結果を対応付けて通知を行ってもよい。 For example, when the medical data to be classified that constitutes the medical data set 320 to be classified has a similarity of 95% with the data to which the diagnosis name A of the second classifier is given, FIG. 14 (a) shows. Notification area 1405 notifies the number of the classifier, the diagnosis name, and the degree of similarity. Further, in the case of this configuration, the classifier in the evaluation unit 502 of the medical data to be classified is a classifier having a class including each class of the plurality of classifiers. Therefore, the evaluation result in the medical data evaluation unit 502 to be classified can be evaluated without going through the step of setting the classifier by the classifier setting unit 1001. That is, when the likelihood of the second classifier to the diagnosis name A is high, but the likelihood of the first classifier to the diagnosis name A is small, learning is performed with characteristics different from those of the first classifier. It means that it is possible to classify diagnostic names with high reliability. If the content of the notification includes any of these, it may be notified together with other information. For example, as shown in the notification area 1406 in FIG. 14B, a plurality of diagnostic names may be classified, and the classification data having high similarity to each diagnostic name may be different. In such a case, the similarity and the diagnosis name may be notified by the notification unit 503 for each of the diagnosis names. Further, as an example, as shown in the pie chart 1500 of FIG. 15, the ratio of the classification data corresponding to each classifier and the classification result by the classifier may be associated and notified.

（変形例３−１）ユーザが分類したい診断名カスタム
変形例３−１では、ユーザが選択した特定の診断名に対して評価を行う構成について述べる。ユーザが例えば診断名Ａおよび診断名Ｂに関して、分類処理を実行したいとする。診断名Ａと診断名Ｂにおいて、実施形態３の構成のように複数の分類器における分類データを構成する診断名をラベルとして学習を行った場合について述べる。複数の分類器が分類を行うクラスは例えば（第１の分類器の診断名Ａ、第２の分類器の診断名Ａ・・・第Ｎ−１の分類器の診断名Ａ、第Ｎの分類器の診断名Ａ、第１の分類器の診断名Ｂ、第２の分類器の診断名Ｂ・・・第Ｎ−１の分類器の診断名Ｂ、第Ｎの分類器の診断名Ｂ、その他）となる。尚、ユーザは、入力インターフェース２０８を介して、診断したい診断名を入力してもよいし、クラスの構成を指定してもよい。ここで、作成されたクラスに対応する分類データを用いて診断名を分類する分類器の学習を行う。本構成により、ユーザ所望の診断名に対してのみ、クラス尤度が出力される。尚作成されたクラスに対応する分類データを用いて分類器の学習ができると上述までの分類器の作成フローによってさらに複数の分類を作成してもよい。 (Modification 3-1) Diagnosis name that the user wants to classify Custom Modification 3-1 describes a configuration in which a specific diagnosis name selected by the user is evaluated. Suppose a user wants to execute a classification process for, for example, diagnosis name A and diagnosis name B. A case where learning is performed using the diagnosis names constituting the classification data in a plurality of classifiers as labels in the diagnosis name A and the diagnosis name B as in the configuration of the third embodiment will be described. The class in which a plurality of classifiers classify is, for example, (diagnosis name A of the first classifier, diagnosis name A of the second classifier ... diagnosis name A of the N-1 classifier, classification of the Nth classifier. Vessel diagnostic name A, first classifier diagnostic name B, second classifier diagnostic name B ... N-1 classifier diagnostic name B, Nth classifier diagnostic name B, Others). The user may input the diagnosis name to be diagnosed or specify the class configuration via the input interface 208. Here, the classifier that classifies the diagnosis name is learned using the classification data corresponding to the created class. With this configuration, the class likelihood is output only for the diagnosis name desired by the user. If the classifier can be learned using the classification data corresponding to the created class, a plurality of classifications may be further created by the above-mentioned flow of creating the classifier.

＜実施形態４＞
分類器の性能の向上のためには、教師データの数と質が一つの課題となっている。教師データの質に関して、質は例えばアノテーション（ラベルを指す）が適切に付与されているかどうかにより判断される。教師データの中には、誤ってアノテーションがなされていたり、学習した特徴では適切に分類できないようなデータに同一のアノテーションが付与されていたりすることがある。 <Embodiment 4>
The number and quality of teacher data is one issue for improving the performance of classifiers. Regarding the quality of teacher data, the quality is judged, for example, by whether or not annotations (pointing to labels) are properly attached. Some teacher data may be erroneously annotated, or the same annotation may be added to data that cannot be properly classified by the learned features.

本実施形態では、上述までで述べた複数の分類器および分類データの作成フローに基づいて、教師データに対して再度アノテーションもしくは新規の医用データに対してアノテーションを行う（以下再ラベリング）形態について説明をする。ここでは、上述した医用データに対して適切なアノテーションが付与されていない、もしくは分類のモデル構造に対して異なるラベルを付与することが適切である場合に、再ラベリングをユーザに促すことができる。図１３のように、複数の分類器に対応する分類データ間に重複する診断名を有する医用データが存在すると仮定をし、診断名に対応する分類器の作成を行う。例えば図１３の診断名Ａのように複数の分類データに対して複数のサンプルが存在する場合に有効である。ここで作成する分類器は、診断名Ａの第１の分類データと、診断名Ａの第２の分類データの２クラス分類をする。もちろんクラス数は多値でも数は問わない。分類器は、例えば、Ｇｒａｄｉｅｎｔ−ｗｅｉｇｈｔｅｄＣｌａｓｓＡｃｔｉｖａｔｉｏｎＭａｐｐｉｎｇ（以降Ｇｒａｄ−ＣＡＭ）と呼ばれる分類器を用いる。Ｇｒａｄ−ＣＡＭはクラスごとの影響が大きい画像箇所をヒートマップと尤度を表示することができる技術である。即ち、診断名Ａに対して第１の分類器のクラスと、第２の分類器のクラスのそれぞれに対応する分類データを２クラスに分類する際の特徴を尤度と共にユーザに認識させることができる。さらにユーザは、Ｇｒａｄ−ＣＡＭによるヒートマップに基づいて、当該分類データに対して再ラベリングすることができ、再ラベリングデータを教師データに加えて分類器を学習させることにより、分類器による分類結果の信頼性とロバスト性が確保できる。以下、図１６を用いて情報処理装置１０１の機能ブロックについて説明をする。尚、情報処理装置１０１は他の実施形態で述べた他の構成を有していてもよいし、以下説明をする機能ブロックのみを別途有していてもよいし、複数の情報処理装置によって構成されてもよい。 In the present embodiment, a mode in which the teacher data is annotated again or the new medical data is annotated (hereinafter referred to as relabeling) will be described based on the plurality of classifiers and the flow of creating the classification data described above. do. Here, relabeling can be encouraged if the medical data described above is not properly annotated, or if it is appropriate to label the model structure of the classification differently. As shown in FIG. 13, it is assumed that there is medical data having a duplicate diagnosis name among the classification data corresponding to the plurality of classifiers, and the classifier corresponding to the diagnosis name is created. For example, it is effective when a plurality of samples exist for a plurality of classification data as shown in the diagnosis name A in FIG. The classifier created here classifies the first classification data of the diagnosis name A and the second classification data of the diagnosis name A into two classes. Of course, the number of classes can be multi-valued or any number. As the classifier, for example, a classifier called Gradient-weighted Class Activation Mapping (hereinafter referred to as Grad-CAM) is used. Grade-CAM is a technology that can display a heat map and likelihood of image parts that are greatly affected by each class. That is, it is possible to make the user recognize the characteristics when classifying the classification data corresponding to each of the first classifier class and the second classifier class into the two classes for the diagnosis name A together with the likelihood. it can. Further, the user can relabel the classification data based on the heat map by Grad-CAM, and by adding the relabeling data to the teacher data and training the classifier, the classification result by the classifier can be obtained. Reliability and robustness can be ensured. Hereinafter, the functional block of the information processing apparatus 101 will be described with reference to FIG. The information processing device 101 may have other configurations described in other embodiments, may have only the functional blocks described below separately, or may be configured by a plurality of information processing devices. May be done.

図１６において、情報処理装置１０１は、正解ラベルを有する医用データに対して、複数の分類器のそれぞれによる分類結果のうち、正解ラベルと分類結果との乖離の程度が所定の基準を満たさない（乖離の程度が所定の閾値未満）医用データである分類データを取得する分類データ取得部１６０１を有する。そして、分類データ取得部１６０１は、取得された分類データのうち、異なる分類器に対応する分類データで且つ、同一の正解ラベルが付されている分類データを教師データとして分類器の学習部１６０２に送信する。分類器の学習部１６０２は送信された分類データに基づいて、分類器の学習を行う。そして、医用データ取得部１６０３は、医用画像ＤＢ１０２より、医用データを取得し、医用データのラベル評価部１６０４に送信する。医用データのラベル評価部１６０４は、取得した医用データを分類器の学習部１６０２に送信し、分類器の学習部１６０２で学習された分類器に分類処理を実行させる。そして分類処理が実行された医用データと分類結果を正解ラベル設定部１６０５に送信する。医用データと分類結果を取得した正解ラベル設定部１６０５は、医用データにラベルが付されているかを判定し、医用データにラベルが付されていない場合には、新規にラベルを設定する。一方で医用データにラベルが伏されていた場合は、ラベルの置換を行う。医用データのラベル評価部１６０４は、分類器による分類結果を通知部１６０６に送信をする。通知部１６０６は分類結果の通知を行う。即ち、本実施形態において情報処理装置１０１は、医用データをクラス分類する複数の分類器を有する情報処理装置であって、正解ラベルが付与された医用データに対する前記複数の分類器のそれぞれによる分類結果のうち、正解ラベルと分類結果との乖離の程度が所定の基準を満たさない分類データを取得する分類データ取得部１６０１を有する。さらに取得した分類データのうち、異なる分類器に対応する分類データで且つ同一の正解ラベルを有する分類データを教師データとして分類器の学習を行う分類器の学習部１６０２と、を有する。 In FIG. 16, the information processing apparatus 101 does not satisfy a predetermined criterion for the degree of deviation between the correct answer label and the classification result among the classification results by each of the plurality of classifiers for the medical data having the correct answer label ( It has a classification data acquisition unit 1601 for acquiring classification data which is medical data (the degree of deviation is less than a predetermined threshold). Then, the classification data acquisition unit 1601 sends the classification data corresponding to different classifiers among the acquired classification data to the learning unit 1602 of the classifier as the teacher data with the same correct answer label. Send. The learning unit 1602 of the classifier learns the classifier based on the transmitted classification data. Then, the medical data acquisition unit 1603 acquires the medical data from the medical image DB 102 and transmits it to the label evaluation unit 1604 of the medical data. The label evaluation unit 1604 of the medical data transmits the acquired medical data to the learning unit 1602 of the classifier, and causes the classifier learned by the learning unit 1602 of the classifier to execute the classification process. Then, the medical data on which the classification process has been executed and the classification result are transmitted to the correct label setting unit 1605. The correct answer label setting unit 1605, which has acquired the medical data and the classification result, determines whether the medical data is labeled, and if the medical data is not labeled, sets a new label. On the other hand, if the label is hidden in the medical data, the label is replaced. The label evaluation unit 1604 of the medical data transmits the classification result by the classifier to the notification unit 1606. The notification unit 1606 notifies the classification result. That is, in the present embodiment, the information processing device 101 is an information processing device having a plurality of classifiers for classifying medical data, and the classification result of each of the plurality of classifiers for the medical data to which the correct answer label is given. Among them, the classification data acquisition unit 1601 for acquiring classification data in which the degree of deviation between the correct answer label and the classification result does not satisfy a predetermined criterion is provided. Further, among the acquired classification data, there is a classifier learning unit 1602 that learns the classifier using the classification data corresponding to different classifiers and having the same correct answer label as teacher data.

また、学習された分類器の分類結果に基づいて、医用データの正解ラベルを設定する正解ラベル設定部１６０５を有していてもよい。さらには、分類結果を通知する通知部１６０６を有する。 Further, it may have a correct answer label setting unit 1605 that sets a correct answer label of medical data based on the classification result of the learned classifier. Further, it has a notification unit 1606 for notifying the classification result.

図１７は、本実施形態の処理フローである。ステップＳ１７０１は、分類データ取得部１６０１により、医用画像ＤＢ１０２における分類データセット６２０の内、異なる分類器に対応する分類データで且つ同一のラベルを有する分類データセットを取得する。ステップＳ１７０２において、分類器の学習部１６０２は、例えばＧｒａｄ−ＣＡＭに基づいた学習器で、取得された分類データの学習を行う。ステップＳ１７０３は、分類データ取得部１６０１によってさらに分類器の学習（生成）を必要とする分類データの有無を判定し、分類器の生成が必要であると判定された場合には、ステップＳ１７０１に戻ってさらに処理を実行する。分類データ取得部１６０１により、分類器の学習（生成）が終了したと判定された場合には、後段のステップに移行する。ステップＳ１７０４は、医用データの取得部１６０３によって医用データを取得するステップである。医用データ取得部１６０３によって取得される医用データは、正解ラベルが付与されたデータでも、付与されていないデータでもよい。例えば、上述の実施形態で記載した分類器をクラスとした分類器によるクラス尤度の差が小さいデータや、新たに教師データとして正解ラベルの付与が必要なデータ等が対象として考えられる。ステップＳ１７０５において医用データのラベルの評価を行う。医用データのラベル評価部１６０４は、分類器の学習部１６０２によって作成された学習器に対して分類処理を実行させる。そして分類結果として各クラスへの尤度と、Ｇｒａｄ−ＣＡＭによるヒートマップを取得し、通知部１６０６を介して分類結果の通知を行う。即ち学習された分類器による分類結果が尤度であることを特徴とする。また分類器がＧｒａｄ−ＣＡＭに基づく分類器であることを特徴とする。通知部１６０６は、Ｇｒａｄ−ＣＡＭによるヒートマップを通知することを特徴とする。また通知部１６０６は、図１５で上述したように、各分類器に対応する学習データの数および割合の少なくとも一方を通知してもよい。ステップＳ１７０６において、正解ラベル設定部１６０５は、現在のラベルの有無を判定し、ラベルが付されている場合には、ラベルとの整合性を判定する。正解ラベル設定部１６０５は、当該分類器による分類器のうち最も高いクラスを正解ラベルとしてもよいし、閾値を超えるクラスを正解クラスとしてもよい。尚、両者を組み合わせて正解ラベルを設定してもよい。ステップＳ１７０７においてラベルの置換を行う。またラベルが付与されていない場合には分類結果に基づいてラベルの付与を行う（ステップＳ１７０７）。ステップＳ１７０６において、ラベルが付されていて且つ、ラベルの信頼性が高い場合には、処理の終了をする。即ち、情報処理装置１０１における正解ラベルの設定部１６０５は、正解ラベルを付与された医用データの正解ラベルを置換することを特徴とする。 FIG. 17 is a processing flow of the present embodiment. In step S1701, the classification data acquisition unit 1601 acquires a classification data set having the same label as the classification data corresponding to different classifiers among the classification data sets 620 in the medical image DB 102. In step S1702, the learning unit 1602 of the classifier learns the acquired classification data with a learning device based on, for example, Grad-CAM. In step S1703, the classification data acquisition unit 1601 determines whether or not there is classification data that further requires learning (generation) of the classifier, and if it is determined that generation of the classifier is necessary, the process returns to step S1701. And execute further processing. When the classification data acquisition unit 1601 determines that the learning (generation) of the classifier has been completed, the process proceeds to the subsequent step. Step S1704 is a step of acquiring medical data by the medical data acquisition unit 1603. The medical data acquired by the medical data acquisition unit 1603 may be data with a correct label or data without the correct label. For example, data having a small difference in class likelihood between classifiers using the classifier described in the above embodiment as a class, data requiring a correct label as new teacher data, and the like can be considered as targets. In step S1705, the label of the medical data is evaluated. The label evaluation unit 1604 of the medical data causes the learning device created by the learning unit 1602 of the classifier to execute the classification process. Then, as the classification result, the likelihood to each class and the heat map by Grad-CAM are acquired, and the classification result is notified via the notification unit 1606. That is, it is characterized in that the classification result by the learned classifier is the likelihood. Further, the classifier is a classifier based on Grad-CAM. The notification unit 1606 is characterized in that the heat map by Grad-CAM is notified. Further, as described above in FIG. 15, the notification unit 1606 may notify at least one of the number and the ratio of the training data corresponding to each classifier. In step S1706, the correct answer label setting unit 1605 determines the presence or absence of the current label, and if the label is attached, determines the consistency with the label. In the correct answer label setting unit 1605, the highest class among the classifiers by the classifier may be the correct answer label, or the class exceeding the threshold value may be the correct answer class. The correct label may be set by combining both. Label replacement is performed in step S1707. If the label is not attached, the label is attached based on the classification result (step S1707). In step S1706, if the label is attached and the label is highly reliable, the process is terminated. That is, the correct answer label setting unit 1605 in the information processing apparatus 101 is characterized in that it replaces the correct answer label of the medical data to which the correct answer label is given.

（変形例４−１）
上述の実施形態４は、誤ってアノテーションされたもしくは、分類器が学習した特徴では分類できない分類データを再ラベリングする手法について述べた。変形例４−１は、正解ラベル設定部１６０５が新規に画像データに対してアノテーションをする際に、実施形態４で説明をしたＧｒａｄ−ＣＡＭを用いて、ラベリングを促す。即ち、新たにラベリングが必要なデータをＧｒａｄ−ＣＡＭを基にした分類器に対して入力を行うと、例えば第１の分類器の診断名Ａの場合に着目すべき画像領域と、第２の分類器の診断名Ａの場合に注目すべき画像領域をそれぞれ取得することができる。ユーザは、第１の分類器の診断名Ａにおける注目部位と、第２の分類器の診断名Ａにおける注目部位とに基づいて、いずれのラベルを新規の画像データに対して付すかを決定することができる。なお、Ｇｒａｄ−ＣＡＭによるヒートマップに基づいてユーザにラベリングをさせる形態に捉われず、複数の分類器の診断名をラベルとして分類器が分類した尤度に基づいて情報処理装置１０１がラベリングを行ってもよい。また、情報処理装置１０１がラベリングを行ったデータを医用画像ＤＢ１０２における医用データとして分類器を作成するフローに用いてもよい。即ち、正解ラベル設定部による正解ラベルの設定は、正解ラベルが付与されていない医用データに正解ラベルを付与することを特徴とする。 (Modification 4-1)
The fourth embodiment described above describes a method of relabeling classification data that cannot be classified by features that have been incorrectly annotated or learned by the classifier. In the modified example 4-1 when the correct answer label setting unit 1605 newly annotates the image data, labeling is promoted by using the Grad-CAM described in the fourth embodiment. That is, when data requiring new labeling is input to a classifier based on Grad-CAM, for example, an image region to be noted in the case of the diagnosis name A of the first classifier and a second classifier In the case of the diagnostic name A of the classifier, it is possible to acquire each notable image region. The user determines which label should be attached to the new image data based on the region of interest in the diagnostic name A of the first classifier and the region of interest in the diagnostic name A of the second classifier. be able to. It should be noted that the information processing apparatus 101 performs labeling based on the likelihood of classification by the classifiers using the diagnostic names of a plurality of classifiers as labels, regardless of the form in which the user is labeled based on the heat map by Grad-CAM. You may. Further, the data labeled by the information processing apparatus 101 may be used in a flow for creating a classifier as medical data in the medical image DB 102. That is, the setting of the correct answer label by the correct answer label setting unit is characterized in that the correct answer label is given to the medical data to which the correct answer label is not given.

１０１情報処理装置
１０２医用画像ＤＢ
１０３ＬＡＮ
３１０第ｉの医用データセット
３２０分類対象の医用データセット
４０１第ｉの医用データ取得部
４０２第ｉの分類器
４０３分類結果の評価部
４０４第ｉ＋１の分類器の学習部
５０１分類対象の医用データ取得部
５０２分類対象の医用データ評価部
５０３通知部
６０１尤度取得部
６０２判定部
６０３第ｉ＋１の医用データセット 101 Information processing device 102 Medical image DB
103 LAN
310 First i medical data set 320 Classification target medical data set 401 First i medical data acquisition unit 402 Second i classifier 403 Classification result evaluation unit 404 First i + 1 classifier learning unit 501 Acquisition of classification target medical data Department 502 Medical data evaluation unit to be classified 503 Notification unit 601 Probability acquisition unit 602 Judgment unit 603 Medical data set of the third i + 1

Claims

A likelihood acquisition unit that acquires the class likelihood for medical data with a correct label using the first classifier that classifies medical data.
An evaluation unit for classification results that evaluates the degree of dissociation based on the class likelihood acquired by the likelihood acquisition unit and the class corresponding to the correct label.
A determination unit that determines whether or not the degree of deviation by the evaluation unit of the classification result satisfies a predetermined criterion, and
A learning unit of a classifier that learns a second classifier using medical data determined to meet a predetermined criterion by the determination unit as teacher data,
An information processing device characterized by having.

The claim is characterized in that the medical data determined by the determination unit not to meet the predetermined criteria is used as the classification data corresponding to the classifier that classifies the medical data determined not to meet the predetermined criteria. The information processing apparatus according to 1.

The second classifier learned from the teacher data is used as a classifier for classifying the medical data, and the medical data determined to meet the predetermined criteria is used as the medical data, and the likelihood acquisition unit and classification are used. The information processing apparatus according to claim 1 or 2, further comprising a control unit capable of repeatedly executing the processes of the result evaluation unit, the determination unit, and the learning unit.

The information processing apparatus according to claim 3, wherein the information processing apparatus has a plurality of classifiers and a plurality of classification data corresponding to each of the plurality of classifiers by the repetition.

A claim characterized by having a classification target medical data evaluation unit that evaluates classification target medical data using a classifier that has learned classification data with each of the plurality of classifiers as labels as teacher data. The information processing apparatus according to 4.

The information processing apparatus according to claim 5, further comprising a notification unit for notifying an evaluation result by the medical data evaluation unit to be classified.

The information processing apparatus according to claim 5, wherein the classifier in the medical data evaluation unit to be classified is a classifier that calculates a classification result into a class corresponding to the plurality of classifiers by likelihood.

The claim is characterized by having a classifier setting unit for setting a classifier that classifies the medical data to be classified among the plurality of classifiers based on the evaluation result by the medical data evaluation unit to be classified. The information processing apparatus according to 5 or 7.

The information processing device according to claim 8, wherein the setting unit of the classifier sets the classifier having the highest likelihood as a classifier for classifying the medical data to be classified.

The information processing apparatus according to claim 8 or 9, wherein the setting unit of the classifier sets the classifier whose likelihood exceeds the threshold value as a classifier for classifying the medical data to be classified. ..

The information processing according to any one of claims 8 to 10, wherein the information processing indicating the classifier set by the setting unit of the classifier and the notification unit for notifying the classification result by the classifier are provided. apparatus.

In the repetition, it is determined that the number of teacher data for learning the classifier is less than or equal to the predetermined number, the classification accuracy of the classifier is determined to be less than or equal to the predetermined value, the overfitting is determined, the unlearned is determined, and the number of times specified by the user is exceeded. The information processing apparatus according to claim 3 or 4, wherein the repetition is terminated when any of the determination processes is performed.

The likelihood acquisition step of acquiring the class likelihood for the medical data with the correct label using the first classifier that classifies the medical data, and
An evaluation step of a classification result for evaluating the degree of dissociation based on the class likelihood acquired by the likelihood acquisition unit and the class corresponding to the correct label, and
A determination step for determining whether or not the deviation by the evaluation unit of the classification result satisfies a predetermined criterion, and
A learning step of a classifier for learning a second classifier using medical data determined to meet a predetermined criterion by the determination unit as teacher data, and
An information processing method characterized by having.

13. The thirteenth aspect of the present invention is characterized in that the data determined not to meet the predetermined criteria by the determination step is used as the classification data corresponding to the classifier that classifies the data determined not to meet the predetermined criteria. The information processing method described.

The second classifier learned from the teacher data is used as a classifier for classifying the medical data, and the medical data determined to meet the predetermined criteria is used as the medical data, and the likelihood acquisition step and classification are performed. The information processing method according to claim 13 or 14, further comprising a control step capable of repeatedly executing the processing of the result evaluation step, the determination step, and the learning step.

A program for causing a computer to execute the information processing method according to claim 15.