WO2021140604A1 - Teaching data generation system and teaching data generation method - Google Patents

Teaching data generation system and teaching data generation method Download PDF

Info

Publication number
WO2021140604A1
WO2021140604A1 PCT/JP2020/000424 JP2020000424W WO2021140604A1 WO 2021140604 A1 WO2021140604 A1 WO 2021140604A1 JP 2020000424 W JP2020000424 W JP 2020000424W WO 2021140604 A1 WO2021140604 A1 WO 2021140604A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
teacher data
creator
teacher
correct answer
Prior art date
Application number
PCT/JP2020/000424
Other languages
French (fr)
Japanese (ja)
Inventor
順也 福岡
Original Assignee
国立大学法人長崎大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国立大学法人長崎大学 filed Critical 国立大学法人長崎大学
Priority to JP2021569658A priority Critical patent/JP7482537B2/en
Priority to PCT/JP2020/000424 priority patent/WO2021140604A1/en
Publication of WO2021140604A1 publication Critical patent/WO2021140604A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to a teacher data creation system used in artificial intelligence (AI) in deep learning and the like, and a teacher data creation method.
  • AI artificial intelligence
  • Patent Document 1 A technique for providing diagnostic support for medical images with a computer has been proposed (see, for example, Patent Document 1).
  • AI artificial intelligence
  • Teacher data is important for making decisions for diagnostic support by artificial intelligence, but even if it is generally called teacher data, it is unclear what should be the "correct answer”.
  • determining the "correct answer" in the diagnosis of a pathological image requires time to obtain prognostic information, and it is not possible to determine / select the "correct answer” at least in almost real time and acquire it as teacher data. It is impossible.
  • interstitial pneumonia has a 5-year survival rate of about 20%, which is a disease with a poor prognosis comparable to cancer, but its classification is extremely difficult.
  • the concordance rate of diagnosis results among pathologists who diagnosed interstitial pneumonia is extremely low, and it is considered necessary to statistically prove that poor prognosis can be significantly classified.
  • teacher data used in artificial intelligence has been unified according to the status of global authority in the field and the brand name of the facility.
  • An object of the present invention is to provide a teacher data creation system and a teacher data creation method that enable acquisition of teacher data having a high accuracy rate.
  • the present invention presents data in which the correct answer used for selecting the creator of the teacher data is known, data in which the correct answer as a candidate for the teacher data is unknown, the judgment result for each data, and the correct answer is known.
  • An information acquisition device that acquires data related to the correctness of the judgment result for the data, data for which the correct answer used to select the creator of the teacher data is known, data for which the correct answer that is a candidate for teacher data is unknown, judgment result for each data
  • the storage device that stores the data related to the correctness of the judgment result for the data whose correct answer is known, and the judgment result and the correct answer by the candidate of the teacher data creator for the data whose correct answer is known are used for selecting the creator of the teacher data.
  • the creator of the teacher data is selected from the data related to the correctness of the judgment result for the known data
  • the first teacher data is selected from the data whose correct answer used for selecting the creator of the teacher data is known.
  • the present invention includes data having a known correct answer for selecting a creator of teacher data, a judgment result by a candidate for a creator of teacher data for data having a known correct answer used for selecting a creator of teacher data, and
  • the creator of the teacher data is selected from the process of acquiring the data regarding the correctness of the judgment result, the judgment result by the candidate for the data whose correct answer is known to be used for selecting the creator of the teacher data, and the data regarding the correctness of the judgment result.
  • the process of selecting the first teacher data from the data for which the correct answer used to select the creator of the teacher data is known, and the process of acquiring data for which the correct answer that is a candidate for the teacher data is unknown.
  • the judgment result of the data by the candidate for the creator of the teacher data is used. Based on, the candidates for teacher data creators can be grouped and the best group with the highest accuracy rate can be extracted.
  • the teacher data was selected as the teacher data, and this teacher data was suitable as the teacher data for artificial intelligence learning. It becomes a thing.
  • this teacher data is also artificial by accumulating the data whose correct answer is unknown and the judgment result for the data by the best group, and using the data that is expected to have a high correct answer rate of the actual judgment result as the teacher data. It is suitable as teacher data for intelligent learning.
  • FIG. 1 is a functional block diagram showing an example of the teacher data creation system of the present embodiment.
  • the teacher data creation system 1A of the present embodiment acquires data used for selecting a creator of teacher data, data that is a candidate for teacher data, judgment results for each data, and data regarding correctness of judgment results for each data.
  • Information acquisition device 2 to be used data used for selecting the creator of teacher data, data that is a candidate for teacher data, judgment result for each data, and display device 3 for displaying data related to the correctness of the judgment result for each data.
  • the teacher data creation system 1A includes a teacher data creator and a control device 5 for selecting teacher data.
  • the control device 5 selects the creator of the teacher data from the data used for selecting the creator of the teacher data, the judgment result for the data used for selecting the creator of the teacher data, and the data regarding the correctness of the judgment result. Further, the control device 5 selects the teacher data from the data used for selecting the creator of the teacher data, the judgment result for the data used for selecting the creator of the teacher data, the data regarding the correctness of the judgment result, and the like. Further, the control device 5 selects additional teacher data from the data that is a candidate for teacher data and the judgment result for the data that is a candidate for teacher data.
  • the information acquisition device 2 may be one in which each data is input by an operation of an operator using a keyboard, a mouse, a pen-type tablet, or the like, or one in which each data is input via a communication line, a recording medium, or the like. ..
  • the storage device 4 may be a database installed in a hospital or the like, or an external database that can be connected via a communication line.
  • the control device 5 may be an information processing device such as a personal computer.
  • the teacher data creation system 1A acquires information such as an image in which the correct answer is known and the judgment result by the candidate for the teacher data creator can be judged as correct as the data used for selecting the creator of the teacher data. It is acquired by the device 2 and stored in the storage device 4.
  • the teacher data creation system 1A acquires the judgment result by the candidate of the teacher data creator for the data used for selecting the teacher data creator and the data for determining the correctness of the judgment result by the information acquisition device 2. Store in the storage device 4.
  • the control device 5 collects candidates having similar judgment results from the candidates for the creator of the teacher data by cluster analysis or the like.
  • multiple judgment items are set for the data used to select the creator of the teacher data, and the teacher data is based on the matching rate of each item with the judgment result for each item and the data for determining the correctness of the judgment result. Group the candidates for the creator of.
  • the group that has the best match rate with the data that determines the correctness of the judgment result or that gives the judgment result equal to or higher than a predetermined threshold is set as the best group.
  • the best group is the creator of teacher data.
  • the control device 5 selects a set of data related to the correctness of the judgment result corresponding to the data used for selecting the creator of the teacher data belonging to the best group as the teacher data A for artificial intelligence learning. Since the correct answer is known for the teacher data A, the accuracy of the judgment result is guaranteed.
  • the judgment result by the creators of the plurality of teacher data belonging to the best group is the data that determines the correctness of the judgment result.
  • a set of data having a match rate equal to or higher than a predetermined threshold and showing a high match rate among creators of multiple teacher data belonging to the best group and judgment results by the creators of the corresponding teacher data is used. It may be selected as teacher data A.
  • the teacher data A is the teacher data that guarantees the accuracy of the judgment result and brings the judgment result close to the creator of the teacher data.
  • the concordance rate with the data that determines the correctness of the judgment result is less than a predetermined threshold, that is, the judgment result is incorrect.
  • a predetermined threshold that is, the judgment result is incorrect.
  • the teacher data creation system 1A acquires data in which the correct answer is unknown and the correctness of the judgment result is uncertain as the candidate data of the teacher data by the information acquisition device 2 and stores it in the storage unit 4. Further, the teacher data creation system 1A acquires the judgment result for the data by the creator of the teacher data belonging to the best group by the information acquisition device 2 and stores it in the storage unit 4.
  • the judgment result by the creator of the teacher data belonging to the best group shows a high matching rate among the creators of the plurality of teacher data belonging to the best group, and the judgment by the creator of the corresponding teacher data.
  • the set of results is selected as teacher data B for artificial intelligence learning.
  • the teacher data B the accuracy of the judgment result is expected.
  • the teacher data creation system 1A acquires and stores a plurality of pathological images of cases in which the prognosis is known and whether or not the diagnosis result is correct can be determined by the information acquisition device 2 as data used for selecting the creator of the teacher data. Store in device 4.
  • the teacher data creation system 1A designates a large number of pathologists as candidates for the creator of the teacher data, and the diagnosis results corresponding to each pathological image by the many pathologists and the data for determining the correctness of the diagnosis results.
  • the prognosis information is acquired by the information acquisition device 2 and stored in the storage device 4.
  • the control device 5 uses a cluster analysis method to group pathologists based on the matching rate of each item with the diagnosis result for a plurality of diagnosis items set for the pathological image and the prognosis information.
  • the group with the best concordance rate with the prognosis information or the diagnosis result of the predetermined threshold value or more is set as the best group, and this best group is set as the creator of the teacher data.
  • the control device 5 uses the pathological image used for selecting the creator of the teacher data and the set of the corresponding prognosis information as the teacher data A for artificial intelligence learning. Since the correct answer is known for the teacher data A, the accuracy of the diagnosis result is guaranteed.
  • the diagnosis result by the pathologist belonging to the best group from the pathological images used for selecting the creator of the teacher data has a concordance rate with the prognosis information and a predetermined threshold or more, and is in the best group.
  • a set of a pathological image showing a high concordance rate among a plurality of pathologists belonging to the group and a corresponding diagnosis result by the pathologist may be selected as the teacher data A for artificial intelligence learning.
  • the teacher data A is the teacher data that guarantees the accuracy of the diagnosis result and brings the diagnosis result close to that of an actual pathologist.
  • the teacher data creation system 1A acquires a pathological image in which the prognosis is unknown or the prognosis information is not included and the correctness of the diagnosis result is uncertain is acquired by the information acquisition device 2 and stored in the storage device 4. Further, the teacher data creation system 1A acquires the diagnosis result of the pathological image by the pathologist belonging to the best group with the information acquisition device 2 and stores it in the storage unit 4.
  • the control device 5 artificially learns a set of pathological images in which the diagnosis results by the pathologist belonging to the best group show a high concordance rate among a plurality of pathologists belonging to the best group and the corresponding diagnosis results by the pathologist. Select as teacher data B for. The accuracy of the diagnosis result is also expected for the teacher data B.
  • FIG. 2 is a flowchart showing an example of a method of creating teacher data according to the present embodiment, and an example of selecting teacher data from a pathological image will be described.
  • the data used to select the creator of the teacher data the data whose correct answer is known and which can judge whether or not the judgment result by the candidate for the creator of the teacher data is correct is selected.
  • a plurality of pathological images of a case whose prognosis is known and whether or not the diagnosis result is correct are acquired.
  • the diagnosis results by a large number of pathologists for these pathological images and known prognosis information are acquired (step SA1).
  • the pathologists are grouped based on the concordance rate between the diagnosis results and the prognosis information for a plurality of diagnosis items using a cluster analysis method or the like (step SA2).
  • the pathological image used to select the creator of the teacher data and the set of the corresponding prognosis information are selected as the teacher data A for artificial intelligence learning (step SA4).
  • step SA5 A pathological image in which the prognosis is unknown, the prognosis information is not included, and the correctness of the diagnosis result is uncertain is acquired (step SA5).
  • step SA6 Acquire diagnosis results by a plurality of pathologists belonging to the best group for pathological images having an unknown prognosis.
  • FIG. 3 is a flowchart showing another example of the method of creating teacher data according to the present embodiment, and an example of selecting teacher data from a pathological image will be described.
  • step SB1, step SB2 and step SB3 the best group is extracted by the same processing as in step SA1, step SA2 and step SA3 of FIG. 2, and the pathologist belonging to the best group is selected as the creator of the teacher data. To do.
  • a set of pathological images of cases showing a high concordance rate among the pathologists belonging to the best group and the corresponding diagnosis results by the pathologists is used for artificial intelligence learning. It is selected as the teacher data A of (step SB4). It is desirable that the teacher data A is further limited to those in which the concordance rate between the judgment result and the prognosis information is equal to or higher than a predetermined threshold value.
  • step SB5 step SB6 and step SB7, the same processing as in step SA5, step SA6 and step SA7 of FIG. 2 is performed to select the best group from the pathological images whose prognosis is unknown as a candidate for teacher data.
  • Example of operation and effect of the teacher data creation system and creation method of this embodiment When a pathologist who belongs to a group with a high concordance rate between the diagnosis result and prognosis information for a pathological image with a known prognosis, that is, a high accuracy rate makes a diagnosis for a pathological image with an unknown prognosis, the diagnosis result and the actual prognosis information It is expected that the concordance rate will be high.
  • data such as a pathological image having a known prognosis and an image or the like whose correct answer is known and whether or not the correctness judgment result is correct can be used as a teacher.
  • the data used to select the creator of the data can be used as a teacher.
  • the correctness of the judgment result of the data by the candidate of the teacher data creator can be determined by using a computer, and based on the judgment result of the data by the candidate of the teacher data creator, the teacher data creator Candidates can be grouped and the best group with the highest accuracy rate can be extracted.
  • the above-mentioned teacher data A which is the data with a high accuracy rate of the judgment result by the best group, is artificial intelligence learning. It will be suitable as teacher data for.
  • the teacher data B which is a collection of the above data, is also suitable as the teacher data for artificial intelligence learning.
  • the teacher data A whose correct answer is known or the teacher data A whose correct answer rate by the creator of the teacher data is equal to or higher than a predetermined threshold is added. Therefore, it is possible to add a large number of teacher data B whose correct answer is unknown but which can be expected to have a high correct answer rate.
  • 1A ... Teacher data creation system, 2 ... Information acquisition device, 3 ... Display device, 4 ... Storage device, 5 ... Control device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided are a teaching data generation system and a teaching data generation method which enable selection of highly accurate teaching data. This teaching data generation system 1A acquires and stores data having a known accuracy that is used in selection of a teaching data generator, data having an unknown accuracy that becomes a teaching data candidate, a determination result regarding each piece of data, and data about whether the determination result is correct regarding the data having a known accuracy, and then selects the teaching data generator from a determination result made by a teaching data generator candidate regarding the data having a known accuracy that is used in the selection of a teaching data generator, and from the data about whether said determination result regarding the data having a known accuracy is correct. Said teaching data generation system 1A also selects the teaching data from the data having a known accuracy that was used in the selection of the teaching data generator, the data having an unknown accuracy that becomes a teaching data candidate, the determination result regarding each piece of data, and the data about whether the determination result is correct regarding the data having a known accuracy.

Description

教師データの作成システム及び教師データの作成方法Teacher data creation system and teacher data creation method
 本発明は、ディープラーニング等における人工知能(AI)で使用される教師データの作成システム及び教師データの作成方法に関する。 The present invention relates to a teacher data creation system used in artificial intelligence (AI) in deep learning and the like, and a teacher data creation method.
 医療画像に対してコンピュータで診断支援を行う技術が提案されている(例えば、特許文献1参照)。コンピュータでの病変検出や自動診断を行う場合、過去の症例を教師データとした機械学習を用いるのが一般的であり、近年、人工知能(AI)による診断支援が提案されている。 A technique for providing diagnostic support for medical images with a computer has been proposed (see, for example, Patent Document 1). When performing lesion detection or automatic diagnosis by computer, it is common to use machine learning using past cases as teacher data, and in recent years, diagnostic support by artificial intelligence (AI) has been proposed.
 人工知能による診断支援のための判断には教師データが重要となるが、一概に教師データといっても、中には何を「正解」として良いか不明瞭なものもある。 Teacher data is important for making decisions for diagnostic support by artificial intelligence, but even if it is generally called teacher data, it is unclear what should be the "correct answer".
 例えば、病理画像の診断における「正解」の判定には、予後の情報を得るための時間が必要であり、少なくとも殆どリアルタイムで「正解」を判断/選択し、それを教師データとして取得することは不可能である。 For example, determining the "correct answer" in the diagnosis of a pathological image requires time to obtain prognostic information, and it is not possible to determine / select the "correct answer" at least in almost real time and acquire it as teacher data. It is impossible.
 また、正解のデータを生成できる人的能力にも限界が生じる。例えば、間質性肺炎(UIP)は、5年生存率が20%程度と、がんに匹敵する予後の悪い疾患であるが、その分類は極めて困難である。実際に、間質性肺炎の診断を行った病理医の間での診断結果の一致率が極めて低いことが示されており、予後不良を有意に分別できるという統計学的証明が必要と考えられていた。 Also, there is a limit to the human ability to generate correct answer data. For example, interstitial pneumonia (UIP) has a 5-year survival rate of about 20%, which is a disease with a poor prognosis comparable to cancer, but its classification is extremely difficult. In fact, it has been shown that the concordance rate of diagnosis results among pathologists who diagnosed interstitial pneumonia is extremely low, and it is considered necessary to statistically prove that poor prognosis can be significantly classified. Was there.
特開2015-116319号公報Japanese Unexamined Patent Publication No. 2015-116319
 従来、医療分野においては、その分野における世界的権威や施設のブランド名等のステータスにより、人工知能で用いる教師データを統一してきた。 Conventionally, in the medical field, teacher data used in artificial intelligence has been unified according to the status of global authority in the field and the brand name of the facility.
 しかし、間質性肺炎等のように病理医の間での診断結果の一致率が低い分野では、教師データが正解であるか否かの不明瞭さが否めず、正解率が高い教師データを用いた人工知能の学習が難しかった。 However, in fields such as interstitial pneumonia where the concordance rate of diagnosis results among pathologists is low, it is undeniable that it is unclear whether the teacher data is correct or not, and teacher data with a high correct answer rate is used. It was difficult to learn the artificial intelligence used.
 本発明の目的は、正解率が高い教師データの取得を可能とした教師データの作成システム及び教師データの作成方法を提供することである。 An object of the present invention is to provide a teacher data creation system and a teacher data creation method that enable acquisition of teacher data having a high accuracy rate.
 上述した課題を解決するため、本発明は、教師データの作成者の選定に用いる正解が既知のデータ、教師データの候補となる正解が未知のデータ、各データに対する判断結果、及び、正解が既知のデータに対する判断結果の正否に関するデータを取得する情報取得装置と、教師データの作成者の選定に用いる正解が既知のデータ、教師データの候補となる正解が未知のデータ、各データに対する判断結果、及び、正解が既知のデータに対する判断結果の正否に関するデータを記憶する記憶装置と、教師データの作成者の選定に用いる正解が既知のデータに対する教師データの作成者の候補者による判断結果と正解が既知のデータに対する判断結果の正否に関するデータから教師データの作成者を選定すると共に、教師データの作成者の選定に用いた正解が既知のデータの中から第1の教師データを選定し、教師データの候補となる正解が未知のデータの中から教師データ作成者の判断結果に基づいて第2の教師データを選定する制御装置を備えた教師データの作成システムである。 In order to solve the above-mentioned problems, the present invention presents data in which the correct answer used for selecting the creator of the teacher data is known, data in which the correct answer as a candidate for the teacher data is unknown, the judgment result for each data, and the correct answer is known. An information acquisition device that acquires data related to the correctness of the judgment result for the data, data for which the correct answer used to select the creator of the teacher data is known, data for which the correct answer that is a candidate for teacher data is unknown, judgment result for each data, In addition, the storage device that stores the data related to the correctness of the judgment result for the data whose correct answer is known, and the judgment result and the correct answer by the candidate of the teacher data creator for the data whose correct answer is known are used for selecting the creator of the teacher data. The creator of the teacher data is selected from the data related to the correctness of the judgment result for the known data, and the first teacher data is selected from the data whose correct answer used for selecting the creator of the teacher data is known. This is a teacher data creation system equipped with a control device that selects a second teacher data based on the judgment result of the teacher data creator from the data whose correct answer is unknown.
 また、本発明は、教師データの作成者の選定に用いる正解が既知のデータ、教師データの作成者の選定に用いる正解が既知のデータに対する教師データの作成者の候補者による判断結果、及び、判断結果の正否に関するデータを取得する工程と、教師データの作成者の選定に用いる正解が既知のデータに対する候補者による判断結果、及び、判断結果の正否に関するデータから、教師データの作成者を選定する工程と、教師データの作成者の選定に用いた正解が既知のデータの中から第1の教師データを選定する工程と、教師データの候補となる正解が未知のデータを取得する工程と、教師データの候補となる正解が未知のデータに対する教師データの作成者による判断結果を取得する工程と、教師データの候補となる正解が未知のデータの中から、教師データの作成者による判断結果に基づいて、第2の教師データを選定する工程とを有する教師データの作成方法である。 Further, the present invention includes data having a known correct answer for selecting a creator of teacher data, a judgment result by a candidate for a creator of teacher data for data having a known correct answer used for selecting a creator of teacher data, and The creator of the teacher data is selected from the process of acquiring the data regarding the correctness of the judgment result, the judgment result by the candidate for the data whose correct answer is known to be used for selecting the creator of the teacher data, and the data regarding the correctness of the judgment result. The process of selecting the first teacher data from the data for which the correct answer used to select the creator of the teacher data is known, and the process of acquiring data for which the correct answer that is a candidate for the teacher data is unknown. The process of acquiring the judgment result by the creator of the teacher data for the data whose correct answer as a candidate for the teacher data is unknown, and the judgment result by the creator of the teacher data from the data whose correct answer as a candidate for the teacher data is unknown. Based on this, it is a method of creating teacher data having a step of selecting a second teacher data.
 正解が既知で、正否の判断結果が正しいか否かを判断可能なデータを、教師データの作成者の選定に用いるデータとすることで、教師データの作成者の候補者による当該データの判断結果に基づき、教師データの作成者の候補者をグループ分けし、正解率の最も高い最良グループを抽出することができる。 By using data for which the correct answer is known and whether or not the correctness judgment result is correct is used for selecting the creator of the teacher data, the judgment result of the data by the candidate for the creator of the teacher data is used. Based on, the candidates for teacher data creators can be grouped and the best group with the highest accuracy rate can be extracted.
 教師データの作成者の選定に用いたデータの中で、最良グループによる判断結果の正解率の高いデータを教師データに選定することで、この教師データは、人工知能学習用の教師データとして適したものとなる。 Among the data used to select the creator of the teacher data, the data with a high accuracy rate of the judgment result by the best group was selected as the teacher data, and this teacher data was suitable as the teacher data for artificial intelligence learning. It becomes a thing.
 また、正解が未知のデータと、最良グループによる当該データに対する判断結果を集積し、実際の判断結果の正解率が高いことが期待されるデータを教師データとすることで、この教師データも、人工知能学習用の教師データとして適したものとなる。 In addition, this teacher data is also artificial by accumulating the data whose correct answer is unknown and the judgment result for the data by the best group, and using the data that is expected to have a high correct answer rate of the actual judgment result as the teacher data. It is suitable as teacher data for intelligent learning.
 従って、正解率が高い教師データを用いて人工知能の学習が可能となると共に、人工知能の学習に用いる正解率が高い教師データの数を増加させることができる。 Therefore, it is possible to learn artificial intelligence using teacher data with a high accuracy rate, and it is possible to increase the number of teacher data with a high accuracy rate used for learning artificial intelligence.
本実施の形態の教師データの作成システムの一例を示す機能ブロック図である。It is a functional block diagram which shows an example of the teacher data creation system of this embodiment. 本実施の形態の教師データの作成方法の一例を示すフローチャートである。It is a flowchart which shows an example of the teacher data creation method of this embodiment. 本実施の形態の教師データの作成方法の別の一例を示すフローチャートである。It is a flowchart which shows another example of the method of creating the teacher data of this embodiment.
 以下、図面を参照して、本発明の教師データの作成システム及び教師データの作成方法の実施の形態について説明する。 Hereinafter, embodiments of the teacher data creation system and the teacher data creation method of the present invention will be described with reference to the drawings.
 <本実施の形態の教師データの作成システムの構成例>
 図1は、本実施の形態の教師データの作成システムの一例を示す機能ブロック図である。
<Structure example of the teacher data creation system of this embodiment>
FIG. 1 is a functional block diagram showing an example of the teacher data creation system of the present embodiment.
 本実施の形態の教師データの作成システム1Aは、教師データの作成者の選定に用いるデータ、教師データの候補となるデータ、各データに対する判断結果及び各データに対する判断結果の正否に関するデータ等を取得する情報取得装置2と、教師データの作成者の選定に用いるデータ、教師データの候補となるデータ、各データに対する判断結果及び各データに対する判断結果の正否に関するデータ等を表示する表示装置3と、教師データの作成者の選定に用いるデータ、教師データの候補となるデータ、各データに対する判断結果及び各データに対する判断結果の正否に関するデータ等を記憶する記憶装置4を備える。 The teacher data creation system 1A of the present embodiment acquires data used for selecting a creator of teacher data, data that is a candidate for teacher data, judgment results for each data, and data regarding correctness of judgment results for each data. Information acquisition device 2 to be used, data used for selecting the creator of teacher data, data that is a candidate for teacher data, judgment result for each data, and display device 3 for displaying data related to the correctness of the judgment result for each data. It is provided with a storage device 4 that stores data used for selecting a creator of teacher data, data that is a candidate for teacher data, judgment results for each data, and data regarding the correctness of judgment results for each data.
 また、教師データの作成システム1Aは、教師データの作成者及び教師データを選定する制御装置5を備える。制御装置5は、教師データの作成者の選定に用いるデータと、教師データの作成者の選定に用いるデータに対する判断結果及び判断結果の正否に関するデータ等から教師データの作成者を選定する。また、制御装置5は、教師データの作成者の選定に用いるデータ、教師データの作成者の選定に用いるデータに対する判断結果及び判断結果の正否に関するデータ等から教師データを選定する。更に、制御装置5は、教師データの候補となるデータと、教師データの候補となるデータに対する判断結果等から追加の教師データを選定する。 Further, the teacher data creation system 1A includes a teacher data creator and a control device 5 for selecting teacher data. The control device 5 selects the creator of the teacher data from the data used for selecting the creator of the teacher data, the judgment result for the data used for selecting the creator of the teacher data, and the data regarding the correctness of the judgment result. Further, the control device 5 selects the teacher data from the data used for selecting the creator of the teacher data, the judgment result for the data used for selecting the creator of the teacher data, the data regarding the correctness of the judgment result, and the like. Further, the control device 5 selects additional teacher data from the data that is a candidate for teacher data and the judgment result for the data that is a candidate for teacher data.
 情報取得装置2は、キーボード、マウス、ペン型タブレット等を用いた操作者の操作により各データが入力されるものでもよく、通信回線、記録媒体等を介して各データが入力されるものでもよい。記憶装置4は、病院等に設置されるデータベース、通信回線を介して接続可能な外部のデータベースであっても良い。制御装置5は、パーソナルコンピュータ等の情報処置装置でよい。 The information acquisition device 2 may be one in which each data is input by an operation of an operator using a keyboard, a mouse, a pen-type tablet, or the like, or one in which each data is input via a communication line, a recording medium, or the like. .. The storage device 4 may be a database installed in a hospital or the like, or an external database that can be connected via a communication line. The control device 5 may be an information processing device such as a personal computer.
 教師データの作成システム1Aは、教師データの作成者の選定に用いるデータとして、正解が既知で、教師データの作成者の候補者による判断結果が正しいか否かを判断可能な画像等を情報取得装置2で取得し、記憶装置4に記憶する。 The teacher data creation system 1A acquires information such as an image in which the correct answer is known and the judgment result by the candidate for the teacher data creator can be judged as correct as the data used for selecting the creator of the teacher data. It is acquired by the device 2 and stored in the storage device 4.
 また、教師データの作成システム1Aは、教師データの作成者の選定に用いるデータに対する教師データの作成者の候補者による判断結果と、判断結果の正否を決めるデータを情報取得装置2で取得し、記憶装置4に記憶する。 Further, the teacher data creation system 1A acquires the judgment result by the candidate of the teacher data creator for the data used for selecting the teacher data creator and the data for determining the correctness of the judgment result by the information acquisition device 2. Store in the storage device 4.
 制御装置5は、クラスター分析等により、教師データの作成者の候補者の中から互いに似た判断結果を持つものを集める。 The control device 5 collects candidates having similar judgment results from the candidates for the creator of the teacher data by cluster analysis or the like.
 例えば、教師データの作成者の選定に用いるデータに対して複数の判断項目を設定し、各項目に対する判断結果と、判断結果の正否を決めるデータとの項目ごとの一致率等に基づき、教師データの作成者の候補者をグループ分けする。 For example, multiple judgment items are set for the data used to select the creator of the teacher data, and the teacher data is based on the matching rate of each item with the judgment result for each item and the data for determining the correctness of the judgment result. Group the candidates for the creator of.
 そして、教師データの作成者の候補者が属するグループの中で、判断結果の正否を決めるデータとの一致率が最も良いかあるいは所定の閾値以上の判断結果を出したグループを最良グループとし、この最良グループを教師データの作成者とする。 Then, among the groups to which the candidates for the creator of the teacher data belong, the group that has the best match rate with the data that determines the correctness of the judgment result or that gives the judgment result equal to or higher than a predetermined threshold is set as the best group. The best group is the creator of teacher data.
 制御装置5は、最良グループに属する教師データの作成者の選定に用いたデータとそれに対応する判断結果の正否に関するデータのセットを人工知能学習用の教師データAとして選定する。教師データAは、正解が既知であるため、判断結果の正確さが担保されたものとなる。 The control device 5 selects a set of data related to the correctness of the judgment result corresponding to the data used for selecting the creator of the teacher data belonging to the best group as the teacher data A for artificial intelligence learning. Since the correct answer is known for the teacher data A, the accuracy of the judgment result is guaranteed.
 また、制御装置5は、最良グループに属する教師データの作成者の選定に用いたデータの中から、最良グループに属する複数の教師データの作成者による判断結果が、判断結果の正否を決めるデータと所定の閾値以上の一致率を持ち、最良グループに属する複数の教師データの作成者間で高い一致率を示すデータとそれに対応する教師データの作成者による判断結果のセットを、人工知能学習用の教師データAに選定してもよい。この場合は、教師データAは、判断結果の正確さが担保されると共に、教師データの作成者に近い判断結果をもたらす教師データとなる。 Further, in the control device 5, among the data used for selecting the creator of the teacher data belonging to the best group, the judgment result by the creators of the plurality of teacher data belonging to the best group is the data that determines the correctness of the judgment result. For artificial intelligence learning, a set of data having a match rate equal to or higher than a predetermined threshold and showing a high match rate among creators of multiple teacher data belonging to the best group and judgment results by the creators of the corresponding teacher data is used. It may be selected as teacher data A. In this case, the teacher data A is the teacher data that guarantees the accuracy of the judgment result and brings the judgment result close to the creator of the teacher data.
 最良グループに属する複数の教師データの作成者間で高い一致率を示すデータであるものの、判断結果の正否を決めるデータとの一致率が所定の閾値を下回るもの、すなわち判断結果が不正解であるものについては、当該最良グループに属する全ての教師データの作成者に対して、当該データは誤謬を招きやすいという認識を共有してもらうことにする。これにより最良グループに属する教師データの作成者の質がより向上する。 Although the data shows a high concordance rate among the creators of a plurality of teacher data belonging to the best group, the concordance rate with the data that determines the correctness of the judgment result is less than a predetermined threshold, that is, the judgment result is incorrect. For things, we will ask all creators of teacher data belonging to the best group to share the perception that the data is error-prone. This will further improve the quality of teacher data creators who belong to the best group.
 また、教師データの作成システム1Aは、教師データの候補となるデータとして、正解が未知で、判断結果の正否が未確定のデータを情報取得装置2で取得し、記憶部4に記憶する。更に、教師データの作成システム1Aは、最良グループに属する教師データの作成者による当該データに対する判断結果を情報取得装置2で取得し、記憶部4に記憶する。 Further, the teacher data creation system 1A acquires data in which the correct answer is unknown and the correctness of the judgment result is uncertain as the candidate data of the teacher data by the information acquisition device 2 and stores it in the storage unit 4. Further, the teacher data creation system 1A acquires the judgment result for the data by the creator of the teacher data belonging to the best group by the information acquisition device 2 and stores it in the storage unit 4.
 制御装置5は、最良グループに属する教師データの作成者による判断結果が、この最良グループに属する複数の教師データの作成者間で高い一致率を示すデータとそれに対応する教師データの作成者による判断結果のセットを、人工知能学習用の教師データBに選定する。教師データBについても、判断結果の正確さが期待されたものとなる。 In the control device 5, the judgment result by the creator of the teacher data belonging to the best group shows a high matching rate among the creators of the plurality of teacher data belonging to the best group, and the judgment by the creator of the corresponding teacher data. The set of results is selected as teacher data B for artificial intelligence learning. As for the teacher data B, the accuracy of the judgment result is expected.
 本実施の形態の教師データの作成システム1Aにおいて、病理画像から教師データを取得する例について説明する。 An example of acquiring teacher data from a pathological image in the teacher data creation system 1A of the present embodiment will be described.
 教師データの作成システム1Aは、教師データの作成者の選定に用いるデータとして、予後が既知で診断結果が正解か否か判断可能な症例の複数の病理画像を情報取得装置2で取得し、記憶装置4に記憶する。 The teacher data creation system 1A acquires and stores a plurality of pathological images of cases in which the prognosis is known and whether or not the diagnosis result is correct can be determined by the information acquisition device 2 as data used for selecting the creator of the teacher data. Store in device 4.
 また、教師データの作成システム1Aは、教師データの作成者の候補者として多数の病理医を指定し、この多数の病理医による各病理画像に対応する診断結果と、診断結果の正否を決めるデータである予後情報を情報取得装置2で取得し、記憶装置4に記憶する。 Further, the teacher data creation system 1A designates a large number of pathologists as candidates for the creator of the teacher data, and the diagnosis results corresponding to each pathological image by the many pathologists and the data for determining the correctness of the diagnosis results. The prognosis information is acquired by the information acquisition device 2 and stored in the storage device 4.
 制御装置5は、クラスター分析の手法を用い、病理画像に対して設定された複数の診断項目に対する診断結果と、予後情報との項目ごとの一致率等に基づき、病理医をグループ分けする。 The control device 5 uses a cluster analysis method to group pathologists based on the matching rate of each item with the diagnosis result for a plurality of diagnosis items set for the pathological image and the prognosis information.
 そして、病理医が属するグループの中で、予後情報との一致率が最も良いかあるいは所定の閾値以上の診断結果を出したグループを最良グループとし、この最良グループを教師データの作成者とする。 Then, among the groups to which the pathologist belongs, the group with the best concordance rate with the prognosis information or the diagnosis result of the predetermined threshold value or more is set as the best group, and this best group is set as the creator of the teacher data.
 制御装置5は、教師データの作成者の選定に用いた病理画像とそれに対応する予後情報のセットを人工知能学習用の教師データAとする。教師データAは、正解が既知であるため、診断結果の正確さが担保されたものとなる。 The control device 5 uses the pathological image used for selecting the creator of the teacher data and the set of the corresponding prognosis information as the teacher data A for artificial intelligence learning. Since the correct answer is known for the teacher data A, the accuracy of the diagnosis result is guaranteed.
 また、制御装置5は、教師データの作成者の選定に用いた病理画像の中から、最良グループに属する病理医による診断結果が、予後情報と所定の閾値以上の一致率を持ち、最良グループに属する複数の病理医間で高い一致率を示す病理画像とそれに対応する病理医による診断結果のセットを、人工知能学習用の教師データAに選定してもよい。この場合は、教師データAは、診断結果の正確さが担保されると共に実際の病理医に近い診断結果をもたらす教師データとなる。 Further, in the control device 5, the diagnosis result by the pathologist belonging to the best group from the pathological images used for selecting the creator of the teacher data has a concordance rate with the prognosis information and a predetermined threshold or more, and is in the best group. A set of a pathological image showing a high concordance rate among a plurality of pathologists belonging to the group and a corresponding diagnosis result by the pathologist may be selected as the teacher data A for artificial intelligence learning. In this case, the teacher data A is the teacher data that guarantees the accuracy of the diagnosis result and brings the diagnosis result close to that of an actual pathologist.
 また、教師データの作成システム1Aは、予後未知あるいは予後情報が含まれず、診断結果の正否が未確定の病理画像を情報取得装置2で取得し、記憶装置4に記憶する。更に、教師データの作成システム1Aは、最良グループに属する病理医による当該病理画像に対する診断結果を情報取得装置2で取得し、記憶部4に記憶する。 Further, the teacher data creation system 1A acquires a pathological image in which the prognosis is unknown or the prognosis information is not included and the correctness of the diagnosis result is uncertain is acquired by the information acquisition device 2 and stored in the storage device 4. Further, the teacher data creation system 1A acquires the diagnosis result of the pathological image by the pathologist belonging to the best group with the information acquisition device 2 and stores it in the storage unit 4.
 制御装置5は、最良グループに属する病理医による診断結果が、この最良グループに属する複数の病理医間で高い一致率を示す病理画像とそれに対応する病理医による診断結果のセットを、人工知能学習用の教師データBに選定する。教師データBについても、診断結果の正確さが期待されたものとなる。 The control device 5 artificially learns a set of pathological images in which the diagnosis results by the pathologist belonging to the best group show a high concordance rate among a plurality of pathologists belonging to the best group and the corresponding diagnosis results by the pathologist. Select as teacher data B for. The accuracy of the diagnosis result is also expected for the teacher data B.
 <本実施の形態の教師データの作成方法の一例>
 図2は、本実施の形態の教師データの作成方法の一例を示すフローチャートであり、病理画像から教師データを選定する例について説明する。
<Example of method for creating teacher data in this embodiment>
FIG. 2 is a flowchart showing an example of a method of creating teacher data according to the present embodiment, and an example of selecting teacher data from a pathological image will be described.
 教師データの作成者の選定に用いるデータとして、正解が既知で、教師データの作成者の候補者による判断結果が正しいか否かを判断可能なデータが選定される。本例では、予後が既知で診断結果が正解か否か判断可能な症例の複数の病理画像を取得する。また、これら病理画像に対する多数の病理医による診断結果、及び、既知の予後情報を取得する(ステップSA1)。 As the data used to select the creator of the teacher data, the data whose correct answer is known and which can judge whether or not the judgment result by the candidate for the creator of the teacher data is correct is selected. In this example, a plurality of pathological images of a case whose prognosis is known and whether or not the diagnosis result is correct can be determined are acquired. In addition, the diagnosis results by a large number of pathologists for these pathological images and known prognosis information are acquired (step SA1).
 病理医ごとの診断結果に基づき、クラスター分析の手法等を用い、複数の診断項目に対する診断結果と予後情報の一致率等から病理医をグループ分けする(ステップSA2)。診断結果と予後情報の一致率が他のグループより高い等、各グループの中で最も予後と相関する分別が可能だったグループを最良グループとして抽出し、最良グループに属する病理医を教師データの作成者に選定する(ステップSA3)。 Based on the diagnosis results for each pathologist, the pathologists are grouped based on the concordance rate between the diagnosis results and the prognosis information for a plurality of diagnosis items using a cluster analysis method or the like (step SA2). The group that was most able to correlate with the prognosis among each group, such as the matching rate between the diagnosis result and the prognosis information being higher than the other groups, was extracted as the best group, and the pathologists belonging to the best group were created as teacher data. (Step SA3).
 教師データの作成者の選定に用いた病理画像とそれに対応する予後情報のセットを、人工知能学習用の教師データAとして選定する(ステップSA4)。 The pathological image used to select the creator of the teacher data and the set of the corresponding prognosis information are selected as the teacher data A for artificial intelligence learning (step SA4).
 教師データの候補となるデータとして。予後が未知で予後情報が含まれず、診断結果の正否が未確定の病理画像を取得する(ステップSA5)。 As data that is a candidate for teacher data. A pathological image in which the prognosis is unknown, the prognosis information is not included, and the correctness of the diagnosis result is uncertain is acquired (step SA5).
 予後が未知の病理画像に対する最良グループに属する複数の病理医による診断結果を取得する(ステップSA6)。 Acquire diagnosis results by a plurality of pathologists belonging to the best group for pathological images having an unknown prognosis (step SA6).
 最良グループに属する病理医による診断結果が、この最良グループに属する複数の病理医間で高い一致率を示す病理画像とそれに対応する病理医による診断結果のセットを、人工知能学習用の教師データBとして選定する(ステップSA7)。 A set of pathological images in which the diagnosis results by the pathologists belonging to the best group show a high concordance rate among multiple pathologists belonging to this best group and the corresponding diagnosis results by the pathologists are obtained as teacher data B for artificial intelligence learning. (Step SA7).
 図3は、本実施の形態の教師データの作成方法の別の一例を示すフローチャートであり、病理画像から教師データを選定する例について説明する。 FIG. 3 is a flowchart showing another example of the method of creating teacher data according to the present embodiment, and an example of selecting teacher data from a pathological image will be described.
 図3において、ステップSB1、ステップSB2及びステップSB3では、図2のステップSA1、ステップSA2及びステップSA3と同様の処理で最良グループを抽出し、最良グループに属する病理医を教師データの作成者に選定する。 In FIG. 3, in step SB1, step SB2 and step SB3, the best group is extracted by the same processing as in step SA1, step SA2 and step SA3 of FIG. 2, and the pathologist belonging to the best group is selected as the creator of the teacher data. To do.
 教師データの作成者の選定に用いた病理画像の中で、最良グループに属する病理医間で高い一致率を示す症例の病理画像とそれに対応する病理医による診断結果のセットを、人工知能学習用の教師データAとして選定する(ステップSB4)。この教師データAは、更に、判断結果と予後情報の一致率が所定の閾値以上のものに限定することが望ましい。 Among the pathological images used to select the creator of the teacher data, a set of pathological images of cases showing a high concordance rate among the pathologists belonging to the best group and the corresponding diagnosis results by the pathologists is used for artificial intelligence learning. It is selected as the teacher data A of (step SB4). It is desirable that the teacher data A is further limited to those in which the concordance rate between the judgment result and the prognosis information is equal to or higher than a predetermined threshold value.
 図3において、ステップSB5、ステップSB6及びステップSB7では、図2のステップSA5、ステップSA6及びステップSA7と同様の処理で、教師データの候補となる予後が未知の病理画像の中から、最良グループに属する複数の病理医間で診断結果が高い一致率を示す病理画像と、それに対応する病理医による診断結果のセットを教師データBとして選定する。 In FIG. 3, in step SB5, step SB6 and step SB7, the same processing as in step SA5, step SA6 and step SA7 of FIG. 2 is performed to select the best group from the pathological images whose prognosis is unknown as a candidate for teacher data. A set of pathological images showing a high concordance rate of diagnosis results among a plurality of pathologists belonging to them and the corresponding diagnosis results by pathologists is selected as teacher data B.
 <本実施の形態の教師データの作成システム及び作成方法の作用効果例>
 予後が既知の病理画像に対する診断結果と予後情報の一致率が高い、すなわち、正解率の高いグループに属する病理医が、予後が未知の病理画像に対する診断を行うと、診断結果と実際の予後情報の一致率はやはり高くなることが期待される。
<Example of operation and effect of the teacher data creation system and creation method of this embodiment>
When a pathologist who belongs to a group with a high concordance rate between the diagnosis result and prognosis information for a pathological image with a known prognosis, that is, a high accuracy rate makes a diagnosis for a pathological image with an unknown prognosis, the diagnosis result and the actual prognosis information It is expected that the concordance rate will be high.
 そこで、本実施の形態の教師データの作成システム及び作成方法では、予後が既知の病理画像等、正解が既知で、正否の判断結果が正しいか否かを判断可能な画像等のデータを、教師データの作成者の選定に用いるデータとする。 Therefore, in the teacher data creation system and creation method of the present embodiment, data such as a pathological image having a known prognosis and an image or the like whose correct answer is known and whether or not the correctness judgment result is correct can be used as a teacher. The data used to select the creator of the data.
 教師データの作成者の候補者による当該データの判断結果の正否は、コンピュータを用いて行うことができ、教師データの作成者の候補者による当該データの判断結果に基づき、教師データの作成者の候補者をグループ分けし、正解率の最も高い最良グループを抽出することができる。 The correctness of the judgment result of the data by the candidate of the teacher data creator can be determined by using a computer, and based on the judgment result of the data by the candidate of the teacher data creator, the teacher data creator Candidates can be grouped and the best group with the highest accuracy rate can be extracted.
 教師データの作成者の選定に用いたデータ及び教師データの作成者の選定に用いたデータの中で、最良グループによる判断結果の正解率の高いデータである上述した教師データAは、人工知能学習用の教師データとして適したものとなる。 Among the data used for selecting the creator of the teacher data and the data used for selecting the creator of the teacher data, the above-mentioned teacher data A, which is the data with a high accuracy rate of the judgment result by the best group, is artificial intelligence learning. It will be suitable as teacher data for.
 また、最良グループによる正解が未知のデータに対する判断結果は、他のグループによる判断結果と比較して正解率が高くなる考えられることから、正解が未知のデータと、最良グループによる当該データに対する判断結果を集積した教師データBも、人工知能学習用の教師データとして適したものとなる。 In addition, the judgment result for the data whose correct answer is unknown by the best group is considered to have a higher correct answer rate than the judgment result by other groups. Therefore, the judgment result for the data whose correct answer is unknown and the data by the best group The teacher data B, which is a collection of the above data, is also suitable as the teacher data for artificial intelligence learning.
 このように、上記システム、方法で最良グループを選定して教師データを取得することで、正解が既知の教師データAあるいは教師データの作成者による正解率が所定の閾値以上の教師データAに加えて、正解は未知であるが正解率が高いことが期待できる多数の教師データBを追加することができる。 In this way, by selecting the best group by the above system and method and acquiring the teacher data, the teacher data A whose correct answer is known or the teacher data A whose correct answer rate by the creator of the teacher data is equal to or higher than a predetermined threshold is added. Therefore, it is possible to add a large number of teacher data B whose correct answer is unknown but which can be expected to have a high correct answer rate.
 従って、正解率が高い教師データを用いて人工知能の学習が可能となると共に、人工知能の学習に用いる正解率が高い教師データの数を増加させることができる。 Therefore, it is possible to learn artificial intelligence using teacher data with a high accuracy rate, and it is possible to increase the number of teacher data with a high accuracy rate used for learning artificial intelligence.
 例えば、教師データの作成者の選定及び最初の教師データAの取得後、医療機関との連携で病理画像を取得し、当該病理画像に対して最良グループに属する病理医(=教師データの作成者)が診断を行い、その診断結果を取得することで、教師データBを集積することが可能である。 For example, after selecting the creator of the teacher data and acquiring the first teacher data A, a pathological image is acquired in cooperation with a medical institution, and a pathologist belonging to the best group for the pathological image (= creator of the teacher data). ) Makes a diagnosis and obtains the diagnosis result, so that the teacher data B can be accumulated.
 また、医療分野のみならず、食品の質判定を人工知能で行う分野において、その教師データの選定に適用することも可能である。 It can also be applied to the selection of teacher data not only in the medical field but also in the field where food quality is judged by artificial intelligence.
 1A・・・教師データの作成システム、2・・・情報取得装置、3・・・表示装置、4・・・記憶装置、5・・・制御装置 1A ... Teacher data creation system, 2 ... Information acquisition device, 3 ... Display device, 4 ... Storage device, 5 ... Control device

Claims (10)

  1.  教師データの作成者の選定に用いる正解が既知のデータ、教師データの候補となる正解が未知のデータ、各データに対する判断結果、及び、正解が既知のデータに対する判断結果の正否に関するデータを取得する情報取得装置と、
     教師データの作成者の選定に用いる正解が既知のデータ、教師データの候補となる正解が未知のデータ、各データに対する判断結果、及び、正解が既知のデータに対する判断結果の正否に関するデータを記憶する記憶装置と、
     教師データの作成者の選定に用いる正解が既知のデータに対する教師データの作成者の候補者による判断結果と正解が既知のデータに対する判断結果の正否に関するデータから教師データの作成者を選定すると共に、教師データの作成者の選定に用いた正解が既知のデータの中から第1の教師データを選定し、教師データの候補となる正解が未知のデータの中から教師データ作成者の判断結果に基づいて第2の教師データを選定する制御装置を備えた
     ことを特徴とする教師データの作成システム。
    Acquires data with known correct answers used to select the creator of teacher data, data with unknown correct answers as candidates for teacher data, judgment results for each data, and data regarding the correctness of judgment results for data with known correct answers. Information acquisition device and
    Stores data with known correct answers used to select the creator of teacher data, data with unknown correct answers as candidates for teacher data, judgment results for each data, and data regarding the correctness of judgment results for data with known correct answers. Storage device and
    The creator of the teacher data is selected from the judgment result by the candidate for the creator of the teacher data for the data whose correct answer is known and the data regarding the correctness of the judgment result for the data whose correct answer is known. The first teacher data is selected from the data whose correct answer used to select the creator of the teacher data is known, and the correct answer that is a candidate for the teacher data is based on the judgment result of the teacher data creator from the unknown data. A teacher data creation system characterized by being equipped with a control device for selecting a second teacher data.
  2.  前記制御装置は、正解が既知のデータに対する教師データの作成者の候補者による判断結果と、判断結果の正否に関するデータから、正解率に基づき教師データの作成者を選定すると共に、
     教師データの作成者の選定で用いた正解が既知のデータとそれに対応する判断結果の正否に関するデータのセットを第1の教師データとして選定し、
     正解が未知のデータに対する複数の教師データの作成者間での判断結果の一致率の高いデータとそれに対応する教師データの作成者の判断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項1に記載の教師データの作成システム。
    The control device selects the creator of the teacher data based on the correct answer rate from the judgment result by the candidate for the creator of the teacher data for the data whose correct answer is known and the data regarding the correctness of the judgment result.
    A set of data whose correct answer is known and the data regarding the correctness of the corresponding judgment result used in the selection of the creator of the teacher data is selected as the first teacher data.
    The feature is that the data with a high matching rate of the judgment results among the creators of multiple teacher data for the data whose correct answer is unknown and the set of the judgment results of the creators of the corresponding teacher data are selected as the second teacher data. The teacher data creation system according to claim 1.
  3.  前記制御装置は、正解が既知のデータに対する教師データの作成者の候補者による判断結果と、判断結果の正否に関するデータから、正解率に基づき教師データの作成者を選定すると共に、
     教師データの作成者の選定で用いた正解が既知のデータの中から、判断結果の正解率及び複数の教師データの作成者間での判断結果の一致率の高いデータとそれに対応する教師データの作成者の判断結果のセットを第1の教師データとして選定し、
     正解が未知のデータに対する複数の教師データの作成者間での判断結果の一致率の高いデータとそれに対応する教師データの作成者の判断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項1に記載の教師データの作成システム。
    The control device selects the creator of the teacher data based on the correct answer rate from the judgment result by the candidate for the creator of the teacher data for the data whose correct answer is known and the data regarding the correctness of the judgment result.
    From the data for which the correct answer used in the selection of the creator of the teacher data is known, the data with a high accuracy rate of the judgment result and the matching rate of the judgment result among the creators of multiple teacher data and the corresponding teacher data Select the set of judgment results of the creator as the first teacher data,
    The feature is that the data with a high matching rate of the judgment results among the creators of multiple teacher data for the data whose correct answer is unknown and the set of the judgment results of the creators of the corresponding teacher data are selected as the second teacher data. The teacher data creation system according to claim 1.
  4.  教師データの作成者の選定に用いる正解が既知のデータは、予後が既知の病理画像であり、
     前記制御装置は、予後が既知の病理画像に対する教師データの作成者の候補者による診断結果と、予後情報との一致率に基づき教師データの作成者を選定すると共に、
     教師データの作成者の選定で用いた予後が既知の病理画像とそれに対応する予後情報のセットを第1の教師データとして選定し、
     教師データの候補となる正解が未知のデータは、予後が未知の病理画像であり、
     前記制御装置は、予後が未知の病理画像に対する複数の教師データの作成者間での診断結果の一致率の高い病理画像とそれに対応する診断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項2に記載の教師データの作成システム。
    The data with a known correct answer used to select the creator of the teacher data is a pathological image with a known prognosis.
    The control device selects the creator of the teacher data based on the concordance rate between the diagnosis result by the candidate for the creator of the teacher data for the pathological image having a known prognosis and the prognosis information, and also
    A set of pathological images with a known prognosis and the corresponding prognosis information used in the selection of the creator of the teacher data was selected as the first teacher data.
    The data whose correct answer is unknown, which is a candidate for teacher data, is a pathological image whose prognosis is unknown.
    The control device selects as the second teacher data a set of pathological images having a high concordance rate of diagnostic results among creators of multiple teacher data for pathological images having an unknown prognosis and corresponding diagnostic results. The teacher data creation system according to claim 2, which is characterized.
  5.  教師データの作成者の選定に用いる正解が既知のデータは、予後が既知の病理画像であり、
     前記制御装置は、予後が既知の病理画像に対する教師データの作成者の候補者による診断結果と、予後情報との一致率に基づき教師データの作成者を選定すると共に、
     教師データの作成者の選定で用いた予後が既知の病理画像の中から、診断結果と予後情報の一致率及び複数の教師データの作成者間での診断結果の一致率の高い病理画像とそれに対応する診断結果のセットを第1の教師データとして選定し、
     教師データの候補となる正解が未知のデータは、予後が未知の病理画像であり、
     前記制御装置は、予後が未知の病理画像に対する複数の教師データの作成者間での診断結果の一致率の高い病理画像とそれに対応する診断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項3に記載の教師データの作成システム。
    The data with a known correct answer used to select the creator of the teacher data is a pathological image with a known prognosis.
    The control device selects the creator of the teacher data based on the concordance rate between the diagnosis result by the candidate for the creator of the teacher data for the pathological image having a known prognosis and the prognosis information, and also
    From the pathological images with known prognosis used in the selection of the creator of the teacher data, the pathological image with a high matching rate between the diagnosis result and the prognosis information and the matching rate of the diagnosis result among the creators of multiple teacher data and the pathological image. Select the corresponding set of diagnostic results as the first teacher data and
    The data whose correct answer is unknown, which is a candidate for teacher data, is a pathological image whose prognosis is unknown.
    The control device selects as the second teacher data a set of pathological images having a high concordance rate of diagnostic results among creators of multiple teacher data for pathological images having an unknown prognosis and corresponding diagnostic results. The teacher data creation system according to claim 3, which is characterized.
  6.  教師データの作成者の選定に用いる正解が既知のデータ、教師データの作成者の選定に用いる正解が既知のデータに対する教師データの作成者の候補者による判断結果、及び、判断結果の正否に関するデータを取得する工程と、
     教師データの作成者の選定に用いる正解が既知のデータに対する候補者による判断結果、及び、判断結果の正否に関するデータから、教師データの作成者を選定する工程と、
     教師データの作成者の選定に用いた正解が既知のデータの中から第1の教師データを選定する工程と、
     教師データの候補となる正解が未知のデータを取得する工程と、
     教師データの候補となる正解が未知のデータに対する教師データの作成者による判断結果を取得する工程と、
     教師データの候補となる正解が未知のデータの中から、教師データの作成者による判断結果に基づいて、第2の教師データを選定する工程と
     を有することを特徴とする教師データの作成方法。
    Data with known correct answers used to select teacher data creators, judgment results by candidates for teacher data creators for data with known correct answers used to select teacher data creators, and data related to the correctness of judgment results And the process of getting
    The process of selecting the creator of teacher data from the judgment result by the candidate for the data whose correct answer is known to be used for selecting the creator of the teacher data, and the data related to the correctness of the judgment result, and the process of selecting the creator of the teacher data.
    The process of selecting the first teacher data from the data whose correct answer used to select the creator of the teacher data is known, and
    The process of acquiring data for which the correct answer that is a candidate for teacher data is unknown,
    The process of acquiring the judgment result by the creator of the teacher data for the data whose correct answer as a candidate for the teacher data is unknown, and
    A method for creating teacher data, which comprises a process of selecting a second teacher data from data whose correct answer as a candidate for teacher data is unknown, based on a judgment result by the creator of the teacher data.
  7.  正解が既知のデータに対する教師データの作成者の候補者による判断結果と、判断結果の正否に関するデータから、正解率に基づき教師データの作成者を選定し、
     教師データの作成者の選定で用いた正解が既知のデータとそれに対応する判断結果の正否に関するデータのセットを第1の教師データとして選定し、
     正解が未知のデータに対する複数の教師データの作成者間での判断結果の一致率の高いデータとそれに対応する教師データの作成者の判断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項6に記載の教師データの作成方法。
    The creator of the teacher data is selected based on the correct answer rate from the judgment result by the candidate for the creator of the teacher data for the data whose correct answer is known and the data regarding the correctness of the judgment result.
    A set of data whose correct answer is known and the data regarding the correctness of the corresponding judgment result used in the selection of the creator of the teacher data is selected as the first teacher data.
    The feature is that the data with a high matching rate of the judgment results among the creators of multiple teacher data for the data whose correct answer is unknown and the set of the judgment results of the creators of the corresponding teacher data are selected as the second teacher data. The method for creating teacher data according to claim 6.
  8.  正解が既知のデータに対する教師データの作成者の候補者による判断結果と、判断結果の正否に関するデータから、正解率に基づき教師データの作成者を選定し、
     教師データの作成者の選定で用いた正解が既知のデータの中から、判断結果の正解率及び複数の教師データの作成者間での判断結果の一致率の高いデータとそれに対応する教師データの作成者の判断結果のセットを第1の教師データとして選定し、
     正解が未知のデータに対する複数の教師データの作成者間での判断結果の一致率の高いデータとそれに対応する教師データの作成者の判断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項6に記載の教師データの作成方法。
    The creator of the teacher data is selected based on the correct answer rate from the judgment result by the candidate for the creator of the teacher data for the data whose correct answer is known and the data regarding the correctness of the judgment result.
    From the data for which the correct answer used in the selection of the creator of the teacher data is known, the data with a high accuracy rate of the judgment result and the matching rate of the judgment result among the creators of multiple teacher data and the corresponding teacher data Select the set of judgment results of the creator as the first teacher data,
    The feature is that the data with a high matching rate of the judgment results among the creators of multiple teacher data for the data whose correct answer is unknown and the set of the judgment results of the creators of the corresponding teacher data are selected as the second teacher data. The method for creating teacher data according to claim 6.
  9.  教師データの作成者の選定に用いる正解が既知のデータは、予後が既知の病理画像であり、
     予後が既知の病理画像に対する教師データの作成者の候補者による診断結果と、予後情報との一致率に基づき教師データの作成者を選定し、
     教師データの作成者の選定で用いた予後が既知の病理画像とそれに対応する予後情報のセットを第1の教師データとして選定し、
     教師データの候補となる正解が未知のデータは、予後が未知の病理画像であり、
     予後が未知の病理画像に対する複数の教師データの作成者間での診断結果の一致率の高い病理画像とそれに対応する教師データの作成者の診断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項7に記載の教師データの作成方法。
    The data with a known correct answer used to select the creator of the teacher data is a pathological image with a known prognosis.
    A teacher data creator is selected based on the concordance rate between the diagnosis result by the candidate teacher data creator for the pathological image with a known prognosis and the prognosis information.
    A set of pathological images with a known prognosis and the corresponding prognosis information used in the selection of the creator of the teacher data was selected as the first teacher data.
    The data whose correct answer is unknown, which is a candidate for teacher data, is a pathological image whose prognosis is unknown.
    Select a set of pathological images with a high concordance rate of diagnostic results among the creators of multiple teacher data for pathological images with an unknown prognosis and the diagnostic results of the creators of the corresponding teacher data as the second teacher data. The method for creating teacher data according to claim 7, wherein the teacher data is created.
  10.  教師データの作成者の選定に用いる正解が既知のデータは、予後が既知の病理画像であり、
     予後が既知の病理画像に対する教師データの作成者の候補者による診断結果と、予後情報との一致率に基づき教師データの作成者を選定し、
     教師データの作成者の選定で用いた予後が既知の病理画像の中から、診断結果と予後情報の一致率及び複数の教師データの作成者間での診断結果の一致率の高い病理画像とそれに対応する教師データの作成者の診断結果のセットを第1の教師データとして選定し、
     教師データの候補となる正解が未知のデータは、予後が未知の病理画像であり、
     予後が未知の病理画像に対する複数の教師データの作成者間での診断結果の一致率の高い病理画像とそれに対応する教師データの作成者の診断結果のセットを第2の教師データとして選定する
     ことを特徴とする請求項8に記載の教師データの作成方法。
    The data with a known correct answer used to select the creator of the teacher data is a pathological image with a known prognosis.
    A teacher data creator is selected based on the concordance rate between the diagnosis result by the candidate teacher data creator for the pathological image with a known prognosis and the prognosis information.
    From the pathological images with known prognosis used in the selection of the creator of the teacher data, the pathological image with a high matching rate between the diagnosis result and the prognosis information and the matching rate of the diagnosis result among the creators of multiple teacher data and the pathological image. Select the set of diagnostic results of the creator of the corresponding teacher data as the first teacher data and
    The data whose correct answer is unknown, which is a candidate for teacher data, is a pathological image whose prognosis is unknown.
    Select a set of pathological images with a high concordance rate of diagnostic results among the creators of multiple teacher data for pathological images with an unknown prognosis and the diagnostic results of the creators of the corresponding teacher data as the second teacher data. The method for creating teacher data according to claim 8, wherein the teacher data is created.
PCT/JP2020/000424 2020-01-09 2020-01-09 Teaching data generation system and teaching data generation method WO2021140604A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021569658A JP7482537B2 (en) 2020-01-09 2020-01-09 Teacher data creation system
PCT/JP2020/000424 WO2021140604A1 (en) 2020-01-09 2020-01-09 Teaching data generation system and teaching data generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/000424 WO2021140604A1 (en) 2020-01-09 2020-01-09 Teaching data generation system and teaching data generation method

Publications (1)

Publication Number Publication Date
WO2021140604A1 true WO2021140604A1 (en) 2021-07-15

Family

ID=76787908

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/000424 WO2021140604A1 (en) 2020-01-09 2020-01-09 Teaching data generation system and teaching data generation method

Country Status (2)

Country Link
JP (1) JP7482537B2 (en)
WO (1) WO2021140604A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018106662A (en) * 2016-12-22 2018-07-05 キヤノン株式会社 Information processor, information processing method, and program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018106662A (en) * 2016-12-22 2018-07-05 キヤノン株式会社 Information processor, information processing method, and program

Also Published As

Publication number Publication date
JPWO2021140604A1 (en) 2021-07-15
JP7482537B2 (en) 2024-05-14

Similar Documents

Publication Publication Date Title
US11850021B2 (en) Dynamic self-learning medical image method and system
US10733726B2 (en) Pathology case review, analysis and prediction
JP6362808B1 (en) Information processing apparatus and information processing method
CN111512322B (en) Using neural networks
KR20180025093A (en) A method and apparatus for machine learning based on weakly supervised learning
JP2006043007A (en) Diagnosis support program and diagnosis support apparatus
Mercan et al. Characterizing diagnostic search patterns in digital breast pathology: scanners and drillers
Yang et al. A CNN-based active learning framework to identify mycobacteria in digitized Ziehl-Neelsen stained human tissues
Antoniades et al. Artificial intelligence in cardiovascular imaging—principles, expectations, and limitations
CN114746953A (en) AI system for predictive review of reading time and reading complexity of 2D/3D breast images
KR20190072292A (en) Apparatus and method for body growth prediction modeling
CN109190699A (en) A kind of more disease joint measurement methods based on multi-task learning
WO2021140604A1 (en) Teaching data generation system and teaching data generation method
JP2023139296A (en) Signal processing method, signal processing apparatus, and signal processing program
US20220399114A1 (en) Processing multimodal images of tissue for medical evaluation
WO2023274599A1 (en) Methods and systems for automated follow-up reading of medical image data
US20220366671A1 (en) Method and system for identifying anomalies in x-rays
US20240225447A1 (en) Dynamic self-learning medical image method and system
EP3996001A1 (en) Data generation program, data generation method, and information processing device
US20230099284A1 (en) System and method for prognosis management based on medical information of patient
WO2023195405A1 (en) Cell detection device, cell diagnosis support device, cell detection method, and cell detection program
Jara et al. Learning Curve Analysis on Adam, Sgd, and Adagrad Optimizers on a Convolutional Neural Network Model for Cancer Cells Recognition
CN114761971A (en) Evaluation framework for time series data
Lange et al. Aardvark: Composite Visualizations of Trees, Time-Series, and Images
Gu et al. Bridging the Clinical-Computational Transparency Gap in Digital Pathology

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912451

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021569658

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20912451

Country of ref document: EP

Kind code of ref document: A1