JP7510760B2

JP7510760B2 - Medical information processing device and medical information processing system

Info

Publication number: JP7510760B2
Application number: JP2020004726A
Authority: JP
Inventors: 翔佐々木; 亮一長江; 智生藤戸; 公一寺井; 佑介狩野; 優大山崎; 佳史山形; 昌史吉田; 雄之介春; 滉平篠原
Original assignee: Canon Medical Systems Corp
Current assignee: Canon Medical Systems Corp
Priority date: 2020-01-15
Filing date: 2020-01-15
Publication date: 2024-07-04
Anticipated expiration: 2040-01-15
Also published as: JP2021111288A

Description

本明細書に開示の実施形態は、医用情報処理装置及び医用情報処理システムに関する。 The embodiments disclosed in this specification relate to a medical information processing device and a medical information processing system.

近年、医療分野においても多くのＡＩ技術が適用され応用されている。このような医療分野でのＡＩ技術では、実際の診療において利用された医用データが、その患者の許可を受け、トレーニングデータとして利用される。従って、医用システムに組み込まれた学習モデルは、被検者の許可を得たトレーニングデータを用いてトレーニングされたものとなる。 In recent years, many AI technologies have been applied in the medical field. With such AI technologies in the medical field, medical data used in actual medical treatment is used as training data with the patient's permission. Therefore, the learning model built into the medical system is trained using training data with the subject's permission.

しかしながら、患者からデータの使用を制限又は禁止する申し出が事後的に発生した場合、これまでの学習モデルを使用できなくなる可能性がある。同時に、この様な問題にあたり、トレーニングデータセットに含まれる膨大なデータの中から使用制限のかかったデータを手動で差し替え学習済モデルを再学習させることは、大きな負担となる。 However, if a patient subsequently requests to restrict or prohibit the use of their data, it may become impossible to use the trained model. At the same time, in order to deal with such a problem, manually replacing the data with the usage restrictions from the vast amount of data contained in the training dataset and retraining the trained model would be a huge burden.

特表２０１３－５３７３２６号公報JP 2013-537326 A

本明細書等に開示の実施形態が解決しようとする課題の一つは、学習モデルに用いたトレーニング用データに事後的に使用制限がかかった場合であっても、トレーニング用データセットを容易に再構築できるようにすることである。 One of the problems that the embodiments disclosed in this specification aim to solve is to make it possible to easily reconstruct a training data set even if restrictions are subsequently imposed on the training data used in a learning model.

本実施形態に係る医用情報処理装置は、データベースと、情報管理部と、加工部とを備える。前記データベースは、学習モデルのトレーニングデータとして用いられる、患者毎に管理された複数のトレーニング用データを含む第１のデータセットと、前記トレーニング用データと同じデータ構造を有する複数の予備用データを含む第２のデータセットと、を格納する。前記情報管理部は、前記第１のデータセットに含まれる前記複数のトレーニング用データのうち、前記患者毎に関連付けられたプライバシー情報の少なくとも一部の使用制限に応じて、使用不可となった前記プライバシー情報を特定する特定情報を生成する。前記加工部は、前記第１のデータセットに含まれる前記複数のトレーニング用データのうち、前記情報管理部が生成した前記特定情報によって特定される使用不可となった前記トレーニング用データを、前記第２のデータセットに含まれる複数の前記予備用データのうちの使用可能である前記予備用データと置換し、前記第１のデータセットを加工する。 The medical information processing device according to the present embodiment includes a database, an information management unit, and a processing unit. The database stores a first dataset including a plurality of training data managed for each patient, which are used as training data for a learning model, and a second dataset including a plurality of spare data having the same data structure as the training data. The information management unit generates identification information for identifying unusable privacy information among the plurality of training data included in the first dataset, according to a usage restriction of at least a part of privacy information associated with each patient. The processing unit replaces the unusable training data identified by the identification information generated by the information management unit among the plurality of training data included in the first dataset with the spare data that is usable among the plurality of spare data included in the second dataset, and processes the first dataset.

図１は、実施形態に係る医用情報処理装置が使用される状況について説明するための図である。FIG. 1 is a diagram for explaining a situation in which a medical image processing apparatus according to an embodiment is used. 図２は、本実施形態に係る医用情報処理装置の構成を示したブロック図である。FIG. 2 is a block diagram showing the configuration of a medical image processing apparatus according to this embodiment. 図３は、トレーニング用データ、予備用データのデータ構造を説明するための図である。FIG. 3 is a diagram for explaining the data structure of the training data and the preliminary data. 図４は、学習モデル用データセットの再構築処理及び事後学習処理の流れについて説明するためのフローチャートである。FIG. 4 is a flowchart for explaining the flow of the reconstruction process and the post-learning process of the learning model dataset.

以下、添付図面を参照して、実施形態に係る医用情報処理装置について説明する。 The medical information processing device according to the embodiment will be described below with reference to the attached drawings.

まず、実施形態に係る医用情報処理装置が使用される状況について説明する。図１は、実施形態に係る医用情報処理装置が使用される状況について説明するための図である。図１において、企業Ａは、医療において利用される学習モデル（例えば、低解像度の画像を入力とし高解像度の画像を出力する超解像モデル、診断に用いる画像を入力とし腫瘍等の有無を判定するＣＡＤ用モデル、領域抽出用モデル等）を開発・製造する。このような学習モデルは、医用画像診断装置や医用画像処理装置に実装される。また、このような学習モデルは、企業Ａが有するサーバ装置に実装し、病院Ｂが有するクライアント装置からの要求によりネットワークＮを介して利用されることもある。 First, a situation in which the medical information processing device according to the embodiment is used will be described. FIG. 1 is a diagram for explaining a situation in which the medical information processing device according to the embodiment is used. In FIG. 1, company A develops and manufactures learning models used in medicine (for example, a super-resolution model that takes a low-resolution image as input and outputs a high-resolution image, a CAD model that takes an image used for diagnosis as input and determines the presence or absence of a tumor, a model for region extraction, etc.). Such learning models are implemented in medical image diagnostic devices and medical image processing devices. In addition, such learning models may be implemented in a server device owned by company A and used via network N in response to a request from a client device owned by hospital B.

病院Ｂは、企業Ａが製造した学習モデルが内蔵された医用画像診断装置や医用画像処理装置、或いはネットワークＮを介して企業Ａが有するサーバ装置に実装された学習モデルを使用する。 Hospital B uses a medical image diagnostic device or a medical image processing device that has a built-in learning model manufactured by company A, or a learning model implemented on a server device owned by company A via network N.

また、企業Ａが製造した学習モデルは、複数の医用情報者から提供された医用情報をトレーニングデータとして利用することで製造されたものである。医用情報提供者Ｃは、このトレーニングデータとして利用された医用情報の提供者（典型的には患者）の一人であり、自身の判断と任意のタイミングで、自身が提供した医用情報の使用可否（提供した医用情報の一部の使用可否も含む）を企業Ａに意思表示することができる。 The learning model created by company A was produced by using medical information provided by multiple medical information providers as training data. Medical information provider C is one of the providers (typically patients) of the medical information used as this training data, and can express to company A at his/her own discretion and at any time whether or not the medical information he/she provided can be used (including whether or not part of the medical information provided can be used).

なお、図１においては、医用情報提供者Ｃを一人だけ記載したのはあくまでも説明上の便宜に過ぎない。当然ながら、医用情報提供者は、医用情報の提供を受けた人数だけ存在することになる。また、医用情報提供者が提供した医用上情報の使用可否を企業Ａに伝えるのは、医用情報提供者本人である必要はなく、その代理人（例えば、親族、法的代理人、エージエント等）であってもよい。以下の説明においては、説明を具体的にするため、情報提供者が患者本人である場合を例とする。 Note that in Figure 1, only one medical information provider C is shown for the sake of convenience. Naturally, there will be as many medical information providers as there are people who have received medical information. Furthermore, it is not necessary for the medical information provider himself to be the one who informs company A of whether or not the medical information provided by the medical information provider can be used; it could be his/her representative (e.g., a relative, legal representative, agent, etc.). In the following explanation, for the sake of concreteness, we will use as an example a case where the information provider is the patient himself/herself.

図２は、本実施形態に係る医用情報処理装置１の構成を示したブロック図である。医用情報処理装置１は、典型的には企業Ａによって所有され管理されるものであり、専用又は汎用コンピュータである。 Figure 2 is a block diagram showing the configuration of a medical information processing device 1 according to this embodiment. The medical information processing device 1 is typically owned and managed by company A, and is a dedicated or general-purpose computer.

医用情報処理装置１は、記憶回路１０、処理回路１１、入力回路１２、通信Ｉ／Ｆ部1３、表示回路１４を備える。 The medical information processing device 1 includes a memory circuit 10, a processing circuit 11, an input circuit 12, a communication I/F unit 13, and a display circuit 14.

記憶回路１０は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、フラッシュメモリ（ｆｌａｓｈｍｅｍｏｒｙ）等の半導体メモリ素子、ハードディスク、光ディスク等によって構成される。記憶回路１０は、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリ及びＤＶＤ（ＤｉｇｉｔａｌＶｉｄｅｏＤｉｓｋ）などの可搬型メディアによって構成されてもよい。 The memory circuit 10 is composed of semiconductor memory elements such as RAM (Random Access Memory) and flash memory, a hard disk, an optical disk, etc. The memory circuit 10 may also be composed of portable media such as a USB (Universal Serial Bus) memory and a DVD (Digital Video Disk).

記憶回路１０は、処理回路１１において用いられる各種処理プログラム（アプリケーションプログラムの他、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）等も含まれるや、プログラムの実行に必要なデータや、ボリュームデータ及び医用画像を記憶する。また、ＯＳに、操作者に対する表示回路１４への情報の表示にグラフィックを多用し、基礎的な操作を入力回路１２によって行なうことができるＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）を含めることもできる。 The memory circuitry 10 stores various processing programs (including application programs, an OS (Operating System), etc.) used in the processing circuitry 11, as well as data required to execute the programs, volume data, and medical images. The OS can also include a GUI (Graphical User Interface) that makes extensive use of graphics to display information to the operator on the display circuitry 14 and allows basic operations to be performed by the input circuitry 12.

また、記憶回路１０は、学習モデル毎に生成された複数の学習モデル用データセット１０－１～１０－ｎを記憶する。 The memory circuit 10 also stores multiple learning model data sets 10-1 to 10-n generated for each learning model.

学習モデル用データセット１０－１～１０－ｎは、それぞれ予め決められたデータ構造に従って医用データを患者毎に登録するデータベースであり、例えば超解像モデル、ＣＡＤ用モデルといった学習モデル毎に管理される。学習モデル用データセット１０－１～１０－ｎは、複数のトレーニング用データによって構成されるトレーニング用データセット１０－１ａ～１０－ｎａと、複数の予備用データによって構成される予備用データセット１０－１ｂ～１０－ｎｂとからそれぞれ構成される。 The learning model data sets 10-1 to 10-n are databases that register medical data for each patient according to a predetermined data structure, and are managed for each learning model, such as a super-resolution model or a CAD model. The learning model data sets 10-1 to 10-n are each composed of training data sets 10-1a to 10-na that are composed of multiple training data sets, and spare data sets 10-1b to 10-nb that are composed of multiple spare data sets.

ここで、トレーニング用データとは、例えば深層学習モデル等のＡＩモデルのトレーニング（又は学習）に使用するトレーニングデータ（訓練データ）であり、患者毎に管理される。また、対応する学習モデルのトレーニングに必要とされる量の複数のトレーニング用データをトレーニング用データセットと呼ぶ。従って、トレーニング用データセットは、複数の患者から提供されたトレーニング用データによって構成される。 Here, training data refers to training data used for training (or learning) an AI model, such as a deep learning model, and is managed for each patient. In addition, multiple pieces of training data in the amount required for training the corresponding learning model are called a training dataset. Therefore, a training dataset is composed of training data provided by multiple patients.

予備用データとは、トレーニング用データと同様のデータ構造を有し、直接的にＡＩモデルのトレーニングには用いられないが、学習モデル構築後にトレーニング用データセットが使用不可データを含むことになった場合に、使用不可データと入れ替え（置換）するためのデータである。この目的から、予備用データは、トレーニング用データを参考に（例えば、含まれるコンテンツ等がトレーニング用データと似る様に）生成される。また、複数の予備用データを予備用データセットと呼ぶ。同じく、予備用データセットは、複数の患者から提供された予備用データによって構成される。なお、予備用データセットは、例えば、使用不可データの入れ替えの対象となるトレーニング用データセットと関連付して管理するようにしてもよい。 The spare data has a data structure similar to the training data, and is not used directly to train the AI model, but is used to replace (substitute) the unusable data if the training dataset contains unusable data after the learning model is constructed. For this purpose, the spare data is generated with reference to the training data (for example, so that the content contained therein is similar to that of the training data). Furthermore, multiple spare data are called spare datasets. Similarly, a spare dataset is composed of spare data provided by multiple patients. Note that the spare dataset may be managed, for example, in association with the training dataset that is the target for replacing the unusable data.

図３は、トレーニング用データ、予備用データのデータ構造を説明するための図であり、トレーニング用データ、予備用データに付されるメタ情報の一例である。図３に示すように、トレーニング用データ及び予備用データは、患者毎すなわち患者ＩＤ毎に管理され、各患者について性別、生年月日、身長、体重、遺伝子情報、病歴、検査情報（ＣＴ画像検査、ＭＲ画像検査等）等の各種情報について体系的に分類されている。 Figure 3 is a diagram for explaining the data structure of training data and spare data, and is an example of meta information attached to the training data and spare data. As shown in Figure 3, the training data and spare data are managed for each patient, i.e., for each patient ID, and are systematically classified according to various information about each patient, such as gender, date of birth, height, weight, genetic information, medical history, and examination information (CT imaging, MR imaging, etc.).

患者ＩＤ及び当該患者ＩＤに関連づけされた各情報には、プライバシータグＰが付されている。ここで、プライバシータグとは、対応する情報の使用可否を示すタグである。例えば、患者ＩＤそのもののプライバシータグ（第１のタグ）は、当該患者に関連付けされた全ての情報の使用可否を一括して管理するタグである。図３に示した例では、患者ＩＤのプライバシータグＰについて「使用可」を示すものとしてチェックが付されている。このため、当該患者に関する情報は、その下に関連付けされた情報毎に使用可否が管理されることになる。一方、仮に図３に示した例において、患者ＩＤのプライバシータグＰについて「使用不可」を示すものとしてチェックが外されている場合には、当該患者に関連する全ての情報は、一括して使用不可として管理されることになる。 A privacy tag P is attached to the patient ID and each piece of information associated with the patient ID. Here, a privacy tag is a tag that indicates whether the corresponding information can be used. For example, the privacy tag (first tag) of the patient ID itself is a tag that collectively manages whether all information associated with the patient can be used. In the example shown in FIG. 3, the privacy tag P of the patient ID is checked to indicate "usable". Therefore, the information related to the patient is managed for usability for each piece of information associated below it. On the other hand, if, in the example shown in FIG. 3, the privacy tag P of the patient ID is unchecked to indicate "unusable", all information related to the patient is collectively managed as unusable.

また、図３に示した例では、性別、生年月日、身長、体重、病歴、検査情報といった患者ＩＤに関連する情報に付されたプライバシータグＰ（第２のタグ）には「使用可」を示すものとしてチェックが付されている。一方、遺伝子情報のプライバシータグＰには「使用不可」を示すものとしてチェックが外されている。 In the example shown in FIG. 3, the privacy tags P (second tags) attached to information related to the patient ID, such as gender, date of birth, height, weight, medical history, and examination information, are checked to indicate that they are "usable." On the other hand, the privacy tag P for genetic information is unchecked to indicate that it is "unusable."

従って、図３に示した当該患者のトレーニング用データ又は予備用データについては、遺伝子情報のみ「使用不可」として取り扱われることになる。 Therefore, for the training data or spare data for the patient shown in Figure 3, only the genetic information will be treated as "unusable."

処理回路１１は、プログラムを記憶回路1０から読み出し、実行することで各プログラムに対応する機能を実現するプロセッサである。処理回路１１は、例えば、プライバシー管理機能１１１、データセット加工機能１１２、事後学習機能１１３を有する。処理回路１１は、記憶回路１０に格納されている各種制御プログラムを読み出してプライバシー管理機能１１１、データセット加工機能１１２、事後学習機能１１３を実現すると共に、記憶回路１０、入力回路１２、通信Ｉ／Ｆ部１３、表示回路１４における処理動作を統括的に制御する。換言すると、各プログラムを読み出した状態の処理回路１１は、プライバシー管理機能１１１、データセット加工機能１１２、事後学習機能１１３を有することとなる。 The processing circuit 11 is a processor that reads out the programs from the memory circuit 10 and executes them to realize the functions corresponding to each program. The processing circuit 11 has, for example, a privacy management function 111, a data set processing function 112, and a post-learning function 113. The processing circuit 11 reads out various control programs stored in the memory circuit 10 to realize the privacy management function 111, the data set processing function 112, and the post-learning function 113, and also comprehensively controls the processing operations in the memory circuit 10, the input circuit 12, the communication I/F unit 13, and the display circuit 14. In other words, the processing circuit 11 in a state in which each program has been read out has the privacy management function 111, the data set processing function 112, and the post-learning function 113.

プライバシー管理機能１１１は、各患者に関する医用情報につき、プライバシーを保護するための処理（プライバシー保護処理）を行う。このプライバシー保護処理を受けたデータは、原則的に個人を特定できない情報となる。また、プライバシー管理機能１１１は、学習モデルが生成された後に（事後的に）特定のトレーニング用データの少なくとも一部が使用不可となった場合（トレーニング用データの少なくとも一部に使用制限がかかった場合）、当該使用不可となったデータを特定するための情報（特定情報）を生成する。また、プライバシー管理機能１１１は、特定情報と共に、当該使用不可となったデータを含むトレーニング用データセットの再構築をデータセット加工機能１１２に対して指示する。なお、プライバシー管理機能１１１は情報管理部の一例である。 The privacy management function 111 performs processing (privacy protection processing) to protect the privacy of medical information related to each patient. Data that has undergone this privacy protection processing becomes information that, in principle, cannot identify an individual. Furthermore, if at least a portion of specific training data becomes unusable (subsequently) after a learning model has been generated (if use restrictions are placed on at least a portion of the training data), the privacy management function 111 generates information (specification information) for identifying the unusable data. Furthermore, together with the specific information, the privacy management function 111 instructs the dataset processing function 112 to reconstruct a training dataset that includes the unusable data. The privacy management function 111 is an example of an information management unit.

データセット加工機能１１２は、学習モデルのトレーニングを目的として患者から提供された情報を格納するデータベース（図示せず。また、このデータベースを「患者提供情報データベース」と呼ぶ。）から、学習モデルの種別に応じて事前に設定された条件に基づいて、トレーニング用データセットと予備用データセットに必要なデータを自動で収集し、学習モデル用データセットを自動的に構築する。なお、データセット加工機能１１２は加工部の一例である。 The dataset processing function 112 automatically collects data required for the training dataset and the spare dataset from a database (not shown; this database is also referred to as the "patient-provided information database") that stores information provided by patients for the purpose of training the learning model, based on conditions that are preset according to the type of learning model, and automatically constructs a dataset for the learning model. The dataset processing function 112 is an example of a processing unit.

すなわち、データセット加工機能１１２は、患者提供情報データベースに格納された情報から図３に示したデータ構造を有するトレーニング用データ又は予備用データを患者毎に生成し、トレーニング用データセットと予備用データセットとを学習モデル毎に自動的に構築する。データセット加工機能１１２は、トレーニング用データセット及び予備用データセットの構築に用いるデータのうち、トレーニング用データとして用いる割合と、予備用データとして用いる割合とを任意に設定することができる。 That is, the dataset processing function 112 generates training data or spare data having the data structure shown in FIG. 3 for each patient from the information stored in the patient-provided information database, and automatically constructs a training dataset and a spare dataset for each learning model. The dataset processing function 112 can arbitrarily set the proportion of data used to construct the training dataset and the spare dataset to be used as training data and the proportion to be used as spare data.

なお、学習モデル用データセットは、自動構築ではなく、人的に構成することもできる。また、トレーニング用データとして積極的に用いたいデータに関して、その情報を事前に設定しておくことも可能である。さらに、トレーニング用データセットと予備用データセットとの間で、一部のデータが共有されていてもよい。 The learning model dataset can be constructed manually, rather than automatically. It is also possible to set information in advance about data that is to be actively used as training data. Furthermore, some data may be shared between the training dataset and the backup dataset.

また、データセット加工機能１１２は、プライバシー管理機能１１１からの命令をトリガとして、学習モデル用データセットを自動的に再構築する。すなわち、データセット加工機能１１２は、プライバシー管理機能１１１から受け取った特定情報に基づいて、トレーニング用データセット内の使用不可データを特定する。また、データセット加工機能１１２は、トレーニング用データセット内の特定した使用不可データを、予備用データセット内の当該使用不可データと類似度が高く置換の候補となるデータ（候補データ）を入れ替えることで、学習モデル用データセットを自動的に再構築する。 Furthermore, the dataset processing function 112 automatically reconstructs the learning model dataset when triggered by a command from the privacy management function 111. That is, the dataset processing function 112 identifies unusable data in the training dataset based on the specific information received from the privacy management function 111. Furthermore, the dataset processing function 112 automatically reconstructs the learning model dataset by replacing the identified unusable data in the training dataset with data (candidate data) that is highly similar to the unusable data in the spare dataset and is a candidate for replacement.

さらに、データセット加工機能１１２は、予備用データセットの各使用可能データにつき、入れ替える使用不可データとの類似度を計算する。データセット加工機能１１２は、予備用データセットの各使用可能データにつき、類似度の最も高い使用可能データをトレーニング用データセット内の使用不可データと置換する。 Furthermore, the dataset processing function 112 calculates the similarity between each piece of usable data in the spare dataset and the unusable data to be replaced. For each piece of usable data in the spare dataset, the dataset processing function 112 replaces the unusable data in the training dataset with the usable data that has the highest similarity.

データセット加工機能１１２は、使用不可データに関する種々の指標を基準として類似度を計算することができる。 The dataset processing function 112 can calculate similarity based on various indicators related to unusable data.

類似度計算に用いる指標としては、例えば、「撮像プロトコル（Ｃａｒｄｉａｃ、Ｎｅｕｒｏ等）」、「撮像プログラム（造影剤使用の有無、ＤＡ、ＤＳＡ、回転撮像等）」、「患者情報（性別、年齢層、体格、症例等）」、「撮像時の装置の設定情報（管電圧値、管電流値、寝台の高さ、アームの角度等）」、「画像情報（検査目的、診断部位、モダリティ、画像種、解像度等）を挙げることができる。 Indices used in similarity calculations include, for example, "imaging protocol (Cardiac, Neuro, etc.)", "imaging program (whether contrast agent was used or not, DA, DSA, rotational imaging, etc.)", "patient information (gender, age group, physique, case, etc.)", "device setting information at the time of imaging (tube voltage value, tube current value, bed height, arm angle, etc.)", and "image information (purpose of examination, diagnostic area, modality, image type, resolution, etc.)".

複数の指標を用いた類似度計算は、例えば次のような類似度評価式を利用することができる。 To calculate similarity using multiple indices, the following similarity evaluation formula can be used, for example:

（類似度）＝１／｛（重み係数α）・（第１指標の差）＋（重み係数β）・（第２指標の差）＋・・・・｝ (Similarity) = 1/{(Weighting coefficient α) x (difference in first index) + (Weighting coefficient β) x (difference in second index) + ...}

ここで、「指標の差」とは、使用不可データと類似度計算の対象となる予備用データとの間の当該指標の差を意味する。また、各重み係数は、それぞれの指標をどれくらい類似度に反映させるかを調整するための係数である。 Here, "difference in index" refers to the difference in index between the unusable data and the backup data that is the subject of the similarity calculation. Also, each weighting coefficient is a coefficient for adjusting how much each index is reflected in the similarity.

例えば、特定の臓器の領域を自動抽出する学習モデルにおいて、入力データに対象臓器の領域をマーキングした画像データが使用されているとする。この学習モデルに使用されたトレーニングデータの差し替えにおいては、例えば対象領域の大きさや断面の位置を指標として含む類似度掲載を実行する。このとき、対象領域の大きさや断面の位置については、学習モデルの種別に鑑み、重み係数を大きくした上で類似度の計算をすることができる。 For example, in a learning model that automatically extracts the area of a specific organ, image data in which the area of the target organ is marked is used as input data. When replacing the training data used in this learning model, a similarity calculation is performed that includes, for example, the size of the target area and the position of the cross section as an index. In this case, the weighting coefficient for the size of the target area and the position of the cross section can be increased in consideration of the type of learning model before the similarity calculation is performed.

データセット加工機能１１２は、学習用モデルデータセットの構築又は再構築が完了した旨を、必要に応じて、事後学習機能１１３、企業Ｂの管理サーバに通知する。 The dataset processing function 112 notifies the post-learning function 113 and company B's management server, as necessary, that the construction or reconstruction of the learning model dataset has been completed.

事後学習機能１１３は、データセット加工機能１１２によって再構築されたトレーニング用データセットを用いて、対応する学習モデルについての事後学習を実行する。 The post-learning function 113 uses the training dataset reconstructed by the dataset processing function 112 to perform post-learning on the corresponding learning model.

また、事後学習機能１１３は、再構築されたトレーニング用データセットを用いた事後学習の結果を、必要に応じて、データセット加工機能１１２、企業Ｂの管理サーバに通知する。 In addition, the post-learning function 113 notifies the data set processing function 112 and company B's management server of the results of post-learning using the reconstructed training data set, as necessary.

なお、上記説明において用いた「プロセッサ」という文言は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＧＰＵ（Ｇｒａｐｈｉｃａｌｐｒｏｃｅｓｓｉｎｇｕｎｉｔ）或いは、特定用途向け集積回路（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ：ＡＳＩＣ）、プログラマブル論理デバイス（例えば、単純プログラマブル論理デバイス（ＳｉｍｐｌｅＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ：ＳＰＬＤ）、複合プログラマブル論理デバイス（ＣｏｍｐｌｅｘＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＤｅｖｉｃｅ：ＣＰＬＤ），及びフィールドプログラマブルゲートアレイ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ：ＦＰＧＡ））等の回路を意味する。プロセッサは記憶回路１０に保存されたプログラムを読み出し実行することで機能を実現する。なお、記憶回路１０にプログラムを保存する代わりに、プロセッサの回路内にプログラムを直接組み込むよう構成しても構わない。この場合、プロセッサは回路内に組み込まれたプログラムを読み出し実行することで機能を実現する。 The term "processor" used in the above description refers to circuits such as a CPU (Central Processing Unit), a GPU (Graphical processing unit), or an application specific integrated circuit (ASIC), a programmable logic device (e.g., a Simple Programmable Logic Device (SPLD), a Complex Programmable Logic Device (CPLD), and a Field Programmable Gate Array (FPGA)). The processor realizes its functions by reading and executing the program stored in the memory circuit 10. Note that instead of storing the program in the memory circuit 10, the program may be directly built into the processor circuit. In this case, the processor realizes its functions by reading and executing the program built into the circuit.

入力回路１２は、操作者によって操作が可能なポインティングデバイス（マウス等）やキーボード等の入力デバイスからの信号を入力する回路であり、ここでは、入力デバイス自体も入力回路１２に含まれるものとする。操作者により入力デバイスが操作されると、入力回路１２はその操作に応じた入力信号を生成して処理回路１１に出力する。なお、医用情報処理装置１は、入力デバイスが表示回路１４と一体に構成されたタッチパネルを備えてもよい。 The input circuitry 12 is a circuit that inputs signals from input devices such as a pointing device (such as a mouse) or a keyboard that can be operated by an operator, and here, the input devices themselves are also included in the input circuitry 12. When the input device is operated by the operator, the input circuitry 12 generates an input signal corresponding to the operation and outputs it to the processing circuitry 11. The medical information processing device 1 may also be equipped with a touch panel in which the input device is integrated with the display circuitry 14.

入力回路１２は、は、関心領域（ＲＯＩ）の設定などを行うためのトラックボール、スイッチボタン、マウス、キーボード、操作面へ触れることで入力操作を行うタッチパッド、操作面へ触れることで入力操作を行うタッチパッド、表示画面とタッチパッドとが一体化されたタッチスクリーン、光学センサを用いた非接触入力回路、及び音声入力回路及び表示画面とタッチパッドとが一体化されたタッチパネルディスプレイ等によって実現される。 The input circuitry 12 is realized by a trackball for setting a region of interest (ROI), a switch button, a mouse, a keyboard, a touchpad for performing input operations by touching the operation surface, a touchpad for performing input operations by touching the operation surface, a touch screen in which a display screen and a touchpad are integrated, a non-contact input circuit using an optical sensor, and a voice input circuit and a touch panel display in which a display screen and a touchpad are integrated, etc.

なお、入力回路１２は、はマウス、キーボードなどの物理的な操作部品を備えるものだけに限られない。例えば、装置とは別体に設けられた外部の入力機器から入力操作に対応する電気信号を受け取り、この電気信号を制御回路へ出力する電気信号の処理回路も入力回路１２の例に含まれる。 The input circuit 12 is not limited to being equipped with physical operating parts such as a mouse and a keyboard. For example, an example of the input circuit 12 includes an electrical signal processing circuit that receives an electrical signal corresponding to an input operation from an external input device provided separately from the device and outputs this electrical signal to a control circuit.

表示回路１４は、画像を表示するディスプレイであり、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ等によって構成される。表示回路１４は、処理回路１１からの指示に応じてＬＣＤ上に、各種操作画面や、画像データ等の各種表示情報を表示させる。 The display circuit 14 is a display that displays images, and is configured with an LCD (Liquid Crystal Display, etc.). The display circuit 14 displays various display information such as various operation screens and image data on the LCD in response to instructions from the processing circuit 11.

通信Ｉ／Ｆ（ｉｎｔｅｒｆａｃｅ）回路１３は、所定の通信規格にしたがって、外部装置との通信動作を行う。医用情報処理装置１がネットワーク上に設けられる場合、通信Ｉ／Ｆ回路１３は、ネットワーク上の外部装置と情報の送受信を行なう。例えば、通信Ｉ／Ｆ回路１３は、撮像で得られたデータをＭＲＩ装置等の医用画像診断装置や医用画像管理装置から受信する。 The communication I/F (interface) circuit 13 communicates with an external device in accordance with a specified communication standard. When the medical information processing device 1 is provided on a network, the communication I/F circuit 13 transmits and receives information to and from external devices on the network. For example, the communication I/F circuit 13 receives data obtained by imaging from a medical image diagnostic device such as an MRI device or a medical image management device.

（学習モデル用データセットの再構築処理）
次に、学習モデル用データセットの再構築処理について説明する。 (Reconstruction process of the data set for the learning model)
Next, the reconstruction process of the learning model dataset will be described.

まず、学習モデル用データセットの再構築処理が実行される典型的な場面について説明する。 First, we will explain a typical scenario in which the process of reconstructing a dataset for a learning model is performed.

データセット加工機能１１２によって最初のトレーニング用データセットが構築された時点では、トレーニング用データセット、予備用データセットのそれぞれに含まれている各データは、使用可能データとなっている。しかしながら、これらの各データが、例えば提供者である患者本人やその親族から使用を希望しない申し出があるなど、事後的に使用不可データとなる場合がある。係る場合には、使用不可データを含むトレーニング用データセットを用いてトレーニングされた学習モデルについても使用不可となる可能性がある。 When the initial training dataset is constructed by the dataset processing function 112, each piece of data contained in the training dataset and the spare dataset is usable data. However, there are cases where this data becomes unusable data later, for example, when the patient who provided the data or their relatives request that the data not be used. In such cases, there is a possibility that the learning model trained using the training dataset that includes the unusable data will also become unusable.

このため、トレーニング用データセット内の使用不可データを予備用データセット内の使用可能データと置換してトレーニング用データセットを再構築し、当該再構築されたトレーニング用データセットを用いて学習モデルを事後学習する必要がある。また、係る事後学習の結果、事後学習前の学習モデルと同程度の精度を担保する必要がある。 For this reason, it is necessary to reconstruct the training dataset by replacing the unusable data in the training dataset with the usable data in the spare dataset, and then to post-train the learning model using the reconstructed training dataset. Furthermore, it is necessary to ensure that the results of such post-training have the same level of accuracy as the learning model before the post-training.

以下、そのための学習モデル用データセットの再構築処理及び事後学習処理について説明する。 Below, we explain the reconstruction process of the learning model dataset and the post-learning process.

図４は、学習モデル用データセットの再構築処理及び事後学習処理の流れについて説明するためのフローチャートである。 Figure 4 is a flowchart explaining the process of reconstructing the learning model dataset and the post-learning process.

図４に示すように、まず、データセット加工機能１１２は、プライバシー管理機能１１１から使用不可データのデータ加工指示を受け付ける（ステップＳ１）。また、データセット加工機能１１２は、使用不可データに関する情報（すなわち、どの患者のどのデータが使用不可データであるのかを特定するための情報）を取得する（ステップＳ２）。 As shown in FIG. 4, first, the dataset processing function 112 receives a data processing instruction for unusable data from the privacy management function 111 (step S1). The dataset processing function 112 also acquires information about the unusable data (i.e., information for identifying which data of which patient is unusable data) (step S2).

データセット加工機能１１２は、取得した使用不可データに関する情報を用いて、トレーニング用データセット内の使用不可データを特定する（ステップＳ３）。 The dataset processing function 112 uses the acquired information about the unusable data to identify the unusable data in the training dataset (step S3).

データセット加工機能１１２は、予備用データセット内の各データと、特定された使用不可データとの類似度を計算し、計算された類似度に従って、特定された使用不可データとの置換対象となる候補データを決定する（ステップＳ４）。なお、このとき、予備用データセット内に特定された使用不可データと同一の提供者によるデータが含まれている場合がある。この様な特定された使用不可データと同一の提供者によるデータについては、類似度計算の対象から除外され、置換の対象とされない。 The dataset processing function 112 calculates the similarity between each data in the spare dataset and the identified unusable data, and determines candidate data to replace the identified unusable data according to the calculated similarity (step S4). At this time, the spare dataset may contain data provided by the same provider as the identified unusable data. Such data provided by the same provider as the identified unusable data is excluded from the similarity calculation and is not subject to replacement.

データセット加工機能１１２は、予備用データセット内の類似度が最も高い第１の候補データと、トレーニング用データセット内の使用不可データとを置換する（ステップＳ５）。このとき、置換された使用不可データは、「使用不可」を示すものとして患者ＩＤのプライバシータグが外され、予備データとして予備用データセット内に格納され管理される。 The dataset processing function 112 replaces the first candidate data with the highest similarity in the spare dataset with the unusable data in the training dataset (step S5). At this time, the privacy tag of the patient ID is removed from the replaced unusable data to indicate "unusable," and the data is stored and managed in the spare dataset as spare data.

なお、置換された使用不可データは、「使用不可」を示すものとして患者ＩＤのプライバシータグが外された後、必要に応じて予備用データセットから削除するようにしてもよい。 In addition, the replaced unusable data may be deleted from the backup data set as necessary after the privacy tag of the patient ID is removed to indicate that the data is "unusable."

データセット加工機能１１２は、他の使用不可データが存在する場合には（ステップＳ６のＹｅｓ）、他の使用不可データについてもステップＳ４、５の処理を実行する。 If other unusable data exists (Yes in step S6), the dataset processing function 112 also performs the processing of steps S4 and S5 on the other unusable data.

他の使用不可データが存在しない場合には（ステップＳ６のＮｏ）、事後学習機能１１３は、再構築されたトレーニング用データセットを用いて、対応する学習モデルについての事後学習を実行する（ステップＳ７）。 If there is no other unusable data (No in step S6), the post-learning function 113 uses the reconstructed training dataset to perform post-learning on the corresponding learning model (step S7).

事後学習機能１１３は、事後学習の結果得られた新たな学習モデルの精度は一定値以上であるか否かを判定する（ステップＳ８）。 The post-learning function 113 determines whether the accuracy of the new learning model obtained as a result of the post-learning is equal to or greater than a certain value (step S8).

一方、データセット加工機能１１２は、新たな学習モデルの精度は一定以下であると判定した場合には（ステップＳ８のＮｏ）、類似度が最も高い第１の候補データを類似度が二番目に高い第２の候補データに入れ替えて、トレーニング用データセットを再構築する（ステップＳ９）。事後学習機能１１３は、第１の候補データが第２の候補データと置換されたトレーニング用データセットを用いて、対応する学習モデルについての事後学習を実行する（ステップＳ７）。 On the other hand, if the dataset processing function 112 determines that the accuracy of the new learning model is below a certain level (No in step S8), it replaces the first candidate data with the highest similarity with the second candidate data with the second highest similarity, and reconstructs the training dataset (step S9). The post-learning function 113 uses the training dataset in which the first candidate data has been replaced with the second candidate data to perform post-learning on the corresponding learning model (step S7).

なお、事後学習機能１１３は、事後学習によって新たに生成される学習モデルの精度が一定値になるまで、ステップＳ３において計算された類似度に基づいて候補データを置換して、ステップＳ７～Ｓ９までの処理を繰り返し実行する。 The post-learning function 113 replaces candidate data based on the similarity calculated in step S3 and repeatedly executes the processes from steps S7 to S9 until the accuracy of the learning model newly generated by post-learning reaches a certain value.

一方、事後学習機能１１３は、新たな学習モデルの精度は一定値以上であると判定した場合には（ステップＳ８のＹｅｓ）、事後学習を終了し、再構築されたトレーニング用データセットを用いた学習モデルの生成を完了する。また、事後学習機能１１３は、必要に応じて、再構築されたトレーニング用データセットを用いた事後学習が完了した旨を、企業Ｂの管理サーバに通知する。 On the other hand, if the post-learning function 113 determines that the accuracy of the new learning model is equal to or greater than a certain value (Yes in step S8), it ends the post-learning and completes the generation of the learning model using the reconstructed training dataset. In addition, the post-learning function 113 notifies the management server of company B, as necessary, that the post-learning using the reconstructed training dataset has been completed.

本実施形態に係る医用情報処理装置は、データベースとしての学習モデル用データセットと、加工部としてのデータセット加工機能１１２と、事後学習部としての事後学習機能１１３とを備える。学習モデル用データセットは、学習モデルのトレーニングデータとして用いられる複数のトレーニング用データを含む第1のデータセットとしてのトレーニング用データセットと、トレーニング用データと同じデータ構造を有する複数の予備用データを含む第２のデータセットとしての予備用データセットと、を格納する。データセット加工機能１１２は、トレーニング用データセットに含まれる複数のトレーニング用データのうち使用不可となったトレーニング用データを、予備用データセットに含まれる複数の予備用データのうちの使用可能である予備用データと置換し、トレーニング用データセットを加工（再構築）する。 The medical information processing device according to this embodiment includes a learning model dataset as a database, a dataset processing function 112 as a processing unit, and a post-learning function 113 as a post-learning unit. The learning model dataset stores a training dataset as a first dataset including multiple training data used as training data for the learning model, and a spare dataset as a second dataset including multiple spare data having the same data structure as the training data. The dataset processing function 112 replaces unusable training data among the multiple training data included in the training dataset with usable spare data among the multiple spare data included in the spare dataset, and processes (reconstructs) the training dataset.

従って、トレーニング用データセットに含まれるデータに事後的に使用制限がかかった場合であっても、容易にトレーニング用データセットを再構築し、トレーニング（学習）に使ったデータ総数とバリエーションを一定に保つことができる。その結果、事後学習に必要なデータセットを、容易且つ迅速に更新することができ、トレーニング用データに使用制限がかかった場合であっても、新たに再学習したモデルが、これまで使用していた学習モデルと同様の性能を持つことを担保することができ、常に一定レベルの精度を有する学習モデルを提供することができる。 Therefore, even if restrictions are subsequently placed on the use of data included in the training dataset, the training dataset can be easily reconstructed, and the total number and variation of data used in training (learning) can be kept constant. As a result, the dataset required for post-learning can be easily and quickly updated, and even if restrictions are subsequently placed on the use of training data, it is possible to ensure that the newly retrained model has the same performance as the learning model that was previously used, and it is possible to provide a learning model that always has a certain level of accuracy.

また、医用システムで用いられている学習モデルで使用しているデータの差し替えと、学習モデルの際学習の手間をなくすことができる。その結果、トレーニング用データセットに含まれるデータに事後的に使用制限がかかった場合であっても、ユーザは特定の作業を行うことなく学習モデルの精度を担保することができる。 It also eliminates the need to replace data used in learning models used in medical systems and the need to re-train learning models. As a result, even if restrictions on the use of data included in the training dataset are subsequently imposed, users can ensure the accuracy of the learning model without having to perform any specific tasks.

（変形例１）
上記実施形態においては、医用情報処理装置１がデータベースとしての学習モデル用データセット１０－１～１０－ｎ、プライバシー管理機能１１１、事後学習機能１１３を有する構成を例示した。これに対し、学習モデル用データセット１０－１～１０－ｎ、プライバシー管理機能１１１、事後学習機能１１３は、例えばクラウド上のサーバ等、ネットワークを介して医用情報処理装置１を通信可能な少なくとも一つ装置に、分散して又はまとめて設けるようにしてもよい。 (Variation 1)
In the above embodiment, the medical information processing device 1 has the learning model data sets 10-1 to 10-n as a database, the privacy management function 111, and the post-learning function 113. In contrast, the learning model data sets 10-1 to 10-n, the privacy management function 111, and the post-learning function 113 may be provided in a distributed or collectively manner in at least one device that can communicate with the medical information processing device 1 via a network, such as a server on a cloud.

係る構成の場合、医用情報処理装置１は、医用画像情報処理システムとして構成されることになる。 In this configuration, the medical information processing device 1 is configured as a medical image information processing system.

（変形例２）
上記実施形態においては、データセット加工機能１１２は、プライバシー管理機能からの命令をトリガとしてトレーニング用データセットの再構築を実行した。これに対し、データセット加工機能１１２は、例えば、企業Ｂが管理する別のサーバ装置からの命令をトリガとして、トレーニング用データセットの再構築を実行するようにしてもよい。 (Variation 2)
In the above embodiment, the data set processing function 112 reconstructs the training data set in response to a command from the privacy management function. Alternatively, the data set processing function 112 may reconstruct the training data set in response to a command from another server device managed by company B.

また、上記実施形態においては、事後学習機能１１３は、データセット加工機能１１２ｂによってトレーニング用データセットが再構築されたことをトリガとして、事後学習を自動的に実行した。これに対し、事後学習機能１１３は、学習モデル用データベースの再構築の後、任意のタイミングで学習モデルの事後学習を開始することもできる。例えば、事後学習機能１１３は、企業Ｂが管理する別のサーバ装置からの命令や別途人為的な入力指示をトリガとして事後学習を実行するようにしてもよい。 In the above embodiment, the post-learning function 113 automatically executes post-learning when the training dataset is reconstructed by the dataset processing function 112b. In contrast, the post-learning function 113 can also start post-learning of the learning model at any time after the learning model database is reconstructed. For example, the post-learning function 113 may execute post-learning when triggered by a command from another server device managed by company B or a separate manual input instruction.

（変形例３）
上記実施形態においては、プライバシー管理機能１１１から使用不可データのデータ加工指示を受け付けたことをトリガとして、トレーニング用データセットを再構築する場合を例とした。これに対し、患者提供情報データベースに新規登録された患者提供情報の数が一定値以上になったことをトリガとして、トレーニング用データセットを再構築するようにしてもよい。このとき、トレーニング用データセットのみならず、予備用データセットについても患者提供情報データベース内の新たなデータと入れ替えるようにしてもよい。また、予備用データセットから入れ替え対象とされた古い予備用データについては、学習モデル用データセットとしての管理を解除されることが好ましい。 (Variation 3)
In the above embodiment, the training dataset is reconstructed when a data processing instruction for unusable data is received from the privacy management function 111. Alternatively, the training dataset may be reconstructed when the number of newly registered patient-provided information in the patient-provided information database reaches a certain value. In this case, not only the training dataset but also the spare dataset may be replaced with new data in the patient-provided information database. In addition, it is preferable that the old spare data that is replaced from the spare dataset is released from management as a learning model dataset.

なお、本変形例においても、当然ながら、事後学習機能１１３は、新たなトレーニングデータの精度が担保されるまで、トレーニング用データセットの再構築は繰り返し実行されるのが好ましい。 In this modified example, it is of course preferable that the post-learning function 113 repeatedly reconstructs the training dataset until the accuracy of the new training data is guaranteed.

（変形例４）
学習モデル用データセット内の予備用データセットの使用可能データの割合が閾値を下回った場合には、データセット加工機能１１２は、当該予備用データセットを再構築することが好ましい。このとき、データセット加工機能１１２は、患者提供情報データベースから新たな使用可能データを取得し、予備用データセット内の使用不可データと入れ替えることにより、予備用データセットを再構築する。また、係る再構築においては、予備用データセットから入れ替え対象とされた古い予備用データについては、学習モデル用データセットとしての管理を解除されることが好ましい。 (Variation 4)
When the ratio of usable data of the spare dataset in the learning model dataset falls below a threshold, the dataset processing function 112 preferably reconstructs the spare dataset. At this time, the dataset processing function 112 reconstructs the spare dataset by acquiring new usable data from the patient-provided information database and replacing the unusable data in the spare dataset with the new usable data. In addition, in such reconstruction, it is preferable that the old spare data that is to be replaced from the spare dataset is released from management as a learning model dataset.

（変形例５）
学習モデル用データセット１０－１～１０－ｎ内のトレーニングデータは、事後的に、例えば使用されなくなった装置によって取得された情報、医療ガイドラインの変更により不適切な条件で収集してしまった情報等を含むことになる場合がある。この様なトレーニングデータ（「適格性喪失データ」と呼ぶ）は、たとえ患者からの使用許可を得ている場合であっても、学習モデルのトレーニングに用いることはできない。 (Variation 5)
The training data in the learning model data sets 10-1 to 10-n may include, for example, information acquired by a device that is no longer in use, information collected under inappropriate conditions due to changes in medical guidelines, etc. Such training data (called "disqualified data") cannot be used for training the learning model even if permission to use it has been obtained from the patient.

そこで、このような適格性喪失データが学習モデル用データセットに含まれていることが判明した場合、データセット加工機能１１２は、予備用データセット内の使用不可データと入れ替えることにより、自動的にトレーニング用モデルデータベースを再構築することが好ましい。また、係る再構築においては、予備用データセットから入れ替え対象とされた古い予備用データについては、学習モデル用データセットとしての管理を解除されることが好ましい。 Therefore, when it is found that such disqualified data is included in the learning model dataset, it is preferable that the dataset processing function 112 automatically reconstructs the training model database by replacing it with the unusable data in the spare dataset. In addition, in such reconstruction, it is preferable that the old spare data that is the target of replacement from the spare dataset is released from management as a learning model dataset.

以上説明した少なくとも一つの実施形態によれば、学習モデルに用いたトレーニング用データに事後的に使用制限がかかった場合であっても、トレーニング用データセットを容易に再構築できるようにすることができる。 According to at least one of the embodiments described above, even if restrictions are subsequently imposed on the use of training data used in a learning model, it is possible to easily reconstruct the training data set.

また、本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更、実施形態同士の組み合わせを行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although several embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other forms, and various omissions, substitutions, modifications, and combinations of embodiments can be made without departing from the spirit of the invention. These embodiments and their modifications are included in the scope of the invention and its equivalents as set forth in the claims, as well as in the scope and spirit of the invention.

１医用情報処理装置
１０記憶回路
１０－１～１０－ｎ学習モデル用データセット
１０－１ａ～１０－ｎａトレーニング用データセット
１０－１ｂ～１０－ｎｂ予備用データセット
１１処理回路
１２入力回路
１３通信Ｉ／Ｆ部
１４表示回路
１１１プライバシー管理機能
１１２データセット加工機能
１１３事後学習機能 1 Medical information processing device 10 Memory circuit 10-1 to 10-n Learning model data set 10-1a to 10-na Training data set 10-1b to 10-nb Spare data set 11 Processing circuit 12 Input circuit 13 Communication I/F unit 14 Display circuit 111 Privacy management function 112 Data set processing function 113 Post-learning function

Claims

A database storing a first dataset including a plurality of training data managed for each patient, which are used as training data for a learning model, and a second dataset including a plurality of preliminary data having the same data structure as the training data;
an information management unit that generates identification information that identifies unusable privacy information among the plurality of training data included in the first dataset according to a usage restriction on at least a portion of the privacy information associated with each patient; and
a processing unit that replaces the unusable training data identified by the identification information generated by the information management unit, among the plurality of training data included in the first data set, with the usable spare data among the plurality of spare data included in the second data set, and processes the first data set;
A medical information processing device comprising:

the training data includes at least one of patient information, genetic information, and image information relating to a patient , and a tag for collectively managing whether or not privacy information included in the training data can be used for each patient ;
The processing unit processes the first data set based on the tag.
The medical information processing device according to claim 1 .

the training data includes at least one of patient information, genetic information, and image information relating to a patient, and a tag for individually managing the availability of privacy information included in the training data for each patient;
The processing unit processes the first data set based on the tag.
The medical information processing device according to claim 1 .

The medical information processing device according to any one of claims 1 to 3, wherein the processing unit performs the replacement based on the similarity between the unusable training data included in the first data set and each of the multiple spare data included in the second data set.

The medical information processing device according to claim 4, wherein the processing unit calculates the similarity using at least one of an imaging protocol, an imaging program, patient information, imaging device setting information at the time of imaging, and image information.

The medical information processing device according to claim 4 or 5, further comprising a post-learning unit that post-learns the learning model using the processed first data set.

The processing unit further processes the first data set based on the similarity when the post-trained learning model does not satisfy a predetermined accuracy,
The medical image processing apparatus according to claim 6 , wherein the post-learning unit performs post-learning on the learning model using the first data set that has been further processed.

The medical information processing device according to any one of claims 1 to 7, wherein, when there are multiple pieces of unusable training data among the multiple pieces of training data included in the first data set, the processing unit replaces each of the multiple pieces of unusable training data with the spare data that is usable among the multiple spare data included in the second data set, and processes the first data set.

The medical information processing device according to any one of claims 1 to 8, wherein the processing unit reconstructs the first data set when a certain number of pieces of patient-provided information are newly registered in the patient-provided information database.

The medical information processing device according to any one of claims 1 to 9, wherein the processing unit reconstructs the first data set when the ratio of usable data in the second data set falls below a threshold value.

The medical information processing device according to any one of claims 1 to 10, wherein the processing unit reconstructs at least one of the first data set and the second data set when at least one of the first data set and the second data set includes disqualification data that is inappropriate for use in training.

A database storing a first dataset including a plurality of training data managed for each patient, which are used as training data for a learning model, and a second dataset including a plurality of preliminary data having the same data structure as the training data;
an information management unit that generates identification information that identifies unusable privacy information among the plurality of training data included in the first dataset according to a usage restriction on at least a portion of the privacy information associated with each patient; and
a processing unit that replaces the unusable training data identified by the identification information generated by the information management unit, among the plurality of training data included in the first data set, with the usable spare data among the plurality of spare data included in the second data set, and processes the first data set;
A medical information processing system comprising: