CN110348241A - A kind of multicenter under data sharing strategy cooperates with prognosis prediction system - Google Patents

A kind of multicenter under data sharing strategy cooperates with prognosis prediction system Download PDF

Info

Publication number
CN110348241A
CN110348241A CN201910629800.2A CN201910629800A CN110348241A CN 110348241 A CN110348241 A CN 110348241A CN 201910629800 A CN201910629800 A CN 201910629800A CN 110348241 A CN110348241 A CN 110348241A
Authority
CN
China
Prior art keywords
data
center
medical institutions
prognosis prediction
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910629800.2A
Other languages
Chinese (zh)
Other versions
CN110348241B (en
Inventor
李劲松
李谨
田雨
吴承凯
池胜强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhijiang Laboratory
Original Assignee
Zhijiang Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhijiang Laboratory filed Critical Zhijiang Laboratory
Priority to CN201910629800.2A priority Critical patent/CN110348241B/en
Publication of CN110348241A publication Critical patent/CN110348241A/en
Priority to PCT/CN2020/083588 priority patent/WO2020233258A1/en
Application granted granted Critical
Publication of CN110348241B publication Critical patent/CN110348241B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Bioethics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种数据共享策略下的多中心协同预后预测系统。该系统能够在多个医疗机构中心下实现隐私保护的数据共享,从而为模型构建提供足够的数据。本发明采用相对于弱分类器能够获得更好预测结果的集成学习算法来构建系统。该系统在各个中心处理敏感的患者级数据,并同时构建出集成学习模型的子分类器,仅交换不太敏感的中间结果以构建完整的集成学习模型,从而保证了所提出的多中心模型与集中式模型具有相同甚至更优的结果。本发明多中心协同预后预测系统保护了患者的个人隐私,不需要在大型集中式数据源上运行算法模型,在实际临床应用中,为单个医疗机构中构建预测模型的样本太少提供了可靠的解决方案。

The invention discloses a multi-center collaborative prognosis prediction system under a data sharing strategy. The system can realize privacy-protected data sharing under multiple medical institution centers, thus providing enough data for model construction. The present invention uses an integrated learning algorithm that can obtain better prediction results than weak classifiers to construct the system. The system processes sensitive patient-level data at each center and simultaneously constructs sub-classifiers of the ensemble learning model, exchanging only less sensitive intermediate results to build a complete ensemble learning model, thus ensuring that the proposed multi-center model is compatible with The centralized model has the same or even better results. The multi-center collaborative prognosis prediction system of the present invention protects the personal privacy of patients and does not need to run algorithm models on large centralized data sources. In actual clinical applications, it provides a reliable solution for the lack of samples for constructing prediction models in a single medical institution. solution.

Description

一种数据共享策略下的多中心协同预后预测系统A multi-center collaborative prognosis prediction system under a data sharing strategy

技术领域technical field

本发明属于医疗领域及机器学习领域,尤其涉及一种数据共享策略下的多中心协同预后预测系统。The invention belongs to the fields of medical treatment and machine learning, and in particular relates to a multi-center collaborative prognosis prediction system under a data sharing strategy.

背景技术Background technique

预后预测在临床研究和实践中发挥着重要作用。基于单个医疗机构的电子健康记录(EHR)数据构建的预测模型可能缺少足够的统计效力和良好的泛化能力。因此,基于多个医疗机构中心电子健康记录数据协同分析的预后预测模型构建,可以用于提高用于模型训练的患者数量和覆盖面,丰富患者的预后特征,最终提高模型的预后预测的准确性和泛化能力。集成学习是一种在临床预后中应用非常广泛的算法,与逻辑回归和cox模型等线性模型不同,集成学习算法通常精度更好,且具有捕获变量间的非线性关系的能力,能很好地避免机器学习中常见的过拟合问题。因此,利用集成学习算法进行模型构建,为多中心下的协同预后预测系统的搭建提供理想的解决方案。另外,在进行多中心预后预测的同时,必须要保护患者的隐私。现有的多中心下隐私保护的集成学习训练模型大多是基于加密的方法,如利用加性同态加密等方法。Aslett等人提出基于完全同态加密的集成学习模型。Magkos等人利用基于同态加密的协议框架构建加密模块,从而训练出集成学习分类器。虽然这些加密方法可以防止信息泄漏与数据交换,但会显著影响计算和存储效率,可扩展性差,不适用于处理多中心下的大型临床数据。Prognosis prediction plays an important role in clinical research and practice. Predictive models constructed based on electronic health record (EHR) data from a single medical institution may lack sufficient statistical power and good generalization ability. Therefore, the construction of a prognosis prediction model based on the collaborative analysis of electronic health record data in multiple medical institution centers can be used to increase the number and coverage of patients used for model training, enrich the prognostic characteristics of patients, and ultimately improve the accuracy and accuracy of the model's prognosis prediction. Generalization. Integrated learning is an algorithm that is widely used in clinical prognosis. Unlike linear models such as logistic regression and cox models, integrated learning algorithms usually have better accuracy and have the ability to capture nonlinear relationships between variables. Avoid overfitting problems common in machine learning. Therefore, the use of ensemble learning algorithms for model construction provides an ideal solution for the construction of a multi-center collaborative prognosis prediction system. In addition, while performing multi-center prognosis prediction, the privacy of patients must be protected. Most of the existing privacy-preserving ensemble learning training models under multiple centers are based on encryption methods, such as using additive homomorphic encryption and other methods. Aslett et al. proposed an ensemble learning model based on fully homomorphic encryption. Magkos et al. used a protocol framework based on homomorphic encryption to build an encryption module to train an ensemble learning classifier. Although these encryption methods can prevent information leakage and data exchange, they will significantly affect computing and storage efficiency, have poor scalability, and are not suitable for processing large clinical data under multi-center.

发明内容Contents of the invention

本发明的目的在于针对现有技术的不足,提供一种新型数据共享策略下的多中心协同预后预测系统。The purpose of the present invention is to provide a multi-center collaborative prognosis prediction system under a novel data sharing strategy to address the deficiencies in the prior art.

本发明的目的是通过以下技术方案来实现的:一种数据共享策略下的多中心协同预后预测系统,该系统包括以下四个模块:The object of the present invention is achieved through the following technical solutions: a multi-center collaborative prognosis prediction system under a data sharing strategy, the system includes the following four modules:

(1)数据获取模块:在各医疗机构中心分别收集患者预后预测所需要的各个变量的数据,作为该医疗机构中心的源数据集。(1) Data acquisition module: collect the data of each variable required for patient prognosis prediction in each medical institution center, and use it as the source data set of the medical institution center.

(2)数据匿名化模块:对每个医疗机构中心的源数据集以百分比p进行随机采样,对采样数据使用匿名化算法生成匿名化数据,剩余数据作为该医疗机构中心的本地训练集;来自每个医疗机构中心的匿名化数据由中央服务器收集合成增强数据集;将增强数据集分成两部分,即附加训练集和验证集;附加训练集用于回传并分配给每个医疗机构中心;验证集用于选择集成学习模型的超参数(hyper parameter)。(2) Data anonymization module: randomly sample the source data set of each medical institution center with a percentage p, use an anonymization algorithm to generate anonymized data for the sampled data, and use the remaining data as the local training set of the medical institution center; The anonymized data of each medical institution center is collected by the central server to synthesize the enhanced data set; the enhanced data set is divided into two parts, namely the additional training set and the verification set; the additional training set is used for return and distributed to each medical institution center; The validation set is used to select the hyperparameters of the ensemble learning model.

(3)模型训练模块:每个医疗机构中心在本地训练集成学习模型的子分类器,在训练过程中的训练数据包括该医疗机构中心的本地训练集和中央服务器回传给该医疗机构中心的附加训练集;这表明用于训练每个医疗机构中心子分类器的训练集不仅来自中心本身还来自其他中心的数据集,从而增加数据集的随机性,以提高集成学习模型的整体性能。在训练过程中,利用从增强数据集创建的验证集选择集成学习模型的超参数。(3) Model training module: each medical institution center trains the sub-classifier of the integrated learning model locally, and the training data in the training process includes the local training set of the medical institution center and the data returned by the central server to the medical institution center Additional training set; this indicates that the training set used to train the sub-classifiers for each medical institution center comes not only from the center itself but also from the datasets of other centers, thereby increasing the randomness of the dataset to improve the overall performance of the ensemble learning model. During training, the hyperparameters of the ensemble learning model are selected using a validation set created from the augmented dataset.

(4)预后模型应用模块:由中央服务器收集各医疗机构中心本地训练的子分类器构成完整的集成学习模型;将新的患者数据输入该集成学习模型执行预后预测。(4) Prognosis model application module: the central server collects sub-classifiers trained locally in each medical institution center to form a complete integrated learning model; new patient data is input into the integrated learning model to perform prognosis prediction.

进一步地,所述数据匿名化模块中,每个医疗机构中心源数据集的随机采样百分比p选择50%。将匿名化数据比例p固定在50%能够提升集成学习模型的预测效果,子分类器的直接集成或者数据的完全匿名化再集中训练都不能实现最佳结果;p的大小可以调整以适应复杂的决策支持场景,用于不同场景下的临床实践中患者的预后预测。Further, in the data anonymization module, the random sampling percentage p of the central source data set of each medical institution is selected as 50%. Fixing the proportion of anonymized data p at 50% can improve the prediction effect of the ensemble learning model, and the direct integration of sub-classifiers or the complete anonymization of data and centralized training cannot achieve the best results; the size of p can be adjusted to adapt to complex Decision support scenarios for prognosis prediction of patients in clinical practice under different scenarios.

进一步地,所述匿名化算法可选择k-匿名算法(k-anonymity)、l-多样性(l-diversity)、t-临近度(t-closeness)以及差分隐私等匿名算法。其中具体用于实现k-匿名的方法可以选择抑制(suppression),抑制即彻底隐藏某些信息,不发布某些数据项。Further, the anonymization algorithm may choose k-anonymity algorithm (k-anonymity), l-diversity (l-diversity), t-closeness (t-closeness) and differential privacy and other anonymous algorithms. Among them, the specific method for realizing k-anonymity can choose suppression, which means completely hiding certain information and not releasing certain data items.

进一步地,该系统考虑水平分割数据(horizontal-partitioned data),即每个医疗机构中心的源数据集具有相同种类的变量。Further, the system considers horizontal-partitioned data, that is, the source data set of each medical institution center has the same kind of variables.

本发明的有益效果是:本发明创新地提出了一种多中心数据共享策略,能够在多个医疗机构中心下实现隐私保护的数据共享,从而为模型构建提供足够的数据。本发明采用相对于弱分类器能够获得更好预测结果的集成学习算法(如随机森林算法)来构建系统。该系统在各个中心处理敏感的患者级数据,并同时构建出集成学习模型的子分类器,仅交换不太敏感的中间结果以构建完整的集成学习模型,从而保证了所提出的多中心模型与集中式模型具有相同甚至更优的结果。本发明多中心协同预后预测系统保护了患者的个人隐私,不需要在大型集中式数据源上运行算法模型,在实际临床应用中,为单个医疗机构中构建预测模型的样本太少提供了可靠的解决方案。The beneficial effects of the present invention are: the present invention innovatively proposes a multi-center data sharing strategy, which can realize privacy-protected data sharing under multiple medical institution centers, thereby providing sufficient data for model construction. The present invention uses an integrated learning algorithm (such as a random forest algorithm) that can obtain better prediction results than weak classifiers to construct the system. The system processes sensitive patient-level data at each center and simultaneously constructs sub-classifiers of the ensemble learning model, exchanging only less sensitive intermediate results to build a complete ensemble learning model, thus ensuring that the proposed multi-center model is compatible with The centralized model has the same or even better results. The multi-center collaborative prognosis prediction system of the present invention protects the personal privacy of patients and does not need to run algorithm models on large centralized data sources. solution.

附图说明Description of drawings

图1为数据共享策略下的多中心协同预后预测系统框架图;Figure 1 is a frame diagram of the multi-center collaborative prognosis prediction system under the data sharing strategy;

图2为数据共享策略示意图;Figure 2 is a schematic diagram of a data sharing strategy;

图3为各中心数据传输示意图;Figure 3 is a schematic diagram of data transmission in each center;

图4为本发明数据共享策略下的多中心协同预后预测系统与集中式训练下的预后预测系统的预测能力对比图。Fig. 4 is a comparison chart of the prediction ability of the multi-center collaborative prognosis prediction system under the data sharing strategy of the present invention and the prognosis prediction system under centralized training.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明作进一步详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments.

本发明提供的一种新型数据共享策略下的多中心协同预后预测系统,如图1所示,包括以下四个模块:A multi-center collaborative prognosis prediction system under a novel data sharing strategy provided by the present invention, as shown in Figure 1, includes the following four modules:

(1)数据获取模块:在各医疗机构中心分别收集患者预后预测所需要的各个变量的数据,作为该医疗机构中心的源数据集。本实施例采用结直肠癌的数据进行实验验证,其中医疗机构中心的个数为5个,各个医疗机构中心通过数据获取模块采集到的电子病历数据样例如表1所示,共包括年龄、性别、肿瘤大小、T分期、N分期以及癌胚抗原指数等6个变量的数据信息。(1) Data acquisition module: collect the data of each variable required for patient prognosis prediction in each medical institution center, and use it as the source data set of the medical institution center. This embodiment adopts the data of colorectal cancer for experimental verification, wherein the number of medical institution centers is 5, and the electronic medical record data samples collected by each medical institution center through the data acquisition module are shown in Table 1, including age, gender , tumor size, T stage, N stage, and carcinoembryonic antigen index and other data information of six variables.

表1:结直肠癌患者的单个中心的电子病历数据采集举例Table 1: Example of electronic medical record data collection at a single center for colorectal cancer patients

年龄age 性别gender 肿瘤大小(mm)Tumor size (mm) T分期T stage N分期N stage 癌胚抗原指数carcinoembryonic antigen index 11 6565 male 4.84.8 II IIIIII 阳性positive 22 7474 Female 1.51.5 IIII IVIV 阴性feminine

(2)数据匿名化模块:如图2所示,对每个医疗机构中心的源数据集以百分比p进行随机采样,对采样数据使用匿名化算法生成匿名化数据,剩余数据作为该医疗机构中心的本地训练集。来自每个医疗机构中心的匿名化数据由中央服务器收集合成增强数据集;将增强数据集分成两部分,即附加训练集和验证集;附加训练集用于回传并分配给每个医疗机构中心;验证集用于选择集成学习模型的超参数(hyperparameter)。在实验中,匿名化数据比例p设置为50%,具体的匿名化算法采用k-匿名中的抑制算法,需要通过验证集选择的超参数有2个:单个决策树使用特征的最大数量、子分类器的数量。(2) Data anonymization module: As shown in Figure 2, the source data set of each medical institution center is randomly sampled with a percentage p, and the anonymized data is generated using an anonymization algorithm for the sampled data, and the remaining data is used as the medical institution center local training set. Anonymized data from each medical institution center is collected by a central server to synthesize an augmented dataset; the augmented dataset is split into two parts, an additional training set and a validation set; the additional training set is used for backhaul and distributed to each medical institution center ; The validation set is used to select the hyperparameters of the ensemble learning model. In the experiment, the proportion of anonymized data p is set to 50%. The specific anonymization algorithm uses the suppression algorithm in k-anonymity. There are two hyperparameters that need to be selected through the verification set: the maximum number of features used by a single decision tree, sub The number of classifiers.

(3)模型训练模块:如图2所示,每个医疗机构中心在本地训练集成学习模型的子分类器,在训练过程中的训练数据包括该医疗机构中心的本地训练集和中央服务器回传给该医疗机构中心的附加训练集;这表明用于训练每个医疗机构中心子分类器的训练集不仅来自中心本身还来自其他中心的数据集,从而增加数据集的随机性,以提高集成学习模型的整体性能。在训练过程中,利用从增强数据集创建的验证集选择集成学习模型的超参数,从而解决多中心模式下的袋外误差(OOB)与标准随机森林不完全相同导致的无偏估计无效的问题。(3) Model training module: as shown in Figure 2, each medical institution center locally trains the sub-classifier of the integrated learning model, and the training data during the training process includes the local training set of the medical institution center and the central server return An additional training set for the facility center; this indicates that the training set used to train each facility center sub-classifier comes not only from the center itself but also from datasets from other centers, thereby increasing the randomness of the dataset to improve ensemble learning The overall performance of the model. During the training process, the hyperparameters of the ensemble learning model are selected using the validation set created from the augmented dataset, thereby solving the problem that the out-of-bag error (OOB) in the multi-center mode is not exactly the same as that of the standard random forest, which causes the unbiased estimation to be invalid. .

(4)预后模型应用模块:由中央服务器收集各医疗机构中心本地训练的子分类器构成完整的集成学习模型;将新的患者数据输入该集成学习模型执行预后预测。实验结果如图4所示,预后预测系统的预测能力用AUC来衡量。可以看出本发明提出的数据共享策略下的多中心协同预后预测系统可以取得比集中式训练下的预后预测系统更优的预测结果。(4) Prognosis model application module: the central server collects sub-classifiers trained locally in each medical institution center to form a complete integrated learning model; new patient data is input into the integrated learning model to perform prognosis prediction. The experimental results are shown in Figure 4, and the prediction ability of the prognosis prediction system is measured by AUC. It can be seen that the multi-center collaborative prognosis prediction system under the data sharing strategy proposed by the present invention can achieve better prediction results than the prognosis prediction system under centralized training.

上述实施例用来解释说明本发明,而不是对本发明进行限制,在本发明的精神和权利要求的保护范围内,对本发明做出的任何修改和改变,都落入本发明的保护范围。The above-mentioned embodiments are used to illustrate the present invention, rather than to limit the present invention. Within the spirit of the present invention and the protection scope of the claims, any modification and change made to the present invention will fall into the protection scope of the present invention.

Claims (4)

1. the multicenter under a kind of data sharing strategy cooperates with prognosis prediction system characterized by comprising
(1) data acquisition module: the number of each variable required for patient's prognosis prediction is collected respectively at each medical institutions center According to set of source data as the medical institutions center.
(2) data anonymous module: carrying out stochastical sampling to the set of source data at each medical institutions center with percentage p, to adopting Sample data generate anonymization data, local training set of the remaining data as the medical institutions center using anonymization algorithm;Come Enhancing data set is synthesized by central server collection from the anonymization data at each medical institutions center;Enhancing data set is divided into Two parts, i.e., additional training set and verifying collection;Additional training set is for returning and distributing to each medical institutions center;Verifying collection For selecting the hyper parameter (hyper parameter) of integrated learning model.
(3) it model training module: was being trained in the sub-classifier of the integrated learning model of locally training at each medical institutions center Training data in journey includes that the local training set at the medical institutions center and central server return to the medical institutions center Additional training set;This shows for training the training set of each medical institutions center sub-classifier to go back not only from center itself Data set from other centers, to increase the randomness of data set, to improve the overall performance of integrated study model.It is instructing During white silk, the hyper parameter of integrated learning model is selected using the verifying collection created from enhancing data set.
(4) prognostic model application module: the sub-classifier that each medical institutions center is trained is collected by central server and is constituted Complete integrated study model;New patient data is inputted into the integrated study model and executes prognosis prediction.
2. the multicenter under a kind of data sharing strategy according to claim 1 cooperates with prognosis prediction system, feature exists In, in the data anonymous module, the stochastical sampling percentage p selection 50% of each medical institutions center set of source data.
3. the multicenter under a kind of data sharing strategy according to claim 1 cooperates with prognosis prediction system, feature exists In k- anonymity algorithm (k-anonymity), l- diversity (l-diversity), t- proximity may be selected in the anonymization algorithm (t-closeness) and the anonymity algorithms such as difference privacy.The method for being wherein specifically used for realizing k- anonymity can choose inhibition (suppression), inhibit thoroughly to hide certain information, do not issue certain data item.
4. the multicenter under a kind of data sharing strategy according to claim 1 cooperates with prognosis prediction system, feature exists In the system considers horizontal segmentation data (horizontal-partitioned data), i.e., the source at each medical institutions center Data set has the variable of identical type.
CN201910629800.2A 2019-07-12 2019-07-12 A multi-center collaborative prognosis prediction system under a data sharing strategy Active CN110348241B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910629800.2A CN110348241B (en) 2019-07-12 2019-07-12 A multi-center collaborative prognosis prediction system under a data sharing strategy
PCT/CN2020/083588 WO2020233258A1 (en) 2019-07-12 2020-04-07 Data sharing strategy-based multi-center collaborative prognosis prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910629800.2A CN110348241B (en) 2019-07-12 2019-07-12 A multi-center collaborative prognosis prediction system under a data sharing strategy

Publications (2)

Publication Number Publication Date
CN110348241A true CN110348241A (en) 2019-10-18
CN110348241B CN110348241B (en) 2021-08-03

Family

ID=68175993

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910629800.2A Active CN110348241B (en) 2019-07-12 2019-07-12 A multi-center collaborative prognosis prediction system under a data sharing strategy

Country Status (2)

Country Link
CN (1) CN110348241B (en)
WO (1) WO2020233258A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222570A (en) * 2020-01-06 2020-06-02 广西师范大学 Ensemble learning classification method based on difference privacy
CN111245903A (en) * 2019-12-31 2020-06-05 烽火通信科技股份有限公司 Joint learning method and system based on edge calculation
WO2020233258A1 (en) * 2019-07-12 2020-11-26 之江实验室 Data sharing strategy-based multi-center collaborative prognosis prediction system
CN113221162A (en) * 2021-04-28 2021-08-06 健康数据(北京)科技有限公司 Private disease-specific big data privacy protection method and system based on block chain
CN117577333A (en) * 2024-01-17 2024-02-20 浙江大学 Multi-center clinical prognosis prediction system based on causal feature learning

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115442099B (en) * 2022-08-28 2023-06-06 北方工业大学 A privacy-preserving data sharing method and system based on distributed GAN

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130063565A (en) * 2011-12-07 2013-06-17 조윤진 Combination of multiple classifiers using bagging in semi-supervised learning
CN104200417A (en) * 2014-08-20 2014-12-10 西安唐城电子医疗设备研究所 Rehabilitation training system based on cloud computing
CN107871160A (en) * 2016-09-26 2018-04-03 谷歌公司 Communication Efficient Joint Learning
CN109711556A (en) * 2018-12-24 2019-05-03 中国南方电网有限责任公司 Machine patrols data processing method, device, net grade server and provincial server
CN109977694A (en) * 2019-03-11 2019-07-05 暨南大学 A kind of data sharing method based on cooperation deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897545B (en) * 2017-01-05 2019-04-30 浙江大学 A kind of tumor prognosis forecasting system based on depth confidence network
CN106886799B (en) * 2017-03-17 2019-08-02 东北大学 A kind of continuous annealing band steel quality online test method based on hybrid integrated study
CN110348241B (en) * 2019-07-12 2021-08-03 之江实验室 A multi-center collaborative prognosis prediction system under a data sharing strategy

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130063565A (en) * 2011-12-07 2013-06-17 조윤진 Combination of multiple classifiers using bagging in semi-supervised learning
CN104200417A (en) * 2014-08-20 2014-12-10 西安唐城电子医疗设备研究所 Rehabilitation training system based on cloud computing
CN107871160A (en) * 2016-09-26 2018-04-03 谷歌公司 Communication Efficient Joint Learning
CN109711556A (en) * 2018-12-24 2019-05-03 中国南方电网有限责任公司 Machine patrols data processing method, device, net grade server and provincial server
CN109977694A (en) * 2019-03-11 2019-07-05 暨南大学 A kind of data sharing method based on cooperation deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈莉平等: "基于大数据的脑卒中复发预测模型的构建", 《物联网技术》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020233258A1 (en) * 2019-07-12 2020-11-26 之江实验室 Data sharing strategy-based multi-center collaborative prognosis prediction system
CN111245903A (en) * 2019-12-31 2020-06-05 烽火通信科技股份有限公司 Joint learning method and system based on edge calculation
CN111245903B (en) * 2019-12-31 2022-07-01 烽火通信科技股份有限公司 Joint learning method and system based on edge calculation
CN111222570A (en) * 2020-01-06 2020-06-02 广西师范大学 Ensemble learning classification method based on difference privacy
CN111222570B (en) * 2020-01-06 2022-08-26 广西师范大学 Ensemble learning classification method based on difference privacy
CN113221162A (en) * 2021-04-28 2021-08-06 健康数据(北京)科技有限公司 Private disease-specific big data privacy protection method and system based on block chain
CN117577333A (en) * 2024-01-17 2024-02-20 浙江大学 Multi-center clinical prognosis prediction system based on causal feature learning
CN117577333B (en) * 2024-01-17 2024-04-09 浙江大学 Multi-center clinical prognosis prediction system based on causal feature learning

Also Published As

Publication number Publication date
WO2020233258A1 (en) 2020-11-26
CN110348241B (en) 2021-08-03

Similar Documents

Publication Publication Date Title
CN110348241A (en) A kind of multicenter under data sharing strategy cooperates with prognosis prediction system
KR102634785B1 (en) Decentralized privacy-preserving computing on protected data
Patel et al. Human–machine partnership with artificial intelligence for chest radiograph diagnosis
Starke et al. 2D and 3D convolutional neural networks for outcome modelling of locally advanced head and neck squamous cell carcinoma
Nguyen et al. A novel decentralized federated learning approach to train on globally distributed, poor quality, and protected private medical data
Zhou et al. Cross-modal translation and alignment for survival analysis
Luo et al. A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system
Moscatelli et al. An infrastructure for precision medicine through analysis of big data
Gaonkar et al. Ethical issues arising due to bias in training AI algorithms in healthcare and data sharing as a potential solution
Nordlinger et al. Healthcare and artificial intelligence
Parikh et al. Clinician perspectives on machine learning prognostic algorithms in the routine care of patients with cancer: a qualitative study
Alabdulkarim et al. A Privacy-Preserving Algorithm for Clinical Decision-Support Systems Using Random Forest.
Aldhyani et al. A secure internet of medical things framework for breast cancer detection in sustainable smart cities
CN111261299B (en) Multi-center collaborative cancer prognosis prediction system based on multi-source transfer learning
Rupp et al. Exbehrt: Extended transformer for electronic health records
Field et al. Infrastructure platform for privacy-preserving distributed machine learning development of computer-assisted theragnostics in cancer
Ricotti et al. Incidence and prevalence analysis of non-small-cell and small-cell lung cancer using administrative data
Xu et al. Machine learning models for 180-day mortality prediction of patients with advanced cancer using patient-reported symptom data
Jackson et al. Extending a generative adversarial network to produce medical records with demographic characteristics and health system use
Hossain et al. A collaborative federated learning framework for lung and colon cancer classifications
Hu et al. Enhancing the Accuracy of Lymph-Node-Metastasis Prediction in Gynecologic Malignancies Using Multimodal Federated Learning: Integrating CT, MRI, and PET/CT
Li et al. Blockchain-based collaborative data analysis framework for distributed medical knowledge extraction
Xue et al. Risk-based colposcopy for cervical precancer detection: a cross-sectional Multicenter Study in China
Singh et al. Research trends on AI in breast cancer diagnosis, and treatment over two decades
Cozma et al. Explainable machine learning solution for observing optimal surgery timings in thoracic cancer diagnosis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant