CN113642034A - Medical big data safety sharing method and system based on horizontal and vertical federal learning - Google Patents

Medical big data safety sharing method and system based on horizontal and vertical federal learning Download PDF

Info

Publication number
CN113642034A
CN113642034A CN202110713157.9A CN202110713157A CN113642034A CN 113642034 A CN113642034 A CN 113642034A CN 202110713157 A CN202110713157 A CN 202110713157A CN 113642034 A CN113642034 A CN 113642034A
Authority
CN
China
Prior art keywords
big data
model
medical big
sharing
organization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110713157.9A
Other languages
Chinese (zh)
Inventor
顾东晓
曹林
李敏
王晓玉
杨雪洁
赵旺
谢懿
鲍超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202110713157.9A priority Critical patent/CN113642034A/en
Publication of CN113642034A publication Critical patent/CN113642034A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Bioethics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a medical big data safety sharing method, a system, a storage medium and electronic equipment based on horizontal and longitudinal federal learning, and relates to the technical field of medical big data safety sharing. According to the method, according to the relevant medical big data of each second organization, a transverse federal learning method is adopted to update first model parameters set up by the first organization on a medical big data sharing platform; updating second model parameters set up by the first mechanism on the medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third mechanism; and establishing an allocation model based on the shape value, and respectively determining a first model training result and a second model training result obtained by the first mechanism. The problem of data sharing among medical related institutions in different areas is solved, and the data sharing among different institutions in the same area is also solved; the incentive mechanism is set to ensure reliability of the data and encourage more medically-related institutions to participate in the sharing process.

Description

Medical big data safety sharing method and system based on horizontal and vertical federal learning
Technical Field
The invention relates to the technical field of medical big data safety sharing, in particular to a medical big data safety sharing method, a medical big data safety sharing system, a medical big data safety sharing storage medium and electronic equipment based on horizontal and longitudinal federal learning.
Background
In recent years, data privacy protection is receiving more and more attention from the whole society, and data exchange between enterprises and institutions is prohibited without user authorization. Data sharing between different enterprises and institutions is difficult and serious, and a large and small data island is formed, which brings great challenges to artificial intelligence and machine learning. In particular, in the medical field, accurate results can be obtained only after a large amount of data and a large amount of cases are analyzed, and the interaction of various medical big data is not smooth due to the particularity of the medical big data and the difference of information acquisition systems adopted by hospitals. Data sharing between hospitals is difficult, and is more difficult with other health and old-age institutions. On the basis of not exposing user data and invading the individual privacy of the user data, the user data among different organizations are shared, more comprehensive index analysis data can be provided, a decision maker is helped to make correct judgment, and the strategic goal of building a healthy Chinese is facilitated.
Federal learning is a distributed machine learning algorithm, and researchers are beginning to use this technology in the field of medical big data sharing. The federal learning mode comprises horizontal federal learning, vertical federal learning and federal transfer learning. The horizontal federal learning is generally used in the case that the user features overlap more and the users overlap less among the entities. Longitudinal federal learning is generally applicable to entities having the same or similar user space but different feature spaces. Federal transfer learning is applicable to the condition that each participant has completely different sample dimensions and feature spaces.
However, the above medical big data sharing technology cannot complete the intercommunication and interconnection of data on the premise of protecting the privacy of users.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a medical big data safety sharing method, a system, a storage medium and electronic equipment based on horizontal and vertical federal learning, and solves the technical problem of revealing user privacy in the medical big data sharing process.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme:
a medical big data safety sharing method based on horizontal and vertical federal learning comprises the following steps:
s1, according to a first data sharing request and a second data sharing request initiated by a first organization on a medical big data sharing platform, determining a second organization responding to the first data sharing request and a third organization responding to the second data sharing request;
s2, updating first model parameters set up by the first institution on a medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second institution;
s3, updating second model parameters set up by the first institution on a medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third institution;
s4, establishing a distribution model based on the shape value, and respectively determining a first model training result and a second model training result obtained by the first mechanism;
and S5, according to the training results of the first model and the second model, combining the relevant medical big data of the first organization, and realizing the safe sharing of the medical big data among the organizations.
Preferably, the medical big data sharing platform in step S1 includes a base layer, a middle layer and an application layer.
Preferably, the step S2 specifically includes:
s21, after downloading the first model and the model hash abstract thereof from the medical big data sharing platform, each second organization uplinks the information of the second organization on the completion calculation preparation state, wherein the completion calculation preparation state means that the second organization completes transmission of the first model and the corresponding medical big data to each local data center;
s22, after the uplink of the calculation preparation state information is finished, sending an instruction for starting local model training, encrypting model parameters obtained by respective training and then performing uplink of the first model hash abstract;
s23, after the chain linking of the encryption parameters and the hash digests is completed, verifying the hash digests of the first model, decrypting the encryption parameters after the verification is passed, and triggering the intelligent contracts of the aggregation calculation;
s24, updating the local model of the second mechanism according to the aggregation calculation result; and completing model parameter updating of the first model until the model error is smaller than the acceptable error.
Preferably, the step S3 includes:
s31, acquiring overlapped users of the first mechanism and the third mechanism by adopting an RSA algorithm and a Hash algorithm according to the user data characteristics of the first mechanism and the third mechanism;
s32, obtaining federal calculation intermediate gradients of both parties corresponding to the third institutions according to the user data labels of the overlapped users in the first institutions and the relevant medical big data of the second institutions;
s33, calculating a middle gradient according to the federation of each party and updating the second model; and completing model parameter updating of the second model until the model error is smaller than the acceptable error.
Preferably, the excitation parameters of each participating institution in the assignment model in step S4 are expressed as:
Figure BDA0003133753640000041
wherein the content of the first and second substances,
Figure BDA0003133753640000042
representing an incentive parameter of an ith participating entity; n represents the total number of participating institutions; s represents a subset of N participating institutions; v. ofSIndividual contribution values representing the subset S; v. of(S∪{i})Represents a contribution value of the set S { U }; n \ i represents a subset that does not include the ith participating authority.
A medical big data safety sharing system based on horizontal and vertical federal learning comprises:
the response determining module is used for determining a second mechanism responding to a first data sharing request and a third mechanism responding to a second data sharing request according to the first data sharing request and the second data sharing request initiated by a first mechanism on a medical big data sharing platform;
the first updating module is used for updating first model parameters set up by the first organization on a medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second organization;
the second updating module is used for updating second model parameters set up by the first organization on a medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third organization;
the result distribution module is used for establishing a distribution model based on a shape value and respectively determining a first model and a second model training result acquired by the first mechanism;
and the data sharing module is used for combining the relevant medical big data of the first mechanism according to the training results of the first model and the second model to realize the safe sharing of the medical big data among all mechanisms.
A storage medium storing a computer program for secure sharing of medical big data based on horizontal and vertical federal learning, wherein the computer program causes a computer to execute the secure sharing method of medical big data as described above.
An electronic device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the secure sharing of medical big data method as described above.
(III) advantageous effects
The invention provides a medical big data safety sharing method, a medical big data safety sharing system, a medical big data safety sharing storage medium and electronic equipment based on horizontal and longitudinal federal learning. Compared with the prior art, the method has the following beneficial effects:
according to the medical big data of each second organization, updating first model parameters set up by the first organization on a medical big data sharing platform by adopting a horizontal federal learning method; according to the relevant medical big data of each third organization, updating second model parameters set up by the first organization on a medical big data sharing platform by adopting a longitudinal federal learning method; establishing a distribution model based on a shape value, and respectively determining a first model and a second model training result obtained by the first mechanism; and according to the training results of the first model and the second model, combining the relevant medical big data of the first organization to realize the safe sharing of the medical big data among the organizations. Under the condition that original medical big data of other medical related institutions are not directly acquired, the problem of data sharing among medical related institutions in different areas is solved, and data sharing among different institutions in the same area is also solved; the incentive mechanism is set to ensure reliability of the data and encourage more medically-related institutions to participate in the sharing process.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a medical big data security sharing method based on horizontal and vertical federal learning according to an embodiment of the present invention;
fig. 2 is a structural block diagram of a medical big data security sharing system based on horizontal and vertical federal learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are a part of the embodiments of the present invention, but not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the application solves the technical problem of revealing user privacy in the medical big data sharing process by providing a medical big data safety sharing method, system, storage medium and electronic equipment based on horizontal and vertical federal learning.
In order to solve the technical problems, the general idea of the embodiment of the application is as follows:
according to the embodiment of the invention, according to the relevant medical big data of each second organization, a transverse federal learning method is adopted to update the first model parameters set up by the first organization on the medical big data sharing platform; according to the relevant medical big data of each third organization, updating second model parameters set up by the first organization on a medical big data sharing platform by adopting a longitudinal federal learning method; establishing a distribution model based on a shape value, and respectively determining a first model and a second model training result obtained by the first mechanism; and according to the training results of the first model and the second model, combining the relevant medical big data of the first organization to realize the safe sharing of the medical big data among the organizations. Under the condition that original medical big data of other medical related institutions are not directly acquired, the problem of data sharing among medical related institutions in different areas is solved, and data sharing among different institutions in the same area is also solved; the incentive mechanism is set to ensure reliability of the data and encourage more medically-related institutions to participate in the sharing process.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
Example (b):
in a first aspect, as shown in fig. 1, an embodiment of the present invention provides a medical big data security sharing method based on horizontal and vertical federal learning, including:
s1, according to a first data sharing request and a second data sharing request initiated by a first organization on a medical big data sharing platform, determining a second organization responding to the first data sharing request and a third organization responding to the second data sharing request;
s2, updating first model parameters set up by the first institution on a medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second institution;
s3, updating second model parameters set up by the first institution on a medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third institution;
s4, establishing a distribution model based on the shape value, and respectively determining a first model training result and a second model training result obtained by the first mechanism;
and S5, according to the training results of the first model and the second model, combining the relevant medical big data of the first organization, and realizing the safe sharing of the medical big data among the organizations.
Under the condition that original medical big data of other medical related institutions are not directly acquired, the embodiment of the invention not only solves the problem of data sharing among medical related institutions in different areas, but also solves the problem of data sharing among different institutions in the same area; the incentive mechanism is set to ensure reliability of the data and encourage more medically-related institutions to participate in the sharing process.
The above steps will be described in detail with reference to the specific contents.
S1, according to a first data sharing request and a second data sharing request initiated by a first organization on a medical big data sharing platform, determining a second organization responding to the first data sharing request and a third organization responding to the second data sharing request.
It should be noted that the first mechanism and the second mechanism refer to medical related mechanisms located in different fields, and are characterized in that the user features are overlapped more, and the user overlap less; the first mechanism and the third mechanism are medical related mechanisms located in the same field, and are characterized in that the overlapping of user features is less, and the overlapping of users is more. The medical related institutions include hospitals and other health and old care institutions. The second mechanism and the third mechanism are a plurality of other participating mechanisms except the first mechanism in the federal learning.
The medical big data sharing platform comprises a base layer, a middle layer and an application layer.
The base layer is an infrastructure and includes a plurality of resources such as computing, storage, and communication networks.
The middle layer comprises a plurality of modules of block chain service, an encryption component, a federal learning service center, deployment management, a container center and operation and maintenance management. The block chain adopts a HyperLegger Fabric framework, supports multi-channel isolation, stores data generated by different channels into different distributed accounts, meets privacy requirements of different alliances, and ensures isolation and privacy of messages on different channels. And the encryption component encapsulates algorithms such as asymmetric encryption and the like, and supports encrypted storage and authority access of data. The federated learning service center comprises two functions of a data interface and aggregation calculation, wherein the data interface is used for transmitting the training parameters of each node model and carrying out the aggregation calculation based on the aggregated parameters. Deployment management supports user-defined platform parameters, one-click platform initialization is achieved, and block chain nodes are deployed quickly and efficiently. The container management comprises configuration management, mirror image warehouse and task management functions, and supports task operation in the isolation environment. The operation and maintenance management comprises environment configuration, component upgrading and log management and is used for detecting and maintaining platform operation.
The application layer comprises a blockchain browser, organization management, data query, data supervision, computing traceability and platform initialization.
And S2, updating the first model parameters set up by the first institution on the medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second institution. The method specifically comprises the following steps:
and S21, after downloading the first model and the model hash abstract thereof from the medical big data sharing platform, each second organization uplinks the information of the second organization on the completion calculation preparation state, wherein the completion calculation preparation state means that the second organization completes transmission of the first model and the corresponding medical big data to the respective local data center.
Assuming that the first institution is hospital A, the data owned by the first institution cannot train the model well, and the hospital A can build the model on a platform, write an intelligent contract for aggregation calculation and then initiate a data sharing request. There is a second mechanism: B. c, D three hospitals want the training results of the model, they can respond to Hospital A on the platform. B. C, D, downloading the model and the hash abstract of the model, transmitting the relevant medical big data such as the physiological data and case data of the model and the patient to a local data center, and linking the state information after calculation preparation.
And S22, after the uplink of the calculation preparation state information is finished, sending an instruction for starting local model training, and encrypting the model parameters obtained by the training and performing uplink of the first model hash abstract.
And the hospital A monitors whether other hospitals are all in a calculation preparation state through on-chain inquiry and real-time non-blocking, and then sends an instruction for starting local model training. After each round of calculation of model training, the block chain link point encrypts the parameters, and records the encrypted parameters and the model hash to the block chain account book.
And S23, after the chain linking of the encryption parameters and the hash digests is completed, verifying the hash digests of the first model, decrypting the encryption parameters after the verification is passed, and triggering the intelligent contracts of the aggregation calculation.
And (3) simultaneously starting a non-blocking monitoring mechanism by the block chain nodes, dynamically monitoring each round of iterative computation condition of each node on the chain in real time, and triggering an intelligent contract of aggregation computation when monitoring that all hospitals have uploaded originally computed parameters.
Verifying the model hash value of each hospital before aggregation calculation to ensure that each hospital uses the model built by the hospital A and ensure the consistency of the model; and querying the aggregation calculation state on the aggregation calculation front chain, wherein the query can be performed only in the calculation state. Then accessing a Restful interface provided by the hospital A, respectively decrypting the encrypted parameters after inquiring the corresponding records, and further executing the aggregation calculation; and recording the aggregation calculation result (the result obtained by training all the participating hospital data through the model provided by the hospital A) and the calculation state (the state is the end calculation) to the blockchain, and accessing a Restful interface to log out the current round of record.
S24, updating the local model of the second mechanism according to the aggregation calculation result; and completing model parameter updating of the first model until the model error is smaller than the acceptable error.
Each hospital monitors the aggregation calculation state of each round in a real-time non-blocking mode through on-chain inquiry; and (5) updating the local model when the aggregation calculation result of each round is monitored, and continuing the next round of model training. And stopping iteration until the model error E is less than or equal to the acceptable error E', and finishing the training of the first model.
And S3, updating the second model parameters set up by the first institution on the medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third institution. The method specifically comprises the following steps:
and S31, acquiring the overlapped users of the first mechanism and the third mechanism by adopting an RSA algorithm and a Hash algorithm according to the user data characteristics of the first mechanism and the third mechanism.
Assume a third mechanism: endowment organization A has users u1, u2, u3 and u4, and a first organization: medical institution B owns users u1, u2, u3, u 5. It is required to find the common user of both sides, i.e., the overlapping user, without revealing the non-common user. The organization B generates n, e and d by RSA algorithm and sends the public key containing n and e to the organization A. The mechanism A encrypts the own user data, the hash + Ri is realized, and the encrypted data YA is sent to the mechanism B. The mechanism B exponentiates YA to d to obtain ZA, encrypts user data of the mechanism B, hashes the user data to d, obtains ZB, and then sends the ZA and the ZB to the mechanism A. And after the mechanism A obtains the ZB, the Ri of the user encrypted data ZA of the mechanism A is removed and the data ZA is hashed to obtain the DA. According to the result of intersection of DA and ZB, mechanism A can judge that the overlapped users of mechanism A and mechanism B are u1, u2 and u3, finally the result I is sent to B, and sample alignment is finished
And S32, obtaining the federate calculation intermediate gradient of each third institution according to the user data label of the overlapped user in the first institution and the relevant medical big data of the second institution.
Assume that the endowment organization a has features X1 and X2, as shown in table 1:
TABLE 1
Name (I) X1 (sleep time) X2 (number of exercises)
u1 6 2
u2 8 1
u3 7 3
u4 7 1
Assume that medical facility B has a feature X3 and a label Y, as shown in table 2:
TABLE 2
Name (I) X3 (body temperature) Y (with a disease)
u1 36.3 Is that
u2 37.1 Whether or not
u3 36.8 Whether or not
u5 36.9 Is that
The medical big data security sharing platform generates a secret key pair and distributes a public key to the endowment institution A and the medical institution B for encrypting data needing to be exchanged in the training process.
And the old institution A and the medical institution B interact in an encrypted form to calculate the federate calculation intermediate gradient of both parties. The endowment institution A and the medical institution B respectively calculate based on the encrypted gradient values, meanwhile, the medical institution B calculates loss according to the label data (whether the user suffers from a certain disease), and summarizes the results to the platform, the platform calculates the total gradient according to the summarized results, and then respectively transmits the decrypted gradient back to the endowment institution A and the medical institution B.
S33, calculating a middle gradient according to the federation of each party and updating the second model; and completing model parameter updating of the second model until the model error is smaller than the acceptable error.
And the medical institution B updates the parameters of the second model according to the intermediate gradient calculated by the federal of both parties, and continuously iterates the process until the model error is smaller than the acceptable error, thereby completing the updating of the model parameters of the second model.
S4, establishing a distribution model based on the shape value, and respectively determining the training results of the first model and the second model obtained by the first mechanism.
Different medical related data holders provide different data volumes and have different data values, and sharing results obtained by the various parties are different, so that medical related organizations with rich data volumes are willing to be added into the medical big data security sharing platform.
In the process of training the overall model, incentive rewards are carried out according to the data characteristics, data and sub-models of the participants to the contribution of the overall model effect to the promotion of the overall model effect, and incentive parameters of different organizations are calculated through the following formula, namely the incentive parameters of each participant organization in the distribution model are expressed as:
Figure BDA0003133753640000131
wherein the content of the first and second substances,
Figure BDA0003133753640000132
representing an incentive parameter of an ith participating entity; n represents the total number of participating institutions; s represents a subset of N participating institutions; v. ofSIndividual contribution values representing the subset S; v. of(S∪{i})Represents a contribution value of the set S { U }; n \ i represents a subset that does not include the ith participating authority.
The set incentive mechanism solves the problem that different organizations are added into the federal common modeling, namely, the effect of the model is shown in practical application after the model is established and is recorded on a permanent data recording mechanism (such as a block chain). The more data provided, the higher the value, the more reliable the organization will see the model effect better, to avoid some organizations from burdening, and at the same time will attract more high-quality organizations to join the sharing platform.
The determination of the model effect that each organization should enjoy can be realized through an objective combination principle, and the specific operations are as follows: the model effect is divided into three equal parts: upper, medium, lower, etc., and then determine two goals with the highest incentive parameter T for each data-sharing campaign participation facility: alpha is alphaT、βT. (alpha, beta are coefficients, and 1)>α>Beta, the particular value being determined by the participant negotiation of the current data sharing activity) an incentive parameter greater than or equal to alphaTThe mechanism (2) can see the superior model effect, and the excitation parameter is more than or equal to betaTAnd is less than alphaTThe mechanism of (2) can see the moderate model effect, and the other mechanism can only see the lower model effect.
And S5, according to the training results of the first model and the second model, combining the relevant medical big data of the first organization, and realizing the safe sharing of the medical big data among the organizations.
The first mechanism enjoys model training results of different degrees according to the corresponding excitation parameters, and if the excitation parameters are high, a better model training result is obtained. Therefore, the medical big data safety sharing task based on the horizontal and vertical federal learning is finished, and the sharing of information value is completed on the basis of not revealing user data.
In a second aspect, as shown in fig. 2, an embodiment of the present invention provides a medical big data security sharing system based on horizontal and vertical federal learning, including:
the response determining module is used for determining a second mechanism responding to a first data sharing request and a third mechanism responding to a second data sharing request according to the first data sharing request and the second data sharing request initiated by a first mechanism on a medical big data sharing platform;
the first updating module is used for updating first model parameters set up by the first organization on a medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second organization;
the second updating module is used for updating second model parameters set up by the first organization on a medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third organization;
the result distribution module is used for establishing a distribution model based on a shape value and respectively determining a first model and a second model training result acquired by the first mechanism;
and the data sharing module is used for combining the relevant medical big data of the first mechanism according to the training results of the first model and the second model to realize the safe sharing of the medical big data among all mechanisms.
It can be understood that the medical big data security sharing system based on the horizontal and vertical federal learning provided by the embodiment of the invention corresponds to the block chain-based self-service mutual-help elderly care medical big data security sharing method based on the horizontal and vertical federal learning provided by the embodiment of the invention, and the explanation, the example, the beneficial effects and the like of the relevant contents can refer to the corresponding parts in the medical big data security sharing method, and the details are not repeated here.
In a third aspect, an embodiment of the present invention provides a storage medium storing a computer program for secure sharing of medical big data based on horizontal and vertical federal learning, wherein the computer program causes a computer to execute the secure sharing method of medical big data as described above.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the secure sharing of medical big data method as described above.
In summary, compared with the prior art, the method has the following beneficial effects:
according to the embodiment of the invention, according to the relevant medical big data of each second organization, a transverse federal learning method is adopted to update the first model parameters set up by the first organization on the medical big data sharing platform; according to the relevant medical big data of each third organization, updating second model parameters set up by the first organization on a medical big data sharing platform by adopting a longitudinal federal learning method; establishing a distribution model based on a shape value, and respectively determining a first model and a second model training result obtained by the first mechanism; and according to the training results of the first model and the second model, combining the relevant medical big data of the first organization to realize the safe sharing of the medical big data among the organizations. Under the condition that original medical big data of other medical related institutions are not directly acquired, the problem of data sharing among medical related institutions in different areas is solved, and data sharing among different institutions in the same area is also solved; the incentive mechanism is set to ensure reliability of the data and encourage more medically-related institutions to participate in the sharing process.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A medical big data safety sharing method based on horizontal and vertical federal learning is characterized by comprising the following steps:
s1, according to a first data sharing request and a second data sharing request initiated by a first organization on a medical big data sharing platform, determining a second organization responding to the first data sharing request and a third organization responding to the second data sharing request;
s2, updating first model parameters set up by the first institution on a medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second institution;
s3, updating second model parameters set up by the first institution on a medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third institution;
s4, establishing a distribution model based on the shape value, and respectively determining a first model training result and a second model training result obtained by the first mechanism;
and S5, according to the training results of the first model and the second model, combining the relevant medical big data of the first organization, and realizing the safe sharing of the medical big data among the organizations.
2. The medical big data security sharing method according to claim 1, wherein the medical big data sharing platform in the step S1 includes a base layer, a middle layer and an application layer.
3. The medical big data security sharing method according to claim 2, wherein the step S2 specifically includes:
s21, after downloading the first model and the model hash abstract thereof from the medical big data sharing platform, each second organization uplinks the information of the second organization on the completion calculation preparation state, wherein the completion calculation preparation state means that the second organization completes transmission of the first model and the corresponding medical big data to each local data center;
s22, after the uplink of the calculation preparation state information is finished, sending an instruction for starting local model training, encrypting model parameters obtained by respective training and then performing uplink of the first model hash abstract;
s23, after the chain linking of the encryption parameters and the hash digests is completed, verifying the hash digests of the first model, decrypting the encryption parameters after the verification is passed, and triggering the intelligent contracts of the aggregation calculation;
s24, updating the local model of the second mechanism according to the aggregation calculation result; and completing model parameter updating of the first model until the model error is smaller than the acceptable error.
4. The medical big data security sharing method according to claim 2, wherein the step S3 includes:
s31, acquiring overlapped users of the first mechanism and the third mechanism by adopting an RSA algorithm and a Hash algorithm according to the user data characteristics of the first mechanism and the third mechanism;
s32, obtaining federal calculation intermediate gradients of both parties corresponding to the third institutions according to the user data labels of the overlapped users in the first institutions and the relevant medical big data of the second institutions;
s33, calculating a middle gradient according to the federation of each party and updating the second model; and completing model parameter updating of the second model until the model error is smaller than the acceptable error.
5. The medical big data security sharing method according to claim 1, wherein the excitation parameters of each participating institution in the distribution model in the step S4 are expressed as:
Figure RE-FDA0003246281770000021
wherein the content of the first and second substances,
Figure RE-FDA0003246281770000022
representing an incentive parameter of an ith participating entity; n represents the total number of participating institutions; s represents a subset of N participating institutions; v. ofSIndividual contribution values representing the subset S; v. of(S∪{i})Represents a contribution value of the set S { U }; n \ i represents a subset that does not include the ith participating authority.
6. A medical big data safety sharing system based on horizontal and vertical federal learning is characterized by comprising:
the response determining module is used for determining a second mechanism responding to a first data sharing request and a third mechanism responding to a second data sharing request according to the first data sharing request and the second data sharing request initiated by a first mechanism on a medical big data sharing platform;
the first updating module is used for updating first model parameters set up by the first organization on a medical big data sharing platform by adopting a horizontal federal learning method according to the relevant medical big data of each second organization;
the second updating module is used for updating second model parameters set up by the first organization on a medical big data sharing platform by adopting a longitudinal federal learning method according to the relevant medical big data of each third organization;
the result distribution module is used for establishing a distribution model based on a shape value and respectively determining a first model and a second model training result acquired by the first mechanism;
and the data sharing module is used for combining the relevant medical big data of the first mechanism according to the training results of the first model and the second model to realize the safe sharing of the medical big data among all mechanisms.
7. A storage medium storing a computer program for secure sharing of medical big data based on horizontal and vertical federal learning, wherein the computer program causes a computer to execute the secure sharing method of medical big data according to any one of claims 1 to 5.
8. An electronic device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the secure sharing of medical big data method according to any of claims 1-5.
CN202110713157.9A 2021-06-25 2021-06-25 Medical big data safety sharing method and system based on horizontal and vertical federal learning Pending CN113642034A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110713157.9A CN113642034A (en) 2021-06-25 2021-06-25 Medical big data safety sharing method and system based on horizontal and vertical federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110713157.9A CN113642034A (en) 2021-06-25 2021-06-25 Medical big data safety sharing method and system based on horizontal and vertical federal learning

Publications (1)

Publication Number Publication Date
CN113642034A true CN113642034A (en) 2021-11-12

Family

ID=78416155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110713157.9A Pending CN113642034A (en) 2021-06-25 2021-06-25 Medical big data safety sharing method and system based on horizontal and vertical federal learning

Country Status (1)

Country Link
CN (1) CN113642034A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114638376A (en) * 2022-03-25 2022-06-17 支付宝(杭州)信息技术有限公司 Multi-party combined model training method and device in composite sample scene
WO2024060410A1 (en) * 2022-09-20 2024-03-28 天翼电子商务有限公司 Horizontal and vertical federated learning combined algorithm
CN114638376B (en) * 2022-03-25 2024-06-04 支付宝(杭州)信息技术有限公司 Multi-party joint model training method and device in composite sample scene

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111311257A (en) * 2020-01-20 2020-06-19 福州数据技术研究院有限公司 Medical data sharing excitation method and system based on block chain
CN111698322A (en) * 2020-06-11 2020-09-22 福州数据技术研究院有限公司 Medical data safety sharing method based on block chain and federal learning
WO2021004551A1 (en) * 2019-09-26 2021-01-14 深圳前海微众银行股份有限公司 Method, apparatus, and device for optimization of vertically federated learning system, and a readable storage medium
CN112257063A (en) * 2020-10-19 2021-01-22 上海交通大学 Cooperative game theory-based detection method for backdoor attacks in federal learning
CN112434313A (en) * 2020-11-11 2021-03-02 北京邮电大学 Data sharing method, system, electronic device and storage medium
US20210097381A1 (en) * 2019-09-27 2021-04-01 Canon Medical Systems Corporation Model training method and apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021004551A1 (en) * 2019-09-26 2021-01-14 深圳前海微众银行股份有限公司 Method, apparatus, and device for optimization of vertically federated learning system, and a readable storage medium
US20210097381A1 (en) * 2019-09-27 2021-04-01 Canon Medical Systems Corporation Model training method and apparatus
CN111311257A (en) * 2020-01-20 2020-06-19 福州数据技术研究院有限公司 Medical data sharing excitation method and system based on block chain
CN111698322A (en) * 2020-06-11 2020-09-22 福州数据技术研究院有限公司 Medical data safety sharing method based on block chain and federal learning
CN112257063A (en) * 2020-10-19 2021-01-22 上海交通大学 Cooperative game theory-based detection method for backdoor attacks in federal learning
CN112434313A (en) * 2020-11-11 2021-03-02 北京邮电大学 Data sharing method, system, electronic device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
何雯 等: "基于联邦学习的企业数据共享探讨", 信息与电脑(理论版), vol. 1, no. 08, pages 173 - 176 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114638376A (en) * 2022-03-25 2022-06-17 支付宝(杭州)信息技术有限公司 Multi-party combined model training method and device in composite sample scene
CN114638376B (en) * 2022-03-25 2024-06-04 支付宝(杭州)信息技术有限公司 Multi-party joint model training method and device in composite sample scene
WO2024060410A1 (en) * 2022-09-20 2024-03-28 天翼电子商务有限公司 Horizontal and vertical federated learning combined algorithm

Similar Documents

Publication Publication Date Title
CN111986755B (en) Data sharing system based on blockchain and attribute-based encryption
Li et al. EdgeCare: Leveraging edge computing for collaborative data management in mobile healthcare systems
Almulhim et al. A lightweight and secure authentication scheme for IoT based e-health applications
Ramzan et al. Healthcare applications using blockchain technology: Motivations and challenges
CN111698322A (en) Medical data safety sharing method based on block chain and federal learning
Wang et al. Security-aware and privacy-preserving personal health record sharing using consortium blockchain
CN108389615A (en) A kind of pregnant baby retrospect deposit system and method based on block chain technology
CN108600227A (en) A kind of medical data sharing method and device based on block chain
CN109685504B (en) Block chain-based shared economy accounting method
Liu et al. A blockchain-empowered federated learning in healthcare-based cyber physical systems
CN107896213A (en) Electronic prescription date storage method
CN112162959B (en) Medical data sharing method and device
Sánchez-Guerrero et al. Collaborative ehealth meets security: Privacy-enhancing patient profile management
Cullen et al. Distributed ledger technology for IoT: Parasite chain attacks
CN114417421A (en) Meta-universe-based shared information privacy protection method and related device
CN112289448A (en) Health risk prediction method and device based on joint learning
Nie et al. Blockchain‐empowered secure and privacy‐preserving health data sharing in edge‐based IoMT
Rathore et al. Blockchain applications for healthcare
CN113642034A (en) Medical big data safety sharing method and system based on horizontal and vertical federal learning
Lai et al. Edge intelligent collaborative privacy protection solution for smart medical
Badri et al. Blockchain for WSN and IoT applications
Al-Muhtadi et al. Access control using threshold cryptography for ubiquitous computing environments
Kaddoura et al. Blockchain for healthcare and medical systems
CN115292745A (en) Block chain-based medical data value circulation method
Liu et al. Mobile agent application and integration in electronic anamnesis system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination