CN114417398A - Data sharing method based on block chain and federal learning - Google Patents
Data sharing method based on block chain and federal learning Download PDFInfo
- Publication number
- CN114417398A CN114417398A CN202111543907.9A CN202111543907A CN114417398A CN 114417398 A CN114417398 A CN 114417398A CN 202111543907 A CN202111543907 A CN 202111543907A CN 114417398 A CN114417398 A CN 114417398A
- Authority
- CN
- China
- Prior art keywords
- node
- data
- nodes
- team
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Computer Hardware Design (AREA)
- Medical Informatics (AREA)
- Accounting & Taxation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Business, Economics & Management (AREA)
- Mathematical Physics (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data sharing method based on a block chain and federal learning, wherein block chain nodes which trust each other are organized into a team, and the team meeting the credit rating requirement is selected to respond to a request task after the request task is received; after receiving a data sharing task, using nodes in a team meeting the credit rating requirement to train a verification model until the verification model reaches preset accuracy or maximum training time, and realizing model sharing to protect the privacy of a data provider; the method comprises the steps of packing model training processes to the local, achieving consensus among block chain nodes based on a consensus algorithm of node contribution, and rewarding credit for a team meeting credit rating requirements, so that each training process in the data sharing process is recorded to ensure that a data provider provides high-quality data, rewarding credit after achieving consensus, updating the credit rating in time, ensuring the reliability of the credit rating, and relieving the privacy protection problem of the data in the Internet of things.
Description
Technical Field
The invention relates to the technical field of data sharing of the Internet of things, in particular to a data sharing method based on a block chain and federal learning.
Background
With the development of internet technology, the internet of things (IoT) is widely used in various industries. The sensor is an important component of the Internet of things and is also the most important data source of the Internet of things system. The perception data collected by a single sensor often cannot meet the requirements of users, and the real value of the Internet of things lies in comprehensive utilization and sharing of various data and information. For example, in the healthcare field, data sharing may provide valuable health records, including treatment information and physical examination information, which may provide targeted treatment to patients. In the tourism industry, collected data are analyzed, data sharing can accurately know the preference of tourists, and future tourism hotspots are predicted, so that the service quality is improved. However, data sharing in the internet of things may face the following problems: first, it is difficult for every organization to establish mutual trust, and therefore, they are unlikely to share reliable local data; second, data privacy has become a big problem hindering data sharing, as data owners suffer from privacy disclosure. Thus, achieving efficient data sharing is a challenge, particularly if both of these issues have not been solved.
Machine learning techniques are widely used for data sharing. Traditional machine learning techniques first collect data and then focus on model training. However, large-scale data collection is often difficult to achieve because the data owner is concerned about privacy disclosure. Federated learning is a distributed machine learning framework. It not only reduces the computational burden of centralized equipment by aggregating local training models of data owners instead of raw data, but also protects data privacy of data owners. The block chain is used as a distributed shared account book and a database, has the characteristics of decentralization, non-tampering, traceability, collective maintenance, openness and transparency and the like, and can provide reliable technical support for privacy protection of data sharing. For example, the blockchain may record the sharing behavior of each participant providing the data model, forcing the participants to provide a reliable data model.
Secure data sharing in the internet of things is receiving more and more attention, and a large number of data sharing mechanisms based on block chains and federal learning are proposed: gao et al (Blockchain based secure IoT data sharing frame for SDN-enabled smart communications) propose a secure data sharing framework using blockchains and proxy re-encryption techniques; xu et al (BDSS-FA: A Block-based Data Security establishing Platform With Fine Grained Access Control) propose a new encryption algorithm based on hierarchical attributes, and the attributes are allocated to an authorization center based on a block chain to realize the safe Sharing of Data; makhdoom and the like (privySharing: A block-based frame for privacy presetting and secure data sharing in smart contracts) embed access control rules in an intelligent contract to control the access of a user to data, and divide a block chain into a plurality of channels to protect the privacy and the security of the data; K. P.Y u et al (Block-Enhanced Data Sharing with Traceable and Direct retrieval in IIoT) abstract proposes an efficient and safe Data Sharing model based on attribute encryption, which can resist various attacks; hao et al (effective and private-Enhanced fed Learning for Industrial Intelligent understanding) propose a high-efficiency federal Learning mode, guarantee the Privacy of the data, this scheme can resist collusion attack in the distributed environment, prevent the personal data from revealing at the same time; sattler et al (Robust and Communication-Efficient fed Learning From Non-i.i.d.data) propose a sparse compression framework suitable for the broadband limited environment to solve the Communication overhead in the model training; a.imteaj and m.h.amini (Distributed Sensing Using Smart End-User Devices: Pathway to fed Learning for Autonomous IoT) improve federal Learning by evaluating model feedback of participants and an update method of participant weights; lu et al (Block chain and Federated Learning for Privacy-Preserved Data Sharing in Industrial IoT) combine Data Sharing, machine Learning, block chaining and federal Learning together to solve the Privacy protection problem in Data Sharing; l.yin et al (A Privacy monitoring mined Learning for Multiparty Data Sharing in Social IoTs) protects Data Privacy of Data Sharing participants in the Social Internet of things by combining Federal Learning and cryptography, and improves Data transmission and storage efficiency by using sparse differential gradient; chai et al (A Hierarchical Block-Enabled Learning for Knowledge Learning in the Internet of Vehicles) construct a secure Hierarchical Federated Learning scheme to protect the privacy of local data models and solve the security problem of resource Sharing in the Internet of Vehicles environment.
Although the above work has positively contributed to privacy protection, further research is needed to ensure the reliability of the data sharing process. Therefore, in order to realize safe and reliable untrusted data sharing, a data sharing mechanism based on federal learning is proposed.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the data sharing method based on the block chain and the federal learning is provided, and the privacy protection problem of data in the Internet of things can be effectively relieved.
In order to solve the technical problems, the invention adopts the technical scheme that:
a data sharing method based on block chains and federal learning comprises the following steps:
building mutually trusted block link points into a team;
receiving a request task, and selecting a team meeting the credit rating requirement to respond to the request task;
receiving a data sharing task, and using the nodes in the team meeting the credit rating requirement to train a verification model until the verification model reaches preset accuracy or maximum training time;
packing the model training process to the local, achieving consensus among block chain nodes based on a consensus algorithm of node contribution, and rewarding credit for the team meeting the credit rating requirement.
The invention has the beneficial effects that: building mutually trusted block link points into a team, and after receiving a request task, selecting a team meeting the credit rating requirement to respond to the request task; after receiving a data sharing task, using nodes in a team meeting the credit rating requirement to train a verification model until the verification model reaches preset accuracy or maximum training time, and realizing model sharing to protect the privacy of a data provider; the method comprises the steps of packing model training processes to the local, achieving consensus among block chain nodes based on a consensus algorithm contributed by nodes, and rewarding credit for a team meeting credit rating requirements, so that each training process in the data sharing process is recorded to ensure that a data provider provides high-quality data, rewarding credit after achieving consensus, and updating the credit rating in time, thereby ensuring the reliability of the credit rating and effectively relieving the privacy protection problem of the data in the Internet of things.
Drawings
FIG. 1 is a general flow chart of a federated learning-based data sharing strategy according to an embodiment of the present invention;
FIG. 2 is a block diagram of a federated learning-based data sharing strategy according to an embodiment of the present invention;
fig. 3 is a specific flowchart of a data sharing policy based on federal learning according to an embodiment of the present invention.
Detailed Description
In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.
Referring to fig. 1 to fig. 3, an embodiment of the present invention provides a data sharing method based on a block chain and federal learning, including the steps of:
building mutually trusted block link points into a team;
receiving a request task, and selecting a team meeting the credit rating requirement to respond to the request task;
receiving a data sharing task, and using the nodes in the team meeting the credit rating requirement to train a verification model until the verification model reaches preset accuracy or maximum training time;
packing the model training process to the local, achieving consensus among block chain nodes based on a consensus algorithm of node contribution, and rewarding credit for the team meeting the credit rating requirement.
From the above description, the beneficial effects of the present invention are: building mutually trusted block link points into a team, and after receiving a request task, selecting a team meeting the credit rating requirement to respond to the request task; after receiving a data sharing task, using nodes in a team meeting the credit rating requirement to train a verification model until the verification model reaches preset accuracy or maximum training time, and realizing model sharing to protect the privacy of a data provider; the method comprises the steps of packing model training processes to the local, achieving consensus among block chain nodes based on a consensus algorithm contributed by nodes, and rewarding credit for a team meeting credit rating requirements, so that each training process in the data sharing process is recorded to ensure that a data provider provides high-quality data, rewarding credit after achieving consensus, and updating the credit rating in time, thereby ensuring the reliability of the credit rating and effectively relieving the privacy protection problem of the data in the Internet of things.
Further, the grouping mutually trusted tile link points into a team comprises:
providing a preset amount of mortgage when a team is built, and calculating a penalty coefficient k of a bad behavior node:
in the formula, v represents the total number of work rounds of the nodes for completing the cooperative task, p represents the temporary exit frequency of the nodes, and q represents the lazy frequency of the nodes;
calculating the penalty value of the nodes in the team:
computing compensation value C of non-adverse behavior nodes in team1:
Where N represents the total number of nodes in the team.
According to the description, when a team is built, a preset amount of mortgage is provided, the penalty coefficient of a node with bad behaviors is calculated, the penalty value of the node in the team is calculated according to the penalty coefficient of the node, and then the compensation value of each node in the team is calculated; therefore, aiming at possible bad behaviors of the nodes, a team management mechanism based on 'mortgage-penalty' is designed, the loss of other nodes can be made up, and each team can further manage and supervise members so as to efficiently and reliably complete the data sharing task.
Further, the grouping mutually trusted block nodes into a team further comprises:
setting the original credit of the leader node or each member node in the team to be zero;
calculating the reward value of the leader node in the team:
in the formula, CreditIndicating credit awards provided by task publishers, WkRepresenting the contribution of the weighted ratio data nodes to the global model;
calculating a reward value for each member node:
calculating a credit value C for each node2:
C2=Cbase+Cobtain;
In the formula, CbaseRepresenting the original credit accumulation value of the node.
As can be seen from the above description, in order to promote honest and effective training of nodes, a credit rating mechanism is introduced, and the credit rating mechanism is rewarded or punished according to the contribution of the nodes, so that the accuracy of the credit rating is ensured.
Further, the data sharing task includes an ID of the task requester, a requested task category, a timestamp, and a task level.
As can be seen from the above description, the ID of the task requester, the type of the task requested, the timestamp and the task level included in the data sharing task facilitate the normal operation of the subsequent team for corresponding tasks and training the corresponding data model.
Further, the selecting a team meeting a credit rating requirement to respond to the requested task comprises:
verifying the identity of the requesting task according to the nodes connected to the requesting task;
and judging whether the request task is processed according to the identification, if so, directly returning a processing result inquired in the block chain, otherwise, broadcasting the request task on the block chain, and selecting a team meeting the credit rating requirement to respond to the request task.
As can be seen from the above description, after receiving a request task, it is necessary to verify the identifier of the request task according to the node connected to the request task, and if the identifier is found in the block chain, it indicates that the request task has been processed, and returns a query result, otherwise, the request task is broadcast, and a team meeting the credit rating requirement is selected to respond to the request task, thereby ensuring that the request task is not repeatedly executed.
Further, using the nodes in the team meeting the credit rating requirement to train a verification model until the verification model reaches a preset accuracy or a maximum training time comprises:
training a verification model locally by using one node in the team meeting the credit rating requirement, and carrying out private key signature on model parameters of the verification model;
sending the signed model parameter to an unused node in the team meeting the credit rating requirement, and updating the model parameter of the unused node;
sending the updated model parameters to another unused node in the team meeting the credit rating requirement until the verification model reaches a preset accuracy or a maximum training time.
As can be seen from the above description, a verification model is generated when the data model is trained, and then the model parameters are signed by using a private key; randomly sending the signed model parameter to the next unused node, updating the model parameter by the next unused node according to local data, then randomly sending the updated model parameter to the other unused node, and repeating the process until the verification model reaches the preset accuracy or the maximum training time; therefore, nodes in the team train the global model through federal learning, the calculation burden of centralized equipment can be reduced, and the data privacy of data owners is protected.
Further, training a verification model using the nodes in the team that meet the credit rating requirement further comprises:
according to a random algorithm and two adjacent data sets with at most one different record, after removing two data sets in a row, calculating the probability that the same result is obtained by the random algorithm:
Pr[G(D)∈0]≤exp(ε)·Pr[G(D′)∈0];
where G denotes a random algorithm, ε denotes a privacy budget, usually a small constant, and D denotes a data set;
calculating the sensitivity:
Δf=maxD,D,||G(D)-G(G′)||;
the laplace mechanism applied to the global model is calculated from the sensitivities:
G=Gm+Lap(Δf/ε);
wherein G ismIs a trained global model.
As can be seen from the above description, by adding laplacian noise to the global data sharing model, differential privacy is applied to data sharing, thereby preventing inference attacks initiated by data requesters and providing further privacy protection for data.
Further, the packing the model training process locally comprises:
all sharing records between the data requester and the data nodes are used as sharing transactions;
and packaging the shared transaction into blocks through the transaction recording node and storing the blocks to the local.
As can be seen from the above description, all shared records between the data requester and the data node are packaged into blocks by the transaction record node, so a blockchain is introduced into the data sharing process, and each training process is recorded to ensure that the data provider provides high-quality data.
Further, the node contribution-based consensus algorithm agreeing among block link points and awarding credit to the team meeting the credit rating requirement comprises:
executing a consensus process by nodes executing a data sharing task, each node competing for the opportunity to write a transaction record into a block by a work contribution mechanism;
and broadcasting the corresponding block to other nodes by the node with the authority for verification, and adding the corresponding block to the block chain for auditing after the verification is passed.
As can be seen from the above description, by performing the consensus process, each node competes for the opportunity to write the transaction record to the block through the work contribution mechanism, and thus performing the transaction writing according to the contribution can reduce the computational burden of the device.
Further, the consensus algorithm based on node contribution comprises the steps of:
and calculating the contribution of each node according to the cosine similarity:
whereinRepresenting the actual update gradient k of the node,representing the local update gradient of the kth node,representing the gradient of the model before the data node k is updated,a gradient representing the global model;
performing a reward mechanism based on the contribution weight ratio;
the contribution value is calculated by the mapping function:
calculating the weight ratio of the node contribution to the global model by using a soft-max function;
calculating the function value of soft-max:
according to the description, the contribution values of the nodes are accurately calculated according to the cosine values, and the consensus algorithm based on the data node contribution is used for achieving consensus among the block link points, so that the credit reward is conveniently carried out subsequently.
The invention discloses a data sharing method based on a block chain and federal learning, which is suitable for realizing model sharing to protect the privacy of a data provider by using point-to-point federal learning in the background of the Internet of things, and is described by a specific implementation mode as follows:
example one
Referring to fig. 1 to 3, a data sharing method based on a block chain and federal learning includes the steps of:
and S1, grouping the mutually trusted block chain nodes into a team.
Each team has a team leader responsible for receiving data sharing tasks, supervising the joint learning process in data sharing, and sending a global model with differential privacy to task publishers.
The data node in step S1 may have selfish behavior, and an internal team management mechanism based on "mortgage-penalty" is designed to solve the problem, and specifically includes the following steps:
s11, providing a preset amount of mortgage when a team is built, and calculating a penalty coefficient k of a bad behavior node:
in the formula, v represents the total number of work rounds of the nodes for completing the cooperative task, p represents the number of times of the nodes for temporarily quitting, and q represents the number of times of the nodes for being lazy.
S12, calculating penalty values of nodes in the team:
S13, calculating the compensation value C of the non-adverse behavior node in the team1:
Where N represents the total number of nodes in the team.
Further, in order to promote honest and effective training of the data nodes in the step S1, a credit rating mechanism is introduced, and the credit rating mechanism is rewarded or punished according to the contribution of the data nodes, which includes the following specific steps:
s14, setting the original credit of the leader node or each member node in the team to be zero.
S15, calculating the reward value of the leader node in the team:
in the formula, CreditIndicating credit awards provided by task publishers, WkRepresenting the contribution of the weighted ratio data nodes to the global model.
S16, calculating the reward value of each member node:
s17, calculating a credit value C2 of each node:
C2=Cbase+Cobtain。
in the formula, CbaseRepresenting the original credit accumulation value of the node.
And S2, receiving the request task, and selecting the team meeting the credit rating requirement to respond to the request task.
Specifically, step S2 further includes: federated learning is a distributed machine learning framework. It not only reduces the computational burden on centralized devices by aggregating local training models (rather than raw data) of data owners, but also protects data privacy of data owners.
S21, initiating a data sharing request task: the data requestor initiates a data sharing request. The task contains the requestor's ID, the requested task category, a timestamp, and the task level, and is signed by its private key.
S22, team response task: after a data requestor issues a request task, the node connected to it will first verify its identity and then search the blockchain to determine if the request has been previously processed. And if the cache records exist, directly returning the query result. If it is a new request, the task will be broadcast on the blockchain and the data team meeting the credit requirements will respond to the task.
And S3, receiving a data sharing task, and training a verification model by using the nodes in the team meeting the credit rating requirement until the verification model reaches the preset accuracy or the maximum training time.
The data sharing task comprises an ID of a task requester, a requested task category, a time stamp and a task level.
S31, training a verification model locally by using one node in the team meeting the credit rating requirement, and carrying out private key signature on model parameters of the verification model.
S32, sending the signed model parameters to an unused node in the team meeting the credit rating requirement, and updating the model parameters of the unused node.
S33, sending the updated model parameters to another unused node in the team meeting the credit rating requirement until the verification model reaches the preset accuracy or the maximum training time.
And S4, packing the model training process to the local, achieving consensus among block link points based on a consensus algorithm of node contribution, and rewarding credit for the team meeting the credit rating requirement.
Specifically, the blockchain is used as a distributed shared account book and a database, has the characteristics of decentralization, non-tampering, tracking, collective maintenance, openness and transparency and the like, and can provide reliable technical support for privacy protection of data sharing.
And S41, taking all sharing records between the data requester and the data node as sharing transactions.
And S42, packaging the sharing affair into blocks through the affair recording node and saving the blocks to the local.
S43, executing the consensus process through the nodes executing the data sharing task, wherein each node competes for the opportunity of writing the transaction record into the block through the work contribution mechanism.
And S44, the node with the authority broadcasts the corresponding block to other nodes for verification, and the corresponding block is added to the block chain for auditing after the verification is passed.
Therefore, in this embodiment, the combination of the blockchain and the federal learning not only solves the privacy and security problems of data sharing in a distributed scenario, but also improves the quality of shared data. The shared records for each participant can be tracked, which enables security audits.
Example two
Referring to fig. 1 to fig. 3, the present embodiment is different from the first embodiment in that a privacy difference is further defined to be applied to data sharing, and specifically, in consideration that a malicious data requester may initiate an attack, a team leader should add interference to a model, and a model protection method based on differential privacy is used, which specifically includes the following steps:
given a random algorithm G, two contiguous data sets D1 and D2 with at most one different recording;
after removing the two data sets in a row, the probability of obtaining the same result by the random algorithm G is calculated according to equation (7):
Pr[G(D)∈0]≤exp(ε)·Pr[G(D′)∈0];
where G denotes a random algorithm, ε denotes a privacy budget, usually a small constant, and D denotes a data set;
calculating the sensitivity:
Δf=maxD,D,||G(D)-G(G′)||;
the laplace mechanism applied to the global model is calculated from the sensitivities:
G=Gm+Lap(Δf/ε);
wherein G ismIs a trained global model.
EXAMPLE III
Referring to fig. 1 to fig. 3, the difference between the present embodiment and the first and second embodiments is that steps of a consensus algorithm based on node contribution are further defined, specifically:
and calculating the contribution of each node according to the cosine similarity:
whereinRepresenting the actual update gradient k of the node,representing the local update gradient of the kth node,representing the gradient of the model before the data node k is updated,a gradient representing the global model;
performing a reward mechanism based on the contribution weight ratio;
the contribution value is calculated by the mapping function:
calculating the weight ratio of the node contribution to the global model by using a soft-max function;
calculating the function value of soft-max:
therefore, the consensus mechanism in this embodiment has the advantage that it may prevent the lazy behavior of the nodes. Because in the process of multi-party cooperative training model, some lazy nodes may directly copy the previous model parameters to the next data node.
In summary, the data sharing method based on the block chain and the federal learning provided by the invention has the following beneficial effects: (1) from a data sharing perspective: the invention provides a data sharing mechanism based on federal learning, which converts a data sharing problem into a model sharing problem and realizes team-based data sharing. In addition, a reward and punishment mechanism is introduced. Specifically, the data requester carries out reward punishment on each team according to the result of data sharing, so that team members can complete data sharing with high quality and high reliability. Furthermore, in order to further penalize members providing unreliable data, a "mortgage-penalty" mechanism is introduced. Thus, each team may further manage and supervise the members so that they can efficiently and reliably complete the data sharing task. (2) From a privacy protection perspective analysis: according to the method and the device, the Laplace noise is added into the global data sharing model, and the difference privacy is applied to data sharing, so that inference attack initiated by a data requester is prevented, and further privacy protection is provided for the data.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.
Claims (10)
1. A data sharing method based on block chain and federal learning is characterized by comprising the following steps:
building mutually trusted block link points into a team;
receiving a request task, and selecting a team meeting the credit rating requirement to respond to the request task;
receiving a data sharing task, and using the nodes in the team meeting the credit rating requirement to train a verification model until the verification model reaches preset accuracy or maximum training time;
packing the model training process to the local, achieving consensus among block chain nodes based on a consensus algorithm of node contribution, and rewarding credit for the team meeting the credit rating requirement.
2. The method for data sharing based on blockchain and federal learning of claim 1, wherein the grouping mutually trusted blockchain nodes into a team comprises:
providing a preset amount of mortgage when a team is built, and calculating a penalty coefficient k of a bad behavior node:
in the formula, v represents the total number of work rounds of the nodes for completing the cooperative task, p represents the temporary exit frequency of the nodes, and q represents the lazy frequency of the nodes;
calculating the penalty value of the nodes in the team:
computing compensation value C of non-adverse behavior nodes in team1:
Where N represents the total number of nodes in the team.
3. The method for data sharing based on blockchain and federal learning of claim 2, wherein the grouping mutually trusted blockchain nodes into a team further comprises:
setting the original credit of the leader node or each member node in the team to be zero;
calculating the reward value of the leader node in the team:
in the formula, CreditIndicating credit awards provided by task publishers, WkRepresenting the contribution of the weighted ratio data nodes to the global model;
calculating a reward value for each member node:
calculating a credit value C for each node2:
C2=Cbase+Cobtain;
In the formula, CbaseRepresenting the original credit accumulation value of the node.
4. The data sharing method based on the block chain and the federal learning of claim 1, wherein the data sharing task comprises an ID of a task requester, a requested task category, a time stamp and a task level.
5. The method of claim 1, wherein selecting teams meeting credit rating requirements to respond to the requested task comprises:
verifying the identity of the requesting task according to the nodes connected to the requesting task;
and judging whether the request task is processed according to the identification, if so, directly returning a processing result inquired in the block chain, otherwise, broadcasting the request task on the block chain, and selecting a team meeting the credit rating requirement to respond to the request task.
6. The method of claim 1, wherein training a verification model using nodes in the team meeting the credit rating requirement until the verification model reaches a preset accuracy or a maximum training time comprises:
training a verification model locally by using one node in the team meeting the credit rating requirement, and carrying out private key signature on model parameters of the verification model;
sending the signed model parameter to an unused node in the team meeting the credit rating requirement, and updating the model parameter of the unused node;
sending the updated model parameters to another unused node in the team meeting the credit rating requirement until the verification model reaches a preset accuracy or a maximum training time.
7. The method of claim 1, wherein training a verification model using nodes in the team meeting the credit rating requirement further comprises:
according to a random algorithm and two adjacent data sets with at most one different record, after removing two data sets in a row, calculating the probability that the same result is obtained by the random algorithm:
Pr[G(D)∈O]≤exp(ε)·Pr[G(D′)∈O];
where G denotes a random algorithm, ε denotes a privacy budget, usually a small constant, and D denotes a data set;
calculating the sensitivity:
Δf=maxD,D′||G(D)-G(G′)||;
the laplace mechanism applied to the global model is calculated from the sensitivities:
G=Gm+Lap(Δf/ε);
wherein G ismIs a trained global model.
8. The data sharing method based on blockchain and federal learning of claim 1, wherein the packing of the model training process to local comprises:
all sharing records between the data requester and the data nodes are used as sharing transactions;
and packaging the shared transaction into blocks through the transaction recording node and storing the blocks to the local.
9. The method for data sharing based on block chain and federal learning as claimed in claim 1, wherein said consensus algorithm based on node contribution reaches consensus among block chain nodes and rewards credit for the teams meeting credit rating requirement includes:
executing a consensus process by nodes executing a data sharing task, each node competing for the opportunity to write a transaction record into a block by a work contribution mechanism;
and broadcasting the corresponding block to other nodes by the node with the authority for verification, and adding the corresponding block to the block chain for auditing after the verification is passed.
10. The method for sharing data based on blockchain and federal learning as claimed in claim 1, wherein said consensus algorithm based on node contribution comprises the steps of:
and calculating the contribution of each node according to the cosine similarity:
wherein Representing the actual update gradient k of the node,representing the local update gradient of the kth node,representing the gradient of the model before the data node k is updated,a gradient representing the global model;
performing a reward mechanism based on the contribution weight ratio;
the contribution value is calculated by the mapping function:
calculating the weight ratio of the node contribution to the global model by using a soft-max function;
calculating the function value of soft-max:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111543907.9A CN114417398A (en) | 2021-12-16 | 2021-12-16 | Data sharing method based on block chain and federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111543907.9A CN114417398A (en) | 2021-12-16 | 2021-12-16 | Data sharing method based on block chain and federal learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114417398A true CN114417398A (en) | 2022-04-29 |
Family
ID=81267799
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111543907.9A Pending CN114417398A (en) | 2021-12-16 | 2021-12-16 | Data sharing method based on block chain and federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114417398A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114726551A (en) * | 2022-06-06 | 2022-07-08 | 广州优刻谷科技有限公司 | Meta-universe credit assessment method and device based on federal management |
CN115174404A (en) * | 2022-05-17 | 2022-10-11 | 南京大学 | Multi-device federal learning system based on SDN networking |
CN115510494A (en) * | 2022-10-13 | 2022-12-23 | 贵州大学 | Multi-party safety data sharing method based on block chain and federal learning |
CN116029370A (en) * | 2023-03-17 | 2023-04-28 | 杭州海康威视数字技术股份有限公司 | Data sharing excitation method, device and equipment based on federal learning of block chain |
CN117472866A (en) * | 2023-12-27 | 2024-01-30 | 齐鲁工业大学(山东省科学院) | Federal learning data sharing method under block chain supervision and excitation |
-
2021
- 2021-12-16 CN CN202111543907.9A patent/CN114417398A/en active Pending
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115174404A (en) * | 2022-05-17 | 2022-10-11 | 南京大学 | Multi-device federal learning system based on SDN networking |
CN114726551A (en) * | 2022-06-06 | 2022-07-08 | 广州优刻谷科技有限公司 | Meta-universe credit assessment method and device based on federal management |
CN114726551B (en) * | 2022-06-06 | 2022-08-16 | 广州优刻谷科技有限公司 | Meta-universe credit assessment method and device based on federal management |
CN115510494A (en) * | 2022-10-13 | 2022-12-23 | 贵州大学 | Multi-party safety data sharing method based on block chain and federal learning |
CN115510494B (en) * | 2022-10-13 | 2023-11-21 | 贵州大学 | Multiparty safety data sharing method based on block chain and federal learning |
CN116029370A (en) * | 2023-03-17 | 2023-04-28 | 杭州海康威视数字技术股份有限公司 | Data sharing excitation method, device and equipment based on federal learning of block chain |
CN116029370B (en) * | 2023-03-17 | 2023-07-25 | 杭州海康威视数字技术股份有限公司 | Data sharing excitation method, device and equipment based on federal learning of block chain |
CN117472866A (en) * | 2023-12-27 | 2024-01-30 | 齐鲁工业大学(山东省科学院) | Federal learning data sharing method under block chain supervision and excitation |
CN117472866B (en) * | 2023-12-27 | 2024-03-19 | 齐鲁工业大学(山东省科学院) | Federal learning data sharing method under block chain supervision and excitation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114417398A (en) | Data sharing method based on block chain and federal learning | |
US20200394471A1 (en) | Efficient database maching learning verification | |
Jiang et al. | A medical big data access control model based on fuzzy trust prediction and regression analysis | |
CN115510494A (en) | Multi-party safety data sharing method based on block chain and federal learning | |
Zhang et al. | TDTA: A truth detection based task assignment scheme for mobile crowdsourced Industrial Internet of Things | |
Miao et al. | An intelligent and privacy-enhanced data sharing strategy for blockchain-empowered Internet of Things | |
CN113779617B (en) | State channel-based federal learning task credible supervision and scheduling method and device | |
Tang et al. | A trust-based model for security cooperating in vehicular cloud computing | |
CN112530587A (en) | Construction method of two-dimensional dynamic trust evaluation model for medical big data access control | |
Wang et al. | The truthful evolution and incentive for large-scale mobile crowd sensing networks | |
Wu et al. | A blockchain based access control scheme with hidden policy and attribute | |
Sun | Research on the tradeoff between privacy and trust in cloud computing | |
Ahmadjee et al. | A study on blockchain architecture design decisions and their security attacks and threats | |
Halgamuge et al. | Trust model to minimize the influence of malicious attacks in sharding based blockchain networks | |
Wang et al. | Blockchain-based federated learning in mobile edge networks with application in internet of vehicles | |
Singh et al. | An adaptive mutual trust based access control model for electronic healthcare system | |
Rahmadika et al. | Reliable collaborative learning with commensurate incentive schemes | |
CN112968873B (en) | Encryption method and device for private data transmission | |
Liao et al. | Blockchain-based mobile crowdsourcing model with task security and task assignment | |
Xi et al. | CrowdLBM: A lightweight blockchain-based model for mobile crowdsensing in the Internet of Things | |
Kalapaaking et al. | Smart Policy Control for Securing Federated Learning Management System | |
Liu et al. | A fine‐grained medical data sharing scheme based on federated learning | |
CN114826684B (en) | Decentralized crowdsourcing method, system and terminal supporting efficient privacy protection | |
CN116776373A (en) | Medical data trusted sharing method based on blockchain and federal learning | |
Zhou et al. | Ensuring Long-Term Trustworthy Collaboration in IoT Networks using Contract Theory and Reputation Mechanism on Blockchain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |