CN113570065A - Data management method, device and equipment based on alliance chain and federal learning - Google Patents

Data management method, device and equipment based on alliance chain and federal learning Download PDF

Info

Publication number
CN113570065A
CN113570065A CN202110773692.3A CN202110773692A CN113570065A CN 113570065 A CN113570065 A CN 113570065A CN 202110773692 A CN202110773692 A CN 202110773692A CN 113570065 A CN113570065 A CN 113570065A
Authority
CN
China
Prior art keywords
data
local
demander
accuracy
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110773692.3A
Other languages
Chinese (zh)
Inventor
王少影
辛锐
王静
李启蒙
刘玮
黄镜宇
吴军英
肖帆
周文芳
吕鹏鹏
王智慧
连阳阳
方蓬勃
蔺鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Vectinfo Technologies Co ltd
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Original Assignee
Beijing Vectinfo Technologies Co ltd
State Grid Corp of China SGCC
Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Vectinfo Technologies Co ltd, State Grid Corp of China SGCC, Information and Telecommunication Branch of State Grid Hebei Electric Power Co Ltd filed Critical Beijing Vectinfo Technologies Co ltd
Priority to CN202110773692.3A priority Critical patent/CN113570065A/en
Publication of CN113570065A publication Critical patent/CN113570065A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention provides a data management method, a device and equipment based on alliance chain and federal learning, wherein the method comprises the steps of receiving a data request instruction sent by a data demand party; calling an intelligent contract to share the data request instruction to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each data-related node including a local data set associated with the requested data category; training a local data set based on federated machine learning to obtain a global model; target data are output to a data demander according to a global model and a demander identifier, so that the obtained target data are not only provided by a certain node, but also the related data of different data related nodes can be fused and sent to the data demander based on a federal machine learning model, so that the accuracy of the target data is effectively guaranteed while the complete target data are obtained, and the matching degree of the target data and the required data is improved.

Description

Data management method, device and equipment based on alliance chain and federal learning
Technical Field
The invention relates to the technical field of data processing, in particular to a data management method, a data management device and data management equipment based on a alliance chain and federal learning.
Background
Urban computing is an emerging field in which cities serve as application scenes and computer science and technology are fused with urban planning, traffic, energy, environment, economy and other disciplines, and challenges faced by cities are solved by continuously acquiring, integrating and analyzing various heterogeneous big data in the cities. The smart city construction is already successful at present, and by means of artificial intelligence, cloud computing and big data modernization technologies, the city computing plays a great role in the fields of smart traffic, smart medical treatment and the like. However, the current smart city construction is still imperfect, because of factors such as competitive relationship, safety problem and approval process, the circulation of data among different owners, cloud and end and internet of things nodes has a barrier which is difficult to break, so that a so-called data island problem is formed, and safe and credible data sharing cannot be carried out to support city calculation. "Federal Machine Learning" (Federal Machine Learning) is essentially a distributed Machine Learning technique for data encryption, where participating parties can co-model without revealing the underlying data and the encrypted (obfuscated) form of the underlying data. The method can realize that private data of each participant can not go out of the local, and a virtual global model can be established without violating related laws through a parameter exchange mode under an encryption mechanism. The block chain is a decentralized, data encryption and non-falsifiable distributed shared database, and a block chain technology combined with an intelligent contract can be used as a service platform of a federal learning task and provides functions of task distribution and data sharing for each participant of federal learning.
At present, a data sharing mode based on a block chain and federal learning generally stores data in the block chain, then a data demand party sends a data demand to a block chain system, the block chain system returns related data to a data demand party, and most of the related data is provided for a certain node.
However, the data sharing method directly performed from the block chain only has a relatively low matching degree between the obtained target data and the required data.
Disclosure of Invention
The invention provides a data management method, a data management device and data management equipment based on alliance chain and federal learning, which are used for solving the defect that the matching degree of target data and required data is low easily caused by simply and directly carrying out data sharing through a block chain in the prior art, and realizing the improvement of the matching degree of the target data and the required data.
The invention provides a data management method based on alliance chain and federal learning, which comprises the following steps:
receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category;
calling an intelligent contract to share the demander identification and the request data category to a alliance chain;
retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local dataset associated with the requested data category;
training the local data set based on federated machine learning to obtain a global model;
and outputting target data to the data demander according to the global model and the demander identifier.
According to the data management method based on the federation chain and the federal learning, which is provided by the invention, the local data set is trained based on the federal machine learning to obtain a global model, and the method comprises the following steps:
based on federal machine learning, training each local data set respectively to obtain a corresponding local model;
and training the local model based on the federal machine learning to obtain a global model.
According to the data management method based on alliance chain and federal learning provided by the invention, after outputting the target data to the data demand side according to the global model and the demand side identifier, the method further comprises the following steps:
identifying an amount of data in the target data provided by each of the data-dependent nodes;
determining an accuracy rate of each of the local models;
and determining the incentive benefit of each data-related node according to the provided data volume and the accuracy of the local model, and distributing the incentive benefits to the corresponding data-related nodes.
According to the data management method based on alliance chain and federal learning provided by the invention, the accuracy of each local model is determined, and the method comprises the following steps:
and determining the accuracy of the local model based on a preset average absolute error formula.
According to the data management method based on alliance chain and federal learning provided by the invention, the incentive income of each data-related node is determined according to the provided data volume and the accuracy of the local model, and the method comprises the following steps:
determining a unit data volume price;
determining a preset profit calculation formula according to the unit data volume price, the provided data volume and the accuracy of the local model;
and determining the incentive benefit of each data-related node through the preset benefit calculation formula.
According to the data management method based on alliance chain and federal learning provided by the invention, after the global model is obtained, the method further comprises the following steps:
determining a leader node among all the data-dependent nodes based on the accuracy of each of the local models;
collecting all data received by the alliance link through the leader node, and integrating the data into a target block;
broadcasting the target block to each of the data-related nodes so that the data-related nodes authenticate the target block;
and if the data related nodes all represent that the authentication is passed, storing the target block into a alliance chain.
According to the data management method based on alliance chain and federal learning provided by the invention, the step of determining the leader node in all the data related nodes based on the accuracy of each local model comprises the following steps:
determining committee nodes among the data-related nodes based on the accuracy of each of the local models;
calculating the accuracy of the local model corresponding to each committee node according to a preset verification formula;
and determining a leader node in the committee nodes according to the accuracy of the local model corresponding to the committee nodes.
According to the data management method based on alliance chain and federal learning provided by the invention, before receiving a data request instruction sent by a data demand party, the method further comprises the following steps:
receiving a registration application of a data provider, wherein the registration application comprises a public key, a data configuration file and a wallet account;
generating a data retrieval record according to the public key, the data configuration file and the wallet account;
uploading the data retrieval record to a federation chain so that any node in the federation chain verifies the data configuration file.
The invention also provides a data management device based on alliance chain and federal learning, which comprises:
the receiving module is used for receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category;
the sharing module is used for calling an intelligent contract to share the demander identification and the request data category to a alliance chain;
a retrieval module for retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local data set associated with the requested data category;
the training module is used for training the local data set based on federal machine learning to obtain a global model;
and the output module is used for outputting target data to the data demander according to the global model and the demander identifier.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the data management method based on the alliance chain and the federal learning.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data management method based on federation chain and federal learning as any one of the above.
The invention provides a data management method, a device and equipment based on alliance chain and federal learning, wherein the method comprises the steps of receiving a data request instruction sent by a data demand party, wherein the data request instruction comprises a demand party identifier and a request data category; calling an intelligent contract to share the identifier of the demand party and the request data category to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each data-related node including a local data set associated with the requested data category; training a local data set based on federated machine learning to obtain a global model; according to the mode that the target data are output to the data demander according to the global model and the demander identifier, the obtained target data are not only provided by a certain node, but also based on the federal machine learning model, the related data of different data related nodes can be fused and sent to the data demander, the accuracy of the target data is effectively guaranteed while the complete target data are obtained, and the matching degree of the target data and the demand data is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a data management method based on federation chain and federal learning provided by the present invention;
FIG. 2 is a second flowchart of a data management method based on federation chain and federal learning according to the present invention;
FIG. 3 is a schematic structural diagram of a data management device based on federation chain and federal learning provided by the present invention;
fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The data management method, device and equipment based on federation chain and federal learning of the present invention are described below with reference to fig. 1-4.
Fig. 1 is a schematic flow chart of a data management method based on federation chain and federal learning provided in the present invention.
As shown in fig. 1, the data management method based on federation chain and federal learning provided in this embodiment includes the following steps:
101. and receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category.
And the data demand party sends a data request instruction according to the actual demand of the data demand party, wherein the data request instruction comprises a demand party identifier and a request data category. The function of the demander identifier is to effectively return the retrieved data to the data demander after the matching data is found, so as to guarantee the data transmission speed, and the requested data category is the data actually needed by the data demander. That is, what the demand is and what the address of the demand is are included, so that the completion of the acquisition of the entire data can be better guaranteed.
In a smart city, a specific acquisition mode of data acquisition by a data provider is that a ubiquitous sensor network is associated with a real city by using sensor nodes, mass data is stored, calculated and analyzed, the sensor nodes of a wireless sensor network acquire network data and integrate the data and send the data to a nearby data aggregation base station, the data acquisition base stations are defined as data aggregation nodes, and each city is provided with a plurality of entities (such as base stations, government servers and the like) with calculation and storage resources. The data aggregation nodes communicate through wired network connection, so that the data aggregation nodes can reliably process data in a coordinated mode. Specifically, the local data aggregation node collects and analyzes local data, shares the local data to the block chain through the wired network, and acquires data required by the local data aggregation node as required, so that the real-time performance and the accuracy of data analysis are guaranteed. The data aggregation node is responsible for collecting and managing transactions from local. These transactions are compressed and arranged into blocks after the audit passes the completion consensus mechanism. The block is composed of a block header and a block body, wherein the Hash (Hash) value, the timestamp (timestamp) and the Merkle root of the block body of the preamble block header are written in the block header.
102. And calling an intelligent contract to share the demander identification and the request data category on the alliance chain.
Specifically, the intelligent contract is called to share the identity of the demand party and the request category to a federation chain, the federation chain is a block chain between a public chain and a private chain, decentralization is weakened to obtain higher consensus achievement efficiency, and therefore the consensus efficiency is higher than that of the public chain and the decentralization degree of the private chain is higher than that of the public chain, and the federation chain can meet the demand better in the scene of city data federal learning.
103. On the federation chain, data-related nodes associated with the requested data category are retrieved, each data-related node including a local data set associated with the requested data category.
Data-related nodes associated with the requested data category are retrieved over the federation chain, wherein each data-related node includes a local data set associated with the requested data category. When a specific data demander sends a data request, the data that is requested may be owned by multiple data providers, and at this time, the data needs to be acquired separately, and may include multiple data nodes.
In an embodiment there are two types of transactions, retrieval transactions and data sharing transactions. Since most data is sensitive and large in size, placing data on a chain of blocks with limited storage space is a resource-intensive, risky task. Thus, the federation chain is utilized to retrieve data, while the actual data is stored locally by its owner. When a new data provider participates, its unique Identification (ID) is recorded as a transaction in the blockchain, as well as a summary of its data, including data type, and data size. All data files from multiple participants will be recorded in the form of transactions. Each data sharing event is also stored as a transaction in the blockchain, so that the usage of the data can be tracked for further auditing.
104. And training the local data set based on the federal machine learning to obtain a global model.
Each data-dependent node comprises a local data set which may comprise repeated data or different data, and the local data can be trained based on federal machine learning to obtain a global model, so that the data acquisition can be better completed. Specifically, based on federal machine learning, a local data set is trained to obtain a global model, including: and training each local data set respectively based on the federal machine learning to obtain a corresponding local model, and then training the local model based on the federal machine learning to obtain a global model.
For any data participant, the step of learning the local model is as follows:
(1) a local data set is selected. The relevant node holding the data finds the local data set relevant to the request locally.
(2) And (5) local model training. The data-dependent party runs a machine learning algorithm locally to train the local model.
(3) Collaborative multi-party learning. And broadcasting the local model as a transaction to each participant to update the federal learning parameters, wherein the process is iterated until the training model result reaches an expected value so as to finish the training of the local model.
105. And outputting the target data to the data demander according to the global model and the demander identifier.
And through the global model, the data which the demander wants to obtain can be obtained, and the data required by the data demander is defined as target data. After the target data are obtained, the target data can be returned to the data demander according to the demander identifier, so that the data management is completed.
Let N data-related parties and one of their common federated datasets D. For any data-related party PiHaving a local data set DiE.g. D. These N data-related parties agree to share their data without revealing any private information. Let a data-sharing request with a query R-R issued by a requestor1,r2,…rmInstead of returning the original data at the query request, the computed results are provided to the queries for sharing. All participants associated with the request cooperate with the corresponding learning algorithm to train a global model M without revealing any private data. Finally, the trained global data model M is returned to the data requestor, which may conveniently obtain an answer r (M) locally to its data sharing request.
In the data management method based on the alliance chain and the federal learning provided by the embodiment, a data request instruction sent by a data demand party is received, and the data request instruction comprises a demand party identifier and a request data category; calling an intelligent contract to share the identifier of the demand party and the request data category to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each data-related node including a local data set associated with the requested data category; training a local data set based on federated machine learning to obtain a global model; according to the mode that the target data are output to the data demander according to the global model and the demander identifier, the obtained target data are not only provided by a certain node, but also based on the federal machine learning model, the related data of different data related nodes can be fused and sent to the data demander, the accuracy of the target data is effectively guaranteed while the complete target data are obtained, and the matching degree of the target data and the demand data is improved.
Fig. 2 is a second flowchart of the data management method based on federation chain and federal learning provided in the present invention.
As shown in fig. 2, the data management method based on federation chain and federal learning provided in this embodiment provides comprehensive and accurate data for incentive data providers, and includes the following steps:
201. and receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category.
202. And calling an intelligent contract to share the demander identification and the request data category on the alliance chain.
203. On the federation chain, data-related nodes associated with the requested data category are retrieved, each data-related node including a local data set associated with the requested data category.
204. And training the local data set based on the federal machine learning to obtain a global model.
205. And outputting the target data to the data demander according to the global model and the demander identifier.
The specific contents of step 201-.
206. An amount of data in the target data provided by each data-dependent node is identified.
Specifically, after each federal learning task is completed, rewards are issued to data-related parties for incentive, but because the data volume of each data-related node is different and the quality of a local model is also divided into good and bad, how to distribute data fees fairly is also a problem to be considered. The data statistics can be performed according to a preset mode, because a specific target data may be composed of a combination of data provided by a plurality of data providers, it is first necessary to determine how much data each data provider provides, so that different incentives are given according to different data volumes, the enthusiasm of the data providers can be improved better, and the fairness is also improved.
207. The accuracy of each local model is determined.
Because each data-dependent node, the quality of the local model needs to be evaluated and verified in the process of reaching consensus. The quality of the local model of federal learning is quantitatively represented in this embodiment by the model prediction accuracy. Specifically, in the classification task, the accuracy can be represented by the proportion of correct classification records, and the accuracy of the local model is determined based on a preset average absolute error formula. In the regression task, the accuracy is measured using Mean Absolute Error (MAE):
Figure BDA0003154893140000111
is as in (1), wherein yiIs the true result of the data record, f (x)i) Is a model miAnd (4) outputting the result. The lower the MAE, miThe higher the accuracy of (b) indicates the higher the quality of its federal learning training model. Each local partial model can be obtained according to (1)And (4) accuracy.
208. And determining the incentive benefit of each data-related node according to the provided data volume and the accuracy of the local model, and distributing the incentive benefits to the corresponding data-related nodes.
In order to enable incentive reward distribution to be more reasonable, after the data volume provided by each data related node and the accuracy of the local model corresponding to each data related node are obtained, reward benefits are distributed by integrating the data volume and the accuracy.
Specifically, the method comprises the steps of firstly determining unit data volume price, then determining a preset profit calculation formula according to the unit data volume price, the provided data volume and the accuracy of a local model, and determining the incentive profit of each data related node through the preset profit calculation formula. It is necessary to determine not only the profit of the data provider but also the cost of the data consumer, and the calculation of the relevant cost can be performed according to the following formula.
For the data requestor, the cost of one federal learning task is formula (2)
Figure BDA0003154893140000112
Wherein p is the unit data price, | ΣidiL is the total amount of data, FBranch standIndicating the fee that the data demander needs to pay.
For each data provider, the revenue for one federal learning task is formula (3)
Figure BDA0003154893140000121
Wherein, | dj| represents the data amount of the data-dependent node,
Figure BDA0003154893140000122
and n is a natural number. According to quantization of tables by means of MAEAnd showing the training quality of the local model of the node, wherein the higher the training quality is, the higher the income of the node is.
Therefore, the payment fee of the data demander and the different profits of the data provider can be calculated more accurately and fairly, that is, the more comprehensive and accurate the data provided by the data provider is, the higher the obtained profits are, and the better the incentive is provided for the data provider.
Further, in this embodiment, the method further includes: determining a leader node among all data-related nodes based on the accuracy of each local model; collecting all data received by the alliance link through the leader node, and integrating the data into a target block; broadcasting a target block to each data-related node so that the data-related node authenticates the target block; if all the data-related nodes indicate that the authentication is passed, the target block is stored in the federation chain, for example, the authentication is passed in such a manner that all the data-related nodes are approved, or a certain percentage of the data-related nodes indicate approval.
Specifically, based on the accuracy of each local model, a leader node is determined among all data-related nodes. First, committee nodes are determined among the data-related nodes based on the accuracy of each local model. The process of consensus is performed by elected committee nodes according to federally learned behavior. The committee node is selected from all data-related party nodes, and the consensus information only needs to be sent to the committee node and does not need to be sent to all the nodes, so that the communication overhead is reduced to a certain degree. Committee node P in response to the Federal learning taskiThe local model m trained by selfiTo the next committee node. Recording the process of this model transmission as a transaction
Figure BDA0003154893140000123
The record tuple for this transaction is shown in Table 1:
TABLE 1
Figure BDA0003154893140000131
Each committee node PiAll have a pair of public key and Private Key (PK)i,SKi) For encryption and for signing. Then PiThe encrypted own public key is broadcast to other committee nodes.
And then, calculating the accuracy of the local model corresponding to each committee node according to a preset verification formula. The MAE calculation formula (4) of the local model corresponding to the committee node is as follows:
Figure BDA0003154893140000132
wherein, MAE (m)j) Is a local model mjCalculated MAE, gamma quantized representation PjThe contribution weight to the global model M, γ, is calculated as follows (5):
Figure BDA0003154893140000133
wherein d isjPresentation committee node PjSize of training data amount, gamma by djAnd other data participant data sizes.
And finally, determining a leader node in the committee nodes according to the accuracy of the local model corresponding to the committee nodes, wherein the local model corresponding to the committee node with the highest accuracy can be used as the leader node.
After determining to obtain the leader node, the leader node negotiates a consensus process for all nodes. The leader node firstly collects all the received transactions and all the data received in the alliance chain, including the global model M obtained by final training, and then integrates the global model M into a target block
Figure BDA0003154893140000134
WhereinHkIs the block head of the block,
Figure BDA0003154893140000135
is a time stamp. The leader node then sends BkAnd broadcasting the data to all data related nodes to carry out an authentication link. The data related node firstly verifies the block header format, the block size, the transaction timestamp and other conventional information of the block, and also verifies the transmission track of the transaction like bitcoin to check the block correctness. The committee node validates each transaction, computes MAE (m)i) And mae (m). If the calculated MAE is within a certain error range, a confirmation message is sent to the leader node. If the block is acknowledged by all authenticated nodes, the leader node sends its own signed block to all nodes, which is stored in the block chain in a non-tamperproof manner. Therefore, when the same data demanders acquire data again, the related data can be directly read in the alliance chain without performing new federal machine learning. Meanwhile, the payment and the income are both in accordance with the related standards, and the data acquisition efficiency is effectively improved.
Further, in this embodiment, the method further includes: receiving a registration application of a data provider, wherein the registration application comprises a public key, a data configuration file and a wallet account, and generating a data retrieval record according to the public key, the data configuration file and the wallet account; and uploading the data retrieval record to the alliance chain so that any node in the alliance chain verifies the data configuration file. And (4) node registration, namely calling an intelligent contract to register node information on the block chain after receiving a node registration request, and then calling the intelligent contract to register the node data record on the chain. And updating the node information, namely calling an intelligent contract to update the data record of the node after receiving the node information updating request. After the node registers to the blockchain for the first time, the intelligent contract is called to register the wallet account. The data request node calls a function before initiating a data request, the data request node delivers funds to the blockchain platform, and the data provider calls to obtain the income after the federal learning task is finished.
The federation chain platform is introduced to improve the security of federal learning, a centralized server is replaced, the risk of single-point failure and the bottleneck of system performance are avoided, and functions of federal learning task distribution, multi-party federal learning, expense calculation and the like are automatically executed by means of an intelligent contract. Through a Federal learning-driven PoTQ consensus mechanism, the waste of computing resources is avoided. And the data related nodes are fairly charged and paid according to the data quantity and the training quality.
Based on the same general inventive concept, the application also protects a data management device based on the alliance chain and the federal study. The data management device based on the federation chain and the federal learning provided by the invention is described below, and the data management device based on the federation chain and the federal learning described below and the data management method based on the federation chain and the federal learning described above can be correspondingly referred to each other.
Fig. 3 is a schematic structural diagram of a data management device based on federation chain and federal learning provided by the present invention.
As shown in fig. 3, the data management apparatus based on federation chain and federal learning provided in this embodiment includes:
the receiving module 10 is configured to receive a data request instruction sent by a data demander, where the data request instruction includes a demander identifier and a request data category;
the sharing module 20 is used for calling an intelligent contract to share the demander identifier and the request data category to a alliance chain;
a retrieval module 30 for retrieving data-related nodes associated with the requested data category over the federation chain, each data-related node including a local data set associated with the requested data category;
the training module 40 is used for training a local data set based on federal machine learning to obtain a global model;
and the output module 50 is used for outputting the target data to the data demander according to the global model and the demander identifier.
In the data management device based on the alliance chain and the federal learning provided by the embodiment, a data request instruction sent by a data demand party is received, and the data request instruction comprises a demand party identifier and a request data category; calling an intelligent contract to share the identifier of the demand party and the request data category to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each data-related node including a local data set associated with the requested data category; training a local data set based on federated machine learning to obtain a global model; according to the mode that the target data are output to the data demander according to the global model and the demander identifier, the obtained target data are not only provided by a certain node, but also based on the federal machine learning model, the related data of different data related nodes can be fused and sent to the data demander, the accuracy of the target data is effectively guaranteed while the complete target data are obtained, and the matching degree of the target data and the demand data is improved.
Further, the training module 40 in this embodiment is specifically used for
Based on federal machine learning, training each local data set respectively to obtain a corresponding local model;
and training the local model based on the federal machine learning to obtain a global model.
Further, in this embodiment, the present invention further includes an excitation module, configured to:
identifying an amount of data in the target data provided by each data-dependent node;
determining the accuracy of each local model; determining an accuracy rate of each local model, comprising: and determining the accuracy of the local model based on a preset average absolute error formula.
And determining the incentive benefit of each data-related node according to the provided data volume and the accuracy of the local model, and distributing the incentive benefits to the corresponding data-related nodes.
Further, the excitation module in this embodiment is specifically configured to:
determining a unit data volume price;
determining a preset profit calculation formula according to the price of unit data volume, the provided data volume and the accuracy of the local model;
and determining the incentive benefit of each data related node through a preset benefit calculation formula.
Further, the present embodiment further includes a storage module, configured to:
determining a leader node among all data-related nodes based on the accuracy of each local model;
collecting all data received by the alliance link through the leader node, and integrating the data into a target block;
broadcasting the target block to each data-related node so that the data-related node authenticates the target block;
and if the data related nodes all indicate that the authentication is passed, storing the target block into the alliance chain.
Further, the storage module in this embodiment is specifically configured to:
determining committee nodes among the data-related nodes based on the accuracy of each local model;
calculating the accuracy of a local model corresponding to each committee node according to a preset verification formula;
and determining a leader node in the committee nodes according to the accuracy of the local model corresponding to the committee nodes.
Further, the present embodiment further includes a registration module, configured to:
receiving a registration application of a data provider, wherein the registration application comprises a public key, a data configuration file and a wallet account;
generating a data retrieval record according to the public key, the data configuration file and the wallet account;
and uploading the data retrieval record to the alliance chain so that any node in the alliance chain verifies the data configuration file.
Fig. 4 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 4: a processor (processor)410, a communication Interface 420, a memory (memory)430 and a communication bus 440, wherein the processor 410, the communication Interface 420 and the memory 430 are communicated with each other via the communication bus 440. Processor 410 may invoke logic instructions in memory 430 to perform a federation chain and federal learning based data management method comprising: receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category; calling an intelligent contract to share the demander identification and the request data category to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local dataset associated with the requested data category; training the local data set based on federated machine learning to obtain a global model; and outputting target data to the data demander according to the global model and the demander identifier.
In addition, the logic instructions in the memory 430 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, which when executed by a computer, enable the computer to perform a federation chain and federated learning-based data management method provided by the above methods, the method comprising: receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category; calling an intelligent contract to share the demander identification and the request data category to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local dataset associated with the requested data category; training the local data set based on federated machine learning to obtain a global model; and outputting target data to the data demander according to the global model and the demander identifier.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the above-provided federation chain and federal learning based data management method, the method comprising: receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category; calling an intelligent contract to share the demander identification and the request data category to a alliance chain; retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local dataset associated with the requested data category; training the local data set based on federated machine learning to obtain a global model; and outputting target data to the data demander according to the global model and the demander identifier.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A data management method based on alliance chain and federal learning is characterized by comprising the following steps:
receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category;
calling an intelligent contract to share the demander identification and the request data category to a alliance chain;
retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local dataset associated with the requested data category;
training the local data set based on federated machine learning to obtain a global model;
and outputting target data to the data demander according to the global model and the demander identifier.
2. A data management method according to claim 1, wherein the local data set is trained to obtain a global model based on federated machine learning, and the method comprises:
based on federal machine learning, training each local data set respectively to obtain a corresponding local model;
and training the local model based on the federal machine learning to obtain a global model.
3. A data management method according to claim 2, wherein after outputting target data to the data demander according to the global model and the demander identifier, the method further comprises:
identifying an amount of data in the target data provided by each of the data-dependent nodes;
determining an accuracy rate of each of the local models;
and determining the incentive benefit of each data-related node according to the provided data volume and the accuracy of the local model, and distributing the incentive benefits to the corresponding data-related nodes.
4. A method for data management based on federation chain and federal learning as claimed in claim 3, wherein said determining an accuracy for each of the local models comprises:
and determining the accuracy of the local model based on a preset average absolute error formula.
5. A method for data management based on federation chain and federal learning as claimed in claim 3, wherein said determining an incentive benefit for each of the data-related nodes based on the amount of data provided and the accuracy of the local model comprises:
determining a unit data volume price;
determining a preset profit calculation formula according to the unit data volume price, the provided data volume and the accuracy of the local model;
and determining the incentive benefit of each data-related node through the preset benefit calculation formula.
6. A method for data management based on federation chain and federal learning as claimed in claim 3, wherein after obtaining the global model, the method further comprises:
determining a leader node among all the data-dependent nodes based on the accuracy of each of the local models;
collecting all data received by the alliance link through the leader node, and integrating the data into a target block;
broadcasting the target block to each of the data-related nodes so that the data-related nodes authenticate the target block;
and if the data related nodes all represent that the authentication is passed, storing the target block into a alliance chain.
7. A data management method based on federation chain and federal learning as claimed in claim 6, wherein said determining a leader node among all the data-related nodes based on the accuracy of each of the local models comprises:
determining committee nodes among the data-related nodes based on the accuracy of each of the local models;
calculating the accuracy of the local model corresponding to each committee node according to a preset verification formula;
and determining a leader node in the committee nodes according to the accuracy of the local model corresponding to the committee nodes.
8. A data management method according to claim 1, wherein before receiving a data request command sent by a data demander, the method further comprises:
receiving a registration application of a data provider, wherein the registration application comprises a public key, a data configuration file and a wallet account;
generating a data retrieval record according to the public key, the data configuration file and the wallet account;
uploading the data retrieval record to a federation chain so that any node in the federation chain verifies the data configuration file.
9. A data management apparatus based on federation chain and federal learning, comprising:
the receiving module is used for receiving a data request instruction sent by a data demander, wherein the data request instruction comprises a demander identifier and a request data category;
the sharing module is used for calling an intelligent contract to share the demander identification and the request data category to a alliance chain;
a retrieval module for retrieving, over the federation chain, data-related nodes associated with the requested data category, each of the data-related nodes including a local data set associated with the requested data category;
the training module is used for training the local data set based on federal machine learning to obtain a global model;
and the output module is used for outputting target data to the data demander according to the global model and the demander identifier.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the federation chain and federal learning based data management method of any one of claims 1 to 8.
CN202110773692.3A 2021-07-08 2021-07-08 Data management method, device and equipment based on alliance chain and federal learning Pending CN113570065A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110773692.3A CN113570065A (en) 2021-07-08 2021-07-08 Data management method, device and equipment based on alliance chain and federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110773692.3A CN113570065A (en) 2021-07-08 2021-07-08 Data management method, device and equipment based on alliance chain and federal learning

Publications (1)

Publication Number Publication Date
CN113570065A true CN113570065A (en) 2021-10-29

Family

ID=78164120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110773692.3A Pending CN113570065A (en) 2021-07-08 2021-07-08 Data management method, device and equipment based on alliance chain and federal learning

Country Status (1)

Country Link
CN (1) CN113570065A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331299A (en) * 2022-03-11 2022-04-12 北京骑胜科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN114800545A (en) * 2022-01-18 2022-07-29 泉州华中科技大学智能制造研究院 Robot control method based on federal learning
CN115049011A (en) * 2022-06-27 2022-09-13 支付宝(杭州)信息技术有限公司 Method and device for determining contribution degree of training member model of federal learning
CN115510494A (en) * 2022-10-13 2022-12-23 贵州大学 Multi-party safety data sharing method based on block chain and federal learning
CN115766295A (en) * 2023-01-05 2023-03-07 成都墨甲信息科技有限公司 Industrial internet data secure transmission method, device, equipment and medium
CN116597498A (en) * 2023-07-07 2023-08-15 暨南大学 Fair face attribute classification method based on blockchain and federal learning
CN117235782A (en) * 2023-08-31 2023-12-15 北京可利邦信息技术股份有限公司 Method, system and terminal for realizing privacy calculation data security based on alliance chain

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723946A (en) * 2020-06-19 2020-09-29 深圳前海微众银行股份有限公司 Federal learning method and device applied to block chain
CN111931242A (en) * 2020-09-30 2020-11-13 国网浙江省电力有限公司电力科学研究院 Data sharing method, computer equipment applying same and readable storage medium
CN112632013A (en) * 2020-12-07 2021-04-09 国网辽宁省电力有限公司物资分公司 Data security credible sharing method and device based on federal learning
CN112784995A (en) * 2020-12-31 2021-05-11 杭州趣链科技有限公司 Federal learning method, device, equipment and storage medium
CN112861152A (en) * 2021-02-08 2021-05-28 北京航空航天大学 Federal learning incentive method and system based on permit chain

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111723946A (en) * 2020-06-19 2020-09-29 深圳前海微众银行股份有限公司 Federal learning method and device applied to block chain
CN111931242A (en) * 2020-09-30 2020-11-13 国网浙江省电力有限公司电力科学研究院 Data sharing method, computer equipment applying same and readable storage medium
CN112632013A (en) * 2020-12-07 2021-04-09 国网辽宁省电力有限公司物资分公司 Data security credible sharing method and device based on federal learning
CN112784995A (en) * 2020-12-31 2021-05-11 杭州趣链科技有限公司 Federal learning method, device, equipment and storage medium
CN112861152A (en) * 2021-02-08 2021-05-28 北京航空航天大学 Federal learning incentive method and system based on permit chain

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
卢云龙: ""数据隐私安全防护及共享方法研究"", 《中国知网博士学位论文全文数据库》, pages 90 - 96 *
卢云龙: "数据隐私安全防护及共享方法研究", 中国知网博士学位论文全文数据库, 15 January 2021 (2021-01-15), pages 90 - 96 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114800545A (en) * 2022-01-18 2022-07-29 泉州华中科技大学智能制造研究院 Robot control method based on federal learning
CN114800545B (en) * 2022-01-18 2023-10-27 泉州华中科技大学智能制造研究院 Robot control method based on federal learning
CN114331299A (en) * 2022-03-11 2022-04-12 北京骑胜科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN114331299B (en) * 2022-03-11 2022-07-22 北京骑胜科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN115049011A (en) * 2022-06-27 2022-09-13 支付宝(杭州)信息技术有限公司 Method and device for determining contribution degree of training member model of federal learning
CN115510494A (en) * 2022-10-13 2022-12-23 贵州大学 Multi-party safety data sharing method based on block chain and federal learning
CN115510494B (en) * 2022-10-13 2023-11-21 贵州大学 Multiparty safety data sharing method based on block chain and federal learning
CN115766295A (en) * 2023-01-05 2023-03-07 成都墨甲信息科技有限公司 Industrial internet data secure transmission method, device, equipment and medium
CN116597498A (en) * 2023-07-07 2023-08-15 暨南大学 Fair face attribute classification method based on blockchain and federal learning
CN116597498B (en) * 2023-07-07 2023-10-24 暨南大学 Fair face attribute classification method based on blockchain and federal learning
CN117235782A (en) * 2023-08-31 2023-12-15 北京可利邦信息技术股份有限公司 Method, system and terminal for realizing privacy calculation data security based on alliance chain

Similar Documents

Publication Publication Date Title
CN113570065A (en) Data management method, device and equipment based on alliance chain and federal learning
TWI768163B (en) Method and device for generating smart contracts
CN109684375B (en) Method, accounting node and medium for querying transaction information in blockchain network
Missier et al. Mind my value: a decentralized infrastructure for fair and trusted iot data trading
CN109447648A (en) The method of recorded data zone block, accounting nodes and medium in block chain network
CN112685766B (en) Enterprise credit investigation management method and device based on block chain, computer equipment and storage medium
WO2019213779A1 (en) Blockchain data exchange network and methods and systems for submitting data to and transacting data on such a network
CN110275891A (en) Artificial intelligence software market
US20200320530A1 (en) Maintenance plant management method, system and data management server
CN111860865B (en) Model construction and analysis method, device, electronic equipment and medium
CN110365711B (en) Multi-platform user identity association method and device, computer equipment and computer readable storage medium
CN109472678A (en) A kind of accounting account book management method, electronic device and readable storage medium storing program for executing based on block chain
CN113987080A (en) Block chain excitation method and device based on reputation consensus and related products
US11631061B2 (en) Method for creating and maintaining a distributed ledger of vehicle gas consumption and wear and tear information
CN113902037A (en) Abnormal bank account identification method, system, electronic device and storage medium
CN113011883A (en) Data processing method, device, equipment and storage medium
Nguyen et al. A marketplace for trading ai models based on blockchain and incentives for iot data
CN112765565A (en) Copyright protection method and system based on block chain
CN114548910B (en) Forestry carbon sink measurement monitoring and transaction system and method based on block chain
KR102048944B1 (en) Copyright management method and system of copyrighted work acquired during project execution based on block chain
CN115271962A (en) NFT transaction management system based on decentralized autonomous organization
CN109165319B (en) Accounting method, device and system based on block chain
US20200265514A1 (en) Recording medium recording communication program and communication apparatus
CN116823417A (en) Block chain supply chain financial risk control method, system, storage medium and electronic terminal based on federal learning
CN113283990B (en) Data sharing processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination