CN117332871A - Federal learning processing method based on blockchain and related equipment - Google Patents

Federal learning processing method based on blockchain and related equipment Download PDF

Info

Publication number
CN117332871A
CN117332871A CN202210726244.2A CN202210726244A CN117332871A CN 117332871 A CN117332871 A CN 117332871A CN 202210726244 A CN202210726244 A CN 202210726244A CN 117332871 A CN117332871 A CN 117332871A
Authority
CN
China
Prior art keywords
node
training
computing
data
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210726244.2A
Other languages
Chinese (zh)
Inventor
王鹏飞
赵祎安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Tencent Technology Shenzhen Co Ltd
Original Assignee
Dalian University of Technology
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology, Tencent Technology Shenzhen Co Ltd filed Critical Dalian University of Technology
Priority to CN202210726244.2A priority Critical patent/CN117332871A/en
Publication of CN117332871A publication Critical patent/CN117332871A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a federal learning processing method based on a blockchain and related equipment, wherein the method comprises the following steps: the first computing node receives a federal learning task generated by a task initiating node, and if the first computing node does not have federal learning computing capacity aiming at first training data, determining second computing nodes which are successful in competition and meet node reliability conditions in M second computing nodes as target computing nodes; if a first data sharing request sent by a target computing node is received, sending first training data to the target computing node, so that the target computing node trains an initial model associated with a binding learning task according to the first training data and stored training data to obtain first branch training update parameters; the task initiation node is further configured to globally update the initial model with N branch training update parameters including the first branch training update parameter. By adopting the invention, the data utilization rate of federal learning can be improved.

Description

Federal learning processing method based on blockchain and related equipment
Technical Field
The application relates to the technical field of computers, in particular to a federal learning processing method based on a blockchain and related equipment.
Background
As a collaborative machine learning paradigm, federal learning has attracted considerable attention in industry and academia in recent years. The general federal learning process is to train the local model by using edge devices, aggregate and take the average value of all local model updates in a central server as global model updates, then each edge device acquires the updated global model from the central server, and train the updated global model continuously by using own data until the global model training is completed.
With the popularization of intelligent internet of things equipment, federal learning is also gradually applied to the internet of things scene. It can be appreciated that the more data that participates in model training in the federal learning algorithm, the better the model's performance will naturally be. However, in the internet of things scenario, some of the internet of things devices have limited computing resources, and although they possess the training data required by the model, they cannot participate in the training. Therefore, in the scene of the Internet of things, the utilization rate of data is low due to the limitation of equipment calculation, so that the performance of a finally trained model is insufficient.
Disclosure of Invention
The embodiment of the application provides a federation learning processing method based on a blockchain and related equipment, which can improve the data utilization rate of federation learning, thereby improving the model performance.
In one aspect, the embodiment of the application provides a federation learning processing method based on a blockchain, which is characterized in that the method is executed by a first computing node in a blockchain network, the blockchain network further comprises a task initiating node and M second computing nodes, and M is a positive integer; the method comprises the following steps:
the method comprises the steps that a first computing node receives a federation learning task generated by a task initiating node through a task intelligent contract, and training data associated with the federation learning task is obtained and used as first training data;
if the first computing node does not have the federal learning computing capability aiming at the first training data, broadcasting a first training competition request aiming at a federal learning task to M second computing nodes, and determining the second computing node which is successful in competition and meets the node reliability condition in the M second computing nodes as a target computing node; the target computing node has federal learning computing capability for the first training data;
if a first data sharing request sent by a target computing node is received, sending first training data to the target computing node, so that the target computing node trains an initial model associated with a federation learning task according to the first training data and stored training data associated with the federation learning task to obtain first branch training update parameters; the task initiating node is also used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include a first branch training update parameter, N being a positive integer.
In one aspect, the embodiment of the application provides a federation learning processing method based on a blockchain, which is characterized in that the method is executed by a task initiating node in a blockchain network, the blockchain network further comprises W computing nodes, and W is a positive integer; the method comprises the following steps:
the task initiating node generates a federation learning task through a task intelligent contract, and broadcasts the federation learning task to W computing nodes; if the calculation limited nodes exist in the W calculation nodes, the calculation limited nodes are used for broadcasting training competition requests aiming at shared training data to the communicable calculation nodes corresponding to the calculation limited nodes, and the calculation limited nodes are also used for determining target communicable calculation nodes which are successful in competition and meet the node credibility conditions in the communicable calculation nodes; the shared training data is the training data stored in the computing limited node and associated with the federal learning task; the computing restricted node does not have federal learning computing capability for shared training data;
receiving federal learning task response requests sent by Y capability computing nodes respectively, wherein Y is a positive integer; the Y capability computing nodes include target communicable computing nodes, and the Y capability computing nodes do not include computation-constrained nodes;
Determining X training computing nodes from Y capacity computing nodes according to Y federation learning task response requests, and sending federation learning task issuing instructions carrying initial models associated with federation learning tasks to the X training computing nodes so that the X training computing nodes train the initial models according to available training data respectively to obtain branch training update parameters; x is a positive integer; if the X training computing nodes comprise target communicable computing nodes, the available training data corresponding to the target communicable computing nodes comprise shared training data and training data stored in the target communicable computing nodes;
globally updating the initial model according to the aggregate training updating parameters until a target model meeting training conditions indicated by the federal learning task is obtained; the aggregate training update parameter is generated based on the X branch training update parameters.
In one aspect, an embodiment of the present application provides a federal learning processing device based on a blockchain, where the federal learning processing device is applied to a first computing node in a blockchain network, and the blockchain network further includes a task initiating node and M second computing nodes, where M is a positive integer; the federal learning processing device includes:
The task receiving module is used for receiving the federation learning task generated by the task initiating node through the task intelligent contract, and acquiring training data associated with the federation learning task as first training data;
the competition broadcasting module is used for broadcasting a first training competition request aiming at a federal learning task to M second computing nodes if the first computing nodes do not have federal learning computing capability aiming at the first training data;
the node determining module is used for determining a second computing node which is successful in competition and meets the node credibility condition from M second computing nodes, and the second computing node is used as a target computing node; the target computing node has federal learning computing capability for the first training data;
the data sharing module is used for sending the first training data to the target computing node if a first data sharing request sent by the target computing node is received, so that the target computing node trains an initial model associated with the federation learning task according to the first training data and stored training data associated with the federation learning task to obtain a first branch training update parameter; the task initiating node is also used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include a first branch training update parameter, N being a positive integer.
Wherein, the node determination module includes:
the competition information receiving unit is used for receiving first training competition response information sent by the L competition nodes respectively in a competition time period; a first training competition response message comprises a competition digital resource quantity; the L competing nodes are nodes with federal learning computing capacity aiming at the first training data in the M second computing nodes; l is a positive integer less than or equal to M;
the trusted node screening unit is used for removing the competing nodes which do not meet the node credibility condition from the L competing nodes to obtain S trusted competing nodes; s is a positive integer less than or equal to L;
the node pre-selection unit is used for acquiring the trusted competition node with the highest number of the competing digital resources from the S trusted competition nodes as a pre-selection calculation node;
a contention confirmation unit for transmitting a first training contention confirmation request to the pre-selected computing node;
and the competition confirming unit is further used for determining the pre-selected computing node as the target computing node if the first training competition confirming response information sent by the pre-selected computing node according to the first training competition confirming request is received in the confirming time period.
Wherein L competing nodes comprise a competing node Z i I is a positive integer less than or equal to L;
the federal learning processing device further includes:
the trusted node determining module is used for acquiring a node trusted probability table;
the trusted node determining module is also used for inquiring from the node trusted probability tableCompeting node Z i Is a trusted probability of (1);
the trusted node determining module is further configured to determine a trusted node according to the competing node Z i Connection relation between them and competing node Z i Determining a probability of trust of a competing node Z i Is the confidence level of (2);
the trusted node determining module is further configured to determine if the node Z is competing i If the reliability of the (a) is smaller than the reliability threshold, determining the competing node Z i The node credibility condition is not satisfied;
the trusted node determining module is further configured to determine if the node Z is competing i If the reliability of (a) is greater than or equal to the reliability threshold, determining the competing node Z i The node reliability condition is satisfied.
The federal learning task comprises a task period and initial model information;
the above-mentioned federal study processing apparatus, further comprising:
the computing capability determining module is used for determining first required computing resources corresponding to the federal learning task according to the initial model information;
the computing capability determining module is further configured to determine that the first computing node does not have federal learning computing capability for the first training data if the first required computing resource is greater than the available computing resource; the available computing resources refer to idle computing resources of the first computing node;
The computing capability determining module is further configured to determine, according to the data amount of the first training data, a first training duration corresponding to federal learning computation for the first training data if the first required computing resource is less than or equal to the available computing resource;
the computing capability determining module is further configured to determine that the first computing node has federal learning computing capability for the first training data if the first training duration is less than or equal to the task deadline;
the computing capability determining module is further configured to determine that the first computing node does not have federal learning computing capability for the first training data if the first training time period is longer than the task time period.
The block chain network further comprises P third computing nodes, wherein P is a positive integer;
the above-mentioned federal study processing apparatus, further comprising:
the competition request receiving module is used for receiving second training competition requests aiming at federal learning tasks and respectively sent by the P third computing nodes; the second training competition request sent by the j third computing node comprises the data volume of the j second training data; the j second training data is the training data which is stored in the j third computing node and is associated with the federal learning task; the j-th third computing node does not have federal learning computing capability for the j-th second training data; j is a positive integer less than or equal to P;
The acquisition node determining module is used for determining a third calculation node with successful competition from P third calculation nodes as a target acquisition node if the first calculation node has the federal learning calculation capability of the first training data; the first computing node also has federal learning capability for the target training data; the target training data is second training data corresponding to the target acquisition node;
the task response module is used for sending a federal learning task response request carrying training data information to the task initiating node; the training data information is generated according to the first training data and the target training data;
the data acquisition module is used for sending a second data sharing request to the target acquisition node if a federal learning task issuing instruction sent by the task initiating node is received; the federation learning task issuing instruction comprises an initial model associated with the federation learning task;
the model training module is used for receiving target training data sent by the target acquisition node according to the second data sharing request, and training the initial model according to the first training data and the target training data to obtain second branch training update parameters; the task initiating node is also used for globally updating the initial model through H branch training updating parameters; the H branch training update parameters include a second branch training update parameter, H being a positive integer.
Wherein, obtain the node and confirm the module, include:
the node selection unit is used for traversing second training competition requests aiming at federal learning tasks and respectively sent by P third computing nodes if the first computing node has federal learning computing capacity of the first training data;
the node selection unit is further configured to obtain second training data G in a second training competition request sent by a kth third computing node k According to the second training data G k Data quantity determination for second training data G k A second training period T of the data quantity of (2) k The method comprises the steps of carrying out a first treatment on the surface of the k is a positive integer less than or equal to P;
the node selection unit is further configured to select a second training duration T k Adding the first training time length to obtain a total training time length T k total
The node selection unit is further configured to, if the total training duration T k total Less than or equal to the task deadline, determining that the first computing node is provided with the second training data G k Transmits second training competition response information R to the kth third computing node k
The node confirmation unit is used for receiving second training competition confirmation requests respectively sent by the Q third computing nodes in the competition waiting time period; q third computing nodes are nodes which receive second training competition response information sent by the first computing node, and Q is a positive integer; a second training contention confirmation request includes a target contention digital resource amount;
The node confirmation unit is configured to determine that a third computing node corresponding to the second training contention confirmation request with the highest target number of contention digital resources is a third computing node with successful contention, send second training contention confirmation response information to the third computing node with successful contention, and take the third computing node with successful contention as a target acquisition node.
Wherein, above-mentioned federal study processing apparatus still includes:
the encryption module is used for carrying out encryption processing on the second branch training updating parameters according to the received random differential privacy noise to obtain target encryption branch training updating parameters and sending the target encryption branch training updating parameters to the task initiating node if the random differential privacy noise is received while receiving the federal learning task issuing instruction sent by the task initiating node; the task initiating node is also used for carrying out information aggregation processing on the S encrypted branch training updating parameters to obtain encrypted aggregation training updating parameters; s is a positive integer; the task initiating node is also used for adding the encrypted aggregate training updating parameters and the differential privacy key to obtain aggregate training updating parameters, and globally updating the initial model according to the aggregate training updating parameters; the S encrypted branch training update parameters include a target encrypted branch training update parameter.
Wherein, above-mentioned federal study processing apparatus still includes:
the data sharing recording module is used for acquiring the number of the appointed digital resources with the target computing node;
the data sharing record module is also used for packaging the number of the appointed digital resources, the data quantity of the first training data and the data sharing relation with the target computing node into data sharing record transaction, and caching the data sharing record transaction in a record transaction pool;
the data sharing record module is also used for carrying out consensus uplink on the data sharing record transaction when receiving a first data sharing request sent by the target computing node;
and the data sharing recording module is also used for receiving the digital resources which are sent by the target computing node and correspond to the number of the appointed digital resources.
In one aspect, an embodiment of the present application provides a federal learning processing device based on a blockchain, where the federal learning processing device is applied to a task initiating node in a blockchain network, and the blockchain network further includes W computing nodes, where W is a positive integer; the federal learning processing device includes:
the task broadcasting module is used for generating a federation learning task through a task intelligent contract and broadcasting the federation learning task to W computing nodes; if the calculation limited nodes exist in the W calculation nodes, the calculation limited nodes are used for broadcasting training competition requests aiming at shared training data to the communicable calculation nodes corresponding to the calculation limited nodes, and the calculation limited nodes are also used for determining target communicable calculation nodes which are successful in competition and meet the node credibility conditions in the communicable calculation nodes; the shared training data is the training data stored in the computing limited node and associated with the federal learning task; the computing restricted node does not have federal learning computing capability for shared training data;
The response receiving module is used for receiving federal learning task response requests respectively sent by the Y capability computing nodes; y is a positive integer; the Y capability computing nodes include target communicable computing nodes, and the Y capability computing nodes do not include computation-constrained nodes;
the training node determining module is used for determining X training computing nodes from Y capability computing nodes according to Y federal learning task response requests, wherein X is a positive integer;
the task issuing module is used for sending federal learning task issuing instructions carrying initial models associated with federal learning tasks to the X training computing nodes so that the X training computing nodes train the initial models according to available training data respectively to obtain branch training update parameters; if the X training computing nodes comprise target communicable computing nodes, the available training data corresponding to the target communicable computing nodes comprise shared training data and training data stored in the target communicable computing nodes;
the model updating module is used for globally updating the initial model according to the aggregate training updating parameters until a target model meeting training conditions indicated by the federal learning task is obtained; the aggregate training update parameter is generated based on the X branch training update parameters.
Wherein, a federal learning task response request includes a unit data resource consumption amount and a training data total amount;
a training node determination module comprising:
the consumption obtaining unit is used for obtaining the predicted consumption of the digital resource corresponding to the federal learning task;
the ascending sort unit is used for ascending sort of Y unit data resource consumption according to Y federal learning task response requests to obtain Y unit data resource consumption after sorting;
the training node selection unit is used for traversing the Y unit data resource consumption after sequencing, sequentially acquiring the e unit data resource consumption, and determining the e node digital resource prediction consumption according to the e unit data resource consumption and the training data total amount corresponding to the e unit data resource consumption; e is a positive integer less than or equal to Y;
the training node selection unit is also used for adding the capacity calculation node corresponding to the e-th unit data resource consumption into a node preselected queue;
the training node selection unit is further configured to subtract the current predicted consumption of the digital resource from the predicted consumption of the digital resource of the e node if the predicted consumption of the digital resource of the e node is less than the current predicted consumption of the digital resource of the remaining digital resource, obtain an updated predicted consumption of the digital resource of the remaining digital resource, and continuously traverse to obtain the consumption of the e+1st unit data resource;
The training node selection unit is further configured to stop traversing if the predicted consumption of the digital resource of the e-th node is greater than or equal to the predicted consumption of the current remaining digital resource, or e is equal to Y, and use all the X capability computing nodes in the node pre-selection queue as training computing nodes.
Wherein, above-mentioned federal study processing apparatus still includes:
the digital resource recording module is used for sequentially acquiring X-1 training calculation nodes from the node pre-selection queue;
the digital resource recording module is also used for predicting consumption of node digital resources corresponding to the X-1 training calculation nodes respectively and taking the predicted consumption as transaction digital resource consumption corresponding to the X-1 training calculation nodes respectively;
the digital resource recording module is also used for acquiring the X training calculation node from the node pre-selection queue
The digital resource recording module is further configured to, if the node digital resource predicted consumption corresponding to the xth training computing node is less than the target remaining digital resource predicted consumption, use the node digital resource predicted consumption corresponding to the xth training computing node as the transaction digital resource consumption corresponding to the xth training computing node; the target residual digital resource predicted consumption is a value obtained by subtracting the node digital resource predicted consumption corresponding to the X-1 training calculation nodes respectively from the digital resource predicted consumption;
The digital resource recording module is further used for taking the predicted consumption of the target residual digital resources as the transaction digital resource consumption corresponding to the X training computing node if the predicted consumption of the digital resources of the node corresponding to the X training computing node is greater than or equal to the predicted consumption of the target residual digital resources;
the digital resource recording module is also used for associating and packaging each training computing node and the transaction digital resource consumption corresponding to each training computing node into a digital resource recording transaction, and caching the digital resource recording transaction in a recording transaction pool;
and the digital resource recording module is also used for respectively transmitting the digital resources corresponding to the related transaction digital resource consumption to each training computing node when the target model meeting the training conditions indicated by the federal learning task is obtained, and carrying out consensus uplink on the digital resource recording transactions.
Wherein, above-mentioned federal study processing apparatus still includes:
the noise processing module is used for generating a random differential privacy noise set; the differential privacy noise set comprises random differential privacy noise respectively corresponding to each training computing node;
the noise processing module is also used for adding the X random differential privacy noises to obtain a random differential privacy total noise;
The noise processing module is also used for taking the opposite number of the random differential privacy total noise as a differential privacy key;
the noise sending module is used for sending the federation learning task issuing instruction carrying the initial model related to the federation learning task to the X training computing nodes and respectively sending the corresponding random differential privacy noise to the X training computing nodes so that the X training computing nodes respectively encrypt the branch training updating parameters according to the received random differential privacy noise to obtain encrypted branch training updating parameters;
the parameter aggregation module is used for carrying out information aggregation processing on the encryption branch training update parameters returned by the X training calculation nodes to obtain encryption aggregation training update parameters;
and the parameter decryption module is used for adding the encrypted aggregate training updating parameters with the differential privacy passwords to obtain the aggregate training updating parameters.
Wherein, parameter aggregation module includes:
the precision verification unit is used for acquiring a training test set;
the precision verification unit is also used for respectively verifying the encryption branch training update parameters returned by the X training calculation nodes according to the training test set to obtain X update precision;
The target parameter aggregation unit is used for taking the encryption branch training update parameters returned by the training calculation nodes with the update precision meeting the training precision condition as target encryption branch training update parameters;
and the target parameter aggregation unit is also used for carrying out information aggregation processing on the target encryption branch training update parameters to obtain the encryption aggregation training update parameters.
Wherein, above-mentioned federal study processing apparatus still includes:
the precision recording module is used for packaging the X training computing nodes and the X updating precision into an updating precision recording transaction, and initiating consensus processing for the updating precision recording transaction to the blockchain network so that each node in the blockchain network updates the stored precision recording table according to the X updating precision when the updating precision recording transaction consensus passes; the precision record table is used for recording each node in the blockchain network and the current update precision corresponding to each node.
Wherein, above-mentioned federal study processing apparatus still includes:
the block-out weight determining module is used for acquiring an accuracy record list according to the vote confirmation request;
the block-out weight determining module is also used for determining the contribution degree corresponding to each node according to the current updating precision corresponding to each node in the precision record table;
The block-out weight determining module is further used for determining the resource rights and interests corresponding to each node according to the contribution degree weight, the contribution degree corresponding to each node and the digital resource possession duration corresponding to each node;
the block-out weight determining module is further used for sending votes to each node according to the proportion between the corresponding resource rights of each node so that each node votes on the received votes;
the block-out weight determining module is further used for if a calculation limited node exists in each node, the calculation limited node is used for voting the received ballot to the target communicable calculation node;
the block-out weight determining module is also used for receiving the number of votes of each node after voting and determining the block-out weight acquisition proportion corresponding to each node according to the number of votes of each node;
the block-out weight determining module is also used for randomly determining block nodes from each node according to the block-out weight acquisition proportion; the block-out node has the block-out weight of the new block;
the block-out weight determining module is also used for sending the rewarding digital resources to the block-out nodes so that the block-out nodes distribute the rewarding digital resources according to the ballot composition proportion; the vote composition ratio is a number ratio between the votes sent by the task initiating node and the votes voted by the target calculation restricted node, which are received by the block node.
In one aspect, a computer device is provided, including: a processor, a memory, a network interface;
the processor is connected to the memory and the network interface, where the network interface is used to provide a data communication network element, the memory is used to store a computer program, and the processor is used to call the computer program to execute the method in the embodiment of the present application.
In one aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, the computer program being adapted to be loaded by a processor and to perform a method according to embodiments of the present application.
In one aspect, the embodiments of the present application provide a computer program product or a computer program, where the computer program product or the computer program includes computer instructions, where the computer instructions are stored in a computer readable storage medium, and where a processor of a computer device reads the computer instructions from the computer readable storage medium, and where the processor executes the computer instructions, so that the computer device performs a method in an embodiment of the present application.
In the embodiment of the application, when a first computing node in a blockchain network receives a federation learning task generated by a task initiating node through a task intelligent contract, training data associated with the federation learning task is obtained and used as first training data; if the first computing node does not have the federal learning computing capability aiming at the first training data, broadcasting a first training competition request aiming at a federal learning task to M second computing nodes, and determining the second computing node which is successful in competition and meets the node reliability condition in the M second computing nodes as a target computing node; and if the first data sharing request sent by the target computing node is received, the first training data is sent to the target computing node, and the target computing node can train an initial model associated with the federation learning task according to the first training data and the stored training data associated with the federation learning task to obtain first branch training update parameters. Wherein M is a positive integer, and the target computing node has federal learning computing capability for the first training data; the task initiating node is also used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include the first branch training update parameter, N being a positive integer. By adopting the method provided by the embodiment of the application, if the first computing node is limited in calculation, the first training data stored by the first computing node can be shared to the target computing node which has federal learning computing capacity for the first training data and meets the node reliability condition, the target computing node can use the self-stored training data and the first training data to train the model, so that the waste of the training data in the computing node with limited calculation is avoided, the utilization rate of the data in the federal learning process is improved, more training data participate in training, the model performance obtained through final training is higher, and the first computing node and the target computing node are nodes in a blockchain network and meet the reliability condition, so that the data interaction is safer.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a blockchain network according to an embodiment of the present disclosure;
fig. 2a is a schematic diagram of a scenario of a federal learning task broadcast according to an embodiment of the present application;
fig. 2b is a schematic view of a scenario of a federal learning processing method according to an embodiment of the present application;
FIG. 2c is a schematic diagram of a scenario of a federal learning process provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart of a federal learning processing method based on blockchain according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a data structure for recording transactions according to an embodiment of the present application;
FIG. 5 is a schematic flow chart of a federal learning processing method based on blockchain according to an embodiment of the present application;
FIG. 6 is a flowchart of a federal learning processing method based on blockchain according to an embodiment of the present disclosure;
FIG. 7 is a flowchart of a federal learning processing method based on blockchain according to an embodiment of the present disclosure;
fig. 8 is a flowchart of a data contention method according to an embodiment of the present application;
FIG. 9 is a flowchart of a task allocation method provided in an embodiment of the present application;
FIG. 10 is a flow chart of encryption model training provided by an embodiment of the present application;
FIG. 11 is a schematic structural diagram of a federal learning processing device based on blockchain according to an embodiment of the present disclosure;
FIG. 12 is a schematic illustration of a computer device provided in an embodiment of the present application;
FIG. 13 is a schematic block chain based federal learning processing arrangement according to an embodiment of the present disclosure;
fig. 14 is a schematic structural diagram of another computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
For ease of understanding, the following description will address the relevant concepts that the present application will apply to:
federal machine learning (Federated machine learning/Federated Learning), also known as federal learning, joint learning, federal learning. Federal machine learning is a machine learning framework that can effectively help multiple institutions perform data usage and machine learning modeling while meeting the requirements of user privacy protection, data security, and government regulations. The general federal learning process is to train the local model by using edge devices, aggregate and take the average value of all local model updates in a central server as global model updates, then each edge device acquires the updated global model from the central server, and train the updated global model continuously by using own data until the global model training is completed.
Social networking: social networking (Social Internet of Things, SIoT), along with the advancement of applications of the internet of things, the internet of things technology is being combined with the social networking, and the internet of things also not only comprises the association of things with things and things with people, but also can introduce the relationship of people with people, so that the world of everything interconnection is better depicted. A commonly established social network is a network of relationships from person to person, persons being nodes in the network, organized together by person-to-person friendship.
Blockchain enhanced federal learning market: blockchain enhanced federal learning market (Blockchain-enhanced Federated Learning Market, BFL market), a social internet of things distributed market with fair competition and data transactions, wherein internet of things devices in the BFL market can benefit by completing published federal learning tasks or data transactions.
Blockchains are the carrier and organization of the way the blockchain technology is run. The blockchain technique (Blockchain technology, BT), also known as the distributed ledger technique, is an internet database technique that features decentralization, transparent disclosure, and allows everyone to participate in database records. Blockchain technology is a distributed infrastructure and computing method that uses a blockchain data structure to verify and store data, a distributed node consensus algorithm to generate and update data, cryptography to secure data transmission and access, and intelligent contracts composed of automated script code to program and manipulate data.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like, and is mainly used for sorting data according to time sequence, encrypting the data into an account book, preventing the account book from being tampered and forged, and simultaneously verifying, storing and updating the data. A blockchain is essentially a de-centralized database in which each node stores an identical blockchain, and a blockchain network can distinguish nodes into consensus nodes and service nodes, wherein the consensus nodes are responsible for the consensus of the blockchain's entire network. The process for transaction data to be written into the ledger in the blockchain network may be: the client sends the transaction data to the service nodes, then the transaction data is transmitted between the service nodes in the blockchain network in a baton mode until the consensus node receives the transaction data, the consensus node packages the transaction data into blocks, performs consensus among other consensus nodes, and writes the blocks carrying the transaction data into an account book after the consensus passes.
A block: the data packet carrying transaction data (i.e. transaction traffic) on the blockchain network is a data structure marked with a time stamp and a hash value of the previous block, and the block is verified and the transaction in the block is determined by the consensus mechanism of the network.
Hash value: also called information characteristic value or eigenvalue, hash value is generated by converting input data of arbitrary length into a password by hash algorithm and performing fixed output, and original input data cannot be retrieved by decrypting the hash value, which is a one-way encryption function. In the blockchain, each block (except the initial block) contains the hash value of the successor block, which is referred to as the parent block of the current block. Hash value is the potential core foundation and most important aspect in blockchain technology, which preserves the authenticity of the recorded and viewed data, as well as the integrity of the blockchain as a whole.
Transaction: the transaction sent by the blockchain account is provided with a transaction hash as a unique identifier, and the transaction hash comprises the blockchain account to which the account address identifier belongs for sending the transaction.
Intelligent contract: a smart contract may refer to code that each node of a blockchain (including consensus nodes) may understand and execute, may execute any logic, and may yield results. It should be appreciated that one or more intelligent contracts may be included in the blockchain, which may be distinguished by an identification number (Identity document, ID) or name, and that the transaction request may carry the identification number or name of the intelligent contract, thereby specifying the intelligent contract that the blockchain is to operate.
Computing resources: computing resources refer to hardware or network resources that a device needs to occupy when performing computing tasks, and may generally include cpu (central processing unit ) resources, gpu (graphics processing unit, graphics processor) resources, memory resources, network bandwidth resources, disk resources.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a blockchain network according to an embodiment of the present application. The blockchain network as shown in fig. 1 may include, but is not limited to, a blockchain network to which the federated chain corresponds. The blockchain network may include a plurality of blockchain nodes, and the plurality of blockchain nodes may include a blockchain node 10a, a blockchain node 10b, a blockchain node 10c, a blockchain node 10d, …, and a blockchain node 10n. Each blockchain node can receive data sent by the outside during normal operation, perform blockchain uplink processing based on the received data, and also can send the data to the outside. To ensure data interworking between the various blockchain nodes, a data connection may exist between each blockchain node, such as between blockchain node 10a and blockchain node 10b, between blockchain node 10a and blockchain node 10c, and between blockchain node 10b and blockchain node 10 c.
It will be appreciated that data or block transfer may be performed between the blockchain nodes via the data connections described above. The blockchain network may implement data connection between blockchain nodes based on node identifiers, and for each blockchain node in the blockchain network, each blockchain node may store node identifiers of other blockchain nodes having a connection relationship with itself, so as to broadcast the acquired data or generated blocks to other blockchain nodes according to the node identifiers of the other blockchain nodes, for example, the blockchain node 10a may maintain a node identifier list as shown in table 1, where the node identifier list stores node names and node identifiers of the other nodes:
TABLE 1
Node name Node identification
Node 10a AAA.AAA.AAA.AAA
Node 10b BBB.BBB.BBB.BBB
Node 10c CCC.CCC.CCC.CCC
Node 10d DDD.DDD.DDD.DDD
Node 10n EEE.EEE.EEE.EEE
The node identifier may be any of a protocol (Internet Protocol, IP) address for interconnection between networks, and any other information that can be used to identify a node in a blockchain network, and the IP address is only illustrated in table 1. For example, the blockchain node 10a may send information (e.g., a block) to the blockchain node 10b through the node identification bbb.bbb.bbb.bbb.bbb, and the blockchain node 10b may determine that the information was sent by the blockchain node 10a through the node identification aaa.aaa.aaa.
In a blockchain, a block must be consensus-passed through consensus nodes in the blockchain network before the block is uplink, and the block can be added to the blockchain after the consensus passes. It will be appreciated that when a blockchain is used in some contexts of a government or commercial establishment, not all participating nodes in the blockchain (i.e., blockchain nodes in the blockchain node system described above) have sufficient resources and necessity to become consensus nodes of the blockchain. For example, in the blockchain network shown in fig. 1, blockchain node 10a, blockchain node 10b, blockchain node 10c, and blockchain node 10d may be considered as consensus nodes in the blockchain network. The consensus nodes in the blockchain network participate in consensus, that is, consensus is performed on the blocks (comprising a batch of transactions), including generating blocks and voting on the blocks; while non-consensus nodes do not participate in consensus, but will help propagate block and vote messages, and synchronize status with each other, etc.
In the blockchain network shown in fig. 1, part of the blockchain nodes (whether or not are consensus nodes) may be the internet of things devices, and a blockchain enhanced social internet of things federation learning market may be formed between the blockchain nodes of the internet of things devices, for example, in the blockchain network shown in fig. 1, the blockchain nodes 10a, 10b, 10c and 10d are all the internet of things devices, and then the blockchain nodes 10a, 10b, 10c and 10d may form a blockchain enhanced social internet of things federation learning market. In the federation learning market of the social networking of the blockchain enhancement, any one of the devices of the internet of things can be used as a task initiating node, a federation learning task is generated through a task intelligent contract in the blockchain and then broadcast to a computing node cluster (namely other devices of the internet of things in the federation learning market), and each computing node in the computing node cluster can determine whether to respond to the federation learning task according to own computing resources or share training data stored by the computing node cluster to other computing nodes with rich computing resources.
Taking a first computing node in a computing node cluster as an example, the first computing node receives a federation learning task generated by a task initiating node through a task intelligent contract, and can acquire training data associated with the federation learning task as first training data; if the first computing node does not have the federal learning computing capability for the first training data, that is, the computing resources of the first computing node are insufficient to complete the federal learning task, a first training competition request for the federal learning task may be broadcast to M second computing nodes (may be computing nodes within a communicable range of the first computing node) and then the first computing node may determine, among the M second computing nodes, a second computing node that performs competition successfully and satisfies a node reliability condition as a target computing node, where it is required that the target computing node has the federal learning computing capability for the first training data; then, if the first computing node receives a first data sharing request sent by the target computing node, the first training data is sent to the target computing node, and then the target computing node can train an initial model associated with the federation learning task according to the first training data and stored training data associated with the federation learning task to obtain first branch training update parameters, and then the first branch training update parameters are sent to the task initiating node. And the task initiating node receives the first branch training updating parameters returned by the target computing node and the branch training updating parameters returned by other computing nodes, and finally, the task initiating node carries out global updating on the initial model corresponding to the federal learning task through the N received branch training updating parameters. Wherein M and N are positive integers.
It should be understood that the above data connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may be directly or indirectly connected through a wireless communication manner, and may also be connected through other connection manners, which is not limited herein.
It can be appreciated that the federal learning processing method provided in the embodiments of the present application may be executed by an internet of things device, including, but not limited to, a server and a terminal device. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc.
It is appreciated that embodiments of the present application may be applied to a variety of scenarios, including, but not limited to, smart internet of things, artificial intelligence, and the like.
It will be appreciated that in the specific embodiments of the present application, related data such as training data is referred to, and when the above embodiments of the present application are applied to specific products or technologies, user permissions or consents are required, and the collection, use and processing of related data is required to comply with related laws and regulations and standards of related countries and regions.
For a better understanding of the architecture of the federal learning market and the federal learning process of the blockchain enhanced social internet of things described above, please refer to fig. 2a, fig. 2a is a schematic diagram of a scenario of federal learning task broadcasting provided in an embodiment of the present application. As shown in fig. 2a, the federal learning market 2000 is a blockchain enhanced social internet of things federal learning market, and the federal learning market 2000 includes a task initiating node 200, a computing node 201, a computing node 202, and a computing node 203. The task initiating node 200, the computing node 201, the computing node 202 and the computing node 203 are all devices of the internet of things, and each node may be any one of the blockchain nodes in the blockchain network shown in fig. 1, for example, the task initiating node 200 may be the blockchain node 10b shown in fig. 1, the computing node 201 may be the blockchain node 10d shown in fig. 1, the computing node 202 may be the blockchain node 10c shown in fig. 1, and the computing node 203 may be the blockchain node 10a shown in fig. 1. In other words, the federation learning market 2000 may be installed in the blockchain network as shown in fig. 1, and all nodes in the federation learning market 2000 belong to blockchain nodes in the blockchain network, so that each node may correspondingly store an identical blockchain 2001, and interaction processes between nodes in the federation learning market 2000 may be in the form of transactions, and after the nodes are identified in the blockchain network, the interactions are recorded in the blockchain 2001.
As shown in fig. 2a, in the federal learning market 2000, assuming that the task initiation node 200 needs to train an initial model a by means of federal learning, a task intelligence contract may be invoked, then a federal learning task B is generated for the initial model a, and then the federal learning task B is broadcast to all computing nodes in the federal learning market 2000. Wherein the task intelligence contract belongs to the above-mentioned intelligent contract in the blockchain network, and is a piece of code that can be understood and executed by the task initiating node 200. It should be noted that, in the federal learning market 2000, any node may be a task initiating node, and at this time, other nodes may be computing nodes corresponding to the task initiating node. In addition, the same task initiating node can initiate different federal learning tasks at the same time, and different task initiating nodes can initiate different federal learning tasks at the same time, so that the processing processes of different federal learning tasks are not affected in the same federal learning market.
Each computing node, upon receiving federal learning task B, may determine whether to respond to the federal learning task B based on the stored data and computing resources. For ease of understanding, the computing node 201 is illustrated as the first computing node. Referring to fig. 2b, fig. 2b is a schematic view of a scenario of a federal learning processing method according to an embodiment of the present application. As shown in fig. 2B, after receiving the federal learning task B generated by the task initiating node 200 through the task intelligence contract, the computing node 201 may acquire the locally stored training data C associated with the federal learning task B, and then determine whether the computing node itself has federal learning computing capability for the training data C, in short, the computing node 201 predicts the required computing resources required for training the initial model a associated with the federal learning task B, and then queries the available computing resources of the computing node itself, and if the available computing resources are smaller than the required computing resources, the computing node 201 determines that the computing node itself does not have federal learning capability for the training data C, and at this time, the computing node 201 may be referred to as a computing limited node and cannot directly respond to the federal learning task B. However, to increase the utilization of data in the federal learning market 2000, the computing node 201 may share the training data C to computing nodes with rich computing resources, so that the computing nodes with rich computing resources have more training data to complete the federal learning task B. Accordingly, the computing node that obtains the training data C of the computing node 201 needs to give the corresponding digital resource to the computing node 201. To obtain higher digital resources, computing node 201 may generate a training contention request D, which is then broadcast to other computing nodes, e.g., computing node 202 and computing node 203. The training competition request D may be actually considered as an auction request, and after receiving the training competition request D, the other computing nodes may select whether to compete for the sharing right of the training data C according to the idle condition of their own computing resources. It may be appreciated that when the number of computing nodes in the federal learning market 2000 is small, the training competition request of the computing restricted node may be broadcast to the remaining computing nodes, and when the number of computing nodes included in the federal learning market 2000 is large, the training competition request of the computing restricted node may be broadcast only to some computing nodes that are closer to each other.
As shown in fig. 2b, after receiving the training competition request D, if it is determined that the computing node 202 has federal computing capability for the locally stored training data, and further has federal computing capability for the training data C, a training competition response message E may be generated and returned to the computing node 201, where the training competition response message E includes the number E of competing digital resources, which is used to inform the computing node 201, and if the computing node 201 is willing to share the training data C to itself, the digital resource corresponding to the number E of competing digital resources may be sent to the computing node 201. Similarly, the computing node 203 may send a training contention response message F carrying the number of contention digital resources F to the computing node 201. The computing node 201 may intelligently determine, from the computing nodes 202 and 203, the computing node that successfully competes and satisfies the node reliability condition as the target computing node. The node reliability condition means that the reliability of the computing node 201 needs to be greater than the reliability threshold, and the federal learning market 2000 belongs to the social internet of things scenario, so that similar and similar relationships exist between computing nodes, i.e., the reliability of other computing nodes is different in the federal learning market 2000 for the computing node 201, for example, the reliability of the computing node 201 for some computing nodes with frequent data interaction may be high, and the reliability for some computing nodes with no interaction may be low. The computing node with successful competition generally refers to the computing node with the largest amount of given competing digital resources.
Assuming that the target computing node determined by computing node 201 is computing node 202, the available training data for the federal learning task by computing node 202 comprises locally stored training data and training data C. As can be seen from fig. 2B, when the computing node in the federal learning market 2000 receives the federal learning task B, if the computing resource of the computing node is limited, the computing node can share the training data stored by itself to the computing node with rich computing resources; if the computing node is rich in computing resources, it is possible to compete for training data of other computing-constrained computing nodes. After the competition of the training data among the computing nodes is completed, the computing nodes with abundant computing resources and available training data can be used as capability computing nodes, and a federal learning task response request is sent to the task initiating node 200. For ease of understanding, please refer to fig. 2c together, fig. 2c is a schematic view of a scenario of a federal learning process according to an embodiment of the present application. As shown in fig. 2c, assuming that computing node 202 and computing node 203 are both capability computing nodes, computing node 202 and computing node 203 may each send a federal learning task response request to task initiation node 200. If the task initiating node 200 receives the federal learning task response request, all or a part of the capability computing nodes can be selected from the responsive capability computing nodes as training computing nodes. Assuming that the task initiation node 200 selects the computing node 202 and the computing node 203 as training computing nodes, the task initiation node 200 will issue federal learning task issuing instructions carrying the initial model a to the computing node 202 and the computing node 203, respectively. As shown in fig. 2C, because the computing node 202 successfully competes with the training data C stored in the computing node 201, after receiving the federal learning task issuing instruction, the computing node 202 generates a data sharing request, and then sends the data sharing request to the computing node 201, after receiving the data sharing request, the computing node 201 sends the training data C to the computing node 202, and after receiving the training data C, the computing node 202 further obtains the locally stored training data H, trains the initial model a according to the training data C and the training data H, finally obtains the branch training update parameter H, and returns to the task initiating node 200. As shown in fig. 2c, for the computing node 203, the training data of other computing nodes are not contended, so that the computing node 203 directly trains the initial model a according to the locally stored training data I, finally obtains the branch training update parameter J, and returns to the task initiating node 200. The task originating node 200 may globally update the initial model a based on all of the received branch training update parameters.
Further, referring to fig. 3, fig. 3 is a flow chart of a federal learning processing method based on a blockchain according to an embodiment of the present application. The method may be performed by a first computing node in a blockchain network (may be any blockchain node in the blockchain network), the blockchain network further includes a task initiation node (may be any blockchain node in the blockchain network other than the first computing node) and M second computing nodes (may be any M blockchain nodes in the blockchain network other than the first computing node and the task initiation node), and M is a positive integer. The following description will be given by taking the method performed by the first computing node as an example, where the blockchain-based federal learning processing method may at least include the following steps S101 to S103:
in step S101, the first computing node receives a federal learning task generated by the task initiating node through a task intelligent contract, and obtains training data associated with the federal learning task as first training data.
Specifically, the federal learning task may carry information such as initial model information, a task period, and a type of training data required. The first computing node may first obtain training data associated with the federal learning task as first training data according to a type of training data required. Then, the first computing node may determine whether it has federal learning computing capability for the first training data according to the initial model information, the task deadline, and the like. Having federal learning computing capabilities for the first training data means that the first computing node can complete training of an initial model associated with the federal learning task using the first training data within a task deadline.
Optionally, the first computing node determines whether it has a feasible implementation procedure of federal learning computing capability for the first training data, which may be: the first computing node may determine, according to the initial model information, a first required computing resource corresponding to the federal learning task, and if the first required computing resource is greater than the available computing resource, determine that the first computing node does not have federal learning computing capability for the first training data. Wherein the available computing resources refer to free computing resources of the first computing node. If the first required computing resource is less than or equal to the available computing resource, the first computing node may determine, according to the data amount of the first training data, a first training duration corresponding to federal learning calculation for the first training data; if the first training time length is less than or equal to the task deadline, determining that the first computing node has federal learning computing capability for the first training data; if the first training time period is longer than the task deadline, it is determined that the first computing node does not have federal learning computing power for the first training data.
Step S102, if the first computing node does not have the federal learning computing capability for the first training data, broadcasting a first training competition request for the federal learning task to the M second computing nodes, and determining a second computing node which is successful in competition and meets a node reliability condition in the M second computing nodes as a target computing node; the target computing node has federal learning computing capabilities for the first training data.
Specifically, the absence of the federal learning computing capability for the first training data means that, although the first computing node has the first training data associated with the federal learning task, the first computing node does not have a computing resource for training the initial model, or the computing resource is insufficient, so that training for the first training data cannot be guaranteed to be completed within the task deadline. At this time, the first computing node may choose to share the first training data through a safe and reliable data sharing policy, so as to obtain a certain benefit. Thus, the first computing node may broadcast a first training competition request for the federal learning task to M second computing nodes in the blockchain network, wherein the M second computing nodes may be computing nodes in the blockchain network that are within a communicable range of the first computing node. The first training competition request may be understood as an auction request, or informing the M second computing nodes that the first training data may be selected to compete with a certain number of digital resources. Wherein the digital resource refers to a virtual digital asset available for trading in a blockchain network.
Specifically, the first computing node needs to determine a second computing node that is successfully contended and meets the node reliability condition from M second computing nodes, and as a feasible implementation process of the target computing node, the method may be: in the competition time period, receiving first training competition response information sent by L competition nodes respectively; a first training competition response message comprises a competition digital resource quantity; the L competing nodes are nodes with federal learning computing capacity aiming at the first training data in the M second computing nodes; l is a positive integer less than or equal to M; removing the competitive nodes which do not meet the node credibility condition from the L competitive nodes to obtain S credible competitive nodes; s is a positive integer less than or equal to L; the method comprises the steps of obtaining a trusted competition node with the highest number of competing digital resources from S trusted competition nodes, and taking the trusted competition node as a pre-selection calculation node; transmitting a first training contention confirmation request to the pre-selected computing node; and if the first training competition confirmation response information sent by the pre-selected computing node according to the first training competition confirmation request is received in the confirmation time period, determining that the pre-selected computing node is the target computing node.
It may be appreciated that not all the M second computing nodes may have enough computing resources to complete the training of the first training data, so the first computing node may only receive L first training contention response messages sent by the contention nodes with federal learning computing capabilities for the first training data, where the first training contention response messages are used to inform the first computing node of the number of contention digital resources corresponding to the digital resources that the first computing node is willing to pay for the first training data. The first computing node will naturally prefer to select the competing node that has the greatest number of competing digital resources. However, it should be noted that the first computing node does not consider whether the second computing node is trusted when broadcasting the first training competition request. In the social internet of things scene, trust problems exist among nodes, and data interaction among the untrusted nodes is not performed, because data security problems can occur. Therefore, there may be an unreliable contention node among the L contention nodes, in order to ensure data security, the first computing node should reject the unreliable contention node first, obtain a reliable contention node, and then select the reliable contention node with the highest contention resource.
Specifically, whether the competing node is trusted may be determined by a node reliability condition, where the competing node that satisfies the node reliability condition may be that the reliability of the competing node by the first computing node is greater than a reliability threshold.
Alternatively, assume that L competing nodes include competing node Z i I is a positive integer less than or equal to L, and the first computing node determines Z i One possible implementation of whether the node reliability condition is satisfied may be: acquiring a node trusted probability table; checking from a node trusted probability tablePolling competing node Z i Is a trusted probability of (1); according to and compete with node Z i Connection relation between them and competing node Z i Determining a probability of trust of a competing node Z i Is the confidence level of (2); if competing for node Z i If the reliability of the (a) is smaller than the reliability threshold, determining the competing node Z i The node credibility condition is not satisfied; if competing for node Z i If the reliability of (a) is greater than or equal to the reliability threshold, determining the competing node Z i The node reliability condition is satisfied. The node reliability probability table contains the reliability probability of a first computing node to the nodes directly adjacent to the node in the blockchain network, the reliability probability can be divided into 11 grades according to step length 0.1 from 0 to 1, the first computing node can select the reliability probability of the node according to the social trust relationship with the node, and then according to the competition node Z i Connection relation between them and competing node Z i Determining a probability of trust of a competing node Z i Is to be determined.
Specifically, the first computing node is based on the competing node Z i Connection relation between them and competing node Z i Determining a probability of trust of a competing node Z i One possible implementation of the reliability of (a) may be: if competing for node Z i Directly connected to the first computing node, the first computing node may calculate the competing node Z using an entropy-based approach i The reliability of the data is calculated according to the following formula:
/>
wherein p refers to competing node Z i When p is between 0 and 0.5, competing node Z i Is the credibility of (1)When p is between 0.5-1, competing node Z i Is +.>
If competing for node Z i Is not directly connected with the first computing node, and the first computing node does not find the competing node Z in the trusted probability table i The first computing node may pass through the node Z with the competing nodes i The reliability of the node depending on indirect connection is calculated and competing with the node Z by means of cascade and multipath propagation i The cascade is calculated using multiplication rules and the multipath propagation is calculated using weighted average rules. For ease of understanding, assume that the first computing node corresponds to device a, competing node Z i Corresponding to the device d, the device a is directly connected with the device b and the device c respectively, the device d is directly connected with the device b and the device c respectively, the device b is not connected with the device c, and at the moment, the calculation formula of the credibility between the device a and the device d is as follows:
wherein L is ad I.e. the confidence level between device a and device d, L ab I.e. the reliability between device a and device b, L is thus the result of the direct connection of device a to device b ab Can be obtained based on the above formula (1), and L is the same as that ac I.e. the degree of trust, L, between device a and device c bd I.e. the confidence level, L, between device b and device d cd I.e. the degree of trust between device c and device d, can be determined based on equation (1) above.
Specifically, after the first computing node obtains the trusted competition node with the highest number of the competitive digital resources, the trusted competition node is only used as the pre-selected computing node, because the same competition node can compete with training data corresponding to a plurality of computing limited nodes at the same time, and when the first computing node selects the competition node, the competition node can also determine whether to agree with the selection of the first computing node. Therefore, the first computing node further needs to send a first training contention confirmation request to the pre-selected computing node, where the first training contention confirmation request may carry a number of contracted digital resources, and if the first computing node receives the first training contention confirmation response information sent by the pre-selected computing node according to the first training contention confirmation request in the confirmation time period, the first computing node takes the pre-selected computing node as the target computing node, which represents that after the first computing node shares the first training data to the target computing node, the target computing node needs to give the first computing node a number of contracted digital resources corresponding to the number of contracted digital resources. It may be appreciated that the number of the agreed digital resources may be smaller than or equal to the number of the competing digital resources corresponding to the target computing node at first, and the specific implementation may be determined according to different situations, for example, the number of the agreed digital resources may be equal to the second largest number of the competing digital resources received by the first computing node, so that the relationship of data sharing with the target computing node can be better determined.
Step S103, if a first data sharing request sent by the target computing node is received, sending the first training data to the target computing node, so that the target computing node trains an initial model associated with the federation learning task according to the first training data and stored training data associated with the federation learning task, and obtains a first branch training update parameter; the task initiating node is further used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include the first branch training update parameter, N being a positive integer.
Specifically, after the first computing node determines the target computing node, the first computing node determines the data sharing relationship with the target computing node, but at this time, the first computing node does not need to send the first training data to the target computing node, but waits for the target computing node to send the first data sharing request to itself, and then sends the first training data to the target computing node. Because the task initiating node is limited in terms of the budgeted digital resources of a federal learning task, there are typically multiple computable nodes in the blockchain network that possess training data associated with the federal learning task and that have the ability to complete the federal learning task, the target computing node belongs to the computable node. Therefore, the task initiating node generally needs to perform task allocation, and obtains as much training data as possible to complete the federal learning task under the condition that the cost does not exceed the budget digital resources. If the computable node selected by the task initiating node contains a target computing node, the target computing node sends a first data sharing request to the first computing node. After the target computing node acquires the first training data, the initial model associated with the bang learning task can be trained according to the first training data and the stored training data associated with the federal learning task, so that a first branch training update parameter is obtained, and then the first branch training update parameter is returned to the task initiating node. Similarly, other calculable nodes in the calculable nodes also return branch training update parameters to the task initiating node respectively, in other words, the task initiating node can receive N branch training update parameters finally, and the task initiating node carries out global update on the initial model according to the N branch training update parameters.
Optionally, when the first computing node determines the target computing node, the first computing node may acquire the number of contracted digital resources with the target computing node; and packaging the agreed digital resource quantity, the data quantity of the first training data and the data sharing relation with the target computing node into a data sharing record transaction, and caching the data sharing record transaction in a record transaction pool. The first computing node can perform consensus uplink on the data sharing record transaction while receiving a first data sharing request sent by the target computing node; and then receiving the digital resources corresponding to the number of the appointed digital resources, which are sent by the target computing node. That is, the data sharing between the first computing node and the target computing node may be stored in the blockchain via data sharing record transactions, facilitating subsequent verification. Referring to fig. 4, fig. 4 is a schematic diagram of a data structure for recording transactions according to an embodiment of the present application. As shown in fig. 4, the record transaction includes a transaction identification number, a transaction type, a message receiver, a message sender, load data, and a digital signature. Wherein the transaction identification number is used for uniquely identifying a transaction and is determined by a transaction generation rule; the transaction type comprises data sharing record transaction, and the data sharing record transaction generated by the first computing node is aimed at, at this time, the message receiver is the target computing node, the message sender is the first computing node, the load data comprises the agreed digital resource quantity, the data quantity of the first training data and the data sharing relation between the first computing node and the target computing node, and the digital signature is generated by the first computing node through a private key.
By adopting the method provided by the embodiment of the application, if the first computing node is limited in calculation, the first training data stored by the first computing node can be shared to the target computing node which has federal learning computing capacity for the first training data and meets the node reliability condition, the target computing node can use the self-stored training data and the first training data to train the model, so that the waste of the training data in the computing node with limited calculation is avoided, the utilization rate of the data in the federal learning process is improved, more training data participate in training, the model performance obtained through final training is higher, and the first computing node and the target computing node are nodes in a blockchain network and meet the reliability condition, so that the data interaction is safer.
Further, referring to fig. 5, fig. 5 is a flowchart of a federal learning processing method based on a blockchain according to an embodiment of the present application. The method may be performed by a first computing node in a blockchain network (may be any blockchain node in the blockchain network), the blockchain network further includes a task initiation node (may be any blockchain node in the blockchain network other than the first computing node) and M second computing nodes (may be any M blockchain nodes in the blockchain network other than the first computing node and the task initiation node), and M is a positive integer. The following description will be given by taking the method performed by the first computing node as an example, where the blockchain-based federal learning processing method may at least include the following steps S201 to S206:
In step S201, the first computing node receives the federal learning task generated by the task initiating node through the task intelligent contract, and obtains training data associated with the federal learning task as first training data.
Specifically, the implementation of step S201 may refer to step S101, which is not described herein.
Step S202, receiving second training competition requests for the federal learning task, which are sent by the P third computing nodes respectively.
Specifically, among the P third computing nodes, the second training competition request sent by the jth third computing node includes the data size of the jth second training data; the j second training data is the training data which is stored in the j third computing node and is associated with the federal learning task; the j-th third computing node does not have federal learning computing capability for the j-th second training data; j is a positive integer less than or equal to P.
In particular, the federal learning task of the task initiation node may be broadcast in a federal learning market in the blockchain network, so that multiple computing nodes may receive the federal learning task, and multiple computing nodes may also have multiple computing restricted nodes that have training data associated with the federal learning task but do not have the task completed, and each of the computing restricted nodes may send a training competition request to computing nodes within a communicable range. For the first computing node, the third computing node refers to a computing limited node that has the second training data within its communicable range but does not have federal learning computing power for the second training data.
Step S203, if the first computing node has the federal learning computing capability of the first training data, determining a third computing node with successful competition from the P third computing nodes as a target acquisition node; the first computing node also has federal learning capability for target training data; the target training data is second training data corresponding to the target acquisition node.
Specifically, before responding to the P second training competition requests, the first computing node needs to determine whether the first computing node has federal learning computing capability for the first training data, because if the first computing node cannot train with the first training data of the first computing node, the first computing node also belongs to the computing restricted node, and can not compete with the second training data of other computing restricted nodes, namely the third computing node. The implementation process of the first computing node in determining whether to have the federal learning computing capability for the first training data may refer to the optional implementation process of step S101 in the embodiment corresponding to fig. 3, which is not described herein.
Specifically, if the first computing node has the federal learning computing capability for the first training data, determining a third computing node with successful competition among the P third computing nodes as a feasible implementation procedure of the target acquisition node may be: if the first computing node has the federal learning computing capability of the first training data, traversing second training competition requests which are respectively sent by the P third computing nodes and aim at federal learning tasks; acquiring second training data G in a second training competition request sent by a kth third computing node k According to the second training data G k Data quantity determination for second training data G k A second training period T of the data quantity of (2) k The method comprises the steps of carrying out a first treatment on the surface of the k is a positive integer less than or equal to P; the second training time period T k Adding the first training time length to obtain a total training time length T k total The method comprises the steps of carrying out a first treatment on the surface of the If the total training time length T k total Less than or equal to the task deadline, determining that the first computing node is provided with the second training data G k Transmits second training competition response information R to the kth third computing node k The method comprises the steps of carrying out a first treatment on the surface of the In the competition waiting time period, receiving second training competition confirmation requests respectively sent by Q third computing nodes; q is a positive integer less than or equal to P; a second training contention confirmation request includes a target contention digital resource amount; determining a third computing node corresponding to a second training competition confirmation request corresponding to the lowest unit data resource consumption as a third computing node with successful competition, sending second training competition confirmation response information to the third computing node with successful competition, and taking the third computing node with successful competition as a target acquisition node; wherein one isThe consumption of unit data resources corresponding to the three computing nodes is determined according to the target competition digital resource quantity corresponding to the third computing node and the data quantity of the second training data. Briefly, a first computing node determines third computing nodes that can complete training of second training data that it stores, and then sends second training competition response information to these third computing nodes, respectively, in response to second training competition requests of these third computing nodes. As can be seen from the above step S102, since the third computing node receives the second training competition response information in addition to the second training competition response information, the third computing node receiving the second training competition response information does not necessarily send the second training competition confirmation request to the first computing node, and only when the first computing node is the pre-selected computing node, the third computing node sends the second training competition confirmation request to the first computing node. Therefore, the first computing node may receive the second training contention confirmation requests sent by the Q third computing nodes, where Q is a positive integer less than or equal to P, and the first computing node may certainly select, from among the Q third computing nodes, the third computing node corresponding to the second training contention confirmation request corresponding to the lowest unit data resource consumption as the third computing node that performs contention successfully, where the unit data resource consumption is the number of digital resources required to acquire the second training data corresponding to the unit data. It can be understood that if the first computing node has abundant computing resources, the first computing node may select a plurality of third computing nodes from the Q third computing nodes as third computing nodes with successful competition, and only need to ensure that the second training data corresponding to all the third computing nodes with successful competition and the training of the first training data stored by the first computing node can be completed within the task deadline.
Step S204, a federal learning task response request carrying training data information is sent to the task initiating node; the training data information is generated from the first training data and the target training data.
Specifically, the training data information includes a total amount of training data and a consumption amount of unit data resources, where the total amount of training data is a sum of a data amount corresponding to the first training data and a data amount corresponding to the target training data, and may be calculated by the following formula (3):
wherein g i,j Representing the initiating node r for the task i First computing node e j Amount of data owned, gamma i,k Representing a first computing node e j From the third computing node e of successful competition k Where μ represents the number of third computing nodes that succeeded in the competition.
Wherein the unit data resource consumption amount can be calculated by the following formula (4):
wherein,for the first computing node e j The corresponding total amount of training data; v i,j Representing a first computing node e j Initial consumption of unit data resources, c k,j Representing a first computing node e j The need arises for a third computing node e to compete successfully k Pay per unit data resource consumption, +.>Representing a first computing node e j And finally, the corresponding unit data resource consumption.
Step S205, if a federation learning task issuing instruction sent by the task initiating node is received, a second data sharing request is sent to the target acquiring node; the federation learning task issuing instructions include an initial model associated with the federation learning task.
Specifically, the task initiating node may receive, in addition to the first computing node, a federal learning task response request sent by another calculable node, where the task initiating node may select a part of calculable nodes to send a federal learning task issuing instruction from the calculable nodes corresponding to the received federal learning task response request.
Step S206, receiving the target training data sent by the target acquisition node according to the second data sharing request, and training the initial model according to the first training data and the target training data to obtain second branch training update parameters; the task initiating node is also used for globally updating the initial model through H branch training updating parameters; the H branch training update parameters include the second branch training update parameter, H being a positive integer.
Optionally, if the random differential privacy noise is received while the federal learning task issuing instruction sent by the task initiating node is received, encrypting the second branch training update parameter according to the received random differential privacy noise to obtain a target encrypted branch training update parameter, and sending the target encrypted branch training update parameter to the task initiating node; the task initiating node is also used for carrying out information aggregation processing on the S encrypted branch training updating parameters to obtain encrypted aggregation training updating parameters; s is a positive integer; the task initiating node is also used for adding the encrypted aggregate training updating parameters and the differential privacy key to obtain aggregate training updating parameters, and globally updating the initial model according to the aggregate training updating parameters; the S encrypted branch training update parameters include a target encrypted branch training update parameter. And (3) carrying out encryption processing on the second branch training update parameters, wherein the used encryption formula is as follows:
Wherein W is i Representing the initial parameters corresponding to the initial model,representing a second branching training obtained after trainingTraining updated parameters, i.e. parameters corresponding to the initial model after training, delta j Representing random differential privacy noise received by the second computing node.
By adopting the method provided by the embodiment of the application, if the computing resources of the first computing node are sufficient, the first computing node can compete for the second training data of the third computing node, so that when the federal learning task corresponding to the task initiating node is completed, more training data can be used, the trained model performance is higher, and the waste of the training data in the computing node with limited computation is avoided.
Further, referring to fig. 6, fig. 6 is a flowchart of a federal learning processing method based on a blockchain according to an embodiment of the present application. The method can be executed by a task initiating node (can be any blockchain node in the blockchain network) in the blockchain network, and the blockchain network also comprises W computing nodes (can be any W blockchain nodes except the task initiating node in the blockchain network), wherein W is a positive integer. The following describes the method performed by the task initiating node, wherein the blockchain-based federal learning processing method at least may include the following steps S301-S304:
Step S301, a task initiating node generates a federation learning task through a task intelligent contract, and broadcasts the federation learning task to the W computing nodes.
Specifically, if a calculation limited node exists in the W calculation nodes, the calculation limited node is used for broadcasting a training competition request aiming at shared training data to a communicable calculation node corresponding to the calculation limited node, and the calculation limited node is also used for determining a target communicable calculation node which is successful in competition and meets a node credibility condition in the communicable calculation nodes; the shared training data is the training data stored in the computing limited node and associated with the federal learning task; the compute constrained nodes do not have federal learning computing power for shared training data. For a specific implementation process of determining the target communicable computing node by the computing restricted node, reference may be made to the process of determining the target computing node by the first computing node in step S102 in the embodiment corresponding to fig. 3, which is not described herein.
Step S302, receiving federal learning task response requests sent by Y capability computing nodes respectively; y is a positive integer; the Y capability computing nodes include the target communicable computing node, and the Y capability computing nodes do not include the computation restricted node.
Specifically, a federal learning task response request includes a unit data resource consumption amount and a training data total amount. For the process of determining the corresponding unit data resource consumption amount and the training data total amount by each capability calculation node, reference may be made to the description of determining the training data total amount and the unit data resource consumption amount by the first calculation node in step S204 in the embodiment corresponding to fig. 5.
Step S303, determining X training computing nodes from the Y capability computing nodes according to Y federation learning task response requests, and sending federation learning task issuing instructions carrying initial models associated with federation learning tasks to the X training computing nodes so that the X training computing nodes train the initial models according to available training data respectively to obtain branch training updating parameters; x is a positive integer; if the X training computing nodes comprise the target communicable computing node, the available training data corresponding to the target communicable computing node comprises the shared training data and the training data already stored in the target communicable computing node.
Specifically, a feasible implementation process of determining X training computing nodes from the Y capability computing nodes according to the Y federal learning task response requests may be: obtaining a digital resource prediction consumption corresponding to a federal learning task; ascending sort is carried out on the Y unit data resource consumption according to the Y federal learning task response requests, and Y unit data resource consumption after sort is obtained; traversing the Y unit data resource consumption after sequencing, sequentially obtaining the e unit data resource consumption, and determining the e node digital resource prediction consumption according to the e unit data resource consumption and the training data total amount corresponding to the e unit data resource consumption; e is a positive integer less than or equal to Y; adding the capacity calculation node corresponding to the e-th unit data resource consumption into a node pre-selection queue; if the e-th node digital resource predicted consumption is smaller than the current residual digital resource predicted consumption, subtracting the current residual digital resource predicted consumption from the e-th node digital resource predicted consumption to obtain an updated residual digital resource predicted consumption, and continuously traversing to obtain the e+1st unit data resource consumption; if the e node digital resource predicted consumption is greater than or equal to the current residual digital resource predicted consumption or e is equal to Y, stopping traversing, and taking X capacity computing nodes in the node preselected queue as training computing nodes. For easy understanding, assuming that the digital resource prediction consumption of the task initiating node is 100, the total amount of training data corresponding to the capability computing node a is 20, the unit data resource consumption is 2, the total amount of training data corresponding to the capability computing node B is 40, the unit data resource consumption is 1, the total amount of training data corresponding to the capability computing node C is 50, the unit data resource consumption is 1.5, and the 3 unit data resource consumption after ascending order is 1, 1.5, and 2, the task initiating node preferentially selects the capability computing node B, determines that the digital resource prediction consumption of the corresponding node is 1×40=40, then adds the node to the node preselected queue, and subtracts the node from the node 100 and 40 because 40 is smaller than 100, so as to obtain a new residual digital resource prediction consumption 60, then the task initiating node continues to obtain the capability computing node C, determines that the digital resource prediction consumption of the node C is 50×1.5=75, and adds the capability computing node C to the node preselected queue because 75 is larger than 60, and the task initiating node selects to end. At this time, the capacity calculation node B and the capacity calculation node C in the node pre-selection queue are training calculation nodes. It should be noted that, for the capability calculating node B, the task initiating node may allocate the digital resource corresponding to the predicted consumption of the digital resource of the node, because the budget digital resource of the task initiating node is sufficient at this time; however, for the capability calculating node C, the task initiating node can only allocate the digital resource corresponding to the residual digital resource predicted consumption 60, so that when the capability calculating node C finishes the federal learning task issuing instruction issued by the task initiating node, only the training data of the training data amount corresponding to the residual digital resource predicted consumption 60 is acquired for training.
Optionally, sequentially acquiring X-1 training computing nodes from the node pre-selection queue; the node digital resource prediction consumption corresponding to the X-1 training calculation nodes is used as the transaction digital resource consumption corresponding to the X-1 training calculation nodes; acquiring an X training computing node from the node pre-selection queue, wherein if the node digital resource predicted consumption corresponding to the X training computing node is smaller than the target residual digital resource predicted consumption, the node digital resource predicted consumption corresponding to the X training computing node is used as the transaction digital resource consumption corresponding to the X training computing node; the target residual digital resource predicted consumption is a value obtained by subtracting the node digital resource predicted consumption corresponding to the X-1 training calculation nodes respectively from the digital resource predicted consumption; if the node digital resource predicted consumption corresponding to the X training computing node is greater than or equal to the target residual digital resource predicted consumption, taking the target residual digital resource predicted consumption as the transaction digital resource consumption corresponding to the X training computing node; each training computing node and the corresponding transaction digital resource consumption amount of each training computing node are associated and packaged into a transaction with digital resource records, and the digital resource records are cached in a record transaction pool; when a target model meeting training conditions indicated by the federal learning task is obtained, digital resources corresponding to the associated transaction digital resource consumption are respectively sent to each training computing node, and digital resource record transactions are subjected to consensus uplink. Referring to fig. 4 again, the transaction type shown in fig. 4 further includes a digital resource record transaction, the task initiating node generates a digital resource record transaction for each training computing node, and in the data structure of the digital resource record transaction, the message receiving party is the training computing node, the message sending party is the task initiating node, the load data is the training computing node and the transaction digital resource consumption corresponding to the training computing node, and the digital signature is generated by the task initiating node through the private key.
Step S304, globally updating the initial model according to the aggregate training updating parameters until a target model meeting training conditions indicated by the federal learning task is obtained; the aggregate training update parameter is generated based on the X branch training update parameters.
Specifically, it can be known from the above that, after the task initiating node receives the X branch training update parameters, the task initiating node performs information aggregation processing on the X branch training update parameters to obtain an aggregated training update parameter.
Specifically, the training conditions indicated by the federal learning task generally include that the iterative training reaches the target number of rounds or that the parameter accuracy reaches the target accuracy, so, if the parameter accuracy of the aggregate training update parameter does not reach the target accuracy, the task initiating node may issue the globally updated initial model to the X training computing nodes, so that the X training computing nodes perform a new round of training on the globally updated initial model according to the available training data, respectively.
Optionally, for security during data transmission, the task initiating node may generate a set of random differential privacy noise; the differential privacy noise set comprises random differential privacy noise respectively corresponding to each training computing node; adding the X random differential privacy noises to obtain a random differential privacy total noise; taking the opposite number of the random differential privacy total noise as a differential privacy key; and then, sending a federation learning task issuing instruction carrying an initial model related to a federation learning task to X training computing nodes, and simultaneously respectively sending corresponding random differential privacy noise to the X training computing nodes, so that the X training computing nodes respectively encrypt the branch training updating parameters according to the received random differential privacy noise to obtain encrypted branch training updating parameters.
Optionally, at this time, the task initiating node determines an aggregate training update parameter, which may be an information aggregate processing performed on the encrypted branch training update parameters returned by the X training computing nodes, to obtain an encrypted aggregate training update parameter, and adds the encrypted aggregate training update parameter to the differential privacy password to obtain an aggregate training update parameter. The information aggregation processing is performed on the encrypted branch training update parameters returned by the X training computing nodes, so as to obtain a feasible implementation process of the encrypted branch training update parameters, which may be: acquiring a training test set; respectively verifying encryption branch training updating parameters returned by the X training computing nodes according to the training test set to obtain X updating precision; the encryption branch training updating parameters returned by the training computing nodes with the updating precision meeting the training precision conditions are used as target encryption branch training updating parameters; and carrying out information aggregation processing on the target encryption branch training updating parameters to obtain the encryption aggregation training updating parameters.
Optionally, packaging the X training computing nodes and the X update accuracies into an update accuracy record transaction, and initiating consensus processing for the update accuracy record transaction to the blockchain network, so that each node in the blockchain network updates the stored accuracy record table according to the X update accuracies when the update accuracy record transaction consensus passes; the precision record table is used for recording each node in the blockchain network and the current update precision corresponding to each node. The data structure of updating the precision record transaction may also refer to the data structure of the record transaction in the embodiment corresponding to fig. 4, which is not described herein.
Alternatively, consistency of the distributed ledger may be guaranteed based on a contribution driven consensus mechanism. When the task initiating node is used as an executing node of a certain round to determine which node of the new block is out of the block, the precision record table can be obtained according to the vote confirmation request; determining the contribution degree corresponding to each node according to the current updating precision corresponding to each node in the precision record table; determining the resource rights and interests corresponding to each node according to the contribution degree weight, the contribution degree corresponding to each node and the digital resource possession duration corresponding to each node; according to the proportion between the corresponding resource rights of each node, sending votes to each node so that each node votes on the received votes; if the calculation limited node exists in each node, the calculation limited node is used for voting the received ballots to the target communicable calculation node; the number of votes of each node after voting is received, and the block-out weight acquisition proportion corresponding to each node is determined according to the number of votes of each node; determining block nodes randomly from each node according to the block-out weight acquisition proportion; the block-out node has the block-out weight of the new block; transmitting the rewarding digital resources to the block outlet node so that the block outlet node distributes the rewarding digital resources according to the ballot composition proportion; the vote composition ratio is a number ratio between the votes sent by the task initiating node and the votes voted by the target calculation restricted node, which are received by the block node.
The calculation of the contribution degree corresponding to each node is determined according to the current update precision corresponding to each node in the precision record table, and the following formula can be used:
ψ(ε)=-εlog 2 (ε)-(1-ε)log 2 (1- ε) formula (6)
Wherein epsilon is the current update precision corresponding to the node, when epsilon is between 0 and 0.5, the contribution degree obtained by the current update of the node is 1-phi (epsilon), and when epsilon is between 0.5 and 1, the contribution degree obtained by the current update of the node is phi (epsilon) -1.
The resource benefit corresponding to each node is determined according to the contribution degree weight, the contribution degree corresponding to each node and the digital resource possession duration corresponding to each node, and the weight benefit calculation formula is as follows:
wherein, coiage represents the digital resource possession duration corresponding to the node, namely, the number of digital resources held by the device multiplied by the holding time, l i,j Representing node e j At task initiator r i The contribution degree obtained in the federal learning task.
By adopting the method provided by the embodiment of the application, the related transaction is recorded by using the blockchain, the market order is normalized by using the intelligent contract, and the influence of noise on the model precision is eliminated by adopting the improved counteractable differential privacy noise encryption algorithm, so that the attack from malicious equipment is effectively prevented, and all federal learning tasks are ensured to be capable of safely carrying out model training.
Further, referring to fig. 7, fig. 7 is a flowchart of a federal learning processing method based on a blockchain according to an embodiment of the present application. The method can be jointly realized by the Internet of things equipment in the federal learning market (BFL market) in the blockchain network, any one of the Internet of things equipment in the BFL market can be used as a task initiating node, and other Internet of things equipment can be regarded as a computing node at the moment. The following will describe an example of the method executed by the task initiating node and the computing node together, where the blockchain-based federal learning processing method may at least include the following steps S401-S404:
in step S401, the task initiating node generates a federal learning task, and broadcasts the federal learning task in the BFL market.
Specifically, the task initiating node requests to initiate federal learning, and invokes the intelligent contract to create a federal learning task. The task initiation node may then broadcast the created federal learning task in the BFL market, including specifically the type of training data required for the task, the task deadline, and the global model size.
Step S402, a computing node receives a federation learning task sent by a task initiating node and judges whether the computing node has training data required by the federation learning task and whether the computing node has the capability of completing the federation learning task; the computing limited node which has the training data and has no ability to complete the federal learning task can initiate data competition within the communication range, share the data to surrounding trusted devices, and the computing node which has the ability to complete the federal learning task participates in competing data resources, and the computing node which has successful competition can acquire the training data of the computing limited node when participating in the federal learning task.
Specifically, it is understood that both the compute constrained node and the computable node belong to a compute node.
Specifically, the data competition initiated by the limited node may be completed by using a collaborative trusted learning policy (trust-enhanced collaborative learning strategy, TCL) based on data sharing, where the TCL algorithm is a specific embodiment of step S102 in the embodiment corresponding to fig. 3. For ease of understanding, please refer to fig. 8, fig. 8 is a flowchart of a data contention method according to an embodiment of the present application. As shown in fig. 8, the data contention includes the steps of:
in step S501, the computing restricted node initiates data contention.
Specifically, reference may be made to the implementation procedure described in the embodiment corresponding to fig. 3 that the first computing node that does not have the federal learning computing capability for the first training data broadcasts the first training competition request to the M second computing nodes, and the computing restricted node may broadcast the data competition request initiated by the computing restricted node within the communication range, including the data amount and the task deadline of the training data owned by the computing restricted node.
In step S502, the node may participate in the contention and transmit the number of contended digital resources.
Specifically, after receiving the data competition request, the calculable node judges whether the training of the competitive training data can be completed within a specified time according to the data quantity and the task deadline of the competitive training data, and if the training can be completed, the calculable node bids to the calculating limited node, namely, sends training competition response information carrying the quantity of the competitive digital resources.
In step S503, the computing restricted node obtains a set of competing digital resource amounts, setting n=1.
Specifically, after broadcasting the data competition request, the computing restricted node may receive the bid within a fixed period of time, and after the fixed period of time has ended, no new bid is received. The calculation limited node needs to reject the number of the competitive digital resources of the computable node with the reliability lower than the reliability threshold value in the received bidder, then selects the computable node corresponding to the highest number of the competitive digital resources from the remaining number of the competitive digital resources as a potential winner (i.e. a preselected node) of the current data competition, calculates the current data competition according to the number of the next highest number of the competitive digital resources, and sends a confirmation request to the number of the appointed digital resources corresponding to the potential winner. First, the computing restricted node puts the number of competing digital resources received in a fixed period of time into a set of number of competing digital resources, and then sets n=1.
In step S504, the computing restricted node takes the nth number of competing digital resources from the set of competing digital resources.
In step S505, the computing restricted node determines whether the reliability of the nth computing node is greater than the reliability threshold, if so, step S506 is executed, otherwise, step S504 is executed for n++.
Specifically, the calculation of the reliability of the limited node to determine the nth calculable node may refer to the determination of the competing node Z by the first calculating node in step S102 in the embodiment corresponding to fig. 3 i Is not described in detail herein. In addition, n++ means that n is incremented by one.
Step S506, the computing restricted node determines whether the n-th number of competing digital resources is greater than the current highest number of competing digital resources, if the n-th number of competing digital resources is greater than the current highest number of competing digital resources, step S507 is executed; otherwise, step S508 is performed.
Step S507, the highest number of digital resources and the next highest number of digital resources updated by the limited node are calculated.
Specifically, the computing restricted node updates the next highest number of digital resources to the current highest number of competing digital resources, and then updates the highest number of digital resources to the nth number of competing digital resources.
Step S508, the limited node is calculated to determine whether n is equal to the total number corresponding to the number of the competing digital resources in the competing digital resource number set, if so, step S509 is executed; otherwise, step S504 is performed on n++.
Specifically, n is equal to the total number corresponding to the number of the competing digital resources in the competing digital resource number set, which indicates that the calculation of the restricted node is completed for traversing the competing digital resource number set.
Step S509, the computing limited node sends a confirmation request to the computable node corresponding to the highest number of competing digital resources.
Specifically, the confirmation request is actually the first training contention confirmation request in the embodiment corresponding to fig. 3, where the confirmation request carries a agreed number of digital resources, and the agreed number of digital resources is generally generated according to the next highest number of digital resources.
Step S510, the computing limited node determines whether a confirmation response of the computable node corresponding to the highest number of competing digital resources is received, and if not, the step S501 is executed or the flow is ended; the execution of step S511 is received.
Specifically, the acknowledgement response is actually the first training contention acknowledgement response information described in the embodiment corresponding to fig. 3.
In step S511, the computing restricted node stores the data race transaction record in the blockchain (temporarily inactive).
Specifically, the generation of the data sharing record transaction described in step S103 in the embodiment corresponding to fig. 3 can be referred to as a possible process of the limited node storing the data competing transaction record in the blockchain. At this time, the data sharing record transaction may be cached in the record transaction pool and not be commonly uplink.
In step S403, the task initiation node assigns the federal learning task to the appropriate training computing node.
Specifically, the training computing node is a computable node selected by the task initiating node. The task initiating node needs to ensure that as much data as possible can be obtained under a given budget when it is allocated, and the training computing node needs to have training data of the allocated federal learning task.
Specifically, when the task initiating node performs federal learning task allocation, a task allocation algorithm (quality-oriented task allocation algorithm, QTA) facing training quality, which is a specific embodiment of step S303 in the embodiment corresponding to fig. 6, may be implemented by combining a greedy policy. For ease of understanding, please refer to fig. 9, fig. 9 is a flowchart of a task allocation method according to an embodiment of the present application. As shown in fig. 9, the task allocation method includes the steps of:
in step S601, the node may calculate the transmission data holding amount and the unit data resource consumption amount.
Specifically, the process of determining the own data holding amount and the unit data resource consumption amount by the computing node may refer to the description of determining the total amount of training data and the unit data resource consumption amount by the first computing node in step S204 in the embodiment corresponding to fig. 5.
In step S602, the task initiating node obtains a unit data resource consumption set, and performs ascending order on the unit data resource consumption set, setting m=1.
In particular, the task originating node should start selecting from the computable node with the lowest consumption of unit data resources in order to save budget, and thus it is necessary to order the consumption of unit data resources from low to high.
In step S603, the task initiating node takes the mth computable node to join the node pre-selection queue.
In step S604, the task initiating node determines the node digital resource consumption corresponding to the mth computable node, and uses the node digital resource consumption corresponding to the mth computable node as the transaction digital resource consumption.
Specifically, the node digital resource consumption=data holding amount is a unit data resource consumption amount. The transaction digital resource consumption refers to the corresponding number of digital resources that the task initiating node will ultimately send to the computable node.
Step S605, the task originating node updates the remaining digital resource prediction consumption amount.
Step S606, the task initiating node determines whether the predicted consumption of the residual digital resources is greater than the consumption of the digital resources of the node corresponding to the (m+1) th computable node; if yes, m++, go to step S603; if not, step S607 is performed.
In step S607, the task initiating node takes the predicted consumption of the remaining digital resources as the consumption of the transaction digital resources corresponding to the m+1th computable node, and adds the m+1th computable node to the node pre-selection queue.
In step S608, the task originating node stores the task allocation transaction record in the blockchain (validated).
Specifically, the process of storing the task allocation transaction record in the blockchain by the task initiating node may refer to the process from generation to consensus uplink of the digital resource record transaction in step S303 in the embodiment corresponding to fig. 6, and the process of storing the task allocation transaction record in the blockchain is effective when the consensus uplink of the digital resource record transaction is successful.
In step S609, the data competition transaction record corresponding to the computable node in the node pre-selection queue takes effect.
In step S610, the task initiation node determines a set of training computing nodes.
Specifically, the computable nodes in the node pre-selection queue are training computing nodes finally determined by the task initiating node.
Step S404, after the allocation of the federal learning task is completed, the task initiating node sends data such as initial model parameters, model training super parameters and the like to the training computing node, the training computing node locally completes training of the initial model, and sends local update information to the task initiating node, and the task initiating node carries out global update on the initial model after the local update information is aggregated, and iterates repeatedly until the accuracy requirement is met or an iteration round of initial setting is reached.
Specifically, considering the security problem of data in the transmission process, a simple encryption model training architecture (encrypted model training scheme, EMT) with noise cancellation can be provided based on a differential privacy encryption algorithm to realize the secure transmission of data. The EMT architecture is an embodiment of step S304 in the embodiment corresponding to fig. 6. For ease of understanding, please refer to fig. 10, fig. 10 is a flowchart of encryption model training provided in an embodiment of the present application. As shown in fig. 10, the encryption model training includes:
step S701, a task initiating node sends a federal learning task issuing instruction and random differential privacy noise to a training computing node.
Specifically, the task initiation node may produce a random differential privacy noise for each training computing node in the set of training computing nodes. The federal learning task issuing instruction comprises model parameters and model training hyper-parameters corresponding to the initial model.
In step S702, the training computing node downloads the initial model and performs training using the available training data.
Specifically, the available training data includes shared training data acquired from the computation-constrained node and locally stored training data.
Step S703, training the computing node to obtain local update information and encrypting with random differential privacy noise to obtain encrypted local update information.
Wherein the local update information, namely the above-mentioned branch training update parameter, the encrypted local update information, namely the above-mentioned encrypted branch training update parameter, and the encryption process can be referred to the above-mentioned formula (5).
Step S704, the task originating node receives the encrypted local update information and verifies it.
Specifically, after receiving all the encrypted local update information, the task initiating node sequentially uses the random test set to verify all the encrypted local update information, and records the accuracy obtained by each encrypted local update information on the random test set, namely the update accuracy, into corresponding transactions as a measurement standard of the local update quality of each training computing node.
Step S705, a task initiating node determines whether the local update precision meets the standard; if the standard is reached, executing step S706; if not, step S704 is performed.
Step S706, the task initiating node adds the up-to-standard training computing nodes into the candidate aggregation queue.
In step S707, the task initiation node determines whether a task deadline is reached. If yes, go to step S708; if not, step S704 is performed.
In step S708, the task initiating node aggregates the local update information and completes the global update of the initial model.
Step S709, the task initiating node determines whether the globally updated initial model meets the precision requirement or reaches the iteration number of initial setting, and if not, step S701 is executed; if one or both of them are satisfied, step S710 is performed.
In step S710, the task initiating node stores the federal learning task transaction record in the blockchain.
Step S405, after the federation learning task is completed, the task initiating node pays corresponding rewards to the training computing nodes which successfully participate in the local model training, and notifies the intelligent contract to cancel the federation learning task.
Specifically, after the federal learning task is completed, the task initiator pays the rewards digital resources to those training computing nodes that submitted local updates before the deadline according to the corresponding transaction records, and does not pay for those training computing nodes that did not complete training on time.
By adopting the method provided by the embodiment of the application, BFL market is organized by using the blockchain technology, each step adopts intelligent contracts to force each participant to perform, and fairness of data transaction and safety of federal learning are ensured. And the adoption of a cooperative trusted learning strategy (TCL) based on data sharing and a task allocation algorithm (QTA) facing training quality can ensure the great improvement of the data utilization rate in the BFL market, because the calculation limited node can send data to surrounding trusted calculable nodes when the calculation limited node is in weak charge of federal learning tasks by means of the cooperative trusted learning strategy based on data sharing, thereby protecting the privacy of the data and effectively preventing the waste of the data. And, in combination with the competitive mechanism of the market, the computing limited node is often lower than the average market level in terms of value of its own data. In QTA, the task initiating node is a computable node with a low bid that is preferentially selected for training in task allocation, so TCL and QTA can cause the task initiating node to obtain more data under a fixed budget. Meanwhile, an EMT encryption architecture is adopted to conduct model training, so that the safety of data transmission is guaranteed.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a federal learning processing device based on a blockchain according to an embodiment of the present application. The federal learning process means can be a computer program (including program code) running on a computer device, for example, the federal learning process means is an application software; the device can be used for executing corresponding steps in the federal learning processing method provided by the embodiment of the application. As shown in fig. 11, the federal learning processing device 1 may include: a task receiving module 101, a contention broadcasting module 102, a node determining module 103, and a data sharing module 104.
The task receiving module 101 is configured to receive a federation learning task generated by a task initiating node through a task intelligent contract, and acquire training data associated with the federation learning task as first training data;
a competition broadcasting module 102, configured to broadcast a first training competition request for a federal learning task to M second computing nodes if the first computing node does not have federal learning computing capability for the first training data;
a node determining module 103, configured to determine, as a target computing node, a second computing node that is successful in competition and satisfies a node reliability condition, from among the M second computing nodes; the target computing node has federal learning computing capability for the first training data;
The data sharing module 104 is configured to send the first training data to the target computing node if the first data sharing request sent by the target computing node is received, so that the target computing node trains an initial model associated with the federation learning task according to the first training data and stored training data associated with the federation learning task, and obtains a first branch training update parameter; the task initiating node is also used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include a first branch training update parameter, N being a positive integer.
The specific implementation manners of the task receiving module 101, the contention broadcasting module 102, the node determining module 103, and the data sharing module 104 may be referred to the specific description of step S101 to step S103 in the embodiment corresponding to fig. 3, and will not be described herein.
Wherein the node determining module 103 comprises: a contention information receiving unit 1031, a trusted node screening unit 1032, a node pre-selection unit 1033, and a contention confirmation unit 1034.
A contention information receiving unit 1031, configured to receive, in a contention period, first training contention response information sent by each of the L contention nodes; a first training competition response message comprises a competition digital resource quantity; the L competing nodes are nodes with federal learning computing capacity aiming at the first training data in the M second computing nodes; l is a positive integer less than or equal to M;
A trusted node screening unit 1032, configured to reject, from the L competing nodes, competing nodes that do not meet the node reliability condition, to obtain S trusted competing nodes; s is a positive integer less than or equal to L;
a node pre-selection unit 1033, configured to obtain, from the S trusted competition nodes, the trusted competition node having the highest number of competing digital resources, as a pre-selected computing node;
a contention confirmation unit 1034 for sending a first training contention confirmation request to the pre-selected computing node;
the contention confirmation unit 1034 is further configured to determine that the pre-selected computing node is the target computing node if the first training contention confirmation response message sent by the pre-selected computing node according to the first training contention confirmation request is received within the confirmation period.
The specific implementation manner of the contention information receiving unit 1031, the trusted node screening unit 1032, the node pre-selecting unit 1033, and the contention confirmation unit 1034 may be referred to the specific description of step S102 in the embodiment corresponding to fig. 3, and will not be repeated here.
Wherein L competing nodes comprise a competing node Z i I is a positive integer less than or equal to L;
the federal learning processing device 1 further includes: the trusted node determination module 105.
A trusted node determining module 105, configured to obtain a node trusted probability table;
the trusted node determining module 105 is further configured to query the node trusted probability table for the competing node Z i Is a trusted probability of (1);
the trusted node determining module 105 is further configured to determine a node Z according to the competition with i Connection relationship and race betweenContention node Z i Determining a probability of trust of a competing node Z i Is the confidence level of (2);
the trusted node determining module 105 is further configured to, if the node Z is competing i If the reliability of the (a) is smaller than the reliability threshold, determining the competing node Z i The node credibility condition is not satisfied;
the trusted node determining module 105 is further configured to, if the node Z is competing i If the reliability of (a) is greater than or equal to the reliability threshold, determining the competing node Z i The node reliability condition is satisfied.
The specific implementation manner of the trusted node determining module 105 may refer to the optional description of step S102 in the embodiment corresponding to fig. 3, which is not described herein.
The federal learning task comprises a task period and initial model information;
the federal learning processing device 1 further includes: the computing power determination module 106.
The computing capability determining module 106 is configured to determine a first required computing resource corresponding to the federal learning task according to the initial model information;
The computing capability determining module 106 is further configured to determine that the first computing node does not have federal learning computing capability for the first training data if the first required computing resource is greater than the available computing resource; the available computing resources refer to idle computing resources of the first computing node;
the computing capability determining module 106 is further configured to determine, according to the data amount of the first training data, a first training duration corresponding to federal learning computation for the first training data if the first required computing resource is less than or equal to the available computing resource;
the computing capability determining module 106 is further configured to determine that the first computing node has federal learning computing capability for the first training data if the first training duration is less than or equal to the task deadline;
the computing power determining module 106 is further configured to determine that the first computing node does not have federal learning computing power for the first training data if the first training time period is longer than the task time period.
The specific implementation manner may be referred to the optional description of step S101 in the embodiment corresponding to fig. 3, which is not described herein.
The block chain network further comprises P third computing nodes, wherein P is a positive integer;
The federal learning processing device 1 further includes: a competition request receiving module 107, an acquisition node determining module 108, a task response module 109, a data acquisition module 110, and a model training module 111.
A competition request receiving module 107, configured to receive second training competition requests for the federal learning task sent by the P third computing nodes respectively; the second training competition request sent by the j third computing node comprises the data volume of the j second training data; the j second training data is the training data which is stored in the j third computing node and is associated with the federal learning task; the j-th third computing node does not have federal learning computing capability for the j-th second training data; j is a positive integer less than or equal to P;
the acquiring node determining module 108 is configured to determine a third computing node with successful competition among the P third computing nodes as a target acquiring node if the first computing node has the federal learning computing capability of the first training data; the first computing node also has federal learning capability for the target training data; the target training data is second training data corresponding to the target acquisition node;
the task response module 109 is configured to send a federal learning task response request carrying training data information to a task initiating node; the training data information is generated according to the first training data and the target training data;
The data acquisition module 110 is configured to send a second data sharing request to the target acquisition node if a federal learning task issuing instruction sent by the task initiation node is received; the federation learning task issuing instruction comprises an initial model associated with the federation learning task;
the model training module 111 is configured to receive target training data sent by the target acquisition node according to the second data sharing request, train the initial model according to the first training data and the target training data, and obtain second branch training update parameters; the task initiating node is also used for globally updating the initial model through H branch training updating parameters; the H branch training update parameters include a second branch training update parameter, H being a positive integer.
The specific implementation manners of the contention request receiving module 107, the acquisition node determining module 108, the task response module 109, the data acquisition module 110, and the model training module 111 may be referred to the specific description of step S201 to step S206 in the embodiment corresponding to fig. 5, and will not be described herein.
Wherein the acquisition node determining module 108 includes: node selection unit 1081 and node confirmation unit 1082.
Node selecting unit 1081, configured to traverse second training competition requests for the federal learning task sent by the P third computing nodes, if the first computing node has the federal learning computing capability of the first training data;
node selection unit 1081 is further configured to obtain second training data G in the second training competition request sent by the kth third computing node k According to the second training data G k Data quantity determination for second training data G k A second training period T of the data quantity of (2) k The method comprises the steps of carrying out a first treatment on the surface of the k is a positive integer less than or equal to P;
node selection unit 1081 further configured to select a second training period T k Adding the first training time length to obtain a total training time length T k total
Node selection unit 1081 is also configured to, if total training period T k total Less than or equal to the task deadline, determining that the first computing node is provided with the second training data G k Transmits second training competition response information R to the kth third computing node k
A node confirmation unit 1082, configured to receive, in a contention waiting period, second training contention confirmation requests sent by the Q third computing nodes respectively; q third computing nodes are nodes which receive second training competition response information sent by the first computing node, and Q is a positive integer; a second training contention confirmation request includes a target contention digital resource amount;
The node confirmation unit 1082 is configured to determine that the third computing node corresponding to the second training contention confirmation request with the highest target number of contention digital resources is a third computing node with successful contention, send second training contention confirmation response information to the third computing node with successful contention, and take the third computing node with successful contention as the target acquisition node.
The specific implementation manner may be referred to the specific description of step S203 in the embodiment corresponding to fig. 5, and the detailed description is omitted here.
Wherein, the above-mentioned federal study processing apparatus 1 further includes: an encryption module 112.
The encryption module 112 is configured to, if a federal learning task issuing instruction sent by a task initiating node is received and random differential privacy noise is received, encrypt a second branch training update parameter according to the received random differential privacy noise to obtain a target encrypted branch training update parameter, and send the target encrypted branch training update parameter to the task initiating node; the task initiating node is also used for carrying out information aggregation processing on the S encrypted branch training updating parameters to obtain encrypted aggregation training updating parameters; s is a positive integer; the task initiating node is also used for adding the encrypted aggregate training updating parameters and the differential privacy key to obtain aggregate training updating parameters, and globally updating the initial model according to the aggregate training updating parameters; the S encrypted branch training update parameters include a target encrypted branch training update parameter.
The specific implementation of the encryption module 112 may be referred to the optional description of the steps in the embodiment corresponding to fig. 5, which is not described herein.
Wherein, the above-mentioned federal study processing apparatus 1 further includes: a data sharing record module 113.
A data sharing record module 113, configured to obtain the number of digital resources agreed with the target computing node;
the data sharing record module 113 is further configured to package the number of agreed digital resources, the data amount of the first training data, and the data sharing relationship with the target computing node into a data sharing record transaction, and cache the data sharing record transaction in a record transaction pool;
the data sharing record module 113 is further configured to perform consensus uplink on the data sharing record transaction while receiving the first data sharing request sent by the target computing node;
the data sharing record module 113 is further configured to receive a digital resource corresponding to the number of agreed digital resources sent by the target computing node.
The specific implementation of the data sharing record module 113 may be referred to the optional description in the embodiment corresponding to fig. 3, and will not be described herein.
Referring to fig. 12, fig. 12 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 12, the federal learning processing device 1 in the embodiment corresponding to fig. 11 described above may be applied to a computer apparatus 1000, and the computer apparatus 1000 may include: processor 1001, network interface 1004, and memory 1005, and in addition, the above-described computer device 1000 may further include: a user interface 1003, and at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface, among others. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the processor 1001. As shown in fig. 12, an operating system, a network communication module, a user interface module, and a device control application program may be included in the memory 1005, which is one type of computer-readable storage medium.
In the computer device 1000 shown in fig. 12, the network interface 1004 may provide a network communication network element; while user interface 1003 is primarily used as an interface for providing input to a user; while the processor 1001 may be configured to invoke the device control application stored in the memory 1005, the computer device 1000 may be a first computing node to implement:
receiving a federation learning task generated by a task initiating node through a task intelligent contract, and acquiring training data associated with the federation learning task as first training data;
if the first computing node does not have the federal learning computing capability aiming at the first training data, broadcasting a first training competition request aiming at a federal learning task to M second computing nodes, and determining the second computing node which is successful in competition and meets the node reliability condition in the M second computing nodes as a target computing node; the target computing node has federal learning computing capability for the first training data;
if a first data sharing request sent by a target computing node is received, sending first training data to the target computing node, so that the target computing node trains an initial model associated with a federation learning task according to the first training data and stored training data associated with the federation learning task to obtain first branch training update parameters; the task initiating node is also used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include a first branch training update parameter, N being a positive integer.
It should be understood that the computer device 1000 described in the embodiments of the present application may perform the foregoing description of the federal learning processing method in any of the embodiments corresponding to fig. 3 and 5, and will not be described herein. In addition, the description of the beneficial effects of the same method is omitted.
Furthermore, it should be noted here that: the embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores a computer program executed by the foregoing federal learning processing device 1, where the computer program includes program instructions, and when the processor executes the program instructions, the processor is capable of executing the description of the federal learning processing method in any one of the corresponding embodiments of fig. 3 and 5, and therefore, the description will not be repeated here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application.
Further, referring to fig. 13, fig. 13 is a schematic structural diagram of another federal learning processing device based on a blockchain according to an embodiment of the present application. The federal learning process means 2 may be a computer program (including program code) running in a computer device, for example, the federal learning process means 2 is an application software; the device can be used for executing corresponding steps in the method provided by the embodiment of the application. As shown in fig. 13, the federal learning processing device 2 may include: a task broadcasting module 21, a response receiving module 22, a training node determining module 23, a task issuing module 24 and a model updating module 25.
The task broadcasting module 21 is configured to generate a federal learning task through a task intelligent contract, and broadcast the federal learning task to W computing nodes; if the calculation limited nodes exist in the W calculation nodes, the calculation limited nodes are used for broadcasting training competition requests aiming at shared training data to the communicable calculation nodes corresponding to the calculation limited nodes, and the calculation limited nodes are also used for determining target communicable calculation nodes which are successful in competition and meet the node credibility conditions in the communicable calculation nodes; the shared training data is the training data stored in the computing limited node and associated with the federal learning task; the computing restricted node does not have federal learning computing capability for shared training data;
the response receiving module 22 is configured to receive federal learning task response requests sent by the Y capability computing nodes respectively; y is a positive integer; the Y capability computing nodes include target communicable computing nodes, and the Y capability computing nodes do not include computation-constrained nodes;
the training node determining module 23 is configured to determine X training computing nodes from the Y capability computing nodes according to the Y federal learning task response requests, where X is a positive integer;
the task issuing module 24 is configured to send a federation learning task issuing instruction carrying an initial model associated with a federation learning task to the X training computing nodes, so that the X training computing nodes train the initial model according to available training data respectively, and obtain a branch training update parameter; if the X training computing nodes comprise target communicable computing nodes, the available training data corresponding to the target communicable computing nodes comprise shared training data and training data stored in the target communicable computing nodes;
The model updating module 25 is configured to globally update the initial model according to the aggregate training update parameter until a target model that meets the training condition indicated by the federal learning task is obtained; the aggregate training update parameter is generated based on the X branch training update parameters.
The specific implementation manners of the task broadcasting module 21, the response receiving module 22, the training node determining module 23, the task issuing module 24, and the model updating module 25 may be referred to the specific description of step S301 to step S304 in the embodiment corresponding to fig. 6, and will not be described herein.
Wherein, a federal learning task response request includes a unit data resource consumption amount and a training data total amount;
training node determination module 23, comprising: consumption acquisition unit 231, ascending order ordering unit 232, and training node selection unit 233.
A consumption obtaining unit 231, configured to obtain a predicted consumption of digital resources corresponding to the federal learning task;
an ascending sort unit 232, configured to ascending sort the Y unit data resource consumption according to the Y federal learning task response requests, to obtain the sorted Y unit data resource consumption;
the training node selection unit 233 is configured to traverse the sequenced Y unit data resource consumption amounts, sequentially obtain an e unit data resource consumption amount, and determine an e node digital resource prediction consumption amount according to the e unit data resource consumption amount and a training data total amount corresponding to the e unit data resource consumption amount; e is a positive integer less than or equal to Y;
The training node selection unit 233 is further configured to add the capability calculation node corresponding to the e-th unit data resource consumption amount to the node pre-selection queue;
the training node selecting unit 233 is further configured to subtract the current predicted consumption of the digital resource from the predicted consumption of the digital resource of the e-th node if the predicted consumption of the digital resource of the e-th node is less than the current predicted consumption of the digital resource of the remaining digital resource, obtain an updated predicted consumption of the digital resource of the remaining digital resource, and continuously traverse to obtain the (e+1) -th unit data resource consumption;
the training node selection unit 233 is further configured to stop traversing if the predicted consumption of the digital resource of the e-th node is greater than or equal to the predicted consumption of the current remaining digital resource, or e is equal to Y, and take all the X capability computing nodes in the node pre-selection queue as training computing nodes.
The specific implementation manner of the consumption obtaining unit 231, the ascending sort unit 232, and the training node selecting unit 233 may be referred to the specific description of step S303 in the embodiment corresponding to fig. 6, and will not be described herein.
Wherein, the federal learning processing device 2 further comprises: a digital resource recording module 26.
A digital resource recording module 26 for sequentially acquiring X-1 training computing nodes from the node pre-selection queue;
the digital resource recording module 26 is further configured to predict consumption amounts of digital resources of nodes corresponding to the X-1 training computing nodes respectively, as consumption amounts of transaction digital resources corresponding to the X-1 training computing nodes respectively;
the digital resource recording module 26 is further configured to obtain the xth training computing node from the node pre-selection queue
The digital resource recording module 26 is further configured to, if the node digital resource predicted consumption corresponding to the xth training computing node is less than the target remaining digital resource predicted consumption, take the node digital resource predicted consumption corresponding to the xth training computing node as the transaction digital resource consumption corresponding to the xth training computing node; the target residual digital resource predicted consumption is a value obtained by subtracting the node digital resource predicted consumption corresponding to the X-1 training calculation nodes respectively from the digital resource predicted consumption;
the digital resource recording module 26 is further configured to, if the predicted consumption of the digital resource of the node corresponding to the xth training computing node is greater than or equal to the predicted consumption of the target remaining digital resource, use the predicted consumption of the target remaining digital resource as the consumption of the transaction digital resource corresponding to the xth training computing node;
The digital resource recording module 26 is further configured to associate and package each training computing node and the transaction digital resource consumption amount corresponding to each training computing node into a digital resource recording transaction, and cache the digital resource recording transaction in a recording transaction pool;
the digital resource recording module 26 is further configured to, when obtaining a target model that meets training conditions indicated by the federal learning task, send digital resources corresponding to the associated transaction digital resource consumption to each training computing node, and perform consensus uplink on the digital resource recording transactions.
The specific implementation of the digital resource recording module 26 may be referred to the optional description of step S303 in the embodiment corresponding to fig. 6, which is not repeated here.
Wherein, the federal learning processing device 2 further comprises: noise processing module 27, noise transmission module 28, parameter aggregation module 29, and parameter decryption module 210.
A noise processing module 27 for generating a set of random differential privacy noise; the differential privacy noise set comprises random differential privacy noise respectively corresponding to each training computing node;
the noise processing module 27 is further configured to add the X random differential privacy noises to obtain a random differential privacy total noise;
The noise processing module 27 is further configured to take the inverse number of the random differential privacy total noise as the differential privacy key;
the noise sending module 28 is configured to send, while sending a federal learning task issuing instruction carrying an initial model associated with a federal learning task to the X training computing nodes, the corresponding random differential privacy noise to the X training computing nodes, so that the X training computing nodes encrypt the branch training update parameters according to the received random differential privacy noise, respectively, to obtain encrypted branch training update parameters;
the parameter aggregation module 29 is configured to perform information aggregation processing on the encrypted branch training update parameters returned by the X training computing nodes, to obtain encrypted aggregate training update parameters;
the parameter decryption module 210 adds the encrypted aggregate training update parameter to the differential privacy password to obtain the aggregate training update parameter.
The specific implementation manners of the noise processing module 27, the noise transmitting module 28, the parameter aggregation module 29, and the parameter decryption module 210 may be referred to the optional description of step S304 in the embodiment corresponding to fig. 6, which is not repeated here.
Wherein the parameter aggregation module 29 comprises: the accuracy verification unit 291 and the target parameter aggregation unit 292.
A precision verification unit 291, configured to obtain a training test set;
the precision verification unit 291 is further configured to verify the encrypted branch training update parameters returned by the X training computing nodes according to the training test set, so as to obtain X update precision;
a target parameter aggregation unit 292, configured to use, as a target encrypted branch training update parameter, an encrypted branch training update parameter returned by a training computing node whose update accuracy meets a training accuracy condition;
the target parameter aggregation unit 292 is further configured to perform information aggregation processing on the target encrypted branch training update parameter to obtain an encrypted aggregate training update parameter.
The specific implementation manner of the accuracy verification unit 291 and the target parameter aggregation unit 292 may refer to the optional description of step S304 in the embodiment corresponding to fig. 6, which is not described herein.
Wherein, the federal learning processing device 2 further comprises: the accuracy recording module 211.
The precision recording module 211 is configured to package the X training computing nodes and the X update precisions into an update precision recording transaction, and initiate consensus processing for the update precision recording transaction to the blockchain network, so that each node in the blockchain network updates the stored precision recording table according to the X update precisions when the update precision recording transaction consensus passes; the precision record table is used for recording each node in the blockchain network and the current update precision corresponding to each node.
The specific implementation manner of the accuracy recording module 211 may refer to the optional description of step S304 in the embodiment corresponding to fig. 6, which is not described herein.
Wherein, the federal learning processing device 2 further comprises:
the block-out weight determining module 212 is configured to obtain an accuracy record table according to the vote confirmation request;
the block-out weight determining module 212 is further configured to determine a contribution degree corresponding to each node according to a current update precision corresponding to each node in the precision record table;
the block-out weight determining module 212 is further configured to determine a resource benefit corresponding to each node according to the contribution weight, the contribution corresponding to each node, and the digital resource possession duration corresponding to each node;
the block-out weight determining module 212 is further configured to send a vote to each node according to a ratio between the resource rights corresponding to each node, so that each node votes on the received vote;
the block-out weight determining module 212 is further configured to, if a computation-limited node exists in each node, vote the received vote to a target communicable computing node;
the block-out weight determining module 212 is further configured to receive the number of votes of each node after voting, and determine a block-out weight acquisition ratio corresponding to each node according to the number of votes of each node;
The block-out weight determining module 212 is further configured to determine a block node from each node randomly according to a block-out weight acquisition ratio; the block-out node has the block-out weight of the new block;
the block-out weight determining module 212 is further configured to send the bonus digital resources to a block-out node, so that the block-out node allocates the bonus digital resources according to the ballot composition proportion; the vote composition ratio is a number ratio between the votes sent by the task initiating node and the votes voted by the target calculation restricted node, which are received by the block node.
The specific implementation manner of the block-out right determining module 212 may refer to the optional description of step S304 in the embodiment corresponding to fig. 6, which is not described herein.
Further, referring to fig. 14, fig. 14 is a schematic structural diagram of another computer device according to an embodiment of the present application. As shown in fig. 14, the federal learning processing device 2 in the embodiment corresponding to fig. 13 described above may be applied to the computer apparatus 2000, and the computer apparatus 2000 may include: processor 2001, network interface 2004 and memory 2005, in addition, the above-described computer device 2000 further includes: a user interface 2003, and at least one communication bus 2002. Wherein a communication bus 2002 is used to enable connected communications between these components. The user interface 2003 may include a Display screen (Display), a Keyboard (Keyboard), and the optional user interface 2003 may further include a standard wired interface, a wireless interface, among others. The network interface 2004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 2005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 2005 may also optionally be at least one storage device located remotely from the aforementioned processor 2001. As shown in fig. 14, an operating system, a network communication module, a user interface module, and a device control application program may be included in the memory 2005 as one type of computer-readable storage medium.
In the computer device 2000 illustrated in fig. 14, the network interface 2004 may provide network communication functions; while user interface 2003 is primarily an interface for providing input to a user; and processor 2001 may be used to invoke a device control application stored in memory 2005, which may act as a task initiating node to implement:
generating a federation learning task through a task intelligent contract, and broadcasting the federation learning task to W computing nodes; if the calculation limited nodes exist in the W calculation nodes, the calculation limited nodes are used for broadcasting training competition requests aiming at shared training data to the communicable calculation nodes corresponding to the calculation limited nodes, and the calculation limited nodes are also used for determining target communicable calculation nodes which are successful in competition and meet the node credibility conditions in the communicable calculation nodes; the shared training data is the training data stored in the computing limited node and associated with the federal learning task; the computing restricted node does not have federal learning computing capability for shared training data;
receiving federal learning task response requests sent by Y capability computing nodes respectively, wherein Y is a positive integer; the Y capability computing nodes include target communicable computing nodes, and the Y capability computing nodes do not include computation-constrained nodes;
Determining X training computing nodes from Y capacity computing nodes according to Y federation learning task response requests, and sending federation learning task issuing instructions carrying initial models associated with federation learning tasks to the X training computing nodes so that the X training computing nodes train the initial models according to available training data respectively to obtain branch training update parameters; x is a positive integer; if the X training computing nodes comprise target communicable computing nodes, the available training data corresponding to the target communicable computing nodes comprise shared training data and training data stored in the target communicable computing nodes;
globally updating the initial model according to the aggregate training updating parameters until a target model meeting training conditions indicated by the federal learning task is obtained; the aggregate training update parameter is generated based on the X branch training update parameters.
It should be understood that the computer device 2000 described in the embodiments of the present application may perform the description of the access control method in the foregoing embodiments, and may also perform the description of the federal learning processing device 2 in the foregoing embodiment corresponding to fig. 13, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
Furthermore, it should be noted here that: the embodiment of the present application further provides a computer readable storage medium, and the computer readable storage medium stores therein a computer program executed by the foregoing federal learning processing device 2, and when the processor loads and executes the computer program, the foregoing description of the access control method in any of the foregoing embodiments can be executed, so that a detailed description thereof will not be given here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application.
The computer readable storage medium may be a federal learning processing apparatus provided in any one of the foregoing embodiments or an internal storage unit of the foregoing computer device, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (flash card) or the like, which are provided on the computer device. Further, the computer-readable storage medium may also include both internal storage units and external storage devices of the computer device. The computer-readable storage medium is used to store the computer program and other programs and data required by the computer device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
Furthermore, it should be noted here that: embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the computer device to perform the method provided by any of the corresponding embodiments described above.
The terms first, second and the like in the description and in the claims and drawings of the embodiments of the present application are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the term "include" and any variations thereof is intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or device that comprises a list of steps or elements is not limited to the list of steps or modules but may, in the alternative, include other steps or modules not listed or inherent to such process, method, apparatus, article, or device.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied as electronic hardware, as a computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of network elements in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether these network elements are implemented in hardware or software depends on the specific application and design constraints of the solution. The skilled person may use different methods for implementing the described network elements for each specific application, but such implementation should not be considered beyond the scope of the present application.
The foregoing disclosure is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the claims herein, as the equivalent of the claims herein shall be construed to fall within the scope of the claims herein.

Claims (19)

1. A blockchain-based federal learning processing method, wherein the method is performed by a first computing node in a blockchain network, the blockchain network further comprising a task initiating node and M second computing nodes, M being a positive integer; the method comprises the following steps:
the first computing node receives a federation learning task generated by the task initiating node through a task intelligent contract, and acquires training data associated with the federation learning task as first training data;
if the first computing node does not have the federal learning computing capability aiming at the first training data, broadcasting a first training competition request aiming at the federal learning task to the M second computing nodes, and determining a second computing node which is successful in competition and meets a node reliability condition from the M second computing nodes as a target computing node; the target computing node has federal learning computing capability for the first training data;
If a first data sharing request sent by the target computing node is received, sending the first training data to the target computing node, so that the target computing node trains an initial model associated with the federation learning task according to the first training data and stored training data associated with the federation learning task to obtain a first branch training update parameter; the task initiating node is further used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include the first branch training update parameter, N being a positive integer.
2. The method of claim 1, wherein the determining, as the target computing node, a second computing node that is successful in contention and satisfies a node reliability condition among the M second computing nodes, comprises:
in the competition time period, receiving first training competition response information sent by L competition nodes respectively; a first training competition response message comprises a competition digital resource quantity; the L competing nodes are nodes with federal learning computing capacity aiming at the first training data in the M second computing nodes; l is a positive integer less than or equal to M;
Removing the competitive nodes which do not meet the node credibility condition from the L competitive nodes to obtain S credible competitive nodes; s is a positive integer less than or equal to L;
the trusted competition node with the highest number of the competitive digital resources is obtained from the S trusted competition nodes and is used as a pre-selection calculation node;
transmitting a first training contention confirmation request to the pre-selected computing node;
and if the first training competition confirmation response information sent by the pre-selected computing node according to the first training competition confirmation request is received in the confirmation time period, determining the pre-selected computing node as a target computing node.
3. The method of claim 2, wherein the L competing nodes comprise competing node Z i I is a positive integer less than or equal to L; the method further comprises the steps of:
acquiring a node trusted probability table;
querying the competing node Z from the node trusted probability table i Is a trusted probability of (1);
according to the competition node Z i Connection relation between the two and the competing node Z i Determining a probability of trust of a competing node Z i Is the confidence level of (2);
if the competing node Z i If the reliability of the (a) is smaller than the reliability threshold, determining the competing node Z i The node credibility condition is not satisfied;
if the competing node Z i If the reliability of (a) is greater than or equal to the reliability threshold, determining the competing node Z i The node reliability condition is satisfied.
4. The method of claim 1, wherein the federal learning task includes a task deadline and initial model information;
the method further comprises the steps of:
determining a first demand computing resource corresponding to the federal learning task according to the initial model information;
if the first required computing resource is greater than the available computing resource, determining that the first computing node does not have federal learning computing capability for the first training data; the available computing resources refer to idle computing resources of the first computing node;
if the first required computing resource is smaller than or equal to the available computing resource, determining a first training duration corresponding to federal learning computation of the first training data according to the data amount of the first training data;
if the first training duration is less than or equal to the task deadline, determining that the first computing node has federal learning computing capability for the first training data;
and if the first training time period is longer than the task deadline, determining that the first computing node does not have federal learning computing capability for the first training data.
5. The method of claim 4, wherein the blockchain network further includes P third computing nodes, P being a positive integer;
the method further comprises the steps of:
receiving second training competition requests which are respectively sent by the P third computing nodes and aim at the federal learning task; the second training competition request sent by the j third computing node comprises the data volume of the j second training data; the j second training data is the training data which is stored in the j third computing node and is associated with the federal learning task; the jth third computing node does not have federal learning computing capability for the jth second training data; j is a positive integer less than or equal to P;
if the first computing node has the federal learning computing capability of the first training data, determining a third computing node with successful competition from the P third computing nodes as a target acquisition node; the first computing node also has federal learning capability for target training data; the target training data is second training data corresponding to the target acquisition node;
transmitting a federal learning task response request carrying training data information to the task initiating node; the training data information is generated according to the first training data and the target training data;
If a federation learning task issuing instruction sent by the task initiating node is received, a second data sharing request is sent to the target acquiring node; the federation learning task issuing instruction comprises an initial model associated with the federation learning task;
receiving the target training data sent by the target acquisition node according to the second data sharing request, and training the initial model according to the first training data and the target training data to obtain second branch training update parameters; the task initiating node is also used for globally updating the initial model through H branch training updating parameters; the H branch training update parameters include the second branch training update parameter, H being a positive integer.
6. The method of claim 5, wherein if the first computing node has federal learning computing capability of the first training data, determining a third computing node with successful contention among the P third computing nodes as a target acquisition node comprises:
if the first computing node has federal learning computing capability aiming at the first training data, traversing second training competition requests aiming at the federal learning tasks and respectively sent by the P third computing nodes;
Acquiring second training data G in a second training competition request sent by a kth third computing node k According to the data volume of the second training data G k Data amount determination for the second training data G k A second training period T of the data quantity of (2) k The method comprises the steps of carrying out a first treatment on the surface of the k is a positive integer less than or equal to P;
the second training time period T k Adding the first training time length to obtain a total training time length T k total
If the total training time length T k total Less than or equal to the task deadline, determining that the first computing node is provided with the second training data G k Federal of (2)Learning computing power, transmitting second training competition response information R to the kth third computing node k
In the competition waiting time period, receiving second training competition confirmation requests respectively sent by Q third computing nodes; q is a positive integer less than or equal to; a second training contention confirmation request includes a target contention digital resource amount;
determining a third computing node corresponding to a second training competition confirmation request corresponding to the lowest unit data resource consumption as a third computing node with successful competition, sending second training competition confirmation response information to the third computing node with successful competition, and taking the third computing node with successful competition as a target acquisition node; the unit data resource consumption amount corresponding to one third computing node is determined according to the target competition digital resource amount corresponding to one third computing node and the data amount of the second training data.
7. The method as recited in claim 5, further comprising:
if random differential privacy noise is received while a federal learning task issuing instruction sent by the task initiating node is received, encrypting the second branch training updating parameter according to the received random differential privacy noise to obtain a target encrypted branch training updating parameter, and sending the target encrypted branch training updating parameter to the task initiating node; the task initiating node is further used for carrying out information aggregation processing on the S encrypted branch training update parameters to obtain encrypted aggregation training update parameters; s is a positive integer; the task initiating node is further configured to add the encrypted aggregate training update parameter to the differential privacy key to obtain an aggregate training update parameter, and globally update the initial model according to the aggregate training update parameter; the S encrypted branch training update parameters include the target encrypted branch training update parameters.
8. The method as recited in claim 1, further comprising:
acquiring the number of appointed digital resources with the target computing node;
packaging the number of the appointed digital resources, the data amount of the first training data and the data sharing relation with the target computing node into a data sharing record transaction, and caching the data sharing record transaction in a record transaction pool;
When a first data sharing request sent by the target computing node is received, the data sharing record transaction is subjected to consensus uplink;
and receiving the digital resources which are transmitted by the target computing node and correspond to the number of the appointed digital resources.
9. The federal learning processing method based on the blockchain is characterized by being executed by a task initiating node in a blockchain network, wherein the blockchain network also comprises W computing nodes, and W is a positive integer; the method comprises the following steps:
the task initiating node generates a federation learning task through a task intelligent contract, and broadcasts the federation learning task to the W computing nodes; if the W computing nodes have computing limited nodes, broadcasting a training competition request aiming at shared training data to the communicable computing nodes corresponding to the computing limited nodes by the computing limited nodes, and determining target communicable computing nodes which are successful in competition and meet the node credibility condition in the communicable computing nodes by the computing limited nodes; the shared training data is the training data which is stored in the computing limited node and is associated with the federal learning task; the computation-constrained node does not have federal learning computation capability for the shared training data;
Receiving federal learning task response requests sent by Y capability computing nodes respectively; y is a positive integer; the Y capability computing nodes include the target communicable computing node, and the Y capability computing nodes do not include the computation restricted node;
determining X training computing nodes from the Y capacity computing nodes according to Y federation learning task response requests, and sending federation learning task issuing instructions carrying initial models associated with federation learning tasks to the X training computing nodes so that the X training computing nodes train the initial models according to available training data respectively to obtain branch training updating parameters; x is a positive integer; if the X training computing nodes comprise the target communicable computing nodes, the available training data corresponding to the target communicable computing nodes comprise the shared training data and the training data stored in the target communicable computing nodes;
globally updating the initial model according to the aggregate training updating parameters until a target model meeting training conditions indicated by the federal learning task is obtained; the aggregate training update parameter is generated based on the X branch training update parameters.
10. The method of claim 9, wherein a federal learning task response request includes a unit data resource consumption amount and a training data total amount;
the determining X training computing nodes from the Y capability computing nodes according to the Y federal learning task response requests includes:
obtaining the predicted consumption of the digital resources corresponding to the federal learning task;
ascending sort is carried out on the Y unit data resource consumption according to the Y federal learning task response requests, and Y unit data resource consumption after sort is obtained;
traversing the Y unit data resource consumption after sequencing, sequentially obtaining the e unit data resource consumption, and determining the e node digital resource prediction consumption according to the e unit data resource consumption and the training data total amount corresponding to the e unit data resource consumption; e is a positive integer less than or equal to Y;
adding the capacity calculation node corresponding to the e-th unit data resource consumption into a node preselected queue;
if the e-th node digital resource predicted consumption is smaller than the current residual digital resource predicted consumption, subtracting the current residual digital resource predicted consumption from the e-th node digital resource predicted consumption to obtain updated residual digital resource predicted consumption, and continuing traversing to obtain the e+1st unit data resource consumption;
And stopping traversing if the e-th node digital resource predicted consumption is greater than or equal to the current residual digital resource predicted consumption or e is equal to Y, and taking X capacity computing nodes in the node pre-selection queue as training computing nodes.
11. The method as recited in claim 10, further comprising:
sequentially acquiring X-1 training calculation nodes from the node pre-selection queue;
the node digital resource prediction consumption corresponding to the X-1 training calculation nodes is used as the transaction digital resource consumption corresponding to the X-1 training calculation nodes;
obtaining an X training computing node from the node pre-selection queue
If the node digital resource predicted consumption corresponding to the X training computing node is smaller than the target residual digital resource predicted consumption, taking the node digital resource predicted consumption corresponding to the X training computing node as the transaction digital resource consumption corresponding to the X training computing node; the target residual digital resource predicted consumption is a value obtained by subtracting the node digital resource predicted consumption corresponding to the X-1 training calculation nodes respectively from the digital resource predicted consumption;
If the node digital resource predicted consumption corresponding to the X training computing node is greater than or equal to the target residual digital resource predicted consumption, taking the target residual digital resource predicted consumption as the transaction digital resource consumption corresponding to the X training computing node;
each training computing node and the corresponding transaction digital resource consumption amount of each training computing node are associated and packaged into a transaction with digital resource records, and the digital resource records are cached in a record transaction pool;
when a target model meeting training conditions indicated by the federal learning task is obtained, digital resources corresponding to the associated transaction digital resource consumption are respectively sent to each training computing node, and the digital resource record transaction is subjected to consensus uplink.
12. The method as recited in claim 9, further comprising:
generating a random differential privacy noise set; the differential privacy noise set comprises random differential privacy noise respectively corresponding to each training computing node;
adding the X random differential privacy noises to obtain a random differential privacy total noise;
taking the opposite number of the random differential privacy total noise as a differential privacy key;
Transmitting corresponding random differential privacy noise to the X training computing nodes while transmitting a federation learning task issuing instruction carrying an initial model associated with the federation learning task to the X training computing nodes, so that the X training computing nodes encrypt the branch training updating parameters according to the received random differential privacy noise to obtain encrypted branch training updating parameters;
and carrying out information aggregation processing on the encryption branch training update parameters returned by the X training calculation nodes to obtain encryption aggregation training update parameters, and adding the encryption aggregation training update parameters and the differential privacy passwords to obtain aggregation training update parameters.
13. The method of claim 12, wherein the performing information aggregation processing on the encrypted branch training update parameters returned by the X training computing nodes to obtain encrypted aggregate training update parameters includes:
acquiring a training test set;
respectively verifying encryption branch training update parameters returned by the X training computing nodes according to the training test set to obtain X update precision;
The encryption branch training updating parameters returned by the training computing nodes with the updating precision meeting the training precision conditions are used as target encryption branch training updating parameters;
and carrying out information aggregation processing on the target encryption branch training updating parameters to obtain the encryption aggregation training updating parameters.
14. The method as recited in claim 13, further comprising:
packaging the X training computing nodes and the X updating accuracies into an updating accuracy record transaction, and initiating consensus processing for the updating accuracy record transaction to a blockchain network, so that each node in the blockchain network updates a stored accuracy record table according to the X updating accuracies when the updating accuracy record transaction consensus passes; the precision record table is used for recording each node in the blockchain network and the current update precision corresponding to each node.
15. The method as recited in claim 14, further comprising:
acquiring the precision record table according to the vote confirmation request;
determining the contribution degree corresponding to each node according to the current updating precision corresponding to each node in the precision record table;
Determining the resource rights and interests corresponding to each node according to the contribution degree weight, the contribution degree corresponding to each node and the digital resource possession duration corresponding to each node;
according to the proportion between the corresponding resource rights of each node, sending votes to each node so that each node votes on the received votes;
if the calculation limited node exists in each node, the calculation limited node is used for voting the received ballots to the target communicable calculation node;
the number of votes of each node after voting is received, and the block-out weight acquisition proportion corresponding to each node is determined according to the number of votes of each node;
determining block nodes randomly from each node according to the block-out weight acquisition proportion; the block-out node owns the block-out weight of the new block;
transmitting the rewarding digital resources to the block outlet node so that the block outlet node distributes the rewarding digital resources according to the proportion of the ballot composition; the ballot composition ratio refers to the number ratio between the ballots sent by the task initiating node and received by the block-out node and the ballots voted by the target calculation limited node.
16. The federal learning processing device based on the blockchain is characterized by being applied to a first computing node in a blockchain network, wherein the blockchain network further comprises a task initiating node and M second computing nodes, and M is a positive integer; the device comprises:
the task receiving module is used for receiving a federation learning task generated by a task initiating node through a task intelligent contract, and acquiring training data associated with the federation learning task as first training data;
the competition broadcasting module is used for broadcasting a first training competition request aiming at the federal learning task to the M second computing nodes if the first computing node does not have federal learning computing capability aiming at the first training data;
the node determining module is used for determining a second computing node which is successful in competition and meets the node credibility condition from the M second computing nodes, and the second computing node is used as a target computing node; the target computing node has federal learning computing capability for the first training data;
the data sharing module is used for sending the first training data to the target computing node if a first data sharing request sent by the target computing node is received, so that the target computing node trains an initial model associated with the federation learning task according to the first training data and stored training data associated with the federation learning task to obtain a first branch training update parameter; the task initiating node is further used for globally updating the initial model through N branch training updating parameters; the N branch training update parameters include the first branch training update parameter, N being a positive integer.
17. A computer device, comprising: a processor, a memory, and a network interface;
the processor is connected to the memory, the network interface for providing data communication functions, the memory for storing program code, the processor for invoking the program code to perform the method of any of claims 1-15.
18. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program adapted to be loaded by a processor and to perform the method of any of claims 1-15.
19. A computer program product comprising computer programs/instructions which, when executed by a processor, are adapted to carry out the method of any one of claims 1-15.
CN202210726244.2A 2022-06-24 2022-06-24 Federal learning processing method based on blockchain and related equipment Pending CN117332871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210726244.2A CN117332871A (en) 2022-06-24 2022-06-24 Federal learning processing method based on blockchain and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210726244.2A CN117332871A (en) 2022-06-24 2022-06-24 Federal learning processing method based on blockchain and related equipment

Publications (1)

Publication Number Publication Date
CN117332871A true CN117332871A (en) 2024-01-02

Family

ID=89276058

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210726244.2A Pending CN117332871A (en) 2022-06-24 2022-06-24 Federal learning processing method based on blockchain and related equipment

Country Status (1)

Country Link
CN (1) CN117332871A (en)

Similar Documents

Publication Publication Date Title
JP7114629B2 (en) System and method for parallel verification of blockchain transactions
US10880095B2 (en) Electronic apparatus, method for electronic apparatus and information processing system
AU2020200149B2 (en) Methods and apparatus for a distributed database within a network
CN108712488B (en) Data processing method and device based on block chain and block chain system
Asheralieva et al. Reputation-based coalition formation for secure self-organized and scalable sharding in iot blockchains with mobile-edge computing
Kaur et al. Scalability in blockchain: Challenges and solutions
CN112541758A (en) Multi-round voting type fault-tolerant sequencing consensus mechanism and method based on block chain
EP4318362A1 (en) Blockchain-based data processing method, apparatus and device, and storage medium
JP2023076628A (en) Computer-implemented systems and methods relating to binary blockchain comprising one pair of coupled blockchains
CN112163856A (en) Consensus method and system for block chain and Internet of things fusion scene
CN110555079B (en) Data processing method, device, equipment and storage medium
Sun et al. Rtchain: A reputation system with transaction and consensus incentives for e-commerce blockchain
US11546340B2 (en) Decentralized access control for authorized modifications of data using a cryptographic hash
WO2023077796A1 (en) Backbone node access method and blockchain system
US10970180B2 (en) Methods and apparatus for verifying processing results and/or taking corrective actions in response to a detected invalid result
CN116703601B (en) Data processing method, device, equipment and storage medium based on block chain network
WO2022183518A1 (en) Cloud-computing-oriented high-performance blockchain architecture method
US11736299B2 (en) Data access control for edge devices using a cryptographic hash
CN111222885B (en) Data processing request endorsement method and device, computer equipment and storage medium
CN115310137B (en) Secrecy method and related device of intelligent settlement system
CN117332871A (en) Federal learning processing method based on blockchain and related equipment
CN107707383B (en) Put-through processing method and device, first network element and second network element
Cheng et al. Correlation trust authentication model for peer-to-peer networks
Qi et al. STFM: a blockchain sharding algorithm based on trust field model for heterogeneous Internet of Things
CN117333180A (en) Block chain-based data processing method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination