CN114662705A - Federal learning method, device, electronic equipment and computer readable storage medium - Google Patents

Federal learning method, device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN114662705A
CN114662705A CN202210272358.4A CN202210272358A CN114662705A CN 114662705 A CN114662705 A CN 114662705A CN 202210272358 A CN202210272358 A CN 202210272358A CN 114662705 A CN114662705 A CN 114662705A
Authority
CN
China
Prior art keywords
model
federal
verification information
information
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210272358.4A
Other languages
Chinese (zh)
Inventor
程勇
蒋杰
韦康
刘煜宏
陈鹏
陶阳宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210272358.4A priority Critical patent/CN114662705A/en
Publication of CN114662705A publication Critical patent/CN114662705A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method and a device for federated learning, electronic equipment and a computer readable storage medium; after a local model is trained and first model information of a trained first local model is sent to a server, the server returns a converged first federated model, the first federated model is verified, disturbance processing is carried out on first verification information, a preset disturbed first verification information set is updated based on the disturbed first verification information, the target disturbed first verification information is screened out from the updated disturbed first verification information set, a model index corresponding to the target disturbed first verification information is constructed, and the target disturbed first verification information and the model index are sent to the server so that the server can generate the trained federated model; the method and the device can improve the training efficiency of federal learning, and the embodiment of the invention can be applied to various scenes such as cloud technology, artificial intelligence, intelligent traffic, auxiliary driving and the like.

Description

Federal learning method, device, electronic equipment and computer readable storage medium
Technical Field
The invention relates to the technical field of communication, in particular to a method and a device for Federation learning, electronic equipment and a computer readable storage medium.
Background
With the development of computer computing and the progress of artificial intelligence technology, federal learning gradually becomes a popular subject, and the federal learning completes a model of machine learning and deep learning models through multi-party cooperation. Federal learning also includes horizontal federal learning in which multiple computing devices respectively use their respective sample data to train the same model together, and in order to ensure the sample data security, the training data of the participants are usually protected based on a security mechanism of cryptography.
In the research and practice process of the prior art, the inventor of the invention finds that a safety mechanism based on cryptography ensures that the information interacted between a server and a participant needs to be encrypted, so that the communication overhead is increased, and the server and the participant need to interact for multiple times, so that the network communication bandwidth and stability are higher, and the federal learning training efficiency is lower.
Disclosure of Invention
The embodiment of the invention provides a federated learning method, a federated learning device, electronic equipment and a computer-readable storage medium, which can improve the training efficiency of federated learning.
A method for federated learning, comprising:
training a local model, and sending first model information of the trained first local model to a server, so that the server aggregates at least one trained first local model based on the first model information;
receiving a first federated model after aggregation returned by the server, and verifying the first federated model to obtain first verification information of the first federated model;
disturbing the first verification information, and updating a preset disturbed first verification information set based on the disturbed first verification information;
screening out target disturbed first verification information from the updated disturbed first verification information set, and constructing a model index of a target federal model corresponding to the target disturbed first verification information;
and sending the first verification information after the target disturbance and the model index to the server so that the server can generate a post-training federated model based on the first verification information after the target disturbance and the model index.
Optionally, an embodiment of the present invention further provides a federated learning method, including:
receiving first model information of a trained first local model sent by at least one participating node participating in current federal learning;
aggregating the trained first local model based on the first model information to obtain a first federated model;
respectively sending the first federated model to the participating nodes, and receiving first verification information and model indexes after target disturbance returned by the participating nodes;
determining the federal model convergence state in the current federal learning according to the first verification information after the target disturbance;
and generating a post-training federal model based on the federal model convergence state and the model index.
Correspondingly, an embodiment of the present invention provides a bang learning device, including:
the training unit is used for training the local models and sending first model information of the trained first local model to the server so that the server can aggregate at least one trained first local model based on the first model information;
the verification unit is used for receiving the aggregated first federal model returned by the server and verifying the first federal model to obtain first verification information of the first federal model;
the disturbance unit is used for carrying out disturbance processing on the first verification information and updating a preset disturbed first verification information set based on the disturbed first verification information;
the screening unit is used for screening out the first verification information after the target disturbance from the updated disturbed first verification information set and constructing a model index of a target federal model corresponding to the first verification information after the target disturbance;
and the sending unit is used for sending the first verification information after the target disturbance and the model index to the server so that the server can generate a post-training federated model based on the first verification information after the target disturbance and the model index.
Optionally, an embodiment of the present invention may further include a bang learning device, including:
the first receiving unit is used for receiving first model information of a trained first local model, which is sent by at least one participating node participating in current federal learning;
the aggregation unit is used for aggregating the trained first local model based on the first model information to obtain a first federated model;
the second receiving unit is used for respectively sending the first federated model to the participating nodes and receiving first verification information and model indexes after target disturbance returned by the participating nodes;
the determining unit is used for determining the federal model convergence state in the current federal learning according to the first verification information after the target disturbance;
and the generating unit is used for generating the post-training federal model based on the state of convergence of the federal model and the model index.
Optionally, in some embodiments, the perturbation unit may be specifically configured to obtain random noise information of the first verification information according to a preset differential privacy budget; and adding the random noise information into the first verification information to obtain disturbed first verification information.
Optionally, in some embodiments, the perturbation unit may be specifically configured to determine a noise distribution parameter of the noise information according to a preset differential privacy budget; generating a noise information set obeying the noise distribution parameter distribution based on the noise distribution parameter; and randomly screening out noise information in the noise information set to obtain random noise information of the first verification information.
Optionally, in some embodiments, the training unit may be specifically configured to obtain local training sample data, and calculate a gradient of the local model through the local training sample data; carrying out differential privacy processing on the gradient, and converging the local model based on the processed gradient to obtain a trained first local model; the sending the first model information of the trained first local model to a server so that the server aggregates at least one trained first local model based on the first model information includes: extracting a current gradient or model parameters from the trained first local model to serve as first model information, and sending the first model information to a server so that the server can aggregate at least one trained first local model based on the first model information.
Optionally, in some embodiments, the training unit may be specifically configured to obtain local training sample data, and train a local model through the local training sample data to obtain a first local model after initial training; extracting initial gradient or initial model parameters from the first local model after the initial training to obtain initial model information; and carrying out differential privacy processing on the initial model information to obtain first model information of the trained first local model.
Optionally, in some embodiments, the federal learning apparatus may further include a first detection unit, where the first detection unit may be specifically configured to, when the server is not detected, obtain a node address of a participating node participating in current federal learning; and sending the first model information of the trained first local model to the participating node based on the node address so that the participating node generates a post-training federated model based on the first model information.
Optionally, in some embodiments, the first detection unit may be specifically configured to, when second model information of a trained second local surface model sent by the participating node is received, aggregate the trained first local model and the trained second local model based on the second model information, and obtain a second joint model; verifying the second federated model to obtain second verification information, and performing disturbance processing on the second verification information to obtain disturbed second verification information; and updating a preset post-disturbance second verification information set based on the post-disturbance second verification information, and generating a post-training federal model based on the updated post-disturbance second verification information set.
Optionally, in some embodiments, the first detecting unit may be specifically configured to screen target post-disturbance second verification information from the updated post-disturbance second verification information set, and send the target post-disturbance second verification information to the participating node; when third disturbed verification information sent by the participating node is received, fusing the second disturbed verification information of the target and the third disturbed verification information to obtain fused disturbed verification information; and generating a post-training federal model based on the fused post-disturbance verification information, the target post-disturbance second verification information and the post-disturbance third verification information.
Optionally, in some embodiments, the federal learning device may further include a second detection unit, where the second detection unit is specifically configured to receive a post-training federal model returned by the server when it is detected that the server generates the post-training federal model; when detecting that the server does not generate the post-training federated model, receiving training information returned by the server, and returning to execute the step of training the local model based on the training information until the server generates the post-training federated model, and receiving the post-training federated model returned by the server.
Optionally, in some embodiments, the determining unit may be specifically configured to fuse the first verification information after the target disturbance, so as to obtain current fusion verification information corresponding to the first federal model; acquiring historical fusion verification information corresponding to a historical federated model, and calculating an information increment between the current fusion verification information and the historical fusion verification information; and determining the convergence state of the federal model in the current federal learning based on the information increment.
Optionally, in some embodiments, the generating unit may be specifically configured to generate a trained federated model based on the model index when the federated model convergence state is converged, and send the trained federated model to the participating node; and when the federal model convergence state is not converged, sending training information to the participating nodes so that the participating nodes can return to execute the step of training the local model based on the training information until the federal model convergence state is converged, so as to obtain a trained federal model, and sending the trained federal model to the participating nodes.
In addition, an embodiment of the present invention further provides an electronic device, which includes a processor and a memory, where the memory stores an application program, and the processor is configured to run the application program in the memory to implement the federal learning method provided in the embodiment of the present invention.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a plurality of instructions, and the instructions are suitable for being loaded by a processor to perform the steps in any of the federal learning methods provided by the embodiments of the present invention.
The embodiment of the invention trains the local model, sends the first model information of the trained first local model to the server, so that the server aggregates at least one trained first local model based on the first model information, receives the aggregated first federal model returned by the server, verifies the first federal model to obtain the first verification information of the first federal model, then disturbs the first verification information, updates the preset disturbed first verification information set based on the disturbed first verification information, screens the target disturbed first verification information from the updated disturbed first verification information set, constructs a model index of the target federal model corresponding to the target disturbed first verification information, and then sends the target disturbed first verification information and the model index to the server, so that the server generates a post-training federal model based on the first verification information after target disturbance and the model index; according to the scheme, the first verification information is disturbed by adopting a differential privacy mechanism, information interaction is carried out between the participants and the server in a plaintext form, any cryptography mode is not needed, the problem of ciphertext expansion is avoided, the communication overhead is very low, and the requirements on network bandwidth and delay are very low, so that the training efficiency of federal learning can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a system diagram of a federated learning system as provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a scenario of a federated learning method provided by an embodiment of the present invention;
FIG. 3 is a flow chart of a federated learning method provided by an embodiment of the present invention;
FIG. 4 is a flow diagram illustrating a current federated learning provided by an embodiment of the present invention in the presence of a federated server;
FIG. 5 is a flow chart illustrating a current federated learning provided by an embodiment of the present invention without a federated server;
FIG. 6 is another flow chart of a federated learning method provided by an embodiment of the present invention;
FIG. 7 is a flow diagram illustrating federated learning using a differential privacy algorithm according to an embodiment of the present invention;
FIG. 8 is another flow chart diagram of a federated learning method provided by an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a first federal learning device provided in an embodiment of the present invention;
fig. 10 is another schematic structural diagram of a first federal learning device provided in an embodiment of the present invention;
fig. 11 is another schematic structural diagram of a first federal learning device provided in an embodiment of the present invention;
FIG. 12 is a schematic structural diagram of a second federated learning device provided by the embodiments of the present invention;
fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a method and a device for federated learning, electronic equipment and a computer-readable storage medium. The federal learning device may be integrated in an electronic device, and the electronic device may be a server or a terminal. Specifically, the embodiments of the present invention provide a federal learning apparatus (may be referred to as a first federal learning apparatus for differentiation) suitable for a first electronic device, and a federal learning apparatus (may be referred to as a second federal learning apparatus for differentiation) suitable for a second electronic device. The first electronic device may be a distributed node, and the distributed node may be a terminal or a server, and the terminal includes, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent household appliance, a vehicle-mounted terminal, an aircraft, and the like. The second electronic device may be a Network-side device such as a server, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Network acceleration service (CDN), and a big data and artificial intelligence platform. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein. The embodiment of the invention can be applied to various scenes including but not limited to cloud technology, artificial intelligence, intelligent traffic, driving assistance and the like.
In the embodiment of the present application, for example, referring to fig. 1, a federal learning system provided by the embodiment of the present invention includes a local server 10 and a federal server 20, where the local server 10 serves as a participating node or a participating party, the number of the local servers 10 may be one or more, and the local server 10 and the federal server 20 are connected through a network, for example, through a wired or wireless network connection. The local servers 10 may also be connected via a network, such as a ring topology or a mesh topology (P2P), to communicate between the local servers 10.
Wherein, the local server 10 may be configured to train a local model, and send first model information of the trained first local model to the federation server 20, so that the federation server 20 aggregates at least one trained first local model based on the first model information, receives the aggregated first federation model returned by the federation server 20, and verifies the first federation model to obtain first verification information of the first federation model, then performs perturbation processing on the first verification information, and updates a preset post-perturbation first verification information set based on the post-perturbation first verification information, screens out the post-perturbation first verification information of a target from the updated post-perturbation first verification information set, and constructs a model index of the target federation model corresponding to the post-perturbation first verification information, and then sends the post-perturbation first verification information of the target and the model index to the federation server 20, so that the federal server 20 generates the post-training federal model based on the first verification information after the target disturbance and the model index, and further improves the training efficiency of federal learning, which may be specifically shown in fig. 2.
The federal server 20 is configured to aggregate the trained first local models, and generate the trained federal model according to the target disturbed first verification information and the model index, and may specifically be as follows:
after first model information of a trained first local model sent by at least one participatory node participating in current federal learning is received, the trained first local model is aggregated based on the first model information to obtain a first federal model, then the first federal model is respectively sent to the participatory nodes, first verification information and model indexes after target disturbance returned by the participatory nodes are received, the state of federal model convergence in current federal learning is determined according to the first verification information after target disturbance, and then the trained federal model is generated based on the state of federal model convergence and the model indexes.
Among them, federal Learning (fed Learning): the federated learning is also called joint learning, can realize the 'availability but invisibility' of data on the premise of protecting the data security, namely, the training task of the machine learning model is completed through multi-party cooperation, and can also provide reasoning service of the machine learning model. Unlike traditional centralized machine learning, in the federated learning process, one or more machine learning models are cooperatively trained by two or more participants together. In terms of classification, based on the distribution characteristics of data, federal Learning can be divided into Horizontal federal Learning (Horizontal federal learned Learning), vertical federal Learning (vertical federal learned Learning), and federal Transfer Learning (federal transferred Learning). The horizontal federated learning is also called federated learning based on samples, and is suitable for the situation that sample sets share the same feature space but sample spaces are different; the longitudinal federated learning is also called feature-based federated learning and is suitable for the situation that sample sets share the same sample space but feature spaces are different; federated migration learning then applies to cases where the sample sets differ not only in the sample space but also in the feature space. The federal learning method provided by the embodiment of the application belongs to a horizontal federal learning method, and the application scene of horizontal federal learning is that respective sample data have the same feature space and different sample spaces in each participating node (local server) of federal learning.
It should be noted that the federal learning method provided in the embodiment of the present application relates to a machine learning technique in the field of artificial intelligence, that is, in the embodiment of the present application, a federal model can be trained by using an artificial intelligence machine learning technique through a participating node and a federal server, so that a trained federal is lost.
So-called Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
The participating nodes in the federal learning and the federal server can be cloud platforms. The cloud platform is also called a cloud computing platform, and is a service based on hardware resources and software resources, and provides computing, network and storage capabilities. Cloud computing (cloud computing) is a computing model that distributes computing tasks over a pool of resources formed by a large number of computers, enabling various application systems to obtain computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as being infinitely expandable and available at any time, available on demand, expandable at any time, and paid for on-demand.
As a basic capability provider of cloud computing, a cloud computing resource pool (called as an ifas (Infrastructure as a Service) platform for short is established, and multiple types of virtual resources are deployed in the resource pool and are selectively used by external clients.
According to the logic function division, a PaaS (Platform as a Service) layer can be deployed on an IaaS (Infrastructure as a Service) layer, a SaaS (Software as a Service) layer is deployed on the PaaS layer, and the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, a web container, etc. SaaS is a variety of business software, such as web portal, sms, and mass texting. Generally speaking, SaaS and PaaS are upper layers relative to IaaS.
The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.
The embodiment will be described in terms of a first federal learning device, which may be specifically integrated in an electronic device, where the electronic device may be a device such as a local server; the local server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Network acceleration service (CDN), big data and an artificial intelligence platform.
A method for federated learning, comprising:
training the local model, sending first model information of the trained first local model to the server, so that the server aggregates at least one trained first local model based on the first model information, receives the aggregated first federal model returned by the server, verifies the first federal model to obtain first verification information of the first federal model, disturbing the first verification information, updating a preset disturbed first verification information set based on the disturbed first verification information, screening out target disturbed first verification information from the updated disturbed first verification information set, and constructing a model index of the target federal model corresponding to the first verification information after the target disturbance, sending the first verification information after the target disturbance and the model index to a server, so that the server generates a post-training federal model based on the first verification information after the target disturbance and the model index.
As shown in fig. 3, the specific flow of the federal learning method is as follows:
101. the method comprises the steps of training local models, and sending first model information of the trained first local models to a server, so that the server can aggregate at least one trained first local model based on the first model information.
The server may be a federal server, and the so-called federal server may be used to aggregate local models trained by the participants in a federal learning scenario and generate a post-training federal model based on verification information of the federal model returned by the participants.
The first model information may be understood as model information of the trained first local model, and the model information may be a model gradient or a model parameter of the trained first local model.
The local model may be trained in a plurality of ways, for example, the differential privacy processing may be performed in the training process of the local model, or the differential privacy processing may be performed after the local model is trained, so as to obtain a trained first local model meeting the requirements of the differential privacy, which may specifically be as follows:
(1) differential privacy processing during training of local models
For example, local training sample data may be acquired, the gradient of the local model is calculated through the local training sample data, differential privacy processing is performed on the gradient, and the local model is converged based on the processed gradient, so that a trained first local model is obtained.
Among them, differential privacy processing (DP) is a means of privacy protection (statistical means) and aims to provide a means of maximizing the accuracy of data queries while minimizing the chances of identifying their records when querying from a statistical database. One key concept related to differential privacy is the neighboring data set. Given two data sets x and x', if they differ by one and only one piece of data, the two data sets may be referred to as adjacent data sets. If for a random algorithm M, it is considered to satisfy the differential privacy requirement if it acts on two outputs obtained from the two adjacent data sets, for example, two machine learning models are trained respectively, and it is difficult to distinguish which data set the obtained output is from with a certain probability. Mathematically, the ε -difference privacy definition may be as shown in equation (1):
Figure BDA0003554121860000111
where o represents the output and epsilon represents the privacy loss metric. The meaning of this equation (1) is that for any adjacent data set, the probability of training to a particular output parameter is almost unity. Therefore, the observer can hardly perceive the small change of the data set by observing the output parameters, and can not reversely derive a specific training data by observing the output parameters. In this way, the purpose of protecting local data is achieved.
The gradient may be subjected to various differential privacy processing manners, for example, the gradient may be subjected to two-norm clipping (clipping) and gaussian noise addition, or other differential privacy processing manners may be adopted for processing, and the privacy budget of the differential privacy processing may be a preset training privacy budget (epsilon)1)。
(2) Differential privacy processing after local model training
For example, local training sample data may be acquired, the local model is trained through the local training sample data to obtain a first local model after initial training, an initial gradient or an initial model parameter is extracted from the first local model after the initial training to obtain initial model information, and differential privacy processing is performed on the initial model information to obtain first model information of the first local model after training.
For example, two-norm clipping (clipping) and gaussian noise addition may be performed on the initial gradient or initial model parameter in the initial model information, or other differential privacy processing methods may be used for processing, where the privacy budget of the differential privacy processing may be a preset training privacy budget (epsilon)1)。
After the local model is trained, the first model information of the trained first local model may be sent to the server, for example, when the trained first local model is a model obtained after differential privacy processing is performed in a training process of the local model, a current gradient or model parameters may be extracted from the trained first local model to serve as the first model information, and the first model information is sent to the server, so that the server aggregates at least one trained first local model based on the first model information, and when the trained first local model is a model subjected to differential privacy processing after the local model is trained, the first model of the trained first local model is sent to the server, so that the server aggregates at least one trained local model based on the first model information.
The aggregation of the at least one trained local model by the server can be understood as the integrated fusion of the trained first local models trained by each participant in the current federal learning, so that the first federal model M can be obtainednThe method for aggregating the trained first local model may be various, for example, the first Federated model may be generated by using a Federated Average (Federated Average) method, or the first Federated model may also be generated by using an ensemble learning method. After the server generates the first federated model, the first federated model may be sent to the various parties in the current federated learning.
102. And receiving the aggregated first federal model returned by the server, and verifying the first federal model to obtain first verification information of the first federal model.
The first verification information is obtained by verifying or evaluating the effect of the first federal model by a current federal learning participant.
The receiving server may return the aggregated first federation model in a plurality of ways, which may specifically be as follows:
for example, the first federal model returned by the server may be directly received, or federal model information of the first federal model returned by the server may also be received, and the first federal model is generated based on the federal model information.
The method for generating the first federal model based on the federal model information may be various, for example, a base model may be obtained, and the model parameters or the gradients of the base model are updated and adjusted based on the federal model information, so as to obtain the first federal model, or the gradients or the model parameters of the local model may be adjusted based on the federal model information, so as to obtain the first federal model.
After the first federal model is received, the first federal model can be verified, the verification is that the quality of the federal model is evaluated, and various modes for verifying the first federal model can be provided, for example, verification sample data can be obtained, the first federal model is verified based on the verification sample data to obtain verification information of the first federal model, and the verification information is evaluated to obtain verification information of the first federal model.
The method for verifying the first federated model may be various, for example, the number of the first federated model correctly identified or the number of the verification sample data classified may be verified, or at least one model index of the first federated model may be verified based on the verification sample data, so as to obtain verification information.
The model index may be of various types, and for example, the model index may include AUC (area under ROC curve, which is a model evaluation index), Accuracy, Precision, Recall, and/or F1-Score (a classification index).
After the verification information is obtained, the verification information can be evaluated in various manners, for example, when only one verification result of the verification manner exists in the verification information, the verification information can be directly converted into an effect score, and the effect score Q is obtainedn,kAs the first verification information, when verification results of multiple verification modes exist in the verification information, each verification result may be weighted, the verification results after weighting are fused, and the first federal model M is determined based on the verification results after fusionnScore of (2) Qn,kScore the effect by Qn,kAs the first authentication information.
103. And disturbing the first verification information, and updating a preset disturbed first verification information set based on the disturbed first verification information.
The perturbation processing may be understood as performing differential privacy processing on the first verification information, or may be understood as adding random noise information to the first verification information, so as to perturb the first verification information.
The preset disturbed first verification information set may be a set storing disturbed first verification information of the first federal model to be verified, which is received by a participant of current federal learning each time, and when the nth first federal model to be verified is received, at this moment, (n-1) disturbed first verification information is stored in the preset disturbed first verification information set. After the disturbance processing of the currently received first verification information of the first federal model is completed, the preset disturbed first verification information set may be updated, and at this time, the updated disturbed first verification information set may include n pieces of disturbed first verification information.
The method for performing perturbation processing on the first verification information may be various, and specifically may be as follows:
for example, the random noise information of the first verification information may be obtained according to the preset differential privacy budget, and the random noise information is added to the first verification information to obtain the disturbed first verification information.
The random noise information may be noise information randomly added to the first verification information, and the first verification information is used as the effect score Qn,kFor example, the random noise information may be noise information such as a random number. According to the preset differential privacy budget, various ways of obtaining the random noise information of the first verification information may be provided, for example, a noise distribution parameter of the noise information may be determined according to the preset differential privacy budget, a noise information set complying with the noise distribution parameter is generated based on the noise distribution parameter, and the noise information is randomly screened out from the noise information set to obtain the random noise information of the first verification information.
The preset differential privacy budget can also be understood as the total privacy overhead in the current federal learning scene, and the privacy overhead required by the differential privacy processing on the first verification information can be the verification differential privacy budgetε2Therefore, according to the preset differential privacy budget, there are various ways to determine the noise distribution parameters of the noise information, for example, the preset differential privacy budget epsilon and the preset training privacy budget epsilon can be calculated1A differential privacy budget difference between, and based on the differential privacy budget difference, determining a verification differential privacy budget ε2Based on the verified differential privacy budget, noise distribution parameters of the noise information are determined. The noise distribution parameter may be understood as a parameter indicating the distribution of noise information.
After the random noise information of the first verification information is obtained, the random noise information may be added to the first verification information to obtain the disturbed first verification information, and the manner of adding the random noise information may be various, for example, the first verification information is used as the effect score Qn,kFor example, the random noise information is a random number based on laplacian distribution, and the effect score Q can be directly obtainedn,kAdding a random number to obtain the disturbed first verification information, which can be specifically shown as formula (2):
Figure BDA0003554121860000151
wherein, Vn,kFor the first verification information after the perturbation,
Figure BDA0003554121860000152
is a random number, ε, randomly selected in a Laplace distribution2To validate the differential privacy budget, the validation differential privacy budget ε2The privacy overhead consumed for performing the differential privacy processing at this time may be specifically shown in formula (3):
Figure BDA0003554121860000153
wherein epsilon2To verify the differential privacy budget, ε is a preset differential privacy budget, which is a total differential privacy budget preset in the current Federal study,ε1for training the differential privacy budget, N is the total number of iterations, which may be one iteration calculated by one perturbation on the first verification information, and may also be understood as the number of aggregations of the federated model.
The method for disturbing the first verification information adopts a Laplace privacy mechanism, the Laplace privacy mechanism can be a method for disturbing the numerical value of the answer query by using Laplace noise, so that the output has certain probability to be the same, the epsilon-difference privacy is ensured, and for the query function f:
Figure BDA0003554121860000154
the laplace mechanism can be shown in equation (4):
Figure BDA0003554121860000155
wherein, YiSubject to distribution as independent co-partial variables
Figure BDA0003554121860000156
Δ f is the sensitivity value due to changes in the neighboring data sets. Therefore, for any adjacent data set x, x', the probability output is the same, and the probability is less than or equal to eε
It should be noted that, in addition to the difference privacy mechanism of the laplacian distribution, other differential privacy mechanisms may also be adopted to perform the perturbation processing on the first verification information, for example, the perturbation processing may include an exponential mechanism, a gaussian noise and Best Response (optimal Response) mechanism, and the like.
After the first verification information is disturbed, the preset disturbed first verification information may be updated based on the disturbed first verification information, and the updating manner may be various, for example, the disturbed first verification information may be directly added to the preset disturbed first verification information set to obtain an updated disturbed first verification information set, and at this time, the updated disturbed first verification information set includes the disturbed first verification information for each received first verification information of the first federated model.
104. And screening out the first verification information after the target disturbance from the updated first verification information after the disturbance, and constructing a model index of the target federal model corresponding to the first verification information after the target disturbance.
The model index may be understood as index information indicating a target federal model.
The method for screening out the target disturbed first verification information from the updated disturbed first verification information set may be various, and specifically may be as follows:
for example, the post-disturbance first verification information with the largest effect score may be screened from the updated post-disturbance first verification information set to obtain the target post-disturbance first verification information, or the post-disturbance first verification information in the updated post-disturbance first verification information set may be sorted, based on the sorting result, at least one candidate post-disturbance first verification information is screened from the updated post-disturbance first verification information set, and the post-disturbance first verification information with the largest effect score or the largest effect score is screened from the candidate post-disturbance first verification information as the target post-disturbance first verification information. Optionally, the disturbed first verification information may be screened out by an exponential mechanism (exponentiallmechnishm) as the target disturbed first verification information.
It should be noted that each disturbed first verification information corresponds to a first federal model, so after the disturbed first verification information of the target is screened out, a model index of the target federal model corresponding to the disturbed first verification information of the target can be constructed, and the model index can be constructed in various ways, for example, a received first federal model set can be obtained, a federal model corresponding to the disturbed first verification information of the target is screened out from the first federal model set to obtain the target federal model, and then a model index of the target federal model is constructed according to the model information of the target federal model.
105. And sending the first verification information and the model index after the target disturbance to a server so that the server can generate a post-training federal model based on the first verification information and the model index after the target disturbance.
The method for sending the first verification information and the model index after the target disturbance to the server may be various, and specifically may be as follows:
for example, the first verification information after the target disturbance and the model index may be directly sent to the server, so that the server generates the post-training federal model based on the first verification information after the target disturbance and the model index, or a training stopping request may be sent to the server, where the training stopping request carries a storage address of the first verification information after the target disturbance and the model index, so that the server obtains the first verification information after the target disturbance and the model index based on the storage address, and then the server may generate the post-training federal model based on the first verification information after the target disturbance and the model index.
Optionally, after the first verification information after the target disturbance and the model index are sent to the server, so that the server generates the post-training federal model based on the first verification information after the target disturbance and the model index, it may also be detected whether the server generates the post-training federal model, for example, when it is detected that the server generates the post-training federal model, the post-training federal model returned by the server is received, when it is detected that the server does not generate the post-training federal model, the training information returned by the server is received, and based on the training information, the step of training the local model is returned and executed until the server generates the post-training federal model, the post-training federal model returned by the server is received.
Optionally, after training the local model, the server may also be detected in the current federated learning scenario, when the server is detected, first model information of the trained first local federated model may be sent to the server, when the server is not detected, it can be stated that the federated server does not exist in the current federated learning scenario, and at this time it can also be determined that two participants exist in the current federated learning, the two participants can communicate directly without going through the federal server, at which point the trained first local model can be sent to the participating nodes of the current federal learning participants, e.g., when the server is not detected, the node address of a participating node participating in current federal learning is obtained, and first model information of the trained first local model is sent to the participating node based on the node address so that the participating node can generate the trained federal model based on the first model information.
It should be noted that, in a current federated learning scenario without a server, the participants may act as participants and also as a role of a server in federated learning, that is, a first model of a trained first local model may be sent to the participating nodes, and a second model of a trained second local model sent by the participating nodes may also be received, at this time, the trained first local model and the trained second local model may be aggregated by the local participating nodes to obtain a second federated model, and the second federated model may be further verified to generate the trained federated model, for example, when second model information of the trained second local model sent by the participating nodes is received, the trained first local model and the trained second local model are aggregated based on the second model information, and obtaining a second federated model, verifying the second federated model to obtain second verification information, performing disturbance processing on the second verification information to obtain disturbed second verification information, updating a preset disturbed second verification information set based on the disturbed second verification information, and generating a trained federated model based on the updated disturbed second verification information.
For example, the post-disturbance second verification information is screened from the post-disturbance second verification information set after being updated, the post-disturbance second verification information is sent to the participating node, when the post-disturbance third verification information sent by the participating node is received, the post-disturbance second verification information and the post-disturbance third verification information are fused to obtain fused post-disturbance verification information, and the post-training federal model is generated based on the fused post-disturbance verification information, the post-disturbance second verification information and the post-disturbance third verification information.
The method for screening the second verification information after the target disturbance from the updated disturbed second verification information set is the same as the method for screening the first verification information after the target disturbance, which is described above in detail and is not repeated here.
The method includes the steps of generating a post-training federal model based on fused post-disturbance verification information, target post-disturbance second verification information and post-disturbance third verification information, for example, obtaining historical post-disturbance verification information, calculating information increment of the fused post-disturbance verification information and the historical post-disturbance verification information, determining a federal model convergence state of current federal learning based on the information increment, obtaining a first candidate federal model corresponding to the target post-disturbance second verification information and a second candidate federal model corresponding to the post-disturbance third verification information when the federal model convergence state is converged, and generating the post-training federal model according to the first candidate federal model and the second candidate federal model. And when the federal model convergence state is not converged, returning to the step of executing the training of the local model until the federal model convergence state is converged so as to obtain the trained federal model.
The method for generating the post-training federal model according to the first candidate federal model and the second candidate federal model may be various, for example, the post-training federal model may be screened from the first candidate federal model and the second candidate federal model, or the post-training federal model may be obtained by fusing the second candidate federal model and the second federal model, and the like.
It should be noted that, in the federal learning scenario including the server, participating nodes of at least two participating parties interact with the server, for example, the disturbance processing mode is laplacian mechanism, the first verification information is an effect score, and the server is a federal server, a flow of the current federal learning with the federal server may be as shown in fig. 4, where the participating parties may or may not include the server, and the current federal learning with the federal server is performed by using the server as an exampleTraining local models based on differential privacy through participating nodes, and training a first local model Mn,1Sending the local model to a federal server, and aggregating the received trained first local model by the federal server to obtain a federal model MnSending the federation model Mn to a participant node of a participant, the participant node being paired with a third federation model MnEvaluating the effect to obtain an effect score Qn,kThen, the participator gives the effect score Q to the Federal model Mn based on the Laplace mechanismn,kDisturbing and dividing the fraction V corresponding to the modeln,kAnd a corresponding model index Ln,kAnd sending the information to a federal server so that the federal server generates a trained federal model. In a federated learning scenario that does not include a server, a participant 1 and a participant 2 exist, the participant 1 and the participant 2 communicate through a ring topology or a mesh topology (P2P), and taking a disturbance processing mode as a laplace mechanism and first verification information as an effect score as an example, a current federated learning process that does not include a federated server may be as shown in fig. 5, where the participant 1 trains a local model based on a differential privacy mechanism and sends first model information of the trained first local model to the participant 2, and may also receive second model information of the trained second local model sent by the participant 2, aggregate the trained first local model and the trained second local model, and perform effect evaluation on the aggregated federated model, then disturb the effect score of the federated model by using the laplace mechanism, and send the disturbed effect score to the participant 2, and meanwhile, the disturbed effect score sent by the participant 2 can be received, so that a post-training federal model is generated.
In the current federal learning scene, the participators use a DP-SGD and laplacian-based differential privacy mechanism scheme to protect training data of the participators, the participators can train the participators to obtain a local model and send the local model to a federal server or other participators in a plaintext mode, overfitting of horizontal federal model training can be effectively prevented through effect evaluation of the federal model, in addition, disturbance processing is carried out on verification information generated by the effect evaluation, and therefore safety of the participators can be guaranteed in the model effect evaluation process. In the data interaction process of one iteration of the participants and the server, each participant only needs to send model information and verification information (including model indexes) after one disturbance to the federal server once, and all the information is transmitted in a plaintext form, only one time of transmission of the federal model (or the model information) is performed from the federal server to each participant, and the transmission is also in a plaintext form, so that the communication overhead is very small, and the requirements on the bandwidth and the delay of a communication network are very low. In addition, two-party horizontal federal learning can be supported in the scheme, and one party can directly communicate with the other party without a server (a federal server).
As can be seen from the above, in the embodiment of the present application, after a local model is trained, first model information of a trained first local model is sent to a server, so that the server aggregates at least one trained first local model based on the first model information, receives a first federated model returned by the server after aggregation, verifies the first federated model to obtain first verification information of the first federated model, then performs perturbation processing on the first verification information, updates a preset perturbed first verification information set based on the perturbed first verification information, screens out target perturbed first verification information from the updated perturbed first verification information set, constructs a model index of a target federated model corresponding to the target perturbed first verification information, and then sends the target perturbed first verification information and the model index to the server, so that the server generates a post-training federal model based on the first verification information after target disturbance and the model index; according to the scheme, a differential privacy mechanism is adopted to carry out disturbance processing on the first verification information, information interaction is carried out between the participant and the server in a plaintext mode, any cryptography mode is not needed, the problem of ciphertext expansion cannot be caused, the communication overhead is very low, the requirements on network bandwidth and delay are very low, and therefore the training efficiency of federal learning can be improved.
This embodiment will be described from the perspective of a second joint learning apparatus, which may be specifically integrated in an electronic device, where the electronic device may be a server or other devices; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Network acceleration service (CDN), big data and an artificial intelligence platform.
A method for federated learning, comprising:
after first model information of a trained first local model sent by at least one participatory node participating in current federal learning is received, the trained first local model is aggregated based on the first model information to obtain a first federal model, then the first federal model is respectively sent to the participatory nodes, first verification information and model indexes after target disturbance returned by the participatory nodes are received, then the state of federal model convergence in current federal learning is determined according to the first verification information after target disturbance, and the trained federal model is generated based on the state of federal model convergence and the model indexes.
As shown in fig. 6, the federal learning method includes the following specific procedures:
201. and receiving first model information of the trained first local model, which is sent by at least one participant node participating in current federal learning.
For example, the first model information of the trained first local model sent by the participating node participating in the current federated learning may be directly received, or a model aggregation request sent by the participating node participating in the current federated learning may also be received, where the model aggregation request carries a storage address of the trained first local model, and the first model information of the trained first local model is obtained based on the storage address.
202. And aggregating the trained first local model based on the first model information to obtain a first federal model.
For example, a trained first local model may be constructed based on the first model information, and the trained first local models may be aggregated to obtain a first federated model.
For example, the local model of the participant may be obtained, information such as a gradient or a model parameter of the model is extracted from the first model information, and the gradient or the model parameter is set in the local model, so as to obtain the trained first local model, or the information such as the gradient or the model parameter of the model is extracted from the first model information, and the trained first local model is generated based on the model gradient or the model parameter.
After the trained first local model is constructed, the trained first model may be aggregated, and the aggregation mode may be multiple, for example, the trained first local models of the participating parties may be aggregated in a federal Average (fed Average) mode, so as to obtain a first federal model, or the trained first local models of the participating nodes may be integrated and fused in an ensemble learning mode, so as to obtain the first federal model.
The federate averaging method may be various, for example, a sample data size of each participating node may be obtained, a weighting coefficient of each participating node is determined based on the sample data size, and a first local model after training of the participating nodes is weighted and averaged based on the weighting coefficient, so that a first federate model is obtained.
The method for determining the weighting coefficient of each participating node may be various based on the sample data size, for example, the sample data size is accumulated to obtain the total sample data amount of the current federal study, the quantity ratio of the sample data size of each participating node to the total sample data amount is calculated, and the quantity ratio is used as the weighting coefficient of the trained first local model of the corresponding participating node.
203. And respectively sending the first federated model to the participating nodes, and receiving the first verification information and the model index after the target disturbance returned by the participating nodes.
For example, the first federation model may be directly sent to each participating node, or a model gradient or a model parameter and the like may be extracted from the first federation model as federation model information and the federation model information is sent to the participating nodes, or a verification request carrying a storage address of the first federation model or the federation model information of the first federation model may be sent to each participating node, so that the participating nodes obtain the federation model information of the first federation model or the first federation model through the storage address.
After receiving the first federal model or federal model parameters of the first federal model, the participating nodes can verify the first federal model to obtain first verification information of the first federal model, disturb the first verification information, update a preset disturbed first verification information set based on the disturbed first verification information, screen out target disturbed first verification information from the updated disturbed first verification information set, and construct a model index of the target federal model corresponding to the target disturbed first verification information.
After the participating node verifies the first federated model and screens out the first verification information and the model index after the target disturbance, the first verification information and the model index after the target disturbance sent by the participating node can be received, and the modes for receiving the first verification information and the model index after the target disturbance can be various, for example, the first verification information and the model index after the target disturbance sent by the participating node can be directly received, or a model generation request can be received, the model generation request carries the storage address of the first verification information and the model index after the target disturbance, and the first verification information and the model index after the target disturbance can be obtained based on the storage address.
204. And determining the federal model convergence state in the current federal learning according to the first verification information after the target disturbance.
The state of the federal model is used for indicating whether the federal model is converged in the current federal learning, and whether the current federal learning needs to stop training or not can be judged through convergence of the federal model, so that overfitting of the federal model is prevented.
The method for determining the federal model convergence state in the current federal learning according to the first verification information after the target disturbance may be various, and specifically may be as follows:
for example, the first verification information after the target disturbance is fused to obtain current fusion verification information corresponding to the first federal model, historical fusion verification information corresponding to the historical federal model is obtained, information increment between the current fusion verification information and the historical fusion verification information is calculated, and the federal model convergence state in the current federal learning is determined based on the information increment.
The method for fusing the first verification information after the target disturbance may be various, for example, the weighting coefficients of the participating nodes may be obtained, and the weighted average processing is performed on the first verification information after the target disturbance based on the weighting coefficients, so as to obtain the current fusion verification information.
The historical fusion verification information may be fusion verification information corresponding to other federated models aggregated by the server before the first federated model is aggregated. The information increment may be an increment between the current fusion verification information and the historical fusion verification information, or may be an information difference. The method for calculating the information increment between the current fusion verification information and the historical fusion verification information may be various, for example, the historical fusion verification information is sorted based on the aggregation time of the federation model, a preset number of historical fusion verification information adjacent to the first federation model is screened from the historical fusion verification information based on the sorting information and the aggregation time to obtain target historical fusion verification information, and the information increment between the current fusion verification information and each target historical fusion verification information is calculated respectively.
After the information increment is calculated, the federal model convergence state in the current federal learning can be determined based on the information increment, and the federal model convergence state can be determined in various ways, for example, increment difference values between the information increments are respectively calculated, when the increment difference values do not exceed a preset increment threshold value, the federal model convergence state is determined to be converged, and when the increment difference values exceed the preset increment threshold value, the federal model convergence state is determined to be not converged.
In the process of determining the federal model convergence state in the current federal learning according to the first verification information after target disturbance, taking the fusion verification information as the model score as an example, the federal server can be understood as comparing the weighted average scores of n received federal models, and if the federal model has converged, the model score is difficult to be continuously improved along with the number of iteration rounds, so that the federal model convergence state is accurately judged.
205. And generating a post-training federal model based on the federal model convergence state and the model index.
For example, when the convergence state of the federal model is converged, a trained federal model is generated based on the model index and is sent to the participating nodes, and when the convergence state of the federal model is not converged, training information is sent to the participating nodes so that the participating nodes return to the step of training the local model based on the training information until the convergence state of the federal model is converged, so that the trained federal model is obtained, and the trained federal model is sent to the participating nodes.
There may be various ways to generate the trained federated model based on the model index, for example, screening the model index { L ] from the aggregated federated model setn,1,Ln,2,…,Ln,kAnd (5) screening a target federal model corresponding to the target federal model, and screening the trained federal model from the target federal model, or screening a target federal model corresponding to each model index from a converged federal model set, and fusing the target federal models to obtain the trained federal model.
The method for fusing the target federal model may be various, for example, the target federal model is subjected to average processing to obtain the post-training federal model, or the target federal model is subjected to weighted average processing to obtain the post-training federal model.
In the current federal learning scenario with a server, the process of federal learning using a differential privacy algorithm may be as shown in fig. 7, where each participant in horizontal federal learning performs DP-SGD (a differential privacy algorithm) algorithm training locally, and sends the trained model to the federal server in a plaintext form after the local training is completed. And the federal server performs integration and fusion on the received local model parameters and then sends the obtained federal integrated learning model to each participant. After each participant receives the federal model, the model effect evaluation is carried out on the federal model by using a verification set, the result of the model evaluation is sent to a server side, and the server side determines whether to stop training or not. And when the training is determined not to stop, the participant K returns to execute the step of performing local training, and when the training is determined to stop, the optimal federal model is selected as the post-training federal model by using the model average evaluation score.
As can be seen from the above, in the embodiment of the application, after first model information of a trained first local model sent by at least one participant node participating in current federal learning is received, the trained first local model is aggregated based on the first model information to obtain a first federal model, then, the first federal model is sent to the participant nodes respectively, and target disturbed first verification information and model indexes returned by the participant nodes are received; because the information interaction is carried out between the server and the participation method in a plaintext mode, any cryptograph mode is not needed, the problem of ciphertext expansion is avoided, the communication overhead is very low, and the requirements on network bandwidth and delay are very low, so that the training efficiency of federal learning can be improved.
The method described in the above examples is further illustrated in detail below by way of example.
In this embodiment, in a federated learning scenario in which a federated server participates, a local server using a first federated learning device as a participating party and a second federated learning device as a federated server are described, in a federated learning scenario in which no federated server participates, the local server may further include a first local model of a local node and a second local model corresponding to the participating node, and the verification information is an effect score of the federated model as an example.
As shown in fig. 8, a bang learning method specifically includes the following steps:
301. the local server trains the local model and sends the first model information of the trained first local model to the federal server.
For example, the local server may perform the differential privacy processing in the training process of the local model, or may also perform the differential privacy processing after the local model is trained, so as to obtain the trained first local model meeting the differential privacy requirement, which may specifically be as follows:
(1) the local server performs differential privacy processing in the training process of the local model;
for example, the local server may obtain local training sample data, calculate a gradient of the local model through the local training sample data, perform two-norm clipping on the gradient, add gaussian noise, and the like, or may perform processing by using other differential privacy processing methods, where a privacy budget of the differential privacy processing may be a preset training privacy budget (epsilon)1). And converging the local model based on the processed gradient to obtain a trained first local model.
(2) And the local server performs differential privacy processing after the local model is trained.
For example, the local server may obtain local training sample data, train the local model through the local training sample data to obtain a first local model after initial training, extract an initial gradient or an initial model parameter from the first local model after initial training to obtain initial model information, perform two-norm value cutting and gaussian noise addition on the initial gradient or the initial model parameter in the initial model information, or perform processing in another differential privacy processing manner to obtain first model information of the first local model after training. The privacy budget of the differential privacy process may be a preset training privacy budget (epsilon)1)。
When the trained first local model is a model obtained after differential privacy processing is performed in the training process of the local model, the local server can extract a current gradient or model parameters from the trained first local model to serve as first model information, and sends the first model information to the server, so that the server aggregates at least one trained first local model based on the first model information, and when the trained first local model is a model subjected to differential privacy processing after the local model is trained, the local server sends the first model of the trained first local model to the server, so that the server aggregates at least one trained local model based on the first model information.
302. And the federal server aggregates the trained first local model based on the first model information to obtain a first federal model.
For example, the federal server may obtain a local model of each participant, extract information such as a gradient or a model parameter of the model from each piece of first model information, and set the gradient or the model parameter in the local model, thereby obtaining a trained first local model of each participant, or may extract information such as a gradient or a model parameter of the model from the first model information, and generate a trained first local model of each participant based on the gradient or the model parameter of the model.
The federated server obtains the sample data size of each participating node, accumulates the sample data size to obtain the total sample data of the current federated learning, calculates the quantity ratio of the sample data size of each participating node to the total sample data, and takes the quantity ratio as the weighting coefficient of the trained first local model of the corresponding participating node. And carrying out weighted average on the trained first local models of the participating nodes based on the weighting coefficients so as to obtain a first federal model, or integrating and fusing the trained first local models of the participating nodes in an integrated learning mode so as to obtain the first federal model.
303. And the federal server respectively sends the first federal model to the participating nodes.
For example, the federation server may directly send the first federation model to the local server of each participating node, or may extract a model gradient or a model parameter, etc., in the first federation model as federation model information and send the federation model information to the local servers of the participating nodes, or may send a verification request to the local server of each participating node, where the verification request carries a storage address of the federation model information of the first federation model or the first federation model, so that the local server of the participating node obtains the federation model information of the first federation model or the first federation model through the storage address.
304. And the local server verifies the first federal model to obtain a first effect score of the first federal model.
For example, the local server obtains a base model, and updates and adjusts the model parameters or the gradients of the base model based on the federal model information, thereby obtaining a first federal model, or may also adjust the gradients or the model parameters of the local model based on the federal model information, thereby obtaining the first federal model.
The local server obtains verification sample data, verifies the number of the first federated model correct identification or classification test sample data, or can also verify at least one model index of the first federated model based on the verification sample data, thereby obtaining verification information. The model metrics may include AUC, Accuracy, Precision, Recall and/or F1-Score, among others.
When only one verification mode verification result exists in the verification information, the local server can directly convert the verification information into an effect score and convert the effect score Q into the effect scoren,kAs the first effect score, when there are verification results of multiple verification modes in the verification information, each verification result may be weighted, the verification results after weighting are fused, and the first federal model M is determined based on the verification results after fusionnEffect score of (2) Qn,kScore the effect by Qn,kAs the first effectiveness score.
305. And the local server carries out disturbance processing on the first effect score and updates a preset disturbed first effect score set based on the disturbed first effect score.
For example, the local server may calculate a preset differential privacy budget ε and a preset training privacy budget ε1A differential privacy budget difference between, and based on the differential privacy budget difference, determining a verification differential privacy budget ε2Based on the verified differential privacy budget, noise distribution parameters of the noise information are determined. And generating a noise information set obeying the noise distribution parameters based on the noise distribution parameters, and randomly screening out noise information in the noise information set to obtain random noise information with a first effect score. In the effect score Qn,kAdding a random number to obtain a disturbed first effect score, which can be specifically shown in formula (2).
The local server may directly add the disturbed first effect score to the preset disturbed first effect score set to obtain an updated disturbed first effect score set, and at this time, the updated disturbed first effect score set includes the disturbed first effect score for each received first effect score of the first federal model.
306. And screening out a first effect score after the target disturbance from the updated disturbed first effect score set by the local server, and constructing a model index of the target federal model corresponding to the first effect score after the target disturbance.
For example, the local server may screen the post-disturbance first effect score with the largest effect score from the updated post-disturbance first effect score set to obtain the target post-disturbance first effect score, or may sort the post-disturbance first effect score from the updated post-disturbance first effect score set, screen at least one candidate post-disturbance first effect score from the updated post-disturbance first effect score set based on a sorting result, screen the post-disturbance first effect score with the largest effect score or the largest effect score from the candidate post-disturbance first effect scores as the target post-disturbance first effect score, or may screen the post-disturbance first verification information as the target post-disturbance first verification information by using an exponential mechanism (exponentialmechnism).
The local server can obtain the received first federal model set, screen out the federal model corresponding to the first effect score after the target disturbance from the first federal model set to obtain a target federal model, and then construct a model index of the target federal model according to the model information of the target federal model.
307. And the local server sends the first effect score and the model index after the target disturbance to the federal server.
For example, the local server may directly send the target disturbed first effect score and the model index to the federal server, so that the federal server generates a trained federal model based on the target disturbed first effect score and the model index, or may also send a training stopping request to the federal server, where the training stopping request carries the target disturbed first effect score and the storage address of the model index, so that the federal server obtains the target disturbed first effect score and the model index based on the storage address.
308. And the federal server determines the state of convergence of the federal model in the current federal learning according to the first effect score after the target disturbance.
For example, the federal server may obtain a weighting coefficient of the participating node, and perform weighted average processing on the first effect score after the target disturbance based on the weighting coefficient, so as to obtain the current federal model effect score. The method comprises the steps of obtaining historical federal model effect scores corresponding to a historical federal model, sorting the historical federal model effect scores based on the aggregation time of the federal model, screening out a preset number of historical federal model effect scores adjacent to a first federal model from the historical federal model effect scores based on sorting information and the aggregation time to obtain target historical federal model effect scores, and calculating effect score increments between the current federal model effect scores and each target historical federal model effect score. And respectively calculating increment difference values among the effect score increments, determining that the federal model convergence state is converged when the increment difference values do not exceed a preset increment threshold, and determining that the federal model convergence state is not converged when the increment difference values exceed the preset increment threshold.
309. And the federal server generates a trained federal model based on the state of convergence of the federal model and the model index.
For example, when the federal model convergence status is converged, the federal server screens out a model index { L } in the aggregated federal model setn,1,Ln,2,…,Ln,kAnd (4) screening a corresponding target federal model from the target federal model, or screening a target federal model corresponding to each model index from a converged federal model set, and carrying out average processing on the target federal model to obtain the trained federal model, or carrying out weighted average processing on the target federal model to obtain the trained federal model. And sending the trained federated model to a local server of the participating node.
And when the convergence state of the federal model is not converged, the federal server sends training information to the local servers of the participating nodes, so that the local servers of the participating nodes return to the step of training the local model based on the training information until the convergence state of the federal model is converged, the trained federal model is obtained, and the trained federal model is sent to the local servers of the participating nodes.
Optionally, after the target disturbed first effect score and the model index are sent to the federal server, so that the federal server generates a post-training federal model based on the target disturbed first effect score and the model index, the local server may further detect whether the federal server generates the post-training federal model, for example, when detecting that the federal server generates the post-training federal model, receive the post-training federal model returned by the federal server, when detecting that the federal server does not generate the post-training federal model, receive training information returned by the federal server, and return to execute the step of training the local model based on the training information, until the federal server generates the post-training federal model, receive the post-training federal model returned by the federal server.
Optionally, after the local server trains the local model, the local server may also detect the federal server in the current federal learning scenario, when the federal server is detected, the first model information of the trained first local federal model may be sent to the federal server, when the federal server is not detected, it may be stated that the federal server does not exist in the current federal learning scenario, at this time, it may also be determined that two participants exist in the current federal learning, taking the local model of the current participant as the first local model as an example, the local model of the other participant may be the second local model, the two participants may directly communicate, and at this time, the first local server may send the trained first local model to the second local server of the participant node in the current federal learning, for example, when the servers are not detected, the first local server obtains a node address of a participating node participating in current federal learning, and sends first model information of the trained first local model to the second local server of the participating node based on the node address, so that the second local server of the participating node generates the trained federal model based on the first model information.
It should be noted that in a current federated learning scenario without a server, the participants may act as participants and also as a role of a server in federated learning, that is, the first model of the trained first local model may be sent to the second local server of the participant node, the second model of the trained second local model sent by the second local server of the participant node may also be received, when the second model information of the trained second local model sent by the second local model of the participant node is received, the first local model aggregates the trained first local model and the trained second local model based on the second model information to obtain a second federated model, verifies the second federated model to obtain a second effect score, and performs perturbation processing on the second effect score to obtain a perturbed second effect score, updating the preset disturbed second effect score set based on the disturbed second effect score, screening out a target disturbed second effect score from the updated disturbed second effect score set, sending the target disturbed second effect score to the participating node, and fusing the target disturbed second effect score and the disturbed third effect score when receiving a disturbed third effect score sent by the participating node to obtain a fused disturbed effect score.
The method comprises the steps that a first local server obtains a historical fused disturbed effect score of a locally aggregated historical federated model, calculates an effect score increment of the fused disturbed effect score and the historical fused disturbed effect score, determines a federated model convergence state of current federated learning based on the effect score increment, obtains a first candidate federated model corresponding to a second effect score after target disturbance and a second candidate federated model corresponding to a third effect score after disturbance when the federated model convergence state is converged, and screens out a trained federated model from the first candidate federated model and the second candidate federated model, or can fuse the second candidate federated model and the second federated model to obtain the trained federated model, and the like. And when the federal model convergence state is not converged, returning to the step of executing the training of the local model until the federal model convergence state is converged so as to obtain the trained federal model.
As can be seen from the above, in this embodiment, the local server trains the local model, and sends the first model information of the trained first local model to the federal server, so that the federal server aggregates at least one trained first local model based on the first model information, receives the aggregated first federal model returned by the federal server, verifies the first federal model, obtains the first effect score of the first federal model, then performs perturbation processing on the first effect score, updates the preset perturbed first effect score set based on the perturbed first effect score, screens out the target perturbed first effect score from the updated perturbed first effect score set, constructs a model index of the target federal model corresponding to the target perturbed first effect score, and then sends the target perturbed first effect score and the model index to the federal server, so that the federal server generates a trained federal model based on the first effect score after target disturbance and the model index; according to the scheme, a differential privacy mechanism is adopted to perform disturbance processing on the first effect score, information interaction is performed between the participants and the server in a plaintext mode, any cryptography mode is not needed, the problem of ciphertext expansion is avoided, communication overhead is very low, requirements on network bandwidth and delay are very low, and accordingly federal learning training efficiency can be improved.
In order to better implement the above method, an embodiment of the present invention further provides a federal learning device (i.e., a first federal learning device), which may be integrated in a terminal or a server, where the terminal may include a smartphone, a tablet computer, a laptop computer, and/or a personal computer. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Network acceleration service (CDN), big data and an artificial intelligence platform.
For example, as shown in fig. 9, the first federal learning device may include a training unit 401, a verification unit 402, a perturbation unit 403, a screening unit 404, and a transmission unit 405 as follows:
(1) a training unit 401;
the training unit 401 is configured to train a local model, and send first model information of the trained first local model to the server, so that the server aggregates at least one trained first local model based on the first model information.
For example, the training unit 401 may be specifically configured to perform the differential privacy processing during the training process of the local model, or, the differential privacy processing can be performed after the local model is trained, so as to obtain a trained first local model meeting the differential privacy requirement, when the trained first local model is the model obtained after the differential privacy processing is carried out in the training process of the local model, the current gradient or model parameters can be extracted from the trained first local model as first model information, and the first model information is sent to the server, such that the server aggregates the at least one trained first local model based on the first model information, when the trained first local model is the model subjected to the differential privacy processing after the local model is trained, the trained first local model is sent to the server, such that the server aggregates the at least one trained local model based on the first model information.
(2) An authentication unit 402;
and the verification unit 402 is configured to receive the aggregated first federal model returned by the server, and verify the first federal model to obtain first verification information of the first federal model.
For example, the verification unit 402 may be specifically configured to receive the first federal model returned by the server, or may also receive federal model information of the first federal model returned by the server, and generate the first federal model based on the federal model information. Obtaining test sample data, testing the first federal model based on the test sample data to obtain test information of the first federal model, and evaluating the test information to obtain verification information of the first federal model.
(3) A perturbation unit 403;
the perturbation unit 403 is configured to perform perturbation processing on the first verification information, and update a preset perturbed first verification information set based on the perturbed first verification information.
For example, the perturbation unit 403 may specifically determine a noise distribution parameter of the noise information according to a preset differential privacy budget, generate a noise information set complying with the noise distribution parameter based on the noise distribution parameter, and randomly filter out the noise information in the noise information set to obtain random noise information of the first verification information. And adding random noise information into the first verification information to obtain the disturbed first verification information. And determining a noise distribution parameter of the noise information according to a preset differential privacy budget, generating a noise information set obeying the noise distribution parameter based on the noise distribution parameter, and randomly screening the noise information in the noise information set to obtain random noise information of the first verification information.
(4) A screening unit 404;
and the screening unit 404 is configured to screen the post-disturbance first verification information from the updated post-disturbance first verification information set, and construct a model index of the target federated model corresponding to the post-disturbance first verification information.
For example, the screening unit 404 may be specifically configured to screen post-disturbance first verification information with a largest effect score from the updated post-disturbance first verification information set to obtain target post-disturbance first verification information, or may further sort the post-disturbance first verification information in the updated post-disturbance first verification information set, screen at least one candidate post-disturbance first verification information from the updated post-disturbance first verification information set based on a sorting result, and screen the post-disturbance first verification information with a largest effect score or a largest effect score from the candidate post-disturbance first verification information as the target post-disturbance first verification information. The post-disturbance first verification information with the largest effect score is screened from the updated post-disturbance first verification information set to obtain the target post-disturbance first verification information, or the post-disturbance first verification information in the updated post-disturbance first verification information set can be sorted, based on the sorting result, at least one candidate post-disturbance first verification information is screened from the updated post-disturbance first verification information set, and the post-disturbance first verification information with the largest effect score or the largest effect score is screened from the candidate post-disturbance first verification information as the target post-disturbance first verification information.
(5) A transmitting unit 405;
the sending unit 405 is configured to send the first verification information after the target disturbance and the model index to a server, so that the server generates a post-training federal model based on the first verification information after the target disturbance and the model index.
For example, the sending unit 405 may be specifically configured to send the first verification information after the target disturbance and the model index to the server, so that the server generates the post-training federal model based on the first verification information after the target disturbance and the model index, or may also send a verification information request to the server, where the verification information request carries a storage address of the first verification information after the target disturbance and the model index, so that the server obtains the first verification information after the target disturbance and the model index based on the storage address, and then the server may generate the post-training federal model based on the first verification information after the target disturbance and the model index.
Optionally, in some embodiments, the first federal learning device may further include a first detection unit 406, as shown in fig. 10, which may specifically be as follows:
and a first detecting unit 406, configured to detect a server in current federal learning, and generate a post-training federal model when the server is not detected.
For example, the first detecting unit 406 may be specifically configured to, when the server is not detected, obtain a node address of a participating node participating in current federal learning, and send first model information of the trained first local model to the participating node based on the node address, so that the participating node generates the trained federal model based on the first model information. When second model information of a trained second local model sent by a participating node is received, the trained first local model and the trained second local model are aggregated based on the second model information to obtain a second federated model, the second federated model is verified to obtain second verification information, the second verification information is disturbed to obtain disturbed second verification information, a preset disturbed second verification information set is updated based on the disturbed second verification information, and a trained federated model is generated based on the updated disturbed second verification information.
Optionally, in some embodiments, the first federal learning device may further include a second detection unit 407, as shown in fig. 11, which may specifically be as follows:
and a second detecting unit 407, configured to detect the post-training federal model generated by the server, so as to receive the post-training federal model.
For example, the second detecting unit 407 may be specifically configured to receive the post-training federal model returned by the server when it is detected that the server generates the post-training federal model, receive training information returned by the server when it is detected that the server does not generate the post-training federal model, and return to execute the step of training the local model based on the training information until the server generates the post-training federal model, and receive the post-training federal model returned by the server.
As can be seen from the above, in this embodiment, after the training unit 401 trains the local model, and sends the first model information of the trained first local model to the server, so that the server aggregates at least one trained first local model based on the first model information, the verification unit 402 receives the aggregated first federal model returned by the server, and verifies the first federal model to obtain first verification information of the first federal model, then the perturbation unit 403 perturbs the first verification information, updates the preset perturbed first verification information set based on the perturbed first verification information, and the screening unit 404 screens out the target perturbed first verification information from the updated perturbed first verification information set, and constructs a model index of the target federal model corresponding to the target perturbed first verification information, then, the sending unit 405 sends the first verification information after the target disturbance and the model index to the server, so that the server generates a post-training federal model based on the first verification information after the target disturbance and the model index; according to the scheme, a differential privacy mechanism is adopted to carry out disturbance processing on the first verification information, information interaction is carried out between the participant and the server in a plaintext mode, any cryptography mode is not needed, the problem of ciphertext expansion cannot be caused, the communication overhead is very low, the requirements on network bandwidth and delay are very low, and therefore the training efficiency of federal learning can be improved.
In order to better implement the above method, an embodiment of the present invention further provides a federated learning apparatus (i.e., a second federated learning apparatus), where the second federated learning apparatus may be integrated in a server, and the server may be a single server or a server cluster formed by multiple servers.
For example, as shown in fig. 12, the second federated learning apparatus may include a first receiving unit 501, an aggregation unit 502, a second receiving unit 503, a determination unit 504, and a generation unit 505, as follows:
(1) a first receiving unit 501;
a first receiving unit 501, configured to receive first model information of a trained first local model, which is sent by at least one participating node participating in current federal learning.
For example, the first receiving unit 501 may be specifically configured to receive first model information of a trained first local model sent by a participating node participating in current federal learning, or may also receive a model aggregation request sent by a participating node participating in current federal learning, where the model aggregation request carries a storage address of the trained first local model, and obtain the first model information of the trained first local model based on the storage address.
(2) A polymerization unit 502;
an aggregating unit 502, configured to aggregate the trained first local model based on the first model information, so as to obtain a first federated model.
For example, the aggregation unit 502 may be specifically configured to construct a first local model after training based on the first model information, and aggregate the first local model after training of the participating parties in a federal average manner, so as to obtain a first federal model, or may further integrate and fuse the first local model after training of the participating nodes in an ensemble learning manner, so as to obtain the first federal model.
(3) A second receiving unit 503;
the second receiving unit 503 is configured to send the first federation model to the participating nodes respectively, and receive the first verification information and the model index after the target disturbance returned by the participating nodes.
For example, the second receiving unit 503 may be specifically configured to send the first federated model to the participating nodes, and receive the first verification information after the target disturbance and the model index sent by the participating nodes, or may also receive a model generation request, where the model generation request carries a storage address of the first verification information after the target disturbance and the model index, and the first verification information after the target disturbance and the model index may be obtained based on the storage address.
(4) A determination unit 504;
the determining unit 504 is configured to determine a federal model convergence state in the current federal learning according to the first verification information after the target disturbance.
For example, the determining unit 504 may be specifically configured to fuse the first verification information after the target disturbance to obtain current fusion verification information corresponding to the first federal model, obtain historical fusion verification information corresponding to the historical federal model, calculate an information increment between the current fusion verification information and the historical fusion verification information, and determine a federal model convergence state in current federal learning based on the information increment.
(5) A generation unit 505;
and the generating unit 505 is configured to generate the post-training federated model based on the federated model convergence state and the model index.
For example, the generating unit 505 may be specifically configured to generate a post-training federal model based on the model index and send the post-training federal model to the participating node when the state of convergence of the federal model is converged, and send training information to the participating node when the state of convergence of the federal model is not converged, so that the participating node returns to the step of training the local model based on the training information until the state of convergence of the federal model is converged, to obtain the post-training federal model, and send the post-training federal model to the participating node.
In a specific implementation, the above units may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and the specific implementation of the above units may refer to the foregoing method embodiments, which are not described herein again.
As can be seen from the above, in this embodiment, after the first receiving unit 501 receives first model information of a trained first local model sent by at least one participating node participating in current federal learning, the aggregation unit 502 aggregates the trained first local model based on the first model information to obtain a first federal model, then the second receiving unit 503 sends the first federal model to the participating nodes respectively, and receives target disturbed first verification information and model indexes returned by the participating nodes, then the determining unit 504 determines a federal model convergence state in current federal learning according to the target disturbed first verification information, and the generating unit 505 generates the post-training federal model based on the federal model convergence state and the model indexes; because the information interaction is carried out between the server and the participation method in a plaintext mode, any cryptograph mode is not needed, the problem of ciphertext expansion is avoided, the communication overhead is very low, and the requirements on network bandwidth and delay are very low, so that the training efficiency of federal learning can be improved.
An embodiment of the present invention further provides an electronic device, as shown in fig. 13, which shows a schematic structural diagram of the electronic device according to the embodiment of the present invention, specifically:
the electronic device may include components such as a processor 601 of one or more processing cores, memory 602 of one or more computer-readable storage media, a power supply 603, and an input unit 604. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 13 does not constitute a limitation of the electronic device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. Wherein:
the processor 601 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device and processes data by operating or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602. Optionally, processor 601 may include one or more processing cores; preferably, the processor 601 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.
The memory 602 may be used to store software programs and modules, and the processor 601 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 601 with access to the memory 602.
The electronic device further comprises a power supply 603 for supplying power to the various components, and preferably, the power supply 603 is logically connected to the processor 601 through a power management system, so that functions of managing charging, discharging, power consumption, and the like are realized through the power management system. The power supply 603 may also include any component including one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.
The electronic device may further include an input unit 604, and the input unit 604 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
Although not shown, the electronic device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 601 in the electronic device loads the executable file corresponding to the process of one or more application programs into the memory 602 according to the following instructions, and the processor 601 runs the application program stored in the memory 602, thereby implementing various functions as follows:
training the local model, sending first model information of the trained first local model to a server, so that the server aggregates at least one trained first local model based on the first model information, receives the aggregated first federal model returned by the server, verifies the first federal model to obtain first verification information of the first federal model, disturbing the first verification information, updating a preset disturbed first verification information set based on the disturbed first verification information, screening out target disturbed first verification information from the updated disturbed first verification information set, and constructing a model index of the target federal model corresponding to the first verification information after the target disturbance, sending the first verification information after the target disturbance and the model index to a server, so that the server generates a post-training federal model based on the first verification information after target disturbance and the model index. When the server is not detected, the node address of a participating node participating in current federal learning is obtained, and first model information of the trained first local model is sent to the participating node based on the node address so that the participating node can generate the trained federal model based on the first model information. When second model information of a trained second local model sent by a participating node is received, the trained first local model and the trained second local model are aggregated based on the second model information to obtain a second federated model, the second federated model is verified to obtain second verification information, the second verification information is disturbed to obtain disturbed second verification information, a preset disturbed second verification information set is updated based on the disturbed second verification information, and a trained federated model is generated based on the updated disturbed second verification information.
Or
After first model information of a trained first local model sent by at least one participatory node participating in current federal learning is received, the trained first local model is aggregated based on the first model information to obtain a first federal model, then the first federal model is respectively sent to the participatory nodes, first verification information and model indexes after target disturbance returned by the participatory nodes are received, then the state of federal model convergence in current federal learning is determined according to the first verification information after target disturbance, and the trained federal model is generated based on the state of federal model convergence and the model indexes.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
As can be seen from the above, in the embodiment of the present application, after a local model is trained, first model information of a trained first local model is sent to a server, so that the server aggregates at least one trained first local model based on the first model information, receives a first federated model returned by the server after aggregation, verifies the first federated model to obtain first verification information of the first federated model, then performs perturbation processing on the first verification information, updates a preset perturbed first verification information set based on the perturbed first verification information, screens out target perturbed first verification information from the updated perturbed first verification information set, constructs a model index of a target federated model corresponding to the target perturbed first verification information, and then sends the target perturbed first verification information and the model index to the server, so that the server generates a post-training federal model based on the first verification information after the target disturbance and the model index; according to the scheme, the first verification information is disturbed by adopting a differential privacy mechanism, information interaction is carried out between the participants and the server in a plaintext form, any cryptography mode is not needed, the problem of ciphertext expansion is avoided, the communication overhead is very low, and the requirements on network bandwidth and delay are very low, so that the training efficiency of federal learning can be improved.
It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present invention provide a computer-readable storage medium having stored therein a plurality of instructions that can be loaded by a processor to perform the steps of any of the federal learning methods provided by embodiments of the present invention. For example, the instructions may perform the steps of:
training the local model, sending first model information of the trained first local model to a server, so that the server aggregates at least one trained first local model based on the first model information, receives the aggregated first federal model returned by the server, verifies the first federal model to obtain first verification information of the first federal model, disturbing the first verification information, updating a preset disturbed first verification information set based on the disturbed first verification information, screening out target disturbed first verification information from the updated disturbed first verification information set, and constructing a model index of the target federal model corresponding to the first verification information after the target disturbance, sending the first verification information after the target disturbance and the model index to a server, so that the server generates a post-training federal model based on the first verification information after target disturbance and the model index. When the server is not detected, the node address of a participating node participating in current federal learning is obtained, and first model information of the trained first local model is sent to the participating node based on the node address so that the participating node can generate the trained federal model based on the first model information. When second model information of a trained second local model sent by a participating node is received, the trained first local model and the trained second local model are aggregated based on the second model information to obtain a second federated model, the second federated model is verified to obtain second verification information, the second verification information is disturbed to obtain disturbed second verification information, a preset disturbed second verification information set is updated based on the disturbed second verification information, and a trained federated model is generated based on the updated disturbed second verification information.
Or
After first model information of a trained first local model sent by at least one participatory node participating in current federal learning is received, the trained first local model is aggregated based on the first model information to obtain a first federal model, then the first federal model is respectively sent to the participatory nodes, first verification information and model indexes after target disturbance returned by the participatory nodes are received, then the state of federal model convergence in current federal learning is determined according to the first verification information after target disturbance, and the trained federal model is generated based on the state of federal model convergence and the model indexes.
The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.
Wherein the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
Since the instructions stored in the computer-readable storage medium may execute the steps in any federated learning segmentation method provided in the embodiments of the present invention, beneficial effects that any federated learning method provided in the embodiments of the present invention can achieve may be achieved, for which details are detailed in the foregoing embodiments and are not described herein again.
According to an aspect of the application, there is provided, among other things, a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read by a processor of the computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the methods provided in the various alternative implementations of the federal learning aspect or the federal model training aspect described above.
The method, the apparatus, the electronic device and the computer-readable storage medium for federated learning provided by the embodiments of the present invention are described in detail above, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (17)

1. A method for learning Federation is characterized by comprising the following steps:
training a local model, and sending first model information of the trained first local model to a server, so that the server aggregates at least one trained first local model based on the first model information;
receiving a first federated model after aggregation returned by the server, and verifying the first federated model to obtain first verification information of the first federated model;
disturbing the first verification information, and updating a preset disturbed first verification information set based on the disturbed first verification information;
screening out target disturbed first verification information from the updated disturbed first verification information set, and constructing a model index of a target federal model corresponding to the target disturbed first verification information;
and sending the first verification information after the target disturbance and the model index to the server so that the server can generate a post-training federated model based on the first verification information after the target disturbance and the model index.
2. The federal learning method as claimed in claim 1, wherein the perturbing the first verification information includes:
acquiring random noise information of the first verification information according to a preset differential privacy budget;
and adding the random noise information into the first verification information to obtain disturbed first verification information.
3. The federal learning method as claimed in claim 2, wherein the obtaining of the random noise information of the first verification information according to the preset differential privacy budget includes:
determining a noise distribution parameter of noise information according to a preset differential privacy budget;
generating a noise information set obeying the noise distribution parameter distribution based on the noise distribution parameter;
and randomly screening out noise information in the noise information set to obtain random noise information of the first verification information.
4. A federal learning method as claimed in any one of claims 1 to 3, wherein said training of the local model comprises:
obtaining local training sample data, and calculating the gradient of the local model through the local training sample data;
carrying out differential privacy processing on the gradient, and converging the local model based on the processed gradient to obtain a trained first local model;
the sending of the first model information of the trained first local model to the server so that the server aggregates at least one trained first local model based on the first model information includes: extracting a current gradient or model parameters from the trained first local model to serve as first model information, and sending the first model information to a server so that the server can aggregate at least one trained first local model based on the first model information.
5. The federal learning method as claimed in claim 3, wherein the training of local models comprises:
obtaining local training sample data, and training a local model through the local training sample data to obtain a first local model after initial training;
extracting an initial gradient or an initial model parameter from the first local model after the initial training to obtain initial model information;
and carrying out differential privacy processing on the initial model information to obtain first model information of the trained first local model.
6. The federal learning method as claimed in claim 1, wherein after the training of the local model, the method further comprises:
when the server is not detected, acquiring the node address of a participating node participating in current federal learning;
and sending the first model information of the trained first local model to the participating node based on the node address so that the participating node generates a post-training federated model based on the first model information.
7. The federal learning method as claimed in claim 6, further comprising:
when second model information of a trained second local surface model sent by the participating node is received, aggregating the trained first local model and the trained second local model based on the second model information to obtain a second joint model;
verifying the second federated model to obtain second verification information, and performing disturbance processing on the second verification information to obtain disturbed second verification information;
and updating a preset post-disturbance second verification information set based on the post-disturbance second verification information, and generating a post-training federal model based on the updated post-disturbance second verification information set.
8. The federal learning method as claimed in claim 7, wherein the generating a trained federal model based on the updated post-perturbation second verification information set comprises:
screening out target disturbed second verification information from the updated disturbed second verification information set, and sending the target disturbed second verification information to the participating node;
when third disturbed verification information sent by the participating node is received, fusing the second disturbed verification information of the target and the third disturbed verification information to obtain fused disturbed verification information;
and generating a post-training federal model based on the fused post-disturbance verification information, the target post-disturbance second verification information and the post-disturbance third verification information.
9. The federal learning method as claimed in any one of claims 1 to 3, wherein the sending the target post-disturbance first verification information and the model index to the server, so that after the server generates a post-training federal model based on the target post-disturbance first verification information and the model index, the method further comprises:
when detecting that the server generates a post-training federated model, receiving the post-training federated model returned by the server;
when detecting that the server does not generate the post-training federated model, receiving training information returned by the server, and returning to execute the step of training the local model based on the training information until the server generates the post-training federated model, and receiving the post-training federated model returned by the server.
10. A method for federated learning, comprising:
receiving first model information of a trained first local model sent by at least one participating node participating in current federal learning;
aggregating the trained first local model based on the first model information to obtain a first federated model;
respectively sending the first federated model to the participating nodes, and receiving first verification information and model indexes after target disturbance returned by the participating nodes;
determining the federal model convergence state in the current federal learning according to the first verification information after the target disturbance;
and generating a post-training federal model based on the federal model convergence state and the model index.
11. The federal learning method as claimed in claim 9, wherein the determining the state of convergence of the federal model in current federal learning according to the target post-disturbance first verification information comprises:
fusing the first verification information after the target disturbance to obtain current fusion verification information corresponding to the first federal model;
acquiring historical fusion verification information corresponding to a historical federated model, and calculating an information increment between the current fusion verification information and the historical fusion verification information;
and determining the convergence state of the federal model in the current federal learning based on the information increment.
12. The federal learning method as claimed in claim 9, wherein the generating a trained federal model based on the federal model convergence status and model index comprises:
when the federal model convergence state is converged, generating a trained federal model based on the model index, and sending the trained federal model to the participating nodes;
and when the federal model convergence state is not converged, sending training information to the participating nodes so that the participating nodes can return to execute the step of training the local model based on the training information until the federal model convergence state is converged, so as to obtain a trained federal model, and sending the trained federal model to the participating nodes.
13. The utility model provides a bang learning device which characterized in that includes:
the training unit is used for training the local models and sending first model information of the trained first local model to the server so that the server can aggregate at least one trained first local model based on the first model information;
the verification unit is used for receiving the aggregated first federal model returned by the server and verifying the first federal model to obtain first verification information of the first federal model;
the disturbance unit is used for carrying out disturbance processing on the first verification information and updating a preset disturbed first verification information set based on the disturbed first verification information;
the screening unit is used for screening out the first verification information after the target disturbance from the updated disturbed first verification information set and constructing a model index of a target federal model corresponding to the first verification information after the target disturbance;
and the sending unit is used for sending the first verification information after the target disturbance and the model index to the server so that the server can generate a post-training federated model based on the first verification information after the target disturbance and the model index.
14. The utility model provides a bang learning device which characterized in that includes:
the first receiving unit is used for receiving first model information of a trained first local model, which is sent by at least one participating node participating in current federal learning;
the aggregation unit is used for aggregating the trained first local model based on the first model information to obtain a first federated model;
the second receiving unit is used for respectively sending the first federated model to the participating nodes and receiving first verification information and model indexes after target disturbance returned by the participating nodes;
the determining unit is used for determining the federal model convergence state in the current federal learning according to the first verification information after the target disturbance;
and the generating unit is used for generating the post-training federal model based on the state of convergence of the federal model and the model index.
15. An electronic device comprising a processor and a memory, the memory storing an application program, the processor being configured to execute the application program in the memory to perform the steps of the federal learning method as claimed in any of claims 1 to 12.
16. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the federal learning method as claimed in any of claims 1 to 12.
17. A computer readable storage medium storing instructions adapted to be loaded by a processor to perform the steps of the federal learning method as claimed in any of claims 1 to 12.
CN202210272358.4A 2022-03-18 2022-03-18 Federal learning method, device, electronic equipment and computer readable storage medium Pending CN114662705A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210272358.4A CN114662705A (en) 2022-03-18 2022-03-18 Federal learning method, device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210272358.4A CN114662705A (en) 2022-03-18 2022-03-18 Federal learning method, device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114662705A true CN114662705A (en) 2022-06-24

Family

ID=82029930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210272358.4A Pending CN114662705A (en) 2022-03-18 2022-03-18 Federal learning method, device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114662705A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329032A (en) * 2022-10-14 2022-11-11 杭州海康威视数字技术股份有限公司 Federal dictionary based learning data transmission method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115329032A (en) * 2022-10-14 2022-11-11 杭州海康威视数字技术股份有限公司 Federal dictionary based learning data transmission method, device, equipment and storage medium
CN115329032B (en) * 2022-10-14 2023-03-24 杭州海康威视数字技术股份有限公司 Learning data transmission method, device, equipment and storage medium based on federated dictionary

Similar Documents

Publication Publication Date Title
CN112181666B (en) Equipment assessment and federal learning importance aggregation method based on edge intelligence
Lu et al. Differentially private asynchronous federated learning for mobile edge computing in urban informatics
CN108924836B (en) A kind of edge side physical layer channel authentication method based on deep neural network
CN108764453B (en) Modeling method and action prediction system for multi-agent synchronous game
CN111932386B (en) User account determining method and device, information pushing method and device, and electronic equipment
CN113435472A (en) Vehicle-mounted computing power network user demand prediction method, system, device and medium
Fan et al. Federated generative adversarial learning
EP4350572A1 (en) Method, apparatus and system for generating neural network model, devices, medium and program product
CN113128701A (en) Sample sparsity-oriented federal learning method and system
Liu et al. Blockchain-based task offloading for edge computing on low-quality data via distributed learning in the internet of energy
CN110889759A (en) Credit data determination method, device and storage medium
Qu et al. Fl-sec: Privacy-preserving decentralized federated learning using signsgd for the internet of artificially intelligent things
CN112817563B (en) Target attribute configuration information determining method, computer device, and storage medium
CN114662705A (en) Federal learning method, device, electronic equipment and computer readable storage medium
CN110874638A (en) Behavior analysis-oriented meta-knowledge federation method, device, electronic equipment and system
Zhang et al. D2D-LSTM: LSTM-based path prediction of content diffusion tree in device-to-device social networks
CN112541556A (en) Model construction optimization method, device, medium, and computer program product
CN104965846A (en) Virtual human establishing method on MapReduce platform
CN116502709A (en) Heterogeneous federal learning method and device
Yan et al. Federated clustering with GAN-based data synthesis
CN112044082B (en) Information detection method and device and computer readable storage medium
CN111984842B (en) Bank customer data processing method and device
CN115130536A (en) Training method of feature extraction model, data processing method, device and equipment
CN104955059B (en) Cellular network base stations state time-varying model method for building up based on Bayesian network
CN104427546A (en) Method, simulation platform and system for matching mobile network indexes with user experience

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40070910

Country of ref document: HK